Secure speech-recognition data transfer in the internet of things using a power system and a tried-and-true key generation technique

Zhe Wang¹,
Shuangbai He¹ &
Guoan Li²

81 Accesses
Explore all metrics

Abstract

To secure the privacy, confidentiality, and integrity of Speech Data (SD), the concept of secure Speech Recognition (SR) involves accurately recording and comprehending spoken language while employing diverse security processes. As the Internet of Things (IoT) rapidly evolves, the integration of SR capabilities into IoT devices gains significance. However, ensuring the security and privacy of private SD post-integration remains a critical concern. Despite the potential benefits, implementing the proposed Reptile Search Optimized Hidden Markov Model (RSO-HMM) for SR and integrating it with IoT devices may encounter complexities due to diverse device types. Moreover, the challenge of maintaining data security and privacy for assigned SD in practical IoT settings poses a significant hurdle. Ensuring seamless interoperability and robust security measures is essential. We introduce the Reptile Search Optimized Hidden Markov Model (RSO-HMM) for SR, utilizing retrieved aspects as speech data. Gathering a diverse range of SD from speakers with varying linguistic backgrounds enhances the accuracy of the SR system. Preprocessing involves Z-score normalization for robustness and mitigation of outlier effects. The Perceptual Linear Prediction (PLP) technique facilitates efficient extraction of essential acoustic data from speech sources. Addressing data security, Elliptic Curve Cryptography (ECC) is employed for encryption, particularly suited for resource-constrained scenarios. Our study evaluates the SR system, employing key performance metrics including accuracy, precision, recall, and F1 score. The thorough assessment demonstrates the system's remarkable performance, achieving an impressive accuracy of 96%. The primary objective revolves around appraising the system's capacity and dependability in accurately transcribing speech signals. By proposing a comprehensive approach that combines the RSO-HMM for SR, data preprocessing techniques, and ECC encryption, this study advocates for the wider adoption of SR technology within the IoT ecosystem. By tackling critical data security concerns, this approach paves the way for a safer and more efficient globally interconnected society, encouraging the broader utilization of SR technology in various applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Privacy-Preserving Speech Recognition

Treating Speech as Personally Identifiable Information and Its Impact in Machine Translation

Encrypted Transmission Method of Network Speech Recognition Information Based on Big Data Analysis

References

Abdulkareem, A., Somefun, T.E., Chinedum, O.K., Agbetuyi, F., Somefun, T.E.: Design and implementation of a speech recognition system integrated with the Internet of Things. Int. J. Electr. Comput. Eng. (IJECE) 11(2), 1796–1803 (2021)
Article Google Scholar
Abdullah, H., Warren, K., Bindschaedler, V., Papernot, N., Traynor, P.: Sok: the faults in our answers: an overview of attacks against automatic speech recognition and speaker identification systems. In: 2021 IEEE Symposium on Security and Privacy (SP), pp. 730–747. IEEE (2021)
Aghakhani, H., Schönherr, L., Eisenhofer, T., Kolossa, D., Holz, T., Kruegel, C., Vigna, G.: VenoMave: targeted poisoning against speech recognition. In: 2023 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML), pp. 404–417. IEEE (2023)
Aloufi, R., Haddadi, H. and Boyle, D.: Configurable privacy-preserving automatic speech recognition (2021). arXiv preprint arXiv:2104.00766
Bajpai, S., Radha, D.: Smartphone as a controlling device for smart homes using speech recognition. In: 2019 International Conference on Communication and Signal Processing (ICCSP), pp. 0701–0705. IEEE (2019)
Chen, Y., Zhang, J., Yuan, X., Zhang, S., Chen, K., Wang, X., Guo, S.: Sok: a modularized approach to study the security of automatic speech recognition systems. ACM Trans. Privacy Secur. 25(3), 1–31 (2022)
Article Google Scholar
Elrefaei, L.A., Alhassan, T.Q., Omar, S.S.: An Arabic visual dataset for visual speech recognition. Procedia Comput. Sci. 163, 400–409 (2019)
Article Google Scholar
Ge, Y., Ansari, S., Abdulghani, A., Imran, M.A., Abbasi, Q.H.: Intelligent instruction-based IoT framework for smart home applications using speech recognition. In: 2020 IEEE International Conference on Smart Internet of Things (SmartIoT), pp. 197–204. IEEE (2020)
Gondi, S., Pratap, V.: Performance evaluation of offline speech recognition on edge devices. Electronics 10(21), 2697 (2021)
Article Google Scholar
Ibrahim, H. and Varol, A., 2020. A study on automatic speech recognition systems. In 2020 8th International Symposium on Digital Forensics and Security (ISDFS) (pp. 1–5). IEEE.
Isyanto, H., Arifin, A.S., Suryanegara, M.: Performance of smart personal assistant applications based on speech recognition technology using IoT-based voice commands. In: 2020 International Conference on Information and Communication Technology Convergence (ICTC), pp. 640–645. IEEE (2020)
Jain, N., Rastogi, S.: Speech recognition systems—a comprehensive study of concepts and mechanisms. Acta Inform. Malays. (AIM) 3(1), 1–3 (2019)
Article Google Scholar
Jiang, D., Tan, C., Peng, J., Chen, C., Wu, X., Zhao, W., Song, Y., Tong, Y., Liu, C., Xu, Q., Yang, Q.: A gdpr-compliant ecosystem for speech recognition with transfer, federated, and evolutionary learning. ACM Trans. Intell. Syst. Technol. (TIST) 12(3), 1–19 (2021)
Article Google Scholar
Knott, B., Venkataraman, S., Hannun, A., Sengupta, S., Ibrahim, M., van der Maaten, L.: Crypten: Secure multi-party computation meets machine learning. Adv. Neural. Inf. Process. Syst. 34, 4961–4973 (2021)
Google Scholar
Kong, Y., Zhang, J. Adversarial audio: a new information hiding method and backdoor for dnn-based speech recognition models (2019). arXiv preprint arXiv:1904.03829
Kubanek, M., Bobulski, J., Kulawik, J.: A method of speech coding for speech recognition using a convolutional neural network. Symmetry 11(9), 1185 (2019)
Article Google Scholar
Ma, P., Petridis, S., Pantic, M.: Detecting adversarial attacks on audiovisual speech recognition. In: ICASSP 2021—2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6403–6407. IEEE (2021)
Ma, Z., Liu, Y., Liu, X., Ma, J., Li, F.: Privacy-preserving outsourced speech recognition for smart IoT devices. IEEE Internet Things J. 6(5), 8406–8420 (2019)
Article Google Scholar
Mukhamadiyev, A., Khujayarov, I., Djuraev, O., Cho, J.: Automatic speech recognition method based on deep learning approaches for Uzbek language. Sensors 22(10), 3683 (2022)
Article Google Scholar
Munir, A., Ehsan, S.K., Raza, S.M., Mudassir, M.: Face and speech recognition-based smart home. In: 2019 International Conference on Engineering and Emerging Technologies (ICEET), pp. 1–5. IEEE (2019)
Mustafa, M.K., Allen, T., Appiah, K.: A comparative review of dynamic neural networks and hidden Markov model methods for mobile on-device speech recognition. Neural Comput. Appl. 31, 891–899 (2019)
Article Google Scholar
Savchenko, V.V.: Acoustic variability of voice signals as a factor of information security for automatic speech recognition systems with tuning to user’s voice. Radioelectron. Commun. Syst. 63, 532–542 (2020)
Article Google Scholar
Shi, X., Yu, F., Lu, Y., Liang, Y., Feng, Q., Wang, D., Qian, Y., Xie, L.: The accented English speech recognition challenge 2020: open datasets, tracks, baselines, results, and methods. In: ICASSP 2021—2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6918–6922. IEEE (2021)
Sim, K.C., Zadrazil, P., Beaufays, F.: An investigation into on-device personalization of end-to-end automatic speech recognition models (2019). arXiv preprint arXiv:1909.06678
Sood, M., Jain, S.: Speech recognition employing mfcc and dynamic time warping algorithm. In: Innovations in Information and Communication Technologies (IICT-2020) Proceedings of International Conference on ICRIHE-2020, Delhi, India: IICT-2020, pp. 235–242. Springer International Publishing (2021)
Swetha, P., Srilatha, J.: Applications of speech recognition in the agriculture sector: a review. ECS Trans. 107(1), 19377 (2022)
Article Google Scholar
Wang, P., Lu, X., Sun, H., Lv, W.: Application of speech recognition technology in IoT smart home. In: 2019 IEEE 3rd Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC), pp. 1264–1267. IEEE (2019)
Wang, Q., Feng, C., Xu, Y., Zhong, H., Sheng, V.S.: A novel privacy-preserving speech recognition framework using bidirectional LSTM. J. Cloud Comput. 9, 1–13 (2020)
Article Google Scholar
Yang, C.H., Qi, J., Chen, P.Y., Ma, X., Lee, C.H.: Characterizing speech adversarial examples using self-attention u-net enhancement. In: ICASSP 2020—2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3107–3111. IEEE (2020)
Yang, Q., Jin, W., Zhang, Q., Wei, Y., Guo, Z., Li, X., Yang, Y., Luo, Q., Tian, H., Ren, T.L.: Mixed-modality speech recognition and interaction using a wearable artificial throat. Nat. Mach. Intell. 5(2), 169–180 (2023)
Article Google Scholar
Yu, F., Xu, Z., Liu, C., Chen, X.: Masker: adaptive mobile security enhancement against automatic speech recognition in eavesdropping. In: Proceedings of the 56th Annual Design Automation Conference 2019, pp. 1–6 (2019)
Zhang, S.X., Gong, Y., Yu, D.: Encrypted speech recognition using deep polynomial networks. In: ICASSP 2019—2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5691–5695. IEEE (2019)

Download references

Acknowledgements

Key information project of China Southern Power Grid Energy Research Institute Power planning basic database v1.0 construction project of China Southern Power Grid (0006200000081599).

Author information

Authors and Affiliations

Energy Research Institute of China Southern Power Grid Company Limited, Guangzhou, 510663, Guangdong, China
Zhe Wang & Shuangbai He
Hainan Power Grid Co., Ltd., Sanya Power Supply Bureau, Sanya, 572099, Hainan, China
Guoan Li

Authors

Zhe Wang
View author publications
You can also search for this author in PubMed Google Scholar
Shuangbai He
View author publications
You can also search for this author in PubMed Google Scholar
Guoan Li
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the study conception and design.All authors read and approved the final manuscript

Corresponding author

Correspondence to Zhe Wang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, Z., He, S. & Li, G. Secure speech-recognition data transfer in the internet of things using a power system and a tried-and-true key generation technique. Cluster Comput 27, 14669–14684 (2024). https://doi.org/10.1007/s10586-024-04649-3

Download citation

Received: 28 September 2023
Revised: 17 January 2024
Accepted: 02 February 2024
Published: 29 July 2024
Issue Date: December 2024
DOI: https://doi.org/10.1007/s10586-024-04649-3

Secure speech-recognition data transfer in the internet of things using a power system and a tried-and-true key generation technique

Abstract

Access this article

Subscribe and save

Buy Now