A New Lightweight CRNN Model for Keyword Spotting with Edge Computing Devices

Yungen Wei¹²,
Zheng Gong¹²,
Shunzhi Yang¹²,
Kai Ye¹² &
…
Yamin Wen¹³

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 12486))

Included in the following conference series:

International Conference on Machine Learning for Cyber Security

Abstract

Keyword Spotting (KWS) is a significant branch of Automatic Speech Recognition (ASR), which has been widely used in edge computing devices. The goal of KWS is to provide high accuracy at a low false alarm rate (FAR) while reducing the costs of memory, computation, and latency. However, limited resources are challenging for KWS applications on edge computing devices. Lightweight models and structures for deep learning have achieved good results in the KWS branch while maintaining high accuracy, low computational costs, and low latency. In this paper, we present a new Convolutional Recurrent Neural Network (CRNN) architecture named EdgeCRNN for edge computing devices. EdgeCRNN is based on a depthwise separable convolution (DSC) and residual structure, and it uses a feature enhancement method. The experimental results on Google Speech Commands Dataset depict that EdgeCRNN can test 11.1 audio data per second on Raspberry Pi 3B+, which are 2.2 times that of Tpool2. Compared with Tpool2, the accuracy of EdgeCRNN reaches 98.05% whilst its performance is also competitive.

This paper is supported by the National Natural Sciences Foundation of China (No. 61572028), National Cryptography Development Fund (No. MMJJ20180206), the Project of Science and Technology of Guangzhou (No. 201802010044) and Guangdong Basic and Applied Basic Research Foundation (No. 2019A1515011797).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

EdgeCRNN: an edge-computing oriented model of acoustic feature enhancement for keyword spotting

Article 14 March 2021

Speech densely connected convolutional networks for small-footprint keyword spotting

Article 30 March 2023

Keyword Spotting with Neural Networks Used for Image Classification

Notes

1.
https://github.com/genty1314/KWS.git.

References

Wilpon, J., Miller, L., Modi, P.: Improvements and applications for key word recognition using hidden markov modeling techniques. In: 1991 International Conference on Acoustics, Speech, and Signal Processing, pp. 309–312. IEEE (1991)
Google Scholar
Silaghi, M.C.: Spotting subsequences matching an hmm using the average observation probability criteria with application to keyword spotting. In: AAAI, pp. 1118–1123 (2005)
Google Scholar
Chen, G., Parada, C., Heigold, G.: Small-footprint keyword spotting using deep neural networks. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4087–4091. IEEE (2014)
Google Scholar
Benelli, G., Meoni, G., Fanucci, L.: A low power keyword spotting algorithm for memory constrained embedded systems. In: 2018 IFIP/IEEE International Conference on Very Large Scale Integration (VLSI-SoC), pp. 267–272. IEEE (2018)
Google Scholar
Dinelli, G., Meoni, G., Rapuano, E., Benelli, G., Fanucci, L.: An FPGA-based hardware accelerator for cnns using on-chip memories only: Design and benchmarking with intel movidius neural compute stick. Int. J. Reconfig. Comput. 2019, 13 p. (2019)
Google Scholar
Tang, R., Wang, W., Tu, Z., Lin, J.: An experimental analysis of the power consumption of convolutional neural networks for keyword spotting. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5479–5483. IEEE (2018)
Google Scholar
Sainath, T., Parada, C.: Convolutional neural networks for small-footprint keyword spotting (2015)
Google Scholar
Sun, M., Raju, A., Tucker, G., et al.: Max-pooling loss training of long short-term memory networks for small-footprint keyword spotting. In: 2016 IEEE Spoken Language Technology Workshop (SLT), pp. 474–480. IEEE (2016)
Google Scholar
Arik, S.O., Kliegl, M., Child, R., et al.: Convolutional recurrent neural networks for small-footprint keyword spotting. arXiv preprint arXiv:1703.05390 (2017)
Warden, P.: Speech commands: a dataset for limited-vocabulary speech recognition. arXiv preprint arXiv:1804.03209 (2018)
Tucker, G., Wu, M., Sun, M., Panchapagesan, S., Fu, G., Vitaladevuni, S.: Model compression applied to small-footprint keyword spotting. In: INTERSPEECH, pp. 1878–1882 (2016)
Google Scholar
Zhou, Y., Ebrahimi, S., Arık, S.Ö., et al.: Resource-efficient neural architect. arXiv preprint arXiv:1806.07912 (2018)
Anderson, A., Su, J., Dahyot, R., Gregg, D.: Performance-oriented neural architecture search. arXiv preprint arXiv:2001.02976 (2020)
Zhang, Y., Suda, N., Lai, L., Chandra, V.: Hello edge: keyword spotting on microcontrollers. arXiv preprint arXiv:1711.07128 (2017)
Coucke, A., Chlieh, M., Gisselbrecht, T., Leroy, D., Poumeyrol, M., Lavril, T.: Efficient keyword spotting using dilated convolutions and gating. In: ICASSP 2019–2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6351–6355. IEEE (2019)
Google Scholar
McFee, B., et al.: librosa: audio and music signal analysis in python. In: Proceedings of the 14th python in science conference. vol. 8 (2015)
Google Scholar
Zhang, X., Zhou, X., Lin, M., Sun, J.: Shufflenet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018)
Google Scholar
Howard, A.G., Zhu, M., Chen, B., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Tang, R., Lin, J.: Deep residual learning for small-footprint keyword spotting. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5484–5488. IEEE (2018)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Cho, K., Van Merriënboer, B., Gulcehre, C., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014)
Zeng, M., Xiao, N.: Effective combination of densenet and bilstm for keyword spotting. IEEE Access 7, 10767–10775 (2019)
Article Google Scholar
Ma, N., Zhang, X., Zheng, H.T., Sun, J.: Shufflenet v2: practical guidelines for efficient CNN architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, South China Normal University, Guangzhou, China
Yungen Wei, Zheng Gong, Shunzhi Yang & Kai Ye
School of Statistics and Mathematics, Guangdong University of Finance and Economics, Guangzhou, China
Yamin Wen

Authors

Yungen Wei
View author publications
You can also search for this author in PubMed Google Scholar
Zheng Gong
View author publications
You can also search for this author in PubMed Google Scholar
Shunzhi Yang
View author publications
You can also search for this author in PubMed Google Scholar
Kai Ye
View author publications
You can also search for this author in PubMed Google Scholar
Yamin Wen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yamin Wen .

Editor information

Editors and Affiliations

Xidian University, Xi'an, China
Xiaofeng Chen
Guangzhou University, Guangzhou, China
Hongyang Yan
Michigan State University, East Lansing, MI, USA
Qiben Yan
Division of Computer, Electrical and Mathematical Sciences and Engineering, King Abdullah University of Science, Thuwal, Saudi Arabia
Xiangliang Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wei, Y., Gong, Z., Yang, S., Ye, K., Wen, Y. (2020). A New Lightweight CRNN Model for Keyword Spotting with Edge Computing Devices. In: Chen, X., Yan, H., Yan, Q., Zhang, X. (eds) Machine Learning for Cyber Security. ML4CS 2020. Lecture Notes in Computer Science(), vol 12486. Springer, Cham. https://doi.org/10.1007/978-3-030-62223-7_17

Download citation

DOI: https://doi.org/10.1007/978-3-030-62223-7_17
Published: 11 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-62222-0
Online ISBN: 978-3-030-62223-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A New Lightweight CRNN Model for Keyword Spotting with Edge Computing Devices

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

EdgeCRNN: an edge-computing oriented model of acoustic feature enhancement for keyword spotting

Speech densely connected convolutional networks for small-footprint keyword spotting

Keyword Spotting with Neural Networks Used for Image Classification

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A New Lightweight CRNN Model for Keyword Spotting with Edge Computing Devices

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

EdgeCRNN: an edge-computing oriented model of acoustic feature enhancement for keyword spotting

Speech densely connected convolutional networks for small-footprint keyword spotting

Keyword Spotting with Neural Networks Used for Image Classification

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation