Nothing Special   »   [go: up one dir, main page]

Skip to main content

Generation of GMM Weights by Dirichlet Distribution and Model Selection Using Information Criterion for Malayalam Speech Recognition

  • Conference paper
  • First Online:
Intelligent Human Computer Interaction (IHCI 2018)

Abstract

Automatic Speech Recognition is a computer-driven transcription of spoken-language into human-readable text. This paper is focused on the development of an acoustic model for medium vocabulary, context independent, isolated Malayalam Speech Recognizer using Hidden Markov Model (HMM). In this work, the emission probabilities of syllables, based on HMMs are estimated from the Gaussian Mixture Model (GMM). Mel Frequency Cepstral Coefficient (MFCC) technique is used for feature extraction from the input speech. The generation of mixture weights for GMMs is done by implementing Dirichlet Distribution. The efficiency of thus generated Gaussian Mixture Model is verified with different Information Criteria namely Akaike Information Criterion, Bayes Information Criterion, Corrected AIC, Kullback Linear Information Criterion, corrected KIC and Approximated KIC (KICc, AKICc). The accuracy of medium vocabulary, speaker dependent and isolated Malayalam speech corpus for a single Gaussian is 90.91% and Word Error Rate (WER) is 11.9%. The word accuracy and WER of the system are calculated based on the experiments conducted for multivariate Gaussians. For Gaussian mixture five, a better word accuracy of 95.24% along with a WER of 4.76% is attained and the same is verified using Information Criteria.

Kerala State Council of Science Technology and Environment-KSCSTE.

L. Krishna Ramachandran—Research Scholar

S. Elizabeth—Professor

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Benzeghiba, M., et al.: Automatic speech recognition and speech variability: a review. Speech Commun. 49(10), 763–786 (2007)

    Article  Google Scholar 

  2. Rabiner, L.R.: A tutorial on hidden markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)

    Article  Google Scholar 

  3. Malayalam Language (2018). https://en.wikipedia.org/wiki/Malayalam. Accessed 02 Jun 2018

  4. Dirichlet Distribution (2018). https://en.wikipedia.org/wiki/Dirichlet_distribution. Accessed 02 Jun 2018

  5. Abushariah, A.A.M., Gunawan, T.S., Khalifa, O.O., Abushariah, M.A.M.: English digits speech recognition system based on hidden Markov models. In: 2010 International Conference on Computer and Communication Engineer (ICCCE 2010), pp. 1–5. IEEE Press (2010)

    Google Scholar 

  6. Al-Qatab, B.A., Ainon, R.N.: Arabic speech recognition using hidden Markov model toolkit (HTK). In: International Symposium in Information Technology (ITSim), vol. 2, pp. 557–562. IEEE (2010)

    Google Scholar 

  7. Saini, P., Kaur, P., Dua, M.: Hindi automatic speech recognition using HTK. Int. J. Eng. Trends Technol. (IJETT) 4(6), 2223–2229 (2013)

    Google Scholar 

  8. Kumar, K., Aggarwal, R., Jain, A.: A hindi speech recognition system for connected words using HTK. Int. J. Comput. Syst. Eng. 1(1), 25–32 (2012)

    Article  Google Scholar 

  9. Dua, M., Aggarwal, R., Kadyan, V., Dua, S.: Punjabi automatic speech recognition using HTK. IJCSI Int. J. Comput. Sci. Issues 9(4), 359 (2012)

    Google Scholar 

  10. Bhaskar, P.V., Rao, S.R.M., Gopi, A.: HTK based telugu speech recognition. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 2(12), 307–314 (2012)

    Google Scholar 

  11. Kurian, C., Balakrishnan, K.: Speech recognition of Malayalam numbers. In: World Congress on Nature & Biologically Inspired Computing, NaBIC 2009, pp. 1475–1479. IEEE (2009)

    Google Scholar 

  12. Kurian, C., Balakrishnan, K.: Connected digit speech recognition system for Malayalam language. Sadhana 38(6), 1339–1346 (2013)

    Article  Google Scholar 

  13. Kurian, C., Balakrishnan, K.: Development & evaluation of different acoustic models for Malayalam continuous speech recognition. Procedia Eng. 30, 1081–1088 (2012)

    Article  Google Scholar 

  14. Krishnan, V.V., Jayakumar, A., Babu, A.P.: Speech recognition of isolated Malayalam words using wavelet features and artificial neural network. In: 4th IEEE International Symposium on Electronic Design, Test and Applications, DELTA 2008, pp. 240–243. IEEE (2008)

    Google Scholar 

  15. Yu, K.: Generating Gaussian mixture models by model selection for speech recognition. F06 10–701 Final Project Report (2006)

    Google Scholar 

  16. Akogul, S., Erisoglu, M.: A comparison of information criteria in clustering based on mixture of multivariate normal distributions. Math. Comput. Appl. 21(3), 34 (2016)

    MathSciNet  Google Scholar 

  17. Young, S.: Hidden Markov model toolkit: design and philosophy. CUED/F-INFENG/TR. 152, Cambridge University Engineering Department (1994)

    Google Scholar 

  18. Yu, D., Deng, L.: Automatic Speech Recognition, A Deep Learning Approach. SCT. Springer, London (2015). https://doi.org/10.1007/978-1-4471-5779-3

    Book  MATH  Google Scholar 

  19. Reynolds, D.A.: Gaussian mixture models. Encycl. Biom. 2009, 659–663 (2009)

    Google Scholar 

  20. Karlis, D., Xekalaki, E.: Choosing initial values for the EM algorithm for finite mixtures. Comput. Stat. Data Anal. 41(3), 577–590 (2003)

    Article  MathSciNet  Google Scholar 

  21. Steele, R.J., Raftery, A.E.: Performance of bayesian model selection criteria for gaussian mixture models. Front. Stat. Decis. Mak. Bayesian Anal. 2, 113–130 (2010)

    Google Scholar 

  22. Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Parzen, E., Tanabe, K., Kitagawa, G. (eds.) Selected Papers of Hirotugu Akaike, pp. 199–213. Springer, New York (1998). https://doi.org/10.1007/978-1-4612-1694-0_15

    Google Scholar 

  23. Hurvich, C.M., Tsai, C.L.: Regression and time series model selection in small samples. Biometrika 76(2), 297–307 (1989)

    Article  MathSciNet  Google Scholar 

  24. Schwarz, G., et al.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)

    Article  MathSciNet  Google Scholar 

  25. Cavanaugh, J.E.: A large-sample model selection criterion based on kullback’s symmetric divergence. Stat. Probab. Lett. 42(4), 333–343 (1999)

    Article  MathSciNet  Google Scholar 

  26. Seghouane, A.K., Bekara, M.: A small sample model selection criterion based on kullback’s symmetric divergence. IEEE Trans. Signal Process. 52(12), 3314–3323 (2004)

    Article  MathSciNet  Google Scholar 

  27. Seghouane, A.K., Bekara, M., Fleury, G.: A criterion for model selection in the presence of incomplete data based on kullback’s symmetric divergence. Signal Process. 85(7), 1405–1417 (2005)

    Article  Google Scholar 

  28. HTK hidden Markov model toolkit (1994). http://htk.eng.cam.ac.uk

Download references

Acknowledgment

This research is supported by Kerala State Council for Science, Technology and Environment (KSCSTE). I thank KSCSTE for funding the project under the Back-to-lab scheme.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lekshmi Krishna Ramachandran .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Krishna Ramachandran, L., Elizabeth, S. (2018). Generation of GMM Weights by Dirichlet Distribution and Model Selection Using Information Criterion for Malayalam Speech Recognition. In: Tiwary, U. (eds) Intelligent Human Computer Interaction. IHCI 2018. Lecture Notes in Computer Science(), vol 11278. Springer, Cham. https://doi.org/10.1007/978-3-030-04021-5_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-04021-5_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-04020-8

  • Online ISBN: 978-3-030-04021-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics