Nothing Special   »   [go: up one dir, main page]

Skip to main content

Development and Evaluation of Julius-Compatible Interface for Kaldi ASR

  • Conference paper
  • First Online:
Advances in Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2017)

Abstract

In recent years, the use of Kaldi has rapidly grown because it has adopted various technologies of DNN-based speech recognition in succession and has shown high recognition performance. On the other hand, the speech recognition engine, Julius, has been widely used especially in Japan. Julius is also attracting attention since DNN-HMM is implemented in it. In this paper, we describe the design plan of interfaces that make Kaldi speech recognition engine be compatible with Julius, a system overview, and the details of the speech input unit and the recognition result output unit. We also refer to the functions that we are planning to implement.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. The Hidden Markov Model Toolkit (HTK), http://htk.eng.cam.ac.uk/

  2. Glas, D.F., Minato, T., Ishi, C.T., Kawahara, T., Ishiguro, H.: Erica: the erato intelligent conversational android. In: Proceedings of the 25th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), pp. 22–29 (2016)

    Google Scholar 

  3. Ijima, Y., Nose, T., Tachibana, M., Kobayashi, T.: A rapid model adaptation technique for emotional speech recognition with style estimation based on multiple-regression HMM. IEICE Trans. Inf. Syst. 93(1), 107–115 (2010)

    Article  Google Scholar 

  4. Kawahara, T., Nanjo, H., Shinozaki, T., Furui, S.: Benchmark test for speech recognition using the corpus of spontaneous japanese. In: ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition, pp. 1–4 (2003)

    Google Scholar 

  5. Lee, A., Kawahara, T.: Recent development of open-source speech recognition engine julius. In: Proceedings of APSIPA ASC, pp. 131–137 (2009)

    Google Scholar 

  6. Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlicek, P., Qian, Y., Schwarz, P., et al.: The kaldi speech recognition toolkit. In: Proceedings of IEEE Workshop on Automatic Speech Recognition And Understanding (ASRU) (2011)

    Google Scholar 

  7. Zhang, X., Trmal, J., Povey, D., Khudanpur, S.: Improving deep neural network acoustic models using generalized maxout networks. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 215–219 (2014)

    Google Scholar 

Download references

Acknowledgment

Part of this work was supported by JSPS KAKENHI Grant Number JP26280055 and JP15H02720.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yusuke Yamada .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Cite this paper

Yamada, Y., Nose, T., Chiba, Y., Ito, A., Shinozaki, T. (2018). Development and Evaluation of Julius-Compatible Interface for Kaldi ASR. In: Pan, JS., Tsai, PW., Watada, J., Jain, L. (eds) Advances in Intelligent Information Hiding and Multimedia Signal Processing. IIH-MSP 2017. Smart Innovation, Systems and Technologies, vol 82. Springer, Cham. https://doi.org/10.1007/978-3-319-63859-1_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-63859-1_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-63858-4

  • Online ISBN: 978-3-319-63859-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics