Nothing Special   »   [go: up one dir, main page]

Skip to main content

Animated Pronunciation Generated from Speech for Pronunciation Training

  • Conference paper
Intelligent Interactive Multimedia: Systems and Services

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 14))

  • 1196 Accesses

Abstract

Computer-assisted pronunciation training (CAPT) was introduced for language education in recent years. CAPT scores the learner’s pronunciation quality and points out wrong phonemes by using speech recognition technology. However, although the learner can thus realize that his/her speech is different from the teacher’s, the learner still cannot control the articulation organs to pronounce correctly. The learner cannot understand how to correct the wrong articulatory gestures precisely. We indicate these differences by visualizing a learner’s wrong pronunciation movements and the correct pronunciation movements with CG animation. We propose a system for generating animated pronunciation by estimating a learner’s pronunciation movements from his/her speech automatically. The proposed system maps speech to coordinate values that are needed to generate the animations by using multi-layer neural networks (MLN). We use MRI data to generate smooth animated pronunciations. Additionally, we verify whether the vocal tract area and articulatory features are suitable as characteristics of pronunciation movement through experimental evaluation

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Delmonte, R.: SLIM prosodic automatic tools for self-learning instruction. Speech Communication 30(2-3), 145–166 (2000)

    Article  Google Scholar 

  2. Gamper, J., Knapp, J.: A Review of Intelligent CALL Systems. Computer Assisted Language Learning 15(4), 329–342 (2002)

    Article  Google Scholar 

  3. Neumeyer, L., Franco, H., Digalakis, V., Weintraub, M.: Automatic scoring of pronunciation quality. Speech Communication 30(2-3), 83–93 (2000)

    Article  Google Scholar 

  4. Witt, S.M., Young, S.J.: Phone-level pronunciation scoring and assessment for interactive language learning. Speech Communication 30(2-3), 95–108 (1995)

    Article  Google Scholar 

  5. Deroo, O., Ris, C., Gielen, S., Vanparys, J.: Automatic detection of mispronounced phonemes for language learning tools. In: Proceedings of ICSLP 2000, vol. 1, pp. 681–684 (2000)

    Google Scholar 

  6. Wang, S., Higgins, M., Shima, Y.: Training English pronunciation for Japanese learners of English online. The JALT Call Journal 1(1), 39–47 (2005)

    Google Scholar 

  7. Phonetics Flash Animation Project, http://www.uiowa.edu/~acadtech/phonetics/

  8. Wong, K.H., Lo, W.K., Meng, H.: Allophonic variations in visual speech synthesis for corrective feedback in capt. In: Proc. ICASSP 2011, pp. 5708–5711 (2011)

    Google Scholar 

  9. Iribe, Y., Manosavanh, S., Katsurada, K., Hayashi, R., Zhu, C., Nitta, T.: Generation Animated Pronunciation from Speech through Articulatory Feature Extraction. In: Proc. of Interspeecch 2011, pp. 1617–1621 (2011)

    Google Scholar 

  10. Huda, M.N., Katsurada, K., Nitta, T.: Phoneme recognition based on hybrid neural networks with inhibition/enhancement of Distinctive Phonetic Feature (DPF) trajectories. In: Proc. Interspeech 2008, pp. 1529–1532 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yurie Iribe .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Iribe, Y., Manosavan, S., Katsurada, K., Nitta, T. (2012). Animated Pronunciation Generated from Speech for Pronunciation Training. In: Watanabe, T., Watada, J., Takahashi, N., Howlett, R., Jain, L. (eds) Intelligent Interactive Multimedia: Systems and Services. Smart Innovation, Systems and Technologies, vol 14. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29934-6_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-29934-6_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-29933-9

  • Online ISBN: 978-3-642-29934-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics