Automatic Generation of a Pronunciation Dictionary with Rich Variation Coverage Using SMT Methods

Panagiota Karanasou¹⁷ &
Lori Lamel¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6609))

Included in the following conference series:

International Conference on Intelligent Text Processing and Computational Linguistics

1324 Accesses

Abstract

Constructing a pronunciation lexicon with variants in a fully automatic and language-independent way is a challenge, with many uses in human language technologies. Moreover, with the growing use of web data, there is a recurrent need to add words to existing pronunciation lexicons, and an automatic method can greatly simplify the effort required to generate pronunciations for these out-of-vocabulary words. In this paper, a machine translation approach is used to perform grapheme-to-phoneme (g2p) conversion, the task of finding the pronunciation of a word from its written form. Two alternative methods are proposed to derive pronunciation variants. In the first case, an n-best pronunciation list is extracted directly from the g2p converter. The second is a novel method based on a pivot approach, traditionally used for the paraphrase extraction task, and applied as a post-processing step to the g2p converter. The performance of these two methods is compared under different training conditions. The range of applications which require pronunciation lexicons is discussed and the generated pronunciations are further tested in some preliminary automatic speech recognition experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Statistical Pronunciation Adaptation for Spontaneous Speech Synthesis

Incorporating Syllable Phonotactics to Improve Grapheme to Phoneme Translation

A Hybrid Approach to Statistical Machine Translation Between Standard and Dialectal Varieties

References

Bannard, C., Callison-Burch, C.: Paraphrasing with bilingual parallel corpora. In: Proc. of ACL (2005)
Google Scholar
Bisani, M., Ney, H.: Investigations on Joint-Multigram Models for Grapheme-to-Phoneme Conversion. In: ICSLP, pp. 105–108 (2002)
Google Scholar
Deligne, S., Yvon, F., Bimbot, F.: Variable-length sequence matching for phonetic transcription using joint multigrams. In: Proc. European Conf. on Speech Communication and Technology, pp. 2243–2246 (1995)
Google Scholar
Dietterich, T.G., Bakiri, G.: Solving Multiclass Learning Problems via Error-Correcting Output Codes. Journal of Artificial Intelligence 2, 263–286 (1995)
MATH Google Scholar
Gauvain, J.L., Lamel, L., Adda, G.: The LIMSI Broadcast News Transcription System. Speech Comm. 37, 89–108 (2002)
Article MATH Google Scholar
Gerosa, M., Federico, M.: Coping with out-of-vocabulary words:open versus huge vocabulary ASR. In: ICASSP (2009)
Google Scholar
Jiampojamarn, S., Cherry, C., Kondrak, G.: Joint processing and discriminative training for letter-to-phoneme conversion. In: Proc. of ACL-HLT, pp. 905–913 (2008)
Google Scholar
Kaisse, E.M.: Word-Formation and Phonology. In: Handbook of Word-Formation, Studies in Natural Language and Linguistic Theory, vol. 64, pp. 25–47. Springer, Netherlands (2005)
Google Scholar
Koehn, P., et al.: Moses: Open source toolkit for statistical machine translation. In: ICSLP (2002)
Google Scholar
Lamel, L., Adda, G.: On designing pronunciation lexicons for large vocabulary, continuous speech recognition. In: Proc. ICSLP, pp. 6–9 (1996)
Google Scholar
Laurent, A., Deleglise, P., Meignier, S.: Grapheme to phoneme conversion using an SMT system. In: Interspeech (2009)
Google Scholar
Lee, K.F., Hon, H.W.: Speaker-Independent Phone Recognition Using Hidden Markov Models. IEEE Trans. ASSP 37(11), 1641–1648 (1989)
Article Google Scholar
Mangu, L., Brill, E., Stolcke, A.: Finding Consensus Among Words: Lattice-Based Word Error Minimization. In: Eurospeech, pp. 495–498 (1999)
Google Scholar
Rama, T., Singh, A.K., Kolachina, S.: Modeling Letter-to-Phoneme Conversion as a Phrase Based Statistical Machine Translation Problem with Minimum Error Rate Training. In: Proc. NAACL-HLT: Student Research Workshop & Doctoral Consortium, pp. 90–95 (2009)
Google Scholar
Van Rijsbergen, C.J.: Information Retrieval, Butterworths, London, UK (1979)
Google Scholar
Sejnowski, T., Rosenberg, C.: NETtalk: a parallel network that learns to read aloud. In: Report JHU/EECS-86/01 (1986)
Google Scholar
Stolcke, A.: SRILM-An extensible language modeling toolkit. Proc. ICSLP 2002 (2002)
Google Scholar
Taylor, P.: Hidden Markov models for grapheme to phoneme conversion. In: Interspeech, pp. 1973–1976 (2005)
Google Scholar
van Berkel, B., De Smedt, K.: Triphone analysis:a combined method for the correction of orthographical and typographical errors. In: Proc. of the Second Conf. on Applied Natural Language Processing, pp. 77–83 (1988)
Google Scholar

Download references

Author information

Authors and Affiliations

Spoken Language Processing Group, LIMSI-CNRS, 91403, Orsay, France
Panagiota Karanasou & Lori Lamel

Authors

Panagiota Karanasou
View author publications
You can also search for this author in PubMed Google Scholar
Lori Lamel
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Center for Computing Research, National Polytechnic Institute, Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Karanasou, P., Lamel, L. (2011). Automatic Generation of a Pronunciation Dictionary with Rich Variation Coverage Using SMT Methods. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2011. Lecture Notes in Computer Science, vol 6609. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19437-5_42

Download citation

DOI: https://doi.org/10.1007/978-3-642-19437-5_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19436-8
Online ISBN: 978-3-642-19437-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics