Abstract
In this paper, we present a new method of language detection. This method is based on language pair discrimination using neural networks as classifier of acoustic features. No acoustic decomposition of the speech signal is needed. We present an improvement of our method applied to the detection of English for a signal duration of less than 3 seconds (Call Friend corpus), as well as a comparison with a neural predictive model. The obtained results highlight scores ranging from 74.7% to 76.9% according to the method used.
An erratum to this chapter can be found at http://dx.doi.org/10.1007/11550907_163 .
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Zissman, M.A.: Comparison of Four Approaches to Automatic Language Identification of Telephone Speech. IEEE Trans. Speech and Audio Proc. 4, 31–44 (1996)
Torres-Carasquillo, P.A., Reynolds, D.A., Deller Jr., J.R.: Language identification using Gaussian Mixture Model Tokenization. In: ICASSP, Olando, Floride, USA, vol. 1, pp. 757–760 (2002)
Farinas, J., Pellegrino, F.: Automatic Rhythm Modeling for Language Identification. In: Proc. of Eurospeech, Scandinavia, Aalborg, September 2001, vol. 4, pp. 2539–2542 (2001)
Muthusamy, Y.K.: A Segmental Approach to Automatic language Identification. In: Oregon Graduate Institute of Science and Technology (October 1993)
Weilan, W., Kwasny, S.C., Kalman, B.L., Maynard Engebretson, E.: Identifying Language from Raw Speech, An application of Recurrent Neural networks. In: 5th Midwest Artificial Intelligence and Cognitive Science Conference, pp. 53–57 (1993)
CallFriend Corpus, Linguistic Data Consortium (1996), http://www.ldc.upenn/ldc/about/callfriend.html
Bottou, L., Murata, N.: Stochastic Approximations and Efficient Learning. In: Arbib, M.A. (ed.) The Handbook of Brain Theory and Neural Networks, 2nd edn., Cambridge, MA (2002)
Chetouani, M., Gas, B., Zarader, J.L.: Maximization of the modelisation error ratio for neural predictive coding. In: NOLISP (2003)
Herry, S., Gas, B., Sedogbo, C., Zarader, J.L.: Language detection by neural discrimination. In: The Proc. of ICSLP, Jeju, Korea, vol. 2, pp. 1561–1564 (2004)
Muthusamy, Y.K., Cole, R.A., Oshika, B.T.: The OGI Multilingual Telephone speech Corpus. In: Proc. of ICSLP, Banff, pp. 895–898 (1992)
Martin, A., Przybocki, M.: NIST 2003 Language Recognition Evaluation. In: The Proc. of Eurospeech, Switzerland, Geneva (September 2004)
Gauvain, J.L., Messaoudi, A., Schwenk, H.: Language Recognition using Phone Lattices. In: Proc. ICSLP, Jeju, Korea (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Herry, S. (2005). Improvement in Language Detection by Neural Discrimination in Comparison with Predictive Models. In: Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S. (eds) Artificial Neural Networks: Formal Models and Their Applications – ICANN 2005. ICANN 2005. Lecture Notes in Computer Science, vol 3697. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11550907_127
Download citation
DOI: https://doi.org/10.1007/11550907_127
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28755-1
Online ISBN: 978-3-540-28756-8
eBook Packages: Computer ScienceComputer Science (R0)