Abstract
This paper presents a comparison of three competitive learning methods for vector quantizations of speech data in an efficient way. The analyzed algorithms were two batch methods (the Lloyd LBG algorithm and the Neural Gas method) and one on-line technique (K-means algorithm). These methods obtain reduced subsets of codewords for representing bigger data sets. The experiments were designed for speaker dependent and independent tests and consisted in evaluating the reduced training files for speech recognition purposes. In all the studied cases, the results shown a reduction of learning patterns of near 2 orders of magnitude respect to the original training sets without heavily affecting the speech recognition accuracy. The savings in time after using these quantization techniques, made us to consider this reduction results as excellent since they help to approximate the speech matching responses to almost real time. The main contribution of this work refers to an original use of competitive learning techniques for efficient vector quantization of speech data and so, for reducing the memory size and computational costs of a speech recognizer.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Mohri, M., Riley, M.: Weighted Determination and Minimization for Large Vocabulary Speech Recognition. In: Proc. of Eurospeech 1997, Rhodes, Greece, vol. 1, pp. 131–134 (1997)
Padmanabhan, M., Bahl, L.R., Nahamoo, D., de Souza, P.: Decision-Tree Based Quantization of the Feature Space of a Speech Recognizer. In: Proc. of Eurospeech 1997, Rhodes, Greece, vol. 1, pp. 147–150 (1997)
Ravishankar, M., Bisiani, R., Thayer, E.: Sub-Vector Clustering to Improve Memory and Speed Performance of Acoustic Likelihood Computation. In: Proc. of Euro speech 1997, Rhodes, Greece, vol. 1, pp. 151–154 (1997)
Paliwal, K., Atal, B.: Efficient Vector Quantization of LPC Parameters at 24 Bits/Frame. IEEE Transactions on Speech and Audio Processing 1(1), 3–14 (1993)
Linde, Y., Buzo, A., Gray, R.M.: An algorithm for vector quantizer design. IEEE Transactions on Communication 28, 84–95 (1980)
Martinetz, T.M., Schulten, K.J.: A Neural Gas Network Learns Topologies. In: Kohonen, T., Maklsara, K., Simula, O., Kangas, J. (eds.) Artificial Neural Networks, pp. 397–402. North Holland, Amsterdam (1991)
Mac Queen, J.: Some methods for classification and analysis of multivariate observations. In: Proc. of the Fifth Berkeley Symposium on Mathematical statistics and probability, Berkeley, pp. 281–297 (1967)
Voronoi, M.G.: Nouvelles applications des parametres continus a la theorie des formes quadratiques. J. Reine u. Angew. Math. 134, 198–287 (1908)
Cherkassky, V., Mulier, F.: Learning from data: Concepts, theory and methods. John Wiley and Sons, Chichester
B. Fritzke.: Some Competitive Learning Methods, Technical Report, Institute for Neural Computation, Ruhr-Universitat Bochum (1997)
Hermansky, H.: Perceptual Linear Predictive (PLP) Analysis of Speech. Journal of Acoust. Soc. Am., 1738–1752 (April 1990)
Robinson, A.: SHORTEN: Simple lossless and near-lossless waveform compression, Technical Report, CUED/F-INFENG/TR 156 Cambridge University, U.K. (December 1994)
SPK Isolated Digit Database. ELRA-IRST. Istituto per la Ricerca Scientifica e Tecnologica
Robinson, A.: An Application of Recurrent Nets to Phone Probability Estimation. IEEE Transactions on Neural Networks 5(2), 298–305 (1994)
Curatelli, F., Mayora-Ibarra, O., Carotenuto, D.: SPEAR, A Modular Tool for Speech Signal Processing and Recognition. In: Proceeding of WISP 1999, Budapest, Hungary (1999)
Curatelli, F., Mayora-Ibarra, O.: An Hybrid Parallel Associative Memory / DTW Based System for Speech Recognition. In: Recent Advances in Signal Processing and Communications, pp. 140–144. World Scientific Engineering Society, Singapore (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Curatelli, F., Mayora-Ibarra, O. (2000). Competitive Learning Methods for Efficient Vector Quantizations in a Speech Recognition Environment. In: Cairó, O., Sucar, L.E., Cantu, F.J. (eds) MICAI 2000: Advances in Artificial Intelligence. MICAI 2000. Lecture Notes in Computer Science(), vol 1793. Springer, Berlin, Heidelberg. https://doi.org/10.1007/10720076_10
Download citation
DOI: https://doi.org/10.1007/10720076_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67354-5
Online ISBN: 978-3-540-45562-2
eBook Packages: Springer Book Archive