Abstract
Statistical phonetic data for Polish were collected. Phonemes are of different lengths, varying from 30 ms to 200 ms. Average phoneme durations are presented. A corpus of spoken Polish was used to collect statistic values of real language and evaluated to be applied in an automatic speech recognition and speaker identification systems. These natural phenomena could be used in phonemes parametrisation and modelling. An additional source of information for a case of speech segmentation was obtained. The collected data are presented in the paper (average values for all available male speakers and for some chosen ones), along with comments on the corpus and the used method. The obtained data were compared with the expected values according to phonetic literature.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Demenko, G., Wypych, M., Baranowska, E.: Implementation of grapheme-to-phoneme rules and extended SAMPA alphabet in Polish text-to-speech synthesis. In: Speech and Language Technology, PTFon, Poznań, vol. 7(17) (2003)
Glass, J.: A probabilistic Framework for Segment-Based Speech Recognition. Computer Speech and Language 17, 137–152 (2003)
Grayden, D.B., Scordilis, M.S.: Phonemic Segmentation of Fluent Speech. In: Proceedings of ICASSP, Adelaide, pp. 73–76 (1994)
Grocholewski, S.: Założenia akustycznej bazy danych dla języka polskiego na nośniku cd rom (eng. Assumptions of acoustic database for Polish language). Mat. I KK: Głosowa komunikacja człowiek-komputer, Wrocław
Hermansky, H., Morgan, N.: RASTA processing of speech. IEEE Transactions on Speech and Audio Processing 2(4), 578–589 (1994)
Holmes, J.N.: Speech Synthesis and Recognition (2001)
Jassem, W.: Podstawy fonetyki akustycznej (Eng. Rudiments of acoustic phonetics). Państwowe Wydawnictwo Naukowe, Warszawa (1973)
Morgan, N., Zhu, Q., Stolcke, A., Sonmez, K., Sivadas, S., Shinozaki, T., Ostendorf, M., Jain, P., Hermansky, H., Ellis, D., Doddington, G., Chen, B., Cretin, O., Bourlard, H., Athineos, M.: Pushing the envelope - aside. IEEE Signal Processing Magazine 22(5), 81–88
Ostendorf, M., Digalakis, V.V., Kimball, O.A.: From HMM’s to segment models: A unified view of stochastic modeling for speech recognition. IEEE Transactions on Speech and Audio Processing 4, 360–378
Rabiner, L., Juang, B.-H.: Fundamentals of speech recognition. PTR Prentice-Hall, Inc., New Jersey (1993)
Russell, M., Jackson, P.J.B.: A multiple-level linear/linear segmental HMM with a formant-based intermediate layer. Computer Speech and Language 19, 205–225
Stöber, K., Hess, W.: Additional use of phoneme duration hypotheses in automatic speech segmentation. In: Proceedings of ICSLP, Sydney, pp. 1595–1598 (1998)
Suh, Y., Lee, Y.: Phoneme segmentation of continuous speech using multi-layer perceptron. In: Proceedings of ICSLP, Philadelphia, pp. 1297–1300 (1996)
Toledano, D.T., Gómez, L.A.H., Grande, L.V.: Automatic phonetic segmentation. IEEE Transactions on Speech and Audio Processing 11(6), 617–625 (2003)
Weinstein, C.J., McCandless, S.S., Mondshein, L.F., Zue, V.W.: A system for acoustic-phonetic analysis of continuous speech. IEEE Transactions on Acoustics, Speech and Signal Processing 23, 54–67
Wierzchowska, B.: Fonetyka i fonologia języka polskiego (Eng. Fonetics and phonology of Polish). Zakład Narodowy im. Ossolińskich, Wrocław (1980)
Young, S.: Large vocabulary continuous speech recognition: a review. IEEE Signal Processing Magazine 13(5), 45–57
Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.: HTK Book. Cambridge University Engineering Department, UK
Ziółko, B., Manandhar, S., Wilson, R.C., Ziółko, M.: Wavelet method of speech segmentation. In: Proceedings of 14th European Signal Processing Conference EUSIPCO, Florence (2006)
Zue, V.W.: The use of speech knowledge in automatic speech recognition. Proceedings of the IEEE 73, 1602–1615 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ziółko, B., Ziółko, M. (2011). Time Durations of Phonemes in Polish Language for Speech and Speaker Recognition. In: Vetulani, Z. (eds) Human Language Technology. Challenges for Computer Science and Linguistics. LTC 2009. Lecture Notes in Computer Science(), vol 6562. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20095-3_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-20095-3_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20094-6
Online ISBN: 978-3-642-20095-3
eBook Packages: Computer ScienceComputer Science (R0)