Nothing Special   »   [go: up one dir, main page]

Skip to main content

Time Durations of Phonemes in Polish Language for Speech and Speaker Recognition

  • Conference paper
Human Language Technology. Challenges for Computer Science and Linguistics (LTC 2009)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6562))

Included in the following conference series:

Abstract

Statistical phonetic data for Polish were collected. Phonemes are of different lengths, varying from 30 ms to 200 ms. Average phoneme durations are presented. A corpus of spoken Polish was used to collect statistic values of real language and evaluated to be applied in an automatic speech recognition and speaker identification systems. These natural phenomena could be used in phonemes parametrisation and modelling. An additional source of information for a case of speech segmentation was obtained. The collected data are presented in the paper (average values for all available male speakers and for some chosen ones), along with comments on the corpus and the used method. The obtained data were compared with the expected values according to phonetic literature.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Demenko, G., Wypych, M., Baranowska, E.: Implementation of grapheme-to-phoneme rules and extended SAMPA alphabet in Polish text-to-speech synthesis. In: Speech and Language Technology, PTFon, Poznań, vol. 7(17) (2003)

    Google Scholar 

  2. Glass, J.: A probabilistic Framework for Segment-Based Speech Recognition. Computer Speech and Language 17, 137–152 (2003)

    Article  Google Scholar 

  3. Grayden, D.B., Scordilis, M.S.: Phonemic Segmentation of Fluent Speech. In: Proceedings of ICASSP, Adelaide, pp. 73–76 (1994)

    Google Scholar 

  4. Grocholewski, S.: Założenia akustycznej bazy danych dla języka polskiego na nośniku cd rom (eng. Assumptions of acoustic database for Polish language). Mat. I KK: Głosowa komunikacja człowiek-komputer, Wrocław

    Google Scholar 

  5. Hermansky, H., Morgan, N.: RASTA processing of speech. IEEE Transactions on Speech and Audio Processing 2(4), 578–589 (1994)

    Article  Google Scholar 

  6. Holmes, J.N.: Speech Synthesis and Recognition (2001)

    Google Scholar 

  7. Jassem, W.: Podstawy fonetyki akustycznej (Eng. Rudiments of acoustic phonetics). Państwowe Wydawnictwo Naukowe, Warszawa (1973)

    Google Scholar 

  8. Morgan, N., Zhu, Q., Stolcke, A., Sonmez, K., Sivadas, S., Shinozaki, T., Ostendorf, M., Jain, P., Hermansky, H., Ellis, D., Doddington, G., Chen, B., Cretin, O., Bourlard, H., Athineos, M.: Pushing the envelope - aside. IEEE Signal Processing Magazine 22(5), 81–88

    Google Scholar 

  9. Ostendorf, M., Digalakis, V.V., Kimball, O.A.: From HMM’s to segment models: A unified view of stochastic modeling for speech recognition. IEEE Transactions on Speech and Audio Processing 4, 360–378

    Google Scholar 

  10. Rabiner, L., Juang, B.-H.: Fundamentals of speech recognition. PTR Prentice-Hall, Inc., New Jersey (1993)

    MATH  Google Scholar 

  11. Russell, M., Jackson, P.J.B.: A multiple-level linear/linear segmental HMM with a formant-based intermediate layer. Computer Speech and Language 19, 205–225

    Google Scholar 

  12. Stöber, K., Hess, W.: Additional use of phoneme duration hypotheses in automatic speech segmentation. In: Proceedings of ICSLP, Sydney, pp. 1595–1598 (1998)

    Google Scholar 

  13. Suh, Y., Lee, Y.: Phoneme segmentation of continuous speech using multi-layer perceptron. In: Proceedings of ICSLP, Philadelphia, pp. 1297–1300 (1996)

    Google Scholar 

  14. Toledano, D.T., Gómez, L.A.H., Grande, L.V.: Automatic phonetic segmentation. IEEE Transactions on Speech and Audio Processing 11(6), 617–625 (2003)

    Article  Google Scholar 

  15. Weinstein, C.J., McCandless, S.S., Mondshein, L.F., Zue, V.W.: A system for acoustic-phonetic analysis of continuous speech. IEEE Transactions on Acoustics, Speech and Signal Processing 23, 54–67

    Google Scholar 

  16. Wierzchowska, B.: Fonetyka i fonologia języka polskiego (Eng. Fonetics and phonology of Polish). Zakład Narodowy im. Ossolińskich, Wrocław (1980)

    Google Scholar 

  17. Young, S.: Large vocabulary continuous speech recognition: a review. IEEE Signal Processing Magazine 13(5), 45–57

    Google Scholar 

  18. Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.: HTK Book. Cambridge University Engineering Department, UK

    Google Scholar 

  19. Ziółko, B., Manandhar, S., Wilson, R.C., Ziółko, M.: Wavelet method of speech segmentation. In: Proceedings of 14th European Signal Processing Conference EUSIPCO, Florence (2006)

    Google Scholar 

  20. Zue, V.W.: The use of speech knowledge in automatic speech recognition. Proceedings of the IEEE 73, 1602–1615 (1998)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ziółko, B., Ziółko, M. (2011). Time Durations of Phonemes in Polish Language for Speech and Speaker Recognition. In: Vetulani, Z. (eds) Human Language Technology. Challenges for Computer Science and Linguistics. LTC 2009. Lecture Notes in Computer Science(), vol 6562. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20095-3_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-20095-3_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-20094-6

  • Online ISBN: 978-3-642-20095-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics