Distributed Speech Recognition of Mandarin Digits String

Yih-Ru Wang²²,
Bo-Xuan Lu²²,
Yuan-Fu Liao²³ &
…
Sin-Horng Chen²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4274))

Included in the following conference series:

International Symposium on Chinese Spoken Language Processing

1622 Accesses

Abstract

In this paper, the performance of the pitch detection algorithm in ETSI ES-202-212 XAFE standard is evaluated on a Mandarin digit string recognition task. Experimental results showed that the performance of the pitch detection algorithm degraded seriously when the SNR of speech signal was lower than 10dB. This makes the recognizer using pitch information perform inferior to the original recognizer without using pitch information in low SNR environments. A modification of the pitch detection algorithm is therefore proposed to improve the performance of pitch detection in low SNR environments. The recognition performance can be improved for most SNR levels by integrating the recognizers with and without using pitch information. Overall recognition rates of 82.1% and 86.8% were achieved for clean and multi-condition training cases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Automatic Speech Recognition for Moroccan Dialects: A Review

Pitch adaptive MFCC features for improving children’s mismatched ASR

Article 21 July 2015

Performance measurement of a novel pitch detection scheme based on weighted autocorrelation for speech signals

Article 09 September 2019

References

Speech Processing, Transmission and Quality Aspects (STQ); Distributed speech recognition; Extended advanced front-end feature extraction algorithm; Compression algorithms; Back-end reconstruction algorithm, ETSI Standard ES 202 212 (November 2003)
Google Scholar
DSR Front-end Extension for Tonal-language Recognition and Speech Reconstruction. Aurora Group Meeting, by IBM & Motorola (April 2003), http://portal.etsi.org/stq/DSR_Presentations/Presentation.pps
Lin, W.-y., Lee, L.-S.: Improved Tone Recognition for Fluent Mandarin Speech Based on New Inter-Syllabic Features and Robust Pitch Extraction. In: IEEE 8th Automatic Speech Recognition and Understanding Workshop, St. Thomas, US Virgin Islands, USA, December 2003, pp. 237–242 (2003)
Google Scholar
AURORA Database, http://www.elda.org/article20.html
Test and Processing plan for default codec evaluation for speech enabled services (SES), Tdoc S4-030395, 3GPP TSG SA4 meeting #26, Paris, France (May 5-9, 2003)
Google Scholar
Lyu, D.-C., Liang, M.-S., Chiang, Y.-C., Hsu, C.-N., Lyu, R.-Y.: Large Vocabulary Taiwanese (Min-nan) Speech Recognition Using Tone Features and Statistical Pronunciation Modeling. In: Eurospeech 2003, Geneva, pp. 1861–1864 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

National Chiao Tung Univeristy, 1001 Ta Hseuh Road, Hsinchu, 300
Yih-Ru Wang, Bo-Xuan Lu & Sin-Horng Chen
National Taipei University of Technology, No.1, Sec. 3, Chunghsiao E. Rd., Taipei, 106
Yuan-Fu Liao

Authors

Yih-Ru Wang
View author publications
You can also search for this author in PubMed Google Scholar
Bo-Xuan Lu
View author publications
You can also search for this author in PubMed Google Scholar
Yuan-Fu Liao
View author publications
You can also search for this author in PubMed Google Scholar
Sin-Horng Chen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, The University of Hong Kong, Hong Kong
Qiang Huo
Human Language Technology Department, Institute for Infocomm Research (I2R), 119613, Singapore
Bin Ma
School of Computer Engineering, Nanyang Technological University (NTU), 639798, Singapore
Eng-Siong Chng
Institute for Infocomm Research, 21 Heng Mui Keng Terrace, 119613, Singapore
Haizhou Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, YR., Lu, BX., Liao, YF., Chen, SH. (2006). Distributed Speech Recognition of Mandarin Digits String. In: Huo, Q., Ma, B., Chng, ES., Li, H. (eds) Chinese Spoken Language Processing. ISCSLP 2006. Lecture Notes in Computer Science(), vol 4274. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11939993_40

Download citation

DOI: https://doi.org/10.1007/11939993_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49665-6
Online ISBN: 978-3-540-49666-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Distributed Speech Recognition of Mandarin Digits String

Abstract

Access this chapter

Preview

Similar content being viewed by others

Automatic Speech Recognition for Moroccan Dialects: A Review

Pitch adaptive MFCC features for improving children’s mismatched ASR

Performance measurement of a novel pitch detection scheme based on weighted autocorrelation for speech signals

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Distributed Speech Recognition of Mandarin Digits String

Abstract

Access this chapter

Preview

Similar content being viewed by others

Automatic Speech Recognition for Moroccan Dialects: A Review

Pitch adaptive MFCC features for improving children’s mismatched ASR

Performance measurement of a novel pitch detection scheme based on weighted autocorrelation for speech signals

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation