Articulatory and acoustic analyses of Mandarin sentences with different emotions for speaking training of dysphonic disorders

Guofeng Ren^1,2,
Xueying Zhang¹ &
Shufei Duan¹

228 Accesses
Explore all metrics

Abstract

The aim of the current study was to analyze articulatory and acoustic feature of sentences in Mandarin speakers with different emotions; for articulatory features, the movements of lips and tongue, especially velocities of the lips and tongue, during speech production were analyzed; for acoustic features, formants, fundamental frequency, amplitude and speed were analyzed. 14 subjects with pure Mandarin accent were recruited in this experiment. The subjects were asked to express specified sentences under different emotions (anger, sadness, happiness and neutral), for subsequent articulatory and acoustic analyses. The result indicated that emotions influenced the motion of articulators (tongue and lips) obviously; and then, the motion range of tongue and lips with anger and happiness were larger than with sadness and neutral. Results had been discussed to discover the relations between acoustic and articulatory feature of sentences, similarities and difference of multi-syllables and vowels. This study can be the basement for constructing the functional relation between articulatory parameters and acoustic parameters of emotional speech in the future in order to help individuals with dysphonic disorders to do speaking training.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Mandarin Chinese speech database: a corpus of 18,820 auditory neutral nonsense sentences

Article 30 November 2024

Analysis of Mandarin vs English Language for Emotional Voice Conversion

Exploring human voice prosodic features and the interaction between the excitation signal and vocal tract for Assamese speech

Article 07 January 2022

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Badino L, Canevari C, Fadiga L, Metta G (2012) Deep-level acoustic-to-articulatory mapping for DBN-HMM based phone. Paper presented at the SLT, Miami, pp 370–375
Chao H, Yang Z, Liu W (2012) Improved tone modeling by exploiting articulatory features for Mandarin speech recognition. Paper presented at the ICASSP, Tianjin, China
Cowie R, Douglas-Cowie E, Savvidou S, McMahon E, Sawey M, Schröder M (2000) FEELTRACE—an instrument for recording perceived emotion in real time. Paper presented at the ISCA workshop on speech and emotion, Beffast, pp 19–24
Eyben F, Scherer K, Schuller B, Sundberg J, Andre E, Busso C et al (2015) The Geneva minimalistic acoustic parameter set (GeMAPS) for voice research and affective computing. IEEE Trans Affect Comput. https://doi.org/10.1109/taffc.2015.2457417
Article Google Scholar
Fang Q, Wei J, Hu F, Li A, Wang H, IEEE (2013) Estimating the position of mistracked coil of EMA data using GMM-based methods. 2013 Asia-Pacific signal and information processing association annual summit and conference (APSIPA)
Han W-J, Li H-F, Ruan H-B, Ma L (2014) Review on speech emotion recognition. J Softw 25:37–50. https://doi.org/10.13328/j.cnki.jos.004497
Article MATH Google Scholar
Heracleous P, Hagita N (2011) Automatic recognition of speech without any audio information. Paper presented at the ICASSP, Prague, Czech Republic, pp 2392–2395
Heyde CJ, Scobbie JM, Lickley R, Drake EK (2016) How fluent is the fluent speech of people who stutter? A new approach to measuring kinematics with ultrasound. Clin Linguist Phon 30(3–5):292–312. https://doi.org/10.3109/02699206.2015.1100684
Article Google Scholar
Huang D, Wu X, Wei J, Wang H, Song C, Hou Q et al (2013) Visualization of Mandarin articulation by using a physiological articulatory model. Paper presented at the 2013 Asia-Pacific signal and information processing association annual summit and conference (Apsipa), Hokkaido, Japan. <Go to ISI>WOS:000331094400240
Johnson M, Lapkin S, Long V, Sanchez P, Suominen H, Basilakis J, Dawson L (2014) A systematic review of speech recognition technology in health care. BMC Med Inf Decis Mak 14:94
Article Google Scholar
Kim J, Lee S, Narayanan SS (2009) A detailed study of word-position effects on emotion expression in speech. Paper presented at the DBLP, Brighton, England, pp 1–5
Kim J, Lammert A, Ghosh P, Narayanan SS (2013) Spatial and temporal alignment of multimodal human speech production data: real time imaging, flesh point tracking and audio. In: 2013 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 3637–3641
Kim J, Kumar N, Tsiartas A, Li M, Narayanan SS (2015) Automatic intelligibility classification of sentence-level pathological speech. Comput Speech Lang 29(1):132–144. https://doi.org/10.1016/j.csl.2014.02.001
Article Google Scholar
Lee WS (2016) Articulatory–acoustical relationship in cantonese vowels. Lang Linguist 17(4):477–500. https://doi.org/10.1177/1606822x16637058
Article Google Scholar
Li A (2015) Acoustic and articulatory analysis of emotional vowels. Springer, Berlin
Book Google Scholar
Lin SJ (2004) Calorie restriction extends yeast life span by lowering the level of NADH. Genes Dev 18(1):12–16. https://doi.org/10.1101/gad.1164804
Article Google Scholar
Ling Z-H, Richmond K, Yamagishi J (2013) Articulatory control of HMM-based parametric speech synthesis using feature-space-switched multiple regression. IEEE Trans Audio Speech Lang Process 21(1):205–217. https://doi.org/10.1109/tasl.2012.2215600
Article Google Scholar
Malandrakis N, Potamianos A, Evangelopoulos G, Zlatintsi A (2011) A supervised approach to movie emotion tracking. In: IEEE international conference on acoustics, vol 1, pp 2376–2379
Manjunath KE, Sreenivasa Rao K (2015) Articulatory and excitation source features for speech recognition in read, extempore and conversation modes. Int J Speech Technol 19(1):121–134. https://doi.org/10.1007/s10772-015-9329-x
Article Google Scholar
Marstaller L, Burianová H (2014) The multisensory perception of co-speech gestures—a review and meta-analysis of neuroimaging studies. J Neurolinguist 30:69–77. https://doi.org/10.1016/j.jneuroling.2014.04.003
Article Google Scholar
Martin O, Kotsia I, Macq B, Pitas I (2006) The enterface’05 audio-visual emotion database. Paper presented at the international conference on data engineering workshops, Washington, pp 552–559
Meenakshi N, Yarra C, Yamini BK, Ghosh PK (2014) Comparison of speech quality with and without sensors in electromagnetic. Paper presented at the INTERSPEECH, Minneapolis, USA, pp 935–939
Narayanan S et al. (2014) Real-time magnetic resonance imaging and electromagnetic articulography database for speech production research. J Acoust Soc Am 136:1307–1311. https://doi.org/10.1121/1.4890284
Article Google Scholar
Narayanan S, Toutios A, Ramanarayanan V, Lammert A, Kim J, Lee S et al (2014) Real-time magnetic resonance imaging and electromagnetic articulography database for speech production research (TC). J Acoust Soc Am 136(3):1307. https://doi.org/10.1121/1.4890284
Article Google Scholar
Neufeld C, van Lieshout P (2014) Tongue kinematics in palate relative coordinate spaces for electro-magnetic articulography. J Acoust Soc Am 135(1):352–361. https://doi.org/10.1121/1.4836515
Article Google Scholar
Schuller B, Valstar M, Eyben F, McKeown G, Cowie R, Pantic M (2011) AVEC 2011–The first international audio/visual emotion challenge. In: D’Mello S, Graesser A, Schuller B, Martin JC (eds) Affective computing and intelligent interaction. ACII 2011. Lecture notes in computer science, vol 6975. Springer, Berlin, Heidelberg, pp 415–424
Google Scholar
Slis A, Van Lieshout P (2013) The effect of phonetic context on speech movements in repetitive speech. J Acoust Soc Am 134(6):4496. https://doi.org/10.1121/1.4828834
Article Google Scholar
Wei J, Zhang J, Ji Y, Fang Q, Lu W (2016) Morphological normalization of vowel images for articulatory speech recognition. J Vis Commun Image Represent 41:352–360. https://doi.org/10.1016/j.jvcir.2016.10.005
Article Google Scholar
Yang J, Xu L (2017) Mandarin compound vowels produced by prelingually deafened children with cochlear implants. Int J Pediatr Otorhinolaryngol 97:143–149. https://doi.org/10.1016/j.ijporl.2017.04.012
Article Google Scholar
Yu J, Jiang C, Luo C-w, Li R, Li L-y, Wang Z-f (2015) Electro-magnetic articulography data stabilization for speech synchronized articulatory animation. Paper presented at the FSKD, Guilin, China, pp 1924–1928
Zhang D, Liu X, Yan N, Wang L, Zhu Y, Chen H (2014) A multi-channel/multi-speaker articulatory database in mandarin for speech visualization. Paper presented at the 2014 9th international symposium on Chinese spoken language processing (ISCSLP). <Go to ISI>://WOS:000349765600062

Download references

Acknowledgements

Thanks are due to all the subjects in current experiment, to Xueying Zhang and Shufei Duan for technical assistance, and to Jianzheng Yan and Dong Li for assistance in data collection.

Funding

This study was supported by the National Nature Science Foundation of China [Grant Number 61371193].

Author information

Authors and Affiliations

College of Information Engineering, Taiyuan University of Technology, Taiyuan, China
Guofeng Ren, Xueying Zhang & Shufei Duan
Department of Electronics, Xinzhou Teachers University, Xinzhou, China
Guofeng Ren

Authors

Guofeng Ren
View author publications
You can also search for this author in PubMed Google Scholar
Xueying Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Shufei Duan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xueying Zhang.

Ethics declarations

Conflict of interest

The authors report no conflicts of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ren, G., Zhang, X. & Duan, S. Articulatory and acoustic analyses of Mandarin sentences with different emotions for speaking training of dysphonic disorders. J Ambient Intell Human Comput 11, 561–571 (2020). https://doi.org/10.1007/s12652-018-0942-9

Download citation

Received: 29 May 2018
Accepted: 05 July 2018
Published: 11 July 2018
Issue Date: February 2020
DOI: https://doi.org/10.1007/s12652-018-0942-9

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

The Mandarin Chinese speech database: a corpus of 18,820 auditory neutral nonsense sentences

Analysis of Mandarin vs English Language for Emotional Voice Conversion

Exploring human voice prosodic features and the interaction between the excitation signal and vocal tract for Assamese speech

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Articulatory and acoustic analyses of Mandarin sentences with different emotions for speaking training of dysphonic disorders

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

The Mandarin Chinese speech database: a corpus of 18,820 auditory neutral nonsense sentences

Analysis of Mandarin vs English Language for Emotional Voice Conversion

Exploring human voice prosodic features and the interaction between the excitation signal and vocal tract for Assamese speech

Explore related subjects

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now