A comparative analysis of classifiers in emotion recognition through acoustic features

Swarna Kuchibhotla¹,
H. D. Vankayalapati²,
R. S. Vaddi² &
…
K. R. Anne²

812 Accesses
23 Citations
Explore all metrics

Abstract

The most popular features used in speech emotion recognition are prosody and spectral. However the performance of the system degrades substantially, when these acoustic features employed individually i.e either prosody or spectral. In this paper a feature fusion method (combination of energy,pitch prosody features and MFCC spectral features) is proposed. The fused features are classified individually using linear discriminant analysis (LDA), regularized discriminant analysis (RDA), support vector machine (SVM) and k nearest neighbour (kNN). The results are validated over Berlin and Spanish emotional speech databases. Results showed that,the performance is improved by 20 % approximately for each classifier when compared with performance of each classifier with individual features. Results also reveal that RDA is a better choice as a classifier for emotion classification because LDA suffers from singularity problem, which occurs due to high dimensional and small sample size speech samples i.e the number of available training speech samples is small compared to the dimensionality of the sample space. RDA eliminates this singularity problem by using regularization criteria and give better results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Speech Emotion Recognition Using Regularized Discriminant Analysis

An optimal two stage feature selection for speech emotion recognition using acoustic features

Article 02 August 2016

Automatic speech emotion recognition based on hybrid features with ANN, LDA and K_NN classifiers

Article 22 April 2023

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W. F., & Weiss, B. (2005). A database of german emotional speech. In: Interspeech, pp. 1517–1520.
Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., et al. (2001). Emotion recognition in human-computer interaction. IEEE on Signal Processing Magazine, 18(1), 32–80.
Article Google Scholar
El Ayadi, M., Kamel, M. S., & Karray, F. (2011). Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recognition, 44(3), 572–587.
Article MATH Google Scholar
El Ayadi, M. M., Kamel, M. S., & Karray, F. (2007). Speech emotion recognition using gaussian mixture vector autoregressive models. In: Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on, IEEE, vol. 4, pp. IV-957.
Ji, S., & Ye, J. (2008). Generalized linear discriminant analysis: A unified framework and efficient model selection. IEEE Transactions on Neural Networks, 19(10), 1768–1782.
Article Google Scholar
Koolagudi, S. G., & Rao, K. S. (2012). Emotion recognition from speech: a review. International Journal of Speech Technology, 15(2), 99–117.
Article Google Scholar
Koolagudi, S. G., Kumar, N., & Rao, K. S. (2011). Speech emotion recognition using segmental level prosodic analysis. In: Devices and communications (ICDeCom), 2011 International Conference on, IEEE, pp. 1–5.
Luengo, I., Navas, E., Hernáez, I., & Sánchez, J. (2005). Automatic emotion recognition using prosodic parameters. In: Interspeech, pp. 493–496.
Luengo, I., Navas, E., & Hernáez, I. (2010). Feature analysis and evaluation for automatic emotion identification in speech. IEEE Transactions on Multimedia, 12(6), 490–501.
Milton, A., Roy, S. S., & Selvi, S. (2013). Svm scheme for speech emotion recognition using mfcc feature. International Journal of Computer Applications, 69(9), 34–39.
Nicholson, J., Takahashi, K., & Nakatsu, R. (2000). Emotion recognition in speech using neural networks. Neural Computing and Applications, 9(4), 290–296.
Nwe, T. L., Foo, S. W., & De Silva, L. C. (2003). Speech emotion recognition using hidden Markov models. Speech Communication, 41(4), 603–623.
Article Google Scholar
Ravikumar, M., & Suresha, M. (2013). Dimensionality reduction and classification of color features data using svm and knn. International Journal of Image Processing and Visual Communication, 1(4), 16–21.
Sato, N., & Obuchi, Y. (2007). Emotion recognition using mel-frequency cepstral coefficients. Information and Media Technologies, 2(3), 835–848.
Google Scholar
Scherer, K. R. (2003). Vocal communication of emotion: A review of research paradigms. Speech Communication, 40(1), 227–256.
Article MATH Google Scholar
Schuller, B., Rigoll, G., & Lang, M. (2003). Hidden markov model-based speech emotion recognition. In: Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP’03). 2003 IEEE International Conference on, IEEE, vol. 2, pp. II-1.
Tato, R., Santos, R., Kompe, R., & Pardo, J. M. (2002). Emotional space improves emotion recognition. In: Interspeech.
Vankayalapati, H., Anne, K., & Kyamakya, K. (2010). Extraction of visual and acoustic features of the driver for monitoring driver ergonomics applied to extended driver assistance systems. In: Data and mobility, Springer, pp. 83–94.
Vankayalapati, H. D., & SVKK Anne, K. R. (2011). Driver emotion detection from the acoustic features of the driver for real-time assessment of driving ergonomics process. International Society for Advanced Science and Technology (ISAST) Transactions on Computers and Intelligent Systems journal, 3(1), 65–73.
Google Scholar
Ververidis, D., & Kotropoulos, C. (2006). Emotional speech recognition: Resources, features, and methods. Speech Communication, 48(9), 1162–1181.
Article Google Scholar
Vogt, T., André, E., & Wagner, J. (2008). Automatic recognition of emotions from speech: a review of the literature and recommendations for practical realisation. In C. Peter & R. Beale (Eds.), Affect and emotion in human–computer interaction (pp. 75–91). Springer.
Ye, J., Xiong, T., Li, Q., Janardan, R., Bi, J., Cherkassky, V., & Kambhamettu, C. (2006). Efficient model selection for regularized linear discriminant analysis. In: Proceedings of the 15th ACM international conference on Information and knowledge management. ACM, pp. 532–539.
Zhou, Y., Sun, Y., Zhang, J., & Yan, Y. (2009). Speech emotion recognition using both spectral and prosodic features. In: Information engineering and computer science, 2009. ICIECS 2009. International Conference on, IEEE, pp. 1–4.

Download references

Acknowledgments

This work was supported by Research Project on “Non-intrusive real time driving process ergonomics monitoring system to improve road safety in a car – pc environment” funded by DST, New Delhi.

Author information

Authors and Affiliations

Acharya Nagarjuna University, Namburu, Gunter Dt, Andhra Pradesh, India
Swarna Kuchibhotla
V. R. Siddhartha Engineering College, Kanuru, India
H. D. Vankayalapati, R. S. Vaddi & K. R. Anne

Authors

Swarna Kuchibhotla
View author publications
You can also search for this author in PubMed Google Scholar
H. D. Vankayalapati
View author publications
You can also search for this author in PubMed Google Scholar
R. S. Vaddi
View author publications
You can also search for this author in PubMed Google Scholar
K. R. Anne
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Swarna Kuchibhotla.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kuchibhotla, S., Vankayalapati, H.D., Vaddi, R.S. et al. A comparative analysis of classifiers in emotion recognition through acoustic features. Int J Speech Technol 17, 401–408 (2014). https://doi.org/10.1007/s10772-014-9239-3

Download citation

Received: 23 December 2013
Accepted: 19 May 2014
Published: 15 June 2014
Issue Date: December 2014
DOI: https://doi.org/10.1007/s10772-014-9239-3

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Speech Emotion Recognition Using Regularized Discriminant Analysis

An optimal two stage feature selection for speech emotion recognition using acoustic features

Automatic speech emotion recognition based on hybrid features with ANN, LDA and K_NN classifiers

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A comparative analysis of classifiers in emotion recognition through acoustic features

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Speech Emotion Recognition Using Regularized Discriminant Analysis

An optimal two stage feature selection for speech emotion recognition using acoustic features

Automatic speech emotion recognition based on hybrid features with ANN, LDA and K_NN classifiers

Explore related subjects

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation