Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

A comparative analysis of classifiers in emotion recognition through acoustic features

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

The most popular features used in speech emotion recognition are prosody and spectral. However the performance of the system degrades substantially, when these acoustic features employed individually i.e either prosody or spectral. In this paper a feature fusion method (combination of energy,pitch prosody features and MFCC spectral features) is proposed. The fused features are classified individually using linear discriminant analysis (LDA), regularized discriminant analysis (RDA), support vector machine (SVM) and k nearest neighbour (kNN). The results are validated over Berlin and Spanish emotional speech databases. Results showed that,the performance is improved by 20 % approximately for each classifier when compared with performance of each classifier with individual features. Results also reveal that RDA is a better choice as a classifier for emotion classification because LDA suffers from singularity problem, which occurs due to high dimensional and small sample size speech samples i.e the number of available training speech samples is small compared to the dimensionality of the sample space. RDA eliminates this singularity problem by using regularization criteria and give better results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  • Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W. F., & Weiss, B. (2005). A database of german emotional speech. In: Interspeech, pp. 1517–1520.

  • Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., et al. (2001). Emotion recognition in human-computer interaction. IEEE on Signal Processing Magazine, 18(1), 32–80.

    Article  Google Scholar 

  • El Ayadi, M., Kamel, M. S., & Karray, F. (2011). Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recognition, 44(3), 572–587.

    Article  MATH  Google Scholar 

  • El Ayadi, M. M., Kamel, M. S., & Karray, F. (2007). Speech emotion recognition using gaussian mixture vector autoregressive models. In: Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on, IEEE, vol. 4, pp. IV-957.

  • Ji, S., & Ye, J. (2008). Generalized linear discriminant analysis: A unified framework and efficient model selection. IEEE Transactions on Neural Networks, 19(10), 1768–1782.

    Article  Google Scholar 

  • Koolagudi, S. G., & Rao, K. S. (2012). Emotion recognition from speech: a review. International Journal of Speech Technology, 15(2), 99–117.

    Article  Google Scholar 

  • Koolagudi, S. G., Kumar, N., & Rao, K. S. (2011). Speech emotion recognition using segmental level prosodic analysis. In: Devices and communications (ICDeCom), 2011 International Conference on, IEEE, pp. 1–5.

  • Luengo, I., Navas, E., Hernáez, I., & Sánchez, J. (2005). Automatic emotion recognition using prosodic parameters. In: Interspeech, pp. 493–496.

  • Luengo, I., Navas, E., & Hernáez, I. (2010). Feature analysis and evaluation for automatic emotion identification in speech. IEEE Transactions on Multimedia, 12(6), 490–501.

  • Milton, A., Roy, S. S., & Selvi, S. (2013). Svm scheme for speech emotion recognition using mfcc feature. International Journal of Computer Applications, 69(9), 34–39.

  • Nicholson, J., Takahashi, K., & Nakatsu, R. (2000). Emotion recognition in speech using neural networks. Neural Computing and Applications, 9(4), 290–296.

  • Nwe, T. L., Foo, S. W., & De Silva, L. C. (2003). Speech emotion recognition using hidden Markov models. Speech Communication, 41(4), 603–623.

    Article  Google Scholar 

  • Ravikumar, M., & Suresha, M. (2013). Dimensionality reduction and classification of color features data using svm and knn. International Journal of Image Processing and Visual Communication, 1(4), 16–21.

  • Sato, N., & Obuchi, Y. (2007). Emotion recognition using mel-frequency cepstral coefficients. Information and Media Technologies, 2(3), 835–848.

    Google Scholar 

  • Scherer, K. R. (2003). Vocal communication of emotion: A review of research paradigms. Speech Communication, 40(1), 227–256.

    Article  MATH  Google Scholar 

  • Schuller, B., Rigoll, G., & Lang, M. (2003). Hidden markov model-based speech emotion recognition. In: Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP’03). 2003 IEEE International Conference on, IEEE, vol. 2, pp. II-1.

  • Tato, R., Santos, R., Kompe, R., & Pardo, J. M. (2002). Emotional space improves emotion recognition. In: Interspeech.

  • Vankayalapati, H., Anne, K., & Kyamakya, K. (2010). Extraction of visual and acoustic features of the driver for monitoring driver ergonomics applied to extended driver assistance systems. In: Data and mobility, Springer, pp. 83–94.

  • Vankayalapati, H. D., & SVKK Anne, K. R. (2011). Driver emotion detection from the acoustic features of the driver for real-time assessment of driving ergonomics process. International Society for Advanced Science and Technology (ISAST) Transactions on Computers and Intelligent Systems journal, 3(1), 65–73.

    Google Scholar 

  • Ververidis, D., & Kotropoulos, C. (2006). Emotional speech recognition: Resources, features, and methods. Speech Communication, 48(9), 1162–1181.

    Article  Google Scholar 

  • Vogt, T., André, E., & Wagner, J. (2008). Automatic recognition of emotions from speech: a review of the literature and recommendations for practical realisation. In C. Peter & R. Beale (Eds.), Affect and emotion in human–computer interaction (pp. 75–91). Springer.

  • Ye, J., Xiong, T., Li, Q., Janardan, R., Bi, J., Cherkassky, V., & Kambhamettu, C. (2006). Efficient model selection for regularized linear discriminant analysis. In: Proceedings of the 15th ACM international conference on Information and knowledge management. ACM, pp. 532–539.

  • Zhou, Y., Sun, Y., Zhang, J., & Yan, Y. (2009). Speech emotion recognition using both spectral and prosodic features. In: Information engineering and computer science, 2009. ICIECS 2009. International Conference on, IEEE, pp. 1–4.

Download references

Acknowledgments

This work was supported by Research Project on “Non-intrusive real time driving process ergonomics monitoring system to improve road safety in a car – pc environment” funded by DST, New Delhi.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Swarna Kuchibhotla.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kuchibhotla, S., Vankayalapati, H.D., Vaddi, R.S. et al. A comparative analysis of classifiers in emotion recognition through acoustic features. Int J Speech Technol 17, 401–408 (2014). https://doi.org/10.1007/s10772-014-9239-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-014-9239-3

Keywords

Navigation