Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2663204.2666279acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
research-article

Emotion Recognition in Real-world Conditions with Acoustic and Visual Features

Published: 12 November 2014 Publication History

Abstract

There is an enormous number of potential applications of the system which is capable to recognize human emotions. Such opportunity can be useful in various applications, e.g., improvement of Spoken Dialogue Systems (SDSs) or monitoring agents in call-centers. Therefore, the Emotion Recognition In The Wild Challenge 2014 (EmotiW 2014) is focused on estimating emotions in real-world situations. This study presents the results of multimodal emotion recognition based on support vector classifier. The described approach results in 41.77% of overall classification accuracy in the multimodal case. The obtained result is more than 17% higher than the baseline result for multimodal approach.

References

[1]
T. Ahonen, A. Hadid, and M. Pietikainen. Face description with local binary patterns: Application to face recognition. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 28(12):2037--2041, 2006.
[2]
T. R. Almaev and M. F. Valstar. Local gabor binary patterns from three orthogonal planes for automatic facial expression recognition. In Affective Computing and Intelligent Interaction (ACII), 2013 Humaine Association Conference on. IEEE, 2013.
[3]
P. Boersma. Praat, a system for doing phonetics by computer. Glot international, 5(9/10):341--345, 2002.
[4]
C. Busso, Z. Deng, S. Yildirim, M. Bulut, C. M. Lee, A. Kazemzadeh, S. Lee, U. Neumann, and S. Narayanan. Analysis of emotion recognition using facial expressions, speech and multimodal information. In Proceedings of the 6th international conference on Multimodal interfaces, pages 205--211. ACM, 2004.
[5]
O. Celiktutan, F. Eyben, E. Sariyanidi, H. Gunes, and B. Schuller. Maptraits 2014: Introduction to the audio/visual mapping personality traits challenge. In Proceedings of the 16th ACM on International conference on multimodal interaction. ACM, 2014.
[6]
A. Dhall, A. Asthana, R. Goecke, and T. Gedeon. Emotion recognition using phog and lpq features. In Automatic Face & Gesture Recognition and Workshops (FG 2011), 2011 IEEE International Conference on, pages 878--883. IEEE, 2011.
[7]
A. Dhall et al. Collecting large, richly annotated facial-expression databases from movies. 2012.
[8]
A. Dhall, R. Goecke, J. Joshi, M. Wagner, and T. Gedeon. Emotion recognition in the wild challenge 2013. In Proceedings of the 15th ACM on International conference on multimodal interaction, pages 509--516. ACM, 2013.
[9]
A. Dhall, R. Goecke, J. Joshi, M. Wagner, and T. Gedeon. Emotion recognition in the wild challenge 2014. In Proceedings of the 16th ACM on International conference on multimodal interaction. ACM, 2014.
[10]
F. Eyben, M. Wöllmer, and B. Schuller. Opensmile: the munich versatile and fast open-source audio feature extractor. In Proceedings of the international conference on Multimedia, pages 1459--1462. ACM, 2010.
[11]
G. W. Flake and S. Lawrence. Efficient svm regression training with smo. Machine Learning, 46(1-3):271--290, 2002.
[12]
Y.-L. Lin and G. Wei. Speech emotion recognition based on hmm and svm. In Machine Learning and Cybernetics, 2005. Proceedings of 2005 International Conference on, volume 8, pages 4898--4901. IEEE, 2005.
[13]
C. Liu and H. Wechsler. Independent component analysis of gabor features for face recognition. Neural Networks, IEEE Transactions on, 14(4):919--928, 2003.
[14]
C. Liu and H. Wechsler. Independent component analysis of gabor features for face recognition. Neural Networks, IEEE Transactions on, 14(4):919--928, 2003.
[15]
R. W. Picard. Affective computing. MIT press, 2000.
[16]
E. Sánchez-Lozano, P. Lopez-Otero, L. Docio-Fernandez, E. Argones-Rúa, and J. L. Alba-Castro. Audiovisual three-level fusion for continuous estimation of russell's emotion circumplex. In Proceedings of the 3rd ACM international workshop on Audio/visual emotion challenge, pages 31--40. ACM, 2013.
[17]
K. R. Scherer. Emotion. In Sozialpsychologie, pages 293--330. Springer, 1997.
[18]
B. Schuller, S. Steidl, and A. Batliner. The interspeech 2009 emotion challenge. In INTERSPEECH, volume 2009, pages 312--315. Citeseer, 2009.
[19]
B. Schuller, S. Steidl, A. Batliner, F. Burkhardt, L. Devillers, C. A. Müller, and S. S. Narayanan. The interspeech 2010 paralinguistic challenge. In INTERSPEECH, pages 2794--2797, 2010.
[20]
B. Schuller, S. Steidl, A. Batliner, A. Vinciarelli, K. Scherer, F. Ringeval, M. Chetouani, F. Weninger, F. Eyben, E. Marchi, et al. The interspeech 2013 computational paralinguistics challenge: social signals, conflict, emotion, autism. 2013.
[21]
B. Schuller, M. Valstar, F. Eyben, G. McKeown, R. Cowie, and M. Pantic. Avec 2011 - the first international audio/visual emotion challenge. In Affective Computing and Intelligent Interaction, pages 415--424. Springer, 2011.
[22]
B. Schuller, M. Valster, F. Eyben, R. Cowie, and M. Pantic. Avec 2012: the continuous audio/visual emotion challenge. In Proceedings of the 14th ACM international conference on Multimodal interaction, pages 449--456. ACM, 2012.
[23]
N. Sebe, I. Cohen, and T. S. Huang. Multimodal emotion recognition. Handbook of Pattern Recognition and Computer Vision, 4:387--419, 2005.
[24]
M. Sidorov, S. Ultes, and A. Schmitt. Emotions are a personal thing: Towards speaker-adaptive emotion recognition. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 4836--4840, May 2014.
[25]
X. Tan and B. Triggs. Fusing gabor and lbp feature sets for kernel-based face recognition. In Analysis and Modeling of Faces and Gestures, pages 235--249. Springer, 2007.
[26]
M. Valstar, B. Schuller, K. Smith, F. Eyben, B. Jiang, S. Bilakhia, S. Schnieder, R. Cowie, and M. Pantic. Avec 2013: the continuous audio/visual emotion and depression recognition challenge. In Proceedings of the 3rd ACM international workshop on Audio/visual emotion challenge, pages 3--10. ACM, 2013.
[27]
C.-H. Wu and W.-B. Liang. Emotion recognition of affective speech based on multiple classifiers using acoustic-prosodic information and semantic labels. Affective Computing, IEEE Transactions on, 2(1):10--21, 2011.
[28]
C. Xu, P. Du, Z. Feng, Z. Meng, T. Cao, and C. Dong. Multi-modal emotion recognition fusing video and audio. Appl. Math, 7(2):455--462, 2013.
[29]
X. Zhu and D. Ramanan. Face detection, pose estimation, and landmark localization in the wild. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, pages 2879--2886. IEEE, 2012.

Cited By

View all
  • (2022)A Conceptual Framework Based on Conversational Agents for the Early Detection of Cognitive ImpairmentProceedings of 2nd International Conference on Artificial Intelligence: Advances and Applications10.1007/978-981-16-6332-1_65(801-813)Online publication date: 14-Feb-2022
  • (2019)Facial Expression Recognition Using Computer Vision: A Systematic ReviewApplied Sciences10.3390/app92146789:21(4678)Online publication date: 2-Nov-2019
  • (2018)EmoTour: Estimating Emotion and Satisfaction of Users Based on Behavioral Cues and Audiovisual DataSensors10.3390/s1811397818:11(3978)Online publication date: 15-Nov-2018
  • Show More Cited By

Index Terms

  1. Emotion Recognition in Real-world Conditions with Acoustic and Visual Features

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      ICMI '14: Proceedings of the 16th International Conference on Multimodal Interaction
      November 2014
      558 pages
      ISBN:9781450328852
      DOI:10.1145/2663204
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 12 November 2014

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. audio-video data corpus
      2. facial expression
      3. feature-based fusion
      4. support vector machine

      Qualifiers

      • Research-article

      Conference

      ICMI '14
      Sponsor:

      Acceptance Rates

      ICMI '14 Paper Acceptance Rate 51 of 127 submissions, 40%;
      Overall Acceptance Rate 453 of 1,080 submissions, 42%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)3
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 21 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2022)A Conceptual Framework Based on Conversational Agents for the Early Detection of Cognitive ImpairmentProceedings of 2nd International Conference on Artificial Intelligence: Advances and Applications10.1007/978-981-16-6332-1_65(801-813)Online publication date: 14-Feb-2022
      • (2019)Facial Expression Recognition Using Computer Vision: A Systematic ReviewApplied Sciences10.3390/app92146789:21(4678)Online publication date: 2-Nov-2019
      • (2018)EmoTour: Estimating Emotion and Satisfaction of Users Based on Behavioral Cues and Audiovisual DataSensors10.3390/s1811397818:11(3978)Online publication date: 15-Nov-2018
      • (2016)Video Affective Content Analysis based on multimodal features using a novel hybrid SVM-RBM classifier2016 IEEE Uttar Pradesh Section International Conference on Electrical, Computer and Electronics Engineering (UPCON)10.1109/UPCON.2016.7894690(416-421)Online publication date: 2016
      • (2016)Revisiting the EmotiW challenge: how wild is it really?Journal on Multimodal User Interfaces10.1007/s12193-015-0202-710:2(151-162)Online publication date: 12-Feb-2016

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media