Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2818346.2830588acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
research-article

Contrasting and Combining Least Squares Based Learners for Emotion Recognition in the Wild

Published: 09 November 2015 Publication History

Abstract

This paper presents our contribution to ACM ICMI 2015 Emotion Recognition in the Wild Challenge (EmotiW 2015). We participate in both static facial expression (SFEW) and audio-visual emotion recognition challenges. In both challenges, we use a set of visual descriptors and their early and late fusion schemes. For AFEW, we also exploit a set of popularly used spatio-temporal modeling alternatives and carry out multi-modal fusion. For classification, we employ two least squares regression based learners that are shown to be fast and accurate on former EmotiW Challenge corpora. Specifically, we use Partial Least Squares Regression (PLS) and Kernel Extreme Learning Machines (ELM), which is closely related to Kernel Regularized Least Squares. We use a General Procrustes Analysis (GPA) based alignment for face registration. By employing different alignments, descriptor types, video modeling strategies and classifiers, we diversify learners to improve the final fusion performance. Test set accuracies reached in both challenges are relatively 25% above the respective baselines.

References

[1]
T. R. Almaev and M. F. Valstar. Local Gabor binary patterns from three orthogonal planes for automatic facial expression recognition. In 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction (ACII), pages 356--361. IEEE, 2013.
[2]
N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR '05), volume 1, pages 886--893. IEEE, 2005.
[3]
A. Dhall, R. Goecke, J. Joshi, K. Sikka, and T. Gedeon. Emotion recognition in the wild challenge 2014: Baseline, data and protocol. In Proc. of the 16th ACM Intl. Conf. on Multimodal Interaction (ICMI 2014). ACM, 2014.
[4]
A. Dhall, R. Goecke, J. Joshi, M. Wagner, and T. Gedeon. Emotion recognition in the wild challenge 2013. In Proc. of the 15th ACM Intl. Conf. on Multimodal Interaction (ICMI 2013), pages 509--516. ACM, 2013.
[5]
A. Dhall, R. Goecke, S. Lucey, and T. Gedeon. Collecting large, richly annotated facial-expression databases from movies. IEEE MultiMedia, 19(3):34--41, July 2012.
[6]
A. Dhall, R. Goecke, L. S, and T. Gedeon. Static facial expression in though conditions: Data, evaluation protocol and benchmark. In ICCV BEFIT Workshop). IEEE, 2011.
[7]
A. Dhall, O. V. Ramana Murthy, R. Goecke, J. Joshi, and T. Gedeon. Video and image based emotion recognition challenges in the wild: Emotiw 2015. In Proc. of the 17th ACM Intl. Conf. on Multimodal Interaction (ICMI 2015). ACM, 2015.
[8]
F. Eyben, M. Wöllmer, and B. Schuller. Opensmile: the munich versatile and fast open-source audio feature extractor. In Proc. of the intl. conf. on Multimedia, pages 1459--1462. ACM, 2010.
[9]
J. C. Gower. Generalized procrustes analysis. Psychometrika, 40(1):33--51, 1975.
[10]
M. Haghighat, S. Zonouz, and M. Abdel-Mottaleb. Identification using encrypted biometrics. In Computer Analysis of Images and Patterns, pages 440--448. Springer, 2013.
[11]
J. Heikkil\\"a, V. Ojansivu, and E. Rahtu. Improved blur insensitivity for decorrelated local phase quantization. In 20th International Conference on Pattern Recognition (ICPR '10), pages 818--821, 2010.
[12]
G.-B. Huang, H. Zhou, X. Ding, and R. Zhang. Extreme learning machine for regression and multiclass classification. Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on, 42(2):513--529, 2012.
[13]
B. Jiang, M. Valstar, B. Martinez, and M. Pantic. A dynamic appearance descriptor approach to facial actions temporal modeling. Cybernetics, IEEE Transactions on, 44(2):161--174, 2014.
[14]
S. E. Kahou, C. Pal, X. Bouthillier, et al. Combining modality specific deep neural networks for emotion recognition in video. In Proceedings of the 15th ACM on International Conference on Multimodal Interaction, ICMI '13, pages 543--550, 2013.
[15]
H. Kaya and A. A. Salah. Combining modality-specific extreme learning machines for emotion recognition in the wild. Journal on Multimodal User Interfaces, 2015.
[16]
M. Liu, R. Wang, Z. Huang, S. Shan, and X. Chen. Partial least squares regression on Grassmannian manifold for emotion recognition. In Proceedings of the 15th ACM on International Conference on Multimodal Interaction, ICMI '13, pages 525--530, 2013.
[17]
D. G. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2):91--110, 2004.
[18]
T. Ojala, M. Pietikainen, and T. Maenpaa. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 24(7):971--987, 2002.
[19]
F. Perronnin and C. Dance. Fisher kernels on visual vocabularies for image categorization. In IEEE Conference on Computer Vision and Pattern Recognition, pages 1--8, Minneapolis, Minnesota, USA, 2007.
[20]
C. R. Rao and S. K. Mitra. Generalized inverse of matrices and its applications, volume 7. Wiley New York, 1971.
[21]
R. Rifkin, G. Yeo, and T. Poggio. Regularized least-squares classification. NATO Science Series Sub Series III Computer and Systems Sciences, 190:131--154, 2003.
[22]
A. Saeed, A. Al-Hamadi, R. Niese, and M. Elzobi. Effective geometric features for human emotion recognition. In IEEE 11th International Conference on Signal Processing (ICSP), volume 1, pages 623--627, 2012.
[23]
B. Schuller, S. Steidl, A. Batliner, F. Burkhardt, L. Devillers, C. A. Müller, and S. S. Narayanan. The interspeech 2010 paralinguistic challenge. In Proc. INTERSPEECH, pages 2794--2797, 2010.
[24]
J. Sivic and A. Zisserman. Efficient visual search of videos cast as text retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(4):591--606, 2009.
[25]
B. Sun, L. Li, T. Zuo, Y. Chen, G. Zhou, and X. Wu. Combining multimodal features with hierarchical classifier fusion for emotion recognition in the wild. In Proceedings of the 16th International Conference on Multimodal Interaction, ICMI '14, pages 481--486, 2014.
[26]
J. A. Suykens and J. Vandewalle. Least squares support vector machine classifiers. Neural processing letters, 9(3):293--300, 1999.
[27]
A. Vedaldi and B. Fulkerson. VLFeat: An open and portable library of computer vision algorithms, 2008.
[28]
H. Wold. Partial least squares. In S. Kotz and N. L. Johnson, editors, Encyclopedia of Statistical Sciences, pages 581--591. Wiley New York, 1985.
[29]
X. Xiong and F. De la Torre. Supervised Descent Method and Its Application to Face Alignment. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR '13), pages 532--539, 2013.

Cited By

View all
  • (2024)Face Expression Recognition via transformer-based classification modelsBalkan Journal of Electrical and Computer Engineering10.17694/bajece.148614012:3(214-223)Online publication date: 30-Sep-2024
  • (2024)A dual stream attention network for facial expression recognition in the wildInternational Journal of Machine Learning and Cybernetics10.1007/s13042-024-02287-015:12(5863-5880)Online publication date: 23-Jul-2024
  • (2024)Modelling an efficient hybridized approach for facial emotion recognition using unconstraint videos and deep learning approachesSoft Computing10.1007/s00500-024-09668-128:5(4593-4606)Online publication date: 6-Feb-2024
  • Show More Cited By

Index Terms

  1. Contrasting and Combining Least Squares Based Learners for Emotion Recognition in the Wild

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        ICMI '15: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction
        November 2015
        678 pages
        ISBN:9781450339124
        DOI:10.1145/2818346
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 09 November 2015

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. afew
        2. audio-visual emotion corpus
        3. audio-visual fusion
        4. emotion recognition in the wild
        5. feature extraction
        6. sfew

        Qualifiers

        • Research-article

        Conference

        ICMI '15
        Sponsor:
        ICMI '15: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION
        November 9 - 13, 2015
        Washington, Seattle, USA

        Acceptance Rates

        ICMI '15 Paper Acceptance Rate 52 of 127 submissions, 41%;
        Overall Acceptance Rate 453 of 1,080 submissions, 42%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)5
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 19 Nov 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Face Expression Recognition via transformer-based classification modelsBalkan Journal of Electrical and Computer Engineering10.17694/bajece.148614012:3(214-223)Online publication date: 30-Sep-2024
        • (2024)A dual stream attention network for facial expression recognition in the wildInternational Journal of Machine Learning and Cybernetics10.1007/s13042-024-02287-015:12(5863-5880)Online publication date: 23-Jul-2024
        • (2024)Modelling an efficient hybridized approach for facial emotion recognition using unconstraint videos and deep learning approachesSoft Computing10.1007/s00500-024-09668-128:5(4593-4606)Online publication date: 6-Feb-2024
        • (2023)Multimodal Emotion Recognition in Noisy Environment Based on Progressive Label RevisionProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3612867(9571-9575)Online publication date: 26-Oct-2023
        • (2023)Audio-Visual Group-based Emotion Recognition using Local and Global Feature Aggregation based Multi-Task LearningProceedings of the 25th International Conference on Multimodal Interaction10.1145/3577190.3616544(741-745)Online publication date: 9-Oct-2023
        • (2023)Graph Regularized Structured Output SVM for Early Expression Detection With Online ExtensionIEEE Transactions on Cybernetics10.1109/TCYB.2021.310814353:3(1419-1431)Online publication date: Mar-2023
        • (2023)Probabilistic Attribute Tree Structured Convolutional Neural Networks for Facial Expression Recognition in the WildIEEE Transactions on Affective Computing10.1109/TAFFC.2022.315692014:3(1927-1941)Online publication date: 1-Jul-2023
        • (2022)Comparing Approaches for Explaining DNN-Based Facial Expression ClassificationsAlgorithms10.3390/a1510036715:10(367)Online publication date: 3-Oct-2022
        • (2022)A Multimodal Approach for Mania Level Prediction in Bipolar DisorderIEEE Transactions on Affective Computing10.1109/TAFFC.2022.319305413:4(2119-2131)Online publication date: 1-Oct-2022
        • (2022)Smart Meet — Facial Recognition-based Conferencing Platform2022 International Conference on Sustainable Computing and Data Communication Systems (ICSCDS)10.1109/ICSCDS53736.2022.9760973(351-355)Online publication date: 7-Apr-2022
        • Show More Cited By

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media