research-article

Self-Paced Label Distribution Learning for In-The-Wild Facial Expression Recognition

Authors:

Yazhou RenAuthors Info & Claims

MM '22: Proceedings of the 30th ACM International Conference on Multimedia

Pages 161 - 169

https://doi.org/10.1145/3503161.3547960

Published: 10 October 2022 Publication History

Abstract

Label distribution learning (LDL) has achieved great progress in facial expression recognition (FER), where the generating label distribution is a key procedure for LDL-based FER. However, many existing researches have shown the common problem with noisy samples in FER, especially on in-the-wild datasets. This issue may lead to generating unreliable label distributions (which can be seen as label noise), and will further negatively affect the FER model. To this end, we propose a play-and-plug method of self-paced label distribution learning (SPLDL) for in-the-wild FER. Specifically, a simple yet efficient label distribution generator is adopted to generate label distributions to guide label distribution learning. We then introduce self-paced learning (SPL) paradigm and develop a novel self-paced label distribution learning strategy, which considers both classification losses and distribution losses. SPLDL first learns easy samples with reliable label distributions and gradually steps to complex ones, effectively suppressing the negative impact introduced by noisy samples and unreliable label distributions. Extensive experiments on in-the-wild FER datasets (\emphi.e., RAF-DB and AffectNet) based on three backbone networks demonstrate the effectiveness of the proposed method.

Supplementary Material

MP4 File (MM22-fp0868.mp4)

Presentation video of the paper (ID No: mmfp0868) "Self-Paced Label Distribution Learning for In-The-Wild Facial Expression Recognition".

Download
145.93 MB

References

[1]

Emad Barsoum, Cha Zhang, Cristian Canton Ferrer, and Zhengyou Zhang. 2016. Training deep networks for facial expression recognition with crowd-sourced label distribution. In Proceedings of the 18th ACM International Conference on Multimodal Interaction. 279--283.

Digital Library

[2]

Shikai Chen, Jianfeng Wang, Yuedong Chen, Zhongchao Shi, Xin Geng, and Yong Rui. 2020. Label distribution learning on auxiliary label space graphs for facial expression recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 13984--13993.

[3]

Charles Darwin and Phillip Prodger. 1998. The expression of the emotions in man and animals. Oxford University Press (1998).

[4]

Jiankang Deng, Jia Guo, Evangelos Ververas, Irene Kotsia, and Stefanos Zafeiriou. 2020. Retinaface: Single-shot multi-level face localisation in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5203-- 5212.

[5]

Amir Hossein Farzaneh and Xiaojun Qi. 2020. Discriminant distribution-agnostic loss for facial expression recognition in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 406--407.

[6]

Yixiao Ge, Feng Zhu, Dapeng Chen, Rui Zhao, et al. 2020. Self-paced contrastive learning with hybrid memory for domain adaptive object re-id. In Advances in Neural Information Processing Systems. 11309--11321.

[7]

Xin Geng. 2016. Label distribution learning. IEEE Transactions on Knowledge and Data Engineering 28, 7 (2016), 1734--1748.

[8]

Yandong Guo, Lei Zhang, Yuxiao Hu, Xiaodong He, and Jianfeng Gao. 2016. Msceleb-1m: A dataset and benchmark for large-scale face recognition. In Proceedings of the European Conference on Computer Vision. 87--102.

[9]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.

[10]

Zongmo Huang, Yazhou Ren, Xiaorong Pu, and Lifang He. 2021. Non-Linear Fusion for Self-Paced Multi-View Clustering. In Proceedings of the 29th ACM International Conference on Multimedia. 3211--3219.

Digital Library

[11]

Youngkyoon Jang, Hatice Gunes, and Ioannis Patras. 2017. SmileNet: RegistrationFree Smiling Face Detection in the Wild. In Proceedings of the IEEE International Conference on Computer Vision Workshops. 1581--1589.

[12]

Xiuyi Jia, Xiang Zheng, Weiwei Li, Changqing Zhang, and Zechao Li. 2019. Facial emotion distribution learning by exploiting low-rank label correlations locally. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9841--9850.

[13]

Lu Jiang, Deyu Meng, Shoou-I Yu, Zhenzhong Lan, Shiguang Shan, and Alexander Hauptmann. 2014. Self-paced learning with diversity. In Advances in Neural Information Processing Systems. 2078--2086.

[14]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[15]

Dimitrios Kollias, Shiyang Cheng, Evangelos Ververas, Irene Kotsia, and Stefanos Zafeiriou. 2020. Deep neural network augmentation: Generating faces for affect analysis. International Journal of Computer Vision 128, 5 (2020), 1455--1484.

Digital Library

[16]

M Kumar, Benjamin Packer, and Daphne Koller. 2010. Self-paced learning for latent variable models. In Advances in Neural Information Processing Systems. 1189--1197.

[17]

Hao Li and Maoguo Gong. 2017. Self-paced Convolutional Neural Networks. In Proceedings of the International Joint Conference on Artificial Intelligence. 2110-- 2116.

[18]

Shan Li and Weihong Deng. 2018. Reliable crowdsourcing and deep localitypreserving learning for unconstrained facial expression recognition. IEEE Transactions on Image Processing 28, 1 (2018), 356--370.

Digital Library

[19]

Shan Li and Weihong Deng. 2020. Deep facial expression recognition: A survey. IEEE Transactions on Affective Computing (2020), 1--1. https://doi.org/10.1109/ TAFFC.2020.2981446

[20]

Shan Li, Weihong Deng, and JunPing Du. 2017. Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2852--2861.

[21]

Yingjian Li, Yao Lu, Jinxing Li, and Guangming Lu. 2019. Separate loss for basic and compound facial expression recognition in the wild. In Proceedings of the Asian Conference on Machine Learning. 897--911.

[22]

Ping Liu, Shizhong Han, Zibo Meng, and Yan Tong. 2014. Facial expression recognition via a boosted deep belief network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1805--1812.

Digital Library

[23]

Ping Liu, Yuewei Lin, Zibo Meng, Lu Lu, Weihong Deng, Joey Tianyi Zhou, and Yi Yang. 2021. Point adversarial self-mining: A simple method for facial expression recognition. IEEE Transactions on Cybernetics (2021), 1--12.

[24]

Ali Mollahosseini, Behzad Hasani, and Mohammad H Mahoor. 2017. Affectnet: A database for facial expression, valence, and arousal computing in the wild. IEEE Transactions on Affective Computing 10, 1 (2017), 18--31.

Digital Library

[25]

Ali Mollahosseini, Behzad Hasani, Michelle J Salvador, Hojjat Abdollahi, David Chan, and Mohammad H Mahoor. 2016. Facial expression recognition from world wild web. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 58--65.

[26]

Bowen Pan, Shangfei Wang, and Bin Xia. 2019. Occluded facial expression recognition enhanced through privileged information. In Proceedings of the 27th ACM International Conference on Multimedia. 566--573.

Digital Library

[27]

Lili Pan, Shijie Ai, Yazhou Ren, and Zenglin Xu. 2020. Self-paced deep regression forests with consideration on underrepresented examples. In Proceedings of the European Conference on Computer Vision. 271--287.

Digital Library

[28]

Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in pytorch. (2017).

[29]

Robert Plutchik. 1980. A general psychoevolutionary theory of emotion. In Theories of Emotion. 3--33.

[30]

Yazhou Ren, Peng Zhao, Yongpan Sheng, Dezhong Yao, and Zenglin Xu. 2017. Robust softmax regression for multi-class classification with self-paced learning. In Proceedings of the International Joint Conference on Artificial Intelligence. 2641-- 2647.

[31]

Delian Ruan, Yan Yan, Si Chen, Jing-Hao Xue, and Hanzi Wang. 2020. Deep disturbance-disentangled learning for facial expression recognition. In Proceedings of the 28th ACM International Conference on Multimedia. 2833--2841.

Digital Library

[32]

Delian Ruan, Yan Yan, Shenqi Lai, Zhenhua Chai, Chunhua Shen, and Hanzi Wang. 2021. Feature decomposition and reconstruction learning for effective facial expression recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7660--7669.

[33]

Henrique Siqueira, Sven Magg, and Stefan Wermter. 2020. Efficient facial feature learning with wide ensemble-based convolutional neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence. 5800--5809.

[34]

Kai Wang, Xiaojiang Peng, Jianfei Yang, Shijian Lu, and Yu Qiao. 2020. Suppressing uncertainties for large-scale facial expression recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6897--6906.

[35]

Kai Wang, Xiaojiang Peng, Jianfei Yang, Debin Meng, and Yu Qiao. 2020. Region attention networks for pose and occlusion robust facial expression recognition. IEEE Transactions on Image Processing 29 (2020), 4057--4069.

Digital Library

[36]

Chao Xing, Xin Geng, and Hui Xue. 2016. Logistic boosting regression for label distribution learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4489--4497.

[37]

Ning Xu, Yun-Peng Liu, and Xin Geng. 2019. Label enhancement for label distribution learning. IEEE Transactions on Knowledge and Data Engineering 33, 4 (2019), 1632--1643.

Digital Library

[38]

Huiyuan Yang, Umur Ciftci, and Lijun Yin. 2018. Facial expression recognition by de-expression residue learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2168--2177.

[39]

Jiabei Zeng, Shiguang Shan, and Xilin Chen. 2018. Facial expression recognition with inconsistently annotated datasets. In Proceedings of the European Conference on Computer Vision. 222--237.

[40]

Dingwen Zhang, Deyu Meng, and Junwei Han. 2016. Co-saliency detection via a self-paced multiple-instance learning framework. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 5 (2016), 865--878.

Digital Library

[41]

Ting Zhang. 2017. Facial expression recognition based on deep learning: a survey. In Proceedings of the International Conference on Intelligent and Interactive Systems and Applications. 345--352.

[42]

Qian Zhao, Deyu Meng, Lu Jiang, Qi Xie, Zongben Xu, and Alexander G Hauptmann. 2015. Self-paced learning for matrix factorization. In Proceedings of the AAAI Conference on Artificial Intelligence. 3196--3202.

[43]

Zengqun Zhao, Qingshan Liu, and Feng Zhou. 2021. Robust lightweight facial expression recognition network with label distribution training. In Proceedings of the AAAI Conference on Artificial Intelligence. 3510--3519.

[44]

Ying Zhou, Hui Xue, and Xin Geng. 2015. Emotion distribution recognition from facial expressions. In Proceedings of the 23rd ACM International Conference on Multimedia. 1247--1250

Digital Library

Cited By

Xin YZhou YJiang JCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)RobustFace: Adaptive Mining of Noise and Hard Samples for Robust Face RecognitionsProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681231(5065-5073)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681231
Yang YWen LZeng XXu YWu XZhou JWang YCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Learning with Alignments: Tackling the Inter- and Intra-domain Shifts for Cross-multidomain Facial Expression RecognitionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680747(4236-4245)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3680747
Zhou PSun BLiu XDu LLi X(2024)Active Clustering Ensemble With Self-Paced LearningIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.325258635:9(12186-12200)Online publication date: Sep-2024
https://doi.org/10.1109/TNNLS.2023.3252586
Show More Cited By

Index Terms

Self-Paced Label Distribution Learning for In-The-Wild Facial Expression Recognition
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision representations
        Image representations
  2. Machine learning
    1. Machine learning approaches

Recommendations

Expression-invariant face recognition by facial expression transformations

In this paper, we present a method of expression-invariant face recognition that transforms input face image with an arbitrary expression into its corresponding neutral facial expression image. When a new face image with an arbitrary expression is ...
Pose-robust feature learning for facial expression recognition

Automatic facial expression recognition (FER) from non-frontal views is a challenging research topic which has recently started to attract the attention of the research community. Pose variations are difficult to tackle and many face analysis methods ...
Facial expression recognition using dual dictionary learning

Comprehensive feature extraction method is proposed for facial expression recognition.A sparse dictionary learning approach is proposed for facial expression recognition.A regression dictionary is proposed for regression facial expression ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '22: Proceedings of the 30th ACM International Conference on Multimedia

October 2022

7537 pages

ISBN:9781450392037

DOI:10.1145/3503161

General Chairs:
João Magalhães
NOVA University of Lisbon, Portugal
,
Alberto del Bimbo
University of Florence, Italy
,
Shin'ichi Satoh
National Institute of Informatics, Japan
,
Nicu Sebe
University of Trento, Italy
,
Program Chairs:
Xavier Alameda-Pineda
Inria, Grenoble, France
,
Qin Jin
Renmin University of China, China
,
Vincent Oria
New Jersey Institute of Technology, USA
,
Laura Toni
University College London, UK

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 October 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Medico-Engineering Cooperation Funds from University of Electronic Science and Technology of China
Sichuan Science and Technology Program

Conference

MM '22

Sponsor:

SIGMM

MM '22: The 30th ACM International Conference on Multimedia

October 10 - 14, 2022

Lisboa, Portugal

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
330
Total Downloads

Downloads (Last 12 months)70
Downloads (Last 6 weeks)5

Reflects downloads up to 05 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Xin YZhou YJiang JCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)RobustFace: Adaptive Mining of Noise and Hard Samples for Robust Face RecognitionsProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681231(5065-5073)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681231
Yang YWen LZeng XXu YWu XZhou JWang YCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Learning with Alignments: Tackling the Inter- and Intra-domain Shifts for Cross-multidomain Facial Expression RecognitionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680747(4236-4245)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3680747
Zhou PSun BLiu XDu LLi X(2024)Active Clustering Ensemble With Self-Paced LearningIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.325258635:9(12186-12200)Online publication date: Sep-2024
https://doi.org/10.1109/TNNLS.2023.3252586
Yang ZLuo LGu YRen F(2024)K-Face Net: A Two-Stage Framework for Balanced Feature Space in Facial Expression Recognition2024 IEEE International Conference on Multimedia and Expo (ICME)10.1109/ICME57554.2024.10688346(1-6)Online publication date: 15-Jul-2024
https://doi.org/10.1109/ICME57554.2024.10688346
Chen NKok VSeng Chan C(2024)Enhancing Facial Expression Recognition Under Data Uncertainty Based on Embedding ProximityIEEE Access10.1109/ACCESS.2024.341515412(85324-85337)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3415154
Wu YGuo WLin Y(2024)Label distribution feature selection based on neighborhood rough setConcurrency and Computation: Practice and Experience10.1002/cpe.823636:23Online publication date: 22-Jul-2024
https://doi.org/10.1002/cpe.8236
Zhu JLuo BSun ATan JZhao XGao YEl Saddik AMei TCucchiara RBertini MTobon Vallejo DAtrey PHossain M(2023)Variance-Aware Bi-Attention Expression Transformer for Open-Set Facial Expression Recognition in the WildProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3612546(862-870)Online publication date: 26-Oct-2023
https://dl.acm.org/doi/10.1145/3581783.3612546
Tao ZWang YChen ZWang BYan SJiang KGao SZhang WEl Saddik AMei TCucchiara RBertini MTobon Vallejo DAtrey PHossain M(2023)Freq-HD: An Interpretable Frequency-based High-Dynamics Affective Clip Selection Method for in-the-Wild Facial Expression Recognition in VideosProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3611972(843-852)Online publication date: 26-Oct-2023
https://dl.acm.org/doi/10.1145/3581783.3611972

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents