Nothing Special   »   [go: up one dir, main page]

skip to main content

Toward Facial Expression Recognition in the Wild via Noise-Tolerant Network

Published: 01 May 2023 Publication History


Facial Expression Recognition (FER) has recently emerged as a crucial area in Human-Computer Interaction (HCI) system for understanding the user’s inner state and intention. However, feature- and label-noise constitute the major challenge for FER in the wild due to the ambiguity of facial expressions worsened by low-quality images. To deal with this problem, in this paper, we propose a simple but effective Facial Expression Noise-tolerant Network (FENN) which explores the inter-class correlations for mitigating ambiguity that usually happens between morphologically similar classes. Specifically, FENN leverages a multivariate normal distribution to model such correlations at the final hidden layer of the neural network to suppress the heteroscedastic uncertainty caused by inter-class label noise. Furthermore, the discriminative ability of deep features is weakened by the subtle differences between expressions and the presence of feature noise. FENN utilizes a feature-noise mitigation module to extract compact intra-class feature representations under feature noise while preserving the intrinsic inter-class relationships. We conduct extensive experiments to evaluate the effectiveness of FENN on both original annotated images and synthetic noisy annotated images from RAF-DB, AffectNet, and FERPlus in-the-wild facial expression datasets. The results show that FENN significantly outperforms state-of-the-art FER methods.


P. Ekman and W. V. Friesen, “Detecting deception from the body or face,” J. Personality Social Psychol., vol. 29, no. 3, p. 288, 1974.
M. G. Frank and P. Ekman, “The ability to detect deceit generalizes across different types of high-stake lies,” J. Personality Social Psychol., vol. 72, no. 6, p. 1429, 1997.
S. Singh and N. P. Papanikolopoulos, “Monitoring driver fatigue using facial analysis techniques,” in Proc. IEEE/IEEJ/JSAI Int. Conf. Intell. Transp. Syst., Oct. 1999, pp. 314–318.
X. Fan, Y. Sun, B. Yin, and X. Guo, “Gabor-based dynamic representation for human fatigue monitoring in facial image sequences,” Pattern Recognit. Lett., vol. 31, no. 3, pp. 234–243, Feb. 2010.
M. Yang, Y. Ma, Z. Liu, H. Cai, X. Hu, and B. Hu, “Undisturbed mental state assessment in the 5G era: A case study of depression detection based on facial expressions,” IEEE Wireless Commun., vol. 28, no. 3, pp. 46–53, Jun. 2021.
Q. Wang, L. Lu, Q. Zhang, F. Fang, X. Zou, and L. Yi, “Eye avoidance in young children with autism spectrum disorder is modulated by emotional facial expressions,” J. Abnormal Psychol., vol. 127, no. 7, p. 722, 2018.
Y. Gu, H. Yan, X. Zhang, Z. Liu, and F. Ren, “3-D facial expression recognition via attention-based multichannel data fusion network,” IEEE Trans. Instrum. Meas., vol. 70, pp. 1–10, 2021.
Y. Li, J. Zeng, S. Shan, and X. Chen, “Occlusion aware facial expression recognition using CNN with attention mechanism,” IEEE Trans. Image Process., vol. 28, no. 5, pp. 2439–2450, May 2019.
Y. Gu, X. Zhang, Z. Liu, and F. Ren, “WiFE: WiFi and vision based intelligent facial-gesture emotion recognition,” 2020, arXiv:2004.09889.
Y. Li, Y. Gao, B. Chen, Z. Zhang, L. Zhu, and G. Lu, “JDMAN: Joint discriminative and mutual adaptation networks for cross-domain facial expression recognition,” in Proc. 29th ACM Int. Conf. Multimedia, Oct. 2021, pp. 3312–3320.
W. Xie, H. Wu, Y. Tian, M. Bai, and L. Shen, “Triplet loss with multistage outlier suppression and class-pair margins for facial expression recognition,” IEEE Trans. Circuits Syst. Video Technol., vol. 32, no. 2, pp. 690–703, Feb. 2022.
M. Huang, X. Zhang, X. Lan, H. Wang, and Y. Tang, “Convolution by multiplication: Accelerated two-stream Fourier domain convolutional neural network for facial expression recognition,” IEEE Trans. Circuits Syst. Video Technol., vol. 32, no. 3, pp. 1431–1442, Mar. 2022.
P. Lucey, J. F. Cohn, T. Kanade, J. Saragih, Z. Ambadar, and I. Matthews, “The extended Cohn-Kanade dataset (CK+): A complete dataset for action unit and emotion-specified expression,” in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. Workshops, Jun. 2010, pp. 94–101.
M. Valstaret al., “Induced disgust, happiness and surprise: An addition to the MMI facial expression database,” in Proc. 3rd Intern. Workshop EMOTION (Satellite LREC), Corpora Res. Emotion Affect, Paris, France, 2010, p. 65.
S. Li, W. Deng, and J. Du, “Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jul. 2017, pp. 2852–2861.
A. Mollahosseini, B. Hasani, and M. H. Mahoor, “AffectNet: A database for facial expression, valence, and arousal computing in the wild,” IEEE Trans. Affect. Comput., vol. 10, no. 1, pp. 18–31, Jan./Mar. 2017.
E. Barsoum, C. Zhang, C. C. Ferrer, and Z. Zhang, “Training deep networks for facial expression recognition with crowd-sourced label distribution,” in Proc. 18th ACM Int. Conf. Multimodal Interact., Oct. 2016, pp. 279–283.
A. Kendall and Y. Gal, “What uncertainties do we need in Bayesian deep learning for computer vision?” in Proc. Adv. Neural Inf. Process. Syst., vol. 30, 2017, pp. 1–11.
M. Collier, B. Mustafa, E. Kokiopoulou, R. Jenatton, and J. Berent, “Correlated input-dependent label noise in large-scale image classification,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2021, pp. 1551–1560.
V. Fortuinet al., “Deep classifiers with label noise modeling and distance awareness,” 2021, arXiv:2110.02609.
M. Collier, B. Mustafa, E. Kokiopoulou, R. Jenatton, and J. Berent, “A simple probabilistic method for deep classification under input-dependent label noise,” 2020, arXiv:2003.06778.
J. Deng, J. Guo, N. Xue, and S. Zafeiriou, “ArcFace: Additive angular margin loss for deep face recognition,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2019, pp. 4690–4699.
X. Shu, J. Tang, H. Lai, L. Liu, and S. Yan, “Personalized age progression with aging dictionary,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Dec. 2015, pp. 3970–3978.
X. Shu, J. Tang, Z. Li, H. Lai, L. Zhang, and S. Yan, “Personalized age progression with Bi-level aging dictionary learning,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 4, pp. 905–917, Apr. 2018.
S. Liuet al., “Face aging with contextual generative adversarial nets,” in Proc. 25th ACM Int. Conf. Multimedia, 2017, pp. 82–90.
J. Sunet al., “FENeRF: Face editing in neural radiance fields,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Jun. 2022, pp. 7672–7682.
H. Yang, U. Ciftci, and L. Yin, “Facial expression recognition by de-expression residue learning,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Jun. 2018, pp. 2168–2177.
K. Wang, X. Peng, J. Yang, D. Meng, and Y. Qiao, “Region attention networks for pose and occlusion robust facial expression recognition,” IEEE Trans. Image Process., vol. 29, pp. 4057–4069, 2020.
D. Ruan, Y. Yan, S. Chen, J.-H. Xue, and H. Wang, “Deep disturbance-disentangled learning for facial expression recognition,” in Proc. 28th ACM Int. Conf. Multimedia, Oct. 2020, pp. 2833–2841.
G. Wen, T. Chang, H. Li, and L. Jiang, “Dynamic objectives learning for facial expression recognition,” IEEE Trans. Multimedia, vol. 22, no. 11, pp. 2914–2925, Nov. 2020.
E. Friesen and P. Ekman, “Facial action coding system: A technique for the measurement of facial movement,” Palo Alto, vol. 3, no. 2, p. 5, 1978.
Y. Liet al., “Learning informative and discriminative features for facial expression recognition in the wild,” IEEE Trans. Circuits Syst. Video Technol., vol. 32, no. 5, pp. 3178–3189, May 2022.
Y. Li, Y. Gao, B. Chen, Z. Zhang, G. Lu, and D. Zhang, “Self-supervised exclusive-inclusive interactive learning for multi-label facial expression recognition in the wild,” IEEE Trans. Circuits Syst. Video Technol., vol. 32, no. 5, pp. 3190–3202, May 2022.
T. Pu, T. Chen, Y. Xie, H. Wu, and L. Lin, “AU-expression knowledge constrained representation learning for facial expression recognition,” in Proc. IEEE Int. Conf. Robot. Autom. (ICRA), May 2021, pp. 11154–11161.
J. Zeng, S. Shan, and X. Chen, “Facial expression recognition with inconsistently annotated datasets,” in Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 222–237.
K. Wang, X. Peng, J. Yang, S. Lu, and Y. Qiao, “Suppressing uncertainties for large-scale facial expression recognition,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020, pp. 6897–6906.
S. Maoet al., “Label distribution amendment with emotional semantic correlations for facial expression recognition,” 2021, arXiv:2107.11061.
J. She, Y. Hu, H. Shi, J. Wang, Q. Shen, and T. Mei, “Dive into ambiguity: Latent distribution mining and pairwise uncertainty estimation for facial expression recognition,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2021, pp. 6248–6257.
Y. Zhang, C. Wang, and W. Deng, “Relative uncertainty learning for facial expression recognition,” in Proc. Adv. Neural Inf. Process. Syst., vol. 34, 2021, pp. 17616–17627.
L. Jiang, Z. Zhou, T. Leung, L.-J. Li, and L. Fei-Fei, “MentorNet: Learning data-driven curriculum for very deep neural networks on corrupted labels,” in Proc. Int. Conf. Mach. Learn., 2018, pp. 2304–2313.
T. Liu and D. Tao, “Classification with noisy labels by importance reweighting,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 38, no. 3, pp. 447–461, Jul. 2016.
R. Wang, T. Liu, and D. Tao, “Multiclass learning with partially corrupted labels,” IEEE Trans. Neural Netw. Learn. Syst., vol. 29, no. 6, pp. 2568–2580, Jun. 2018.
L. P. F. Garcia, J. A. Sáez, J. Luengo, A. C. Lorena, A. C. de Carvalho, and F. Herrera, “Using the one-vs-one decomposition to improve the performance of class noise filters via an aggregation strategy in multi-class classification problems,” Knowl.-Based Syst., vol. 90, pp. 153–164, Dec. 2015.
J. Luengo, S.-O. Shim, S. Alshomrani, A. Altalhi, and F. Herrera, “CNC-NOS: Class noise cleaning by ensemble filtering and noise scoring,” Knowl.-Based Syst., vol. 140, pp. 27–49, Jan. 2018.
J. Li, R. Socher, and S. C. Hoi, “DivideMix: Learning with noisy labels as semi-supervised learning,” in Proc. Int. Conf. Learn. Represent., 2019, pp. 1–14.
Y. Yan, R. Rosales, G. Fung, R. Subramanian, and J. Dy, “Learning from multiple annotators with varying expertise,” Mach. Learn., vol. 95, no. 3, pp. 291–327, 2014.
F. Rodrigues and F. Pereira, “Deep learning from crowds,” in Proc. AAAI Conf. Artif. Intell., 2018, vol. 32, no. 1, pp. 1–8.
R. Tanno, A. Saeedi, S. Sankaranarayanan, D. C. Alexander, and N. Silberman, “Learning from noisy labels by regularized estimation of annotator confusion,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2019, pp. 11244–11253.
B. Hanet al., “Co-Teaching: Robust training of deep neural networks with extremely noisy labels,” in Proc. Adv. Neural Inf. Process. Syst., vol. 31, 2018, pp. 1–11.
K. E. Train, Discrete Choice Methods With Simulation. Cambridge, U.K.: Cambridge Univ. Press, 2009.
L. Bottou, “Stochastic gradient descent tricks,” in Neural Networks: Tricks of the Trade. Berlin, Germany: Springer, 2012, pp. 421–436.
Y. Wen, K. Zhang, Z. Li, and Y. Qiao, “A discriminative feature learning approach for deep face recognition,” in Proc. Eur. Conf. Comput. Vis. Cham, Switzerland: Springer, 2016, pp. 499–515.
I. J. Goodfellowet al., “Challenges in representation learning: A report on three machine learning contests,” in Proc. Int. Conf. Neural Inf. Process. Cham, Switzerland: Springer, 2013, pp. 117–124.
Y. Guo, L. Zhang, Y. Hu, X. He, and J. Gao, “MS-Celeb-1M: A dataset and benchmark for large-scale face recognition,” in Proc. Eur. Conf. Comput. Vis. Cham, Switzerland: Springer, 2016, pp. 87–102.
K. Zhang, Z. Zhang, Z. Li, and Y. Qiao, “Joint face detection and alignment using multitask cascaded convolutional networks,” IEEE Signal Process. Lett., vol. 23, no. 10, pp. 1499–1503, Oct. 2016.
Y. Li, J. Zeng, S. Shan, and X. Chen, “Patch-gated CNN for occlusion-aware facial expression recognition,” in Proc. 24th Int. Conf. Pattern Recognit. (ICPR), Aug. 2018, pp. 2209–2214.
H. Siqueira, S. Magg, and S. Wermter, “Efficient facial feature learning with wide ensemble-based convolutional neural networks,” in Proc. AAAI Conf. Artif. Intell., 2020, vol. 34, no. 4, pp. 5800–5809.
L. Lo, H. X. Xie, H.-H. Shuai, and W.-H. Cheng, “Facial chirality: Using self-face reflection to learn discriminative features for facial expression recognition,” in Proc. IEEE Int. Conf. Multimedia Expo (ICME), Jul. 2021, pp. 1–6.
Z. Zhao, Q. Liu, and F. Zhou, “Robust lightweight facial expression recognition network with label distribution training,” in Proc. AAAI Conf. Artif. Intell., 2021, vol. 35, no. 4, pp. 3510–3519.
D. Zeng, Z. Lin, X. Yan, Y. Liu, F. Wang, and B. Tang, “Face2Exp: Combating data biases for facial expression recognition,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2022, pp. 20291–20300.
S. Xie, H. Hu, and Y. Chen, “Facial expression recognition with two-branch disentangled generative adversarial network,” IEEE Trans. Circuits Syst. Video Technol., vol. 31, no. 6, pp. 2359–2371, Jun. 2021.
L. V. D. Maaten and G. Hinton, “Visualizing data using t-SNE,” J. Mach. Learn. Res., vol. 9, no. 11, pp. 1–27, 2008.
O. Chapelle and A. Zien, “Semi-supervised classification by low density separation,” in Proc. Int. Workshop Artif. Intell. Statist., 2005, pp. 57–64.

Cited By

View all
  • (2024)Learning Cognitive Features as Complementary for Facial Expression RecognitionInternational Journal of Intelligent Systems10.1155/2024/73211752024Online publication date: 1-Jan-2024
  • (2024)Uncertain Facial Expression Recognition via Multi-Task Assisted CorrectionIEEE Transactions on Multimedia10.1109/TMM.2023.330120926(2531-2543)Online publication date: 1-Jan-2024
  • (2024)Cross-Layer Contrastive Learning of Latent Semantics for Facial Expression RecognitionIEEE Transactions on Image Processing10.1109/TIP.2024.337845933(2514-2529)Online publication date: 27-Mar-2024
  • Show More Cited By



Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors


Published In

cover image IEEE Transactions on Circuits and Systems for Video Technology
IEEE Transactions on Circuits and Systems for Video Technology  Volume 33, Issue 5
May 2023
524 pages


IEEE Press

Publication History

Published: 01 May 2023


  • Research-article


Other Metrics

Bibliometrics & Citations


Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 30 Sep 2024

Other Metrics


Cited By

View all
  • (2024)Learning Cognitive Features as Complementary for Facial Expression RecognitionInternational Journal of Intelligent Systems10.1155/2024/73211752024Online publication date: 1-Jan-2024
  • (2024)Uncertain Facial Expression Recognition via Multi-Task Assisted CorrectionIEEE Transactions on Multimedia10.1109/TMM.2023.330120926(2531-2543)Online publication date: 1-Jan-2024
  • (2024)Cross-Layer Contrastive Learning of Latent Semantics for Facial Expression RecognitionIEEE Transactions on Image Processing10.1109/TIP.2024.337845933(2514-2529)Online publication date: 27-Mar-2024
  • (2023)AST-GCN: Augmented Spatial Temporal Graph Convolutional Neural Network for Gait Emotion RecognitionIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.334172834:6(4581-4595)Online publication date: 12-Dec-2023

View Options

View options

Get Access

Login options







Share this Publication link

Share on social media