Abstract
Facial expressions are a crucial aspect of human communication that provide information about emotions, intentions, interactions, and social relationships. They are a universal signal used daily to convey inner behaviors in natural situations. With the increasing interest in automatic facial emotion recognition, deep neural networks have become a popular tool for recognizing emotions in challenging in-the-wild conditions that are closer to reality. However, these systems must contend with external factors that degrade the quality of facial features, making it challenging to determine the correct emotion classes. In this paper, we first provide a summary of the various fields that use facial recognition systems under in-the-wild context. Then, we extensively examine the major datasets utilized for in-the-wild facial expression recognition, taking into account their appropriateness for this context, the challenges related to their application, the coverage of various emotions, and the potential domains of application. The analysis is conducted rigorously, emphasizing the merits and demerits of each dataset and advocating for their pertinence and effectiveness in real-life situations. We also present an expanded taxonomy of facial emotion recognition in-the-wild, while focusing mainly on deep learning methods and covering the manufacturing steps of a facial emotion recognition system and the different possible techniques for each step. Finally, we provide a discussion, insights, and conclusion, making this survey a reference point for researchers interested in the in-the-wild context, while providing a better understanding of the different datasets’ compositions and specificities. This survey can help advance research on deep facial emotion recognition in-the-wild and serve as a resource for methods, applications, and datasets in the field.
Similar content being viewed by others
Data Availability Statement
For all the data described in this review paper, we have provided references to the original dataset sources. Please note that the availability of these datasets may be subject to certain restrictions or usage terms imposed by the original data providers.
References
Abbas, A. and Chalup, S. K. (2017). Group emotion recognition in the wild by combining deep neural networks for facial expression classification and scene-context analysis. In Proceedings of the 19th ACM international conference on multimodal interaction, pages 561–568.
AlBdairi AJA, Xiao Z, Alkhayyat A, Humaidi AJ, Fadhel MA, Taher BH, Alzubaidi L, Santamaría J, Al-Shamma O. Face recognition based on deep learning and fpga for ethnicity identification. Appl Sci. 2022;12(5):2605.
Altameem T, Altameem A. Facial expression recognition using human machine interaction and multi-modal visualization analysis for healthcare applications. Image Vis Comput. 2020;103: 104044.
Bargal, S. A., Barsoum, E., Ferrer, C. C., and Zhang, C. (2016). Emotion recognition in the wild from videos using images. In Proceedings of the 18th ACM International Conference on Multimodal Interaction, pages 433–436.
Barsoum, E., Zhang, C., Ferrer, C. C., and Zhang, Z. (2016). Training deep networks for facial expression recognition with crowd-sourced label distribution. In Proceedings of the 18th ACM international conference on multimodal interaction, pages 279–283.
Bechtoldt MN, Beersma B, van Kleef GA. When (not) to empathize: The differential effects of combined emotion recognition and empathic concern on client satisfaction across professions. Motiv Emot. 2019;43:112–29.
Bejaoui H, Ghazouani H, Barhoumi W. Fully automated facial expression recognition using 3d morphable model and mesh-local binary pattern. In: Blanc-Talon J, Penne R, Philips W, Popescu D, Scheunders P, editors. Advanced Concepts for Intelligent Vision Systems. Cham. Springer International Publishing; 2017. p. 39–50.
Bejaoui H, Ghazouani H, Barhoumi W. Sparse coding-based representation of LBP difference for 3d/4d facial expression recognition. Multimedia Tools and Applications. 2019;78(16):22773–96.
Benitez-Quiroz, C. F., Srinivasan, R., Feng, Q., Wang, Y., and Martinez, A. M. (2017). Emotionet challenge: Recognition of facial expressions of emotion in the wild. arXiv preprint arXiv:1703.01210.
Bissinger, B., Märtin, C., and Fellmann, M. (2022). Support of virtual human interactions based on facial emotion recognition software. In Human-Computer Interaction. Technological Innovation: Thematic Area, HCI 2022, Held as Part of the 24th HCI International Conference, HCII 2022, Virtual Event, June 26–July 1, 2022, Proceedings, Part II, pages 329–339. Springer.
Boughanem, H., Ghazouani, H., and Barhoumi, W. (2021). Towards a deep neural method based on freezing layers for in-the-wild facial emotion recognition. In 2021 IEEE/ACS 18th International Conference on Computer Systems and Applications (AICCSA), pages 1–8. IEEE.
Boughanem, H., Ghazouani, H., and Barhoumi, W. (2022). Multichannel convolutional neural network for human emotion recognition from in-the-wild facial expressions. The Visual Computer, pages 1–26.
Boughanem., H., Ghazouani., H., and Barhoumi., W. (2023). Ycbcr color space as an effective solution to the problem of low emotion recognition rate of facial expressions in-the-wild. In Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 5: VISAPP,, pages 822–829. INSTICC, SciTePress.
Bouzakraoui, M. S., Sadiq, A., and Alaoui, A. Y. (2019). Appreciation of customer satisfaction through analysis facial expressions and emotions recognition. In 2019 4th World Conference on Complex Systems (WCCS), pages 1–5. IEEE.
Bouzakraoui MS, Sadiq A, Alaoui AY. Customer satisfaction recognition based on facial expression and machine learning techniques. Advances in Science, Technology and Engineering Systems. 2020;5(4):594–9.
Buvaneswari, B. and Reddy, T. K. (2017). A review of eeg based human facial expression recognition systems in cognitive sciences. In 2017 International Conference on Energy, Communication, Data Analytics and Soft Computing (ICECDS), pages 462–468. IEEE.
Castaño R, Sujan M, Kacker M, Sujan H. Managing consumer uncertainty in the adoption of new products: Temporal distance and mental simulation. J Mark Res. 2008;45(3):320–36.
Chen J, Ou Q, Chi Z, Fu H. Smile detection in the wild with deep convolutional neural networks. Mach Vis Appl. 2017;28:173–83.
Chimienti, M., Danzi, I., Gattulli, V., Impedovo, D., Pirlo, G., and Veneto, D. (2022). Behavioral analysis for user satisfaction. In 2022 IEEE Eighth International Conference on Multimedia Big Data (BigMM), pages 113–119. IEEE.
Cruz AC, Bhanu B, Le BT. Human automotive interaction: Affect recognition for motor trend magazine’s best driver car of the year. IntechOpen: In Emotion and Attention Recognition Based on Biological Signals and Images; 2017.
Dhall, A., Goecke, R., Joshi, J., Sikka, K., and Gedeon, T. (2014). Emotion recognition in the wild challenge 2014: Baseline, data and protocol. In Int Conference on Multimodal Interaction, pages 461–466.
Dhall A, Goecke R, Lucey S, Gedeon T. Collecting large, richly annotated facial-expression databases from movies. IEEE Multimedia. 2012;19(3):34–41.
Ding, W., Xu, M., Huang, D., Lin, W., Dong, M., Yu, X., and Li, H. (2016). Audio and face video emotion recognition in the wild using deep neural networks and small datasets. In Proceedings of the 18th ACM international conference on multimodal interaction, pages 506–513.
Dresvyanskiy D, Ryumina E, Kaya H, Markitantov M, Karpov A, Minker W. End-to-end modeling and transfer learning for audiovisual emotion recognition in-the-wild. Multimodal Technologies and Interaction. 2022;6(2):11.
El Hammoumi, O., Benmarrakchi, F., Ouherrou, N., El Kafi, J., and El Hore, A. (2018). Emotion recognition in e-learning systems. In 2018 6th international conference on multimedia computing and systems (ICMCS), pages 1–6. IEEE.
Eltenahy, S. A. M. (2021). Facial recognition and emotional expressions over video conferencing based on web real time communication and artificial intelligence. In Enabling Machine Learning Applications in Data Science: Proceedings of Arab Conference for Emerging Technologies 2020, pages 29–37. Springer.
Ertay, E., Huang, H., Sarsenbayeva, Z., and Dingler, T. (2021). Challenges of emotion detection using facial expressions and emotion visualisation in remote communication. In Adjunct Proceedings of the 2021 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2021 ACM International Symposium on Wearable Computers, pages 230–236.
Farzaneh, A. H. and Qi, X. (2021). Facial expression recognition in the wild via deep attentive center loss. In Proceedings of the IEEE/CVF winter conference on applications of computer vision, pages 2402–2411.
Fischer, M., Richter, A., Schindler, J., Plättner, J., Temme, G., Kelsch, J., Assmann, D., and Köster, F. (2014). Modular and scalable driving simulator hardware and software for the development of future driver assistence and automation systems. New Developments in Driving Simulation Design and Experiments, pages 223–229.
Georgescu M-I, Ionescu RT, Popescu M. Local learning with deep and handcrafted features for facial expression recognition. IEEE Access. 2019;7:64827–36.
Ghosh, A., Umer, S., Khan, M. K., Rout, R. K., and Dhara, B. C. (2022). Smart sentiment analysis system for pain detection using cutting edge techniques in a smart healthcare framework. Cluster Computing, pages 1–17.
Gogić I, Manhart M, Pandžić IS, Ahlberg J. Fast facial expression recognition using local binary features and shallow neural networks. Vis Comput. 2020;36:97–112.
Goodfellow, I. J., Erhan, D., Carrier, P. L., Courville, A., Mirza, M., Hamner, B., Cukierski, W., Tang, Y., Thaler, D., Lee, D.-H., et al. (2013). Challenges in representation learning: A report on three machine learning contests. In International Conference on Neural Information Processing (ICONIP), pages 117–124.
Guerdelli, H., Ferrari, C., Barhoumi, W., Ghazouani, H., and Berretti, S. (2022). Macro- and micro-expressions facial datasets: A survey. Sensors, 22(4).
Hassanat AB, Albustanji AA, Tarawneh AS, Alrashidi M, Alharbi H, Alanazi M, Alghamdi M, Alkhazi IS, Prasath VS. Deepveil: deep learning for identification of face, gender, expression recognition under veiled conditions. International Journal of Biometrics. 2022;14(3–4):453–80.
Hossain MS, Muhammad G. Emotion-aware connected healthcare big data towards 5g. IEEE Internet Things J. 2017;5(4):2399–406.
Indira, D., Sumalatha, L., and Markapudi, B. R. (2021). Multi facial expression recognition (mfer) for identifying customer satisfaction on products using deep cnn and haar cascade classifier. In IOP Conference Series: Materials Science and Engineering, volume 1074, page 012033. IOP Publishing.
Jeong, J.-Y., Hong, Y.-G., Kim, D., Jeong, J.-W., Jung, Y., and Kim, S.-H. (2022). Classification of facial expression in-the-wild based on ensemble of multi-head cross attention networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2353–2358.
Joshi, A., Kyal, S., Banerjee, S., and Mishra, T. (2020). In-the-wild drowsiness detection from facial expressions. In 2020 IEEE intelligent vehicles symposium (IV), pages 207–212. IEEE.
Kollias, D. and Zafeiriou, S. (2018a). Aff-wild2: Extending the aff-wild database for affect recognition. arXiv preprint arXiv:1811.07770.
Kollias, D. and Zafeiriou, S. (2018b). Training deep neural networks with different datasets in-the-wild: The emotion recognition paradigm. In 2018 International Joint Conference on Neural Networks (IJCNN), pages 1–8. IEEE.
Koujan, M. R., Alharbawee, L., Giannakakis, G., Pugeault, N., and Roussos, A. (2020). Real-time facial expression recognition “in the wild” by disentangling 3d expression from identity. In 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020), pages 24–31. IEEE.
Krithika LB, GG, L. P. Student emotion recognition system (sers) for e-learning improvement based on learner concentration metric. Procedia Computer Science. 2016;85:767–76.
Li, S., Deng, W., and Du, J. (2017). Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2852–2861.
Li T, Chan K-L, Tjahjadi T. Multi-scale correlation module for video-based facial expression recognition in the wild. Pattern Recogn. 2023;142: 109691.
Li Y, Zeng J, Shan S, Chen X. Occlusion aware facial expression recognition using cnn with attention mechanism. IEEE Trans Image Process. 2018;28(5):2439–50.
Liang, X., Xu, L., Zhang, W., Zhang, Y., Liu, J., and Liu, Z. (2022). A convolution-transformer dual branch network for head-pose and occlusion facial expression recognition. The Visual Computer, pages 1–14.
Liu Y, Feng C, Yuan X, Zhou L, Wang W, Qin J, Luo Z. Clip-aware expressive feature learning for video-based facial expression recognition. Inf Sci. 2022;598:182–95.
Lopes AT, De Aguiar E, De Souza AF, Oliveira-Santos T. Facial expression recognition with convolutional neural networks: coping with few data and the training sample order. Pattern Recogn. 2017;61:610–28.
Lotz, A., Ihme, K., Charnoz, A., Maroudis, P., Dmitriev, I., and Wendemuth, A. (2018). Recognizing behavioral factors while driving: A real-world multimodal corpus to monitor the driver’s affective state. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018).
Lu J, Xie X, Zhang R. Focusing on appraisals: How and why anger and fear influence driving risk perception. J Safety Res. 2013;45:65–73.
Lucey, P., Cohn, J. F., Matthews, I., Lucey, S., Sridharan, S., Howlett, J., and Prkachin, K. M. (2010). Automatically detecting pain in video through facial action units. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 41(3):664–674.
Malone A, Carroll A, Murphy BP. Facial affect recognition deficits: A potential contributor to aggression in psychotic illness. Aggress Violent Beh. 2012;17(1):27–35.
Mega C, Ronconi L, De Beni R. What makes a good student? how emotions, self-regulated learning, and motivation contribute to academic achievement. J Educ Psychol. 2014;106(1):121.
Minaee S, Minaei M, Abdolrashidi A. Deep-emotion: Facial expression recognition using attentional convolutional network. Sensors. 2021;21(9):3046.
Mohan K, Seal A, Krejcar O, Yazidi A. Facial expression recognition using local gravitational force descriptor-based deep convolution neural networks. IEEE Trans Instrum Meas. 2020;70:1–12.
Mollahosseini, A., Chan, D., and Mahoor, M. H. (2016). Going deeper in facial expression recognition using deep neural networks. In 2016 IEEE Winter conference on applications of computer vision (WACV), pages 1–10. IEEE.
Mollahosseini A, Hasani B, Mahoor MH. Affectnet: A database for facial expression, valence, and arousal computing in the wild. IEEE Trans Affect Comput. 2017;10(1):18–31.
Nethravathi P, Aithal P. Real-time customer satisfaction analysis using facial expressions and head pose estimation. International Journal of Applied Engineering and Management Letters (IJAEML). 2022;6(1):301–12.
Oden KB, Lohani M, McCoy M, Crutchfield J, Rivers S. Embedding emotional intelligence into military training contexts. Procedia Manufacturing. 2015;3:4052–9.
Pêcher C, Lemercier C, Cellier J-M. Emotions drive attention: Effects on driver’s behaviour. Saf Sci. 2009;47(9):1254–9.
Pujol, F. A., Mora, H., and Martínez, A. (2019). Emotion recognition to improve e-healthcare systems in smart cities. In Research & Innovation Forum 2019: Technology, Innovation, Education, and their Social Impact 1, pages 245–254. Springer.
Qu X, Zou Z, Su X, Zhou P, Wei W, Wen S, Wu D. Attend to where and when: cascaded attention network for facial expression recognition. IEEE Transactions on Emerging Topics in Computational Intelligence. 2021;6(3):580–92.
Rathod P, Gagnani L, Patel K. Facial expression recognition: issues and challenges. International Journal of Enhanced Research in Science Technology & Engineering. 2014;3(2):108–11.
Reddy GV, Savarni CD, Mukherjee S. Facial expression recognition in the wild, by fusion of deep learnt and hand-crafted features. Cogn Syst Res. 2020;62:23–34.
Saurav S, Saini R, Singh S. Emnet: a deep integrated convolutional neural network for facial emotion recognition in the wild. Appl Intell. 2021;51:5543–70.
Savaş BK, Becerikli Y. Real time driver fatigue detection system based on multi-task connn. Ieee Access. 2020;8:12491–8.
Shang Y, Yang M, Cui J, Cui L, Huang Z, Li X. Driver emotion and fatigue state detection based on time series fusion. Electronics. 2023;12(1):26.
Shao J, Qian Y. Three convolutional neural network models for facial expression recognition in the wild. Neurocomputing. 2019;355:82–92.
Sidhom O, Ghazouani H, Barhoumi W. Subject-dependent selection of geometrical features for spontaneous emotion recognition. Multimedia Tools and Applications. 2022;82(2):2635–61.
Singh, J. (2020). Learning based driver drowsiness detection model. In 2020 3rd International Conference on Intelligent Sustainable Systems (ICISS), pages 698–701.
Sun N, Song Y, Liu J, Chai L, Sun H. Appearance and geometry transformer for facial expression recognition in the wild. Comput Electr Eng. 2023;107: 108583.
Surace, L., Patacchiola, M., Battini Sönmez, E., Spataro, W., and Cangelosi, A. (2017). Emotion recognition in the wild using deep neural networks and bayesian classifiers. In Proceedings of the 19th ACM international conference on multimodal interaction, pages 593–597.
Taubman-Ben-Ari O. The effects of positive emotion priming on self-reported reckless driving. Accident Analysis & Prevention. 2012;45:718–25.
Tischler, M. A., Peter, C., Wimmer, M., and Voskamp, J. (2007). Application of emotion recognition methods in automotive research. In Proceedings of the 2nd Workshop on Emotion and Computing–Current Research and Future Impact, volume 1, pages 55–60.
Tokuno, S., Tsumatori, G., Shono, S., Takei, E., Yamamoto, T., Suzuki, G., Mituyoshi, S., and Shimura, M. (2011). Usage of emotion recognition in military health care. In 2011 defense science research conference and expo (DSR), pages 1–5. IEEE.
Tseng S-Y, Narayanan S, Georgiou P. Multimodal embeddings from language models for emotion recognition in the wild. IEEE Signal Process Lett. 2021;28:608–12.
Umer, S., Rout, R. K., Pero, C., and Nappi, M. (2022). Facial expression recognition with trade-offs between data augmentation and deep learning features. Journal of Ambient Intelligence and Humanized Computing, pages 1–15.
Vij A, Pruthi J. An automated psychometric analyzer based on sentiment analysis and emotion recognition for healthcare. Procedia computer science. 2018;132:1184–91.
Viola, P. and Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR).
Wang Y, Zhou S, Liu Y, Wang K, Fang F, Qian H. Congnn: Context-consistent cross-graph neural network for group emotion recognition in the wild. Inf Sci. 2022;610:707–24.
Wei, G., Jian, L., and Mo, S. (2020). Multimodal (audio, facial and gesture) based emotion recognition challenge. In 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020), pages 908–911. IEEE.
Xue, F., Tan, Z., Zhu, Y., Ma, Z., and Guo, G. (2022). Coarse-to-fine cascaded networks with smooth predicting for video facial expression recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2412–2418.
Zafeiriou, S., Papaioannou, A., Kotsia, I., Nicolaou, M., and Zhao, G. (2016). Facial affect“in-the-wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 36–47.
Zhang, F., Zhang, T., Mao, Q., and Xu, C. (2018a). Joint pose and expression modeling for facial expression recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3359–3368.
Zhang H, Su W, Yu J, Wang Z. Identity-expression dual branch network for facial expression recognition. IEEE transactions on cognitive and developmental systems. 2020;13(4):898–911.
Zhang Z, Luo P, Loy CC, Tang X. From facial expression recognition to interpersonal relation prediction. Int J Comput Vision. 2018;126:550–69.
Zhu X, Ye S, Zhao L, Dai Z. Hybrid attention cascade network for facial expression recognition. Sensors. 2021;21(6):2003.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
All authors declare that they have no conflicts of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Boughanem, H., Ghazouani, H. & Barhoumi, W. Facial Emotion Recognition in-the-Wild Using Deep Neural Networks: A Comprehensive Review. SN COMPUT. SCI. 5, 96 (2024). https://doi.org/10.1007/s42979-023-02423-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-023-02423-7