Facial expression recognition via transfer learning in cooperative game paradigms for enhanced social AI

261 Accesses
Explore all metrics

Abstract

Facial Expression Recognition (FER) is an effortless task for humans, and such non-verbal communication is intricately related to how we relate to others beyond the explicit content of our speech. Facial expressions can convey how we are feeling, as well as our intentions, and are thus a key point in multimodal social interactions. Recent computational advances, such as promising results from Convolutional Neural Networks (CNN), have drawn increasing attention to the potential of FER to enhance human–agent interaction (HAI) and human–robot interaction (HRI), but questions remain as to how “transferrable” the learned knowledge is from one task environment to another. In this paper, we explore how FER can be deployed in HAI cooperative game paradigms, where a human subject interacts with a virtual avatar in a goal-oriented environment where they must cooperate to survive. The primary question was whether transfer learning (TL) would offer an advantage for FER over pre-trained models based on similar (but the not exact same) task environment. The final results showed that TL was able to achieve significantly improved results (94.3% accuracy), without the need for an extensive task-specific corpus. We discuss how such approaches could be used to flexibly create more life-like robots and avatars, capable of fluid social interactions within cooperative multimodal environments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep Facial Expression Recognition

FER-Pep: A Deep Learning Based Facial Emotion Recognition Framework for Humanoid Robot Pepper

Action Unit Based Facial Expression Recognition Using Deep Learning

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability

The datasets generated during and/or analyzed during the current study are not publicly available due to the fact the data comprises video and audio recordings of identifiable human subjects during gameplay. However, extracted de-identified data may be made available from the corresponding author upon reasonable request.

References

Camerlink I, Coulange E, Farish M, Baxter EM, Turner SP (2018) Facial expression as a potential measure of both intent and emotion. Sci Rep 8:17602. https://doi.org/10.1038/s41598-018-35905-3
Article Google Scholar
Key MR (2011) The relationship of verbal and nonverbal communication. De Gruyter Mouton, Berlin. https://doi.org/10.1515/9783110813098
Book Google Scholar
Mehrabian A (2008) Communication without words. In: Communication theory, pp 193–200, Routledge. https://doi.org/10.4324/9781315080918-15
Jyoti J, Jesse H (2017) Continuous facial expression recognition for affective interaction with virtual avatar. IEEE Signal Processing Society, SigPort
Google Scholar
Houshmand B, Khan N (2020) Facial expression recognition under partial occlusion from virtual reality headsets based on transfer learning. https://doi.org/10.1109/BigMM50055.2020.00020
Onyema EM, Shukla PK, Dalal S, Mathur MN, Zakariah M, Tiwari B (2021) Enhancement of patient facial recognition through deep learning algorithm: ConvNet. J Healthc Eng. https://doi.org/10.1155/2021/5196000
Article Google Scholar
Bennett CC, Weiss B, Suh J, Yoon E, Jeong J, Chae Y (2022) Exploring data-driven components of socially intelligent AI through cooperative game paradigms. Multimodal Technol Interact 6(2):16. https://doi.org/10.3390/mti6020016
Article Google Scholar
Carranza KR, Manalili J, Bugtai NT, Baldovino RG (2019) Expression tracking with OpenCV deep learning for a development of emotionally aware chatbots. In: 7th IEEE international conference on robot intelligence technology and applications (RiTA), pp 160–163. https://doi.org/10.1109/RITAPP.2019.8932852
Castillo JC, González ÁC, Alonso-Martín F, Fernández-Caballero A, Salichs MA (2018) Emotion detection and regulation from personal assistant robot in smart environment personal assistants. In: Personal assistants: emerging computational technologies, pp 179–195. Springer Cham. https://doi.org/10.1007/978-3-319-62530-0_10
Samadiani N, Huang G, Cai B, Luo W, Chi CH, Xiang Y, He J (2019) A review on automatic facial expression recognition systems assisted by multimodal sensor data. Sensors 9(8):1863. https://doi.org/10.3390/s19081863
Article Google Scholar
Ko BC (2018) A brief review of facial emotion recognition based on visual information. Sensors (Basel, Switzerland). https://doi.org/10.3390/s18020401
Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE computer society conference on computer vision and pattern recognition (CVPR)1: I-I. https://doi.org/10.1109/CVPR.2001.990517.
Karnati M, Ayan S, Ondrej K, Anis Y (2021) FER-net: facial expression recognition using deep neural net. Neural Comput Appl 33:9125–9136. https://doi.org/10.1007/s00521-020-05676-y
Article Google Scholar
Sarker IH (2021) Deep learning: a comprehensive overview on techniques, taxonomy, applications and research directions. SN Comput Sci 2:420. https://doi.org/10.1007/s42979-021-00815-1
Article Google Scholar
Wafa M, Wahida H (2020) Facial emotion recognition using deep learning: review and insights. Procedia Comput Sci 175:689–694. https://doi.org/10.1016/j.procs.2020.07.101
Article Google Scholar
Mollahosseini A, Chan D, Mahoor MH (2016) Going deeper in facial expression recognition using deep neural networks. In: IEEE winter conference on applications of computer vision (WACV), 1–10
Lopes AT, Aguiar ED, Souza AF, Oliveira-Santos T (2017) Facial expression recognition with convolutional neural networks: coping with few data and the training sample order. Pattern Recognit 61:610–628. https://doi.org/10.1016/j.patcog.2016.07.026
Article Google Scholar
Kim DH, Baddar WJ, Jang J, Ro Y (2019) Multi-objective based spatio-temporal feature representation learning robust to expression intensity variations for facial expression recognition. IEEE Trans Affect Comput 10:223–236. https://doi.org/10.1109/TAFFC.2017.2695999
Article Google Scholar
Singh S, Prasad SVAV (2018) Techniques and challenges of face recognition: a critical review. Procedia Comput Sci 143:536–543. https://doi.org/10.1016/j.procs.2018.10.427
Article Google Scholar
Mohamad NO, Dras M, Hamey L, Richards D, Wan S, Paris C (2020) Automatic recognition of student engagement using deep learning and facial expression. In: Joint European conference on machine learning and knowledge discovery in databases, pp 273–289. https://doi.org/10.1007/978-3-030-46133-1_17
Rawat W, Wang Z (2017) Deep convolutional neural networks for image classification: a comprehensive review. Neural Comput 29(9):2352–2449. https://doi.org/10.1162/NECO_a_00990
Article MathSciNet MATH Google Scholar
Li B (2021) Facial expression recognition via transfer learning. EAI Endorsed Trans e-Learning 7(21):e4–e4. https://doi.org/10.4108/eai.8-4-2021.169180
Article Google Scholar
Akhand MAH, Shuvendu R, Nazmul S, Kamal MAS, Shimamura T (2021) Facial emotion recognition using transfer learning in the deep CNN. Electronics 10(9):1036. https://doi.org/10.3390/electronics10091036
Article Google Scholar
Bennett CC, Weiss B (2022) Purposeful failures as a form of culturally-appropriate intelligent disobedience during human–robot social interaction. In: Autonomous agents and multiagent systems best and visionary papers (AAMAS 2022), revised selected papers. Springer-Verlag, Berlin, Heidelberg, pp 84–90. https://doi.org/10.1007/978-3-031-20179-0_5
Marsh AA, Elfenbein HA, Ambady N (2003) Nonverbal “accents”: cultural differences in facial expressions of emotion. Psychol Sci 14(4):373–376. https://doi.org/10.1111/1467-9280.24461
Article Google Scholar
Bartneck C, Kulić D, Croft E, Zoghbi S (2009) Measurement instruments for the anthropomorphism, animacy, likeability, perceived intelligence, and perceived safety of robot. Int J Soc Robot 1:71–81. https://doi.org/10.1007/s12369-008-0001-3
Article Google Scholar
Ekman P, Friesen WV (2003) Unmasking the face a guide to recognizing emotions from facial clues. Malor Books, Los Altos
Google Scholar
Fang X, Rychlowska M, Lange J (2022) Cross-cultural and inter-group research on emotion perception. J Cult Cogn Sci 6:1–7. https://doi.org/10.1007/s41809-022-00102-2
Article Google Scholar
Soussignan R, Schaal B, Boulanger V, Garcia S, Jiang T (2015) Emotional communication in the context of joint attention for food stimuli: effects on attentional and affective processing. Biol Psychol 104:173–183. https://doi.org/10.1016/j.biopsycho.2014.12.006
Article Google Scholar
Mojzisch A, Schilbach L, Helmert JR, Pannasch S, Velichkovsky BM, Vogeley K (2006) The effects of self-involvement on attention, arousal, and facial expression during social interaction with virtual others: a psychophysiological study. Soc Neurosci 1(3–4):184–195. https://doi.org/10.1080/17470910600985621
Article Google Scholar
Blom PM, Methors S, Bakkes S, Spronck P (2019) Modeling and adjusting in-game difficulty based on facial expression analysis. Entertain Comput 31:100307. https://doi.org/10.1016/j.entcom.2019.100307
Article Google Scholar
Mistry K, Jasekar J, Issac B, Zhang L (2018) Extended LBP based facial expression recognition system for adaptive AI agent behaviour. In: International joint conference on neural networks (IJCNN), pp 1–7
Serengil SI (2022) TensorFlow 101: introduction to deep learning for python within TensorFlow. https://www.github.com/serengil/tensorflow-101. Accessed 12 Dec 2022
Yan H (2016) Transfer subspace learning for cross-dataset facial expression recognition. Neurocomputing 208:165–173. https://doi.org/10.1016/j.neucom.2015.11.113
Article Google Scholar
Dubey AK, Jain V (2020) Automatic facial recognition using VGG16 based transfer learning model. J Inf Optim Sci 41:1589–1596. https://doi.org/10.1080/02522667.2020.1809126
Article Google Scholar
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. https://doi.org/10.48550/arXiv.1409.1556
Ji S, Yang M, Yu K (2013) 3D convolutional neural networks for human action recognition. IEEE Trans Pattern Anal Mach Intell 35(1):221–231. https://doi.org/10.1109/TPAMI201259
Article Google Scholar
Karpathy A, Toderici G, Shetty S, Leung T, Sukthankar R, Fei-Fei L (2014) Large-scale video classification with convolutional neural networks. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 1725–1732. https://doi.org/10.1109/CVPR.2014.223
Jeong M, Ko BC (2018) Driver’s facial expression recognition in real-time for safe driving. Sensors 18(12):4270. https://doi.org/10.3390/s18124270
Article Google Scholar
Wu C-H, Lin J-C, Wei W-L (2014) Survey on audiovisual emotion recognition: databases, features, and data fusion strategies. APSIPA Trans Signal Inf Process 3:e12. https://doi.org/10.1017/ATSIP201411
Yang J, Ren P, Zhang D, Chen D, Wen F, Li H, Hua G (2017) Neural aggregation network for video face recognition. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 5216–5225. https://doi.org/10.1109/CVPR.2017.554
Xu Q, Yang Y, Tan Q, Zhang L (2017) Facial expressions in context: electrophysiological correlates of the emotional congruency of facial expressions and background scenes. Front Psychol 8:2175. https://doi.org/10.3389/fpsyg.2017.02175
Article Google Scholar
Cha HS, Im CH (2022) Performance enhancement of facial electromyogram-based facial-expression recognition for social virtual reality applications using linear discriminant analysis adaptation. Virtual Real 26(1):385–398. https://doi.org/10.1007/s10055-021-00575-6
Article Google Scholar
Citron FM, Gray MA, Critchley HD, Weekes BS, Ferstl EC (2014) Emotional valence and arousal affect reading in an interactive way: neuroimaging evidence for an approach-withdrawal framework. Neuropsychologia 56:79–89. https://doi.org/10.1016/jneuropsychologia201401002
Article Google Scholar
Barrett LF, Russell JA (1999) The structure of current affect: controversies and emerging consensus. Curr Direct Psychol Sci 8(1):10–14. https://doi.org/10.1111/1467-8721.00003
Article Google Scholar
Lang PJ, Bradley MM, Cuthbert BN (1997) Motivated attention: affect, activation, and action. Atten Orienting Sens Motivational Processes 97:135
Google Scholar
Russell JA (2003) Core affect and the psychological construction of emotion. Psychol Rev 110(1):145. https://doi.org/10.1037/0033-295x.110.1.145
Article Google Scholar
Tottenham N, Tanaka JW, Leon AC et al (2009) The NimStim set of facial expressions: judgments from untrained research participants. Psychiatry Res 68(3):242–249. https://doi.org/10.1016/jpsychres200805006
Article Google Scholar
Biehl MC, Matsumoto D, Ekman P, Hearn V, Heider KG, Kudoh T, Ton V (1997) Matsumoto and Ekman’s Japanese and Caucasian Facial Expressions of Emotion (JACFEE): reliability data and cross-national differences. J Nonverbal Behav 21:3–21. https://doi.org/10.1023/A:1024902500935
Article Google Scholar
Holzinger AT, Müller H (2021) Toward human–AI interfaces to support explainability and causability in medical AI. Computer 54(10):78–86. https://doi.org/10.1109/MC.2021.3092610
Article Google Scholar
Thomaz A, Hoffman G, Cakmak M (2016) Computational human–robot interaction. Found Trends Robot 4:104–223. https://doi.org/10.1561/2300000049
Article Google Scholar
Celiktutan O, Skordos S, Gunes H (2019) Multimodal human–human–robot interactions (MHHRI) dataset for studying personality and engagement. IEEE Trans Affect Comput 10(4):484–497. https://doi.org/10.1109/TAFFC.2017.2737019
Article Google Scholar
Oh CS, Bailenson JN, Welch GF (2018) A systematic review of social presence: definition, antecedents, and implications. Front Robot AI 5:114. https://doi.org/10.3389/frobt.2018.00114
Article Google Scholar
Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vis 57:137–154. https://doi.org/10.1023/B:VISI.0000013087.49260.fb
Article Google Scholar
Munoz-De-Escalona E, Cañas J (2017) Online measuring of available resources. In: First international symposium on human mental workload: models and applications. https://doi.org/10.21427/D7DK96

Download references

Funding

This work was supported through funding by a Grant from the National Research Foundation of Korea (NRF Grant# 2021R1G1A1003801).

Author information

Authors and Affiliations

Department of Computer Science, Universidad Carlos III de Madrid, Madrid, Spain
Paula Castro Sánchez
Department of Intelligence Computing, Hanyang University, Seoul, Korea
Paula Castro Sánchez & Casey C. Bennett
Department of Computing and Digital Media, DePaul University, Chicago, IL, USA
Casey C. Bennett

Authors

Paula Castro Sánchez
View author publications
You can also search for this author in PubMed Google Scholar
Casey C. Bennett
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Casey C. Bennett.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Ethical approval

This study was conducted in accordance with the Declaration of Helsinki, and approved by the Institutional Review Board of Hanyang University (protocol #HYU2021-138) for studies involving humans. Informed consent was obtained from all subjects involved in this study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 294 kb)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sánchez, P.C., Bennett, C.C. Facial expression recognition via transfer learning in cooperative game paradigms for enhanced social AI. J Multimodal User Interfaces 17, 187–201 (2023). https://doi.org/10.1007/s12193-023-00410-z

Download citation

Received: 08 May 2023
Accepted: 04 August 2023
Published: 14 August 2023
Issue Date: September 2023
DOI: https://doi.org/10.1007/s12193-023-00410-z

Facial expression recognition via transfer learning in cooperative game paradigms for enhanced social AI

Abstract

Access this article

Subscribe and save

Buy Now