Abstract
People represent their emotions in a myriad of ways. Among the most important ones is whole body expressions which have many applications in different fields such as human-computer interaction (HCI). One of the most important challenges in human emotion recognition is that people express the same feeling in various ways using their face and their body. Recently many methods have tried to overcome these challenges using Deep Neural Networks (DNNs). However, most of these methods were based on images or on facial expressions only and did not consider deformation that may happen in the images such as scaling and rotation which can adversely affect the recognition accuracy. In this work, motivated by recent researches on deformable convolutions, we incorporate the deformable behavior into the core of convolutional long short-term memory (ConvLSTM) to improve robustness to these deformations in the image and, consequently, improve its accuracy on the emotion recognition task from videos of arbitrary length. We did experiments on the GEMEP dataset and achieved state-of-the-art accuracy of 98.8\(\%\) on the task of whole human body emotion recognition on the validation set.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ahmed, F., Bari, A., Gavrilova, M.: Emotion recognition from body movement. IEEE Access 8, 11761–11781 (2019). https://doi.org/10.1109/ACCESS.2019.2963113
Bänziger, T., Scherer, K.: Introducing the geneva multimodal emotion portrayal (gemep) corpus. Blueprint for Affective Computing: A Sourcebook (2010)
Chen, J., Chen, Z., Chi, Z., Fu, H.: Facial expression recognition in video with multiple feature fusion. IEEE Trans. Affect. Comput. 9, 38–50 (2016). https://doi.org/10.1109/TAFFC.2016.2593719
Dai, J., et al.: Deformable convolutional networks (2017)
Du, S., Martinez, A.: Compound facial expressions of emotion: from basic research to clinical applications. Dial. Clin. Neurosci. 17, 443–455 (2015)
Glowinski, D., Dael, N., Camurri, A., Volpe, G., Mortillaro, M., Scherer, K.: Toward a minimal representation of affective gestures. T. Affect. Comput. 2, 106–118 (2011). https://doi.org/10.1109/T-AFFC.2011.7
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2015)
Jain, D., Shamsolmoali, P., Sehdev, P.: Extended deep neural network for facial emotion recognition. Pattern Recogn. Lett. 120, 69–74 (2019). https://doi.org/10.1016/j.patrec.2019.01.008
Jeong, D., Kim, B.G., Dong, S.Y.: Deep joint spatiotemporal network (djstn) for efficient facial expression recognition. Sensors 20(7) (2020). https://doi.org/10.3390/s20071936, https://www.mdpi.com/1424-8220/20/7/1936
Rajaram, S., Geetha, M.: Deep learning approach for emotion recognition from human body movements with feedforward deep convolution neural networks. Procedia Comput. Sci. 152, 158–165 (2019). https://doi.org/10.1016/j.procs.2019.05.038
Santhoshkumar, R., Kalaiselvi Geetha, M.: Vision-based human emotion recognition using HOG-KLT feature. In: Singh, P.K., Pawłowski, W., Tanwar, S., Kumar, N., Rodrigues, J.J.P.C., Obaidat, M.S. (eds.) Proceedings of First International Conference on Computing, Communications, and Cyber-Security (IC4S 2019). LNNS, vol. 121, pp. 261–272. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-3369-3_20
Rajaram, S., Geetha, M.K., Arunnehru, J.: SVM-KNN based emotion recognition of human in video using hog feature and KLT tracking algorithm. Int. J. Pure Appl. Math. 117, 621–634 (2017)
Santhoshkumar, R., Geetha, M.K.: Emotion recognition on multi view static action videos using multi blocks maximum intensity code (MBMIC). In: Smys, S., Iliyasu, A.M., Bestak, R., Shi F. (eds.) New Trends in Computational Vision and Bio-inspired Computing, pp. 1143–1151. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-41862-5-116
Sharma, G., Dhall, A.: A survey on automatic multimodal emotion recognition in the wild. In: Phillips-Wren, G., Esposito, A., Jain, L.C. (eds.) Advances in Data Science: Methodologies and Applications. Intelligent Systems Reference Library, vol. 189, pp. 35–64. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-51870-7-3
Shi, X., Chen, Z., Wang, H., Yeung, D.Y., Wong, W.K., Woo, W.C.: Convolutional LSTM network: A machine learning approach for precipitation nowcasting (2015)
Zhang, L., Zhu, G., Shen, P., Song, J.: Learning spatiotemporal features using 3dcnn and convolutional LSTM for gesture recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3120–3128 (2017). https://doi.org/10.1109/ICCVW.2017.369
Zhu, X., Hu, H., Lin, S., Dai, J.: Deformable convnets v2: More deformable, better results (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Tahghighi, P., Koochari, A., Jalali, M. (2021). Deformable Convolutional LSTM for Human Body Emotion Recognition. In: Del Bimbo, A., et al. Pattern Recognition. ICPR International Workshops and Challenges. ICPR 2021. Lecture Notes in Computer Science(), vol 12663. Springer, Cham. https://doi.org/10.1007/978-3-030-68796-0_54
Download citation
DOI: https://doi.org/10.1007/978-3-030-68796-0_54
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-68795-3
Online ISBN: 978-3-030-68796-0
eBook Packages: Computer ScienceComputer Science (R0)