Abstract
The rapid growth of online video content has led to an increasing demand for effective video categorization methods. Current methods employed by video platforms include ratings from moderators, creators, and viewers. However, such a self-rating categorization method might not be the most efficient or insightful way to categorize videos. If physiological signals were taken into account, that would make the categorization more robust and could provide content creators, advertisers, and researchers with a better understanding of the viewers’ emotional responses and preferences. In this paper, we develop a hybrid MLP architecture called “ATT-MLP” that utilizes self-attention in its layers and then test its performance on the AVDOS (Affective Video Dataset Online Study) dataset – a database where viewers’ physiological signals were measured whilst they watched pre-classified videos. ATT-MLP outperformed MLP and traditional ML algorithms (Gaussian Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Linear Ridge, and Random Forrest) across all five data modalities (HRV, IMU, EMG-A, EMG-C, and ALL) of the AVDOS dataset. Accuracy and F1 were used as performance metrics, and the hybrid MLP architecture recorded the highest accuracy and F1 score, 93.8% and 93.1%, when the EMG-A data modality of the AVDOS dataset was used. This study shows that the MLP employing self-attention mechanisms within its hidden layers can be a powerful tool in the classification tasks of affective datasets. The code for the aforementioned model is publicly available on Github: https://github.com/IshtiaqHoque/ATT-MLP.
L.S. Shaiok and I. Hoque—Equal contribution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Xing, B., et al.: Exploiting EEG signals and audiovisual feature fusion for video emotion recognition. IEEE Access 7, 59844–59861 (2019)
Santamaria-Granados, L., Munoz-Organero, M., Ramirez-Gonzalez, G., Abdulhay, E., Arunkumar, N.: Using deep convolutional neural network for emotion detection on a physiological signals dataset (amigos). IEEE Access 7, 57–67 (2018)
Gnacek, M., et al.: Avdos-affective video database online study video database for affective research emotionally validated through an online survey. In: 2022 10th International Conference on Affective Computing and Intelligent Interaction (ACII), pp. 1–8. IEEE (2022)
Michalgnacek: Github - michalgnacek/AVDOS-VR: scripts repository for analysis of DRAP database. https://github.com/michalgnacek/AVDOS-VR
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Fonnegra, R.D., Díaz, G.M.: Deep learning based video spatio-temporal modeling for emotion recognition. In: Kurosu, M. (ed.) HCI 2018. LNCS, vol. 10901, pp. 397–408. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-91238-7_32
Kang, H.B.: Affective content detection using HMMs. In: Proceedings of the eleventh ACM International Conference on Multimedia, pp. 259–262 (2003)
Wang, H.L., Cheong, L.F.: Affective understanding in film. IEEE Trans. Circuits Syst. Video Technol. 16(6), 689–704 (2006)
Soleymani, M., Lichtenauer, J., Pun, T., Pantic, M.: A multimodal database for affect recognition and implicit tagging. IEEE Trans. Affect. Comput. 3(1), 42–55 (2011)
Duan, L., Ge, H., Yang, Z., Chen, J.: Multimodal fusion using kernel-based ELM for video emotion recognition. In: Cao, J., Mao, K., Wu, J., Lendasse, A. (eds.) Proceedings of ELM-2015 Volume 1. PALO, vol. 6, pp. 371–381. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-28397-5_29
Li, D., Huang, F., Yan, L., Cao, Z., Chen, J., Ye, Z.: Landslide susceptibility prediction using particle-swarm-optimized multilayer perceptron: comparisons with multilayer-perceptron-only, BP neural network, and information value models. Appl. Sci. 9(18), 3664 (2019)
Zhang, X., Xu, C., Xue, W., Hu, J., He, Y., Gao, M.: Emotion recognition based on multichannel physiological signals with comprehensive nonlinear processing. Sensors 18(11), 3886 (2018)
Amendolia, S.R., Cossu, G., Ganadu, M., Golosio, B., Masala, G.L., Mura, G.M.: A comparative study of k-nearest neighbour, support vector machine and multi-layer perceptron for thalassemia screening. Chemom. Intell. Lab. Syst. 69(1–2), 13–20 (2003)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Shaiok, L.S., Hoque, I., Hasan, M.R., Ghosh, S., Gedeon, T., Hossain, M.Z. (2024). Attention-Based Multi-layer Perceptron to Categorize Affective Videos from Viewer’s Physiological Signals. In: Nguyen, N.T., et al. Recent Challenges in Intelligent Information and Database Systems. ACIIDS 2024. Communications in Computer and Information Science, vol 2145. Springer, Singapore. https://doi.org/10.1007/978-981-97-5934-7_3
Download citation
DOI: https://doi.org/10.1007/978-981-97-5934-7_3
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-5933-0
Online ISBN: 978-981-97-5934-7
eBook Packages: Computer ScienceComputer Science (R0)