research-article

Multi-feature and Multi-instance Learning with Anti-overfitting Strategy for Engagement Intensity Prediction

Authors:

Jianming Wu,

Zhiguang Zhou,

Yanan Wang,

Yi Li,

Xin Xu,

Yusuke UchidaAuthors Info & Claims

ICMI '19: 2019 International Conference on Multimodal Interaction

Pages 582 - 588

https://doi.org/10.1145/3340555.3355717

Published: 14 October 2019 Publication History

Get Access

Abstract

This paper proposes a novel engagement intensity prediction approach, which is also applied in the EmotiW Challenge 2019 and resulted in good performance. The task is to predict the engagement level when a subject student is watching an educational video in diverse conditions and various environments. Assuming that the engagement intensity has a strong correlation with facial movements, upper-body posture movements and overall environmental movements in a time interval, we extract and incorporate these motion features into a deep regression model consisting of layers with a combination of LSTM, Gated Recurrent Unit (GRU) and a Fully Connected Layer. In order to precisely and robustly predict the engagement level in a long video with various situations such as darkness and complex background, a multi-features engineering method is used to extract synchronized multi-model features in a period of time by considering both the short-term dependencies and long-term dependencies. Based on the well-processed features, we propose a strategy for maximizing validation accuracy to generate the best models covering all the model configurations. Furthermore, to avoid the overfitting problem ascribed to the extremely small database, we propose another strategy applying a single Bi-LSTM layer with only 16 units to minimize the overfitting, and splitting the engagement dataset (train + validation) with 5-fold cross validation (stratified k-fold) to train the conservative model. By ensembling the above models, our methods finally win the second place in the challenge with MSE of 0.06174 on the testing set.

References

[1]

Dhall Abhinav, Goecke Roland, Ghosh Shreya, and Gedeon Tom. 2019. EmotiW 2019: Automatic Emotion, Engagement and Cohesion PredictionTasks. In Proceedings of the 2019 on International Conference on Multimodal Interaction. ACM.

Google Scholar

[2]

Joao Carreira and Andrew Zisserman. 2017. Quo vadis, action recognition? a new model and the kinetics dataset. In proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6299–6308.

Crossref

Google Scholar

[3]

Jennifer A Fredricks, Phyllis C Blumenfeld, and Alison H Paris. 2004. School engagement: Potential of the concept, state of the evidence. Review of educational research 74, 1 (2004), 59–109.

Google Scholar

[4]

Amanjot Kaur, Aamir Mustafa, Love Mehta, and Abhinav Dhall. 2018. Prediction and localization of student engagement in the wild. In 2018 Digital Image Computing: Techniques and Applications (DICTA). IEEE, 1–8.

Google Scholar

[5]

Xuesong Niu, Hu Han, Jiabei Zeng, Xuran Sun, Shiguang Shan, Yan Huang, Songfan Yang, and Xilin Chen. 2018. Automatic engagement prediction with GAP feature. In Proceedings of the 2018 on International Conference on Multimodal Interaction. ACM, 599–603.

Digital Library

Google Scholar

[6]

Leslie N Smith. 2017. Cyclical learning rates for training neural networks. In 2017 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 464–472.

Crossref

Google Scholar

[7]

Christian Stöhr, Natalia Stathakarou, Franziska Mueller, Sokratis Nifakos, and Cormac McGrath. 2019. Videos as learning objects in MOOCs: A study of specialist and non-specialist participants’ video activity in MOOCs. British Journal of Educational Technology 50, 1 (2019), 166–176.

Crossref

Google Scholar

[8]

Lisa Wang, Angela Sy, Larry Liu, and Chris Piech. 2017. Learning to Represent Student Knowledge on Programming Exercises Using Deep Learning. In EDM.

Google Scholar

[9]

Justin M Weinhardt and Traci Sitzmann. 2019. Revolutionizing training and education? Three questions regarding massive open online courses (MOOCs). Human Resource Management Review 29, 2 (2019), 218–225.

Crossref

Google Scholar

[10]

Jianfei Yang, Kai Wang, Xiaojiang Peng, and Yu Qiao. 2018. Deep recurrent multi-instance learning with spatio-temporal features for engagement intensity prediction. In Proceedings of the 2018 on International Conference on Multimodal Interaction. ACM, 594–598.

Digital Library

Google Scholar

[11]

Peng Zhou, Wei Shi, Jun Tian, Zhenyu Qi, Bingchen Li, Hongwei Hao, and Bo Xu. 2016. Attention-based bidirectional long short-term memory networks for relation classification. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 207–212.

Crossref

Google Scholar

Cited By

View all

Somu RAshok Kumar P(2024)Analysis of Learner’s Emotional Engagement in Online Learning Using Machine Learning Adam Robust Optimization AlgorithmScientific Programming10.1155/2024/88861972024:1Online publication date: 5-Jun-2024
https://doi.org/10.1155/2024/8886197
Tian XNunes BLiu YManrique R(2024)Predicting Student Engagement Using Sequential Ensemble ModelIEEE Transactions on Learning Technologies10.1109/TLT.2023.334286017(939-950)Online publication date: 2024
https://doi.org/10.1109/TLT.2023.3342860
Shangguan ZLi XDong YYuan X(2024)Automatic Depression Detection Using Attention-Based Deep Multiple Instance LearningQuality, Reliability, Security and Robustness in Heterogeneous Systems10.1007/978-3-031-65126-7_4(40-51)Online publication date: 20-Aug-2024
https://doi.org/10.1007/978-3-031-65126-7_4
Show More Cited By

Recommendations

Deep Recurrent Multi-instance Learning with Spatio-temporal Features for Engagement Intensity Prediction
ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal Interaction

This paper elaborates the winner approach for engagement intensity prediction in the EmotiW Challenge 2018. The task is to predict the engagement level of a subject when he or she is watching an educational video in diverse conditions and different ...
Automatic Engagement Prediction with GAP Feature
ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal Interaction

In this paper, we propose an automatic engagement prediction method for the Engagement in the Wild sub-challenge of EmotiW 2018. We first design a novel Gaze-AU-Pose (GAP) feature taking into account the information of gaze, action units and head pose ...
Advanced Multi-Instance Learning Method with Multi-features Engineering and Conservative Optimization for Engagement Intensity Prediction
ICMI '20: Proceedings of the 2020 International Conference on Multimodal Interaction

This paper proposes an advanced multi-instance learning method with multi-features engineering and conservative optimization for engagement intensity prediction. It was applied to the EmotiW Challenge 2020 and the results demonstrated the proposed ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

ICMI '19: 2019 International Conference on Multimodal Interaction

October 2019

601 pages

ISBN:9781450368605

DOI:10.1145/3340555

Editors:
Wen Gao
Peking University, China
,
Helen Mei Ling Meng
Chinese University of Hong Kong, China
,
Matthew Turk
Toyota Technological Institute at Chicago, USA
,
Susan R. Fussell
Cornell University, USA
,
Björn Schuller
Imperial College London / University of Augsburg, UK
,
Yale Song
Microsoft Research, USA
,
Kai Yu
Shanghai Jiao Tong University, China

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 October 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ICMI '19

ICMI '19: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION

October 14 - 18, 2019

Suzhou, China

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
384
Total Downloads

Downloads (Last 12 months)19
Downloads (Last 6 weeks)0

Reflects downloads up to 20 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Somu RAshok Kumar P(2024)Analysis of Learner’s Emotional Engagement in Online Learning Using Machine Learning Adam Robust Optimization AlgorithmScientific Programming10.1155/2024/88861972024:1Online publication date: 5-Jun-2024
https://doi.org/10.1155/2024/8886197
Tian XNunes BLiu YManrique R(2024)Predicting Student Engagement Using Sequential Ensemble ModelIEEE Transactions on Learning Technologies10.1109/TLT.2023.334286017(939-950)Online publication date: 2024
https://doi.org/10.1109/TLT.2023.3342860
Shangguan ZLi XDong YYuan X(2024)Automatic Depression Detection Using Attention-Based Deep Multiple Instance LearningQuality, Reliability, Security and Robustness in Heterogeneous Systems10.1007/978-3-031-65126-7_4(40-51)Online publication date: 20-Aug-2024
https://doi.org/10.1007/978-3-031-65126-7_4
Qian JJiang XMa JLi JGao ZQin X(2023)Accompany Children's Learning for You: An Intelligent Companion Learning SystemComputer Graphics Forum10.1111/cgf.1486242:6Online publication date: 3-Jul-2023
https://doi.org/10.1111/cgf.14862
Zhang TEl Ali AWang CHanjalic ACesar P(2023)Weakly-Supervised Learning for Fine-Grained Emotion Recognition Using Physiological SignalsIEEE Transactions on Affective Computing10.1109/TAFFC.2022.315823414:3(2304-2322)Online publication date: 1-Jul-2023
https://doi.org/10.1109/TAFFC.2022.3158234
Savchenko ASavchenko LMakarov I(2022)Classifying Emotions and Engagement in Online Learning Based on a Single Facial Expression Recognition Neural NetworkIEEE Transactions on Affective Computing10.1109/TAFFC.2022.318839013:4(2132-2143)Online publication date: 1-Oct-2022
https://doi.org/10.1109/TAFFC.2022.3188390
Wang JLi XLi JSun QWang H(2022)NGCUBig Data Research10.1016/j.bdr.2021.10029627:COnline publication date: 28-Feb-2022
https://dl.acm.org/doi/10.1016/j.bdr.2021.100296
Copur ONakıp MScardapane SSlowack J(2022)Engagement Detection with Multi-Task Training in E-Learning EnvironmentsImage Analysis and Processing – ICIAP 202210.1007/978-3-031-06433-3_35(411-422)Online publication date: 15-May-2022
https://doi.org/10.1007/978-3-031-06433-3_35

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Abstract

References

Cited By

Recommendations

Deep Recurrent Multi-instance Learning with Spatio-temporal Features for Engagement Intensity Prediction

Automatic Engagement Prediction with GAP Feature

Advanced Multi-Instance Learning Method with Multi-features Engineering and Conservative Optimization for Engagement Intensity Prediction

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

HTML Format

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations