Google Scholar

Long short term memory recurrent neural network based multimodal dimensional emotion recognition

L Chao, J Tao, M Yang, Y Li, Z Wen - Proceedings of the 5th …, 2015 - dl.acm.org

Proceedings of the 5th international workshop on audio/visual emotion challenge, 2015•dl.acm.org

This paper presents our effort to the Audio/Visual+ Emotion Challenge (AV+EC2015), whose goal is to predict the continuous values of the emotion dimensions arousal and valence from audio, visual and physiology modalities. The state of art classifier for dimensional recognition, long short term memory recurrent neural network (LSTM-RNN) is utilized. Except regular LSTM-RNN prediction architecture, two techniques are investigated for dimensional emotion recognition problem. The first one is ε -insensitive loss is utilized as the loss function to optimize. Compared to squared loss function, which is the most widely used loss function for dimension emotion recognition, ε -insensitive loss is more robust for the label noises and it can ignore small errors to get stronger correlation between predictions and labels. The other one is temporal pooling. This technique enables temporal modeling in the input features and increases the diversity of the features fed into the forward prediction architecture. Experiments results show the efficiency of key points of the proposed method and competitive results are obtained.

ACM Digital Library

Show moreShow less

Save Cite Cited by 179 Related articles All 5 versions

Cite

Advanced search

Saved to My library

Long short term memory recurrent neural network based multimodal dimensional emotion recognition