research-article

Free access

DialogueRNN: an attentive RNN for emotion detection in conversations

AUTHORs:

Navonil Majumder,

Soujanya Poria,

Devamanyu Hazarika,

Alexander Gelbukh,

Erik CambriaAuthors Info & Claims

AAAI'19/IAAI'19/EAAI'19: Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence

Article No.: 837, Pages 6818 - 6825

https://doi.org/10.1609/aaai.v33i01.33016818

Published: 27 January 2019 Publication History

PDF eReader Publisher Site

Abstract

Emotion detection in conversations is a necessary step for a number of applications, including opinion mining over chat history, social media threads, debates, argumentation mining, understanding consumer feedback in live conversations, and so on. Currently systems do not treat the parties in the conversation individually by adapting to the speaker of each utterance. In this paper, we describe a new method based on recurrent neural networks that keeps track of the individual party states throughout the conversation and uses this information for emotion classification. Our model outperforms the state-of-the-art by a significant margin on two different datasets.

References

[1]

Alm, C. O.; Roth, D.; and Sproat, R. 2005. Emotions from text: machine learning for text-based emotion prediction. In Proceedings of the conference on human language technology and empirical methods in natural language processing, 579-586. Association for Computational Linguistics.

[2]

Arriaga, O.; Valdenegro-Toro, M.; and Plöger, P. 2017. Realtime convolutional neural networks for emotion and gender classification. CoRR abs/1710.07557.

[3]

Bahdanau, D.; Cho, K.; and Bengio, Y. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv: 1409.0473.

[4]

Busso, C.; Bulut, M.; Lee, C.-C.; Kazemzadeh, A.; Mower, E.; Kim, S.; Chang, J. N.; Lee, S.; and Narayanan, S. S. 2008. IEMOCAP: Interactive emotional dyadic motion capture database. Language resources and evaluation 42(4):335-359.

[5]

Chen, M.; Wang, S.; Liang, P. P.; Baltrušaitis, T.; Zadeh, A.; and Morency, L.-P. 2017. Multimodal sentiment analysis with word-level fusion and reinforcement learning. In Proceedings of the 19th ACM International Conference on Multimodal Interaction, 163-171. ACM.

[6]

Chung, J.; Gülçehre, Ç.; Cho, K.; and Bengio, Y. 2014. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. CoRR abs/1412.3555.

[7]

Datcu, D., and Rothkrantz, L. 2008. Semantic audio-visual data fusion for automatic emotion recognition. Euromedia'2008.

[8]

Ekman, P. 1993. Facial expression and emotion. American psychologist 48(4):384.

[9]

Eyben, F.; Wöllmer, M.; and Schuller, B. 2010. Opensmile: the munich versatile and fast open-source audio feature extractor. In Proceedings of the 18th ACM international conference on Multimedia, 1459-1462. ACM.

Digital Library

[10]

Graves, A.; Wayne, G.; and Danihelka, I. 2014. Neural turing machines. arXiv preprint arXiv: 1410.5401.

[11]

Hazarika, D.; Poria, S.; Zadeh, A.; Cambria, E.; Morency, L.-P.; and Zimmermann, R. 2018. Conversational Memory Network for Emotion Recognition in Dyadic Dialogue Videos. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 2122-2132. New Orleans, Louisiana: Association for Computational Linguistics.

[12]

Hochreiter, S., and Schmidhuber, J. 1997. Long short-term memory. Neural computation 9(8): 1735-1780.

[13]

Kim, Y. 2014. Convolutional neural networks for sentence classification. arXiv preprint arXiv: 1408.5882.

[14]

Kingma, D. P., and Ba, J. 2014. Adam: A Method for Stochastic Optimization. CoRR abs/1412.6980.

[15]

Kumar, A.; Irsoy, O.; Ondruska, P.; Iyyer, M.; Bradbury, J.; Gulrajani, I.; Zhong, V.; Paulus, R.; and Socher, R. 2016. Ask me anything: Dynamic memory networks for natural language processing. In International Conference on Machine Learning, 1378-1387.

[16]

McKeown, G.; Valstar, M.; Cowie, R.; Pantic, M.; and Schroder, M. 2012. The SEMAINE Database: Annotated Multimodal Records of Emotionally Colored Conversations between a Person and a Limited Agent. IEEE Transactions on Affective Computing 3(1):5-17.

Digital Library

[17]

Picard, R. W 2010. Affective computing: From laughter to ieee. IEEE Transactions on Affective Computing 1(1): 11-17.

Digital Library

[18]

Poria, S.; Cambria, E.; Hazarika, D.; Majumder, N.; Zadeh, A.; and Morency, L.-P. 2017. Context-Dependent Sentiment Analysis in User-Generated Videos. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 873-883. Vancouver, Canada: Association for Computational Linguistics.

[19]

Richards, J. M.; Butler, E. A.; and Gross, J. J. 2003. Emotion regulation in romantic relationships: The cognitive consequences of concealing feelings. Journal of Social and Personal Relationships 20(5):599-620.

[20]

Ruusuvuori, J. 2013. Emotion, affect and conversation. The handbook of conversation analysis 330-349.

[21]

Schuller, B.; Valster, M.; Eyben, F.; Cowie, R.; and Pantic, M. 2012. AVEC 2012: The Continuous Audio/Visual Emotion Challenge. In Proceedings of the 14th ACM International Conference on Multimodal Interaction, ICMI ' 12, 449-456. New York, NY, USA: ACM.

[22]

Strapparava, C., and Mihalcea, R. 2010. Annotating and identifying emotions in text. In Intelligent Information Access. Springer. 21-38.

[23]

Sukhbaatar, S.; Szlam, A.; Weston, J.; and Fergus, R. 2015. End-to-end Memory Networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 2, NIPS' 15, 2440-2448. Cambridge, MA, USA: MIT Press.

[24]

Wöllmer, M.; Metallinou, A.; Eyben, F.; Schuller, B.; and Narayanan, S. S. 2010. Context-sensitive multimodal emotion recognition from speech and facial expression using bidirectional lstm modeling. In INTERSPEECH 2010.

[25]

Zadeh, A.; Chen, M.; Poria, S.; Cambria, E.; and Morency, L.-P. 2017. Tensor Fusion Network for Multimodal Sentiment Analysis. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 1103-1114. Copenhagen, Denmark: Association for Computational Linguistics.

[26]

Zadeh, A.; Liang, P. P.; Mazumder, N.; Poria, S.; Cambria, E.; and Morency, L.-P. 2018a. Memory Fusion Network for Multi-view Sequential Learning. In AAAI Conference on Artificial Intelligence, 5634-5641.

[27]

Zadeh, A.; Liang, P. P.; Poria, S.; Vij, P.; Cambria, E.; and Morency, L.-P. 2018b. Multi-attention recurrent network for human communication comprehension. In AAAI Conference on Artificial Intelligence, 5642-5649.

Cited By

Kovacevic NHolz CGross MWampfler R(2024)On Multimodal Emotion Recognition for Human-Chatbot Interaction in the WildProceedings of the 26th International Conference on Multimodal Interaction10.1145/3678957.3685759(12-21)Online publication date: 4-Nov-2024
https://dl.acm.org/doi/10.1145/3678957.3685759
Nguyen CLe TMai ALe DCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Ada2I: Enhancing Modality Balance for Multimodal Conversational Emotion RecognitionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681648(9330-9339)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681648
Zheng WYu JXia RCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)A Unimodal Valence-Arousal Driven Contrastive Learning Framework for Multimodal Multi-Label Emotion RecognitionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681638(622-631)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681638
Show More Cited By

Index Terms

DialogueRNN: an attentive RNN for emotion detection in conversations

Index terms have been assigned to the content through auto-classification.

Recommendations

Dialog acts in greeting and leavetaking in social talk
ISIAA 2017: Proceedings of the 1st ACM SIGCHI International Workshop on Investigating Social Interactions with Artificial Agents

Conversation proceeds through dialogue moves or acts, and dialog act annotation can aid the design of artificial dialog. While many dialogs are task-based or instrumental, with clear goals, as in the case of a service encounter or business meeting, ...
Natural Language, Mixed-initiative Personal Assistant Agents
IMCOM '18: Proceedings of the 12th International Conference on Ubiquitous Information Management and Communication

The increasing popularity and use of personal voice assistant technologies, such as Siri and Google Now, is driving and expanding progress toward the long-term and lofty goal of using artificial intelligence to build human-computer dialog systems ...
Human-robot collaborative tutoring using multiparty multimodal spoken dialogue
HRI '14: Proceedings of the 2014 ACM/IEEE international conference on Human-robot interaction

In this paper, we describe a project that explores a novel experimental setup towards building a spoken, multi-modally rich, and human-like multiparty tutoring robot. A human-robot interaction setup is designed, and a human-human dialogue corpus is ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

AAAI'19/IAAI'19/EAAI'19: Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence

January 2019

10088 pages

ISBN:978-1-57735-809-1

Copyright © 2019 Association for the Advancement of Artificial Intelligence.

Sponsors

Association for the Advancement of Artificial Intelligence

Publisher

AAAI Press

Publication History

Published: 27 January 2019

Qualifiers

Research-article
Research
Refereed limited

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

61
Total Citations
View Citations
160
Total Downloads

Downloads (Last 12 months)89
Downloads (Last 6 weeks)15

Reflects downloads up to 16 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Kovacevic NHolz CGross MWampfler R(2024)On Multimodal Emotion Recognition for Human-Chatbot Interaction in the WildProceedings of the 26th International Conference on Multimodal Interaction10.1145/3678957.3685759(12-21)Online publication date: 4-Nov-2024
https://dl.acm.org/doi/10.1145/3678957.3685759
Nguyen CLe TMai ALe DCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Ada2I: Enhancing Modality Balance for Multimodal Conversational Emotion RecognitionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681648(9330-9339)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681648
Zheng WYu JXia RCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)A Unimodal Valence-Arousal Driven Contrastive Learning Framework for Multimodal Multi-Label Emotion RecognitionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681638(622-631)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681638
Yi ZZhao ZShen ZZhang TCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Multimodal Fusion via Hypergraph Autoencoder and Contrastive Learning for Emotion Recognition in ConversationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681633(4341-4348)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681633
Jing YZhao XCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)DQ-Former: Querying Transformer with Dynamic Modality Priority for Cognitive-aligned Multimodal Emotion Recognition in ConversationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681599(4795-4804)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681599
Tu GXiong FLiang BWang HZeng XXu RCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Multimodal Emotion Recognition Calibration in ConversationsProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681515(9621-9630)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681515
Wen ZCao JShen JYang RLiu SSun M(2024)Personality-affected Emotion Generation in Dialog SystemsACM Transactions on Information Systems10.1145/365561642:5(1-27)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3655616
Liu CXie ZZhao SZhou JXu TLi MChen EGurrin CKongkachandra RSchoeffmann KDang-Nguyen DRossetto LSatoh SZhou L(2024)Speak From Heart: An Emotion-Guided LLM-Based Multimodal Method for Emotional Dialogue GenerationProceedings of the 2024 International Conference on Multimedia Retrieval10.1145/3652583.3658104(533-542)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3652583.3658104
Zheng CXu HSun X(2024)Hypergraph Neural Network for Emotion Recognition in ConversationsACM Transactions on Asian and Low-Resource Language Information Processing10.1145/363876023:2(1-16)Online publication date: 8-Feb-2024
https://dl.acm.org/doi/10.1145/3638760
Xie YSun CLiu YJi ZLiu BSerra ESpezzano F(2024)UniMPC: Towards a Unified Framework for Multi-Party ConversationsProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679864(2639-2649)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3679864
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents