[1]
|
张亚洲, 戎璐, 宋大为, 张鹏. 多模态情感分析研究综述[J]. 模式识别与人工智能, 2020, 33(5): 426-438.
|
[2]
|
Morency, L.P., et al. (2011) Towards Multimodal Sentiment Analysis: Harvesting Opinions from the Web. Proceedings of the 13th International Conference on Multimodal Interfaces, Alicante, 14-18 November 2011, 169-176.
|
[3]
|
Perez-Rosas, V.M. and Morency, L.-P. (2013) Utterance-Level Multimodal Sentiment Analysis. Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Volume 1, 973-982.
|
[4]
|
Wollmer, M., Knaup, T., et al. (2013) Youtube Movie Reviews: Sentiment Analysis in an Audio-Visual Context. IEEE Intelligent Systems, 28, 46-53. https://doi.org/10.1109/MIS.2013.34
|
[5]
|
Park, S.S., et al. (2016) Multimodal Analysis and Prediction of Persuasiveness in Online Social Multimedia. ACM Transactions on Interactive Intelligent Systems, 6, Article 25. https://doi.org/10.1145/2897739
|
[6]
|
Morency, L.-P., et al. (2016) MOSI: Multimodal Corpus of Sentiment Intensity and Subjectivity Analysis in Online Opinion Videos.
|
[7]
|
Zadeh, A.A.B., Poria, S., Cambria, E. and Morency, L.P. (2018) Multimodal Language Analysis in the Wild: CMU- MOSEI Dataset and Interpretable Dynamic Fusion Graph. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Volume 1, 2236-2246. https://doi.org/10.18653/v1/P18-1208
|
[8]
|
Yu, W., et al. (2020) CH-SIMS: A Chinese Multimodal Sentiment Analysis Dataset with Fine-Grained Annotation of Modality. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, July 2020, 3718-3727. https://doi.org/10.18653/v1/2020.acl-main.343
|
[9]
|
Dash, A.K., Rout, J.K. and Jena, S.K. (2016) Harnessing Twitter for Automatic Sentiment Identification Using Machine Learning Techniques. Proceedings of 3rd International Conference on Advanced Computing, Networking and Informatics, Vol. 44, 507-514. https://doi.org/10.1007/978-81-322-2529-4_53
|
[10]
|
Vinodhini, G. and Chandrasekaran, R.M. (2019) A Comparative Performance Evaluation of a Neural Network-Based Approach for Sentiment Classification of Online Reviews. Journal of King Saud University of Computer and Information Sciences, 28, 2-12. https://doi.org/10.1016/j.jksuci.2014.03.024
|
[11]
|
Kaibi, I. and Nfaoui, E.H. (2019) A Comparative Evaluation of Word Embeddings Techniques for Twitter Sentiment Analysis. 2019 International Conference on Wireless Technologies, Embedded and Intelligent Systems (WITS), Fez, 3-4 April 2019, 1-4. https://doi.org/10.1109/WITS.2019.8723864
|
[12]
|
Ahuja, R., Chug, A., Kohli, S., Gupta, S. and Ahuja, P. (2019) The Impact of Features Extraction on the Sentiment Analysis. Procedia Computer Science, 152, 341-348. https://doi.org/10.1016/j.procs.2019.05.008
|
[13]
|
Mohey, D. (2016) Enhancement Bag-of-Words Model for Solving the Challenges of Sentiment Analysis. International Journal of Advanced Computer Science and Applications, 7, 244-252. https://doi.org/10.14569/IJACSA.2016.070134
|
[14]
|
Poria, S., Cambria, E., Hussain, A. and Huang, G.-B. (2015) Towards an Intelligent Framework for Multimodal Effective Data Analysis. Neural Networks, 63, 104-116. https://doi.org/10.1016/j.neunet.2014.10.005
|
[15]
|
Piana, S., Staglianó, A., Odone, F., Verri, A. and Camurri, A. (2014) Real-Time Automatic Emotion Recognition from Body Gestures.
|
[16]
|
Noroozi, F., Corneanu, C.A., Kaminska, D., Sapinski, T., Escalera, S. and Anbarjafari, G. (2018) Survey on Emotional Body Gesture Recognition.
|
[17]
|
Yakaew, A., Dailey, M. and Racharak, T. (2021) Multimodal Sentiment Analysis on Video Streams Using Lightweight Deep Neural Networks. Proceedings of the 10th International Conference on Pattern Recognition Applications and Methods, 4-6 February 2021, 442-451. https://doi.org/10.5220/0010304404420451
|
[18]
|
Song, K., Yao, T., Ling, Q., et al. (2018) Boosting Image Sentiment Analysis with Visual Attention. Neurocomputing, 312, 218-228. https://doi.org/10.1016/j.neucom.2018.05.104
|
[19]
|
王仁武, 孟现茹. 图片情感分析研究综述[J]. 图书情报知识, 2020(3): 119-127.
|
[20]
|
朱雪林. 基于注意力机制的图片文本联合情感分析研究[D]: [硕士学位论文]. 南京: 东南大学, 2019.
|
[21]
|
You, Q.Z., Jin, H.L. and Luo, J.B. (2017) Visual Sentiment Analysis by Attending on Local Image Regions. Thirty-First AAAI Conference on Artificial Intelligence, 31, 231-237. https://doi.org/10.1609/aaai.v31i1.10501
|
[22]
|
Mittal, N., Sharma, D., Joshi, M.L., et al. (2018) Image Sentiment Analysis Using Deep Learning. In: Proceedings of the 2018 IEEE/WIC/ACM International Conference on Web Intelligence, IEEE, Piscataway, 684-687.
https://doi.org/10.1109/WI.2018.00-11
|
[23]
|
Vaswani, A., Shazeer, N., Parmar, N., et al. (2017) Attention Is All You Need. Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, 4-9 December 2017, 5998-6008.
|
[24]
|
Andayani, F., Theng, L.B., Tsun, M.T.K. and Chua, C. (2022) Hybrid LSTM-Transformer Model for Emotion Recognition From Speech Audio Files. IEEE Access, 10, 36018-36027. https://doi.org/10.1109/ACCESS.2022.3163856
|
[25]
|
Heusser, V., Freymuth, N., Constantin, S. and Waibel, A. (2019) Bimodal Speech Emotion Recognition Using Pre-Trained Language Models.
|
[26]
|
Jing, D., Manting, T. and Li, Z. (2021) Transformer-Like Model with Linear Attention for Speech Emotion Recognition. Journal of Southeast University, 37, 164-170.
|
[27]
|
Sakatani, Y. (2021) Combining RNN with Transformer for Modeling Multi-Leg Trips. ACM WSDM WebTour 2021, Jerusalem, 12 March 2021, 50-52.
|
[28]
|
Monkaresi, H., Hussain, M.S. and Calvo, R.A. (2012) Classification of Affects Using Head Movement, Skin Color Features and Physiological Signals. 2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Seoul, 14-17 October 2012, 2664-2669. https://doi.org/10.1109/ICSMC.2012.6378149
|
[29]
|
Cai, G. and Xia, B. (2015) Convolutional Neural Networks for Multimedia Sentiment Analysis. In: Li, J., Ji, H., Zhao, D. and Feng, Y., Eds., Natural Language Processing and Chinese Computing, Vol. 9362, Springer International Publishing, Nanchang, 159-167. https://doi.org/10.1007/978-3-319-25207-0_14
|
[30]
|
Dobrisek, S., Gajsek, R., Mihelic, F., Pavesic, N. and Struc, V. (2013) Towards Efficient Multi-Modal Emotion Recognition. International Journal of Advanced Robotic Systems, 10, 53. https://doi.org/10.5772/54002
|
[31]
|
Wöllmer, M., Weninger, F., Knaup, T., Schuller, B., Sun, C., Sagae, K. and Morency, L.-P. (2013) YouTube Movie Reviews: Sentiment Analysis in an Audio-Visual Context. IEEE Intelligent Systems, 28, 46-53.
https://doi.org/10.1109/MIS.2013.34
|
[32]
|
Siddiquie, B., Chisholm, D. and Divakaran, A. (2015) Exploiting Multimodal Affect and Semantics to Identify Politically Persuasive Web Videos.
|
[33]
|
Mansoorizadeh, M. and Charkari, M. (2014) Multimodal Information Fusion Application to Human Emotion Recognition from Face and Speech. Multimedia Tools and Applications, 49, 277-297.
https://doi.org/10.1007/s11042-009-0344-2
|
[34]
|
Lin, J.-C., Wu, C.-H. and Wei, W.-L. (2012) Error Weighted Semi-Coupled Hidden Markov Model for Audio-Visual Emotion Recognition. IEEE Transactions on Multimedia, 14, 142-156. https://doi.org/10.1109/TMM.2011.2171334
|
[35]
|
Zeng, Z., Hu, Y., Liu, M., Fu, Y. and Huang, T.S. (2006) The Training Combination Strategy of Multi-Stream Fused Hidden Markov Model for Audio-Visual Affect Recognition. Proceedings of the 14th Annual ACM International Conference on Multimedia, Santa Barbara, 23-27 October 2006, 65. https://doi.org/10.1145/1180639.1180661
|
[36]
|
Sebe, N., Cohen, I., Gevers, T. and Huang, T.S. (2006) Emotion Recognition Based on Joint Visual and Audio Cues. 18th International Conference on Pattern Recognition (ICPR’06), Hong Kong, 20-24 August 2006, 1136-1139.
https://doi.org/10.1109/ICPR.2006.489
|
[37]
|
Song, M., Bu, J., Chen, C. and Li, N. (2004) Audio-Visual-Based Emotion Recognition—A New Approach. Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2, 1020-1025.
https://doi.org/10.1109/CVPR.2004.1315276
|
[38]
|
Al-Azani, S. and El-Alfy, E.-S.M. (2020) Enhanced Video Analytics for Sentiment Analysis Based on Fusing Textual. Auditory and Visual Information, 8, 15. https://doi.org/10.1109/ACCESS.2020.3011977
|
[39]
|
Corradini, A., Mehta, M., Bernsen, N.O., Martin, J.C. and Abrilian, S. (2005) Multimodal Input Fusion in Human-Computer Interaction. In: Data Fusion for Situation Monitoring, Incident Detection, Alert and Response Management, IOS Press, Tsakhkadzor, 223-234.
|
[40]
|
Iyengar, G., Nock, H.J. and Neti, C. (2003) Audio-Visual Synchrony for Detection of Monologues in Video Archives. IEEE International Conference on Acoustics, Speech, and Signal Processing, Hong Kong, 6-10 April 2003, V-772.
https://doi.org/10.1109/ICME.2003.1220921
|
[41]
|
刘兵. 情感分析: 挖掘观点、情感和情绪[M]. 北京: 机械工业出版社, 2019: 149-156.
|
[42]
|
Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., van den Driessche, G. and Hassabis, D. (2016) Mastering the Game of Go with Deep Neural Networks and Tree Search. Nature, 529, 484-489.
https://doi.org/10.1038/nature16961
|
[43]
|
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y. and Manzagol, P.-A. (2010) Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion. Journal of Machine Learning Research, 11, 3371-3408.
|
[44]
|
Sak, H., Senior, A. and Beaufays, F. (2014) Long Short-Term Memory Recurrent Neural Network Architectures for Large Scale Acoustic Modeling. Neural and Evolutionary Computing, 1, 1-5.
https://doi.org/10.21437/Interspeech.2014-80
|
[45]
|
Pal, S., Ghosh, S. and Nag, A. (2018) Sentiment Analysis in the Light of LSTM Recurrent Neural Networks. International Journal of Synthetic Emotions, 9, 33-39. https://doi.org/10.4018/IJSE.2018010103
|
[46]
|
Tang, D., Qin, B. and Feng, X. (2016) Effective LSTMs for Target-Dependent Sentiment Classification. Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, Osaka, December 2016, 3298-3307.
|
[47]
|
Basiri, M.E., Nemati, S. and Abdar, M. (2020) An Attention-Based Bidirectional CNN-RNN Deep Model for Sentiment Analysis. Future Generation Computer Systems, 115, 279-294. https://doi.org/10.1016/j.future.2020.08.005
|
[48]
|
Letarte G., Paradis, F. and Giguere, P. (2018) Importance of Self-Attention for Sentiment Analysis. Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, Brussels, November 2018, 267-275. https://doi.org/10.18653/v1/W18-5429
|
[49]
|
Li, W., Qi, F. and Tang, M. (2020) Bidirectional LSTM with Self-Attention Mechanism and Multi-Channel Features for Sentiment Classification. Neurocomputing, 387, 63-77. https://doi.org/10.1016/j.neucom.2020.01.006
|
[50]
|
Xu, Q., Zhu, L. and Dai, T. (2020) Aspect-Based Sentiment Classification with Multiattention Network. Neurocomputing, 388, 135-143. https://doi.org/10.1016/j.neucom.2020.01.024
|
[51]
|
Cao, R., Ye, C. and Zhou, H. (2021) Multimodel Sentiment Analysis with Self-Attention. Proceedings of the Future Technologies Conference (FTC), Volume 1, 16-26. https://doi.org/10.1007/978-3-030-63128-4_2
|
[52]
|
Shenoy, A. and Sardana, A. (2020) Multilogue-Net: A Context Aware RNN for Multi-Modal Emotion Detection and Sentiment Analysis in Conversation. The 58th Annual Meeting of the Association for Computational Linguistics, Seattle, 5-10 July 2020, 19-28. https://doi.org/10.18653/v1/2020.challengehml-1.3
|
[53]
|
Mai, S., Hu, H. and Xing, S. (2019) Conquer and Combine: Hierarchical Feature Fusion Network with Local and Global Perspectives for Multimodal Affective Computing. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, July2019, 481-492. https://doi.org/10.18653/v1/P19-1046
|
[54]
|
Chauhany, D., Poria, S., Ekbaly, A., et al. (2017) Contextual Inter-Modal Attention for Multi-Modal Sentiment Analysis. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, October-November 2018, 3454-3466.
|
[55]
|
Kim, T. (2020) Multi-Attention Multimodal Sentiment Analysis. ICMR’20 Proceedings of the 2020 International Conference on Multimedia Retrieval, Dublin, 8-11 June 2020, 436-441.
|
[56]
|
Liangy, P.P., Kolteryz, J.Z., Morencyy, L.P., et al. (2019) Multimodal Transformer for Unaligned Multimodal Language Sequences. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, 28 July-2 August 2019, 6558-6569.
|