Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3606039.3613107acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article
Open access

Humor Detection System for MuSE 2023: Contextual Modeling, Pesudo Labelling, and Post-smoothing

Published: 29 October 2023 Publication History

Abstract

Humor detection has emerged as an active research area within the field of artificial intelligence. Over the past few decades, it has made remarkable progress with the development of deep learning. This paper introduces a novel framework aimed at enhancing the model's understanding of humorous expressions. Specifically, we consider the impact of correspondence between labels and features. In order to achieve more effective models with limited training samples, we employ a widely utilized semi-supervised learning technique called pseudo labeling. Furthermore, we use the post-smoothing strategy to eliminate abnormally high predictions. At the same time, in order to alleviate the over-fitting phenomenon of the model on the validation set, we created 10 different random subsets of the training and then aggregating their prediction. To verify the effectiveness of our strategy, we evaluate its performance on the Cross-Cultural Humour sub-challenge at MuSe 2023. Experimental results demonstrate that our system achieves an AUC score of 0.9112, surpassing the performance of baseline models by a substantial margin.

Supplementary Material

MP4 File (045video.mp4)
presentation video

References

[1]
Shahin Amiriparian, Lukas Christ, Andreas König, Eva-Maria Messner, Alan Cowen, Erik Cambria, and Björn W. Schuller. 2023. MuSe 2023 Challenge: Multimodal Prediction of Mimicked Emotions, Cross-Cultural Humour, and Personalised Recognition of Affects. In Proceedings of the 31st ACM International Conference on Multimedia (MM'23), October 29-November 2, 2023, Ottawa, Canada. Association for Computing Machinery, Ottawa, Canada. to appear.
[2]
Shahin Amiriparian, Tobias Hübner, Vincent Karas, Maurice Gerczuk, Sandra Ottl, and BjörnWSchuller. 2022. Deepspectrumlite: A power-efficient transfer learning framework for embedded speech and audio processing from decentralized data. Frontiers in Artificial Intelligence 5 (2022), 856232.
[3]
Alexei Baevski, Yuhao Zhou, Abdelrahman Mohamed, and Michael Auli. 2020. wav2vec 2.0: A framework for self-supervised learning of speech representations. Advances in neural information processing systems 33 (2020), 12449--12460.
[4]
Dario Bertero and Pascale Fung. 2016. Deep learning of audio and language features for humor prediction. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16). 496--501.
[5]
Dario Bertero and Pascale Fung. 2016. Deep learning of audio and language features for humor prediction. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16). 496--501.
[6]
G. E. P. Box and G. M. Jenkins. 2010. Time series analysis : forecasting and control. Journal of Time 31, 3 (2010).
[7]
Arnie Cann, Amanda J Watson, and Elisabeth A Bridgewater. 2014. Assessing humor at work: The humor climate questionnaire. Humor 27, 2 (2014), 307--323.
[8]
Mathilde Caron, Hugo Touvron, Ishan Misra, Hervé Jégou, Julien Mairal, Piotr Bojanowski, and Armand Joulin. 2021. Emerging properties in self-supervised vision transformers. In Proceedings of the IEEE/CVF international conference on computer vision. 9650--9660.
[9]
Krishna Chaitanya, Ertunc Erdil, Neerav Karani, and Ender Konukoglu. 2023. Local contrastive loss with pseudo-label based self-training for semi-supervised medical image segmentation. Medical Image Analysis 87 (2023), 102792.
[10]
Peng-Yu Chen and Von-Wun Soo. 2018. Humor recognition using deep learning. In Proceedings of the 2018 conference of the north american chapter of the association for computational linguistics: Human language technologies, volume 2 (short papers). 113--117.
[11]
Sakib Chowdhury, Md. Latifur Rahman, Shams Nafisa Ali, and Md. Jahin Alam. 2020. A RNN Based Parallel Deep Learning Framework for Detecting Sentiment Polarity from Twitter Derived Textual Data. In 2020 11th International Conference on Electrical and Computer Engineering (ICECE). 9--12. https://doi.org/10.1109/ ICECE51571.2020.9393137
[12]
Lukas Christ, Shahin Amiriparian, Alice Baird, Alexander Kathan, Niklas Müller, Steffen Klug, Chris Gagne, Panagiotis Tzirakis, Lukas Stappen, Eva-Maria Meßner, Andreas König, Alan Cowen, Erik Cambria, and Björn W. Schuller. 2023. The MuSe 2023 Multimodal Sentiment Analysis Challenge: Mimicked Emotions, Cross-Cultural Humour, and Personalisation. In MuSe'23: Proceedings of the 4th Multimodal Sentiment Analysis Workshop and Challenge. Association for Computing Machinery. co-located with ACM Multimedia 2022, to appear.
[13]
Lukas Christ, Shahin Amiriparian, Alexander Kathan, Niklas Müller, Andreas König, and Björn W Schuller. 2022. Multimodal Prediction of Spontaneous Humour: A Novel Dataset and First Results. arXiv preprint arXiv:2209.14272 (2022).
[14]
Lukas Christ, Shahin Amiriparian, Alexander Kathan, Niklas Müller, Andreas König, and Björn W. Schuller. 2023. Towards Multimodal Prediction of Spontaneous Humour: A Novel Dataset and First Results. arXiv:2209.14272 [cs.LG]
[15]
Jia Deng,Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition. Ieee, 248--255.
[16]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
[17]
Paul Ekman and Wallace V Friesen. 1978. Facial action coding system. Environmental Psychology & Nonverbal Behavior (1978).
[18]
Florian Eyben, Klaus R Scherer, Björn W Schuller, Johan Sundberg, Elisabeth André, Carlos Busso, Laurence Y Devillers, Julien Epps, Petri Laukka, Shrikanth S Narayanan, et al. 2015. The Geneva minimalistic acoustic parameter set (GeMAPS) for voice research and affective computing. IEEE transactions on affective computing 7, 2 (2015), 190--202.
[19]
Yuxin Fang, Quan Sun, Xinggang Wang, Tiejun Huang, Xinlong Wang, and Yue Cao. 2023. Eva-02: A visual representation for neon genesis. arXiv preprint arXiv:2303.11331 (2023).
[20]
Valentin Flunkert, David Salinas, and Jan Gasthaus. 2020. DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks. International Journal of Forecasting 36, 3 (2020).
[21]
Panagiotis Gkorezis, Eugenia Petridou, and Panteleimon Xanthiakos. 2014. Leader positive humor and organizational cynicism: LMX as a mediator. Leadership & Organization Development Journal 35, 4 (2014), 305--315.
[22]
C. W. J. Granger. 2001. Investigating causal relations by econometric models and cross-spectral methods. Harvard University Press (2001).
[23]
Md Kamrul Hasan, Wasifur Rahman, Amir Zadeh, Jianyuan Zhong, Md Iftekhar Tanveer, Louis-Philippe Morency, et al. 2019. UR-FUNNY: A multimodal language dataset for understanding humor. arXiv preprint arXiv:1904.06618 (2019).
[24]
Md Kamrul Hasan, Wasifur Rahman, Amir Zadeh, Jianyuan Zhong, Md Iftekhar Tanveer, Louis Philippe Morency, Mohammed, and Hoque. 2019. UR-FUNNY: A Multimodal Language Dataset for Understanding Humor. (2019).
[25]
Lang He, Dongmei Jiang, Le Yang, Ercheng Pei, PengWu, and Hichem Sahli. 2015. Multimodal affective dimension prediction using deep bidirectional long shortterm memory recurrent neural networks. In Proceedings of the 5th International Workshop on Audio/Visual Emotion Challenge. 73--80.
[26]
Jian Huang, Ya Li, Jianhua Tao, Zheng Lian, Zhengqi Wen, Minghao Yang, and Jiangyan Yi. 2017. Continuous multimodal emotion prediction based on long short term memory recurrent neural network. In Proceedings of the 7th Annual Workshop on Audio/Visual Emotion Challenge. 11--18.
[27]
Dongseong Hwang, Khe Chai Sim, Zhouyuan Huo, and Trevor Strohman. 2022. Pseudo label is better than human label. arXiv preprint arXiv:2203.12668 (2022).
[28]
Tonglin Jiang, Hao Li, and Yubo Hou. 2019. Cultural differences in humor perception, usage, and implications. Frontiers in psychology 10 (2019), 123.
[29]
Jacob Kahn, Ann Lee, and Awni Hannun. 2020. Self-training for end-to-end speech recognition. In ICASSP 2020--2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 7084--7088.
[30]
Yuta Kayatani, Zekun Yang, Mayu Otani, Noa Garcia, Chenhui Chu, Yuta Nakashima, and Haruo Takemura. 2021. The Laughing Machine: Predicting Humor in Video. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). 2073--2082.
[31]
Jussa Klapuri et al. 2013. Collaborative filtering methods on a very sparse reddit recommendation dataset. Master's thesis.
[32]
Dong-Hyun Lee et al. 2013. Pseudo-label: The simple and efficient semisupervised learning method for deep neural networks. In Workshop on challenges in representation learning, ICML, Vol. 3. Atlanta, 896.
[33]
Soroosh Mariooryad and Carlos Busso. 2014. Correcting time-continuous emotional labels by modeling the reaction lag of evaluators. IEEE Transactions on Affective Computing 6, 2 (2014), 97--108.
[34]
Ke Mei, Chuang Zhu, Jiaqi Zou, and Shanghang Zhang. 2020. Instance adaptive self-training for unsupervised domain adaptation. In Computer Vision--ECCV 2020: 16th European Conference, Glasgow, UK, August 23--28, 2020, Proceedings, Part XXVI 16. Springer, 415--430.
[35]
Badri N. Patro, Mayank Lunayach, Deepankar Srivastava, Sarvesh, Hunar Singh, and Vinay P. Namboodiri. 2021. Multimodal Humor Dataset: Predicting Laughter Tracks for Sitcoms. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). 576--585.
[36]
Alec Radford, JongWook Kim, Tao Xu, Greg Brockman, Christine McLeavey, and Ilya Sutskever. 2023. Robust speech recognition via large-scale weak supervision. In International Conference on Machine Learning. PMLR, 28492--28518.
[37]
Victor Raskin, Willibald Ruch, and Victor Raskin. 2008. The primer of humor research. Mouton de Gruyter.
[38]
Fabien Ringeval, Florian Eyben, Eleni Kroupi, Anil Yuce, Jean-Philippe Thiran, Touradj Ebrahimi, Denis Lalanne, and Björn Schuller. 2015. Prediction of asynchronous dimensional emotion ratings from audiovisual and physiological data. Pattern Recognition Letters 66 (2015), 22--30.
[39]
Sefik Ilkin Serengil and Alper Ozpinar. 2020. Lightface: A hybrid deep face recognition framework. In 2020 innovations in intelligent systems and applications conference (ASYU). IEEE, 1--5.
[40]
Alexander Strehl and Joydeep Ghosh. 2002. Cluster ensembles-a knowledge reuse framework for combining multiple partitions. Journal of machine learning research 3, Dec (2002), 583--617.
[41]
Quan Sun, Yuxin Fang, Ledell Wu, Xinlong Wang, and Yue Cao. 2023. Eva-clip: Improved training techniques for clip at scale. arXiv preprint arXiv:2303.15389 (2023).
[42]
Jiaming Wu, Hongfei Lin, Liang Yang, and Bo Xu. 2021. Mumor: A multimodal dataset for humor detection in conversations. In Natural Language Processing and Chinese Computing: 10th CCF International Conference, NLPCC 2021, Qingdao, China, October 13--17, 2021, Proceedings, Part I 10. Springer, 619--627.
[43]
Zixiaofan Yang, Lin Ai, and Julia Hirschberg. 2019. Multimodal Indicators of Humor in Videos. In 2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR). 538--543. https://doi.org/10.1109/MIPR.2019.00109
[44]
Minghu Zhang, Jianwen Guo, Xin Li, and Rui Jin. 2020. Data-Driven Anomaly Detection Approach for Time-Series Streaming Data. Sensors 20, 19 (2020), 5646.
[45]
Dingyuan Zheng, Jimin Xiao, Ke Chen, Xiaowei Huang, Lin Chen, and Yao Zhao. 2022. Soft pseudo-label shrinkage for unsupervised domain adaptive person re-identification. Pattern Recognition 127 (2022), 108615.

Cited By

View all
  • (2024)AMTN: Attention-Enhanced Multimodal Temporal Network for Humor DetectionProceedings of the 5th on Multimodal Sentiment Analysis Challenge and Workshop: Social Perception and Humor10.1145/3689062.3689375(65-69)Online publication date: 28-Oct-2024
  • (2024)The MuSe 2024 Multimodal Sentiment Analysis Challenge: Social Perception and Humor RecognitionProceedings of the 5th on Multimodal Sentiment Analysis Challenge and Workshop: Social Perception and Humor10.1145/3689062.3689088(1-9)Online publication date: 28-Oct-2024
  • (2024)Social Perception Prediction for MuSe 2024: Joint Learning of Multiple PerceptionsProceedings of the 5th on Multimodal Sentiment Analysis Challenge and Workshop: Social Perception and Humor10.1145/3689062.3689087(52-59)Online publication date: 28-Oct-2024
  • Show More Cited By

Index Terms

  1. Humor Detection System for MuSE 2023: Contextual Modeling, Pesudo Labelling, and Post-smoothing

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      MuSe '23: Proceedings of the 4th on Multimodal Sentiment Analysis Challenge and Workshop: Mimicked Emotions, Humour and Personalisation
      November 2023
      113 pages
      ISBN:9798400702709
      DOI:10.1145/3606039
      This work is licensed under a Creative Commons Attribution International 4.0 License.

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 29 October 2023

      Check for updates

      Author Tags

      1. contextual modeling
      2. humor detection
      3. post-smoothing
      4. pseudo labeling

      Qualifiers

      • Research-article

      Funding Sources

      • the National Natural Science Foundation of China
      • Beijing Municipal Science&Technology CommissionAdministrative Commission of Zhongguancun Science Park
      • Open Research Projects of Zhejiang Lab

      Conference

      MM '23
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 14 of 17 submissions, 82%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)274
      • Downloads (Last 6 weeks)19
      Reflects downloads up to 13 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)AMTN: Attention-Enhanced Multimodal Temporal Network for Humor DetectionProceedings of the 5th on Multimodal Sentiment Analysis Challenge and Workshop: Social Perception and Humor10.1145/3689062.3689375(65-69)Online publication date: 28-Oct-2024
      • (2024)The MuSe 2024 Multimodal Sentiment Analysis Challenge: Social Perception and Humor RecognitionProceedings of the 5th on Multimodal Sentiment Analysis Challenge and Workshop: Social Perception and Humor10.1145/3689062.3689088(1-9)Online publication date: 28-Oct-2024
      • (2024)Social Perception Prediction for MuSe 2024: Joint Learning of Multiple PerceptionsProceedings of the 5th on Multimodal Sentiment Analysis Challenge and Workshop: Social Perception and Humor10.1145/3689062.3689087(52-59)Online publication date: 28-Oct-2024
      • (2024)DPP: A Dual-Phase Processing Method for Cross-Cultural Humor DetectionProceedings of the 5th on Multimodal Sentiment Analysis Challenge and Workshop: Social Perception and Humor10.1145/3689062.3689080(70-78)Online publication date: 28-Oct-2024
      • (2024)Generative Action Procedure Manzai Scenario Based on Maslow’s Stages of Need TheoryAdvances in Network-Based Information Systems10.1007/978-3-031-72325-4_31(319-327)Online publication date: 20-Sep-2024
      • (2023)MuSe 2023 Challenge: Multimodal Prediction of Mimicked Emotions, Cross-Cultural Humour, and Personalised Recognition of AffectsProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3610943(9723-9725)Online publication date: 26-Oct-2023

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media