research-article

Cross-modal Consistency Learning with Fine-grained Fusion Network for Multimodal Fake News Detection

Authors:

Yang YangAuthors Info & Claims

MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in Asia

Article No.: 27, Pages 1 - 7

https://doi.org/10.1145/3595916.3626397

Published: 01 January 2024 Publication History

Abstract

Previous studies on multimodal fake news detection have observed the mismatch between text and images in the fake news and attempted to explore the consistency of multimodal news based on global features of different modalities. However, they fail to investigate this relationship between fine-grained fragments in multimodal content. To gain public trust, fake news often includes relevant parts in the text and the image, making such multimodal content appear consistent. Using global features may suppress potential inconsistencies in irrelevant parts. Therefore, in this paper, we propose a novel Consistency-learning Fine-grained Fusion Network (CFFN) that separately explores the consistency and inconsistency from high-relevant and low-relevant word-region pairs. Specifically, for a multimodal post, we divide word-region pairs into high-relevant and low-relevant parts based on their relevance scores. For the high-relevant part, we follow the cross-modal attention mechanism to explore the consistency. For low-relevant part, we calculate inconsistency scores to capture inconsistent points. Finally, a selection module is used to choose the primary clue (consistency or inconsistency) for identifying the credibility of multimodal news. Extensive experiments on two public datasets demonstrate that our CFFN substantially outperforms all the baselines. Our code can be found at: https://github.com/uestc-lj/CFFN/.

References

[1]

Yi Bin, Yujuan Ding, Bo Peng, Liang Peng, Yang Yang, and Tat-Seng Chua. 2021. Entity slot filling for visual captioning. IEEE Transactions on Circuits and Systems for Video Technology 32, 1 (2021), 52–62.

Digital Library

[2]

Yi Bin, Haoxuan Li, Yahui Xu, Xing Xu, Yang Yang, and Heng Tao Shen. 2023. Unifying Two-Stream Encoders with Transformers for Cross-Modal Retrieval. In Proceedings of the 31st ACM International Conference on Multimedia.

Digital Library

[3]

C. Boididou, S. Papadopoulos, D. Dang-Nguyen, G. Boato, and Y. Kompatsiaris. 2016. Verifying Multimedia Use at MediaEval 2016. In MediaEval 2016 Workshop.

[4]

Christina Boididou, Symeon Papadopoulos, Markos Zampoglou, Lazaros Apostolidis, Olga Papadopoulou, and Yiannis Kompatsiaris. 2018. Detection and visualization of misleading content on Twitter. International Journal of Multimedia Information Retrieval 7, 1 (2018), 71–86.

[5]

Dustin P Calvillo, Abraham M Rutchick, and Ryan JB Garcia. 2021. Individual differences in belief in fake news about election fraud after the 2020 US election. Behavioral Sciences 11, 12 (2021), 175.

[6]

Carlos Castillo, Marcelo Mendoza, and Barbara Poblete. 2011. Information credibility on twitter. In Proceedings of the 20th international conference on World wide web. 675–684.

Digital Library

[7]

Liang-Chieh Chen, Yi Yang, Jiang Wang, Wei Xu, and Alan L Yuille. 2016. Attention to scale: Scale-aware semantic image segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3640–3649.

[8]

Yixuan Chen, Dongsheng Li, Peng Zhang, Jie Sui, Qin Lv, Lu Tun, and Li Shang. 2022. Cross-modal ambiguity learning for multimodal fake news detection. In Proceedings of the ACM Web Conference 2022. 2897–2905.

Digital Library

[9]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR. 770–778.

[10]

Zhiwei Jin, Juan Cao, Han Guo, Yongdong Zhang, and Jiebo Luo. 2017. Multimodal fusion with recurrent neural networks for rumor detection on microblogs. In Proceedings of the 25th ACM international conference on Multimedia. 795–816.

Digital Library

[11]

Zhiwei Jin, Juan Cao, Yongdong Zhang, Jianshe Zhou, and Qi Tian. 2016. Novel visual and statistical image features for microblogs news verification. IEEE transactions on multimedia 19, 3 (2016), 598–608.

[12]

Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL. 4171–4186.

[13]

Dhruv Khattar, Jaipal Singh Goud, Manish Gupta, and Vasudeva Varma. 2019. Mvae: Multimodal variational autoencoder for fake news detection. In The world wide web conference. 2915–2921.

[14]

Sejeong Kwon, Meeyoung Cha, Kyomin Jung, Wei Chen, and Yajun Wang. 2013. Prominent features of rumor propagation in online social media. In 2013 IEEE 13th international conference on data mining. IEEE, 1103–1108.

[15]

Kuang-Huei Lee, Xi Chen, Gang Hua, Houdong Hu, and Xiaodong He. 2018. Stacked cross attention for image-text matching. In Proceedings of the European conference on computer vision (ECCV). 201–216.

Digital Library

[16]

Haoxuan Li, Yi Bin, Junrong Liao, Yang Yang, and Heng Tao Shen. 2023. Your Negative May not Be True Negative: Boosting Image-Text Matching with False Negative Elimination. In Proceedings of the 31st ACM International Conference on Multimedia.

Digital Library

[17]

Chunxiao Liu, Zhendong Mao, An-An Liu, Tianzhu Zhang, Bin Wang, and Yongdong Zhang. 2019. Focus your attention: A bidirectional focal attention network for image-text matching. In Proceedings of the 27th ACM international conference on multimedia. 3–11.

Digital Library

[18]

Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision. 10012–10022.

[19]

Ilaria Montagni, Kevin Ouazzani-Touhami, A Mebarki, N Texier, S Schück, Christophe Tzourio, and Confins Group. 2021. Acceptance of a Covid-19 vaccine is associated with ability to detect fake news and health literacy. Journal of Public Health 43, 4 (2021), 695–702.

[20]

Liang Peng, Shuangji Yang, Yi Bin, and Guoqing Wang. 2021. Progressive graph attention network for video question answering. In Proceedings of the 29th ACM International Conference on Multimedia. 2871–2879.

Digital Library

[21]

Liang Peng, Yang Yang, Zheng Wang, Xiao Wu, and Zi Huang. 2019. Cra-net: Composed relation attention network for visual question answering. In Proceedings of the 27th ACM international conference on multimedia. 1202–1210.

Digital Library

[22]

Peng Qi, Juan Cao, Xirong Li, Huan Liu, Qiang Sheng, Xiaoyue Mi, Qin He, Yongbiao Lv, Chenyang Guo, and Yingchao Yu. 2021. Improving fake news detection by using an entity-enhanced framework to fuse diverse multimodal clues. In Proceedings of the 29th ACM International Conference on Multimedia. 1212–1220.

Digital Library

[23]

Shengsheng Qian, Jinguang Wang, Jun Hu, Quan Fang, and Changsheng Xu. 2021. Hierarchical multi-modal contextual attention network for fake news detection. In Proceedings of the 44th international ACM SIGIR conference on research and development in information retrieval. 153–162.

Digital Library

[24]

Leigang Qu, Meng Liu, Jianlong Wu, Zan Gao, and Liqiang Nie. 2021. Dynamic modality interaction modeling for image-text retrieval. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1104–1113.

Digital Library

[25]

Natali Ruchansky, Sungyong Seo, and Yan Liu. 2017. Csi: A hybrid deep model for fake news detection. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 797–806.

Digital Library

[26]

Kai Shu, Amy Sliva, Suhang Wang, Jiliang Tang, and Huan Liu. 2017. Fake news detection on social media: A data mining perspective. ACM SIGKDD explorations newsletter 19, 1 (2017), 22–36.

Digital Library

[27]

Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. In ICLR.

[28]

Shivangi Singhal, Tanisha Pandey, Saksham Mrig, Rajiv Ratn Shah, and Ponnurangam Kumaraguru. 2022. Leveraging Intra and Inter Modality Relationship for Multimodal Fake News Detection. In Companion Proceedings of the Web Conference 2022. 726–734.

[29]

Shivangi Singhal, Rajiv Ratn Shah, Tanmoy Chakraborty, Ponnurangam Kumaraguru, and Shin’ichi Satoh. 2019. Spotfake: A multi-modal framework for fake news detection. In 2019 IEEE fifth international conference on multimedia big data (BigMM). IEEE, 39–47.

[30]

Chenguang Song, Nianwen Ning, Yunlei Zhang, and Bin Wu. 2021. A multimodal fake news detection model based on crossmodal attention residual and multichannel convolutional neural networks. Information Processing & Management 58, 1 (2021), 102437.

[31]

Xiaobing Sun and Wei Lu. 2020. Understanding attention for text classification. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 3418–3428.

[32]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017).

[33]

Fei Wang, Mengqing Jiang, Chen Qian, Shuo Yang, Cheng Li, Honggang Zhang, Xiaogang Wang, and Xiaoou Tang. 2017. Residual attention network for image classification. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3156–3164.

[34]

Jing Wang, Jinhui Tang, and Jiebo Luo. 2020. Multimodal attention with image text spatial relationship for ocr-based image captioning. In Proceedings of the 28th ACM International Conference on Multimedia. 4337–4345.

Digital Library

[35]

Yaqing Wang, Fenglong Ma, Zhiwei Jin, Ye Yuan, Guangxu Xun, Kishlay Jha, Lu Su, and Jing Gao. 2018. Eann: Event adversarial neural networks for multi-modal fake news detection. In Proceedings of the 24th acm sigkdd international conference on knowledge discovery & data mining. 849–857.

Digital Library

[36]

Yang Wu, Pengwei Zhan, Yunjian Zhang, Liming Wang, and Zhen Xu. 2021. Multimodal fusion with co-attention networks for fake news detection. In Findings of the association for computational linguistics: ACL-IJCNLP 2021. 2560–2569.

[37]

Yahui Xu, Yi Bin, Jiwei Wei, Yang Yang, Guoqing Wang, and Heng Tao Shen. 2023. Multi-Modal Transformer with Global-Local Alignment for Composed Query Image Retrieval. IEEE Transactions on Multimedia (2023).

Digital Library

[38]

Junxiao Xue, Yabo Wang, Yichen Tian, Yafei Li, Lei Shi, and Lin Wei. 2021. Detecting fake news by exploring the consistency of multimodal data. Information Processing & Management 58, 5 (2021), 102610.

Digital Library

[39]

Fan Yang, Yang Liu, Xiaohui Yu, and Min Yang. 2012. Automatic detection of rumor on sina weibo. In Proceedings of the ACM SIGKDD workshop on mining data semantics. 1–7.

Digital Library

[40]

Yang Yang, Jie Zhou, Jiangbo Ai, Yi Bin, Alan Hanjalic, Heng Tao Shen, and Yanli Ji. 2018. Video captioning by adversarial LSTM. IEEE Transactions on Image Processing 27, 11 (2018), 5600–5611.

Digital Library

[41]

Feng Yu, Qiang Liu, Shu Wu, Liang Wang, Tieniu Tan, 2017. A Convolutional Approach for Misinformation Identification. In IJCAI. 3901–3907.

[42]

Huaiwen Zhang, Quan Fang, Shengsheng Qian, and Changsheng Xu. 2019. Multi-modal knowledge-aware event memory network for social media rumor detection. In Proceedings of the 27th ACM international conference on multimedia. 1942–1951.

Digital Library

[43]

Xinyi Zhou, Jindi Wu, and Reza Zafarani. 2020. SAFE: Similarity-Aware Multi-modal Fake News Detection. In Advances in Knowledge Discovery and Data Mining: 24th Pacific-Asia Conference, PAKDD 2020, Singapore, May 11–14, 2020, Proceedings, Part II. Springer, 354–367.

[44]

Xinyi Zhou and Reza Zafarani. 2020. A survey of fake news: Fundamental theories, detection methods, and opportunities. ACM Computing Surveys (CSUR) 53, 5 (2020), 1–40.

Digital Library

Index Terms

Cross-modal Consistency Learning with Fine-grained Fusion Network for Multimodal Fake News Detection
1. Information systems
  1. World Wide Web
    1. Web applications
      1. Social networks

Recommendations

Cross-modal Contrastive Learning for Multimodal Fake News Detection
MM '23: Proceedings of the 31st ACM International Conference on Multimedia

Automatic detection of multimodal fake news has gained a widespread attention recently. Many existing approaches seek to fuse unimodal features to produce multimodal news representations. However, the potential of powerful cross-modal contrastive ...
Cross-modal Ambiguity Learning for Multimodal Fake News Detection
WWW '22: Proceedings of the ACM Web Conference 2022

Cross-modal learning is essential to enable accurate fake news detection due to the fast-growing multimodal contents in online social communities. A fundamental challenge of multimodal fake news detection lies in the inherent ambiguity across different ...
Multimodal Consistency Suppression Factor for Fake News Detection
Recent multi-modal fake news detection methods often use the consistency between textual and visual contents to determine the truth or fake of a news information. Higher levels of textual-visual consistency typically lead to a greater likelihood of ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in Asia

December 2023

745 pages

ISBN:9798400702051

DOI:10.1145/3595916

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 January 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Sichuan Science and Technology Program
National Natural Science Foundation of China

Conference

MMAsia '23

Sponsor:

SIGMM

MMAsia '23: ACM Multimedia Asia

December 6 - 8, 2023

Tainan, Taiwan

Acceptance Rates

Overall Acceptance Rate 59 of 204 submissions, 29%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
192
Total Downloads

Downloads (Last 12 months)192
Downloads (Last 6 weeks)23

Reflects downloads up to 10 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents