research-article

Multimodal Cheapfakes Detection by Utilizing Image Captioning for Global Context

Authors:

Quang-Tien Tran,

Thanh-Phuc Tran,

Duc-Tien Dang-Nguyen,

Minh-Son DaoAuthors Info & Claims

ICDAR '22: Proceedings of the 3rd ACM Workshop on Intelligent Cross-Data Analysis and Retrieval

Pages 9 - 16

https://doi.org/10.1145/3512731.3534210

Published: 27 June 2022 Publication History

Abstract

The rapid development of technology in social media platforms has led to abundant misinformation and fake news spreading in the community. One of the most prevalent ways to misleading information on social media is cheapfakes, which are more accessible and affordable than deepfakes. Most existing approaches extract features from text or concatenate visual and textual features and train with multimodal to classify news. This paper proposed several strategies to leverage object, textual, image captioning features. These strategies focus on utilizing image captioning to extract the correlation between images and captions. We also propose some boosting techniques to enhance the result. Our methods are evaluated on the "MMSys'21 Grand Challenge" dataset and have 86.75% accuracy.

Supplementary Material

MP4 File (ICDAR_23.mp4)

Presentation Video for Multimodal cheapfakes Detection by Utilizing Image Captioning for Global Context at ICDAR, workshop of ICMR.

Download
50.26 MB

References

[1]

Shivangi Aneja, Christoph Bregler, and Matthias Nießner. 2021 a. Catching out-of-context misinformation with self-supervised learning. arXiv preprint arXiv:2101.06278, Vol. 2, 3 (2021).

[2]

Shivangi Aneja, Cise Midoglu, Duc-Tien Dang-Nguyen, Michael Alexander Riegler, Paal Halvorsen, Matthias Niessner, Balu Adsumilli, and Chris Bregler. 2021 b. MMSys' 21 grand challenge on detecting cheapfakes. arXiv preprint arXiv:2107.05297 (2021).

[3]

Tian Bian, Xi Xiao, Tingyang Xu, Peilin Zhao, Wenbing Huang, Yu Rong, and Junzhou Huang. 2020. Rumor detection on social media with bi-directional graph convolutional networks. In Proceedings of the AAAI conference on artificial intelligence, Vol. 34. 549--556.

[4]

Samuel R Bowman, Gabor Angeli, Christopher Potts, and Christopher D Manning. 2015. A large annotated corpus for learning natural language inference. arXiv preprint arXiv:1508.05326 (2015).

[5]

Joan Donovan Britt Paris. 2019. DEEPFAKES AND CHEAP FAKES: The Manipulation of Audio and Visual Evidence. Data & Society (2019).

[6]

Anshika Choudhary and Anuja Arora. 2021. Linguistic feature based learning model for fake news detection and classification. Expert Systems with Applications, Vol. 169 (2021), 114171.

[7]

Giovanni Luca Ciampaglia, Prashant Shiralkar, Luis M Rocha, Johan Bollen, Filippo Menczer, and Alessandro Flammini. 2015. Computational fact checking from knowledge networks. PloS one, Vol. 10, 6 (2015), e0128193.

[8]

Jie Gao, Hella-Franziska Hoffmann, Stylianos Oikonomou, David Kiskovski, and Anil Bandhakavi. 2021. Logically at the Factify 2022: Multimodal Fact Verification. arXiv preprint arXiv:2112.09253 (2021).

[9]

Zhiwei Jin, Juan Cao, Han Guo, Yongdong Zhang, and Jiebo Luo. 2017. Multimodal fusion with recurrent neural networks for rumor detection on microblogs. In Proceedings of the 25th ACM international conference on Multimedia. 795--816.

Digital Library

[10]

Dhruv Khattar, Jaipal Singh Goud, Manish Gupta, and Vasudeva Varma. 2019. Mvae: Multimodal variational autoencoder for fake news detection. In The world wide web conference. 2915--2921.

[11]

Xiujun Li, Xi Yin, Chunyuan Li, Pengchuan Zhang, Xiaowei Hu, Lei Zhang, Lijuan Wang, Houdong Hu, Li Dong, Furu Wei, et al. 2020. Oscar: Object-semantics aligned pre-training for vision-language tasks. In European Conference on Computer Vision. Springer, 121--137.

Digital Library

[12]

X. Liu, Q. Xu, and N. Wang. 2019 b. A survey on deep neural network-based image captioning. Vis Comput, Vol. 35 (2019), 445--470.

Digital Library

[13]

Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019 a. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).

[14]

Jing Ma, Wei Gao, Prasenjit Mitra, Sejeong Kwon, Bernard J Jansen, Kam-Fai Wong, and Meeyoung Cha. 2016. Detecting rumors from microblogs with recurrent neural networks. (2016).

[15]

Jing Ma, Wei Gao, and Kam-Fai Wong. 2018. Rumor detection on twitter with tree-structured recursive neural networks. Association for Computational Linguistics.

[16]

Priyanka Meel and Dinesh Kumar Vishwakarma. 2021. HAN, image captioning, and forensics ensemble multimodal fake news detection. Information Sciences, Vol. 567 (2021), 23--41.

[17]

Shreyash Mishra, S Suryavardan, Amrit Bhaskar, Parul Chopra, Aishwarya Reganti, Parth Patwa, Amitava Das, Tanmoy Chakraborty, Amit Sheth, Asif Ekbal, et al. 2022. Factify: A multi-modal fact verification dataset. In Proceedings of the First Workshop on Multimodal Fact-Checking and Hate Speech Detection (DE-FACTIFY) .

[18]

Behrang Mohit. 2014. Named entity recognition. In Natural language processing of semitic languages. Springer, 221--245.

[19]

Jeff Z Pan, Siyana Pavlova, Chenxi Li, Ningxi Li, Yangmei Li, and Jinshuo Liu. 2018. Content based fake news detection using knowledge graphs. In International semantic web conference. Springer, 669--683.

[20]

Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084 (2019).

[21]

Steven J Rennie, Etienne Marcheret, Youssef Mroueh, Jerret Ross, and Vaibhava Goel. 2017. Self-critical sequence training for image captioning. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7008--7024.

[22]

Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019).

[23]

Nina Schick. 2020. Don't underestimate the cheapfake. MIT Technology Review (2020).

[24]

Baoxu Shi and Tim Weninger. 2016. Discriminative predicate path mining for fact checking in knowledge graphs. Knowledge-based systems, Vol. 104 (2016), 123--133.

[25]

Prashant Shiralkar, Alessandro Flammini, Filippo Menczer, and Giovanni Luca Ciampaglia. 2017. Finding streams in knowledge graphs to support fact checking. In 2017 IEEE International Conference on Data Mining (ICDM). IEEE, 859--864.

[26]

Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, and Tie-Yan Liu. 2020. Mpnet: Masked and permuted pre-training for language understanding. Advances in Neural Information Processing Systems, Vol. 33 (2020), 16857--16867.

[27]

Rutvik Vijjali, Prathyush Potluri, Siddharth Kumar, and Sundeep Teki. 2020. Two stage transformer model for COVID-19 fake news detection and fact checking. arXiv preprint arXiv:2011.13253 (2020).

[28]

Bin Wang and C-C Jay Kuo. 2020. Sbert-wk: A sentence embedding method by dissecting bert-based word models. IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 28 (2020), 2146--2157.

Digital Library

[29]

Yaqing Wang, Fenglong Ma, Zhiwei Jin, Ye Yuan, Guangxu Xun, Kishlay Jha, Lu Su, and Jing Gao. 2018. Eann: Event adversarial neural networks for multi-modal fake news detection. In Proceedings of the 24th acm sigkdd international conference on knowledge discovery & data mining. 849--857.

Digital Library

[30]

Zhiyuan Wu, Dechang Pi, Junfu Chen, Meng Xie, and Jianjun Cao. 2020. Rumor detection based on propagation graph neural network with attention mechanism. Expert Systems with Applications, Vol. 158 (2020), 113595.

[31]

Dimitrina Zlatkova, Preslav Nakov, and Ivan Koychev. 2019. Fact-checking meets fauxtography: Verifying claims about images. arXiv preprint arXiv:1908.11722 (2019).

Cited By

Nguyen BNguyen VNguyen TDang-Nguyen DDo TTran M(2024)A Hybrid Approach for Cheapfake Detection Using Reputation Checking and End-To-End NetworkProceedings of the 1st Workshop on Security-Centric Strategies for Combating Information Disorder10.1145/3660512.3665521(1-12)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1145/3660512.3665521
Le ANguyen MDao MTran ADang-Nguyen DGurrin CKongkachandra RSchoeffmann KDang-Nguyen DRossetto LSatoh SZhou L(2024)TeGA: A Text-Guided Generative-based Approach in Cheapfake DetectionProceedings of the 2024 International Conference on Multimedia Retrieval10.1145/3652583.3657602(1294-1299)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3652583.3657602
Vu DNguyen MNguyen QGurrin CKongkachandra RSchoeffmann KDang-Nguyen DRossetto LSatoh SZhou L(2024)Enhancing Cheapfake Detection: An Approach Using Prompt Engineering and Interleaved Text-Image ModelProceedings of the 2024 International Conference on Multimedia Retrieval10.1145/3652583.3657600(1306-1311)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3652583.3657600
Show More Cited By

Index Terms

Multimodal Cheapfakes Detection by Utilizing Image Captioning for Global Context
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
    2. Natural language processing

Recommendations

A Combination of Visual-Semantic Reasoning and Text Entailment-based Boosting Algorithm for Cheapfake Detection
MM '22: Proceedings of the 30th ACM International Conference on Multimedia

Misuse of real photographs with conflicting image captions in news items is one case of out-of-context (OOC) misuse of media. To detect out-of-context given pair of news (i.e., captions) and attached image, people should determine the truthfulness of the ...
Analysis and Detection of "Pink Slime" Websites in Social Media Posts
WWW '24: Proceedings of the ACM Web Conference 2024

Local news outlets play a vital role in providing trusted and relevant information to communities and addressing their specific needs and concerns. The emergence of news outlets posing as local sources and their spread on social media present a ...
Detecting Out-of-Context Media with LLaMa-Adapter V2 and RoBERTa: An Effective Method for Cheapfakes Detection
ICMR '24: Proceedings of the 2024 International Conference on Multimedia Retrieval

Cheapfakes is a new term for fake media that is made without AI, but with simple tools or captions that deceive or mislead. Cheapfakes include photos, videos, audio recordings, or any media that has been changed to distort its original meaning or ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ICDAR '22: Proceedings of the 3rd ACM Workshop on Intelligent Cross-Data Analysis and Retrieval

June 2022

80 pages

ISBN:9781450392419

DOI:10.1145/3512731

General Chair:
Minh-Son Dao
National Institute of Information and Communications Technology, Japan
,
Program Chairs:
Duc-Tien Dang-Nguyen
Bergen University, Norway
,
Michael Riegler
Simula Metropolitan Center for Digital Engineering, Norway

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 June 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ICMR '22

Sponsor:

SIGMM

ICMR '22: International Conference on Multimedia Retrieval

June 27 - 30, 2022

NJ, Newark, USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

13
Total Citations
View Citations
237
Total Downloads

Downloads (Last 12 months)53
Downloads (Last 6 weeks)7

Reflects downloads up to 14 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Nguyen BNguyen VNguyen TDang-Nguyen DDo TTran M(2024)A Hybrid Approach for Cheapfake Detection Using Reputation Checking and End-To-End NetworkProceedings of the 1st Workshop on Security-Centric Strategies for Combating Information Disorder10.1145/3660512.3665521(1-12)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1145/3660512.3665521
Le ANguyen MDao MTran ADang-Nguyen DGurrin CKongkachandra RSchoeffmann KDang-Nguyen DRossetto LSatoh SZhou L(2024)TeGA: A Text-Guided Generative-based Approach in Cheapfake DetectionProceedings of the 2024 International Conference on Multimedia Retrieval10.1145/3652583.3657602(1294-1299)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3652583.3657602
Vu DNguyen MNguyen QGurrin CKongkachandra RSchoeffmann KDang-Nguyen DRossetto LSatoh SZhou L(2024)Enhancing Cheapfake Detection: An Approach Using Prompt Engineering and Interleaved Text-Image ModelProceedings of the 2024 International Conference on Multimedia Retrieval10.1145/3652583.3657600(1306-1311)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3652583.3657600
Nguyen VNguyen BNguyen TDang-Nguyen DTran MGurrin CKongkachandra RSchoeffmann KDang-Nguyen DRossetto LSatoh SZhou L(2024)A Unified Network for Detecting Out-Of-Context Information Using Generative Synthetic DataProceedings of the 2024 International Conference on Multimedia Retrieval10.1145/3652583.3657599(1300-1305)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3652583.3657599
Pham LVo-Hoang HTran AGurrin CKongkachandra RSchoeffmann KDang-Nguyen DRossetto LSatoh SZhou L(2024)A Generative Adaptive Context Learning Framework for Large Language Models in Cheapfake DetectionProceedings of the 2024 International Conference on Multimedia Retrieval10.1145/3652583.3657597(1288-1293)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3652583.3657597
Vo-Hoang HPham LDao MGurrin CKongkachandra RSchoeffmann KDang-Nguyen DRossetto LSatoh SZhou L(2024)Detecting Out-of-Context Media with LLaMa-Adapter V2 and RoBERTa: An Effective Method for Cheapfakes DetectionProceedings of the 2024 International Conference on Multimedia Retrieval10.1145/3652583.3657596(1282-1287)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3652583.3657596
Liu YRen YSui J(2024)PMMC: Prompt-based Multi-Modal Rumor Detection Model with Modality Conversion2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10650555(1-6)Online publication date: 30-Jun-2024
https://doi.org/10.1109/IJCNN60899.2024.10650555
Luvembe ALi WLi SLiu FWu X(2024)CAF-ODNNInformation Processing and Management: an International Journal10.1016/j.ipm.2024.10365361:3Online publication date: 2-Jul-2024
https://dl.acm.org/doi/10.1016/j.ipm.2024.103653
Feng KRitchie NBlumenthal PParsons AZhang A(2023)Examining the Impact of Provenance-Enabled Media on Trust and Accuracy PerceptionsProceedings of the ACM on Human-Computer Interaction10.1145/36100617:CSCW2(1-42)Online publication date: 4-Oct-2023
https://dl.acm.org/doi/10.1145/3610061
Wu GWu WLiu XXu KWan TWang W(2023)Cheap-Fake Detection with LLM Using Prompt Engineering2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)10.1109/ICMEW59549.2023.00025(105-109)Online publication date: Jul-2023
https://doi.org/10.1109/ICMEW59549.2023.00025
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents