Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3652583.3657601acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article
Open access

A Multi-Stage Deep Learning Approach Incorporating Text-Image and Image-Image Comparisons for Cheapfake Detection

Published: 07 June 2024 Publication History

Abstract

The advancement of multimedia and artificial intelligence (AI) technologies has dismantled the barriers of information sharing, yet it has also ushered in a double-edged sword: a surge in the spread of fake information. In this context, there is a growing need for research on the detection of 'cheapfakes,' which are low-cost fake media, known for their ease of creation. This paper proposes a multi-stage deep learning process designed to effectively detect the diverse and rapidly evolving nature of cheapfakes. A single-step deep learning model faces limitations in distinguishing various types of cheapfakes, necessitating the application of a complex deep learning model approach to detect subtle Out-of-Context (OOC) phenomena. This study employs models based on Bidirectional Encoder Representations from Transformers (BERT) and stable diffusion technologies to approach cheapfake detection. Through the ACM ICMR 2024 challenge, the performance of this model was evaluated on a real dataset, achieving an accuracy of 71.9% in Task 1, an improvement of 7% over previous methods, and an accuracy of 55.7% in Task 2. These results are expected to make a significant contribution to the development of strategies for creating and countering cheapfakes. Additionally, this research aims to contribute to the detection of OOC media misuse through this challenge.

References

[1]
Britt Paris, and Joan Donovan. 2019. Deepfakes and cheap fakes.
[2]
Darius Afchar, Vincent Nozick, Junichi Yamagishi, and Isao Echizen. 2018. Mesonet: a compact facial video forgery detection network. 2018 IEEE international workshop on information forensics and security (WIFS) 2018.
[3]
Md Shohel Rana, Mohammad Nur Nobi, Beddhu Murali, and Andrew H Sung. 2022. Deepfake detection: A systematic literature review. IEEE access 10. https://doi.org/ 10.1109/ACCESS.2022.3154404
[4]
Brian Dolhansky, Joanna Bitton, Ben Pflaum, Jikuo Lu, Russ Howes, Menglin Wang, and Cristian Canton Ferrer. 2020. The deepfake detection challenge (dfdc) dataset. arXiv preprint arXiv:2006.07397. https://doi.org/ 10.48550/arXiv.2006.07397
[5]
Hanqing Zhao, Wenbo Zhou, Dongdong Chen, Tianyi Wei, Weiming Zhang, and Nenghai Yu. 2021. Multi-attentional deepfake detection. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR) 2021.
[6]
Shivangi Aneja, Cise Midoglu, Duc-Tien Dang-Nguyen, Michael Alexander Riegler, Paal Halvorsen, Matthias Nießner, Balu Adsumilli, and Chris Bregler. 2021. MMSys' 21 grand challenge on detecting cheapfakes. arXiv preprint arXiv:2107.05297. https://doi.org/ 10.48550/arXiv.2107.05297
[7]
Shivangi Aneja, Cise Midoglu, Duc-Tien Dang-Nguyen, Sohail Ahmed Khan, Michael Riegler, Pål Halvorsen, Chris Bregler, and Balu Adsumilli. 2022. Acm multimedia grand challenge on detecting cheapfakes. arXiv preprint arXiv:2207.14534. https://doi.org/ 10.48550/arXiv.2207.14534
[8]
Tankut Akgul, Tugce Erkilic Civelek, Deniz Ugur, and Ali C Begen. 2021. Cosmos on steroids: a cheap detector for cheapfakes. Proceedings of the 12th ACM Multimedia Systems Conference 2021.
[9]
Kha-Luan Pham, Manh-Thien Nguyen, Anh-Duy Tran, Minh-Son Dao, and Duc-Tien Dang-Nguyen. 2023. Detecting cheapfakes using self-query adaptive-context learning. Proceedings of the 4th ACM Workshop on Intelligent Cross-Data Analysis and Retrieval 2023.
[10]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
[11]
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. 2022. High-resolution image synthesis with latent diffusion models. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR) 2022.
[12]
Yuanen Zhou, Meng Wang, Daqing Liu, Zhenzhen Hu, and Hanwang Zhang. 2020. More grounded image captioning by distilling image-text matching model. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR) 2020.
[13]
Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q Weinberger, and Yoav Artzi. 2019. Bertscore: Evaluating text generation with bert. arXiv preprint arXiv:1904.09675. https://doi.org/ 10.48550/arXiv.1904.09675
[14]
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. Medical image computing and computer-assisted intervention--MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18 2015.
[15]
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, and Jack Clark. 2021. Learning transferable visual models from natural language supervision. International conference on machine learning (PMLR) 2021.
[16]
Shivangi Aneja, Chris Bregler, and Matthias Nießner. 2021. Cosmos: Catching out-of-context misinformation with self-supervised learning. arXiv preprint arXiv:2101.06278. https://doi.org/ 10.48550/arXiv.2101.06278
[17]
Pengcheng He, Xiaodong Liu, Jianfeng Gao, and Weizhu Chen. 2020. Deberta: Decoding-enhanced bert with disentangled attention. arXiv preprint arXiv:2006.03654. https://doi.org/ 10.48550/arXiv.2006.03654
[18]
Nils Reimers, and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084.
[19]
Dustin Podell, Zion English, Kyle Lacey, Andreas Blattmann, Tim Dockhorn, Jonas Müller, Joe Penna, and Robin Rombach. 2023. Sdxl: Improving latent diffusion models for high-resolution image synthesis. arXiv preprint arXiv:2307.01952.

Cited By

View all
  • (2024)Overview of the Grand Challenge on Detecting Cheapfakes at ACM ICMR 2024Proceedings of the 2024 International Conference on Multimedia Retrieval10.1145/3652583.3657587(1275-1281)Online publication date: 30-May-2024

Index Terms

  1. A Multi-Stage Deep Learning Approach Incorporating Text-Image and Image-Image Comparisons for Cheapfake Detection

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICMR '24: Proceedings of the 2024 International Conference on Multimedia Retrieval
    May 2024
    1379 pages
    ISBN:9798400706196
    DOI:10.1145/3652583
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 June 2024

    Check for updates

    Author Tags

    1. bert
    2. cheapfakes
    3. ground image captioning
    4. misinformation
    5. out-of-context
    6. semantic textual similarity
    7. stable diffusion

    Qualifiers

    • Research-article

    Funding Sources

    • Korea Institute of Police Technology
    • National Research Foundation of Korea

    Conference

    ICMR '24
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 254 of 830 submissions, 31%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)78
    • Downloads (Last 6 weeks)20
    Reflects downloads up to 24 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Overview of the Grand Challenge on Detecting Cheapfakes at ACM ICMR 2024Proceedings of the 2024 International Conference on Multimedia Retrieval10.1145/3652583.3657587(1275-1281)Online publication date: 30-May-2024

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media