Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3512732.3533589acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article

Automatic and Manual Detection of Generated News: Case Study, Limitations and Challenges

Published: 27 June 2022 Publication History

Abstract

In this paper, we study the exploitation of language generation models for disinformation purposes from two viewpoints. Quantitatively, we argue that language models hardly deal with domain adaptation (i.e., the ability to generate text on topics that are not part of a training database, as typically required for news). For this purpose, we show that both simple machine learning models and manual detection can spot machine-generated news in this practically-relevant context. Qualitatively, we put forward the differences between these automatic and manual detection processes, and their potential for a constructive interaction in order to limit the impact of automatic disinformation campaigns. We also discuss the consequences of these findings for the constructive use of natural language generation to produce news items.

References

[1]
David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet Allocation. J. Mach. Learn. Res. 3 (2003), 993--1022.
[2]
Taina Bucher. 2018. If...Then: Algorithmic Power and Politics. Oxford University Press.
[3]
Fabrice Colas and Pavel Brazdil. 2006. Comparison of SVM and Some Older Classification Algorithms in Text Classification Tasks. In Artificial Intelligence in Theory and Practice, IFIP 19th World Computer Congress, TC 12: IFIP AI 2006 Stream, August 21--24, 2006, Santiago, Chile (IFIP, Vol. 217), Max Bramer (Ed.). Springer, 169--178. https://doi.org/10.1007/978-0--387--34747--9_18
[4]
Nadia K. Conroy, Victoria L. Rubin, and Yimin Chen. 2015. Automatic Deception Detection: Methods for Finding Fake News. Proceedings of the Association for Information Science and Technology 52, 1 (2015), 1--4.
[5]
Corinna Cortes and Vladimir Vapnik. 1995. Support-Vector Networks. Mach. Learn. 20, 3 (1995), 273--297. https://doi.org/10.1007/BF00994018
[6]
Sumanth Dathathri, Andrea Madotto, Janice Lan, Jane Hung, Eric Frank, Piero Molino, Jason Yosinski, and Rosanne Liu. 2020. Plug and Play Language Models: A Simple Approach to Controlled Text Generation. In ICLR. OpenReview.net.
[7]
Antonin Descampe, Clément Massart, Simon Poelman, François-Xavier Standaert, and Olivier Standaert. 2022. Automated news recommendation in front of adversarial examples and the technical limits of transparency in algorithmic accountability. AI Soc. 37, 1 (2022), 67--80.
[8]
Nicholas Diakopoulos. 2019. Automating the News: How Algorithms Are Rewriting the Media (harvard university press ed.).
[9]
Benoît Frénay and Michel Verleysen. 2014. Classification in the Presence of Label Noise: A Survey. IEEE Trans. Neural Networks Learn. Syst. 25, 5 (2014), 845--869.
[10]
David J Hand and Keming Yu. 2001. Idiot's Bayes-not so stupid after all? International statistical review 69, 3 (2001), 385--398.
[11]
Naeemul Hassan, Fatma Arslan, Chengkai Li, and Mark Tremayne. 2017. Toward Automated Fact-Checking: Detecting Check-worthy Factual Claims by ClaimBuster. In KDD. ACM, 1803--1812.
[12]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput. 9, 8 (1997), 1735--1780. https://doi.org/10.1162/neco.1997.9.8.1735
[13]
David W. Hosmer and Stanley Lemeshow. 2000. Applied Logistic Regression, Second Edition. Wiley. https://doi.org/10.1002/0471722146
[14]
Edson C. Tandoc Jr., Zheng Wei Lim, and Richard Ling. 2018. Defining "Fake News". Digital Journalism 6, 2 (2018), 137--153.
[15]
David M. J. Lazer, Matthew A. Baum, Yochai Benkler, Adam J. Berinsky, Kelly M. Greenhill, Filippo Menczer, Miriam J. Metzger, Brendan Nyhan, Gordon Pennycook, David Rothschild, Michael Schudson, Steven A. Sloman, Cass R. Sunstein, Emily A. Thorson, Duncan J. Watts, and Jonathan L. Zittrain. 2018. The Science of Fake News. Science 359, 6380 (2018), 1094--1096.
[16]
Gang Liu and Jiabao Guo. 2019. Bidirectional LSTM with attention mechanism and convolutional layer for text classification. Neurocomputing 337 (2019), 325--338. https://doi.org/10.1016/j.neucom.2019.01.078
[17]
Marko Milosavljevic and Igor Vobic. 2019. Human Still in the Loop. Digital Journalism 7, 8 (Sept. 2019), 1098--1116. https://doi.org/10.1080/21670811.2019.1601576
[18]
Tomas Pranckevicius and Virginijus Marcinkevicius. 2017. Comparison of Naive Bayes, Random Forest, Decision Tree, Support Vector Machines, and Logistic Regression Classifiers for Text Reviews Classification. Balt. J. Mod. Comput. 5, 2 (2017). https://doi.org/10.22364/bjmc.2017.5.2.05
[19]
Juan Ramos. 2003. Using TF-IDF to Determine Word Relevance in Document Queries. In Proceedings of the First Instructional Conference on Machine Learning. 29--48.
[20]
Natali Ruchansky, Sungyong Seo, and Yan Liu. 2017. CSI: A Hybrid Deep Model for Fake News Detection. In CIKM. ACM, 797--806.
[21]
Karishma Sharma, Feng Qian, He Jiang, Natali Ruchansky, Ming Zhang, and Yan Liu. 2019. Combating Fake News: A Survey on Identification and Mitigation Techniques. ACM Trans. Intell. Syst. Technol. 10, 3 (2019), 21:1--21:42.
[22]
Kai Shu, Amy Sliva, Suhang Wang, Jiliang Tang, and Huan Liu. 2017. Fake News Detection on Social Media: A Data Mining Perspective. SIGKDD Explor. 19, 1 (2017), 22--36.
[23]
Stephen J. A. Ward. 2018. Epistemologies of Journalism. In Journalism, Tim P. Vos (Ed.). De Gruyter, 63--82. https://doi.org/10.1515/9781501500084-004
[24]
Wen Xu, Jing He, and Yanfeng Shu. 2020. Transfer Learning and Deep Domain Adaptation. In Advances and Applications in Deep Learning, Marco Antonio Aceves-Fernandez (Ed.). IntechOpen, Chapter 3.
[25]
Rowan Zellers, Ari Holtzman, Hannah Rashkin, Yonatan Bisk, Ali Farhadi, Franziska Roesner, and Yejin Choi. 2019. Defending Against Neural Fake News. In NeurIPS. 9051--9062.
[26]
Chunting Zhou, Chonglin Sun, Zhiyuan Liu, and Francis C. M. Lau. 2015. A C-LSTM Neural Network for Text Classification. CoRR abs/1511.08630 (2015). arXiv:1511.08630 http://arxiv.org/abs/1511.08630
[27]
Xinyi Zhou and Reza Zafarani. 2020. A Survey of Fake News: Fundamental Theories, Detection Methods, and Opportunities. ACM Comput. Surv. 53, 5 (2020), 109:1--109:40.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
MAD '22: Proceedings of the 1st International Workshop on Multimedia AI against Disinformation
June 2022
93 pages
ISBN:9781450392426
DOI:10.1145/3512732
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 June 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. digital journalism
  2. fake news detection

Qualifiers

  • Research-article

Conference

ICMR '22
Sponsor:

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 133
    Total Downloads
  • Downloads (Last 12 months)39
  • Downloads (Last 6 weeks)2
Reflects downloads up to 24 Sep 2024

Other Metrics

Citations

Cited By

View all

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media