Abstract
In today’s technologically driven world, the rapid spread of fake news, particularly during critical events like elections, poses a growing threat to the integrity of information. To tackle this challenge head-on, we introduce FakeWatch, a comprehensive framework carefully designed to detect fake news. Leveraging a newly curated dataset of North American election-related news articles, we construct robust classification models. Our framework integrates a model hub comprising of both traditional machine learning (ML) techniques, and state-of-the-art Language Models (LMs) to discern fake news effectively. Our objective is to provide the research community with adaptable and precise classification models adept at identifying fake news for the elections agenda. Quantitative evaluations of fake news classifiers on our dataset reveal that, while state-of-the-art LMs exhibit a slight edge over traditional ML models, classical models remain competitive due to their balance of accuracy and computational efficiency. Additionally, qualitative analyses shed light on patterns within fake news articles. We provide our labeled data (https://huggingface.co/datasets/newsmediabias/fake_news_elections_labelled_data) and model (https://huggingface.co/newsmediabias/FakeWatch) for reproducibility and further research.
Similar content being viewed by others
Notes
https://newspaper.readthedocs.io/en/latest/
https://textblob.readthedocs.io/en/dev/
https://www.fmsasg.com/socialnetworkanalysis/
References
Aïmeur E, Amri S, Brassard G (2023) Fake news, disinformation and misinformation in social media: a review. Soc Netw Anal Min 13(1):30. https://doi.org/10.1007/s13278-023-01028-5
Alghamdi J, Lin Y, Luo S (2023) Towards COVID-19 fake news detection using transformer-based models. Knowl Based Syst 274:110642. https://doi.org/10.1016/j.knosys.2023.110642
Allcott H, Gentzkow M (2017) Social media and fake news in the 2016 election. J Econ Perspect 31(2):211–236
Alonso MA, Vilares D, Gómez-Rodríguez C, Vilares J (2021) Sentiment analysis for fake news detection. Electronics. https://doi.org/10.3390/electronics10111348
Arora Y, Sikka S (2023) Reviewing fake news classification algorithms. In: Goyal D, Kumar A, Piuri V, Paprzycki M (eds) Proceedings of the third international conference on information management and machine intelligence. Algorithms for intelligent systems, Springer, Singapore, pp 425–429. https://doi.org/10.1007/978-981-19-2065-3_46
Asr FT, Taboada M (2019) Big data and quality data for fake news and misinformation detection. Big Data Soc. https://doi.org/10.1177/2053951719843310
Bang Y, Cahyawijaya S, Lee N, Dai W, Su D, Wilie B, Lovenia H, Ji Z, Yu T, Chung W, Do QV, Xu Y, Fung P (2023) A multitask, multilingual, hallucination, and interactivity, multimodal evaluation of ChatGPT on reasoning
Benenson E (2021) Vaccine myths: facts versus fiction: VCU health. https://www.vcuhealth.org/news/covid-19/vaccine-myths-facts-vs-fiction
Bonny AJ, Bhowmik P, Mahmud MS, Sattar A (2022) Detecting fake news in benchmark english news dataset using machine learning classifiers. In: 2022 13th international conference on computing communication and networking technologies (ICCCNT), pp 1–8.https://doi.org/10.1109/ICCCNT54827.2022.9984461 . https://ieeexplore.ieee.org/document/9984461 Accessed 2023-11-20
Brown S (2022) In Russia-ukraine war, Social Media Stokes ingenuity, disinformation. MIT Sloan. https://mitsloan.mit.edu/ideas-made-to-matter/russia-ukraine-war-social-media-stokes-ingenuity-disinformation
Essa E, Omar K, Alqahtani A (2023) Fake news detection based on a hybrid bert and lightgbm models. Complex Intell Syst. https://doi.org/10.1007/s40747-023-01098-0
Faustini PHA, Covões TF (2020) Fake news detection in multiple platforms and languages. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2020.113503
Gaillard S, Oláh ZA, Venmans S, Burke M (2021) Countering the cognitive, linguistic, and psychological underpinnings behind susceptibility to fake news: a review of current literature with special focus on the role of age and digital literacy. Front Commun 6:661801
Gilardi F, Alizadeh M, Kubli M (2023) Chatgpt outperforms crowd-workers for text-annotation tasks. arXiv preprint arXiv:2303.15056
Grinberg N, Joseph K, Friedland L, Swire-Thompson B, Lazer D (2019) Fake news on twitter during the 2016 US presidential election. Science 363(6425):374–378
Gruppi M, Horne BD, Adalı S (2023) NELA-GT-2022: a large multi-labelled news dataset for the study of misinformation in news articles. arXiv. arXiv:2203.05659 [cs]. https://doi.org/10.48550/arXiv.2203.05659 . http://arxiv.org/abs/2203.05659 Accessed 20 Nov 2023
Hamed SK, Ab Aziz MJ, Yaakub MR (2023) A review of fake news detection approaches: a critical analysis of relevant studies and highlighting key challenges associated with the dataset, feature representation, and data fusion. Heliyon 9(10):20382. https://doi.org/10.1016/j.heliyon.2023.e20382
Hamed SK, Ab Aziz MJ, Yaakub MR (2023) A review of fake news detection approaches: a critical analysis of relevant studies and highlighting key challenges associated with the dataset, feature representation, and data fusion. Heliyon. https://doi.org/10.1016/j.heliyon.2023.e20382
Heller S, Rossetto L, Schuldt H (2018) The ps-battles dataset-an image collection for image manipulation detection. arXiv preprint arXiv:1804.04866
Huang L, Yu W, Ma W, Zhong W, Feng Z, Wang H al (2023) A survey on hallucination in large language models: principles, taxonomy, challenges, and open questions
Jarrahi A, Safari L (2022) Evaluating the effectiveness of publishers’ features in fake news detection on social media. Multim Tools Appl. https://doi.org/10.1007/s11042-022-12668-8
Kaliyar RK, Goswami A, Narang P (2021) FakeBERT: fake news detection in social media with a BERT-based deep learning approach. Multim Tools Appl 80(8):11765–11788. https://doi.org/10.1007/s11042-020-10183-2
Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692
Liu C, Wu X, Yu M, Li G, Jiang J, Huang W, Lu X (2019) A two-stage model based on BERT for short fake news detection. In: Douligeris C, Karagiannis D, Apostolou D (eds) Knowledge science, engineering and management. Lecture notes in computer science, Springer, Cham, pp 172–183. https://doi.org/10.1007/978-3-030-29563-9_17
Lu MF, Renaldy Ciptadi V, Nathanael R, Andaria KS, Girsang AS (2022) Fake news classifier with deep learning. In: 2022 international conference on informatics electrical and electronics (ICIEE), pp 1–4.https://doi.org/10.1109/ICIEE55596.2022.10010120 . https://ieeexplore.ieee.org/abstract/document/10010120 Accessed 2023-11-20
Mitra T, Gilbert E (2015) Credbank: a large-scale social media corpus with associated credibility annotations.In: Proceedings of the international AAAI conference on web and social media, vol 9. pp 258–267
Muhammed TS, Mathew SK (2022) The disaster of misinformation: a review of research in social media. Int J Data Sci Anal 13(4):271–285. https://doi.org/10.1007/s41060-022-00311-6
Nakamura K, Levy S, Wang WY (2019) r/fakeddit: a new multimodal benchmark dataset for fine-grained fake news detection. arXiv preprint arXiv:1911.03854
OpenAI: GPT-4 technical report (2023)
Qi P, Cao J, Yang T, Guo J, Li J (2019) Exploiting multi-domain visual information for fake news detection. In: 2019 IEEE international conference on data mining (ICDM), IEEE, pp 518–527
Ramage D, Hall D, Nallapati R, Manning CD (2009) Labeled lda: a supervised topic model for credit attribution in multi-labeled corpora. In: Proceedings of the 2009 conference on empirical methods in natural language processing, pp 248–256
Raza S (2021) Automatic fake news detection in political platforms—a transformer-based approach. In: Hürriyetoğlu A (ed). In: Proceedings of the 4th workshop on challenges and applications of automated extraction of socio-political events from text (CASE 2021), Association for computational linguistics, online, pp 68–78. https://doi.org/10.18653/v1/2021.case-1.10 . https://aclanthology.org/2021.case-1.10 Accessed 2023-11-20
Raza S, Ding C (2022) Fake news detection based on news content and social contexts: a transformer-based approach. Int J Data Sci Anal 13:335–362. https://doi.org/10.1007/s41060-021-00302-z
Raza S, Schwartz B (2023) Constructing a disease database and using natural language processing to capture and standardize free text clinical information. Sci Rep 13(1):8591
Raza S, Garg M, Reji DJ, Bashir SR, Ding C (2024) Nbias: a natural language processing framework for BIAS identification in text. Expert Syst Appl 237:121542. https://doi.org/10.1016/j.eswa.2023.121542
Raza S, Ding C (2019) News recommender system considering temporal dynamics and news taxonomy. In: 2019 IEEE international conference on big data (big Data), IEEE, pp 920–929
Shu K, Mahudeswaran D, Wang S, Lee D, Liu H (2020) Fakenewsnet: a data repository with news content, social context, and spatiotemporal information for studying fake news on social media. Big Data 8(3):171–188
Sitaula N, Mohan CK, Grygiel J, Zhou X, Zafarani R (2020) Credibility-based fake news detection, Springer, Cham, pp 163–182 https://doi.org/10.1007/978-3-030-42699-6_9
Touvron H, Lavril T, Izacard G, Martinet X, Lachaux M-A, Lacroix T, Rozière B, Goyal N, Hambro E, Azhar F, Rodriguez A, Joulin A, Grave E, Lample G (2023) LLaMA: open and efficient foundation language models
Verma PK, Agrawal P, Amorim I, Prodan R (2021) Welfake: word embedding over linguistic features for fake news detection. IEEE Trans Comput Social Syst 8(4):881–893
Wang WY (2017) " liar, liar pants on fire": a new benchmark dataset for fake news detection. arXiv preprint arXiv:1705.00648
Wright C, Gatlin K, Acosta D, Taylor C (2023) Portrayals of the black lives matter movement in hard and fake news and consumer attitudes toward African Americans. Howard J Commun 34(1):19–41. https://doi.org/10.1080/10646175.2022.2065458
Yang H, Zhang J, Hu Z, Zhang L, Cheng X (2023) Multimodal relationship-aware attention network for fake news detection. In: 2023 international conference on data security and privacy protection (DSPP), IEEE. pp 143–149
Zhou X, Zafarani R (2020) A survey of fake news: fundamental theories, detection methods, and opportunities. ACM Comput Surv. https://doi.org/10.1145/3395046
Acknowledgements
Resources used in preparing this research were provided, in part, by the Province of Ontario, the Government of Canada through CIFAR, and companies sponsoring the Vector Institute. Authors would also like to thank the anonymous reviewers for their constructive feedback.
Author information
Authors and Affiliations
Contributions
The study was designed by S.R., who also conducted the initial literature review. T.K. and V.C. contributed to the study design and conducted preliminary experiments. D.P.P. was responsible for data labeling and development of the primary model, while M.R. handled the data curation and additional data labeling. V.C. and T.K. reviewed the annotations and experimental procedures. The first draft of the paper was written by T.H., V.C., and S.R. Baseline experiments were carried out by O.B., and the data analysis was performed by V.C. and S.R. The manuscript underwent revisions by V.C. and S.R. All authors gave their approval for the final version of the manuscript.
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Raza, S., Khan, T., Chatrath, V. et al. FakeWatch: a framework for detecting fake news to ensure credible elections. Soc. Netw. Anal. Min. 14, 142 (2024). https://doi.org/10.1007/s13278-024-01290-1
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13278-024-01290-1