Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3097983.3098131acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
Public Access

Toward Automated Fact-Checking: Detecting Check-worthy Factual Claims by ClaimBuster

Published: 13 August 2017 Publication History


This paper introduces how ClaimBuster, a fact-checking platform, uses natural language processing and supervised learning to detect important factual claims in political discourses. The claim spotting model is built using a human-labeled dataset of check-worthy factual claims from the U.S. general election debate transcripts. The paper explains the architecture and the components of the system and the evaluation of the model. It presents a case study of how ClaimBuster live covers the 2016 U.S. presidential election debates and monitors social media and Australian Hansard for factual claims. It also describes the current status and the long-term goals of ClaimBuster as we keep developing and expanding it.


Fatma Arslan. 2015. Detecting Real-time Check-worthy Factual Claims in Tweets Related to U. S. Politics. Master's thesis. University of Texas at Arlington.
Prakhar Biyani, Sumit Bhatia, Cornelia Caragea, and Prasenjit Mitra 2014. Using non-lexical features for identifying factual and opinionative threads in online forums. Knowledge-Based Systems Vol. 69 (2014), 170--178.
Carlos Castillo, Marcelo Mendoza, and Barbara Poblete. 2011. Information Credibility on Twitter. In WWW. 675--684.
Giovanni Luca Ciampaglia, Prashant Shiralkar, Luis M. Rocha, Johan Bollen, Filippo Menczer, and Alessandro Flammini 2015. Computational Fact Checking from Knowledge Networks. PLOS ONE, Vol. 10, 6 (June 2015), 1--13.
Hugo De Burgh. 2008. Investigative journalism. Routledge.
Yuxiao Dong, Jing Zhang, Jie Tang, Nitesh V. Chawla, and Bai Wang 2015. CoupledLP: Link Prediction in Coupled Networks. In KDD. 199--208.
Adrien Friggeri, Lada Adamic, Dean Eckles, and Justin Cheng 2014. Rumor Cascades ICWSM.
[8] 2016. The State of Automated Factchecking. (2016).
Jonathan Gray, Lucy Chambers, and Liliana Bounegru (Eds.). 2012. The Data Journalism Handbook. Oreilly & Associates Inc. showURL%
Naeemul Hassan et almbox. 2017. ClaimBuster: The First-ever Automated, Live Fact-checking System VLDB.
Naeemul Hassan, Bill Adair, James T. Hamilton, Chengkai Li, Mark Tremayne, Jun Yang, and Cong Yu 2015. The Quest to Automate Fact-Checking. In Computation Journalism Symposium.
Naeemul Hassan, Chengkai Li, and Mark Tremayne 2015. Detecting Check-worthy Factual Claims in Presidential Debates CIKM. 1835--1838.
Naeemul Hassan, Mark Tremayne, Fatma Arslan, and Chengkai Li 2016. Comparing Automated Factual Claim Detection Against Judgments of Journalism Organizations Computation Journalism Symposium.
Jure Leskovec, Lars Backstrom, and Jon Kleinberg. 2009. Meme-tracking and the Dynamics of the News Cycle. KDD.
Yaliang Li, Jing Gao, Chuishi Meng, Qi Li, Lu Su, Bo Zhao, Wei Fan, and Jiawei Han. A Survey on Truth Discovery. SIGKDD Explor. Newsl. Vol. 17, 2 (????), 1--16.
Ryan N. Lichtenwalter and Nitesh V. Chawla 2012. Vertex Collocation Profiles: Subgraph Counting for Link Analysis and Prediction WWW. 1019--1028.
Maxwell E McCombs and Donald L Shaw 1972. The agenda-setting function of mass media. Public opinion quarterly Vol. 36, 2 (1972), 176--187.
John Platt et almbox. 1999. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Advances in large margin classifiers Vol. 10, 3 (1999).
Vahed Qazvinian, Emily Rosengren, Dragomir R. Radev, and Qiaozhu Mei 2011. Rumor Has It: Identifying Misinformation in Microblogs EMNLP. 1589--1599.
Ellen Riloff and Janyce Wiebe 2003. Learning extraction patterns for subjective expressions EMNLP. 105--112.
Ellen Riloff, Janyce Wiebe, and William Phillips. 2005. Exploiting subjectivity classification to improve information extraction AAAI. 1106--1111.
Baoxu Shi and Tim Weninger 2016. Discriminative Predicate Path Mining for Fact Checking in Knowledge Graphs. Knowledge-Based Systems Vol. 104, C (July 2016), 123--133.
Tom W Smith. 1980. America's most important problem-a trend analysis, 1946--1976. Public Opinion Quarterly Vol. 44, 2 (1980), 164--180.
Andreas Vlachos and Sebastian Riedel 2014. Fact Checking: Task definition and dataset construction ACL. 18--22.
Janyce Wiebe and Ellen Riloff 2005. Creating subjective and objective sentence classifiers from unannotated texts. CICLing. 486--497.
Tamar Wilner. 2014. Meet the robots that factcheck. Columbia Journalism Review (September-October 2014).
Hong Yu and Vasileios Hatzivassiloglou 2003. Towards answering opinion questions: Separating facts from opinions and identifying the polarity of opinion sentences. In EMNLP. 129--136.
Jian-Hua Zhu. 1992. Issue competition and attention distraction: A zero-sum theory of agenda-setting. Journalism & Mass Communication Quarterly Vol. 69, 4 (1992), 825--836.

Cited By

View all
  • (2024)Capacitación tecnológica y formación en verificación en los medios de comunicación españoles.Technological and Verification Training in the Spanish MediaVISUAL REVIEW. International Visual Culture Review / Revista Internacional de Cultura Visual10.62161/revvisual.v16.520816:4(1-14)Online publication date: 8-Jul-2024
  • (2024)Transformer-Based Tool for Automated Fact-Checking: A Pilot Study on Online Health Information (Preprint)JMIR Infodemiology10.2196/56831Online publication date: 27-Jan-2024
  • (2024)Building a framework for fake news detection in the health domainPLOS ONE10.1371/journal.pone.030536219:7(e0305362)Online publication date: 8-Jul-2024
  • Show More Cited By



Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors


Published In

cover image ACM Conferences
KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
August 2017
2240 pages
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]



Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 August 2017


Request permissions for this article.

Check for updates

Author Tags

  1. computational journalism
  2. fact-checking
  3. natural language processing
  4. text classification
  5. text mining


  • Research-article

Funding Sources


KDD '17

Acceptance Rates

KDD '17 Paper Acceptance Rate 64 of 748 submissions, 9%;
Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25


Other Metrics

Bibliometrics & Citations


Article Metrics

  • Downloads (Last 12 months)853
  • Downloads (Last 6 weeks)81
Reflects downloads up to 22 Dec 2024

Other Metrics


Cited By

View all
  • (2024)Capacitación tecnológica y formación en verificación en los medios de comunicación españoles.Technological and Verification Training in the Spanish MediaVISUAL REVIEW. International Visual Culture Review / Revista Internacional de Cultura Visual10.62161/revvisual.v16.520816:4(1-14)Online publication date: 8-Jul-2024
  • (2024)Transformer-Based Tool for Automated Fact-Checking: A Pilot Study on Online Health Information (Preprint)JMIR Infodemiology10.2196/56831Online publication date: 27-Jan-2024
  • (2024)Building a framework for fake news detection in the health domainPLOS ONE10.1371/journal.pone.030536219:7(e0305362)Online publication date: 8-Jul-2024
  • (2024)"The Data Says Otherwise" — Towards Automated Fact-checking and Communication of Data ClaimsProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676359(1-20)Online publication date: 13-Oct-2024
  • (2024)Investigating Characteristics, Biases and Evolution of Fact-Checked Claims on the WebProceedings of the 35th ACM Conference on Hypertext and Social Media10.1145/3648188.3675135(246-258)Online publication date: 10-Sep-2024
  • (2024)"Fact-checks are for the Top 0.1%": Examining Reach, Awareness, and Relevance of Fact-Checking in Rural IndiaProceedings of the ACM on Human-Computer Interaction10.1145/36373338:CSCW1(1-34)Online publication date: 26-Apr-2024
  • (2024)Combat Greenwashing with GoalSpotter: Automatic Sustainability Objective Detection in Heterogeneous ReportsProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3680110(4752-4759)Online publication date: 21-Oct-2024
  • (2024)QuestGen: Effectiveness of Question Generation Methods for Fact-Checking ApplicationsProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679985(4036-4040)Online publication date: 21-Oct-2024
  • (2024)A Comprehensive Cloud Architecture for Machine Learning-enabled ResearchPractice and Experience in Advanced Research Computing 2024: Human Powered Computing10.1145/3626203.3670525(1-8)Online publication date: 17-Jul-2024
  • (2024)Wildfire: A Twitter Social Sensing Platform for LaypersonProceedings of the 17th ACM International Conference on Web Search and Data Mining10.1145/3616855.3635704(1106-1109)Online publication date: 4-Mar-2024
  • Show More Cited By

View Options

View options


View or Download as a PDF file.



View online with eReader.


Login options







Share this Publication link

Share on social media