research-article

Toward computational fact-checking

Authors:

Pankaj K. Agarwal,

Cong YuAuthors Info & Claims

Proceedings of the VLDB Endowment, Volume 7, Issue 7

Pages 589 - 600

https://doi.org/10.14778/2732286.2732295

Published: 01 March 2014 Publication History

Abstract

Our news are saturated with claims of "facts" made from data. Database research has in the past focused on how to answer queries, but has not devoted much attention to discerning more subtle qualities of the resulting claims, e.g., is a claim "cherry-picking"? This paper proposes a framework that models claims based on structured data as parameterized queries. A key insight is that we can learn a lot about a claim by perturbing its parameters and seeing how its conclusion changes. This framework lets us formulate practical fact-checking tasks---reverse-engineering (often intentionally) vague claims, and countering questionable claims---as computational problems. Along with the modeling framework, we develop an algorithmic framework that enables efficient instantiations of "meta" algorithms by supplying appropriate algorithmic building blocks. We present real-world examples and experiments that demonstrate the power of our model, efficiency of our algorithms, and usefulness of their results.

References

[1]

C. C. Aggarwal, editor. Managing and Mining Uncertain Data. Springer, 2009.

Digital Library

[2]

P. Agrawal and J. Widom. Confidence-aware join algorithms. ICDE, 2009, 628--639.

Digital Library

[3]

A. M. Andrew. Another efficient algorithm for convex hulls in two dimensions. Information Processing Letters, 9(1979), 216--219.

[4]

M. A. Bender and M. Farach-Colton. The LCA problem revisited. LATIN, 2000, 88--94.

Digital Library

[5]

S. Börzsönyi, D. Kossmann, and K. Stocker. The skyline operator. ICDE, 2001, 421--430.

Digital Library

[6]

S Cohen, J. T. Hamilton, and F. Turner. Computational journalism. CACM, 54(2011), 66--71.

Digital Library

[7]

S. Cohen, C. Li, J. Yang, and C. Yu. Computational journalism: A call to arms to database researchers. CIDR, 2011.

[8]

Harish D., P. N. Darera, and J. R. Haritsa. Identifying robust plans through plan diagram reduction. VLDB, 2008, 1124--1140.

Digital Library

[9]

N. N. Dalvi, C. Ré, and D. Suciu. Probabilistic databases: Diamonds in the dirt. CACM, 52(2009), 86--94.

Digital Library

[10]

J. Fischer and V. Heun. A new succinct representation of rmq-information and improvements in the enhanced suffix array. ESCAPE, 2007, 459--470.

Digital Library

[11]

S. Ganguly. Design and analysis of parametric query optimization algorithms. VLDB, 1998, 228--238.

Digital Library

[12]

J. Gray, A. Bosworth, A. Layman, and H. Pirahesh. Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-total. ICDE, 1996, 152--159.

Digital Library

[13]

D. Harel and R. E. Tarjan. Fast algorithms for finding nearest common ancestors. SIAM, 13(1984), 338--355.

Digital Library

[14]

Z. He and E. Lo. Answering why-not questions on top-k queries. ICDE, 2012, 750--761.

Digital Library

[15]

A. Hulgeri and S. Sudarshan. AniPQO: Almost non-intrusive parametric query optimization for nonlinear cost functions. VLDB, 2003, 766--777.

Digital Library

[16]

Y. E. Ioannidis, R. T. Ng, K. Shim, and T. K. Sellis. Parametric query optimization. VLDB, 1992, 103--114.

Digital Library

[17]

R. Jampani, F. Xu, M. Wu, L. L. Perez, C. Jermaine, and P. J. Haas. The Monte Carlo database system: Stochastic analysis close to the data. TODS, 36(2011), 18.

Digital Library

[18]

H. T. Kung, F. Luccio, and F. P. Preparata. On finding the maxima of a set of vectors. JACM, 22(1975), 469--476.

Digital Library

[19]

X Lin, A. Mukherji, E. A. Rundensteiner, C. Ruiz, and M. O. Ward. PARAS: A parameter space framework for online association mining. VLDB 6(2013), 193--204.

Digital Library

[20]

Y. Luo, X. Lin, W. Wang, and X. Zhou. Spark: top-k keyword query in relational databases. SIGMOD, 2007, 115--126.

Digital Library

[21]

K. Mouratidis and H. Pang. Computing immutable regions for sub-space top-k queries. VLDB, 6(2012), 73--84.

Digital Library

[22]

A. Das Sarma, A. G. Parameswaran, H. Garcia-Molina, and J. Widom. Synthesizing view definitions from data. ICDT, 2010, 89--103.

Digital Library

[23]

M. A. Soliman, I. F. Ilyas, D. Martinenghi, and M. Tagliasacchi. Ranking with uncertain scoring functions: Semantics and sensitivity measures. SIGMOD, 2011, 805--816.

Digital Library

[24]

R. E. Tarjan. Applications of path compression on balanced trees. JACM, 26(1979), 690--715.

Digital Library

[25]

Q. T. Tran and C. Y. Chan. How to ConQueR why-not questions. SIGMOD, 2010, 15--26.

Digital Library

[26]

Q. T. Tran, C. Y. Chan, and S. Parthasarathy. Query by output. SIGMOD, 2009, 535--548.

Digital Library

[27]

E. Wu and S. Madden. Scorpion: Explaining away outliers in aggregate queries. VLDB, 6(2013), 553--564.

Digital Library

[28]

Y. Wu, P. K. Agarwal, C. Li, J. Yang, and C. Yu. Toward computational fact-checking. Technical report, Duke University, 2013. http://www.cs.duke.edu/dbgroup/papers/WuAgarwalEtAl-13-fact_check.pdf.

[29]

A. Yu, P. K. Agarwal, and J. Yang. Processing a large number of continuous preference top-k queries. SIGMOD, 2012, 397--408.

Digital Library

Cited By

Xiao HLi YWang YKarras PMouratidis KAvlona NBaeza-Yates RBonchi F(2024)How to Avoid Jumping to Conclusions: Measuring the Robustness of Outstanding Facts in Knowledge GraphsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671763(3539-3550)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671763
Advani RPapotti PAsudeh ASingh ASun YAkoglu LGunopulos DYan XKumar ROzcan FYe J(2023)Maximizing Neutrality in News OrderingProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599425(11-24)Online publication date: 6-Aug-2023
https://dl.acm.org/doi/10.1145/3580305.3599425
Lin YYoungmann BMoskovitch YJagadish HMilo T(2022)OREOProceedings of the VLDB Endowment10.14778/3554821.355484615:12(3570-3573)Online publication date: 1-Aug-2022
https://dl.acm.org/doi/10.14778/3554821.3554846
Show More Cited By

Recommendations

Toward Automated Fact-Checking: Detecting Check-worthy Factual Claims by ClaimBuster
KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

This paper introduces how ClaimBuster, a fact-checking platform, uses natural language processing and supervised learning to detect important factual claims in political discourses. The claim spotting model is built using a human-labeled dataset of ...
Computational Fact Checking through Query Perturbations
Invited Paper from ICDT 2014, Invited Paper from EDBT 2015, Regular Papers and Technical Correspondence

Our media is saturated with claims of “facts” made from data. Database research has in the past focused on how to answer queries, but has not devoted much attention to discerning more subtle qualities of the resulting claims, for example, is a claim “...
Linguistic Signals under Misinformation and Fact-Checking: Evidence from User Comments on Social Media

Misinformation and fact-checking are opposite forces in the news environment: the former creates inaccuracies to mislead people, while the latter provides evidence to rebut the former. These news articles are often posted on social media and attract user ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment

Proceedings of the VLDB Endowment Volume 7, Issue 7

March 2014

108 pages

ISSN:2150-8097

Editors:
H. V. Jagadish
University of Michigan
,
Aoying Zhou
East Normal University, China

Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 March 2014

Published in PVLDB Volume 7, Issue 7

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

52
Total Citations
View Citations
695
Total Downloads

Downloads (Last 12 months)51
Downloads (Last 6 weeks)7

Reflects downloads up to 22 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Xiao HLi YWang YKarras PMouratidis KAvlona NBaeza-Yates RBonchi F(2024)How to Avoid Jumping to Conclusions: Measuring the Robustness of Outstanding Facts in Knowledge GraphsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671763(3539-3550)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671763
Advani RPapotti PAsudeh ASingh ASun YAkoglu LGunopulos DYan XKumar ROzcan FYe J(2023)Maximizing Neutrality in News OrderingProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599425(11-24)Online publication date: 6-Aug-2023
https://dl.acm.org/doi/10.1145/3580305.3599425
Lin YYoungmann BMoskovitch YJagadish HMilo T(2022)OREOProceedings of the VLDB Endowment10.14778/3554821.355484615:12(3570-3573)Online publication date: 1-Aug-2022
https://dl.acm.org/doi/10.14778/3554821.3554846
Lin YYoungmann BMoskovitch YJagadish HMilo T(2022)On detecting cherry-picked generalizationsProceedings of the VLDB Endowment10.14778/3485450.348545715:1(59-71)Online publication date: 14-Jan-2022
https://dl.acm.org/doi/10.14778/3485450.3485457
Chen JJia CLi QZheng HZhao WYan MLin C(2022)Research on Fake News Detection Based on Diffusion Growth RateWireless Communications & Mobile Computing10.1155/2022/63290142022Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1155/2022/6329014
Gu ZFan RZhao XZhang MFan JDu XIves ZBonifati AEl Abbadi A(2022)OpenTFV: An Open Domain Table-Based Fact Verification SystemProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3520163(2405-2408)Online publication date: 10-Jun-2022
https://dl.acm.org/doi/10.1145/3514221.3520163
Min ERong YBian YXu TZhao PHuang JAnaniadou S(2022)Divide-and-Conquer: Post-User Interaction Network for Fake News Detection on Social MediaProceedings of the ACM Web Conference 202210.1145/3485447.3512163(1148-1158)Online publication date: 25-Apr-2022
https://dl.acm.org/doi/10.1145/3485447.3512163
Velichety SShrivastava U(2022)Quantifying the impacts of online fake news on the equity value of social media platforms – Evidence from TwitterInternational Journal of Information Management: The Journal for Information Professionals10.1016/j.ijinfomgt.2022.10247464:COnline publication date: 1-Jun-2022
https://dl.acm.org/doi/10.1016/j.ijinfomgt.2022.102474
Saquete ETomás DMoreda PMartínez-Barco PPalomar M(2022)Fighting post-truth using natural language processingExpert Systems with Applications: An International Journal10.1016/j.eswa.2019.112943141:COnline publication date: 21-Apr-2022
https://dl.acm.org/doi/10.1016/j.eswa.2019.112943
Roy AFafalios PEkbal AZhu XDietze S(2022)Exploiting stance hierarchies for cost-sensitive stance detection of Web documentsJournal of Intelligent Information Systems10.1007/s10844-021-00642-z58:1(1-19)Online publication date: 1-Feb-2022
https://dl.acm.org/doi/10.1007/s10844-021-00642-z
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents