Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Toward computational fact-checking

Published: 01 March 2014 Publication History

Abstract

Our news are saturated with claims of "facts" made from data. Database research has in the past focused on how to answer queries, but has not devoted much attention to discerning more subtle qualities of the resulting claims, e.g., is a claim "cherry-picking"? This paper proposes a framework that models claims based on structured data as parameterized queries. A key insight is that we can learn a lot about a claim by perturbing its parameters and seeing how its conclusion changes. This framework lets us formulate practical fact-checking tasks---reverse-engineering (often intentionally) vague claims, and countering questionable claims---as computational problems. Along with the modeling framework, we develop an algorithmic framework that enables efficient instantiations of "meta" algorithms by supplying appropriate algorithmic building blocks. We present real-world examples and experiments that demonstrate the power of our model, efficiency of our algorithms, and usefulness of their results.

References

[1]
C. C. Aggarwal, editor. Managing and Mining Uncertain Data. Springer, 2009.
[2]
P. Agrawal and J. Widom. Confidence-aware join algorithms. ICDE, 2009, 628--639.
[3]
A. M. Andrew. Another efficient algorithm for convex hulls in two dimensions. Information Processing Letters, 9(1979), 216--219.
[4]
M. A. Bender and M. Farach-Colton. The LCA problem revisited. LATIN, 2000, 88--94.
[5]
S. Börzsönyi, D. Kossmann, and K. Stocker. The skyline operator. ICDE, 2001, 421--430.
[6]
S Cohen, J. T. Hamilton, and F. Turner. Computational journalism. CACM, 54(2011), 66--71.
[7]
S. Cohen, C. Li, J. Yang, and C. Yu. Computational journalism: A call to arms to database researchers. CIDR, 2011.
[8]
Harish D., P. N. Darera, and J. R. Haritsa. Identifying robust plans through plan diagram reduction. VLDB, 2008, 1124--1140.
[9]
N. N. Dalvi, C. Ré, and D. Suciu. Probabilistic databases: Diamonds in the dirt. CACM, 52(2009), 86--94.
[10]
J. Fischer and V. Heun. A new succinct representation of rmq-information and improvements in the enhanced suffix array. ESCAPE, 2007, 459--470.
[11]
S. Ganguly. Design and analysis of parametric query optimization algorithms. VLDB, 1998, 228--238.
[12]
J. Gray, A. Bosworth, A. Layman, and H. Pirahesh. Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-total. ICDE, 1996, 152--159.
[13]
D. Harel and R. E. Tarjan. Fast algorithms for finding nearest common ancestors. SIAM, 13(1984), 338--355.
[14]
Z. He and E. Lo. Answering why-not questions on top-k queries. ICDE, 2012, 750--761.
[15]
A. Hulgeri and S. Sudarshan. AniPQO: Almost non-intrusive parametric query optimization for nonlinear cost functions. VLDB, 2003, 766--777.
[16]
Y. E. Ioannidis, R. T. Ng, K. Shim, and T. K. Sellis. Parametric query optimization. VLDB, 1992, 103--114.
[17]
R. Jampani, F. Xu, M. Wu, L. L. Perez, C. Jermaine, and P. J. Haas. The Monte Carlo database system: Stochastic analysis close to the data. TODS, 36(2011), 18.
[18]
H. T. Kung, F. Luccio, and F. P. Preparata. On finding the maxima of a set of vectors. JACM, 22(1975), 469--476.
[19]
X Lin, A. Mukherji, E. A. Rundensteiner, C. Ruiz, and M. O. Ward. PARAS: A parameter space framework for online association mining. VLDB 6(2013), 193--204.
[20]
Y. Luo, X. Lin, W. Wang, and X. Zhou. Spark: top-k keyword query in relational databases. SIGMOD, 2007, 115--126.
[21]
K. Mouratidis and H. Pang. Computing immutable regions for sub-space top-k queries. VLDB, 6(2012), 73--84.
[22]
A. Das Sarma, A. G. Parameswaran, H. Garcia-Molina, and J. Widom. Synthesizing view definitions from data. ICDT, 2010, 89--103.
[23]
M. A. Soliman, I. F. Ilyas, D. Martinenghi, and M. Tagliasacchi. Ranking with uncertain scoring functions: Semantics and sensitivity measures. SIGMOD, 2011, 805--816.
[24]
R. E. Tarjan. Applications of path compression on balanced trees. JACM, 26(1979), 690--715.
[25]
Q. T. Tran and C. Y. Chan. How to ConQueR why-not questions. SIGMOD, 2010, 15--26.
[26]
Q. T. Tran, C. Y. Chan, and S. Parthasarathy. Query by output. SIGMOD, 2009, 535--548.
[27]
E. Wu and S. Madden. Scorpion: Explaining away outliers in aggregate queries. VLDB, 6(2013), 553--564.
[28]
Y. Wu, P. K. Agarwal, C. Li, J. Yang, and C. Yu. Toward computational fact-checking. Technical report, Duke University, 2013. http://www.cs.duke.edu/dbgroup/papers/WuAgarwalEtAl-13-fact_check.pdf.
[29]
A. Yu, P. K. Agarwal, and J. Yang. Processing a large number of continuous preference top-k queries. SIGMOD, 2012, 397--408.

Cited By

View all
  • (2024)How to Avoid Jumping to Conclusions: Measuring the Robustness of Outstanding Facts in Knowledge GraphsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671763(3539-3550)Online publication date: 25-Aug-2024
  • (2023)Maximizing Neutrality in News OrderingProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599425(11-24)Online publication date: 6-Aug-2023
  • (2022)OREOProceedings of the VLDB Endowment10.14778/3554821.355484615:12(3570-3573)Online publication date: 1-Aug-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 7, Issue 7
March 2014
108 pages
ISSN:2150-8097
Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 March 2014
Published in PVLDB Volume 7, Issue 7

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)51
  • Downloads (Last 6 weeks)7
Reflects downloads up to 22 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)How to Avoid Jumping to Conclusions: Measuring the Robustness of Outstanding Facts in Knowledge GraphsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671763(3539-3550)Online publication date: 25-Aug-2024
  • (2023)Maximizing Neutrality in News OrderingProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599425(11-24)Online publication date: 6-Aug-2023
  • (2022)OREOProceedings of the VLDB Endowment10.14778/3554821.355484615:12(3570-3573)Online publication date: 1-Aug-2022
  • (2022)On detecting cherry-picked generalizationsProceedings of the VLDB Endowment10.14778/3485450.348545715:1(59-71)Online publication date: 14-Jan-2022
  • (2022)Research on Fake News Detection Based on Diffusion Growth RateWireless Communications & Mobile Computing10.1155/2022/63290142022Online publication date: 1-Jan-2022
  • (2022)OpenTFV: An Open Domain Table-Based Fact Verification SystemProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3520163(2405-2408)Online publication date: 10-Jun-2022
  • (2022)Divide-and-Conquer: Post-User Interaction Network for Fake News Detection on Social MediaProceedings of the ACM Web Conference 202210.1145/3485447.3512163(1148-1158)Online publication date: 25-Apr-2022
  • (2022)Quantifying the impacts of online fake news on the equity value of social media platforms – Evidence from TwitterInternational Journal of Information Management: The Journal for Information Professionals10.1016/j.ijinfomgt.2022.10247464:COnline publication date: 1-Jun-2022
  • (2022)Fighting post-truth using natural language processingExpert Systems with Applications: An International Journal10.1016/j.eswa.2019.112943141:COnline publication date: 21-Apr-2022
  • (2022)Exploiting stance hierarchies for cost-sensitive stance detection of Web documentsJournal of Intelligent Information Systems10.1007/s10844-021-00642-z58:1(1-19)Online publication date: 1-Feb-2022
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media