Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3534678.3539233acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Evaluating Knowledge Graph Accuracy Powered by Optimized Human-machine Collaboration

Published: 14 August 2022 Publication History

Abstract

Estimating the accuracy of an automatically constructed knowledge graph (KG) becomes a challenging task as the KG often contains a large number of entities and triples. Generally, two major components information extraction (IE) and entity linking (EL) are involved in KG construction. However, the existing approaches just focus on evaluating the triple accuracy that indicates the IE quality, completely ignoring the entity accuracy. Motivated by the fact that the major advance of machines is the strong computing power while humans are skilled in correctness verification, we propose an efficient interactive method to reduce the overall cost for evaluating the KG quality, which produces accuracy estimates with a statistical guarantee for both triples and entities. Instead of annotating triples and entities separately, we design a general annotation cost that blends triples and entities generated from the identical source text. During human verification, the machine can pre-compute and infer triples to be annotated in the next round by speculating human feedback. The human-machine collaborative mechanism is optimized by formulating an order selection problem of triples which is NP-hard. Thus, a Monte Carlo Tree Search is proposed to guide the annotation process by finding an approximate solution. Extensive experiments demonstrate that our method takes less annotation cost while yielding higher accuracy estimation quality compared to the state-of-the-art approaches.

Supplemental Material

MP4 File
This is a video for paper "Evaluating Knowledge Graph Accuracy Powered by Optimized Human-machine Collaboration".

References

[1]
Peter Auer, Nicolo Cesa-Bianchi, and Paul Fischer. 2002. Finite-time analysis of the multiarmed bandit problem. Machine learning, Vol. 47, 2 (2002), 235--256.
[2]
Matthias Brocheler, Lilyana Mihalkova, and Lise Getoor. 2012. Probabilistic similarity logic. arXiv preprint arXiv:1203.3469 (2012).
[3]
Cameron B Browne, Edward Powley, Daniel Whitehouse, Simon M Lucas, Peter I Cowling, Philipp Rohlfshagen, Stephen Tavener, Diego Perez, Spyridon Samothrakis, and Simon Colton. 2012. A survey of monte carlo tree search methods. IEEE Transactions on Computational Intelligence and AI in games, Vol. 4, 1 (2012), 1--43.
[4]
George Casella and Roger L Berger. 2002. Statistical inference . Vol. 2. Duxbury Pacific Grove, CA.
[5]
Haihua Chen, Gaohui Cao, Jiangping Chen, and Junhua Ding. 2019. A Practical Framework for Evaluating the Quality of Knowledge Graph. In China Conference on Knowledge Graph and Semantic Computing. Springer, 111--122.
[6]
Jose Ortiz Costa and Anagha Kulkarni. 2018. Leveraging Knowledge Graph for Open-domain Question Answering. In 2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI). IEEE, 389--394.
[7]
Xin Dong, Evgeniy Gabrilovich, Geremy Heitz, Wilko Horn, Ni Lao, Kevin Murphy, Thomas Strohmann, Shaohua Sun, and Wei Zhang. 2014. Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In SIGKDD . 601--610.
[8]
MS Fabian, Kasneci Gjergji, WEIKUM Gerhard, et almbox. 2007. Yago: A core of semantic knowledge unifying wordnet and wikipedia. In WWW . 697--706.
[9]
Michael C Fu. 2016. AlphaGo and Monte Carlo tree search: the simulation optimization perspective. In Winter Simulation Conference (WSC). 659--670.
[10]
Luis Antonio Galárraga, Christina Teflioudi, Katja Hose, and Fabian Suchanek. 2013. AMIE: association rule mining under incomplete evidence in ontological knowledge bases. In WWW . 413--422.
[11]
Junyang Gao, Xian Li, Yifan Ethan Xu, Bunyamin Sisman, Xin Luna Dong, and Jun Yang. 2019. Efficient Knowledge Graph Accuracy Evaluation. Proc. VLDB Endow., Vol. 12, 11 (2019), 1679--1691.
[12]
Kiril Gashteovski, Sebastian Wanner, Sven Hertling, Samuel Broscheit, and Rainer Gemulla. 2019. OPIEC: An Open Information Extraction Corpus. In AKBC .
[13]
Morris H Hansen and William N Hurwitz. 1943. On the theory of sampling from finite populations. The Annals of Mathematical Statistics, Vol. 14, 4 (1943), 333--362.
[14]
Steven Haussmann, Oshani Seneviratne, Yu Chen, Yarden Ne'eman, James Codella, Ching-Hua Chen, Deborah L McGuinness, and Mohammed J Zaki. 2019. FoodKG: a semantics-driven knowledge graph for food recommendation. In International Semantic Web Conference. Springer, 146--162.
[15]
David Kempe, Jon Kleinberg, and Éva Tardos. 2003. Maximizing the spread of influence through a social network. In SIGKDD. 137--146.
[16]
Levente Kocsis and Csaba Szepesvári. 2006. Bandit based monte-carlo planning. In European conference on machine learning . Springer, 282--293.
[17]
Ni Lao, Tom Mitchell, and William Cohen. 2011. Random walk inference and learning in a large scale knowledge base. In EMNLP . 529--539.
[18]
Ying Li, Vitalii Zakhozhyi, Daniel Zhu, and Luis J Salazar. 2020. Domain Specific Knowledge Graphs as a Service to the Public: Powering Social-Impact Funding in the US. In SIGKDD. 2793--2801.
[19]
Shuangyan Liu, Mathieu d'Aquin, and Enrico Motta. 2017. Measuring accuracy of triples in knowledge graphs. In International Conference on Language, Data and Knowledge. Springer, 343--357.
[20]
Xusheng Luo, Le Bo, Jinhang Wu, Lin Li, Zhiy Luo, Yonghua Yang, and Keping Yang. 2021. AliCoCo2: Commonsense Knowledge Extraction, Representation and Application in E-commerce. In SIGKDD. 3385--3393.
[21]
Tom Mitchell, William Cohen, Estevam Hruschka, Partha Talukdar, Bishan Yang, Justin Betteridge, Andrew Carlson, Bhanava Dalvi, Matt Gardner, Bryan Kisiel, et almbox. 2018. Never-ending learning. Commun. ACM, Vol. 61, 5 (2018), 103--115.
[22]
Feng Niu, Che Zhang, Christopher Ré, and Jude W Shavlik. 2012. DeepDive: Web-scale Knowledge-base Construction using Statistical Learning and Inference. VLDS, Vol. 12 (2012), 25--28.
[23]
Prakhar Ojha and Partha Talukdar. 2017. KGEval: Accuracy estimation of automatically constructed knowledge graphs. In EMNLP . 1741--1750.
[24]
Ankur Padia et almbox. 2017. Cleaning noisy knowledge graphs. In Proceedings of the Doctoral Consortium at the 16th International Semantic Web Conference, Vol. 1962.
[25]
Ankur Padia, Frank Ferraro, and Tim Finin. 2018. KGCleaner: Identifying and correcting errors produced by information extraction systems. arXiv preprint arXiv:1808.04816 (2018).
[26]
Heiko Paulheim. 2017. Knowledge graph refinement: A survey of approaches and evaluation methods. Semantic web, Vol. 8, 3 (2017), 489--508.
[27]
Jay Pujara, Eriq Augustine, and Lise Getoor. 2017. Sparsity and noise: Where knowledge graph embeddings fall short. In EMNLP . 1751--1756.
[28]
Jan Rörden, Artem Revenko, Bernhard Haslhofer, and Andreas Blumauer. 2017. Network-based Knowledge Graph Assessment. In SEMANTICS Posters&Demos .
[29]
Yuan Yao, Deming Ye, Peng Li, Xu Han, Yankai Lin, Zhenghao Liu, Zhiyuan Liu, Lixin Huang, Jie Zhou, and Maosong Sun. 2019. DocRED: A Large-Scale Document-Level Relation Extraction Dataset. In ACL . 764--777.

Cited By

View all
  • (2024)Efficient and Reliable Estimation of Knowledge Graph AccuracyProceedings of the VLDB Endowment10.14778/3665844.366586517:9(2392-2403)Online publication date: 6-Aug-2024
  • (2024)Veracity Estimation for Entity-Oriented Search with Knowledge GraphsProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679561(1649-1659)Online publication date: 21-Oct-2024
  • (2024)Towards assessing the quality of knowledge graphs via differential testingInformation and Software Technology10.1016/j.infsof.2024.107521174(107521)Online publication date: Oct-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
August 2022
5033 pages
ISBN:9781450393850
DOI:10.1145/3534678
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 August 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. accuracy estimation
  2. cost optimization
  3. human-machine collaboration
  4. knowledge graph

Qualifiers

  • Research-article

Funding Sources

Conference

KDD '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)111
  • Downloads (Last 6 weeks)11
Reflects downloads up to 18 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Efficient and Reliable Estimation of Knowledge Graph AccuracyProceedings of the VLDB Endowment10.14778/3665844.366586517:9(2392-2403)Online publication date: 6-Aug-2024
  • (2024)Veracity Estimation for Entity-Oriented Search with Knowledge GraphsProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679561(1649-1659)Online publication date: 21-Oct-2024
  • (2024)Towards assessing the quality of knowledge graphs via differential testingInformation and Software Technology10.1016/j.infsof.2024.107521174(107521)Online publication date: Oct-2024
  • (2024)Knowledge graph accuracy evaluation: an LLM-enhanced embedding approachInternational Journal of Data Science and Analytics10.1007/s41060-024-00661-3Online publication date: 8-Oct-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media