Knowledge graph accuracy evaluation: an LLM-enhanced embedding approach

Mingtao Zhang^1,2,
Guoli Yang¹,
Yi Liu¹,
Jing Shi^1,2 &
…
Xiaoying Bai¹

248 Accesses
Explore all metrics

Abstract

As an effective way for knowledge representation and knowledge storage, knowledge graph has been widely used in various fields. However, with the rapid increase of scale and volume of various knowledge graphs, there will inevitably be some knowledge quality matters. To evaluate the accuracy of knowledge graph effectively and efficiently, a common paradigm is to match the facts in knowledge graph with specific external knowledge. In this study, an LLM-enhanced (large language model enhanced) embedding framework is designed, integrating the verification ability of large language models to further evaluate the embedding results. First an optimized embedding model is proposed to make use of knowledge graph’s internal structural information to measure whether the relation of a given triplet is probably founded. Then, the triplets which have less paths to support themselves are selected as the questionable ones, as their correctness cannot be determined confidently. Finally, the questionable triplets are filtered, and LLMs are adopted for further fact verification as external knowledge. The above three parts are aggregated to achieve the automated, accurate and efficient evaluation for knowledge graphs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SSKGE: a time-saving knowledge graph embedding framework based on structure enhancement and semantic guidance

Article 05 August 2023

Enhancing knowledge graph embedding with structure and semantic features

Article 15 February 2024

Research Progress of Knowledge Graph Based on Knowledge Base Embedding

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

In embedding models, the number of training set triplets are constrained by BFS depth and subgraph size.

References

Saxena, A., Tripathi, A., Talukdar, P.: Improving multi-hop question answering over knowledge graphs using knowledge base embeddings. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (2020)
Hildebrandt, M., Quintero Serna, J.A., Ma, Y., Ringsquandl, M., Joblin, M., Tresp, V.: Reasoning on knowledge graphs with debate dynamics. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 4123–4131 (2020)
Dong, X.L., Gabrilovich, E., Heitz, G., Horn, W., Murphy, K.P., Sun, S., Zhang, W.: From data fusion to knowledge fusion. Proc. VLDB Endow. 7, 881–892 (2014)
Article Google Scholar
Wang, R.Y., Strong, D.M.: Beyond accuracy: what data quality means to data consumers. J. Manag. Inf. Syst. 12, 5–33 (1996)
Article Google Scholar
Gao, J., Li, X., Xu, Y.E., Sisman, B., Dong, X.L., Yang, J.: Efficient knowledge graph accuracy evaluation. Proc. VLDB Endow. 12, 1679–1691 (2019)
Article Google Scholar
Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: A core of semantic knowledge unifying wordnet and Wikipedia. In: Proceedings of the 16th International Conference on World Wide Web (2007)
Qi, Y., Zheng, W., Hong, L., Zou, L.: Evaluating knowledge graph accuracy powered by optimized human–machine collaboration. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. KDD’22, pp. 1368–1378. Association for Computing Machinery, New York, NY, USA (2022)
Ojha, P., Talukdar, P.: KGEval: accuracy estimation of automatically constructed knowledge graphs. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (2017)
Amaral, G., Rodrigues, O., Simperl, E.P.B.: Prove: a pipeline for automated provenance verification of knowledge graphs against textual sources (2022). arXiv:2210.14846
Liu, S., d’Aquin, M., Motta, E.: Measuring accuracy of triples in knowledge graphs. In: International Conference on Language, Data, and Knowledge (2017)
Jia, S., Xiang, Y., Chen, X., Wang, K.: Triple trustworthiness measurement for knowledge graph. In: The World Wide Web Conference (2019)
Zhu, Y., Wang, X., Chen, J., Qiao, S., Ou, Y., Yao, Y., Deng, S., Chen, H., Zhang, N.: LLMS for knowledge graph construction and reasoning: recent capabilities and future opportunities (2023). arXiv:2305.13168
Peng, B., Galley, M., He, P., Cheng, H., Xie, Y., Hu, Y., Huang, Q., Lidén, L., Yu, Z., Chen, W., Gao, J.: Check your facts and try again: improving large language models with external knowledge and automated feedback (2023). arXiv:2302.12813
Wei, J., Bosma, M., Zhao, V., Guu, K., Yu, A.W., Lester, B., Du, N., Dai, A.M., Le, Q.V.: Finetuned language models are zero-shot learners (2021). arXiv:2109.01652
Bordes, A., Usunier, N., Garcia-Durán, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: Proceedings of the 26th International Conference on Neural Information Processing Systems—Volume 2. NIPS’13, pp. 2787–2795. Curran Associates Inc., Red Hook, NY, USA (2013)
Bollacker, K., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data (2008)
Akrami, F., Saeef, M.S., Zhang, Q., Hu, W., Li, C.: Realistic re-evaluation of knowledge graph completion methods: an experimental study. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (2020)
Kontokostas, D., Westphal, P., Auer, S., Hellmann, S., Lehmann, J., Cornelissen, R., Zaveri, A.: Test-driven evaluation of linked data quality. In: Proceedings of the 23rd International Conference on World Wide Web (2014)
Fan, W.: Dependencies for graphs: challenges and opportunities. J. Data Inf. Qual. (2019). https://doi.org/10.1145/3310230
Article Google Scholar
Fan, W., Hu, C., Liu, X., Lu, P.: Discovering graph functional dependencies. In: Proceedings of the 2018 International Conference on Management of Data (2018)
Wang, Q., Mao, Z., Wang, B., Guo, L.: Knowledge graph embedding: a survey of approaches and applications. IEEE Trans. Knowl. Data Eng. 29, 2724–2743 (2017)
Article Google Scholar
Cao, J., Fang, J., Meng, Z., Liang, S.: Knowledge graph embedding: a survey from the perspective of representation spaces (2022). arXiv:2211.03536
Ji, G., He, S., Xu, L., Liu, K., Zhao, J.: Knowledge graph embedding via dynamic mapping matrix. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (2015)
Wang, Z., Zhang, J., Feng, J., Chen, Z.: Knowledge graph embedding by translating on hyperplanes. In: Proceedings of the AAAI Conference on Artificial Intelligence (2022)
Lin, Y., Liu, Z., Luan, H., Sun, M., Rao, S., Liu, S.: Modeling relation paths for representation learning of knowledge bases. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (2015)
Xie, R., Liu, Z., Lin, F., Lin, L.: Does William Shakespeare really write Hamlet? knowledge representation learning with confidence. In: Proceedings of the AAAI Conference on Artificial Intelligence (2022)
Zhang, Q., Dong, J., Duan, K., Huang, X., Liu, Y., Xu, L.: Contrastive knowledge graph error detection. In: Proceedings of the 31st ACM International Conference on Information Knowledge Management (2022)
Wang, Y., Ma, F., Gao, J.: Efficient knowledge graph validation via cross-graph representation learning (2020). CoRR arXiv:2008.06995
Lehmann, J., Gerber, D., Morsey, M., Ngonga Ngomo, A.-C.: DeFacto—deep fact validation, pp. 312–327 (2012)
Gerber, D., Esteves, D., Lehmann, J., Bühmann, L., Usbeck, R., Ngonga Ngomo, A.-C., Speck, R.: Defacto—temporal and multilingual deep fact validation. J. Web Semant. (2015). https://doi.org/10.1016/j.websem.2015.08.001
Article Google Scholar
Li, X., Meng, W., Yu, C.: T-verifier: verifying truthfulness of fact statements. In: 2011 IEEE 27th International Conference on Data Engineering (2011)
Liu, Z., Xiong, C., Sun, M., Liu, Z.: Fine-grained fact verification with kernel graph attention network. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (2020)
Zhou, J., Han, X., Yang, C., Liu, Z., Wang, L., Li, C., Sun, M.: Gear: Graph-based evidence aggregating and reasoning for fact verification (2019). arXiv:1908.01843
Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D.M., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A., Sutskever, I., Amodei, D.: Language models are few-shot learners (2020). CoRR arXiv:2005.14165
Zeng, A., Liu, X., Du, Z., Wang, Z., Lai, H., Ding, M., Yang, Z., Xu, Y., Zheng, W., Xia, X., et al.: Glm-130b: an open bilingual pre-trained model (2022). arXiv preprint arXiv:2210.02414
Li, S., Li, X., Shang, L., Dong, Z., Sun, C., Liu, B., Ji, Z., Jiang, X., Liu, Q.: How pre-trained language models capture factual knowledge? A causal-inspired analysis. In: Findings (2022)
Guo, Z., Schlichtkrull, M., Vlachos, A.: A survey on automated fact-checking. Trans. Assoc. Comput. Linguist. 10, 178–206 (2022)
Article Google Scholar

Download references

Acknowledgements

The research is funded by NSFC (72201275). The authors would like to thank researchers in AIBD, and their teams who have provided very helpful discussions and suggestions.

Author information

Authors and Affiliations

Department of Big Data Intelligence, Advanced Institute of Big Data, Beijing, 100195, China
Mingtao Zhang, Guoli Yang, Yi Liu, Jing Shi & Xiaoying Bai
Peking University, Beijing, 100871, China
Mingtao Zhang & Jing Shi

Authors

Mingtao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Guoli Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yi Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jing Shi
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoying Bai
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

MZ, GY and XB conceived and designed the research. MZ and JS conducted the computer simulations. All authors analysed the results and wrote the manuscript.

Corresponding author

Correspondence to Guoli Yang.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhang, M., Yang, G., Liu, Y. et al. Knowledge graph accuracy evaluation: an LLM-enhanced embedding approach. Int J Data Sci Anal (2024). https://doi.org/10.1007/s41060-024-00661-3

Download citation

Received: 10 August 2023
Accepted: 26 September 2024
Published: 08 October 2024
DOI: https://doi.org/10.1007/s41060-024-00661-3

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

SSKGE: a time-saving knowledge graph embedding framework based on structure enhancement and semantic guidance

Enhancing knowledge graph embedding with structure and semantic features

Research Progress of Knowledge Graph Based on Knowledge Base Embedding

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Knowledge graph accuracy evaluation: an LLM-enhanced embedding approach

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

SSKGE: a time-saving knowledge graph embedding framework based on structure enhancement and semantic guidance

Enhancing knowledge graph embedding with structure and semantic features

Research Progress of Knowledge Graph Based on Knowledge Base Embedding

Explore related subjects

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation