Abstract
Knowledge Graph has been proven effective in modeling structured information and conceptual knowledge, especially in the medical domain. However, the lack of high-quality annotated corpora remains a crucial problem for advancing the research and applications on this task. In order to accelerate the research for domain-specific knowledge graphs in the medical domain, we introduce DiaKG, a high-quality Chinese dataset for Diabetes knowledge graph, which contains 22,050 entities and 6,890 relations in total. We implement recent typical methods for Named Entity Recognition and Relation Extraction as a benchmark to evaluate the proposed dataset thoroughly. Empirical results show that the DiaKG is challenging for most existing methods and further analysis is conducted to discuss future research direction for improvements. We hope the release of this dataset can assist the construction of diabetes knowledge graphs and facilitate AI-based applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Li, Y., Teng, D., Shi, X., et al.: Prevalence of diabetes recorded in mainland China using 2018 diagnostic criteria from the American Diabetes Association: national cross sectional study. BMJ 369 (2020)
Luo, Z., Fabre, G., Rodwin, V.G.: Meeting the Challenge of Diabetes in China. Int. J. Health Policy Manage. 9(2) (2020)
Nickel, M., et al.: A review of relational machine learning for knowledge graphs. Proc. IEEE 104(1), 11–33 (2015)
Bisson, L.J., Komm, J.T., Bernas, G.A., et al.: Accuracy of a computer-based diagnostic program for ambulatory patients with knee pain. Am. J. Sports Med. 42(10), 2371–6 (2014)
Wang, M., Liu, M., Liu, J., et al.: Safe medicine recommendation via medical knowledge graph embedding. arXiv preprint arXiv:1710.05980.2017
Tang, H., Ng, J.H.K.: Googling for a diagnosis–use of Google as a diagnostic aid: internet based study. BMJ 333 (2006)
Gann, B.: Giving patients choice and control: health informatics on the patient journey. Yearb Med. Inform. 21(01), 70–73 (2012)
Li, X., Feng, J., Meng, Y., et al.: A unified MRC framework for named entity recognition (2019)
Peng, Z., Wei, S., Tian, J., et al.: Attention-based bidirectional long short-term memory networks for relation classification. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) (2016)
Acknowledgments
We want to express gratitude to the anonymous reviewers for their hard work and kind comments. We also thank Tianchi Platform to host DiaKG.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Chang, D. et al. (2021). DiaKG: An Annotated Diabetes Dataset for Medical Knowledge Graph Construction. In: Qin, B., Jin, Z., Wang, H., Pan, J., Liu, Y., An, B. (eds) Knowledge Graph and Semantic Computing: Knowledge Graph Empowers New Infrastructure Construction. CCKS 2021. Communications in Computer and Information Science, vol 1466. Springer, Singapore. https://doi.org/10.1007/978-981-16-6471-7_26
Download citation
DOI: https://doi.org/10.1007/978-981-16-6471-7_26
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-6470-0
Online ISBN: 978-981-16-6471-7
eBook Packages: Computer ScienceComputer Science (R0)