Abstract
Knowledge graph construction (KGC) aims to organize knowledge into a semantic network which can reveal relations between entities. Its basis is named entity recognition (NER) and relation extraction (RE) tasks. In recent years, KGC methods for Chinese have made great progress. However, most existing methods concentrate on modern Chinese and ignore the classical Chinese due to its complexity, making research in this field relatively lacking. In this paper, we construct a high-quality classical Chinese labeled dataset for NER and RE tasks. More specifically, we conduct a series of experiments to select an optimal NER model to strengthen the whole pipeline model for NER and RE tasks, augmenting our dataset iteratively and automatically. Additionally, we propose an improved RE model to better combine semantic entity information extracted by the NER model. Moreover, we construct a knowledge graph (KG) based on Chinese historical literature and design a visualization system with intuitive display and query functions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
GuwenBERT https://github.com/ethan-yt/guwenbert.
- 2.
References
Bollacker, K., Evans, C., Paritosh, P., Sturge, T., Taylo, J.: Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 1247–1250 (2008)
Christopoulou, F., Miwa, M., Ananiadou, S.: A walk-based model on entity graphs for relation extraction. arXiv preprint arXiv:1902.07023 (2019)
Collins, M., Singer, Y.: Unsupervised models for named entity classification. In: 1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (1999)
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Gao, Y., Liang, J., Han, B., Yakout, M., Mohamed, A.: Building a large-scale, accurate and fresh knowledge graph. In: KDD-2018, Tutorial, vol. 39, pp. 1939–1374 (2018)
Hashimoto, K., Miwa, M., Tsuruoka, Y., Chikayama, T.: Simple customization of recursive neural networks for semantic relation classification. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1372–1376 (2013)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)
Katiyar, A., Cardie, C.: Investigating LSTMs for joint extraction of opinion entities and relations. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 919–929 (2016)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: probabilistic models for segmenting and labeling sequence data (2001)
Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: a lite BERT for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019)
Li, J., et al.: WCP-RNN: a novel RNN-based approach for bio-NER in Chinese EMRs. J. Supercomput. 76(3), 1450–1467 (2020)
Liu, Y., et al.: Roberta: a robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
Niu, X., Sun, X., Wang, H., Rong, S., Qi, G., Yu, Y.: Zhishi.me - weaving Chinese linking open data. In: Aroyo, L., et al. (eds.) ISWC 2011. LNCS, vol. 7032, pp. 205–220. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25093-4_14
Peters, M.E., et al.: Deep contextualized word representations. arXiv preprint arXiv:1802.05365 (2018)
Radford, A., Narasimhan, K., Salimans, T., Sutskever, I.: Improving language understanding by generative pre-training (2018)
Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: Proceedings of the 16th International Conference on World Wide Web, pp. 697–706 (2007)
Sun, Y., et al.: ERNIE: enhanced representation through knowledge integration. arXiv preprint arXiv:1904.09223 (2019)
Vaswani, A., et al.: Attention is all you need. arXiv preprint arXiv:1706.03762 (2017)
Wang, L., Cao, Z., De Melo, G., Liu, Z.: Relation classification via multi-level attention CNNs. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1298–1307 (2016)
Wang, Z., et al.: XLore: a large-scale English-Chinese bilingual knowledge graph. In: International Semantic Web Conference (Posters & Demos), vol. 1035, pp. 121–124 (2013)
Wikipedia contributors: Named-entity recognition – Wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=Named-entity_recognition&oldid=959772078 (2020). Accessed 20 May 2021
Wikipedia contributors: Yellow emperor – Wikipedia, the free encyclopedia (2021). https://en.wikipedia.org/w/index.php?title=Yellow_Emperor&oldid=1038043350. Accessed 14 Aug 2021
Xu, B., et al.: CN-DBpedia: a never-ending Chinese knowledge extraction system. In: Benferhat, S., Tabia, K., Ali, M. (eds.) IEA/AIE 2017. LNCS (LNAI), vol. 10351, pp. 428–438. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-60045-1_44
Yu, P., Wang, X.: BERT-based named entity recognition in Chinese twenty-four histories. In: Wang, G., Lin, X., Hendler, J., Song, W., Xu, Z., Liu, G. (eds.) WISA 2020. LNCS, vol. 12432, pp. 289–301. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-60029-7_27
Zeng, D., Liu, K., Lai, S., Zhou, G., Zhao, J.: Relation classification via convolutional deep neural network. In: Proceedings of COLING 2014, The 25th International Conference on Computational Linguistics: Technical Papers, pp. 2335–2344 (2014)
Zheng, S., et al.: Joint learning of entity semantics and relation pattern for relation extraction. In: Frasconi, P., Landwehr, N., Manco, G., Vreeken, J. (eds.) ECML PKDD 2016. LNCS (LNAI), vol. 9851, pp. 443–458. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46128-1_28
Acknowledgement
This work is supported by the China Universities Industry, Education and Research Innovation Foundation Project (2019ITA03006), and the National Training Programs of Innovation and Entrepreneurship for Undergraduates (202010056117).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Guo, Q. et al. (2021). Constructing Chinese Historical Literature Knowledge Graph Based on BERT. In: Xing, C., Fu, X., Zhang, Y., Zhang, G., Borjigin, C. (eds) Web Information Systems and Applications. WISA 2021. Lecture Notes in Computer Science(), vol 12999. Springer, Cham. https://doi.org/10.1007/978-3-030-87571-8_28
Download citation
DOI: https://doi.org/10.1007/978-3-030-87571-8_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87570-1
Online ISBN: 978-3-030-87571-8
eBook Packages: Computer ScienceComputer Science (R0)