Abstract
Chinese Nested Named Entity Recognition (CNNER) faces several challenges due to the language diversity phenomena, the complexity of the language, and the imbalanced distribution of entity types in Chinese text. To address these challenges in CNNER, we propose a new method called CPMFA (Character Pair-based method with Multi-feature representation and Attention mechanism). The CPMFA method predicts the predefined relations of character pairs in a sentence, and identifies nested named entities based on these relations. First, our method utilizes the pre-trained language model LERT (Linguistically-motivated Bidirectional Encoder Representation from Transformer), and Bidirectional Long Short-Term Memory (BiLSTM) to generate comprehensive and precise character representations. Second, our method uses multi-feature representation to capture complex semantic information within the text, and employs the Pyramid Squeeze Attention (PSA) module to emphasize key features. Finally, to overcome the challenge of the imbalanced distribution of entity types, PolyLoss function is integrated into our model training process. Results of experiments show that the proposed CPMFA method achieves an F1 score of 83.79%. Compared to other mainstream span-based methods, the proposed CPMFA method has excellent performance in CNNER.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Rujia, Z., Lu, D., Peng, G., Bang, W.: Chinese nested named entity recognition algorithm based on segmentation attention and boundary-aware. Comput. Sci. 50(01), 213–220 (2023)
Shiyuan, Y., Shuming, G., Ruiyang, H., Jianpeng, Z., Nan, H.: Layered regional exhaustive model for Chinese nested named entity recognition. Comput. Technol. Dev. 32(09), 161–166+179 (2022)
Li, H., Xu, H., Qian, L., Zhou, G.: Multi-layer joint learning of Chinese nested named entity recognition based on self-attention mechanism. In: Natural Language Processing and Chinese Computing: 9th CCF International Conference, NLPCC 2020, Zhengzhou, China, 14–18 October 2020, Proceedings, Part II, pp. 144–155. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-60457-8_12
Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)
Straková, J., Straka, M., Hajič, J.: Neural architectures for nested NER through linearization. arXiv preprint arXiv:1908.06926 (2019)
Li, F., Lin, Z., Zhang, M., Ji, D.: A span-based model for joint overlapped and discontinuous named entity recognition. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 4814–4828 (2021)
Xia, C., et al.: Multi-grained named entity recognition. In: 57th Annual Meeting of the Association for Computational Linguistics, ACL 2019, pp. 1430–1440. Association for Computational Linguistics (ACL) (2020)
Li, J., et al.: Unified named entity recognition as word-word relation classification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 10965–10973 (2022)
Islam, T., Zinat, S.M., Sukhi, S., Mridha, M.F.: A comprehensive study on attention-based NER. In: Khanna, A., Gupta, D., Bhattacharyya, S., Hassanien, A.E., Anand, S., Jaiswal, A. (eds.) International Conference on Innovative Computing and Communications. AISC, vol. 1388, pp. 665–681. Springer, Singapore (2022). https://doi.org/10.1007/978-981-16-2597-8_57
Cui, S., Joe, I.: A multi-head adjacent attention-based pyramid layered model for nested named entity recognition. Neural Comput. Appl. 35(3), 2561–2574 (2023)
Rodríguez, A.J.C., Castro, D.C., García, S.H.: Noun-based attention mechanism for fine-grained named entity recognition. Expert Syst. Appl. 193, 116406 (2022)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Li, D., Yan, L., Yang, J., Ma, Z.: Dependency syntax guided BERT-BiLSTM-GAM-CRF for Chinese NER. Expert Syst. Appl. 196, 116682 (2022)
Yu, Y., et al.: Chinese mineral named entity recognition based on BERT model. Expert Syst. Appl. 206, 117727 (2022)
Xu, Y., Huang, H., Feng, C., Hu, Y.: A supervised multi-head self-attention network for nested named entity recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 14185–14193 (2021)
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Leng, Z., et al.: PolyLoss: a polynomial expansion perspective of classification loss functions. arXiv preprint arXiv:2204.12511 (2022)
Zhang, H., Zu, K., Lu, J., Zou, Y., Meng, D.: EPSANet: an efficient pyramid squeeze attention block on convolutional neural network. In: Wang, L., Gall, J., Chin, T.J., Sato, I., Chellappa, R. (eds.) Proceedings of the Asian Conference on Computer Vision, pp. 1161–1177. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-26313-2_33
Chang, D., et al.: DiaKG: an annotated diabetes dataset for medical knowledge graph construction. In: Qin, B., Jin, Z., Wang, H., Pan, J., Liu, Y., An, B. (eds.) Knowledge Graph and Semantic Computing: Knowledge Graph Empowers New Infrastructure Construction: 6th China Conference, CCKS 2021, Guangzhou, China, 4–7 November 2021, Proceedings, vol. 1466, pp. 308–314. Springer, Cham (2021). https://doi.org/10.1007/978-981-16-6471-7_26
Li, J., Sun, A., Han, J., Li, C.: A survey on deep learning for named entity recognition. IEEE Trans. Knowl. Data Eng. 34(1), 50–70 (2020)
Wei, Z., Su, J., Wang, Y., Tian, Y., Chang, Y.: A novel cascade binary tagging framework for relational triple extraction. arXiv preprint arXiv:1909.03227 (2019)
Su, J., et al.: Global pointer: novel efficient span-based approach for named entity recognition. arXiv preprint arXiv:2208.03054 (2022)
Su, J.: Efficient globalpointer: fewer parameters, more effects, January 2022. https://spaces.ac.cn/archives/8877
Acknowledgement
This study was supported by the Key Project of Regional Innovation and Development Joint Fund of National Natural Science Foundation of China (Grant No. U22A2025).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Ji, X., Chen, L., Shen, F., Guo, H., Gao, H. (2023). CPMFA: A Character Pair-Based Method for Chinese Nested Named Entity Recognition. In: Yang, X., et al. Advanced Data Mining and Applications. ADMA 2023. Lecture Notes in Computer Science(), vol 14176. Springer, Cham. https://doi.org/10.1007/978-3-031-46661-8_14
Download citation
DOI: https://doi.org/10.1007/978-3-031-46661-8_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46660-1
Online ISBN: 978-3-031-46661-8
eBook Packages: Computer ScienceComputer Science (R0)