Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Dependency Parsing-based Entity Relation Extraction over Chinese Complex Text

Published: 09 June 2021 Publication History

Abstract

Open Relation Extraction (ORE) plays a significant role in the field of Information Extraction. It breaks the limitation that traditional relation extraction must pre-define relational types in the annotated corpus and specific domains restrictions, to realize the goal of extracting entities and the relation between entities in the open domain. However, with the increase of sentence complexity, the precision and recall of Entity Relation Extraction will be significantly reduced. To solve this problem, we present an unsupervised Clause_CORE method based on Chinese grammar and dependency parsing features. Clause_CORE is used for complex sentences processing, including decomposing complex sentence and dynamically complementing sentence components, which can reduce sentences complexity and maintain the integrity of sentences at the same time. Then, we perform dependency parsing for complete sentences and implement open entity relation extraction based on the model constructed by Chinese grammar rules. The experimental results show that the performance of Clause_CORE method is better than that of other advanced Chinese ORE systems on Wikipedia and Sina news datasets, which proves the correctness and effectiveness of the method. The results on mixed datasets of news data and encyclopedia data prove the generalization and portability of the method.

References

[1]
Nancy Chinchor and Elaine Marsch. 1998. MUC-7 Information extraction task definition. In Proceedings of the 7th Message Understanding Conference (MUC’98). 359–367.
[2]
Jing Xu, Liang Gan, Lu Deng, Jing Wang, and Zhou Yan. 2015. Dependency parsing-based Chinese open relation extraction. In Proceedings of the 4th International Conference on Computer Science and Network Technology (ICCSNT’15). 552–556.
[3]
Michele Banko, M. J. Cafarella, and Stephen Soderland. 2007. Open information extraction from the web. In Proceedings of the 16th International Joint Conference on Artifical Intelligence (IJCAI’07). 2670–2676.
[4]
Jun Zhao, Kang Liu, Youguang Zhou, and Li Cai. 2011. Open information extraction. J. Chinese Info. Process. 25, 6 (2011). 98–111.
[5]
Mingyao Li and Jing Yang. 2016. Open Chinese entity relation extraction method based on dependency parsing. Comput. Eng. 42, 6 (2016), 201–207.
[6]
Shanshan Zheng. 2013. Extraction of Open Domain Entity Relations based on Chinese Grammar Features. Ph.D. East China Normal University, Shanghai.
[7]
Shengbin Jia, Maozhen Li, and Yang Xiang. 2018. Chinese open relation extraction and knowledge base establishment. ACM Trans. Asian Low-Resource Lang. Info. Process. 17, 3 (2018), 1–22.
[8]
Anthony Fader, Stephen Soderland, and Oren Etzioni. 2011. Identifying relations for open information extraction. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’11). 1535–1545.
[9]
Fei Wu and D. S. Weld. 2010. Open information extraction using Wikipedia. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL’10). 118–127.
[10]
Johannes Kirschnick, Holmer Hemsen, and Volker Markl. 2016. JEDI: Joint entity and relation detection using type inference. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL’16). 61–66.
[11]
Makoto Miwa and Mohit Bansal. 2016. End-to-end relation extraction using LSTMs on sequences and tree structures. In Proceedings of the 54th Annual Meeting on Association for Computational Linguistics (ACL’16). arxiv:1601.0770.
[12]
Jun Li, Guimin Huang, Jianheng Chen, and Yabing Wang. 2019. Dual CNN for relation extraction with knowledge-based attention and word embeddings. Comput. Intell. Neurosci. 2019.
[13]
Yuan Li, Xiang Chen, Yanxiang Bao, Dongliang Guo, and Xiao Huang. 2019. Relation extraction of Chinese fundamentals of electric circuits textbook based on CNN. 2019. In Proceedings of the IEEE 3rd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC'19). 277--281.
[14]
Andrea Moro, Hong Li, Sebastian Krause, Feiyu Xu, Roberto Navigli, and Hans Uszkoreit. 2013. Semantic rule filtering for web-scale relation extraction. In Proceedings of the International Semantic Web Conference. 347–362.
[15]
Lixin Gan, Changxuan Wan, Dexi Liu, and Jiang Tengjiao Zhong, Qing. 2016. Chinese named entity relation extraction based on syntactic and semantic features. J. Comput. Res. Dev. 53, 2 (2016), 284–302.
[16]
Charte David, Charte Francisco, García Salvador, and Herrera Francisco. 2019. A snapshot on nonstandard supervised learning problems: Taxonomy, relationships, problem transformations and algorithm adaptations. Progr. Artific. Intell. 2019 8, 1 (2019), 1–14.
[17]
Meilun Sheng. 2014. Relation Extraction from Complex Texts in Open Field. Ph.D. Shanghai Jiao Tong University, Shanghai.
[18]
Michele Banko and Oren Etzioni. 2008. The tradeoffs between open and traditional relation extraction. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics (ACL’08). 28–36.
[19]
Sungmin Yang, Yoo So Yeop Jeong, and Ok Ran Jeong. 2020. DeNERT-KG: Named entity and relation extraction model using DQN, Knowledge Graph, and BERT [J]. Appl. Sci. 10, 18 (2020), 6429. https://doi.org/10.3390/app10186429
[20]
Stanovsky Gabriel, Michael Julian, Zettlemoyer Luke, and Dagan Ido. 2018. Supervised open information extraction. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 885–895.
[21]
Roy Arpita, Park Youngja, Lee Taesung, and Pan Shimei. 2019. Supervising unsupervised open information extraction models. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 728–737.
[22]
Trisedya Bayu Distiawan, Weikum Gerhard, Qi Jianzhong, and Zhang Rui. 2019. Neural relation extraction for knowledge base enrichment. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL’19). 229–240.
[23]
Elsahar Hady, Demidova Elena, Gottschalk Simon, Gravier Christophe, and Laforest Frederique. 2017. Unsupervised open relation extraction. In Proceedings of the European Semantic Web Conference, 12–16.
[24]
Angeli Gabor, Premkumar Melvin Johnson, Christopher D. Manning. 2015. Leveraging linguistic structure for open domain information extraction. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 344–354.
[25]
Ying He, Zhixu Li, Qing Yang, Zhigang Chen, An Liu, Lei Zhao, and Xiaofang Zhou. 2020. End-to-end relation extraction based on bootstrapped multi-level distant supervision. In Proceedings of the World Wide Web Conference. 1--24.
[26]
Xinsong Zhang, Tianyi Liu, Weijia Jia, and Pengshuai Li. Fine-grained relation extraction with focal multi-task learning. Sci. China Info. Sci. 63, 6 (2020), 169103. https://doi.org/10.1007/s11432-018-9721-7
[27]
Filipe Mesquita, Jordan Schmidek, and Denilson Barbosa. 2013. Effectiveness and efficiency of open relation extraction. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’13). 447–457.
[28]
Yuval Merhav, Filipe de Sa Mesquita, Denilson Barbosa, Wai Gen Yee, and Ophir Frieder. 2012. Extracting information networks from the blogosphere. ACM Trans. Web 6, 3 (2012), 1–33.
[29]
Mausam, Michael Schmitz, Robert Bart, Stephen Soderland, and Oren Etzioni. 2012. Open language learning for information extraction. In Proceedings of Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL’12). 523–534.
[30]
Ndapandula Nakashole, Gerhard Weikum, and Fabian Suchanek. 2012. PATTY: A taxonomy of relational patterns with semantic types. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLP’12). 1135–1145.
[31]
Ying Xu, Mi-Young Kim, Kevin Quinn, Randy Goebel, and Denilson Barbosa. 2013. Open information extraction with tree kernels. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 868–877.
[32]
Mihai Surdeanu, Sanda Harabagiu, John Williams, and Paul Aarseth. 2003. Using predicate-argument structures for information extraction. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics. 8–15.
[33]
Richard Johansson and Pierre Nugues. 2008. Dependency-based semantic role labeling of propbank. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’08). 69–78.
[34]
Del Corro, Luciano, and Rainer Gemulla. 2013. Clausie: clause-based open information extraction. In Proceedings of the 22nd international conference on World Wide Web (WWW’13). 355–366.
[35]
Duc-Thuan Vo and Ebrahim Bagheri. 2017. Self-training on refined clause patterns for relation extraction. Info. Process. Manage. 000 (2017), 1–21.
[36]
Reshadat Vahideh and Faili Heshaam. 2019. A new open information extraction system using sentence difficulty estimation. Comput. Info. 38, 1 (2019), 986–1008.
[37]
Hao Fei, Yafeng Ren, and Donghong Ji. 2020. Boundaries and edges rethinking: An end-to-end neural model for overlapping entity relation extraction. Info. Process. Manage. 57, 6 (2020), 102311.
[38]
Likun Qiu and Yue Zhang. 2014. ZORE: A syntax-based system for Chinese open relation extraction, In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1870–1880.
[39]
Y. H. Tseng, L. H. Lee, S. Y. Lin, B. S. Liao, M. J. Liu, H. H. Chen, Oren Etzioni, and Anthony Fader. 2014. Chinese open relation extraction for knowledge acquisition. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics (ACL’14). 12–16.
[40]
Yue Wang, Gang Zhou, Fei Tian, Yu Nan, and Jiangtao Ma. 2015. GCORE: A gravitation-based approach for Chinese open relation. In Proceedings of the International Conference on Computer Science and Mechanical Automation (CSMA’15). 86–91.
[41]
Chen Huang, Longhua Qian, Guodong Zhou, and Qiaoming Zhu. 2010. Research on unsupervised Chinese entity relation extraction based on convolution tree kernel. J. Chinese Info. Process. 24, 4 (2010), 11–18.
[42]
Fang Miao, Huixin Liu, Bo Miao, and Chenming Liu. 2018. Open domain news text relationship extraction based on dependency syntax. In Proceedings of the IEEE International Conference of Safety Produce Informatization (IICSPI'18), 310--314.
[43]
Rafael Glauber and B. C. Daniela. 2018. A systematic mapping study on open information extraction. Expert Syst. Appl. 112 (2018), 372–387.
[44]
de Abreu Sandra Collovini and Renata Vieira. 2017. Relp: Portuguese open relation extraction. Knowl. Org. 44, 3 (2017), 163–177.
[45]
Hailun Lin, Yuanzhuo Wang, Peng Zhang, Weiping Wang, Yinliang Yue, and Zhang Lin. 2016. A rule-based open information extraction method using cascaded finite-state transducer. In Proceedings of Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’16). 325–337.
[46]
Kim, Myung Hee, P. Compton, and Y. S. Kim. 2011. RDR-based open IE for the web document. In Proceedings of the 6th International Conference on Knowledge Capture (K-CAP’11). 105–112.
[47]
Xiaoyang Wu and Wu Bin. 2017. The CRFs-based Chinese open entity relation extraction. In Proceedings of the IEEE Second International Conference on Data Science in Cyberspace (DSC’17). 405–411.
[48]
Mingming Sun, Xu Li, Xin Wang, Miao Fan, Yue Feng, and Ping Li. 2018. Logician: A unified end-to-end neural approach for open-domain information extraction. In Proceedings of the 11th ACM International Conference on Web Search and Data Mining. 9 pages. https://doi.org/10.1145/3159652.3159712
[49]
Jiangying Zhang, Kuangrong Hao, Xue-song Tang, Xin Cai, Yan Xiao, and Tong Wang. 2020. A multi-feature fusion model for Chinese relation extraction with entity sense. Knowl.-Based Syst. 206, 106348 (2020), 1--10.
[50]
Yuan Wang, Dezhi Xu, and Jianer Chen. 2009. Research on entity relationship extraction of complex Chinese texts. Comput. Sci. 36, 8 (2009), 208–211.
[51]
Jiana Bao, Tingyu Li, and Tianfang Yao. 2012. Event information extraction approach based on complex Chinese texts. In Proceedings of the International Conference on Asian Language Processing (IALP'12). 61--64.
[52]
Sally Mohamed Ali, Hamdy M. Mousa, Mahmoud Hussein. 2019. IJCI Int. J. Comput. Info. 6, 1 (2019), 20–28.
[53]
Kiril Gashteovski, Rainer Gemulla, and L. D. Corro. 2017. Minie: Minimizing facts in open information extraction. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’17). 2630–2640.
[54]
Tan Saravadee Sae, Lim Tek Yong, Soon Lay-Ki, and Tang Enya Kong. 2016. Learning to extract domain-specific relations from complex sentences. Expert Syst. Appl. 60, 107–117.
[55]
Petroni Fabio, Del Corro Luciano, and Gemulla Rainer. 2015. Core: Context-aware open relation extraction with factorization machines. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’15). 1763–1773.
[56]
Wanxiang Che, Zhenghua Li, and Ting Liu. 2010. LTP: A Chinese language technology platform. In Proceedings of the 23rd International Wanxiang Che l Conference on Computational Linguistics: Demonstrations (COLING’10). 13–16.
[57]
Jinshan Ma. 2008. Research on Chinese Dependency Parsing Based on Statistical Methods. Ph.D. Harbin Institute of Technology, Harbin.
[58]
Maosong Sun and Changning Huang. 1989. Chinese concurrent words, homomorphic word groups and their processing strategies. J. Chinese Info. Process. 3, 4 (1989), 13–25.
[59]
Jianjun Chen. 2010. A Study on Concurrent Word in two Dictionary Part of Speech Tagging. Ph.D. Nankai University.
[60]
Yang Li. 2016. Research and Implementation of Chinese Open Entity Relation Extraction. Ph.D. University of Electronic Science and Technology of China.
[61]
Bin Qin, Anan Liu, and Ting Liu. 2015. Unsupervised Chinese open entity relation extraction. J. Comput. Res. Dev. 52, 5 (2015), 1029–1035. 1239.2015.20131550
[62]
Yuzhao Wang, Yunfei Yang, and Ruixue Zhao. 2017. The Chinese open relation extraction based on dependency parsing. In Proceedings of the 5th International Conference on Frontiers of Manufacturing Science and Measuring Technology (FMSMT’17). 1212–1216.
[63]
Shiyi Han, Yuhui Zhang, Yunshan Ma, Cunchao Tu, Zhipeng Guo, Zhiyuan Liu, and Maosong Sun. 2016. THUOCL: Tsinghua Open Chinese Lexicon, [Online]. http://thuocl.thunlp.org/.
[64]
Xiaoyu Han, Yue Zhang, Wenkai Zhang, and Tinglei Huang. 2020. An attention-based model using character composition of entities in Chinese relation extraction[J]. Information 11, 2 (2020), 79.1--17.

Cited By

View all
  • (2023)Reading Scene Text with Aggregated Temporal Convolutional EncoderACM Transactions on Asian and Low-Resource Language Information Processing10.1145/362582222:11(1-16)Online publication date: 12-Oct-2023
  • (2023)Systematic Literature Review of Information Extraction From Textual Data: Recent Methods, Applications, Trends, and ChallengesIEEE Access10.1109/ACCESS.2023.324089811(10535-10562)Online publication date: 2023
  • (2023)Towards new-generation human-centric smart manufacturing in Industry 5.0Advanced Engineering Informatics10.1016/j.aei.2023.10212157:COnline publication date: 1-Aug-2023
  • Show More Cited By

Index Terms

  1. Dependency Parsing-based Entity Relation Extraction over Chinese Complex Text

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Asian and Low-Resource Language Information Processing
    ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 20, Issue 4
    July 2021
    419 pages
    ISSN:2375-4699
    EISSN:2375-4702
    DOI:10.1145/3465463
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 09 June 2021
    Accepted: 01 February 2021
    Revised: 01 January 2021
    Received: 01 September 2019
    Published in TALLIP Volume 20, Issue 4

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Open entity relation extraction
    2. dependency parsing
    3. complex sentences processed
    4. Chinese grammar rules
    5. unsupervised

    Qualifiers

    • Research-article
    • Refereed

    Funding Sources

    • National Key Research and Development Program of China

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)25
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 13 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Reading Scene Text with Aggregated Temporal Convolutional EncoderACM Transactions on Asian and Low-Resource Language Information Processing10.1145/362582222:11(1-16)Online publication date: 12-Oct-2023
    • (2023)Systematic Literature Review of Information Extraction From Textual Data: Recent Methods, Applications, Trends, and ChallengesIEEE Access10.1109/ACCESS.2023.324089811(10535-10562)Online publication date: 2023
    • (2023)Towards new-generation human-centric smart manufacturing in Industry 5.0Advanced Engineering Informatics10.1016/j.aei.2023.10212157:COnline publication date: 1-Aug-2023
    • (2022)Construction and Application of Text Entity Relation Joint Extraction Model Based on Multi-Head Attention Neural NetworkComputational Intelligence and Neuroscience10.1155/2022/15302952022Online publication date: 1-Jan-2022
    • (2022)Research on the discrepancy distinguishment of the electric power standard clauses based on NLP2022 4th International Academic Exchange Conference on Science and Technology Innovation (IAECST)10.1109/IAECST57965.2022.10062189(146-149)Online publication date: 9-Dec-2022

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media