Abstract
Open Information Extraction is a crucial task in natural language processing with wide applications. Existing efforts only work on extracting simple flat triplets that are not minimized, which neglect triplets of other kinds and their nested combinations. As a result, they cannot provide comprehensive extraction results for its downstream tasks. In this paper, we define three more fine-grained types of triplets, and also pay attention to the nested combination of these triplets. Particular, we propose a novel end-to-end joint extraction model, which identifies the basic semantic elements, comprehensive types of triplets, as well as their nested combinations from plain texts jointly. In this way, information is shared more thoroughly in the whole parsing process, which also lets the model achieve more fine-grained knowledge extraction without relying on external NLP tools or resources. Our empirical study on datasets of two domains, Building Codes and Biomedicine, demonstrates the effectiveness of our model comparing to state-of-the-art approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Angeli, G., Premkumar, M.J.J., Manning, C.D.: Leveraging linguistic structure for open domain information extraction. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 344–354 (2015)
Bast, H., Haussmann, E.: Open information extraction via contextual sentence decomposition. In: 2013 IEEE Seventh International Conference on Semantic Computing, pp. 154–159. IEEE (2013)
Bast, H., Haussmann, E.: More informative open information extraction via simple inference. In: de Rijke, M., et al. (eds.) ECIR 2014. LNCS, vol. 8416, pp. 585–590. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-06028-6_61
Bhutani, N., Jagadish, H., Radev, D.: Nested propositions in open information extraction. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 55–64 (2016)
Cui, L., Wei, F., Zhou, M.: Neural open information extraction. arXiv preprint arXiv:1805.04270 (2018)
Del Corro, L., Gemulla, R.: Clausie: clause-based open information extraction. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 355–366 (2013)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Fader, A., Soderland, S., Etzioni, O.: Identifying relations for open information extraction. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pp. 1535–1545 (2011)
Gashteovski, K., Gemulla, R., Corro, L.D.: Minie: Minimizing Facts in Open Information Extraction. Association for Computational Linguistics (2017)
Han, S., Bang, J., Ryu, S., Lee, G.G.: Exploiting knowledge base to generate responses for natural language dialog listening agents. In: Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pp. 129–133 (2015)
Herzig, J., Berant, J.: Span-based semantic parsing for compositional generalization. arXiv preprint arXiv:2009.06040 (2020)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Huang, Z., Xu, W., Yu, K.: Bidirectional lstm-crf models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)
Jiang, T., Zhao, T., Qin, B., Liu, T., Chawla, N., Jiang, M.: Multi-input multi-output sequence labeling for joint extraction of fact and condition tuples from scientific text. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 302–312 (2019)
Khot, T., Sabharwal, A., Clark, P.: Answering complex questions using open information extraction. arXiv preprint arXiv:1704.05572 (2017)
Kim, W., Goyal, B., Chawla, K., Lee, J., Kwon, K.: Attention-based ensemble for deep metric learning. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 736–751 (2018)
Kolluru, K., Aggarwal, S., Rathore, V., Chakrabarti, S., et al.: Imojie: Iterative memory-based joint open information extraction. arXiv preprint arXiv:2005.08178 (2020)
Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data (2001)
Prasojo, R.E., Kacimi, M., Nutt, W.: Stuffie: semantic tagging of unlabeled facets using fine-grained information extraction. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management, pp. 467–476 (2018)
Pyysalo, S., Ohta, T., Ananiadou, S.: Overview of the cancer genetics (cg) task of bionlp shared task 2013. In: Proceedings of the BioNLP Shared Task 2013 Workshop, pp. 58–66 (2013)
Schmitz, M., Soderland, S., Bart, R., Etzioni, O., et al.: Open language learning for information extraction. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 523–534 (2012)
Stanovsky, G., Dagan, I., et al.: Open IE as an intermediate structure for semantic tasks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pp. 303–308 (2015)
Yahya, M., Whang, S., Gupta, R., Halevy, A.: Renoun: fact extraction for nominal attributes. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 325–335 (2014)
Yates, A., Banko, M., Broadhead, M., Cafarella, M.J., Etzioni, O., Soderland, S.: Textrunner: open information extraction on the web. In: Proceedings of Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT), pp. 25–26 (2007)
Zhan, J., Zhao, H.: Span model for open information extraction on accurate corpus. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 9523–9530 (2020)
Zheng, S., Wang, F., Bao, H., Hao, Y., Zhou, P., Xu, B.: Joint extraction of entities and relations based on a novel tagging scheme. arXiv preprint arXiv:1706.05075 (2017)
Zhou, J., Zhao, H.: Head-driven phrase structure grammar parsing on penn treebank. arXiv preprint arXiv:1907.02684 (2019)
Acknowledgment
This research is partially supported by National Key R&D Program of China (No. 2018AAA0101900), National Natural Science Foundation of China (Grant No. 62072323, 61632016), Natural Science Foundation of Jiangsu Province (No. BK20191420), the Priority Academic Program Development of Jiangsu Higher Education Institutions, and the Collaborative Innovation Center of Novel Software Technology and Industrialization.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Wang, J. et al. (2021). Towards Nested and Fine-Grained Open Information Extraction. In: Qin, B., Jin, Z., Wang, H., Pan, J., Liu, Y., An, B. (eds) Knowledge Graph and Semantic Computing: Knowledge Graph Empowers New Infrastructure Construction. CCKS 2021. Communications in Computer and Information Science, vol 1466. Springer, Singapore. https://doi.org/10.1007/978-981-16-6471-7_14
Download citation
DOI: https://doi.org/10.1007/978-981-16-6471-7_14
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-6470-0
Online ISBN: 978-981-16-6471-7
eBook Packages: Computer ScienceComputer Science (R0)