research-article

Open access

Just Accepted

UIE-Based Relational Extraction Task for Mine Hoist Fault Data

Authors: Xiaochao Dang, GuoZhen Ding, Xiaohui Dong, Fengfang Li, Shiwei Gao, Yue WangAuthors Info & Claims

ACM Transactions on Asian and Low-Resource Language Information Processing

Accepted on 15 November 2024

https://doi.org/10.1145/3705313

Online AM: 21 November 2024 Publication History

Abstract

Information extraction is pivotal in natural language processing, where the goal is to convert unstructured text into structured information. A significant challenge in this domain is the diversity and specific needs of various processing tasks. Traditional approaches typically utilize separate frameworks for different information extraction tasks, such as named entity recognition and relationship extraction, which hampers their uniformity and scalability. In this study, this study introduce a Universal Information Extraction (UIE) framework combined with a cue learning strategy, significantly improving the efficiency and accuracy of extracting mine hoist fault data. Initially, domain-specific data is manually labeled to fine-tune the model, and the accuracy is further enhanced by constructing negative examples during this fine-tuning process. The model then focuses on faults using the Structured Extraction Language (SEL) and a schema-based prompt syntax, the Structural Schema Instructor (SSI), which targets and extracts key information from the fault data to meet specific domain requirements. Experimental results show that UIE substantially improves the processing efficiency and the F1 accuracy of the extracted mine hoist fault data, with the fine-tuned F1 score increasing from 23.59% to 92.51%.

References

[1]

Qiu X, Sun T, Xu Y, et al. Pre-trained models for natural language processing: A survey[J]. Science China Technological Sciences, 2020, 63(10): 1872-1897.

[2]

Nadkarni P M, Ohno-Machado L, Chapman W W. Natural language processing: an introduction[J]. Journal of the American Medical Informatics Association, 2011, 18(5): 544-551.

[3]

Devlin J, Chang M W, Lee K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding[J]. 2018: 1-16.

[4]

Li J, Tang T, Zhao W X, et al. Pretrained language models for text generation: A survey[J]. 2022: 1-35.

[5]

Su D, Xu Y, Winata G I, et al. Generalizing question answering system with pre-trained language model fine-tuning[C]//Proceedings of the 2nd workshop on machine reading for question answering. 2019: 203-211.

[6]

Sarawagi S. Information extraction[J]. Foundations and Trends® in Databases, 2008, 1(3): 261-377.

Digital Library

[7]

Ji S, Pan S, Cambria E, et al. A survey on knowledge graphs: Representation, acquisition, and applications[J]. IEEE transactions on neural networks and learning systems, 2021, 33(2): 494-514.

[8]

Shang J, Liu J, Jiang M, et al. Automated phrase mining from massive text corpora[J]. IEEE Transactions on Knowledge and Data Engineering, 2018, 30(10): 1825-1837.

[9]

Lu Y, Liu Q, Dai D, et al. Unified structure generation for universal information extraction[J]. 2022: 1-16.

[10]

Dong L, Yang N, Wang W, et al. Unified language model pre-training for natural language understanding and generation[J]. Advances in neural information processing systems, 2019, 32.

[11]

Chang M W, Ratinov L A, Roth D, et al. Importance of Semantic Representation: Dataless Classification[C]//Aaai. 2008, 2: 830-835.

[12]

Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[J]. Advances in neural information processing systems, 2017, 30.

[13]

Brown T, Mann B, Ryder N, et al. Language models are few-shot learners[J]. Advances in neural information processing systems, 2020, 33: 1877-1901.

[14]

Zhengyan Zhang, Xu Han, Zhiyuan Liu, Xin Jiang, Maosong Sun, and Qun Liu. 2019. ERNIE: Enhanced Language Representation with Informative Entities. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 1441-1451, Florence, Italy. Association for Computational Linguistics.

[15]

Sun Y, Wang S, Li Y, et al. ERNIE: Enhanced representation through knowledge integration[J]. 2019: 1-8.

[16]

Sun Y, Wang S, Feng S, et al. Ernie 3.0: Large-scale knowledge enhanced pre-training for language understanding and generation[J]. 2021: 1-6.

[17]

Liu W, Zhou P, Zhao Z, et al. K-bert: Enabling language representation with knowledge graph[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2020, 34(03): 2901-2908.

[18]

Wang X, Gao T, Zhu Z, et al. KEPLER: A unified model for knowledge embedding and pre-trained language representation[J]. Transactions of the Association for Computational Linguistics, 2021, 9: 176-194.

[19]

Lample G, Ballesteros M, Subramanian S, et al. Neural architectures for named entity recognition[J]. 2016: 1-11.

[20]

Wu S, He Y. Enriching pre-trained language model with entity information for relation classification[C]. 2019: 2361-2364.

[21]

Yang B, Mitchell T. Joint extraction of events and entities within a document context[J].In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 289-299, San Diego, California. Association for Computational Linguistics.

[22]

Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, and Chris Dyer. 2016. Neural Architectures for Named Entity Recognition. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 260-270, San Diego, California. Association for Computational Linguistics.

[23]

Yuhao Zhang, Peng Qi, and Christopher D. Manning. 2018. Graph Convolution over Pruned Dependency Trees Improves Relation Extraction. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2205=2215, Brussels, Belgium. Association for Computational Linguistics.Miwa M, Bansal M. End-to-end relation extraction using lstms on sequences and tree structures[J]. 2016: 1-13.

[24]

Makoto Miwa and Mohit Bansal. 2016. End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1105-1116, Berlin, Germany. Association for Computational Linguistics.

[25]

Ahn D. The stages of event extraction[C]//Proceedings of the Workshop on Annotating and Reasoning about Time and Events. 2006: 1-8.

[26]

Liu Y, Ott M, Goyal N, et al. Roberta: A robustly optimized bert pretraining approach[J]. 2019: 1-13.

[27]

Chen Q, Zhuo Z, Wang W. Bert for joint intent classification and slot filling[J]. 2019: 1-6.

[28]

Ikuya Yamada, Akari Asai, Hiroyuki Shindo, Hideaki Takeda, and Yuji Matsumoto. 2020. LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6442-6454, Online. Association for Computational Linguistics.Wei Z, Su J, Wang Y, et al. A novel cascade binary tagging framework for relational triple extraction[J]. 2019: 1-13.

[29]

Zhepei Wei, Jianlin Su, Yue Wang, Yuan Tian, and Yi Chang. 2020. A Novel Cascade Binary Tagging Framework for Relational Triple Extraction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 1476-1488, Online. Association for Computational Linguistics.

[30]

Liu P, Yuan W, Fu J, et al. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing[J]. ACM Computing Surveys, 2023, 55(9): 1-35.

Digital Library

[31]

Caruana, R. (1997). Multitask Learning. Machine Learning, 28(1), 41-75.

[32]

Wang Y, Yao Q, Kwok J T, et al. Generalizing from a few examples: A survey on few-shot learning[J]. ACM computing surveys (csur), 2020, 53(3): 1-34.

[33]

Tkachenko, Maxim, Malyuk, Mikhail, Holmanyuk, Andrey, & Liubimov, Nikolai. (2020). Label Studio: Data labeling software. Retrieved from https://github.com/heartexlabs/label-studio

[34]

Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2020. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7871-7880, Online. Association for Computational Linguistics.

[35]

Raffel C, Shazeer N, Roberts A, et al. Exploring the limits of transfer learning with a unified text-to-text transformer[J]. Journal of machine learning research, 2020, 21(140): 1-67.

[36]

Williams R J, Zipser D. A learning algorithm for continually running fully recurrent neural networks[J]. Neural computation, 1989, 1(2): 270-280.

[37]

Ranzato M A, Chopra S, Auli M, et al. Sequence level training with recurrent neural networks[J]. 2015: 1-16.

[38]

Ranran Haoran Zhang, Qianying Liu, Aysa Xuemo Fan, Heng Ji, Daojian Zeng, Fei Cheng, Daisuke Kawahara, and Sadao Kurohashi. 2020. Minimize Exposure Bias of Seq2Seq Models in Joint Entity and Relation Extraction. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 236-246, Online. Association for Computational Linguistics.

[39]

Pan, S. J., & Yang, Q. (2010). A Survey on Transfer Learning. IEEE Transactions on Knowledge and Data Engineering, 22(10), 1345-1359.

Digital Library

Index Terms

UIE-Based Relational Extraction Task for Mine Hoist Fault Data
1. Computing methodologies
  1. Artificial intelligence

Index terms have been assigned to the content through auto-classification.

Recommendations

BIJE: A Joint Extraction Model for Biomedical Information Extraction
Advanced Intelligent Computing Technology and Applications
Abstract
The basic tasks of information extraction in biomedical text include named entity recognition and relation extraction. However, biomedical texts always contain a large number of sentences with complex structures, such as entity nesting and ...
Entity relation joint extraction method for manufacturing industry knowledge data based on improved BERT algorithm
Abstract
The existing joint extraction methods for entity relationships in knowledge data only target specific fields or datasets, which may have insufficient coverage for large-scale and diverse manufacturing knowledge data. In order to achieve more ...
Entity relation extraction based on Biaffine model with embedded context features
MLNLP '23: Proceedings of the 2023 6th International Conference on Machine Learning and Natural Language Processing

The extraction of entities and relationships from unstructured text is not only a critical issue in information extraction, but also an essential component of constructing knowledge graphs. Most existing models for relation extraction first extract all ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Asian and Low-Resource Language Information Processing

ACM Transactions on Asian and Low-Resource Language Information Processing Just Accepted

EISSN:2375-4702

Table of Contents

Copyright © 2024 Copyright held by the owner/author(s).

This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Online AM: 21 November 2024

Accepted: 15 November 2024

Revision received: 09 November 2024

Received: 02 June 2024

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 18 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables