Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article
Open access
Just Accepted

UIE-Based Relational Extraction Task for Mine Hoist Fault Data

Online AM: 21 November 2024 Publication History

Abstract

Information extraction is pivotal in natural language processing, where the goal is to convert unstructured text into structured information. A significant challenge in this domain is the diversity and specific needs of various processing tasks. Traditional approaches typically utilize separate frameworks for different information extraction tasks, such as named entity recognition and relationship extraction, which hampers their uniformity and scalability. In this study, this study introduce a Universal Information Extraction (UIE) framework combined with a cue learning strategy, significantly improving the efficiency and accuracy of extracting mine hoist fault data. Initially, domain-specific data is manually labeled to fine-tune the model, and the accuracy is further enhanced by constructing negative examples during this fine-tuning process. The model then focuses on faults using the Structured Extraction Language (SEL) and a schema-based prompt syntax, the Structural Schema Instructor (SSI), which targets and extracts key information from the fault data to meet specific domain requirements. Experimental results show that UIE substantially improves the processing efficiency and the F1 accuracy of the extracted mine hoist fault data, with the fine-tuned F1 score increasing from 23.59% to 92.51%.

References

[1]
Qiu X, Sun T, Xu Y, et al. Pre-trained models for natural language processing: A survey[J]. Science China Technological Sciences, 2020, 63(10): 1872-1897.
[2]
Nadkarni P M, Ohno-Machado L, Chapman W W. Natural language processing: an introduction[J]. Journal of the American Medical Informatics Association, 2011, 18(5): 544-551.
[3]
Devlin J, Chang M W, Lee K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding[J]. 2018: 1-16.
[4]
Li J, Tang T, Zhao W X, et al. Pretrained language models for text generation: A survey[J]. 2022: 1-35.
[5]
Su D, Xu Y, Winata G I, et al. Generalizing question answering system with pre-trained language model fine-tuning[C]//Proceedings of the 2nd workshop on machine reading for question answering. 2019: 203-211.
[6]
Sarawagi S. Information extraction[J]. Foundations and Trends® in Databases, 2008, 1(3): 261-377.
[7]
Ji S, Pan S, Cambria E, et al. A survey on knowledge graphs: Representation, acquisition, and applications[J]. IEEE transactions on neural networks and learning systems, 2021, 33(2): 494-514.
[8]
Shang J, Liu J, Jiang M, et al. Automated phrase mining from massive text corpora[J]. IEEE Transactions on Knowledge and Data Engineering, 2018, 30(10): 1825-1837.
[9]
Lu Y, Liu Q, Dai D, et al. Unified structure generation for universal information extraction[J]. 2022: 1-16.
[10]
Dong L, Yang N, Wang W, et al. Unified language model pre-training for natural language understanding and generation[J]. Advances in neural information processing systems, 2019, 32.
[11]
Chang M W, Ratinov L A, Roth D, et al. Importance of Semantic Representation: Dataless Classification[C]//Aaai. 2008, 2: 830-835.
[12]
Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[J]. Advances in neural information processing systems, 2017, 30.
[13]
Brown T, Mann B, Ryder N, et al. Language models are few-shot learners[J]. Advances in neural information processing systems, 2020, 33: 1877-1901.
[14]
Zhengyan Zhang, Xu Han, Zhiyuan Liu, Xin Jiang, Maosong Sun, and Qun Liu. 2019. ERNIE: Enhanced Language Representation with Informative Entities. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 1441-1451, Florence, Italy. Association for Computational Linguistics.
[15]
Sun Y, Wang S, Li Y, et al. ERNIE: Enhanced representation through knowledge integration[J]. 2019: 1-8.
[16]
Sun Y, Wang S, Feng S, et al. Ernie 3.0: Large-scale knowledge enhanced pre-training for language understanding and generation[J]. 2021: 1-6.
[17]
Liu W, Zhou P, Zhao Z, et al. K-bert: Enabling language representation with knowledge graph[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2020, 34(03): 2901-2908.
[18]
Wang X, Gao T, Zhu Z, et al. KEPLER: A unified model for knowledge embedding and pre-trained language representation[J]. Transactions of the Association for Computational Linguistics, 2021, 9: 176-194.
[19]
Lample G, Ballesteros M, Subramanian S, et al. Neural architectures for named entity recognition[J]. 2016: 1-11.
[20]
Wu S, He Y. Enriching pre-trained language model with entity information for relation classification[C]. 2019: 2361-2364.
[21]
Yang B, Mitchell T. Joint extraction of events and entities within a document context[J].In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 289-299, San Diego, California. Association for Computational Linguistics.
[22]
Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, and Chris Dyer. 2016. Neural Architectures for Named Entity Recognition. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 260-270, San Diego, California. Association for Computational Linguistics.
[23]
Yuhao Zhang, Peng Qi, and Christopher D. Manning. 2018. Graph Convolution over Pruned Dependency Trees Improves Relation Extraction. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2205=2215, Brussels, Belgium. Association for Computational Linguistics.Miwa M, Bansal M. End-to-end relation extraction using lstms on sequences and tree structures[J]. 2016: 1-13.
[24]
Makoto Miwa and Mohit Bansal. 2016. End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1105-1116, Berlin, Germany. Association for Computational Linguistics.
[25]
Ahn D. The stages of event extraction[C]//Proceedings of the Workshop on Annotating and Reasoning about Time and Events. 2006: 1-8.
[26]
Liu Y, Ott M, Goyal N, et al. Roberta: A robustly optimized bert pretraining approach[J]. 2019: 1-13.
[27]
Chen Q, Zhuo Z, Wang W. Bert for joint intent classification and slot filling[J]. 2019: 1-6.
[28]
Ikuya Yamada, Akari Asai, Hiroyuki Shindo, Hideaki Takeda, and Yuji Matsumoto. 2020. LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6442-6454, Online. Association for Computational Linguistics.Wei Z, Su J, Wang Y, et al. A novel cascade binary tagging framework for relational triple extraction[J]. 2019: 1-13.
[29]
Zhepei Wei, Jianlin Su, Yue Wang, Yuan Tian, and Yi Chang. 2020. A Novel Cascade Binary Tagging Framework for Relational Triple Extraction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 1476-1488, Online. Association for Computational Linguistics.
[30]
Liu P, Yuan W, Fu J, et al. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing[J]. ACM Computing Surveys, 2023, 55(9): 1-35.
[31]
Caruana, R. (1997). Multitask Learning. Machine Learning, 28(1), 41-75.
[32]
Wang Y, Yao Q, Kwok J T, et al. Generalizing from a few examples: A survey on few-shot learning[J]. ACM computing surveys (csur), 2020, 53(3): 1-34.
[33]
Tkachenko, Maxim, Malyuk, Mikhail, Holmanyuk, Andrey, & Liubimov, Nikolai. (2020). Label Studio: Data labeling software. Retrieved from https://github.com/heartexlabs/label-studio
[34]
Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2020. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7871-7880, Online. Association for Computational Linguistics.
[35]
Raffel C, Shazeer N, Roberts A, et al. Exploring the limits of transfer learning with a unified text-to-text transformer[J]. Journal of machine learning research, 2020, 21(140): 1-67.
[36]
Williams R J, Zipser D. A learning algorithm for continually running fully recurrent neural networks[J]. Neural computation, 1989, 1(2): 270-280.
[37]
Ranzato M A, Chopra S, Auli M, et al. Sequence level training with recurrent neural networks[J]. 2015: 1-16.
[38]
Ranran Haoran Zhang, Qianying Liu, Aysa Xuemo Fan, Heng Ji, Daojian Zeng, Fei Cheng, Daisuke Kawahara, and Sadao Kurohashi. 2020. Minimize Exposure Bias of Seq2Seq Models in Joint Entity and Relation Extraction. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 236-246, Online. Association for Computational Linguistics.
[39]
Pan, S. J., & Yang, Q. (2010). A Survey on Transfer Learning. IEEE Transactions on Knowledge and Data Engineering, 22(10), 1345-1359.

Index Terms

  1. UIE-Based Relational Extraction Task for Mine Hoist Fault Data
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Asian and Low-Resource Language Information Processing
    ACM Transactions on Asian and Low-Resource Language Information Processing Just Accepted
    EISSN:2375-4702
    Table of Contents
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Online AM: 21 November 2024
    Accepted: 15 November 2024
    Revision received: 09 November 2024
    Received: 02 June 2024

    Check for updates

    Author Tags

    1. joint extraction
    2. mechanical problem
    3. mining sector
    4. prompt learning

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 0
      Total Downloads
    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 18 Nov 2024

    Other Metrics

    Citations

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media