Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3459637.3481930acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Published: 30 October 2021 Publication History

Abstract

Knowledge enhanced pre-trained language models (K-PLMs) are shown to be effective for many public tasks in the literature, but few of them have been successfully applied in practice. To address this problem, we propose K-AID, a systematic approach that includes a low-cost knowledge acquisition process for acquiring domain knowledge, an effective knowledge infusion module for improving model performance, and a knowledge distillation component for reducing the model size and deploying K-PLMs on resource-restricted devices (e.g., CPU) for real-world application. Importantly, instead of capturing entity knowledge like the majority of existing K-PLMs, our approach captures relational knowledge, which contributes to better improving sentence-level text classification and text matching tasks that play a key role in question answering (QA). We conducted a set of experiments on five text classification tasks and three text matching tasks from three domains, namely E-commerce, Government, and Film&TV, and performed online A/B tests in E-commerce. Experimental results show that our approach is able to achieve substantial improvement on sentence-level question answering tasks and bring beneficial business value in industrial settings.

Supplementary Material

MP4 File (K-AID.mp4)
Video presentation.

References

[1]
Ateret Anaby-Tavor, Boaz Carmeli, Esther Goldbraich, Amir Kantor, George Kour, Segev Shlomov, Naama Tepper, and Naama Zwerdling. 2020. Do Not Have Enough Data? Deep Learning to the Rescue!. In AAAI. 7383--7390.
[2]
Sören Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, and Zachary Ives. 2007. Dbpedia: A nucleus for a web of open data. In The semantic web. Springer, 722--735.
[3]
Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD. 1247--1250.
[4]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
[5]
Shizhe Diao, Jiaxin Bai, Yan Song, Tong Zhang, and Yonggang Wang. 2019. ZEN: pre-training chinese text encoder enhanced by n-gram representations. arXiv preprint arXiv:1911.00720 (2019).
[6]
Thibault Févry, Livio Baldini Soares, Nicholas FitzGerald, Eunsol Choi, and Tom Kwiatkowski. 2020. Entities as experts: Sparse memory access with entity supervision. arXiv preprint arXiv:2004.07202 (2020).
[7]
Bin He, Di Zhou, Jinghui Xiao, et al. 2019. Integrating graph contextualized knowledge into pre-trained language models. arXiv preprint arXiv:1912.00147 (2019).
[8]
Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015).
[9]
Anne Lauscher, Ivan Vulić, Edoardo Maria Ponti, Anna Korhonen, and Goran Glavavs. 2019. Informing unsupervised pretraining with external linguistic knowledge. arXiv preprint arXiv:1909.02339 (2019).
[10]
Yoav Levine, Barak Lenz, Or Dagan, et al. 2019. Sensebert: Driving some sense into bert. arXiv preprint arXiv:1908.05646 (2019).
[11]
Feng-Lin Li, Hehong Chen, Guohai Xu, et al. 2020. AliMeKG: Domain Knowledge Graph Construction and Application in E-commerce. In CIKM. 2581--2588.
[12]
Feng-Lin Li, Minghui Qiu, Haiqing Chen, Xiongwei Wang, Xing Gao, Jun Huang, Juwei Ren, Zhongzhou Zhao, Weipeng Zhao, Lei Wang, et al. 2017. Alime assist: An intelligent assistant for creating an innovative e-commerce experience. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 2495--2498.
[13]
Weijie Liu, Peng Zhou, Zhe Zhao, et al. 2020. K-BERT: Enabling Language Representation with Knowledge Graph. In AAAI. 2901--2908.
[14]
Yinhan Liu, Myle Ott, Naman Goyal, et al. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).
[15]
Matthew E Peters, Mark Neumann, Robert L Logan IV, et al. 2019. Knowledge enhanced contextual word representations. arXiv preprint arXiv:1909.04164 (2019).
[16]
Fanchao Qi, Chenghao Yang, Zhiyuan Liu, Qiang Dong, Maosong Sun, and Zhendong Dong. 2019. Openhownet: An open sememe-based lexical knowledge base. arXiv preprint arXiv:1901.09957 (2019).
[17]
Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. Improving language understanding by generative pre-training.
[18]
Jingbo Shang, Jialu Liu, Meng Jiang, Xiang Ren, Clare R Voss, and Jiawei Han. 2018. Automated phrase mining from massive text corpora. TKDE, Vol. 30, 10 (2018), 1825--1837.
[19]
Robyn Speer, Joshua Chin, and Catherine Havasi. 2017. Conceptnet 5.5: An open multilingual graph of general knowledge. In AAAI, Vol. 31.
[20]
Tianxiang Sun, Yunfan Shao, Xipeng Qiu, Qipeng Guo, Yaru Hu, Xuanjing Huang, and Zheng Zhang. 2020. CoLAKE: Contextualized Language and Knowledge Embedding. arXiv preprint arXiv:2010.00309 (2020).
[21]
Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Xuyi Chen, Han Zhang, Xin Tian, Danxiang Zhu, Hao Tian, and Hua Wu. 2019. Ernie: Enhanced representation through knowledge integration. arXiv preprint arXiv:1904.09223 (2019).
[22]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In NeurIPS. 5998--6008.
[23]
Ruize Wang, Duyu Tang, Nan Duan, et al. 2020. K-adapter: Infusing knowledge into pre-trained models with adapters. arXiv preprint arXiv:2002.01808 (2020).
[24]
Wei Wang, Bin Bi, Ming Yan, et al. 2019 a. Structbert: Incorporating language structures into pre-training for deep language understanding. arXiv preprint arXiv:1908.04577 (2019).
[25]
Xiaozhi Wang, Tianyu Gao, Zhaocheng Zhu, Zhiyuan Liu, Juanzi Li, and Jian Tang. 2019 b. KEPLER: A unified model for knowledge embedding and pre-trained language representation. arXiv preprint arXiv:1911.06136 (2019).
[26]
Xiaozhi Wang, Tianyu Gao, Zhaocheng Zhu, Zhengyan Zhang, Zhiyuan Liu, Juanzi Li, and Jian Tang. 2021. KEPLER: A unified model for knowledge embedding and pre-trained language representation. Transactions of the Association for Computational Linguistics, Vol. 9 (2021), 176--194.
[27]
Gerhard Weikum, Luna Dong, Simon Razniewski, and Fabian Suchanek. 2020. Machine Knowledge: Creation and Curation of Comprehensive Knowledge Bases. arXiv preprint arXiv:2009.11564 (2020).
[28]
Wenhan Xiong, Jingfei Du, William Yang Wang, and Veselin Stoyanov. 2019. Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model. arXiv preprint arXiv:1912.09637 (2019).
[29]
Runqi Yang, Jianhai Zhang, Xing Gao, Feng Ji, and Haiqing Chen. 2019. Simple and effective text matching with richer alignment features. arXiv preprint arXiv:1908.00300 (2019).
[30]
Denghui Zhang, Zixuan Yuan, Yanchi Liu, Fuzhen Zhuang, and Hui Xiong. 2020. E-BERT: A Phrase and Product Knowledge Enhanced Language Model for E-commerce. arXiv preprint arXiv:2009.02835 (2020).
[31]
Zhengyan Zhang, Xu Han, Zhiyuan Liu, Xin Jiang, Maosong Sun, and Qun Liu. 2019. ERNIE: Enhanced language representation with informative entities. arXiv preprint arXiv:1905.07129 (2019).

Cited By

View all
  • (2023)PaperLM: A Pre-trained Model for Hierarchical Examination Paper Representation LearningProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615003(2178-2187)Online publication date: 21-Oct-2023
  • (2023)KEBLM: Knowledge-Enhanced Biomedical Language ModelsJournal of Biomedical Informatics10.1016/j.jbi.2023.104392143(104392)Online publication date: Jul-2023

Index Terms

  1. K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
        October 2021
        4966 pages
        ISBN:9781450384469
        DOI:10.1145/3459637
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 30 October 2021

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. domain knowledge
        2. knowledge infusion
        3. pre-trained language models
        4. question answering

        Qualifiers

        • Research-article

        Conference

        CIKM '21
        Sponsor:

        Acceptance Rates

        Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

        Upcoming Conference

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)28
        • Downloads (Last 6 weeks)2
        Reflects downloads up to 03 Oct 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2023)PaperLM: A Pre-trained Model for Hierarchical Examination Paper Representation LearningProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615003(2178-2187)Online publication date: 21-Oct-2023
        • (2023)KEBLM: Knowledge-Enhanced Biomedical Language ModelsJournal of Biomedical Informatics10.1016/j.jbi.2023.104392143(104392)Online publication date: Jul-2023

        View Options

        Get Access

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media