research-article

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Authors:

Qianglong Chen,

Ji ZhangAuthors Info & Claims

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

Pages 4125 - 4134

https://doi.org/10.1145/3459637.3481930

Published: 30 October 2021 Publication History

Abstract

Knowledge enhanced pre-trained language models (K-PLMs) are shown to be effective for many public tasks in the literature, but few of them have been successfully applied in practice. To address this problem, we propose K-AID, a systematic approach that includes a low-cost knowledge acquisition process for acquiring domain knowledge, an effective knowledge infusion module for improving model performance, and a knowledge distillation component for reducing the model size and deploying K-PLMs on resource-restricted devices (e.g., CPU) for real-world application. Importantly, instead of capturing entity knowledge like the majority of existing K-PLMs, our approach captures relational knowledge, which contributes to better improving sentence-level text classification and text matching tasks that play a key role in question answering (QA). We conducted a set of experiments on five text classification tasks and three text matching tasks from three domains, namely E-commerce, Government, and Film&TV, and performed online A/B tests in E-commerce. Experimental results show that our approach is able to achieve substantial improvement on sentence-level question answering tasks and bring beneficial business value in industrial settings.

Supplementary Material

MP4 File (K-AID.mp4)

Video presentation.

Download
374.75 MB

References

[1]

Ateret Anaby-Tavor, Boaz Carmeli, Esther Goldbraich, Amir Kantor, George Kour, Segev Shlomov, Naama Tepper, and Naama Zwerdling. 2020. Do Not Have Enough Data? Deep Learning to the Rescue!. In AAAI. 7383--7390.

[2]

Sören Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, and Zachary Ives. 2007. Dbpedia: A nucleus for a web of open data. In The semantic web. Springer, 722--735.

Digital Library

[3]

Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD. 1247--1250.

Digital Library

[4]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).

[5]

Shizhe Diao, Jiaxin Bai, Yan Song, Tong Zhang, and Yonggang Wang. 2019. ZEN: pre-training chinese text encoder enhanced by n-gram representations. arXiv preprint arXiv:1911.00720 (2019).

[6]

Thibault Févry, Livio Baldini Soares, Nicholas FitzGerald, Eunsol Choi, and Tom Kwiatkowski. 2020. Entities as experts: Sparse memory access with entity supervision. arXiv preprint arXiv:2004.07202 (2020).

[7]

Bin He, Di Zhou, Jinghui Xiao, et al. 2019. Integrating graph contextualized knowledge into pre-trained language models. arXiv preprint arXiv:1912.00147 (2019).

[8]

Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015).

[9]

Anne Lauscher, Ivan Vulić, Edoardo Maria Ponti, Anna Korhonen, and Goran Glavavs. 2019. Informing unsupervised pretraining with external linguistic knowledge. arXiv preprint arXiv:1909.02339 (2019).

[10]

Yoav Levine, Barak Lenz, Or Dagan, et al. 2019. Sensebert: Driving some sense into bert. arXiv preprint arXiv:1908.05646 (2019).

[11]

Feng-Lin Li, Hehong Chen, Guohai Xu, et al. 2020. AliMeKG: Domain Knowledge Graph Construction and Application in E-commerce. In CIKM. 2581--2588.

Digital Library

[12]

Feng-Lin Li, Minghui Qiu, Haiqing Chen, Xiongwei Wang, Xing Gao, Jun Huang, Juwei Ren, Zhongzhou Zhao, Weipeng Zhao, Lei Wang, et al. 2017. Alime assist: An intelligent assistant for creating an innovative e-commerce experience. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 2495--2498.

Digital Library

[13]

Weijie Liu, Peng Zhou, Zhe Zhao, et al. 2020. K-BERT: Enabling Language Representation with Knowledge Graph. In AAAI. 2901--2908.

[14]

Yinhan Liu, Myle Ott, Naman Goyal, et al. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).

[15]

Matthew E Peters, Mark Neumann, Robert L Logan IV, et al. 2019. Knowledge enhanced contextual word representations. arXiv preprint arXiv:1909.04164 (2019).

[16]

Fanchao Qi, Chenghao Yang, Zhiyuan Liu, Qiang Dong, Maosong Sun, and Zhendong Dong. 2019. Openhownet: An open sememe-based lexical knowledge base. arXiv preprint arXiv:1901.09957 (2019).

[17]

Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. Improving language understanding by generative pre-training.

[18]

Jingbo Shang, Jialu Liu, Meng Jiang, Xiang Ren, Clare R Voss, and Jiawei Han. 2018. Automated phrase mining from massive text corpora. TKDE, Vol. 30, 10 (2018), 1825--1837.

[19]

Robyn Speer, Joshua Chin, and Catherine Havasi. 2017. Conceptnet 5.5: An open multilingual graph of general knowledge. In AAAI, Vol. 31.

Digital Library

[20]

Tianxiang Sun, Yunfan Shao, Xipeng Qiu, Qipeng Guo, Yaru Hu, Xuanjing Huang, and Zheng Zhang. 2020. CoLAKE: Contextualized Language and Knowledge Embedding. arXiv preprint arXiv:2010.00309 (2020).

[21]

Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Xuyi Chen, Han Zhang, Xin Tian, Danxiang Zhu, Hao Tian, and Hua Wu. 2019. Ernie: Enhanced representation through knowledge integration. arXiv preprint arXiv:1904.09223 (2019).

[22]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In NeurIPS. 5998--6008.

Digital Library

[23]

Ruize Wang, Duyu Tang, Nan Duan, et al. 2020. K-adapter: Infusing knowledge into pre-trained models with adapters. arXiv preprint arXiv:2002.01808 (2020).

[24]

Wei Wang, Bin Bi, Ming Yan, et al. 2019 a. Structbert: Incorporating language structures into pre-training for deep language understanding. arXiv preprint arXiv:1908.04577 (2019).

[25]

Xiaozhi Wang, Tianyu Gao, Zhaocheng Zhu, Zhiyuan Liu, Juanzi Li, and Jian Tang. 2019 b. KEPLER: A unified model for knowledge embedding and pre-trained language representation. arXiv preprint arXiv:1911.06136 (2019).

[26]

Xiaozhi Wang, Tianyu Gao, Zhaocheng Zhu, Zhengyan Zhang, Zhiyuan Liu, Juanzi Li, and Jian Tang. 2021. KEPLER: A unified model for knowledge embedding and pre-trained language representation. Transactions of the Association for Computational Linguistics, Vol. 9 (2021), 176--194.

[27]

Gerhard Weikum, Luna Dong, Simon Razniewski, and Fabian Suchanek. 2020. Machine Knowledge: Creation and Curation of Comprehensive Knowledge Bases. arXiv preprint arXiv:2009.11564 (2020).

[28]

Wenhan Xiong, Jingfei Du, William Yang Wang, and Veselin Stoyanov. 2019. Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model. arXiv preprint arXiv:1912.09637 (2019).

[29]

Runqi Yang, Jianhai Zhang, Xing Gao, Feng Ji, and Haiqing Chen. 2019. Simple and effective text matching with richer alignment features. arXiv preprint arXiv:1908.00300 (2019).

[30]

Denghui Zhang, Zixuan Yuan, Yanchi Liu, Fuzhen Zhuang, and Hui Xiong. 2020. E-BERT: A Phrase and Product Knowledge Enhanced Language Model for E-commerce. arXiv preprint arXiv:2009.02835 (2020).

[31]

Zhengyan Zhang, Xu Han, Zhiyuan Liu, Xin Jiang, Maosong Sun, and Qun Liu. 2019. ERNIE: Enhanced language representation with informative entities. arXiv preprint arXiv:1905.07129 (2019).

Cited By

Shan MMa YRuan SCao ZTong SLiu QSu YWang SFrommholz IHopfgartner FLee MOakes MLalmas MZhang MSantos R(2023)PaperLM: A Pre-trained Model for Hierarchical Examination Paper Representation LearningProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615003(2178-2187)Online publication date: 21-Oct-2023
https://dl.acm.org/doi/10.1145/3583780.3615003
Lai TZhai CJi H(2023)KEBLM: Knowledge-Enhanced Biomedical Language ModelsJournal of Biomedical Informatics10.1016/j.jbi.2023.104392143(104392)Online publication date: Jul-2023
https://doi.org/10.1016/j.jbi.2023.104392

Index Terms

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals

Recommendations

KEBLM: Knowledge-Enhanced Biomedical Language Models
Abstract
Pretrained language models (PLMs) have demonstrated strong performance on many natural language processing (NLP) tasks. Despite their great success, these PLMs are typically pretrained only on unstructured free texts without leveraging existing ...
Graphical abstract

Display Omitted
KIMedQA: towards building knowledge-enhanced medical QA models
Abstract
Medical question-answering systems require the ability to extract accurate, concise, and comprehensive answers. They will better comprehend the complex text and produce helpful answers if they can reason on the explicit constraints described in ...
Optimization of knowledge discovery process using domain knowledge
IIS '97: Proceedings of the 1997 IASTED International Conference on Intelligent Information Systems (IIS '97)

Modern database technologies process large volumes of data to discover new knowledge. Some large databases make discovery computationally expensive. Additional knowledge, known as domain or background knowledge, can often guide and restrict the search ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

October 2021

4966 pages

ISBN:9781450384469

DOI:10.1145/3459637

General Chairs:
Gianluca Demartini
The University of Queensland, Australia
,
Guido Zuccon
The University of Queensland, Australia
,
Program Chairs:
J. Shane Culpepper
RMIT University, Australia
,
Zi Huang
The University of Queensland, Australia
,
Hanghang Tong
University of Illinois at Urbana-Champaign, USA

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 October 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

CIKM '21

Sponsor:

CIKM '21: The 30th ACM International Conference on Information and Knowledge Management

November 1 - 5, 2021

Queensland, Virtual Event, Australia

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
160
Total Downloads

Downloads (Last 12 months)28
Downloads (Last 6 weeks)2

Reflects downloads up to 03 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Shan MMa YRuan SCao ZTong SLiu QSu YWang SFrommholz IHopfgartner FLee MOakes MLalmas MZhang MSantos R(2023)PaperLM: A Pre-trained Model for Hierarchical Examination Paper Representation LearningProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615003(2178-2187)Online publication date: 21-Oct-2023
https://dl.acm.org/doi/10.1145/3583780.3615003
Lai TZhai CJi H(2023)KEBLM: Knowledge-Enhanced Biomedical Language ModelsJournal of Biomedical Informatics10.1016/j.jbi.2023.104392143(104392)Online publication date: Jul-2023
https://doi.org/10.1016/j.jbi.2023.104392

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents