research-article

BERT-SMAP: : Paying attention to Essential Terms in passage ranking beyond BERT

Authors:

Ting WangAuthors Info & Claims

Volume 59, Issue 2

https://doi.org/10.1016/j.ipm.2021.102788

Published: 01 March 2022 Publication History

Abstract

Passage ranking has attracted considerable attention due to its importance in information retrieval (IR) and question answering (QA). Prior works have shown that pre-trained language models (e.g. BERT) can improve ranking performance. However, these simple BERT-based methods tend to focus on passage terms that exactly match the question, which makes them easily fooled by the overlapping but irrelevant (distracting) passages. To solve this problem, we propose a self-matching attention-pooling mechanism (SMAP) to highlight the Essential Terms in the question-passage pairs. Further, we propose a hybrid passage ranking architecture, called BERT-SMAP, which combines SMAP with BERT to more effectively identify distracting passages and downplay their influence. BERT-SMAP uses the representations obtained through SMAP to enhance BERT’s classification mechanism as an interaction-focused neural ranker, and as the inputs of a matching function. Experimental results on three evaluation datasets show that our model outperforms the previous best BERTb a s e-based approaches, and is comparable to the state-of-the-art method that utilizes a much stronger pre-trained language model.

Highlights

•

We propose a hybrid ranking architecture for passage ranking, that effectively solves the problem that ranking models are easily bewildered by the overlapping but irrelevant passages.

•

We propose a pooling attention mechanism called SMAP.

•

SMAP is combined with a pre-trained language model to identify distracting passages.

•

Approximately 5% absolute improvement has been achieved on WikiQA dataset, compared to the prior best approach based on the same pre-trained language model.

References

[1]

Bian, W., Li, S., Yang, Z., Chen, G., & Lin, Z. (2017). A compare-aggregate model with dynamic-clip attention for answer selection. In Proceedings of the 2017 ACM on conference on information and knowledge management (pp. 1987–1990).

[2]

Blanco R., Lioma C., Graph-based term weighting for information retrieval, Information Retrieval 15 (1) (2012) 54–92.

[3]

Cohen, D., Yang, L., & Croft, W. B. (2018). Wikipassageqa: A benchmark collection for research on non-factoid answer passage retrieval. In The 41st international ACM SIGIR conference on research & development in information retrieval (pp. 1165–1168).

[4]

Dai, Z., & Callan, J. (2020a). Context-aware document term weighting for ad-hoc search. In Proceedings of the web conference 2020 (pp. 1897–1907).

[5]

Dai Z., Callan J., Context-aware term weighting for first stage passage retrieval, in: Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval, Association for Computing Machinery, New York, NY, USA, 2020, pp. 1533–1536,.

Digital Library

[6]

Dai, Z., Xiong, C., Callan, J., & Liu, Z. (2018). Convolutional neural networks for soft-matching n-grams in ad-hoc search. In Proceedings of the 11th ACM international conference on web search and data mining (pp. 126–134).

[7]

Dehghani, M., Zamani, H., Severyn, A., Kamps, J., & Croft, W. B. (2017). Neural ranking models with weak supervision. In Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval (pp. 65–74).

[8]

Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 conference of the North American Chapter of the Association for Computational Linguistics: Human language technologies (pp. 4171–4186).

[9]

Garg, S., Vu, T., & Moschitti, A. (2020). Tanda: Transfer and adapt pre-trained transformer models for answer sentence selection. In Proceedings of the AAAI conference on artificial intelligence, Vol. 34, No. 05 (pp. 7780–7788).

[10]

Guo, J., Fan, Y., Ai, Q., & Croft, W. B. (2016). A deep relevance matching model for ad-hoc retrieval. In Proceedings of the 25th ACM international on conference on information and knowledge management (pp. 55–64).

[11]

Guo J., Fan Y., Pang L., Yang L., Ai Q., Zamani H., Wu C., Croft W.B., Cheng X., A deep look into neural ranking models for information retrieval, Information Processing & Management 57 (6) (2020).

[12]

Han S., Wang X., Bendersky M., Najork M., Learning-to-rank with BERT in TF-ranking, 2020, arXiv preprint arXiv:2004.08476.

[13]

Hu B., Lu Z., Li H., Chen Q., Convolutional neural network architectures for matching natural language sentences, in: Advances in neural information processing systems 27, 2014, pp. 2042–2052.

[14]

Huang, P.-S., He, X., Gao, J., Deng, L., Acero, A., & Heck, L. (2013). Learning deep structured semantic models for web search using clickthrough data. In Proceedings of the 22nd ACM international conference on information & knowledge management (pp. 2333–2338).

[15]

Khashabi, D., Khot, T., Sabharwal, A., & Roth, D. (2017). Learning what is essential in questions. In Proceedings of the 21st conference on computational natural language learning (pp. 80–89).

[16]

Khattab, O., & Zaharia, M. (2020). Colbert: Efficient and effective passage search via contextualized late interaction over bert. In Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval (pp. 39–48).

[17]

MacAvaney, S., Yates, A., Cohan, A., & Goharian, N. (2019). CEDR: Contextualized embeddings for document ranking. In Proceedings of the 42nd international ACM SIGIR conference on research and development in information retrieval (pp. 1101–1104).

[18]

Mass Y., Roitman H., Erera S., Rivlin O., Weiner B., Konopnicki D., A study of BERT for non-factoid question-answering under passage length constraints, 2019, arXiv preprint arXiv:1908.06780.

[19]

Matsubara, Y., Vu, T., & Moschitti, A. (2020). Reranking for efficient transformer-based answer selection. In Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval (pp. 1577–1580).

[20]

McDonald, R., Brokos, G., & Androutsopoulos, I. (2018). Deep relevance ranking using enhanced document-query interactions. In Proceedings of the 2018 conference on empirical methods in natural language processing (pp. 1849–1860).

[21]

Mitra B., Craswell N., An updated duet model for passage re-ranking, 2019, arXiv preprint arXiv:1903.07666.

[22]

Nguyen, T., Rosenberg, M., Song, X., Gao, J., Tiwary, S., Majumder, R., & Deng, L. (2016). MS MARCO: A human-generated machine reading comprehension dataset. In CoCo@ NIPS.

[23]

Nogueira R., Cho K., Passage re-ranking with BERT, 2019, arXiv preprint arXiv:1901.04085.

[24]

Nogueira R., Lin J., Epistemic A., From doc2query to docTTTTTquery, 2019, Online preprint.

[25]

Nogueira R., Yang W., Cho K., Lin J., Multi-stage document ranking with BERT, arXiv preprint, 2019, arXiv:1910.14424.

[26]

Palangi H., Deng L., Shen Y., Gao J., He X., Chen J., Song X., Ward R., Deep sentence embedding using long short-term memory networks: analysis and application to information retrieval, IEEE/ACM Transactions on Audio, Speech, and Language Processing 24 (4) (2016).

[27]

Peters, M., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L. (2018). Deep contextualized word representations. In Proceedings of the 2018 conference of the North American Chapter of the Association for Computational Linguistics: Human language technologies (pp. 2227–2237).

[28]

Pirtoaca G.-S., Ruseti S., Rebedea T., Improving multi-choice question answering by identifying essential terms in questions, Revista Romana de Interactiune Om-Calculator 11 (2) (2018) 145–162.

[29]

Qiao Y., Xiong C., Liu Z., Liu Z., Understanding the behaviors of BERT in ranking, 2019, arXiv preprint arXiv:1904.07531.

[30]

Qiu, X., & Huang, X. (2015). Convolutional neural tensor network architecture for community-based question answering. In Twenty-fourth international joint conference on artificial intelligence.

[31]

Rousseau, F., & Vazirgiannis, M. (2013). Graph-of-word and TW-IDF: New approach to ad hoc IR. In Proceedings of the 22nd ACM international conference on information & knowledge management (pp. 59–68).

[32]

Wan, S., Lan, Y., Guo, J., Xu, J., Pang, L., & Cheng, X. (2016). A deep architecture for semantic matching with multiple positional sentence representations. In Proceedings of the AAAI conference on artificial intelligence, Vol. 30, No. 1.

[33]

Wan, S., Lan, Y., Xu, J., Guo, J., Pang, L., & Cheng, X. (2016). Match-SRNN: Modeling the recursive matching structure with spatial RNN. In Proceedings of the twenty-fifth international joint conference on artificial intelligence (pp. 2922–2928).

[34]

Xiong, C., Dai, Z., Callan, J., Liu, Z., & Power, R. (2017). End-to-end neural ad-hoc ranking with kernel pooling. In Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval (pp. 55–64).

[35]

Xu, Y., Lin, Z., Liu, Y., Liu, R., Wang, W., & Meng, D. (2019). Ranking and sampling in open-domain question answering. In Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (pp. 2412–2421).

[36]

Xu P., Ma X., Nallapati R., Xiang B., Passage ranking with weak supervsion, 2019, arXiv preprint arXiv:1905.05910.

[37]

Yang, Y., Yih, W.-t., & Meek, C. (2015). Wikiqa: A challenge dataset for open-domain question answering. In Proceedings of the 2015 conference on empirical methods in natural language processing (pp. 2013–2018).

[38]

Yoon, S., Dernoncourt, F., Kim, D. S., Bui, T., & Jung, K. (2019). A compare-aggregate model with latent clustering for answer selection. In Proceedings of the 28th ACM international conference on information and knowledge management (pp. 2093–2096).

[39]

Zheng, G., & Callan, J. (2015). Learning to reweight terms with distributed representations. In Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval (pp. 575–584).

Cited By

Wang YZhang BLiu WCai JZhang H(2024)STMAPInformation Processing and Management: an International Journal10.1016/j.ipm.2023.10357661:1Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1016/j.ipm.2023.103576
Haouari FElsayed TMansour W(2023)Who can verify this? Finding authorities for rumor verification in TwitterInformation Processing and Management: an International Journal10.1016/j.ipm.2023.10336660:4Online publication date: 1-Jul-2023
https://dl.acm.org/doi/10.1016/j.ipm.2023.103366
Shen JPan TXu MGan DAn B(2023)A novel DL-based algorithm integrating medical knowledge graph and doctor modeling for Q&A pair matching in OHPInformation Processing and Management: an International Journal10.1016/j.ipm.2023.10332260:3Online publication date: 1-May-2023
https://dl.acm.org/doi/10.1016/j.ipm.2023.103322
Show More Cited By

Index Terms

BERT-SMAP: Paying attention to Essential Terms in passage ranking beyond BERT
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
  2. Machine learning
2. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Learning to rank
    2. Retrieval tasks and goals

Index terms have been assigned to the content through auto-classification.

Recommendations

Learning to find answers to questions on the Web

We introduce a method for learning to find documents on the Web that contain answers to a given natural language question. In our approach, questions are transformed into new queries aimed at maximizing the probability of retrieving answers from ...
Toward creating a fairer ranking in search engine results
Highlights
- Top web search results from search engines are topically biased.
- Diversity ...
Abstract
With the increasing popularity and social influence of search engines in IR, various studies have raised concerns on the presence of bias in search engines and the social responsibilities of IR systems. As an essential component of ...
Passage-aware Search Result Diversification
Research on search result diversification strives to enhance the variety of subtopics within the list of search results. Existing studies usually treat a document as a whole and represent it with one fixed-length vector. However, considering that a long ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Information Processing and Management: an International Journal

Information Processing and Management: an International Journal Volume 59, Issue 2

Mar 2022

970 pages

ISSN:0306-4573

Issue’s Table of Contents

Elsevier Ltd.

Publisher

Pergamon Press, Inc.

United States

Publication History

Published: 01 March 2022

Author Tags

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 06 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wang YZhang BLiu WCai JZhang H(2024)STMAPInformation Processing and Management: an International Journal10.1016/j.ipm.2023.10357661:1Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1016/j.ipm.2023.103576
Haouari FElsayed TMansour W(2023)Who can verify this? Finding authorities for rumor verification in TwitterInformation Processing and Management: an International Journal10.1016/j.ipm.2023.10336660:4Online publication date: 1-Jul-2023
https://dl.acm.org/doi/10.1016/j.ipm.2023.103366
Shen JPan TXu MGan DAn B(2023)A novel DL-based algorithm integrating medical knowledge graph and doctor modeling for Q&A pair matching in OHPInformation Processing and Management: an International Journal10.1016/j.ipm.2023.10332260:3Online publication date: 1-May-2023
https://dl.acm.org/doi/10.1016/j.ipm.2023.103322
Qamar FLatif SShah A(2023)Techniques, datasets, evaluation metrics and future directions of a question answering systemKnowledge and Information Systems10.1007/s10115-023-02019-w66:4(2235-2268)Online publication date: 22-Dec-2023
https://dl.acm.org/doi/10.1007/s10115-023-02019-w

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents