Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

BERT-SMAP: : Paying attention to Essential Terms in passage ranking beyond BERT

Published: 01 March 2022 Publication History

Abstract

Passage ranking has attracted considerable attention due to its importance in information retrieval (IR) and question answering (QA). Prior works have shown that pre-trained language models (e.g. BERT) can improve ranking performance. However, these simple BERT-based methods tend to focus on passage terms that exactly match the question, which makes them easily fooled by the overlapping but irrelevant (distracting) passages. To solve this problem, we propose a self-matching attention-pooling mechanism (SMAP) to highlight the Essential Terms in the question-passage pairs. Further, we propose a hybrid passage ranking architecture, called BERT-SMAP, which combines SMAP with BERT to more effectively identify distracting passages and downplay their influence. BERT-SMAP uses the representations obtained through SMAP to enhance BERT’s classification mechanism as an interaction-focused neural ranker, and as the inputs of a matching function. Experimental results on three evaluation datasets show that our model outperforms the previous best BERTb a s e-based approaches, and is comparable to the state-of-the-art method that utilizes a much stronger pre-trained language model.

Highlights

We propose a hybrid ranking architecture for passage ranking, that effectively solves the problem that ranking models are easily bewildered by the overlapping but irrelevant passages.
We propose a pooling attention mechanism called SMAP.
SMAP is combined with a pre-trained language model to identify distracting passages.
Approximately 5% absolute improvement has been achieved on WikiQA dataset, compared to the prior best approach based on the same pre-trained language model.

References

[1]
Bian, W., Li, S., Yang, Z., Chen, G., & Lin, Z. (2017). A compare-aggregate model with dynamic-clip attention for answer selection. In Proceedings of the 2017 ACM on conference on information and knowledge management (pp. 1987–1990).
[2]
Blanco R., Lioma C., Graph-based term weighting for information retrieval, Information Retrieval 15 (1) (2012) 54–92.
[3]
Cohen, D., Yang, L., & Croft, W. B. (2018). Wikipassageqa: A benchmark collection for research on non-factoid answer passage retrieval. In The 41st international ACM SIGIR conference on research & development in information retrieval (pp. 1165–1168).
[4]
Dai, Z., & Callan, J. (2020a). Context-aware document term weighting for ad-hoc search. In Proceedings of the web conference 2020 (pp. 1897–1907).
[5]
Dai Z., Callan J., Context-aware term weighting for first stage passage retrieval, in: Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval, Association for Computing Machinery, New York, NY, USA, 2020, pp. 1533–1536,.
[6]
Dai, Z., Xiong, C., Callan, J., & Liu, Z. (2018). Convolutional neural networks for soft-matching n-grams in ad-hoc search. In Proceedings of the 11th ACM international conference on web search and data mining (pp. 126–134).
[7]
Dehghani, M., Zamani, H., Severyn, A., Kamps, J., & Croft, W. B. (2017). Neural ranking models with weak supervision. In Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval (pp. 65–74).
[8]
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 conference of the North American Chapter of the Association for Computational Linguistics: Human language technologies (pp. 4171–4186).
[9]
Garg, S., Vu, T., & Moschitti, A. (2020). Tanda: Transfer and adapt pre-trained transformer models for answer sentence selection. In Proceedings of the AAAI conference on artificial intelligence, Vol. 34, No. 05 (pp. 7780–7788).
[10]
Guo, J., Fan, Y., Ai, Q., & Croft, W. B. (2016). A deep relevance matching model for ad-hoc retrieval. In Proceedings of the 25th ACM international on conference on information and knowledge management (pp. 55–64).
[11]
Guo J., Fan Y., Pang L., Yang L., Ai Q., Zamani H., Wu C., Croft W.B., Cheng X., A deep look into neural ranking models for information retrieval, Information Processing & Management 57 (6) (2020).
[12]
Han S., Wang X., Bendersky M., Najork M., Learning-to-rank with BERT in TF-ranking, 2020, arXiv preprint arXiv:2004.08476.
[13]
Hu B., Lu Z., Li H., Chen Q., Convolutional neural network architectures for matching natural language sentences, in: Advances in neural information processing systems 27, 2014, pp. 2042–2052.
[14]
Huang, P.-S., He, X., Gao, J., Deng, L., Acero, A., & Heck, L. (2013). Learning deep structured semantic models for web search using clickthrough data. In Proceedings of the 22nd ACM international conference on information & knowledge management (pp. 2333–2338).
[15]
Khashabi, D., Khot, T., Sabharwal, A., & Roth, D. (2017). Learning what is essential in questions. In Proceedings of the 21st conference on computational natural language learning (pp. 80–89).
[16]
Khattab, O., & Zaharia, M. (2020). Colbert: Efficient and effective passage search via contextualized late interaction over bert. In Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval (pp. 39–48).
[17]
MacAvaney, S., Yates, A., Cohan, A., & Goharian, N. (2019). CEDR: Contextualized embeddings for document ranking. In Proceedings of the 42nd international ACM SIGIR conference on research and development in information retrieval (pp. 1101–1104).
[18]
Mass Y., Roitman H., Erera S., Rivlin O., Weiner B., Konopnicki D., A study of BERT for non-factoid question-answering under passage length constraints, 2019, arXiv preprint arXiv:1908.06780.
[19]
Matsubara, Y., Vu, T., & Moschitti, A. (2020). Reranking for efficient transformer-based answer selection. In Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval (pp. 1577–1580).
[20]
McDonald, R., Brokos, G., & Androutsopoulos, I. (2018). Deep relevance ranking using enhanced document-query interactions. In Proceedings of the 2018 conference on empirical methods in natural language processing (pp. 1849–1860).
[21]
Mitra B., Craswell N., An updated duet model for passage re-ranking, 2019, arXiv preprint arXiv:1903.07666.
[22]
Nguyen, T., Rosenberg, M., Song, X., Gao, J., Tiwary, S., Majumder, R., & Deng, L. (2016). MS MARCO: A human-generated machine reading comprehension dataset. In CoCo@ NIPS.
[23]
Nogueira R., Cho K., Passage re-ranking with BERT, 2019, arXiv preprint arXiv:1901.04085.
[24]
Nogueira R., Lin J., Epistemic A., From doc2query to docTTTTTquery, 2019, Online preprint.
[25]
Nogueira R., Yang W., Cho K., Lin J., Multi-stage document ranking with BERT, arXiv preprint, 2019, arXiv:1910.14424.
[26]
Palangi H., Deng L., Shen Y., Gao J., He X., Chen J., Song X., Ward R., Deep sentence embedding using long short-term memory networks: analysis and application to information retrieval, IEEE/ACM Transactions on Audio, Speech, and Language Processing 24 (4) (2016).
[27]
Peters, M., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L. (2018). Deep contextualized word representations. In Proceedings of the 2018 conference of the North American Chapter of the Association for Computational Linguistics: Human language technologies (pp. 2227–2237).
[28]
Pirtoaca G.-S., Ruseti S., Rebedea T., Improving multi-choice question answering by identifying essential terms in questions, Revista Romana de Interactiune Om-Calculator 11 (2) (2018) 145–162.
[29]
Qiao Y., Xiong C., Liu Z., Liu Z., Understanding the behaviors of BERT in ranking, 2019, arXiv preprint arXiv:1904.07531.
[30]
Qiu, X., & Huang, X. (2015). Convolutional neural tensor network architecture for community-based question answering. In Twenty-fourth international joint conference on artificial intelligence.
[31]
Rousseau, F., & Vazirgiannis, M. (2013). Graph-of-word and TW-IDF: New approach to ad hoc IR. In Proceedings of the 22nd ACM international conference on information & knowledge management (pp. 59–68).
[32]
Wan, S., Lan, Y., Guo, J., Xu, J., Pang, L., & Cheng, X. (2016). A deep architecture for semantic matching with multiple positional sentence representations. In Proceedings of the AAAI conference on artificial intelligence, Vol. 30, No. 1.
[33]
Wan, S., Lan, Y., Xu, J., Guo, J., Pang, L., & Cheng, X. (2016). Match-SRNN: Modeling the recursive matching structure with spatial RNN. In Proceedings of the twenty-fifth international joint conference on artificial intelligence (pp. 2922–2928).
[34]
Xiong, C., Dai, Z., Callan, J., Liu, Z., & Power, R. (2017). End-to-end neural ad-hoc ranking with kernel pooling. In Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval (pp. 55–64).
[35]
Xu, Y., Lin, Z., Liu, Y., Liu, R., Wang, W., & Meng, D. (2019). Ranking and sampling in open-domain question answering. In Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (pp. 2412–2421).
[36]
Xu P., Ma X., Nallapati R., Xiang B., Passage ranking with weak supervsion, 2019, arXiv preprint arXiv:1905.05910.
[37]
Yang, Y., Yih, W.-t., & Meek, C. (2015). Wikiqa: A challenge dataset for open-domain question answering. In Proceedings of the 2015 conference on empirical methods in natural language processing (pp. 2013–2018).
[38]
Yoon, S., Dernoncourt, F., Kim, D. S., Bui, T., & Jung, K. (2019). A compare-aggregate model with latent clustering for answer selection. In Proceedings of the 28th ACM international conference on information and knowledge management (pp. 2093–2096).
[39]
Zheng, G., & Callan, J. (2015). Learning to reweight terms with distributed representations. In Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval (pp. 575–584).

Cited By

View all
  • (2024)STMAPInformation Processing and Management: an International Journal10.1016/j.ipm.2023.10357661:1Online publication date: 1-Jan-2024
  • (2023)Who can verify this? Finding authorities for rumor verification in TwitterInformation Processing and Management: an International Journal10.1016/j.ipm.2023.10336660:4Online publication date: 1-Jul-2023
  • (2023)A novel DL-based algorithm integrating medical knowledge graph and doctor modeling for Q&A pair matching in OHPInformation Processing and Management: an International Journal10.1016/j.ipm.2023.10332260:3Online publication date: 1-May-2023
  • Show More Cited By

Index Terms

  1. BERT-SMAP: Paying attention to Essential Terms in passage ranking beyond BERT
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Please enable JavaScript to view thecomments powered by Disqus.

          Information & Contributors

          Information

          Published In

          cover image Information Processing and Management: an International Journal
          Information Processing and Management: an International Journal  Volume 59, Issue 2
          Mar 2022
          970 pages

          Publisher

          Pergamon Press, Inc.

          United States

          Publication History

          Published: 01 March 2022

          Author Tags

          1. 00-01
          2. 99-00

          Author Tags

          1. Passage ranking
          2. Attention mechanism
          3. Information retrieval
          4. Question answering
          5. Pre-trained model

          Qualifiers

          • Research-article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)0
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 06 Jan 2025

          Other Metrics

          Citations

          Cited By

          View all
          • (2024)STMAPInformation Processing and Management: an International Journal10.1016/j.ipm.2023.10357661:1Online publication date: 1-Jan-2024
          • (2023)Who can verify this? Finding authorities for rumor verification in TwitterInformation Processing and Management: an International Journal10.1016/j.ipm.2023.10336660:4Online publication date: 1-Jul-2023
          • (2023)A novel DL-based algorithm integrating medical knowledge graph and doctor modeling for Q&A pair matching in OHPInformation Processing and Management: an International Journal10.1016/j.ipm.2023.10332260:3Online publication date: 1-May-2023
          • (2023)Techniques, datasets, evaluation metrics and future directions of a question answering systemKnowledge and Information Systems10.1007/s10115-023-02019-w66:4(2235-2268)Online publication date: 22-Dec-2023

          View Options

          View options

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media