research-article

Modeling Diverse Relevance Patterns in Ad-hoc Retrieval

Authors:

Chengxiang Zhai,

Xueqi ChengAuthors Info & Claims

SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval

Pages 375 - 384

https://doi.org/10.1145/3209978.3209980

Published: 27 June 2018 Publication History

Abstract

Assessing relevance between a query and a document is challenging in ad-hoc retrieval due to its diverse patterns, i.e., a document could be relevant to a query as a whole or partially as long as it provides sufficient information for users' need. Such diverse relevance patterns require an ideal retrieval model to be able to assess relevance in the right granularity adaptively. Unfortunately, most existing retrieval models compute relevance at a single granularity, either document-wide or passage-level, or use fixed combination strategy, restricting their ability in capturing diverse relevance patterns. In this work, we propose a data-driven method to allow relevance signals at different granularities to compete with each other for final relevance assessment. Specifically, we propose a HIerarchical Neural maTching model (HiNT) which consists of two stacked components, namely local matching layer and global decision layer. The local matching layer focuses on producing a set of local relevance signals by modeling the semantic matching between a query and each passage of a document. The global decision layer accumulates local signals into different granularities and allows them to compete with each other to decide the final relevance score.Experimental results demonstrate that our HiNT model outperforms existing state-of-the-art retrieval models significantly on benchmark ad-hoc retrieval datasets.

References

[1]

Gianni Amati and Cornelis Joost Van Rijsbergen . 2002. Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Transactions on Information Systems (TOIS), Vol. 20, 4 (2002), 357--389.

Digital Library

[2]

Michael Bendersky and Oren Kurland . 2008. Utilizing passage-based language models for document retrieval European Conference on Information Retrieval. Springer, 162--174.

Digital Library

[3]

Christopher JC Burges . 2010. From ranknet to lambdarank to lambdamart: An overview. Learning Vol. 11 (2010), 23--581.

[4]

James P Callan . 1994. Passage-level evidence in document retrieval. In SIGIR. Springer-Verlag New York, Inc., 302--310.

Digital Library

[5]

James P Callan, W Bruce Croft, and John Broglio . 1995. TREC and TIPSTER experiments with INQUERY. Information Processing & Management Vol. 31, 3 (1995), 327--343.

Digital Library

[6]

Charles LA Clarke, Falk Scholer, and Ian Soboroff . 2005. The TREC 2005 Terabyte Track. In TREC.

[7]

Hui Fang and ChengXiang Zhai . 2006. Semantic term matching in axiomatic approaches to information retrieval SIGIR. ACM, 115--122.

Digital Library

[8]

Yoav Freund, Raj Iyer, Robert E Schapire, and Yoram Singer . 2003. An efficient boosting algorithm for combining preferences. Journal of machine learning research Vol. 4, Nov (2003), 933--969.

Digital Library

[9]

Alex Graves and Jürgen Schmidhuber . 2009. Offline handwriting recognition with multidimensional recurrent neural networks Advances in neural information processing systems. 545--552.

Digital Library

[10]

Jiafeng Guo, Yixing Fan, Qingyao Ai, and W Bruce Croft . 2016 a. A deep relevance matching model for ad-hoc retrieval CIKM. ACM, 55--64.

Digital Library

[11]

Jiafeng Guo, Yixing Fan, Qingyao Ai, and W Bruce Croft . 2016 b. Semantic matching by non-linear word transportation for information retrieval CIKM. ACM, 701--710.

Digital Library

[12]

Sepp Hochreiter and Jürgen Schmidhuber . 1997. Long short-term memory. Neural computation, Vol. 9, 8 (1997), 1735--1780.

Digital Library

[13]

Baotian Hu, Zhengdong Lu, Hang Li, and Qingcai Chen . 2014. Convolutional neural network architectures for matching natural language sentences NIPS. 2042--2050.

Digital Library

[14]

Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, and Larry Heck . 2013. Learning deep structured semantic models for web search using clickthrough data CIKM. ACM, 2333--2338.

Digital Library

[15]

Thorsten Joachims . 2006. Training linear SVMs in linear time. In SIGKDD. ACM, 217--226.

Digital Library

[16]

Marcin Kaszkiel and Justin Zobel . 1997. Passage retrieval revisited. In SIGIR, Vol. Vol. 31. ACM, 178--185.

Digital Library

[17]

Diederik Kingma and Jimmy Ba . 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[18]

Robert Krovetz . 1993. Viewing morphology as an inference process. In SIGIR. ACM, 191--202.

Digital Library

[19]

Joon Ho Lee . 1997. Analyses of multiple evidence combination. In ACM SIGIR Forum, Vol. Vol. 31. ACM, 267--276.

Digital Library

[20]

Xiaoyong Liu and W. Bruce Croft . 2002. Passage retrieval based on language models. In CIKM. 375--382.

Digital Library

[21]

Yuanhua Lv and Cheng Xiang Zhai . 2009. Positional language models for information retrieval SIGIR. 299--306.

Digital Library

[22]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean . 2013. Distributed representations of words and phrases and their compositionality Advances in neural information processing systems. 3111--3119.

Digital Library

[23]

Bhaskar Mitra and Nick Craswell . 2017. Neural Models for Information Retrieval. arXiv preprint arXiv:1705.01509 (2017).

[24]

Bhaskar Mitra, Fernando Diaz, and Nick Craswell . 2017. Learning to Match using Local and Distributed Representations of Text for Web Search Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1291--1299.

Digital Library

[25]

Seung-Hoon Na . 2015. Two-stage document length normalization for information retrieval. ACM Transactions on Information Systems (TOIS), Vol. 33, 2 (2015), 8.

Digital Library

[26]

Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, and Xueqi Cheng . 2016. A study of matchpyramid models on ad-hoc retrieval. arXiv preprint arXiv:1606.04648 (2016).

[27]

Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, Jingfang Xu, and Xueqi Cheng . 2017. DeepRank: A New Deep Architecture for Relevance Ranking in Information Retrieval CIKM. ACM, 257--266.

Digital Library

[28]

Tao Qin, Tie-Yan Liu, Jun Xu, and Hang Li . 2010. LETOR: A benchmark collection for research on learning to rank for information retrieval. Information Retrieval Vol. 13, 4 (2010), 346--374.

Digital Library

[29]

Keith Rayner . 1998. Eye movements in reading and information processing: 20 years of research. Psychological bulletin Vol. 124, 3 (1998), 372.

[30]

Stephen E Robertson and Steve Walker . 1994. Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In SIGIR. Springer-Verlag New York, Inc., 232--241.

Digital Library

[31]

Gerard Salton, James Allan, and Chris Buckley . 1993. Approaches to passage retrieval in full text information systems SIGIR. ACM, 49--58.

Digital Library

[32]

Mark Sanderson . 2010. Test collection based evaluation of information retrieval systems. Now Publishers Inc.

[33]

Tao Tao and ChengXiang Zhai . 2007. An exploration of proximity measures in information retrieval SIGIR. ACM, 295--302.

Digital Library

[34]

Shengxian Wan, Yanyan Lan, Jun Xu, Jiafeng Guo, Liang Pang, and Xueqi Cheng . 2016. Match-SRNN: Modeling the Recursive Matching Structure with Spatial RNN. arXiv preprint arXiv:1604.04378 (2016).

Digital Library

[35]

Mengqiu Wang and Luo Si . 2008. Discriminative probabilistic models for passage based retrieval SIGIR. ACM, 419--426.

Digital Library

[36]

Ho Chung Wu, Robert WP Luk, Kam-Fai Wong, and KL Kwok . 2007. A retrospective study of a hybrid document-context based retrieval model. Information processing & management Vol. 43, 5 (2007), 1308--1331.

Digital Library

[37]

Wensi Xi, Richard Xu-Rong, Christopher SG Khoo, and Ee-Peng Lim . 2001. Incorporating window-based passage-level evidence in document retrieval. Journal of information science Vol. 27, 2 (2001), 73--80.

[38]

Jun Xu and Hang Li . 2007. Adarank: a boosting algorithm for information retrieval SIGIR. ACM, 391--398.

Digital Library

[39]

Chengxiang Zhai and John Lafferty . 2001. A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval SIGIR. ACM, New York, NY, USA, 334--342.

Digital Library

Cited By

Pan MZhou SLi TLiu YPei QHuang AHuang J(2024)Utilizing passage‐level relevance and kernel pooling for enhancing BERT‐based document rerankingComputational Intelligence10.1111/coin.1265640:3Online publication date: 7-Jun-2024
https://doi.org/10.1111/coin.12656
Pimentel ADíaz OVillaseñor EJiménez J(2023)First steps towards improving official statistics data accessibility in Mexico: Query expansion with neural networks and ad-hoc space vectorsStatistical Journal of the IAOS10.3233/SJI-23001439:3(745-754)Online publication date: 12-Sep-2023
https://doi.org/10.3233/SJI-230014
Li MPopa DChagnon JCinar YGaussier E(2023)The Power of Selecting Key Blocks with Local Pre-ranking for Long Document Information RetrievalACM Transactions on Information Systems10.1145/356839441:3(1-35)Online publication date: 7-Feb-2023
https://dl.acm.org/doi/10.1145/3568394
Show More Cited By

Index Terms

Modeling Diverse Relevance Patterns in Ad-hoc Retrieval
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Learning to rank

Recommendations

A Deep Relevance Matching Model for Ad-hoc Retrieval
CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management

In recent years, deep neural networks have led to exciting breakthroughs in speech recognition, computer vision, and natural language processing (NLP) tasks. However, there have been few positive results of deep models on ad-hoc retrieval tasks. This is ...
On using inter-document relations in microblog retrieval
WWW '13 Companion: Proceedings of the 22nd International Conference on World Wide Web

Microblog Ad-hoc retrieval has received much attention in recent years. As a result of the high vocabulary diversity of the publishing users, a mismatch is formed between the queries being formulated and the tweets representing the actual topics. In ...
Vector space model adaptation and pseudo relevance feedback for content-based image retrieval

Image retrieval is an important problem for researchers in computer vision and content-based image retrieval (CBIR) fields. Over the last decades, many image retrieval systems were based on image representation as a set of extracted low-level features ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval

June 2018

1509 pages

ISBN:9781450356572

DOI:10.1145/3209978

General Chairs:
Kevyn Collins-Thompson
University of Michigan, United States
,
Qiaozhu Mei
University of Michigan, United States
,
Program Chairs:
Brian Davison
Lehigh University, United States
,
Yiqun Liu
Tsinghua University, China
,
Emine Yilmaz
University College London, United Kingdom

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 June 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

the National Natural Science Foundation of China (NSFC)
the National Key R&D Program of China
the 973 Program of China
the Youth Innovation Promotion Association CAS

Conference

SIGIR '18

Sponsor:

SIGIR

SIGIR '18: The 41st International ACM SIGIR conference on research and development in Information Retrieval

July 8 - 12, 2018

MI, Ann Arbor, USA

Acceptance Rates

SIGIR '18 Paper Acceptance Rate 86 of 409 submissions, 21%;

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

44
Total Citations
View Citations
688
Total Downloads

Downloads (Last 12 months)27
Downloads (Last 6 weeks)2

Reflects downloads up to 16 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Pan MZhou SLi TLiu YPei QHuang AHuang J(2024)Utilizing passage‐level relevance and kernel pooling for enhancing BERT‐based document rerankingComputational Intelligence10.1111/coin.1265640:3Online publication date: 7-Jun-2024
https://doi.org/10.1111/coin.12656
Pimentel ADíaz OVillaseñor EJiménez J(2023)First steps towards improving official statistics data accessibility in Mexico: Query expansion with neural networks and ad-hoc space vectorsStatistical Journal of the IAOS10.3233/SJI-23001439:3(745-754)Online publication date: 12-Sep-2023
https://doi.org/10.3233/SJI-230014
Li MPopa DChagnon JCinar YGaussier E(2023)The Power of Selecting Key Blocks with Local Pre-ranking for Long Document Information RetrievalACM Transactions on Information Systems10.1145/356839441:3(1-35)Online publication date: 7-Feb-2023
https://dl.acm.org/doi/10.1145/3568394
Ergashev UDragut EMeng W(2023)Learning To Rank Resources with GNNProceedings of the ACM Web Conference 202310.1145/3543507.3583360(3247-3256)Online publication date: 30-Apr-2023
https://dl.acm.org/doi/10.1145/3543507.3583360
Kamalloo EClarke CRafiei DChen HDuh WHuang HKato MMothe JPoblete B(2023)Limitations of Open-Domain Question Answering Benchmarks for Document-level ReasoningProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3592011(2123-2128)Online publication date: 19-Jul-2023
https://dl.acm.org/doi/10.1145/3539618.3592011
Pan MLi TLiu YPei QHuang EHuang J(2023)A semantically enhanced text retrieval framework with abstractive summarizationComputational Intelligence10.1111/coin.1260340:1Online publication date: 28-Sep-2023
https://doi.org/10.1111/coin.12603
Hambarde KProença H(2023)Information Retrieval: Recent Advances and BeyondIEEE Access10.1109/ACCESS.2023.329577611(76581-76604)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3295776
Rudra KFernando ZAnand A(2023)An in-depth analysis of passage-level label transfer for contextual document rankingInformation Retrieval Journal10.1007/s10791-023-09430-526:1-2Online publication date: 8-Dec-2023
https://doi.org/10.1007/s10791-023-09430-5
Gan LHu LTan XDu X(2023)TBNF:A Transformer-based Noise Filtering Method for Chinese Long-form Text MatchingApplied Intelligence10.1007/s10489-023-04607-353:19(22313-22327)Online publication date: 27-Jun-2023
https://doi.org/10.1007/s10489-023-04607-3
Dong SGoldstein JYang GCrestani FPasi GGaussier E(2022)GazBy: Gaze-Based BERT Model to Incorporate Human Attention in Neural Information RetrievalProceedings of the 2022 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3539813.3545129(182-192)Online publication date: 23-Aug-2022
https://dl.acm.org/doi/10.1145/3539813.3545129
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents