Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3077136.3080761acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
short-paper

Improving Retrieval Performance for Verbose Queries via Axiomatic Analysis of Term Discrimination Heuristic

Published: 07 August 2017 Publication History

Abstract

Number of terms in a query is a query-specific constant that is typically ignored in retrieval functions. However, previous studies have shown that the performance of retrieval models varies for different query lengths, and it usually degrades when query length increases. A possible reason for this issue can be the extraneous terms in longer queries that makes it a challenge for the retrieval models to distinguish between the key and complementary concepts of the query. As a signal to understand the importance of a term, inverse document frequency (IDF) can be used to discriminate query terms. In this paper, we propose a constraint to model the interaction between query length and IDF. Our theoretical analysis shows that current state-of-the-art retrieval models, such as BM25, do not satisfy the proposed constraint. We further analyze the BM25 model and suggest a modification to adapt BM25 so that it adheres to the new constraint. Our experiments on three TREC collections demonstrate that the proposed modification outperforms the baselines, especially for verbose queries.

References

[1]
Michael Bendersky and W. Bruce Croft 2008. Discovering Key Concepts in Verbose Queries. In SIGIR'08. Singapore, Singapore, 491--498.
[2]
Tze Leung Chung, Robert Wing Pong Luk, Kam Fai Wong, Kui Lam Kwok, and Dik Lun Lee 2006. Adapting Pivoted Document-length Normalization for Query Size: Experiments in Chinese and English. ACM Transactions on Asian Language Information Processing (TALIP), Vol. 5, 3 (2006), 245--263.
[3]
Ronan Cummins. 2016. A Study of Retrieval Models for Long Documents and Queries in Information Retrieval WWW'16. Montreal, Quebec, Canada, 795--805.
[4]
Ronan Cummins and Colm O'Riordan 2012. A Constraint to Automatically Regulate Document-length Normalisation CIKM'12. Maui, Hawaii, USA, 2443--2446.
[5]
Hui Fang, Tao Tao, and ChengXiang Zhai 2004. A Formal Study of Information Retrieval Heuristics SIGIR'04. Sheffield, United Kingdom, 49--56.
[6]
Hui Fang and ChengXiang Zhai 2005. An Exploration of Axiomatic Approaches to Information Retrieval SIGIR'05. Salvador, Brazil, 480--487.
[7]
Hui Fang and ChengXiang Zhai 2006. Semantic Term Matching in Axiomatic Approaches to Information Retrieval SIGIR'06. Seattle, Washington, USA, 115--122.
[8]
Manish Gupta and Michael Bendersky 2015. Information Retrieval with Verbose Queries. Foundations and Trends® in Information Retrieval, Vol. 9, 3--4 (2015), 209--354.
[9]
Yuanhua Lv. 2015. A Study of Query Length Heuristics in Information Retrieval CIKM'15. Melbourne, Australia, 1747--1750.
[10]
Yuanhua Lv and ChengXiang Zhai 2011. Lower-bounding Term Frequency Normalization. In CIKM'11. Glasgow, Scotland, UK, 7--16.
[11]
Jiaul H. Paik and Douglas W. Oard 2014. A Fixed-Point Method for Weighting Terms in Verbose Informational Queries CIKM'14. Shanghai, China, 131--140.
[12]
S. E. Robertson and S. Walker 1994. Some Simple Effective Approximations to the 2-Poisson Model for Probabilistic Weighted Retrieval. In SIGIR'94. Dublin, Ireland, 232--241.

Cited By

View all
  • (2024)Space-Efficient Indexes for Uncertain Strings2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00367(4828-4842)Online publication date: 13-May-2024
  • (2023)Text Indexing for Long Patterns: Anchors are All you NeedProceedings of the VLDB Endowment10.14778/3598581.359858616:9(2117-2131)Online publication date: 1-May-2023
  • (2022)Learning Relevant Questions for Conversational Product Search using Deep Reinforcement LearningProceedings of the Fifteenth ACM International Conference on Web Search and Data Mining10.1145/3488560.3498526(746-754)Online publication date: 11-Feb-2022
  • Show More Cited By

Index Terms

  1. Improving Retrieval Performance for Verbose Queries via Axiomatic Analysis of Term Discrimination Heuristic

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval
    August 2017
    1476 pages
    ISBN:9781450350228
    DOI:10.1145/3077136
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 August 2017

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. axiomatic analysis
    2. query length
    3. term discrimination
    4. theoretical analysis
    5. verbose queries

    Qualifiers

    • Short-paper

    Conference

    SIGIR '17
    Sponsor:

    Acceptance Rates

    SIGIR '17 Paper Acceptance Rate 78 of 362 submissions, 22%;
    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)11
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 23 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Space-Efficient Indexes for Uncertain Strings2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00367(4828-4842)Online publication date: 13-May-2024
    • (2023)Text Indexing for Long Patterns: Anchors are All you NeedProceedings of the VLDB Endowment10.14778/3598581.359858616:9(2117-2131)Online publication date: 1-May-2023
    • (2022)Learning Relevant Questions for Conversational Product Search using Deep Reinforcement LearningProceedings of the Fifteenth ACM International Conference on Web Search and Data Mining10.1145/3488560.3498526(746-754)Online publication date: 11-Feb-2022
    • (2021)Effective Query Formulation in Conversation Contextualization: A Query Specificity-based ApproachProceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3471158.3472237(177-183)Online publication date: 11-Jul-2021
    • (2020)An axiomatic approach to corpus-based cross-language information retrievalInformation Retrieval10.1007/s10791-020-09372-223:3(191-215)Online publication date: 1-Jun-2020
    • (2019)BM25 Beyond Query-Document SimilarityString Processing and Information Retrieval10.1007/978-3-030-32686-9_5(65-79)Online publication date: 7-Oct-2019
    • (2019)Pseudo-Relevance Feedback Based on Locally-Built Co-occurrence GraphsAdvances in Databases and Information Systems10.1007/978-3-030-28730-6_7(105-119)Online publication date: 13-Aug-2019
    • (2019)An Axiomatic Approach to Diagnosing Neural IR ModelsAdvances in Information Retrieval10.1007/978-3-030-15712-8_32(489-503)Online publication date: 7-Apr-2019
    • (2018)Theoretical Analysis of Interdependent Constraints in Pseudo-Relevance FeedbackThe 41st International ACM SIGIR Conference on Research & Development in Information Retrieval10.1145/3209978.3210156(1249-1252)Online publication date: 27-Jun-2018

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media