Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1645953.1646116acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Probabilistic models of ranking novel documents for faceted topic retrieval

Published: 02 November 2009 Publication History

Abstract

Traditional models of information retrieval assume documents are independently relevant. But when the goal is retrieving diverse or novel information about a topic, retrieval models need to capture dependencies between documents. Such tasks require alternative evaluation and optimization methods that operate on different types of relevance judgments. We define faceted topic retrieval as a particular novelty-driven task with the goal of finding a set of documents that cover the different facets of an information need. A faceted topic retrieval system must be able to cover as many facets as possible with the smallest number of documents. We introduce two novel models for faceted topic retrieval, one based on pruning a set of retrieved documents and one based on retrieving sets of documents through direct optimization of evaluation measures. We compare the performance of our models to MMR and the probabilistic model due to Zhai et al. on a set of 60 topics annotated with facets, showing that our models are competitive.

References

[1]
R. Agrawal, S. Gollapudi, H. Halverson, and S. Ieong. Diversifying search results. In Proceedings of WSDM '09, pages 5--14.
[2]
J. Allan, B. Carterette, and J. Lewis. When will information retrieval be "good enough"? In Proceedings of SIGIR, pages 433--440, 2005.
[3]
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research, 3:993--1022, Jan. 2003.
[4]
J. Carbonell and J. Goldstein. The user of mmr, diversity-based reranking for reordering documents and producing summaries. In Proceedings of SIGIR, pages 335--336, 1998.
[5]
H. Chen and D. R. Karger. Less is more: Probabilistic models for retrieving fewer relevant documents. In Proceedings of SIGIR, pages 429--436, 2006.
[6]
C. L. A. Clarke, M. Kolla, G. V. Cormack, O. Vechtomova, A. Ashkan, S. Buttcher, and I. MacKinnon. Novelty and diversity in information retrieval evaluation. In Proceedings of SIGIR, pages 659--666, 2008.
[7]
W. Dakka and P. G. Ipeirotis. Automatic extraction of useful facet hierarchies from text databases. Data Engineering, International Conference on, 0:466--475, 2008.
[8]
W. Goffman. On relevance as a measure. Information Storage and Retrieval, 2(3):201--203, 1964.
[9]
M. D. Gordan and P. Lenk. A utility theoretic examination of the probability ranking principle in information retrieval. JASIS, 42:703--714, 1991.
[10]
K. Jarvelin and J. Kekalainen. Ir evaluation methods for retrieving highly relevant documents. In Proceedings of SIGIR, pages 23--28, 2000.
[11]
V. Lavrenko and W. B. Croft. Relevance-based language models. In Proceedings of SIGIR, pages 120--127, 2001.
[12]
F. Radlinski, R. Kleinberg, and T. Joachims. Learning diverse rankings with multi-armed bandits. In Proceedings of ICML '08, pages 784--791.
[13]
S. E. Robertson. The probability ranking principle in ir. Journal of Documentation, 33(4):294--304, Dec. 1977.
[14]
M. Sanderson. Ambiguous queries: Test collections need more sense. In Proceedings of SIGIR, pages 499--506, 2008.
[15]
E. M. Voorhees and D. K. Harman, editors. TREC: Experiment and Evaluation in Information Retrieval. MIT Press, 2005.
[16]
C. Zhai, W. W. Cohen, and J. Lafferty. Beyond independent relevance: Methods and evaluation metrics for subtopic retrieval. In Proceedings of SIGIR, pages 10--17, 2003.
[17]
Y. Zhang, J. Callan, and T. Minka. Novelty and redundancy detection in adaptive filtering. In Proceedings of SIGIR, pages 81--88, 2002.

Cited By

View all
  • (2023)Result Diversification for Legal case RetrievalProceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3624918.3625319(158-168)Online publication date: 26-Nov-2023
  • (2023)Search Result Diversification Using Query Aspects as BottlenecksProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615050(3040-3051)Online publication date: 21-Oct-2023
  • (2022)Iterative query selection for opaque search engines with pseudo relevance feedbackExpert Systems with Applications10.1016/j.eswa.2022.117027201(117027)Online publication date: Sep-2022
  • Show More Cited By

Index Terms

  1. Probabilistic models of ranking novel documents for faceted topic retrieval

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management
    November 2009
    2162 pages
    ISBN:9781605585123
    DOI:10.1145/1645953
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 02 November 2009

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. diversity
    2. information retrieval
    3. novelty
    4. probabilistic models

    Qualifiers

    • Research-article

    Conference

    CIKM '09
    Sponsor:

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)7
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 14 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Result Diversification for Legal case RetrievalProceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3624918.3625319(158-168)Online publication date: 26-Nov-2023
    • (2023)Search Result Diversification Using Query Aspects as BottlenecksProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615050(3040-3051)Online publication date: 21-Oct-2023
    • (2022)Iterative query selection for opaque search engines with pseudo relevance feedbackExpert Systems with Applications10.1016/j.eswa.2022.117027201(117027)Online publication date: Sep-2022
    • (2021)Learning Multiple Intent Representations for Search QueriesProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482445(669-679)Online publication date: 26-Oct-2021
    • (2021)Full coverage of a reader's interests in context‐based information filteringJournal of the Association for Information Science and Technology10.1002/asi.2447072:8(1011-1027)Online publication date: 5-Jul-2021
    • (2020)Search Result Diversification with Guarantee of Topic ProportionalityProceedings of the 2020 ACM SIGIR on International Conference on Theory of Information Retrieval10.1145/3409256.3409839(53-60)Online publication date: 14-Sep-2020
    • (2019)Measuring the diversity of recommendations: a preference-aware approach for evaluating and adjusting diversityKnowledge and Information Systems10.1007/s10115-019-01371-0Online publication date: 19-Jun-2019
    • (2018)Perceptual Similarity Ranking of Temporal Heatmaps Using Convolutional Neural NetworksProceedings of the 2018 Workshop on Understanding Subjective Attributes of Data, with the Focus on Evoked Emotions10.1145/3267799.3267803(25-31)Online publication date: 15-Oct-2018
    • (2018)Exploring Diversification In Non-factoid Question AnsweringProceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3234944.3234973(223-226)Online publication date: 10-Sep-2018
    • (2018)Beyond Greedy SearchProceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3234944.3234967(99-106)Online publication date: 10-Sep-2018
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media