Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1099554.1099735acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
Article

Minimal document set retrieval

Published: 31 October 2005 Publication History

Abstract

This paper presents a novel formulation and approach to the minimal document set retrieval problem. Minimal Document Set Retrieval (MDSR) is a promising information retrieval task in which each query topic is assumed to have different subtopics; the task is to retrieve and rank relevant document sets with maximum coverage but minimum redundancy of subtopics in each set. For this task, we propose three document set retrieval and ranking algorithms: Novelty Based method, Cluster Based method and Subtopic Extraction Based method. In order to evaluate the system performance, we design a new evaluation framework for document set ranking which evaluates both relevance between set and query topic, and redundancy within each set. Finally, we compare the performance of the three algorithms using the TREC interactive track dataset. Experimental results show the effectiveness of our algorithms.

References

[1]
J. Allan, J. Carbonell, G. Doddington, J. Yamron, and Y. Yang. Topic detection and tracking pilot study. Topic Detection and Tracking Workshop Report, 2001.
[2]
J. Carbonell and J. Goldstein. The use of mmr, diversity-based reranking for reordering documents and producing summaries. In Proceedings of SIGIR 1998, pages 335--336, 1998.
[3]
L. F. Chien. Pat-tree-based adaptive keyphrase extraction for intelligent chinese information retrieval. In Proceedings of 20th Annual International ACM/SIGIR Conference on Research and Development in Information Retrieval, 1997.
[4]
W. Hersh and P. Over. Trec-8 interactive track report. The Seventh Text Retrieval Conference (TREC-8), pages 57--64, 2000.
[5]
A. Leuski and J. Allan. Improving interactive retrieval by combining ranked list and clustering. In Proceedings of RIAO, pages 665--681, 2000.
[6]
A. Leuski and W. Croft. An evaluation of techniques for clustering search results. In Technical Report IR-76, 1996.
[7]
N.Jardine and C. van Rijsbergen. The use of hierarchic clustering in information retrieval, Information Storage and Retrieval. 1995.
[8]
P. Over. Trec-6 interactive track report. The Sixth Text Retrieval Conference (TREC-6), pages 73--82, 1998.
[9]
P. Over. Trec-7 interactive track report. The Seventh Text Retrieval Conference (TREC-7), pages 65--72, 1999.
[10]
M. Spitters, R. Villa, and C. V. Rijsbergen. Tno at tdt2001: language model-based topic detection. In Topic Detection and Tracking Workshop Report, 2001.
[11]
E. M. Voorhees. Overview of the TREC 2003 question answering track. In Proceedings of Text REtrieval Conference, 2003.
[12]
J. Yamron, I. Carp, L. Gillick, S. Lowe, and P. V. Mulbregt. Topic tracking in a news stream. In Proceedings of the DARPA Broadcast News Workshop, 1999.
[13]
O. Zamir and O. Etzioni. Web document clustering: A feasibility demonstration. In Proceedings of the 19th International ACM SIGIR Conference on Research and Development of Information Retrieval (SIGIR'98), pages 217--240, 1998.
[14]
O. Zamir and O. Etzioni. Grouper: A dynamic clustering interface to web search results. In Proceedings of the Eighth International World Wide Web Conference (WWW8), 1999.
[15]
H. Zeng, Q. He, Z. Chen, W. Ma, and J. Ma. Learning to cluster web search results. In Proceedings of SIGIR 2004, 2004.
[16]
C. Zhai, W. W. Cohen, and J. Lafferty. Beyond independent relevance: Methods and evaluation metrics for subtopic retrieval. In Proceedings of SIGIR 2003, 2003.
[17]
R. Zhang, Z. M. Zhang, and S. Khanzode. A data mining approach to modeling relationships among categories in image collection. In Proceedings of ACM KDD 2004, pages 749--754, 2004.
[18]
Y. Zhang, J. Callan, and T. Minka. Novelty and redundancy dectection in adaptive filtering. In Proceedings of SIGIR 2002, 2002.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge management
October 2005
854 pages
ISBN:1595931406
DOI:10.1145/1099554
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 October 2005

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. document set retrieval
  2. information retrieval

Qualifiers

  • Article

Conference

CIKM05
Sponsor:
CIKM05: Conference on Information and Knowledge Management
October 31 - November 5, 2005
Bremen, Germany

Acceptance Rates

CIKM '05 Paper Acceptance Rate 77 of 425 submissions, 18%;
Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Searching, Learning, and Subtopic Ordering: A Simulation-Based AnalysisAdvances in Information Retrieval10.1007/978-3-030-99736-6_10(142-156)Online publication date: 5-Apr-2022
  • (2013)Topic based photo set retrieval using user annotated tagsMultimedia Tools and Applications10.1007/s11042-011-0850-x64:1(7-26)Online publication date: 1-May-2013
  • (2009)A Prototype Process-Based Search EngineProceedings of the 2009 IEEE International Conference on Semantic Computing10.1109/ICSC.2009.8(481-486)Online publication date: 14-Sep-2009
  • (2009)Query operations of process-based searches2009 Fourth International Conference on Digital Information Management10.1109/ICDIM.2009.5356786(1-6)Online publication date: Nov-2009
  • (2008)Learning to rank relational objects and its application to web searchProceedings of the 17th international conference on World Wide Web10.1145/1367497.1367553(407-416)Online publication date: 21-Apr-2008
  • (2008)A Graphical Model for Context-Aware Visual Content RecommendationIEEE Transactions on Multimedia10.1109/TMM.2007.91122610:1(52-62)Online publication date: 1-Jan-2008
  • (2008)An information-pattern-based approach to novelty detectionInformation Processing and Management: an International Journal10.1016/j.ipm.2007.09.01344:3(1159-1188)Online publication date: 1-May-2008
  • (2008)Conceptual Subtopic Identification in the Medical DomainAdvances in Artificial Intelligence – IBERAMIA 200810.1007/978-3-540-88309-8_32(312-321)Online publication date: 14-Oct-2008
  • (2006)Improving novelty detection for general topics using sentence level information patternsProceedings of the 15th ACM international conference on Information and knowledge management10.1145/1183614.1183652(238-247)Online publication date: 6-Nov-2006

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media