Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/258525.258561acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article
Free access

Passage retrieval revisited

Published: 01 July 1997 Publication History

Abstract

Ranking based on passages addresses some of the shortcomings of whole-document ranking. It provides convenient units of text to return to the user, avoids the difficulties of comparing documents of different length, and enables identification of short blocks of relevant material amongst otherwise irrelevant text. In this paper we explore the potential of passage retrieval, based on an experimental evaluation of the ability of passages to identify relevant documents. We compare our scheme of arbitrary passage retrieval to several other document retrieval and passage retrieval methods; we show experimentally that, compared to these methods, ranking via fixed-length passages is robust and effective. Our experiments also show that, compared to whole-document ranking, ranking via fixed-length arbitrary passages significantly improves retrieval effectiveness, by 8% for TREC disks 2 and 4 and by 18%-37% for the Federal Register collection.

References

[1]
T.C. Bell, A. Moff&t, I.H. Witten, and J. Zobel. The MG retrieval system: Comp _r,#_#'ng for space and speed. Cornmunicationa of the ACM, 38(4):41-42, April 1995.
[2]
j.P. Callan. Passage-level evidence in document retrieval. In Proc. A CM-SIGIR International Conference on Research and Development in Information Retrieval, pages 302-309, Dublin, Ireland, 1994.
[3]
W.B. Fr#s and It. Baeza-Yates, editors. Information Retrieval: Data Structure# and Algorithms. Prentice- Hail, 1992.
[4]
D.K. Harman. Overview of the first Text Retrieval Conference. In D.K. Harman, editor, Proc. TREC Te# Retrieval Conference, pages 1-20, Washington, November 1992. National Institute of Standards Special Publication 500-207.
[5]
M.A. Hearst and C. Plaunt. Subtopic structuring for full-length document #. In Proc. A CM-SIGIR International Conference on Research and Development in Information Retr/eva/, pages 59-68, Pittsburg, 1993.
[6]
M. Kaszkiel, P. Vines, R. Wilkinson, and J. Zobel. The MDS experiments for TREC5. In Proc. Tezt Retrieval Conference (TREC), November 1996. Proceedings to appear.
[7]
D. Knaus, E. Mittendorf, P. Sch#uble, and P. Sheridan. Highlighting relevant passages for users of the interactive SPIDER retrieval system. In D.K. Harman, editor, Proc. Text Retrieval Conference (TREC), pages 233- 243, Washington, 1995. National Institute of Standards and Technology Special Publication 500-236.
[8]
E. Mittendorf and P. Sch#uble. Document and passage retrieval based on hidden Markov models. In Proc. A CM.8IGIR International Conference on Research and Development in lnforrnation Retr/e#al, pages 318-327, Dublin, Ireland, 1994.
[9]
A. Moffat and J. Zobel. Self-indexing inverted files for fast text retrieval. A CM Transactions on Information Systems, 14(4):349-379, October 1996.
[10]
M. Persia, J. Zobel, and EL S#k#Davis. Filtered document retrieval with frequency-sorted indexes. Journal o/the American,.qocie# for Information Science, 47(10):749-764, 1996.
[11]
G. Saiton. Automatic Tezt Processing: The Transformation, Analzlsis, and Retrieval of Information blt Computer. Addison-Wesley, Reading, MA, 1989.
[12]
G. Salton, J. Allan, and C. BucHey. Approaches to passage retrieval in full text information systems. In Proc. A CM-MGIR International Conference on Research and Development in Information Retr/eva/, pages 49-58, Pittsburg, 1993,
[13]
A. Singhal, C. Buckley, and M. Mitt& Pivoted document length normalization. In Proc. ACM.SIGIR International Conference on Research and Development in Information Retrieval, pages 21-29, Zurich, Switzerland, August 1996.
[14]
R. Wilkinson. Effective retrieval of structured documents. In Proc. A CM-MGIR International Conference on Research and Development m Information Retrieva# pages 311-317, Dublin, Ireland, 1994.
[15]
H. W'flliams and J. Zobel. Indexing nucleotide databases for fast query ev#tmtion. In Prvc. International Conference on Advances in Database Technolo# (EDBT), pages 275-288, Avignon, France, March 1996. Springer-Verlag. Lecture Notes in Computer Science 1057.
[16]
I.H. Witten, A. Moffat, and T.C. Bell. Mancging Cdga. bytes: Compressing and Indewino Documents and Images. Van Nostrand Reinhold, New York, 1994.
[17]
J. Zobel, A. Moffat, EL Wilkinson, and R. S#Davis. Efficient retrieval of partial documents. Information Processin9 # Management, 31(3):361-377, 1995.

Cited By

View all
  • (2022)Sparse Pairwise Re-ranking with Pre-trained TransformersProceedings of the 2022 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3539813.3545140(72-80)Online publication date: 23-Aug-2022
  • (2022)Overview of Touché 2022: Argument RetrievalExperimental IR Meets Multilinguality, Multimodality, and Interaction10.1007/978-3-031-13643-6_21(311-336)Online publication date: 5-Sep-2022
  • (2021)Intra-Document CascadingProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3404835.3462889(1349-1358)Online publication date: 11-Jul-2021
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '97: Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
July 1997
348 pages
ISBN:0897918363
DOI:10.1145/258525
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 July 1997

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

SIGIR97
Sponsor:

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)216
  • Downloads (Last 6 weeks)28
Reflects downloads up to 16 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Sparse Pairwise Re-ranking with Pre-trained TransformersProceedings of the 2022 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3539813.3545140(72-80)Online publication date: 23-Aug-2022
  • (2022)Overview of Touché 2022: Argument RetrievalExperimental IR Meets Multilinguality, Multimodality, and Interaction10.1007/978-3-031-13643-6_21(311-336)Online publication date: 5-Sep-2022
  • (2021)Intra-Document CascadingProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3404835.3462889(1349-1358)Online publication date: 11-Jul-2021
  • (2021)A Principled Approach Using Fuzzy Set Theory for Passage-Based Document RetrievalIEEE Transactions on Fuzzy Systems10.1109/TFUZZ.2020.299011029:7(1967-1977)Online publication date: Jul-2021
  • (2020)Context-Aware Document Term Weighting for Ad-Hoc SearchProceedings of The Web Conference 202010.1145/3366423.3380258(1897-1907)Online publication date: 20-Apr-2020
  • (2020)A passage-based approach to learning to rank documentsInformation Retrieval Journal10.1007/s10791-020-09369-x23:2(159-186)Online publication date: 6-Mar-2020
  • (2019)Utilizing Passages in Fusion-based Document RetrievalProceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3341981.3344212(59-66)Online publication date: 26-Sep-2019
  • (2019)Investigating Passage-level Relevance and Its Role in Document-level Relevance JudgmentProceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3331184.3331233(605-614)Online publication date: 18-Jul-2019
  • (2019)Figure Retrieval from Collections of Research ArticlesAdvances in Information Retrieval10.1007/978-3-030-15712-8_45(696-710)Online publication date: 7-Apr-2019
  • (2018)Modeling Diverse Relevance Patterns in Ad-hoc RetrievalThe 41st International ACM SIGIR Conference on Research & Development in Information Retrieval10.1145/3209978.3209980(375-384)Online publication date: 27-Jun-2018
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media