Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/383952.383992acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

Why batch and user evaluations do not give the same results

Published: 01 September 2001 Publication History

Abstract

Much system-oriented evaluation of information retrieval systems has used the Cranfield approach based upon queries run against test collections in a batch mode. Some researchers have questioned whether this approach can be applied to the real world, but little data exists for or against that assertion. We have studied this question in the context of the TREC Interactive Track. Previous results demonstrated that improved performance as measured by relevance-based metrics in batch studies did not correspond with the results of outcomes based on real user searching tasks. The experiments in this paper analyzed those results to determine why this occurred. Our assessment showed that while the queries entered by real users into systems yielding better results in batch studies gave comparable gains in ranking of relevant documents for those users, they did not translate into better performance on specific tasks. This was most likely due to users being able to adequately find and utilize relevant documents ranked further down the output list.

References

[1]
Salton G and Buckley C, Term-weighting approaches in automatic text retrieval. Info Proc Mgmt, 1988. 24: 513-23.
[2]
Cleverdon C and Keen E, Aslib Cranfield Research Project: Factors determining the performance of indexing systems (Vol. 1: Design, Vol. 2: Results). 1966: Cranfield, UK.
[3]
Harman D. Overview of the first Text REtrieval Conference, in Proceedings of the 16th Annual International ACM Special Interest Group in Information Retrieval. 1993. Pittsburgh: ACM Press, 36-47.
[4]
Meadow C, Relevance? J Am Soc Info Sci, 1985. 36: 354-5.
[5]
Swanson D, Information retrieval as a trial-and-error process. Library Quarterly, 1977. 47: 128-48.
[6]
Hersh W, Relevance and retrieval evaluation: perspectives from medicine. J Am Soc Info Sci, 1994. 45: 201-6.
[7]
Hersh W, et al. Do batch and user evaluations give the same results?, in Proceedings of the 23rd Annual International ACM Special Interest Group in Information Retrieval. 2000. Athens, Greece: ACM Press, 17-24.
[8]
Buckley C and Voorhees E. Evaluating evaluation measure stability, in Proceedings of the 23rd Annual International ACM Special Interest Group in Information Retrieval. 2000. Athens, Greece: ACM.
[9]
Hersh W and Over P. TREC-8 interactive track report, in Proceedings of the 8th Text REtrieval Conference (TREC-8). 2000. Gaithersburg, MD: NIST, 57-64.
[10]
Hersh W, et al. Further analysis of whether batch and user evaluations give the same results with a different user task, in Proceedings of the Ninth Text Retrieval Conference (TREC- 9). 2001. Gaithersburg, MD: NIST, in press.
[11]
Hersh W and Over P. TREC-9 Interactive Track Report, in Proceedings of the Ninth Text Retrieval Conference (TREC- 9). 2001. Gaithersburg, MD: NIST, in press.
[12]
Robertson S and Walker S. Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval, in Proceedings of the 17th Annual International ACM Special Interest Group in Information Retrieval. 1994. Dublin: Springer-Verlag, 232-41.
[13]
Witten I, Moffat A, and Bell T, Managing Gigabytes - Compressing and Indexing Documents and Images. 1994, New York: Van Nostrand Reinhold.
[14]
Singhal A, Buckley C, and Mitra M. Pivoted document length normalization, in Proceedings of the 19th Annual International ACM Special Interest Group in Information Retrieval. 1996. Zurich, Switzerland: ACM Press, 21-9.

Cited By

View all
  • (2024)Context-Driven Interactive Query Simulations Based on Generative Large Language ModelsAdvances in Information Retrieval10.1007/978-3-031-56060-6_12(173-188)Online publication date: 24-Mar-2024
  • (2023)Clustering of Relevant Documents Based on Findability Effort in Information RetrievalInternational Journal of Information Retrieval Research10.4018/IJIRR.31576412:1(1-18)Online publication date: 6-Jan-2023
  • (2023)Beyond efficiency and renewablesHow to Create Sustainable Hospitality10.23912/9781911635659-5428Online publication date: Feb-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
September 2001
454 pages
ISBN:1581133316
DOI:10.1145/383952
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 September 2001

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

SIGIR01
Sponsor:

Acceptance Rates

SIGIR '01 Paper Acceptance Rate 47 of 201 submissions, 23%;
Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)19
  • Downloads (Last 6 weeks)1
Reflects downloads up to 18 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Context-Driven Interactive Query Simulations Based on Generative Large Language ModelsAdvances in Information Retrieval10.1007/978-3-031-56060-6_12(173-188)Online publication date: 24-Mar-2024
  • (2023)Clustering of Relevant Documents Based on Findability Effort in Information RetrievalInternational Journal of Information Retrieval Research10.4018/IJIRR.31576412:1(1-18)Online publication date: 6-Jan-2023
  • (2023)Beyond efficiency and renewablesHow to Create Sustainable Hospitality10.23912/9781911635659-5428Online publication date: Feb-2023
  • (2023)Understanding and Predicting User Satisfaction with Conversational Recommender SystemsACM Transactions on Information Systems10.1145/362498942:2(1-37)Online publication date: 8-Nov-2023
  • (2023)Validating Synthetic Usage Data in Living Lab EnvironmentsJournal of Data and Information Quality10.1145/3623640Online publication date: 24-Sep-2023
  • (2023)When Measurement MisleadsACM SIGIR Forum10.1145/3582524.358254056:1(1-20)Online publication date: 27-Jan-2023
  • (2022)Designing Formulae for Ranking Search Results: Mixed Methods Evaluation StudyJMIR Human Factors10.2196/302589:1(e30258)Online publication date: 25-Mar-2022
  • (2022)Batch Evaluation Metrics in Information Retrieval: Measures, Scales, and MeaningIEEE Access10.1109/ACCESS.2022.321166810(105564-105577)Online publication date: 2022
  • (2021)Modeling search and session effectivenessInformation Processing and Management: an International Journal10.1016/j.ipm.2021.10260158:4Online publication date: 1-Jul-2021
  • (2021)Textual Matching Framework for Measuring Similarity Between Profiles in E-recruitmentIntelligent Systems in Big Data, Semantic Web and Machine Learning10.1007/978-3-030-72588-4_21(291-315)Online publication date: 29-May-2021
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media