Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1571941.1572019acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Towards methods for the collective gathering and quality control of relevance assessments

Published: 19 July 2009 Publication History

Abstract

Growing interest in online collections of digital books and video content motivates the development and optimization of adequate retrieval systems. However, traditional methods for collecting relevance assessments to tune system performance are challenged by the nature of digital items in such collections, where assessors are faced with a considerable effort to review and assess content by extensive reading, browsing, and within-document searching. The extra strain is caused by the length and cohesion of the digital item and the dispersion of topics within it. We propose a method for the collective gathering of relevance assessments using a social game model to instigate participants' engagement. The game provides incentives for assessors to follow a predefined review procedure and makes provisions for the quality control of the collected relevance judgments. We discuss the approach in detail, and present the results of a pilot study conducted on a book corpus to validate the approach. Our analysis reveals intricate relationships between the affordances of the system, the incentives of the social game, and the behavior of the assessors. We show that the proposed game design achieves two designated goals: the incentive structure motivates endurance in assessors and the review process encourages truthful assessment.

References

[1]
Bailey, P., Craswell, N., Soboroff, I., Thomas, P., de Vries, A. P., and Yilmaz, E. 2008. Relevance assessment: are judges exchangeable and does it matter. In Proc. of 31st ACM SIGIR (Singapore). ACM, New York, NY, 667--674.
[2]
Clark, P. B. and J. Q. Wilson. 1961. "Incentive Systems: A Theory of Organizations." Administrative Science Quarterly 6:129--26.
[3]
Cormack, G. V. and Lynam, T. R. 2007. Power and bias of subset pooling strategies. In Proc. of 30th ACM SIGIR (Amsterdam). ACM, New York, NY, 837--838.
[4]
Fuhr, N., Kamps, J., Lalmas, M., Malik, S., and Trotman, A. 2007. Overview of the INEX 2007 ad hoc track. In Proc. of INEX'07. 1--22.
[5]
Kazai, G., Doucet, A., Landoni, M. 2009. Overview of the INEX 2008 Book Track. In Proc. of INEX'08. LNCS Vol. 5613, Springer.
[6]
Piwowarski, B., Trotman, A., and Lalmas, M. 2008. Sound and complete relevance assessment for XML retrieval. ACM Trans. Inf. Syst. 27(1), 1--37.
[7]
Sanderson, M. and Joho, H. 2004. Forming test collections with no system pooling. In Proc. of 27th ACM SIGIR (Sheffield).ACM, New York, NY, 33--40.
[8]
Soboroff, I., Nicholas, C., and Cahan, P. 2001. Ranking retrieval systems without relevance judgments. In Proc. of 24th ACM SIGIR (New Orleans). ACM, New York, 66--73.
[9]
Spink, A. and Greisdorf, H. 2001. Regions and levels: measuring and mapping users'' relevance judgments. J. Am. Soc. Inf. Sci. Technol. 52(2), 161--173.
[10]
Trotman, A., Pharo, N.&Lehtonen (2006). XML-IR users and use cases. In Pre-Proc. of INEX'06, 274--286.
[11]
Trotman, A. and Jenkinson, D. 2007. IR evaluation using multiple assessors per topic. In Proc. of ADCS.
[12]
von Ahn, L. and Dabbish, L. 2008. Designing games with a purpose. Commun. ACM 51(8), 58--67.
[13]
von Ahn, L. and Dabbish, L. 2004. Labeling images with a computer game. In Proc. of SIGCHI Conference on Human Factors in Comp. Systems (Vienna). ACM, NY, 319--326.
[14]
Voorhees, E. M. and Tice, D. M. 2000. Building a question answering test collection. In Proc. of 23rd ACM SIGIR (Athens). ACM, New York, NY, 200--207.
[15]
Voorhees, E. M. and Harman, D. K. 2005 TREC: Experiment and Evaluation in Information Retrieval. The MIT Press.
[16]
Yilmaz, E., Kanoulas, E., and Aslam, J. A. 2008. A simple and efficient sampling method for estimating AP and NDCG. In Proc. of 31st ACM SIGIR. ACM, New York, 603--610.
[17]
Zobel, J. 1998. How reliable are the results of large-scale information retrieval experiments?. In Proc. of 21st ACM SIGIR (Melbourne). ACM, New York, NY, 307--314.

Cited By

View all
  • (2023)Relevance Judgment Convergence Degree – A Measure of Inconsistency among Assessors for Information RetrievalProceedings of the 30th International Conference on Information Systems Development10.62036/ISD.2022.38Online publication date: 2023
  • (2023)Relevance Judgment Convergence Degree—A Measure of Assessors Inconsistency for Information Retrieval DatasetsAdvances in Information Systems Development10.1007/978-3-031-32418-5_9(149-168)Online publication date: 27-Jun-2023
  • (2019)The Practice of CrowdsourcingSynthesis Lectures on Information Concepts, Retrieval, and Services10.2200/S00904ED1V01Y201903ICR06611:1(1-149)Online publication date: 28-May-2019
  • Show More Cited By

Index Terms

  1. Towards methods for the collective gathering and quality control of relevance assessments

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
    July 2009
    896 pages
    ISBN:9781605584836
    DOI:10.1145/1571941
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 19 July 2009

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. relevance assessments
    2. social game
    3. test collection construction

    Qualifiers

    • Research-article

    Conference

    SIGIR '09
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)4
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 26 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Relevance Judgment Convergence Degree – A Measure of Inconsistency among Assessors for Information RetrievalProceedings of the 30th International Conference on Information Systems Development10.62036/ISD.2022.38Online publication date: 2023
    • (2023)Relevance Judgment Convergence Degree—A Measure of Assessors Inconsistency for Information Retrieval DatasetsAdvances in Information Systems Development10.1007/978-3-031-32418-5_9(149-168)Online publication date: 27-Jun-2023
    • (2019)The Practice of CrowdsourcingSynthesis Lectures on Information Concepts, Retrieval, and Services10.2200/S00904ED1V01Y201903ICR06611:1(1-149)Online publication date: 28-May-2019
    • (2019)Idiom—based features in sentiment analysis: Cutting the Gordian knotIEEE Transactions on Affective Computing10.1109/TAFFC.2017.2777842(1-1)Online publication date: 2019
    • (2017)Building Test CollectionsProceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3077136.3082064(1407-1410)Online publication date: 7-Aug-2017
    • (2016)Collaborative construction of metadata and full-text dataset2016 XI Latin American Conference on Learning Objects and Technology (LACLO)10.1109/LACLO.2016.7751767(1-6)Online publication date: Oct-2016
    • (2015)Phrase detectivesProceedings of the 24th International Conference on Artificial Intelligence10.5555/2832747.2832841(4202-4206)Online publication date: 25-Jul-2015
    • (2015)On the Relation Between Assessor's Agreement and Accuracy in Gamified Relevance AssessmentProceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/2766462.2767727(605-614)Online publication date: 9-Aug-2015
    • (2015)BibliographyGames with a Purpose (Gwaps)10.1002/9781119136309.biblio(127-134)Online publication date: 3-Jul-2015
    • (2014)PageFetch 2Proceedings of the First International Workshop on Gamification for Information Retrieval10.1145/2594776.2594784(38-41)Online publication date: 13-Apr-2014
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media