Hybrid Crowd-Machine Methods as Alternatives to Pooling and Expert Judgments

Christopher G. Harris²² &
Padmini Srinivasan²³

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8870))

Included in the following conference series:

Asia Information Retrieval Symposium

1420 Accesses

Abstract

Pooling is a document sampling strategy commonly used to collect relevance judgments when multiple retrieval/ranking algorithms are involved. A fixed number of top ranking documents from each algorithm form a pool. Traditionally, expensive experts judge the pool of documents for relevance. We propose and test two hybrid algorithms as alternatives that reduce assessment costs and are effective. The machine part selects documents to judge from the full set of retrieved documents. The human part uses inexpensive crowd workers to make judgments. We present a clustered and a non-clustered approach for document selection and two experiments testing our algorithms. The first is designed to be statistically robust, controlling for variations across crowd workers, collections, domains and topics. The second is designed along natural lines and investigates more topics. Our results demonstrate high quality can be achieved and at low cost. Moreover, this can be done by judging far fewer documents than with pooling. Precision, recall, F-scores and LAM are very strong, indicating that our algorithms with crowd sourcing offer viable alternatives to collecting judgments via pooling with expert assessments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Stochastic Relevance for Crowdsourcing

Tolerance of Effectiveness Measures to Relevance Judging Errors

Overview of the crowdsourcing process

Article 13 July 2018

References

Alonso, O., Mizzaro, S.: Can We Get Rid of TREC Assessors? Using Mechanical Turk for Relevance Assessment. In: Proc. SIGIR 2009 Workshop on the Future of IR Evaluation, pp. 15–16 (2009)
Google Scholar
Alonso, O., Mizzaro, S.: Using Crowdsourcing for TREC Relevance Assessment. Information Processing & Management 48(6), 1053–1066 (2012)
Article Google Scholar
Carterette, B., Allan, J., Sitaraman, R.: Minimal Test Collections for Retrieval Evaluation. In: SIGIR 2006, pp. 268–275. ACM (2006)
Google Scholar
Carterette, B., Soboroff, I.: The Effect of Assessor Error on IR System Evaluation. In: SIGIR 2010, pp. 539–546. ACM (2010)
Google Scholar
Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis, vol. 3. Wiley, New York (1973)
MATH Google Scholar
Emerson, P.: The Original BordaCount and Partial Voting. Social Choice and Welfare 40(2), 353–358 (2013)
Article MATH MathSciNet Google Scholar
Harman, D.K.: The First Text Retrieval Conference (TREC-1), Rockville, MD, USA, November 4-6 (1992); Information Processing & Management 29(4), 411–414 (1993)
Google Scholar
Harris, C., Srinivasan, P.: Using Hybrid Methods for Relevance Assessment. In: TREC Crowd 2012, TREC Notebook Paper (2012)
Google Scholar
Hersh, W., Buckley, C., Leone, T.J., Hickam, D.: OHSUMED: An Interactive Retrieval Evaluation and New Large Test Collection for Research. In: SIGIR 1994, pp. 192–201. ACM (1994)
Google Scholar
Hersh, W., Hickam, D.: Use of a Multi-Application Computer Workstation in a Clinical Setting. Bulletin of the Medical Library Association 82(4), 382 (1994)
Google Scholar
Kazai, G., Milic-Frayling, N.: On the Evaluation of the Quality of Relevance Assessments Collected through Crowdsourcing. In: SIGIR 2009 Workshop on the Future of IR Evaluation, p. 21 (2009)
Google Scholar
Lease, M., Kazai, G.: Overview of the TREC 2011 Crowdsourcing Track. TREC Notebook Paper (2011)
Google Scholar
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)
Book MATH Google Scholar
McCreadie, R., Macdonald, C., Ounis, I.: Identifying Top News using Crowdsourcing. Information Retrieval, 1–31 (2013)
Google Scholar
Meilă, M., Heckerman, D.: An Experimental Comparison of Several Clustering and Initialization Methods. In: Proc. of the 14th Conference on Uncertainty in Artificial Intelligence, pp. 386–395. Morgan Kaufmann (1998)
Google Scholar
Nallapati, R., Peerreddy, S., Singhal, P.: Skierarchy: Extending the Power of Crowdsourcing using a Hierarchy of Domain Experts, Crowd and Machine Learning. TREC Notebook Paper (2012)
Google Scholar
Qi, H., Yang, M., He, X., Li, S.: Re-examination on Lam% in Spam Filtering. In: SIGIR 2010, pp. 757–758. ACM (2010)
Google Scholar
Raykar, V.C., Yu, S., Zhao, L.H., Valadez, G.H., Florin, C., Bogoni, L., Moy, L.: Learning from Crowds. Journal of Machine Learning Research 99, 1297–1322 (2010)
MathSciNet Google Scholar
Robertson, S.E., Hull, D.A.: The TREC-9 Filtering Track Final Report. In: Online Proc. of TREC (2000)
Google Scholar
Sanderson, M., Joho, H.: Forming Test Collections with no System Pooling. In: SIGIR 2004, pp. 33–40. ACM (2004)
Google Scholar
Smucker, M.D., Jethani, C.P.: Human Performance and Retrieval Precision Revisited. In: SIGIR 2010, pp. 595–602. ACM (2010)
Google Scholar
Smucker, M.D., Kazai, G., Lease, M.: Overview of the TREC 2012 Crowdsourcing Track. TREC Notebook Paper (2012)
Google Scholar
Soboroff, I., Nicholas, C., Cahan, P.: Ranking Retrieval Systems without Relevance Judgments. In: SIGIR 2001, pp. 66–73. ACM (2001)
Google Scholar
Sparck Jones, K., van Rijsbergen, C.: Report on the Need for and Provision of an “Ideal” Information Retrieval Test Collection, British Library Research and Development Report 5266, Computer Laboratory, Univ. of Cambridge (1975)
Google Scholar
Tomlinson, S., Oard, D.W., Baron, J.R., Thompson, P.: Overview of the TREC 2007 Legal Track. In: Online Proceedings of TREC (2007)
Google Scholar
Voorhees, E.M.: Variations in relevance judgments and the measurement of retrieval effectiveness. Information Processing & Management 36(5), 697–716 (2000)
Article Google Scholar
Voorhees, E.M., Harman, D.: Overview of the Fifth Text REtrieval Conference (TREC-5). TREC (97), 1–28 (1996)
Google Scholar
Vuurens, J., de Vries, A.P., Eickhoff, C.: How Much Spam Can You Take? An Analysis of Crowdsourcing Results to Increase Accuracy. In: Proc. ACM SIGIR Workshop on Crowdsourcing for Information Retrieval, pp. 21–26 (2011)
Google Scholar
Xia, T., Zhang, C., Li, T., Xie, J.: BUPT_WILDCAT at TREC Crowdsourcing Track. TREC Notebook Paper (2012)
Google Scholar
Yilmaz, E., Kanoulas, E., Aslam, J.A.: A Simple and Efficient Sampling Method for Estimating AP and NDCG. In: SIGIR 2008, pp. 603–610. ACM (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, SUNY Oswego, Oswego, NY, 13126, USA
Christopher G. Harris
Department of Computer Science, University of Iowa, Iowa City, IA, 52242, USA
Padmini Srinivasan

Authors

Christopher G. Harris
View author publications
You can also search for this author in PubMed Google Scholar
Padmini Srinivasan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Visual Informatic, Universiti Kebangsaan Malaysia, 43600, Bangi, Selangor, Malaysia
Azizah Jaafar
Institute of Visual Informatics, Universiti Kebangsaan Malaysia, 43600, Bangi, Selangor, Malaysia
Nazlena Mohamad Ali
Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, 43600, Bangi, Selangor, Malaysia
Shahrul Azman Mohd Noah
Insight Centre for Data Analytics, Dublin City University, Glasnevin, 9, Dublin, Ireland
Alan F. Smeaton
Information Systems, Queensland University of Technology, 4001, Brisbane, QLD, Australia
Peter Bruza
Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, 40450, Shah Alam, Selangor, Malaysia
Zainab Abu Bakar & Nursuriati Jamil &
Cyber Security Center, Universiti Pertahanan Nasional Malaysia, Kem Sungai Besi, 57000, Kuala Lumpur, Malaysia
Tengku Mohd Tengku Sembok

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Harris, C.G., Srinivasan, P. (2014). Hybrid Crowd-Machine Methods as Alternatives to Pooling and Expert Judgments. In: Jaafar, A., et al. Information Retrieval Technology. AIRS 2014. Lecture Notes in Computer Science, vol 8870. Springer, Cham. https://doi.org/10.1007/978-3-319-12844-3_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-12844-3_6
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12843-6
Online ISBN: 978-3-319-12844-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Hybrid Crowd-Machine Methods as Alternatives to Pooling and Expert Judgments

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Stochastic Relevance for Crowdsourcing

Tolerance of Effectiveness Measures to Relevance Judging Errors

Overview of the crowdsourcing process

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Hybrid Crowd-Machine Methods as Alternatives to Pooling and Expert Judgments

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Stochastic Relevance for Crowdsourcing

Tolerance of Effectiveness Measures to Relevance Judging Errors

Overview of the crowdsourcing process

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation