Abstract
Quality control has been an important issue in crowdsourcing. In the label collection tasks, for a given question, requesters usually aggregate the redundant answers labeled from multiple workers to obtain the reliable answer. Researchers have proposed various statistical approaches for this crowd label aggregation problem. Intuitively these approaches can generate aggregation results with higher quality if the ability of the set of workers is higher. To select a set of workers who are possible to have the higher ability without additional efforts for the requesters, in contrast to the existing solutions which need to design a proper qualification test or use auxiliary information, we propose an iterative reduction approach for worker filtering by leveraging the similarity of two workers. The worker similarity we select is feasible for the practical cases of incomplete labels. We construct experiments based on both synthetic and real datasets to verify the effectiveness of our approach and discuss the capability of our approach in different cases.
Similar content being viewed by others
References
Bachrach, Y., Minka, T., Guive, J., Graepel, T.: How to grade a test without knowing the answers - a Bayesian graphical model for adaptive crowdsourcing and aptitude testing. In: Proceedings of the 29th International Conference on Machine Learning, ICML 2012 (2012)
Li, H., Zhao, B., Fuxman, A.: The wisdom of minority: discovering and targeting the right group of workers for crowdsourcing. In: Proceedings of the 23rd International Conference on World Wide Web, WWW 2014, pp. 165–176 (2014)
Mozafari, B., Sarkar, P., Franklin, M., Jordan, M., Madden, S.: Scaling up crowd-sourcing to very large datasets: a case for active learning. Proc. VLDB Endow. 8(2), 125–136 (2014)
Pang, B., Lee, L.: A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, ACL 2004 (2004)
Snow, R., O’Connor, B., Jurafsky, D., Ng, A.Y.: Cheap and fast–but is it good?: evaluating non-expert annotations for natural language tasks. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP 2008, pp. 254–263 (2008)
Wang, J., Kraska, T., Franklin, M.J., Feng, J.: Crowder: crowdsourcing entity resolution. Proc. VLDB Endow. 5(11), 1483–1494 (2012)
Welinder, P., Branson, S., Belongie, S., Perona, P.: The multidimensional wisdom of crowds. In: Proceedings of the 23rd International Conference on Neural Information Processing Systems, NIPS 2010, pp. 2424–2432 (2010)
Whitehill, J., Ruvolo, P., Wu, T., Bergsma, J., Movellan, J.: Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In: Proceedings of the 22nd International Conference on Neural Information Processing Systems, NIPS 2009, pp. 2035–2043 (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Li, J., Kashima, H. (2017). Iterative Reduction Worker Filtering for Crowdsourced Label Aggregation. In: Bouguettaya, A., et al. Web Information Systems Engineering – WISE 2017. WISE 2017. Lecture Notes in Computer Science(), vol 10570. Springer, Cham. https://doi.org/10.1007/978-3-319-68786-5_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-68786-5_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68785-8
Online ISBN: 978-3-319-68786-5
eBook Packages: Computer ScienceComputer Science (R0)