Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3242587.3242598acmconferencesArticle/Chapter ViewAbstractPublication PagesuistConference Proceedingsconference-collections
research-article
Public Access

Sprout: Crowd-Powered Task Design for Crowdsourcing

Published: 11 October 2018 Publication History

Abstract

While crowdsourcing enables data collection at scale, ensuring high-quality data remains a challenge. In particular, effective task design underlies nearly every reported crowdsourcing success, yet remains difficult to accomplish. Task design is hard because it involves a costly iterative process: identifying the kind of work output one wants, conveying this information to workers, observing worker performance, understanding what remains ambiguous, revising the instructions, and repeating the process until the resulting output is satisfactory. To facilitate this process, we propose a novel meta-workflow that helps requesters optimize crowdsourcing task designs and Sprout, our open-source tool, which implements this workflow. Sprout improves task designs by (1) eliciting points of confusion from crowd workers, (2) enabling requesters to quickly understand these misconceptions and the overall space of questions, and (3) guiding requesters to improve the task design in response. We report the results of a user study with two labeling tasks demonstrating that requesters strongly prefer Sprout and produce higher-rated instructions compared to current best practices for creating gated instructions (instructions plus a workflow for training and testing workers). We also offer a set of design recommendations for future tools that support crowdsourcing task design.

Supplementary Material

suppl.mov (ufp1071.mp4)
Supplemental video
suppl.mov (ufp1071p.mp4)
Supplemental video
MP4 File (p165-bragg.mp4)

References

[1]
Harini Alagarai Sampath, Rajeev Rajeshuni, and Bipin Indurkhya. 2014. Cognitively Inspired Task Design to Improve User Performance on Crowdsourcing Platforms. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '14). ACM, New York, NY, USA, 3665--3674.
[2]
Omar Alonso and Stefano Mizzaro. 2012. Using crowdsourcing for TREC relevance assessment . Information Processing and Management 48, 6 (2012), 1053--1066.
[3]
Jonathan Bragg, Mausam, and Daniel S. Weld. 2016. Optimal Testing for Crowd Workers. In Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems (AAMAS '16). International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC, 966--974. http://dl.acm.org/citation.cfm?id=2937029.2937066
[4]
Joseph Chee Chang, Saleema Amershi, and Ece Kamar. 2017. Revolt: Collaborative Crowdsourcing for Labeling Machine Learning Datasets. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (CHI '17). 2334--2346.
[5]
Justin Cheng, Jaime Teevan, and Michael S. Bernstein. 2015. Measuring Crowdsourcing Effort with Error-Time Curves. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI '15). ACM, New York, NY, USA, 1365--1374.
[6]
Lydia B. Chilton, Greg Little, Darren Edge, Daniel S. Weld, and James A. Landay. 2013. Cascade: Crowdsourcing Taxonomy Creation. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '13). ACM, New York, NY, USA, 1999--2008.
[7]
Stanford Crowd Research Collective. The Daemo Crowdsourcing Marketplace. In CSCW '17.
[8]
Juliet Corbin and Anselm Strauss. 2014. Basics of Qualitative Research. SAGE Publications, Inc.
[9]
Crowdflower. 2017a. Ideal Jobs for Crowdsourcing. (2017). https://success.crowdflower.com/hc/en-us/articles/202703295-Ideal-Jobs-for-Crowdsourcing Downloaded on 9/17/17.
[10]
Crowdflower. 2017b. Test Question Best Practices. (2017). https://success.crowdflower.com/hc/en-us/articles/213078963-Test-Question-Best-Practices Downloaded on 9/17/17.
[11]
Steven Dow, Anand Kulkarni, Scott Klemmer, and Björn Hartmann. 2012. Shepherding the Crowd Yields Better Work. In Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work (CSCW '12). ACM, New York, NY, USA, 1013--1022.
[12]
Ujwal Gadiraju, Yang Jie, and Alessandro Bozzon. 2017. Clarity is a Worthwhile Quality - On the Role of Task Clarity in Microtask Crowdsourcing. In Proceedings of the 28th ACM Conference on Hypertext and Social Media (HT '17). 5--14.
[13]
Philipp Gutheim and Bjö rn Hartmann. 2012. Fantasktic : Improving Quality of Results for Novice Crowdsourcing Users . Master's thesis. University of California, Berkeley. http://www.eecs.berkeley.edu/Pubs/TechRpts/2012/EECS-2012--112.html
[14]
David J. Hauser and Norbert Schwarz. 2016. Attentive Turkers: MTurk participants perform better on online attention checks than do subject pool participants . Behavior Research Methods 48, 1 (2016), 400--407.
[15]
Ayush Jain, Akash Das Sarma, Aditya Parameswaran, and Jennifer Widom. 2017. Understanding Workers, Developing Effective Tasks, and Enhancing Marketplace Dynamics: A Study of a Large Crowdsourcing Marketplace. In 43rd International Conference on Very Large Data Bases (VLDB).
[16]
Jia Deng, Wei Dong, R. Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition. 248--255.
[17]
Sanjay Kairam and Jeffrey Heer. 2016. Parting Crowds: Characterizing Divergent Interpretations in Crowdsourced Annotation Tasks. In Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing (CSCW '16). ACM, New York, NY, USA, 1637--1648.
[18]
Diane Kelly and Leif Azzopardi. 2015. How Many Results Per Page?: A Study of SERP Size, Search Behavior and User Experience. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '15). ACM, New York, NY, USA, 183--192.
[19]
Jaewon Kim, Paul Thomas, Ramesh Sankaranarayana, Tom Gedeon, and Hwan-Jin Yoon. 2016. Pagination Versus Scrolling in Mobile Web Search. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management (CIKM '16). ACM, New York, NY, USA, 751--760.
[20]
Aniket Kittur, Susheel Khamkar, Paul André, and Robert Kraut. 2012. CrowdWeaver: Visually Managing Complex Crowd Work. In Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work (CSCW '12). ACM, New York, NY, USA, 1033--1036.
[21]
Todd Kulesza, Saleema Amershi, Rich Caruana, Danyel Fisher, and Denis Charles. 2014. Structured Labeling for Facilitating Concept Evolution in Machine Learning. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '14). ACM, New York, NY, USA, 3075--3084.
[22]
Anand Kulkarni, Matthew Can, and Björn Hartmann. 2012. Collaboratively Crowdsourcing Workflows with Turkomatic. In Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work (CSCW '12). ACM, New York, NY, USA, 1003--1012.
[23]
Angli Liu, Stephen Soderland, Jonathan Bragg, Christopher H. Lin, Xiao Ling, and Daniel S. Weld. 2016. Effective Crowd Annotation for Relation Extraction. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT '16). The Association for Computational Linguistics, 897--906. http://aclweb.org/anthology/N/N16/N16--1104.pdf
[24]
V. K. Chaithanya Manam and Alexander J. Quinn. 2018. WingIt: Efficient Refinement of Unclear Task Instructions. In Proceedings of the Sixth AAAI Conference on Human Computation and Crowdsourcing (HCOMP '18). AAAI Press, 108--116. https://aaai.org/ocs/index.php/HCOMP/HCOMP18/paper/view/17931
[25]
Brian McInnis, Dan Cosley, Chaebong Nam, and Gilly Leshed. 2016. Taking a HIT: Designing Around Rejection, Mistrust, Risk, and Workers' Experiences in Amazon Mechanical Turk. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI '16). ACM, New York, NY, USA, 2271--2282.
[26]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In NIPS. 1--9.
[27]
Tanushree Mitra, C.J. Hutto, and Eric Gilbert. 2015. Comparing Person- and Process-centric Strategies for Obtaining Quality Data on Amazon Mechanical Turk. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI '15). ACM, New York, NY, USA, 1345--1354.
[28]
David Oleson, Alexander Sorokin, Greg P. Laughlin, Vaughn Hester, John Le, and Lukas Biewald. 2011. Programmatic Gold: Targeted and Scalable Quality Assurance in Crowdsourcing. In Human Computation, Papers from the 2011 AAAI Workshop. AAAI. http://www.aaai.org/ocs/index.php/WS/AAAIW11/paper/view/3995
[29]
Daniel M. Oppenheimer, Tom Meyvis, and Nicolas Davidenko. 2009. Instructional manipulation checks: Detecting satisficing to increase statistical power . Journal of Experimental Social Psychology 45, 4 (2009), 867--872.
[30]
Alexandra Papoutsaki, Hua Guo, Danae Metaxa-Kakavouli, Connor Gramazio, Jeff Rasley, Wenting Xie, Guan Wang, and Jeff Huang. 2015. Crowdsourcing from Scratch: A Pragmatic Experiment in Data Collection by Novice Requesters. In Proceedings of the Third AAAI Conference on Human Computation and Crowdsourcing (HCOMP '15). AAAI Press, 140--149. http://www.aaai.org/ocs/index.php/HCOMP/HCOMP15/paper/view/11582
[31]
Jeffrey Rzeszotarski and Aniket Kittur. 2012. CrowdScape: Interactively Visualizing User Behavior and Output. In Proceedings of the 25th Annual ACM Symposium on User Interface Software and Technology (UIST '12). ACM, New York, NY, USA, 55--62.
[32]
Ben Shneiderman. 1996. The Eyes Have It: A Task by Data Type Taxonomy for Information Visualizations. In Proceedings of the 1996 IEEE Symposium on Visual Languages (VL '96). IEEE Computer Society, Washington, DC, USA, 336--. http://dl.acm.org/citation.cfm?id=832277.834354
[33]
Gaurav Sood. 2016. Parsed DMOZ data. (2016).
[34]
Daniel S. Weld, Mausam, Christopher H. Lin, and Jonathan Bragg. 2015. Artificial Intelligence and Collective Intelligence. In Handbook of Collective Intelligence, Thomas W. Malone and Michael S. Bernstein (Eds.). The MIT Press.
[35]
Meng-Han Wu and Alexander J. Quinn. 2017. Confusing the Crowd: Task Instruction Quality on Amazon Mechanical Turk. In Proceedings of the Fifth AAAI Conference on Human Computation and Crowdsourcing (HCOMP '17). AAAI Press, 206--215. https://aaai.org/ocs/index.php/HCOMP/HCOMP17/paper/view/15943

Cited By

View all
  • (2024)The State of Pilot Study Reporting in Crowdsourcing: A Reflection on Best Practices and GuidelinesProceedings of the ACM on Human-Computer Interaction10.1145/36410238:CSCW1(1-45)Online publication date: 26-Apr-2024
  • (2024)Exploring the Future of Informed Consent: Applying a Service Design ApproachProceedings of the ACM on Human-Computer Interaction10.1145/36373308:CSCW1(1-31)Online publication date: 26-Apr-2024
  • (2024)A clarity and fairness aware framework for selecting workers in competitive crowdsourcing tasksComputing10.1007/s00607-024-01316-8106:9(3005-3030)Online publication date: 6-Jul-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
UIST '18: Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology
October 2018
1016 pages
ISBN:9781450359481
DOI:10.1145/3242587
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 October 2018

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

Conference

UIST '18

Acceptance Rates

UIST '18 Paper Acceptance Rate 80 of 375 submissions, 21%;
Overall Acceptance Rate 561 of 2,567 submissions, 22%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)139
  • Downloads (Last 6 weeks)11
Reflects downloads up to 18 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)The State of Pilot Study Reporting in Crowdsourcing: A Reflection on Best Practices and GuidelinesProceedings of the ACM on Human-Computer Interaction10.1145/36410238:CSCW1(1-45)Online publication date: 26-Apr-2024
  • (2024)Exploring the Future of Informed Consent: Applying a Service Design ApproachProceedings of the ACM on Human-Computer Interaction10.1145/36373308:CSCW1(1-31)Online publication date: 26-Apr-2024
  • (2024)A clarity and fairness aware framework for selecting workers in competitive crowdsourcing tasksComputing10.1007/s00607-024-01316-8106:9(3005-3030)Online publication date: 6-Jul-2024
  • (2024)Quality Assured: Rethinking Annotation Strategies in Imaging AIComputer Vision – ECCV 202410.1007/978-3-031-73229-4_4(52-69)Online publication date: 29-Sep-2024
  • (2023)The Economics of Human Oversight: How Norms and Incentives Affect Costs and Performance of AI WorkersSSRN Electronic Journal10.2139/ssrn.4673217Online publication date: 2023
  • (2023)Judgment Sieve: Reducing Uncertainty in Group Judgments through Interventions Targeting Ambiguity versus DisagreementProceedings of the ACM on Human-Computer Interaction10.1145/36100747:CSCW2(1-26)Online publication date: 4-Oct-2023
  • (2023)Supporting Requesters in Writing Clear Crowdsourcing Task Descriptions Through Computational Flaw AssessmentProceedings of the 28th International Conference on Intelligent User Interfaces10.1145/3581641.3584039(737-749)Online publication date: 27-Mar-2023
  • (2023)CoSINT: Designing a Collaborative Capture the Flag Competition to Investigate MisinformationProceedings of the 2023 ACM Designing Interactive Systems Conference10.1145/3563657.3595997(2551-2572)Online publication date: 10-Jul-2023
  • (2023)NLP-Crowdsourcing Hybrid Framework for Inter-Researcher Similarity DetectionIEEE Transactions on Human-Machine Systems10.1109/THMS.2023.331929053:6(1017-1026)Online publication date: Dec-2023
  • (2023)Labelling instructions matter in biomedical image analysisNature Machine Intelligence10.1038/s42256-023-00625-55:3(273-283)Online publication date: 2-Mar-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media