Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2818048.2835202acmconferencesArticle/Chapter ViewAbstractPublication PagescscwConference Proceedingsconference-collections
research-article

Assignment Techniques for Crowdsourcing Sensitive Tasks

Published: 27 February 2016 Publication History

Abstract

Protecting the privacy of crowd workers has been an important topic in crowdsourcing, however, task privacy has largely been ignored despite the fact that many tasks, e.g., form digitization, live audio transcription or image tagging often contain sensitive information. Although assigning an entire job to a worker may leak private information, jobs can often be split into small components that individually do not. We study the problem of distributing such tasks to workers with the goal of maximizing task privacy using such an approach.
We introduce information loss functions to formally measure the amount of private information leaked as a function of the task assignment. We then design assignment mechanisms for three different assignment settings: PUSH, PULL and a new setting Tug Of War (TOW), which is an intermediate approach that balances flexibility for both workers and requesters. Our assignment algorithms have zero privacy loss for PUSH, and tight theoretical guarantees for PULL. For TOW, our assignment algorithm provably outperforms PULL; importantly the privacy loss is independent of the number of tasks, even when workers collude. We further analyze the performance and privacy tradeoffs empirically on simulated and real-world collusion networks and find that our algorithms outperform the theoretical guarantees.

References

[1]
Micah Adler, Soumen Chakrabarti, Michael Mitzenmacher, and Lars Rasmussen. 1998. Parallel Randomized Load Balancing. Random Structures and Algorithms (1998), 159–188.
[2]
Chithralekha Balamurugan, Shourya Roy, and Sujit Gujar. 2013. Methods and systems for creating tasks of digitizing electronic document. (May 29 2013). US Patent App. 13/904,319.
[3]
Chithralekha Balamurugan, Shourya Roy, Jacki O'neill, and Sujit Gujar. 2014. Method and system for a text data entry from an electronic document. (Oct. 21 2014). US Patent 8,867,838.
[4]
David Blumenthal. 2010. Launching HIteCH. New England Journal of Medicine 362, 5 (2010), 382–385.
[5]
Alessandro Bozzon, Marco Brambilla, Stefano Ceri, Matteo Silvestri, and Giuliano Vesci. 2013. Choosing the right crowd: expert finding in social networks. In Proceedings of the 16th International Conference on Extending Database Technology. ACM, 637–648.
[6]
Jonathan Bragg, Daniel S Weld, and others. 2013. Crowdsourcing multi-label classfication for taxonomy creation. In First AAAI conference on human computation and crowdsourcing.
[7]
Bo Brinkman. 2013. An analysis of student privacy rights in the use of plagiarism detection systems. Science and engineering ethics 19, 3 (2013), 1255–1266.
[8]
Kuang Chen, Akshay Kannan, Yoriyasu Yano, Joseph M Hellerstein, and Tapan S Parikh. 2012. Shreddr: pipelined paper digitization for low-resource organizations. In Proceedings of the 2nd ACM Symposium on Computing for Development. ACM, 3.
[9]
Djellel Eddine Difallah, Gianluca Demartini, and Philippe Cudré-Mauroux. 2013. Pick-A-Crowd: Tell Me What You Like, and Ill Tell You What to Do. In Proceedings of the 22nd international conference on World Wide Web. International World Wide Web Conferences Steering Committee, 367–374.
[10]
Whitfield Diffie and Martin E Hellman. 1976. New directions in cryptography. Information Theory, IEEE Transactions on 22, 6 (1976), 644–654.
[11]
Noah E Friedkin. 1983. Horizons of observability and limits of informal control in organizations. Social Forces 62, 1 (1983), 54–77.
[12]
Gagan Goel, Afshin Nikzad, and Adish Singla. 2014. Allocating tasks to workers with matching constraints: truthful mechanisms for crowdsourcing markets. In Proceedings of the companion publication of the 23rd international conference on World wide web companion. International World Wide Web Conferences Steering Committee, 279–280.
[13]
Gaston H Gonnet. 1981. Expected length of the longest probe sequence in hash code searching. Journal of the ACM (JACM) 28, 2 (1981), 289–304.
[14]
Mitchell Gordon, Walter S Lasecki, Winnie Leung, Ellen Lim, Steven P Dow, and Jeffrey P Bigham. 2014. Glance Privacy: Obfuscating Personal Identity While Coding Behavioral Video. In Second AAAI Conference on Human Computation and Crowdsourcing.
[15]
Ralph Gross and Alessandro Acquisti. 2005. Information revelation and privacy in online social networks. In Proceedings of the 2005 ACM workshop on Privacy in the electronic society. ACM, 71–80.
[16]
Christopher G Harris. 2011. Dirty deeds done dirt cheap: a darker side to crowdsourcing. In Privacy, Security, Risk and Trust (PASSAT) and 2011 IEEE Third International Conference on Social Computing (SocialCom). IEEE, 1314–1317.
[17]
Kashmir Hill and Zack O'Malley Greenburg. 2010. The Black Market Price of Your Personal Info. Forbes Magazine. http://www.forbes.com/2010/11/29/ black-market-price-of-your-info-personal-finance. html
[18]
Jeff Howe. 2008. Crowdsourcing: How the power of the crowd is driving the future of business. Wired Magazine, Random House.
[19]
Srikanth Jagabathula, Lakshminarayanan Subramanian, and Ashwin Venkataraman. 2014. Reputation-based worker filtering in crowdsourcing. In Advances in Neural Information Processing Systems. 2492–2500.
[20]
Hiroshi Kajino, Yukino Baba, and Hisashi Kashima. 2014. Instance-Privacy Preserving Crowdsourcing. In Second AAAI Conference on Human Computation and Crowdsourcing.
[21]
Ravi Kannan, Santosh Vempala, and Adrian Vetta. 2004. On clusterings: Good, bad and spectral. Journal of the ACM (JACM) 51, 3 (2004), 497–515.
[22]
Ehud D Karnin, Eugene Walach, and Tal Drory. 2010. Crowdsourcing in the document processing practice. Springer.
[23]
Roman Khazankin, Harald Psaier, Daniel Schall, and Schahram Dustdar. 2011. Qos-based task scheduling in crowdsourcing environments. In Service-Oriented Computing. Springer, 297–311.
[24]
Ashiqur R KhudaBukhsh, Jaime G Carbonell, and Peter J Jansen. 2014. Detecting Non-Adversarial Collusion in Crowdsourcing. In Second AAAI Conference on Human Computation and Crowdsourcing.
[25]
Aniket Kittur, Jeffrey V Nickerson, Michael Bernstein, Elizabeth Gerber, Aaron Shaw, John Zimmerman, Matt Lease, and John Horton. 2013. The future of crowd work. In Proceedings of the 2013 conference on Computer supported cooperative work. ACM, 1301–1318.
[26]
Aniket Kittur, Boris Smus, Susheel Khamkar, and Robert E Kraut. 2011. Crowdforge: Crowdsourcing complex work. In Proceedings of the 24th annual ACM symposium on User interface software and technology. ACM, 43–52.
[27]
Nicolas Kokkalis, Thomas Köhn, Carl Pfeiffer, Dima Chornyi, Michael S Bernstein, and Scott R Klemmer. 2013. EmailValet: Managing email overload through private, accountable crowdsourcing. In Proceedings of the 2013 conference on Computer supported cooperative work. ACM, 1291–1300.
[28]
Walter Lasecki, Christopher Miller, Adam Sadilek, Andrew Abumoussa, Donato Borrello, Raja Kushalnagar, and Jeffrey Bigham. 2012. Real-time captioning by groups of non-experts. In Proceedings of the 25th annual ACM symposium on User interface software and technology. ACM, 23–34.
[29]
Walter S Lasecki, Mitchell Gordon, Jaime Teevan, Ece Kamar, and Jeffrey P Bigham. 2015. Preserving Privacy in Crowd-Powered Systems. (2015).
[30]
Walter S Lasecki, Jaime Teevan, and Ece Kamar. 2014. Information extraction and manipulation threats in crowd-powered systems. In Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing. ACM, 248–256.
[31]
Matthew Lease, Jessica Hullman, Jeffrey P Bigham, Michael Bernstein, Juho Kim, Walter Lasecki, Saeideh Bakhshi, Tanushree Mitra, and Robert C Miller. 2013. Mechanical turk is not anonymous. Social Science Research Network (2013).
[32]
Greg Little and Yu-An Sun. 2011. Human OCR: Insights from a complex human computation process. In Workshop on Crowdsourcing and Human Computation, Services, Studies and Platforms, ACM CHI. Citeseer.
[33]
R Manmatha, Chengfeng Han, Edward M Riseman, and W Bruce Croft. 1996. Indexing handwriting using word matching. In Proceedings of the first ACM international conference on Digital libraries. ACM, 151–159.
[34]
Jon Noronha, Eric Hysen, Haoqi Zhang, and Krzysztof Z Gajos. 2011. Platemate: crowdsourcing nutritional analysis from food photographs. In Proceedings of the 24th annual ACM symposium on User interface software and technology. ACM, 1–12.
[35]
Ali Nosary, Laurent Heutte, Thierry Paquet, and Yves Lecourtier. 1999. Defining writer's invariants to adapt the recognition task. In Document Analysis and Recognition, 1999. ICDAR'99. Proceedings of the Fifth International Conference on. IEEE, 765–768.
[36]
U.S. Department of Health & Human Services. 2000. Summary of the HIPPA Privacy Rule. http://www.hhs. gov/ocr/privacy/hipaa/understanding/summary/
[37]
Jacki O'Neill, Shourya Roy, Antonietta Grasso, and David Martin. 2013. Form digitization in BPO: from outsourcing to crowdsourcing?. In Proceedings of the SIGCHI conference on Human factors in computing systems. ACM, 197–206.
[38]
Tore Opsahl and Pietro Panzarasa. 2009. Clustering in weighted networks. Social networks 31, 2 (2009), 155–163.
[39]
Tony M Rath and Rudrapatna Manmatha. 2007. Word spotting for historical documents. International Journal of Document Analysis and Recognition (IJDAR) 9, 2-4 (2007), 139–152.
[40]
Theodoros Rekatsinas, Amol Deshpande, and Ashwin Machanavajjhala. 2013. SPARSI: Partitioning Sensitive Data Amongst Multiple Adversaries. Proc. VLDB Endow. 6, 13 (Aug. 2013), 1594–1605.
[41]
Pierangela Samarati and Latanya Sweeney. 1998. Generalizing data to provide anonymity when disclosing information. In PODS, Vol. 98. 188.
[42]
Cristina Sarasua and Matthias Thimm. 2013. Microtask available, send us your CV!. In Cloud and Green Computing (CGC), 2013 Third International Conference on. IEEE, 521–524.
[43]
Benjamin Satzger, Harald Psaier, Daniel Schall, and Schahram Dustdar. 2013. Auction-based crowdsourcing supporting skill management. Information Systems 38, 4 (2013), 547–560.
[44]
Imran Ahmed Siddiqi and Nicole Vincent. 2007. Writer identfication in handwritten documents. In Document Analysis and Recognition, 2007. ICDAR 2007. Ninth International Conference on, Vol. 1. IEEE, 108–112.
[45]
Lav R Varshney. 2012. Privacy and reliability in crowdsourcing service delivery. In SRII Global Conference (SRII), 2012 Annual. IEEE, 55–60.
[46]
Lav R Varshney, Aditya Vempaty, and Pramod K Varshney. 2014. Assuring privacy and reliability in crowdsourcing with coding. In Information Theory and Applications Workshop (ITA), 2014. IEEE, 1–6.
[47]
Louis Vuurpijl and Lambert Schomaker. 1996. Coarse writing-style clustering based on simple stroke-related features. Progress in Handwriting Recognition (1996), 37–44.
[48]
Gang Wang, Tianyi Wang, Haitao Zheng, and Ben Y Zhao. 2014. Man vs. machine: Practical adversarial detection of malicious crowdsourcing workers. In 23rd USENIX Security Symposium, USENIX Association, CA.
[49]
Duncan J Watts and Steven H Strogatz. 1998. Collective dynamics of small-worldnetworks. nature 393, 6684 (1998), 440–442.
[50]
Stephen M Wolfson and Matthew Lease. 2011. Look before you leap: legal pitfalls of crowdsourcing. Proceedings of the American Society for Information Science and Technology 48, 1 (2011), 1–10.
[51]
Sai Wu, Xiaoli Wang, Sheng Wang, Zhenjie Zhang, and Anthony KH Tung. 2014. K-anonymity for crowdsourcing database. Knowledge and Data Engineering, IEEE Transactions on 26, 9 (2014), 2207–2221.

Cited By

View all
  • (2025)Large Scale Anonymous Collusion and its detection in crowdsourcingExpert Systems with Applications10.1016/j.eswa.2024.125284259(125284)Online publication date: Jan-2025
  • (2024)TA-GAE: Crowdsourcing Diverse Task Assignment Based on Graph Autoencoder in AIoTIEEE Internet of Things Journal10.1109/JIOT.2023.334457311:8(14508-14522)Online publication date: 15-Apr-2024
  • (2023)Adaptive Clustering-Based Collusion Detection in CrowdsourcingAdvanced Intelligent Computing Technology and Applications10.1007/978-981-99-4752-2_22(261-275)Online publication date: 31-Jul-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CSCW '16: Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing
February 2016
1866 pages
ISBN:9781450335928
DOI:10.1145/2818048
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 February 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Crowdsourcing
  2. Microtasks
  3. Privacy
  4. Social Networks

Qualifiers

  • Research-article

Conference

CSCW '16
Sponsor:
CSCW '16: Computer Supported Cooperative Work and Social Computing
February 27 - March 2, 2016
California, San Francisco, USA

Acceptance Rates

CSCW '16 Paper Acceptance Rate 142 of 571 submissions, 25%;
Overall Acceptance Rate 2,235 of 8,521 submissions, 26%

Upcoming Conference

CSCW '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)13
  • Downloads (Last 6 weeks)4
Reflects downloads up to 22 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2025)Large Scale Anonymous Collusion and its detection in crowdsourcingExpert Systems with Applications10.1016/j.eswa.2024.125284259(125284)Online publication date: Jan-2025
  • (2024)TA-GAE: Crowdsourcing Diverse Task Assignment Based on Graph Autoencoder in AIoTIEEE Internet of Things Journal10.1109/JIOT.2023.334457311:8(14508-14522)Online publication date: 15-Apr-2024
  • (2023)Adaptive Clustering-Based Collusion Detection in CrowdsourcingAdvanced Intelligent Computing Technology and Applications10.1007/978-981-99-4752-2_22(261-275)Online publication date: 31-Jul-2023
  • (2022)A Survey on Task Assignment in CrowdsourcingACM Computing Surveys10.1145/349452255:3(1-35)Online publication date: 3-Feb-2022
  • (2021)CrowdSolveProceedings of the ACM on Human-Computer Interaction10.1145/34491925:CSCW1(1-30)Online publication date: 22-Apr-2021
  • (2019)Emerging Privacy Issues and Solutions in Cyber-Enabled Sharing Services: From Multiple PerspectivesIEEE Access10.1109/ACCESS.2019.28943447(26031-26059)Online publication date: 2019
  • (2019)Crowdwork platform governance toward organizational value creationThe Journal of Strategic Information Systems10.1016/j.jsis.2019.01.001Online publication date: Feb-2019
  • (2018)CrowdIAProceedings of the ACM on Human-Computer Interaction10.1145/32743742:CSCW(1-29)Online publication date: 1-Nov-2018
  • (2018)Sensitive Task Assignments in Crowdsourcing Markets with Colluding Workers2018 IEEE 34th International Conference on Data Engineering (ICDE)10.1109/ICDE.2018.00042(377-388)Online publication date: Apr-2018
  • (2018)A Survey on Task and Participant Matching in Mobile Crowd SensingJournal of Computer Science and Technology10.1007/s11390-018-1855-y33:4(768-791)Online publication date: 13-Jul-2018
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media