Nothing Special   »   [go: up one dir, main page]

skip to main content
article

Preserving worker privacy in crowdsourcing

Published: 01 September 2014 Publication History

Abstract

This paper proposes a crowdsourcing quality control method with worker-privacy preservation. Crowdsourcing allows us to outsource tasks to a number of workers. The results of tasks obtained in crowdsourcing are often low-quality due to the difference in the degree of skill. Therefore, we need quality control methods to estimate reliable results from low-quality results. In this paper, we point out privacy problems of workers in crowdsourcing. Personal information of workers can be inferred from the results provided by each worker. To formulate and to address the privacy problems, we define a worker-private quality control problem, a variation of the quality control problem that preserves privacy of workers. We propose a worker-private latent class protocol where a requester can estimate the true results with worker privacy preserved. The key ideas are decentralization of computation and introduction of secure computation. We theoretically guarantee the security of the proposed protocol and experimentally examine the computational efficiency and accuracy.

References

[1]
Agrawal R, Srikant R (2000) Privacy-preserving data mining. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pp 439-450.
[2]
Bernstein M, Chi EH, Chilton L, Hartmann B, Kittur A, Miller RC (2011) Crowdsourcing and human computation: systems, studies and platforms. In: Proceedings of CHI 2011 Workshop on Crowdsourcing and Human Computation, pp 53-56.
[3]
Burkhart M, Strasser M, Many D, Dimitropoulos X (2010) SEPIA: privacy-preserving aggregation of multidomain network events and statistics. In: Proceedings of the 19th USENIX Conference on Security, pp 223-240.
[4]
Damgård I, Jurik M (2001) A Generalisation, a simplification and some applications of Paillier's probabilistic public-key system. In: Proceedings of the 4th International Workshop on Practice and Theory in Public Key Cryptography: Public Key Cryptography, pp 119-136.
[5]
Dawid AP, Skene AM (1979) Maximum likelihood estimation of observer error-rates using the EM algorithm. J R Stat Soc Ser C 28(1):20-28.
[6]
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B 39(1):1-38.
[7]
Ertekin S, Hirsh H, Rudin C (2012) Learning to predict the wisdom of crowds. In: Proceedings of Collective Intelligence 2012.
[8]
Lease M (2011) On quality control and machine learning in crowdsourcing. In: Proceedings of the Third Human Computation Workshop, pp 97-102.
[9]
Lin X, Clifton C, Zhu M (2005) Privacy-preserving clustering with distributed EM mixture modeling. Knowl Inf Syst 8(1):68-81.
[10]
Lindell Y, Pinkas B (2000) Privacy preserving data mining. In: Advances in Cryptology-CRYPTO '00, pp 36-54.
[11]
Nabar SU, Kenthapadi K, Mishra N, Motwani R (2008) A survey of query auditing techniques for data privacy. In: Privacy-Preserving Data Mining: Models and Algorithms, pp 415-431.
[12]
Raykar VC, Yu S, Zhao LH, Florin C, Bogoni L, Moy L (2010) Learning from crowds. J Mach Learn Res 11:1297-1322.
[13]
Shamir A (1979) How to share a secret. Commun ACM 22(11):612-613.
[14]
Sheng VS, Provost F, Ipeirotis PG (2008) Get another label? Improving data quality and data mining using multiple, noisy labelers. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 614-622.
[15]
Sweeney L (2002) k-anonymity: a model for protecting privacy. Int J Uncertain Fuzziness Knowl Syst 10(5):557-570.
[16]
Varshney LR (2012) Privacy and reliability in crowdsourcing service delivery. In: Proceedings of the 2012 Annual SRII Global Conference, pp 55-60.
[17]
Welinder P, Branson S, Belongie S, Perona P (2010) The multidimensional wisdom of crowds. Adv Neural Inf Process Syst 23:2424-2432.
[18]
Whitehill J, Ruvolo P, Wu T, Bergsma J, Movellan J (2009) Whose vote should count more: optimal integration of labels from labelers of unknown expertise. Adv Neural Inf Process Syst 22:2035-2043.
[19]
Yang B, Sato I, Nakagawa H (2012) Privacy-preserving EM algorithm for clustering on social network. In: Advances in Knowledge Discovery and Data Mining 16th Pacific-Asia Conference, PAKDD 2012, pp 542-553.

Cited By

View all
  • (2022)Privacy-preserving worker allocation in crowdsourcingThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-021-00713-131:4(733-751)Online publication date: 16-Jan-2022
  • (2020)Understanding Crowdsourcing Contest Fitness Strategic Decision Factors and Performance: An Expectation-Confirmation Theory PerspectiveInformation Systems Frontiers10.1007/s10796-019-09926-w22:5(1227-1240)Online publication date: 1-Oct-2020
  • (2019)Privacy-Preserving Truth Discovery in Crowd Sensing SystemsACM Transactions on Sensor Networks10.1145/327750515:1(1-32)Online publication date: 9-Jan-2019
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Data Mining and Knowledge Discovery
Data Mining and Knowledge Discovery  Volume 28, Issue 5-6
September 2014
482 pages

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 September 2014

Author Tags

  1. Crowdsourcing
  2. EM algorithm
  3. Privacy-preserving data mining
  4. Quality control

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 21 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Privacy-preserving worker allocation in crowdsourcingThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-021-00713-131:4(733-751)Online publication date: 16-Jan-2022
  • (2020)Understanding Crowdsourcing Contest Fitness Strategic Decision Factors and Performance: An Expectation-Confirmation Theory PerspectiveInformation Systems Frontiers10.1007/s10796-019-09926-w22:5(1227-1240)Online publication date: 1-Oct-2020
  • (2019)Privacy-Preserving Truth Discovery in Crowd Sensing SystemsACM Transactions on Sensor Networks10.1145/327750515:1(1-32)Online publication date: 9-Jan-2019
  • (2018)An Efficient Two-Layer Mechanism for Privacy-Preserving Truth DiscoveryProceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining10.1145/3219819.3219998(1705-1714)Online publication date: 19-Jul-2018
  • (2018)Non-Interactive Privacy-Preserving Truth Discovery in Crowd Sensing ApplicationsIEEE INFOCOM 2018 - IEEE Conference on Computer Communications10.1109/INFOCOM.2018.8486371(1988-1996)Online publication date: 16-Apr-2018
  • (2018)SybMatch: Sybil Detection for Privacy-Preserving Task Matching in Crowdsourcing2018 IEEE Global Communications Conference (GLOBECOM)10.1109/GLOCOM.2018.8647346(1-6)Online publication date: 9-Dec-2018
  • (2017)IOTURVAProceedings of the 12th Workshop on Challenged Networks10.1145/3124087.3124093(1-6)Online publication date: 20-Oct-2017
  • (2017)Poster: IoTURVAProceedings of the 23rd Annual International Conference on Mobile Computing and Networking10.1145/3117811.3131262(552-554)Online publication date: 4-Oct-2017
  • (2015)Cloud-Enabled Privacy-Preserving Truth Discovery in Crowd Sensing SystemsProceedings of the 13th ACM Conference on Embedded Networked Sensor Systems10.1145/2809695.2809719(183-196)Online publication date: 1-Nov-2015

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media