Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3219819.3219998acmotherconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Public Access

An Efficient Two-Layer Mechanism for Privacy-Preserving Truth Discovery

Published: 19 July 2018 Publication History

Abstract

Soliciting answers from online users is an efficient and effective solution to many challenging tasks. Due to the variety in the quality of users, it is important to infer their ability to provide correct answers during aggregation. Therefore, truth discovery methods can be used to automatically capture the user quality and aggregate user-contributed answers via a weighted combination. Despite the fact that truth discovery is an effective tool for answer aggregation, existing work falls short of the protection towards the privacy of participating users. To fill this gap, we propose perturbation-based mechanisms that provide users with privacy guarantees and maintain the accuracy of aggregated answers. We first present a one-layer mechanism, in which all the users adopt the same probability to perturb their answers. Aggregation is then conducted on perturbed answers but the aggregation accuracy could drop accordingly. To improve the utility, a two-layer mechanism is proposed where users are allowed to sample their own probabilities from a hyper distribution. We theoretically compare the one-layer and two-layer mechanisms, and prove that they provide the same privacy guarantee while the two-layer mechanism delivers better utility. This advantage is brought by the fact that the two-layer mechanism can utilize the estimated user quality information from truth discovery to reduce the accuracy loss caused by perturbation, which is confirmed by experimental results on real-world datasets. Experimental results also demonstrate the effectiveness of the proposed two-layer mechanism in privacy protection with tolerable accuracy loss in aggregation.

Supplementary Material

MP4 File (li_truth_discovery.mp4)

References

[1]
Shipra Agrawal, Jayant R Haritsa, and B Aditya Prakash . 2009. FRAPP: a framework for high-accuracy privacy-preserving mining. Data Mining and Knowledge Discovery Vol. 18, 1 (2009), 101--139.
[2]
Yoram Bachrach, Tom Minka, John Guiver, and Thore Graepel . 2012. How to Grade a Test without Knowing the Answers -- A Bayesian Graphical Model for Adaptive Crowdsourcing and Aptitude Testing. In Proc. of ICML. 255--262.
[3]
Anirban Basu, Jaideep Vaidya, Juan Camilo Corena, Shinsaku Kiyomoto, Stephen Marsh, Guibing Guo, Jie Zhang, and Yutaka Miyake . 2014. Opinions of people: factoring in privacy and trust. ACM SIGAPP Applied Computing Review Vol. 14, 3 (2014), 7--21.
[4]
Elisa Bertino, Beng Chin Ooi, Yanjiang Yang, and Robert H Deng . 2005. Privacy and ownership preserving of outsourced medical data Proc. of ICDE. 521--532.
[5]
Arijit Chaudhuri and Rahul Mukerjee . 1988. Randomized response: Theory and techniques. Marcel Dekker New York.
[6]
Chris Clifton, Murat Kantarcioglu, AnHai Doan, Gunther Schadow, Jaideep Vaidya, Ahmed Elmagarmid, and Dan Suciu . 2004. Privacy-preserving data integration and sharing. In Proc. of ACM SIGMOD workshop. 19--26.
[7]
Alexander Philip Dawid and Allan M Skene . 1979. Maximum likelihood estimation of observer error-rates using the EM algorithm. Applied statistics (1979), 20--28.
[8]
Xin Luna Dong, Laure Berti-Equille, and Divesh Srivastava . 2009. Integrating Conflicting Data: The Role of Source Dependence. PVLDB Vol. 2, 1 (2009), 550--561.
[9]
John C Duchi, Michael I Jordan, and Martin J Wainwright . 2013. Local privacy and statistical minimax rates. In Proc. of FOCS. 429--438.
[10]
Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith . 2006. Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography. 265--284.
[11]
Cynthia Dwork and Aaron Roth . 2014. The algorithmic foundations of differential privacy. (2014).
[12]
Úlfar Erlingsson, Aleksandra Korolova, and Vasyl Pihur . 2014. RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response Proc. of CCS. 1054--1067.
[13]
Alban Galland, Serge Abiteboul, Amélie Marian, and Pierre Senellart . 2010. Corroborating Information from Disagreeing Views. In Proc. of WSDM. 131--140.
[14]
Haibo Hu, Jianliang Xu, Sai Tung On, Jing Du, and Joseph Kee-Yin Ng . 2010. Privacy-aware location data publishing. ACM Transactions on Database Systems (TODS) Vol. 35, 3 (2010), 18.
[15]
Peter Kairouz, Sewoong Oh, and Pramod Viswanath . 2014. Extremal Mechanisms for Local Differential Privacy NIPS. 2879--2887.
[16]
Hiroshi Kajino, Hiromi Arai, and Hisashi Kashima . 2014. Preserving worker privacy in crowdsourcing. Data Mining and Knowledge Discovery Vol. 28, 5--6 (2014), 1314--1335.
[17]
Daniel Kifer and Ashwin Machanavajjhala . 2012. A rigorous and customizable framework for privacy. In Proc. of PODS. 77--88.
[18]
Qi Li, Yaliang Li, Jing Gao, Lu Su, Bo Zhao, Demirbas Murat, Wei Fan, and Jiawei Han . 2015 b. A Confidence-Aware Approach for Truth Discovery on Long-Tail Data. PVLDB Vol. 8, 4 (2015), 425--436.
[19]
Qi Li, Yaliang Li, Jing Gao, Bo Zhao, Wei Fan, and Jiawei Han . 2014. Resolving Conflicts in Heterogeneous Data by Truth Discovery and Source Reliability Estimation. In Proc. of SIGMOD. 1187--1198.
[20]
Xian Li, Xin Luna Dong, Kenneth B. Lyons, Weiyi Meng, and Divesh Srivastava . 2012. Truth Finding on the Deep Web: Is the Problem Solved? PVLDB Vol. 6, 2 (2012), 97--108.
[21]
Yaliang Li, Jing Gao, Chuishi Meng, Qi Li, Lu Su, Bo Zhao, Wei Fan, and Jiawei Han . 2015 a. A Survey on Truth Discovery. ACM SIGKDD Explorations Newsletter Vol. 17, 2 (2015), 1--16.
[22]
Fenglong Ma, Yaliang Li, Qi Li, Minghui Qiu, Jing Gao, Shi Zhi, Lu Su, Bo Zhao, Heng Ji, and Jiawei Han . 2015. FaitCrowd: Fine Grained Truth Discovery for Crowdsourced Data Aggregation Proc. of KDD. 745--754.
[23]
Chenglin Miao, Wenjun Jiang, Lu Su, Yaliang Li, Suxin Guo, Zhan Qin, Houping Xiao, Jing Gao, and Kui Ren . 2015. Cloud-enabled privacy-preserving truth discovery in crowd sensing systems Proc. of SenSys. 183--196.
[24]
Chenglin Miao, Lu Su, Wenjun Jiang, Yaliang Li, and Miaomiao Tian . 2017. A Lightweight Privacy-Preserving Truth Discovery Framework for Mobile Crowd Sensing Systems. In Proc. of INFOCOM. 1539--1547.
[25]
Liam O' Neill, Franklin Dexter, and Nan Zhang . 2016. The Risks to Patient Privacy from Publishing Data from Clinical Anesthesia Studies. Anesthesia & Analgesia Vol. 122, 6 (2016), 2017--2027.
[26]
Vibhor Rastogi and Suman Nath . 2010. Differentially private aggregation of distributed time-series with transformation and encryption. In Proc. of SIGMOD. 735--746.
[27]
Elaine Shi, T-H Hubert Chan, Eleanor G Rieffel, Richard Chow, and Dawn Song . 2011. Privacy-Preserving Aggregation of Time-Series Data. Proc. of NDSS.
[28]
R. Snow, B. O'Connor, D. Jurafsky, and A. Ng . 2008. Cheap and Fast - But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks. In Proc. of EMNLP'08. 254--263.
[29]
Hien To, Gabriel Ghinita, and Cyrus Shahabi . 2014. A Framework for Protecting Worker Location Privacy in Spatial Crowdsourcing. PVLDB Vol. 7, 10 (2014), 919--930.
[30]
J. Whitehill, P. Ruvolo, T. Wu, J. Bergsma, and J. Movellan . 2009. Whose Vote Should Count More: Optimal Integration of Labelers of Unknown Expertise NIPS. 2035--2043.
[31]
Xiaokui Xiao, Yufei Tao, and Minghua Chen . 2009. Optimal random perturbation at multiple privacy levels. PVLDB Vol. 2, 1 (2009), 814--825.
[32]
Xiaoxin Yin, Jiawei Han, and Philip S. Yu . 2007. Truth discovery with multiple conflicting information providers on the web Proc. of KDD. 1048--1052.
[33]
Ye Zhang, Wai-Kit Wong, Siu-Ming Yiu, Nikos Mamoulis, and David W Cheung . 2013. Lightweight privacy-preserving peer-to-peer data integration PVLDB, Vol. Vol. 6. 157--168.
[34]
Yifeng Zheng, Huayi Duan, Xingliang Yuan, and Cong Wang . 2017 a. Privacy-Aware and Efficient Mobile Crowdsensing with Truth Discovery. IEEE Transactions on Dependable and Secure Computing (2017).
[35]
Yudian Zheng, Guoliang Li, Yuanbing Li, Caihua Shan, and Reynold Cheng . 2017 b. Truth inference in crowdsourcing: is the problem solved? PVLDB Vol. 10, 5 (2017), 541--552.
[36]
D. Zhou, J. C. Platt, S. Basu, and Y. Mao . 2012. Learning from the Wisdom of Crowds by Minimax Entropy NIPS. 2204--2212.

Cited By

View all
  • (2024)Privacy-Preserving Truth Discovery Based on Secure Multi-Party Computation in Vehicle-Based Mobile CrowdsensingIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2024.335020825:7(7767-7779)Online publication date: Jul-2024
  • (2024)Data Poisoning Attacks and Defenses to LDP-based Privacy-Preserving CrowdsensingIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2024.3363507(1-18)Online publication date: 2024
  • (2023)PrivTDSI: A Local Differentially Private Approach for Truth Discovery via Sampling and InferenceIEEE Transactions on Big Data10.1109/TBDATA.2022.31861759:2(471-484)Online publication date: 1-Apr-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
July 2018
2925 pages
ISBN:9781450355520
DOI:10.1145/3219819
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 July 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. differential privacy
  2. truth discovery
  3. two-layer mechanism

Qualifiers

  • Research-article

Funding Sources

Conference

KDD '18
Sponsor:

Acceptance Rates

KDD '18 Paper Acceptance Rate 107 of 983 submissions, 11%;
Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)150
  • Downloads (Last 6 weeks)24
Reflects downloads up to 21 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Privacy-Preserving Truth Discovery Based on Secure Multi-Party Computation in Vehicle-Based Mobile CrowdsensingIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2024.335020825:7(7767-7779)Online publication date: Jul-2024
  • (2024)Data Poisoning Attacks and Defenses to LDP-based Privacy-Preserving CrowdsensingIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2024.3363507(1-18)Online publication date: 2024
  • (2023)PrivTDSI: A Local Differentially Private Approach for Truth Discovery via Sampling and InferenceIEEE Transactions on Big Data10.1109/TBDATA.2022.31861759:2(471-484)Online publication date: 1-Apr-2023
  • (2023)Reliable and Streaming Truth Discovery in Blockchain-based Crowdsourcing2023 20th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON)10.1109/SECON58729.2023.10287465(492-500)Online publication date: 11-Sep-2023
  • (2023)Decentralized privacy-preserving truth discovery for crowd sensingInformation Sciences: an International Journal10.1016/j.ins.2023.03.046632:C(730-741)Online publication date: 1-Jun-2023
  • (2022)Achieving Private and Fair Truth Discovery in Crowdsourcing SystemsSecurity and Communication Networks10.1155/2022/92817292022Online publication date: 1-Jan-2022
  • (2022)Privacy-Enhanced and Practical Truth Discovery in Two-Server Mobile CrowdsensingIEEE Transactions on Network Science and Engineering10.1109/TNSE.2022.31512289:3(1740-1755)Online publication date: 1-May-2022
  • (2022)Towards Personalized Privacy-Preserving Truth Discovery Over Crowdsourced Data StreamsIEEE/ACM Transactions on Networking10.1109/TNET.2021.311005230:1(327-340)Online publication date: Feb-2022
  • (2022)Disguised as Privacy: Data Poisoning Attacks against Differentially Private Crowdsensing SystemsIEEE Transactions on Mobile Computing10.1109/TMC.2022.3173642(1-1)Online publication date: 2022
  • (2022)Privacy-Preserving Streaming Truth Discovery in Crowdsourcing With Differential PrivacyIEEE Transactions on Mobile Computing10.1109/TMC.2021.306277521:10(3757-3772)Online publication date: 1-Oct-2022
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media