Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1989323.1989345acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

No free lunch in data privacy

Published: 12 June 2011 Publication History

Abstract

Differential privacy is a powerful tool for providing privacy-preserving noisy query answers over statistical databases. It guarantees that the distribution of noisy query answers changes very little with the addition or deletion of any tuple. It is frequently accompanied by popularized claims that it provides privacy without any assumptions about the data and that it protects against attackers who know all but one record. In this paper we critically analyze the privacy protections offered by differential privacy.
First, we use a no-free-lunch theorem, which defines non-privacy as a game, to argue that it is not possible to provide privacy and utility without making assumptions about how the data are generated. Then we explain where assumptions are needed. We argue that privacy of an individual is preserved when it is possible to limit the inference of an attacker about the participation of the individual in the data generating process. This is different from limiting the inference about the presence of a tuple (for example, Bob's participation in a social network may cause edges to form between pairs of his friends, so that it affects more than just the tuple labeled as "Bob"). The definition of evidence of participation, in turn, depends on how the data are generated -- this is how assumptions enter the picture. We explain these ideas using examples from social network research as well as tabular data for which deterministic statistics have been previously released. In both cases the notion of participation varies, the use of differential privacy can lead to privacy breaches, and differential privacy does not always adequately limit inference about participation.

References

[1]
A. Blum, C. Dwork, F. McSherry, and K. Nissim. Practical privacy: the SuLQ framework. In PODS, 2005.
[2]
A. Blum, K. Ligett, and A. Roth. A learning theory approach to non-interactive database privacy. In STOC, 2008.
[3]
P. J. Cantwell, H. Hogan, and K. M. Styles. The use of statistical methods in the u.s. census: Ütah v. evans". The American Statistician, 58(3):203--212, 2004.
[4]
K. Chaudhuri and N. Mishra. When random sampling preserves privacy. In CRYPTO, 2006.
[5]
K. Chaudhuri, C. Monteleoni, and A. D. Sarwate. Differentially private empirical risk minimization. http://arxiv.org/abs/0912.0071.
[6]
T. Dalenius. Towards a methodology for statistical disclosure control. Statistik Tidskrift 15, 1977.
[7]
I. Dinur and K. Nissim. Revealing information while preserving privacy. In PODS, 2003.
[8]
C. Dwork. Differential privacy. In ICALP, 2006.
[9]
C. Dwork, K. Kenthapadi, F. McSherry, and I. Mironov. Our data, ourselves: Privacy via distributed noise generation. In EUROCRYPT, 2006.
[10]
C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. In TCC, pages 265--284, 2006.
[11]
C. Dwork and M. Naor. On the difficulties of disclosure prevention in statistical databases or the case for differential privacy. JPC, 2(1), 2010.
[12]
C. Dwork and A. Smith. Differential privacy for statistics: What we know and what we want to learn. JPC, 1(2), 2009.
[13]
A. Friedman and A. Schuster. Data mining with differential privacy. In KDD, 2010.
[14]
S. R. Ganta, S. P. Kasiviswanathan, and A. Smith. Composition attacks and auxiliary information in data privacy. In KDD, 2008.
[15]
G. Ghinita, P. Karras, P. Kalnis, and N. Mamoulis. A framework for efficient data anonymization under privacy and accuracy constraints. ACM TODS, 34(2), 2009.
[16]
M. Hardt and K. Talwar. On the geometry of differential privacy. In STOC, 2010.
[17]
M. Hay, C. Li, G. Miklau, and D. Jensen. Accurate estimation of the degree distribution of private networks. In ICDM, 2009.
[18]
G. Jagannathan, K. Pillaipakkamnatt, and R. N. Wright. A practical differentially private random decision tree classifier. In ICDM Workshops, 2009.
[19]
S. P. Kasiviswanathan and A. Smith. A note on differential privacy: Defining resistance to arbitrary side information. http://arxiv.org/abs/0803.3946, 2008.
[20]
D. Kifer and B.-R. Lin. Towards an axiomatization of statistical privacy and utility. In PODS, 2010.
[21]
R. Kumar, P. Raghavan, S. Rajagopalan, D. Sivakumar, A. Tomkins, and E. Upfal. Stochastic models for the web graph. In FOCS, 2000.
[22]
J. Leskovec, J. Kleinberg, and C. Faloutsos. Graph evolution: Densification and shrinking diameters. TKDD, 1(1), 2007.
[23]
A. Machanavajjhala, D. Kifer, J. Abowd, J. Gehrke, and L. Vihuber. Privacy: From theory to practice on the map. In ICDE, 2008.
[24]
M. Marsili, F. Vega-Redondo, and F. Slanina. The rise and fall of a networked society: A formal model. PNAS, 101(6), 2004.
[25]
F. McSherry and I. Mironov. Differentially private recommender systems: building privacy into the net. In KDD, 2009.
[26]
F. McSherry and K. Talwar. Mechanism design via differential privacy. In FOCS, 2007.
[27]
D. J. Mir and R. N. Wright. A differentially private graph estimator. In ICDM Workshops, 2009.
[28]
I. Mironov, O. Pandey, O. Reingold, and S. Vadhan. Computational differential privacy. In CRYPTO, 2009.
[29]
K. Nissim, S. Raskhodnikova, and A. Smith. Smooth sensitivity and sampling in private data analysis. In STOC, 2007.
[30]
V. Rastogi, M. Hay, G. Miklau, and D. Suciu. Relationship privacy: Output perturbation for queries with joins. In PODS, 2009.
[31]
P. Samarati. Protecting respondents' identities in microdata release. TKDE, pages 1010 -- 1027, 2001.
[32]
S. L. Warner. Randomized response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association, 1965.

Cited By

View all
  • (2025)Computationally efficient data synthesis for AC-OPF: Integrating Physics-Informed Neural Network solvers and active learningApplied Energy10.1016/j.apenergy.2024.124714378(124714)Online publication date: Jan-2025
  • (2024)LDCML: a novel ai-driven approach for privacy-preserving anonymization of quasi-identifiersData and Metadata10.56294/dm20242873(287)Online publication date: 17-May-2024
  • (2024)Privacy-Preserving Transfer Learning Framework for Kidney Disease DetectionApplied Sciences10.3390/app1419862914:19(8629)Online publication date: 25-Sep-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '11: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
June 2011
1364 pages
ISBN:9781450306614
DOI:10.1145/1989323
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 June 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. differential privacy
  2. privacy

Qualifiers

  • Research-article

Conference

SIGMOD/PODS '11
Sponsor:

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)368
  • Downloads (Last 6 weeks)38
Reflects downloads up to 14 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2025)Computationally efficient data synthesis for AC-OPF: Integrating Physics-Informed Neural Network solvers and active learningApplied Energy10.1016/j.apenergy.2024.124714378(124714)Online publication date: Jan-2025
  • (2024)LDCML: a novel ai-driven approach for privacy-preserving anonymization of quasi-identifiersData and Metadata10.56294/dm20242873(287)Online publication date: 17-May-2024
  • (2024)Privacy-Preserving Transfer Learning Framework for Kidney Disease DetectionApplied Sciences10.3390/app1419862914:19(8629)Online publication date: 25-Sep-2024
  • (2024)30 Years of Synthetic DataStatistical Science10.1214/24-STS92739:2Online publication date: 1-May-2024
  • (2024)Differentially private confidence intervals for proportions under stratified random samplingElectronic Journal of Statistics10.1214/24-EJS223418:1Online publication date: 1-Jan-2024
  • (2024)Collaborative learning from distributed data with differentially private synthetic dataBMC Medical Informatics and Decision Making10.1186/s12911-024-02563-724:1Online publication date: 14-Jun-2024
  • (2024)Sensitivity by ParametricityProceedings of the ACM on Programming Languages10.1145/36897268:OOPSLA2(415-441)Online publication date: 8-Oct-2024
  • (2024)"I inherently just trust that it works": Investigating Mental Models of Open-Source Libraries for Differential PrivacyProceedings of the ACM on Human-Computer Interaction10.1145/36870118:CSCW2(1-39)Online publication date: 8-Nov-2024
  • (2024)PrivatEyes: Appearance-based Gaze Estimation Using Federated Secure Multi-Party ComputationProceedings of the ACM on Human-Computer Interaction10.1145/36556068:ETRA(1-23)Online publication date: 28-May-2024
  • (2024)U.S.-U.K. PETs Prize Challenge: Anomaly Detection via Privacy-Enhanced Federated LearningIEEE Transactions on Privacy10.1109/TP.2024.33927211(3-18)Online publication date: 2024
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media