research-article

No free lunch in data privacy

Authors:

Ashwin MachanavajjhalaAuthors Info & Claims

SIGMOD '11: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data

Pages 193 - 204

https://doi.org/10.1145/1989323.1989345

Published: 12 June 2011 Publication History

Abstract

Differential privacy is a powerful tool for providing privacy-preserving noisy query answers over statistical databases. It guarantees that the distribution of noisy query answers changes very little with the addition or deletion of any tuple. It is frequently accompanied by popularized claims that it provides privacy without any assumptions about the data and that it protects against attackers who know all but one record. In this paper we critically analyze the privacy protections offered by differential privacy.

First, we use a no-free-lunch theorem, which defines non-privacy as a game, to argue that it is not possible to provide privacy and utility without making assumptions about how the data are generated. Then we explain where assumptions are needed. We argue that privacy of an individual is preserved when it is possible to limit the inference of an attacker about the participation of the individual in the data generating process. This is different from limiting the inference about the presence of a tuple (for example, Bob's participation in a social network may cause edges to form between pairs of his friends, so that it affects more than just the tuple labeled as "Bob"). The definition of evidence of participation, in turn, depends on how the data are generated -- this is how assumptions enter the picture. We explain these ideas using examples from social network research as well as tabular data for which deterministic statistics have been previously released. In both cases the notion of participation varies, the use of differential privacy can lead to privacy breaches, and differential privacy does not always adequately limit inference about participation.

References

[1]

A. Blum, C. Dwork, F. McSherry, and K. Nissim. Practical privacy: the SuLQ framework. In PODS, 2005.

Digital Library

[2]

A. Blum, K. Ligett, and A. Roth. A learning theory approach to non-interactive database privacy. In STOC, 2008.

Digital Library

[3]

P. J. Cantwell, H. Hogan, and K. M. Styles. The use of statistical methods in the u.s. census: Ütah v. evans". The American Statistician, 58(3):203--212, 2004.

[4]

K. Chaudhuri and N. Mishra. When random sampling preserves privacy. In CRYPTO, 2006.

Digital Library

[5]

K. Chaudhuri, C. Monteleoni, and A. D. Sarwate. Differentially private empirical risk minimization. http://arxiv.org/abs/0912.0071.

Digital Library

[6]

T. Dalenius. Towards a methodology for statistical disclosure control. Statistik Tidskrift 15, 1977.

[7]

I. Dinur and K. Nissim. Revealing information while preserving privacy. In PODS, 2003.

Digital Library

[8]

C. Dwork. Differential privacy. In ICALP, 2006.

Digital Library

[9]

C. Dwork, K. Kenthapadi, F. McSherry, and I. Mironov. Our data, ourselves: Privacy via distributed noise generation. In EUROCRYPT, 2006.

Digital Library

[10]

C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. In TCC, pages 265--284, 2006.

Digital Library

[11]

C. Dwork and M. Naor. On the difficulties of disclosure prevention in statistical databases or the case for differential privacy. JPC, 2(1), 2010.

[12]

C. Dwork and A. Smith. Differential privacy for statistics: What we know and what we want to learn. JPC, 1(2), 2009.

[13]

A. Friedman and A. Schuster. Data mining with differential privacy. In KDD, 2010.

Digital Library

[14]

S. R. Ganta, S. P. Kasiviswanathan, and A. Smith. Composition attacks and auxiliary information in data privacy. In KDD, 2008.

Digital Library

[15]

G. Ghinita, P. Karras, P. Kalnis, and N. Mamoulis. A framework for efficient data anonymization under privacy and accuracy constraints. ACM TODS, 34(2), 2009.

Digital Library

[16]

M. Hardt and K. Talwar. On the geometry of differential privacy. In STOC, 2010.

Digital Library

[17]

M. Hay, C. Li, G. Miklau, and D. Jensen. Accurate estimation of the degree distribution of private networks. In ICDM, 2009.

Digital Library

[18]

G. Jagannathan, K. Pillaipakkamnatt, and R. N. Wright. A practical differentially private random decision tree classifier. In ICDM Workshops, 2009.

Digital Library

[19]

S. P. Kasiviswanathan and A. Smith. A note on differential privacy: Defining resistance to arbitrary side information. http://arxiv.org/abs/0803.3946, 2008.

[20]

D. Kifer and B.-R. Lin. Towards an axiomatization of statistical privacy and utility. In PODS, 2010.

Digital Library

[21]

R. Kumar, P. Raghavan, S. Rajagopalan, D. Sivakumar, A. Tomkins, and E. Upfal. Stochastic models for the web graph. In FOCS, 2000.

Digital Library

[22]

J. Leskovec, J. Kleinberg, and C. Faloutsos. Graph evolution: Densification and shrinking diameters. TKDD, 1(1), 2007.

Digital Library

[23]

A. Machanavajjhala, D. Kifer, J. Abowd, J. Gehrke, and L. Vihuber. Privacy: From theory to practice on the map. In ICDE, 2008.

Digital Library

[24]

M. Marsili, F. Vega-Redondo, and F. Slanina. The rise and fall of a networked society: A formal model. PNAS, 101(6), 2004.

[25]

F. McSherry and I. Mironov. Differentially private recommender systems: building privacy into the net. In KDD, 2009.

Digital Library

[26]

F. McSherry and K. Talwar. Mechanism design via differential privacy. In FOCS, 2007.

Digital Library

[27]

D. J. Mir and R. N. Wright. A differentially private graph estimator. In ICDM Workshops, 2009.

Digital Library

[28]

I. Mironov, O. Pandey, O. Reingold, and S. Vadhan. Computational differential privacy. In CRYPTO, 2009.

Digital Library

[29]

K. Nissim, S. Raskhodnikova, and A. Smith. Smooth sensitivity and sampling in private data analysis. In STOC, 2007.

Digital Library

[30]

V. Rastogi, M. Hay, G. Miklau, and D. Suciu. Relationship privacy: Output perturbation for queries with joins. In PODS, 2009.

Digital Library

[31]

P. Samarati. Protecting respondents' identities in microdata release. TKDE, pages 1010 -- 1027, 2001.

Digital Library

[32]

S. L. Warner. Randomized response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association, 1965.

Cited By

Jiang YLuo XYang YXiao X(2025)Calibrating Noise for Group Privacy in Subsampled MechanismsProceedings of the VLDB Endowment10.14778/3705829.370584818:2(322-334)Online publication date: 28-Feb-2025
https://doi.org/10.14778/3705829.3705848
Clark DGarfinkel SClaffy K(2025)Differential Privacy, Firm-Level Data, and the Binomial PathologyIEEE Security & Privacy10.1109/MSEC.2024.343083723:1(8-16)Online publication date: Jan-2025
https://doi.org/10.1109/MSEC.2024.3430837
Zhang JPeng RLu CWu C(2025)Computationally efficient data synthesis for AC-OPF: Integrating Physics-Informed Neural Network solvers and active learningApplied Energy10.1016/j.apenergy.2024.124714378(124714)Online publication date: Jan-2025
https://doi.org/10.1016/j.apenergy.2024.124714
Show More Cited By

Index Terms

No free lunch in data privacy

Recommendations

Preserving Genomic Privacy via Selective Sharing
WPES'20: Proceedings of the 19th Workshop on Privacy in the Electronic Society

Although genomic data has significant impact and widespread usage in medical research, it puts individuals' privacy in danger, even if they anonymously or partially share their genomic data. To address this problem, we present a framework that is ...
Pufferfish Privacy Mechanisms for Correlated Data
SIGMOD '17: Proceedings of the 2017 ACM International Conference on Management of Data

Many modern databases include personal and sensitive correlated data, such as private information on users connected together in a social network, and measurements of physical activity of single subjects across time. However, differential privacy, the ...
Privacy-Preserving WSDM
WSDM '19: Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining

The goals of learning from user data and preserving user privacy are often considered to be in conflict. This presentation will demonstrate that there are contexts when provable privacy guarantees can be an enabler for better web search and data mining (...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGMOD '11: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data

June 2011

1364 pages

ISBN:9781450306614

DOI:10.1145/1989323

General Chair:
Timos Sellis
IMIS/RC Athena
,
Program Chair:
Renée J. Miller
University of Toronto
,
Publications Chairs:
Anastasios Kementsietsidis
IBM T.J. Watson Research Center
,
Yannis Velegrakis
University of Trento

Copyright © 2011 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMOD: ACM Special Interest Group on Management of Data

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 June 2011

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SIGMOD/PODS '11

Sponsor:

SIGMOD

SIGMOD/PODS '11: International Conference on Management of Data

June 12 - 16, 2011

Athens, Greece

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

422
Total Citations
View Citations
4,177
Total Downloads

Downloads (Last 12 months)299
Downloads (Last 6 weeks)28

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Jiang YLuo XYang YXiao X(2025)Calibrating Noise for Group Privacy in Subsampled MechanismsProceedings of the VLDB Endowment10.14778/3705829.370584818:2(322-334)Online publication date: 28-Feb-2025
https://doi.org/10.14778/3705829.3705848
Clark DGarfinkel SClaffy K(2025)Differential Privacy, Firm-Level Data, and the Binomial PathologyIEEE Security & Privacy10.1109/MSEC.2024.343083723:1(8-16)Online publication date: Jan-2025
https://doi.org/10.1109/MSEC.2024.3430837
Zhang JPeng RLu CWu C(2025)Computationally efficient data synthesis for AC-OPF: Integrating Physics-Informed Neural Network solvers and active learningApplied Energy10.1016/j.apenergy.2024.124714378(124714)Online publication date: Jan-2025
https://doi.org/10.1016/j.apenergy.2024.124714
Kifer DMachanavajjhala A(2025)Pufferfish PrivacyEncyclopedia of Cryptography, Security and Privacy10.1007/978-3-030-71522-9_1569(2020-2022)Online publication date: 8-Jan-2025
https://doi.org/10.1007/978-3-030-71522-9_1569
Biswas SNagar VKhare NJain PAgrawal P(2024)LDCML: a novel ai-driven approach for privacy-preserving anonymization of quasi-identifiersData and Metadata10.56294/dm20242873(287)Online publication date: 17-May-2024
https://doi.org/10.56294/dm2024287
Cherubin GKöpf BPaverd ATople SWutschitz LZanella-Béguelin SBalzarotti DXu W(2024)Closed-form bounds for DP-SGD against record-level inference attacksProceedings of the 33rd USENIX Conference on Security Symposium10.5555/3698900.3699170(4819-4836)Online publication date: 14-Aug-2024
https://dl.acm.org/doi/10.5555/3698900.3699170
Tian YShi YZhang YTian Q(2024)Personalized Federated Learning Scheme for Autonomous Driving Based on Correlated Differential PrivacySensors10.3390/s2501017825:1(178)Online publication date: 31-Dec-2024
https://doi.org/10.3390/s25010178
Canbay YAdsiz SCanbay P(2024)Privacy-Preserving Transfer Learning Framework for Kidney Disease DetectionApplied Sciences10.3390/app1419862914:19(8629)Online publication date: 25-Sep-2024
https://doi.org/10.3390/app14198629
Drechsler JHaensch A(2024)30 Years of Synthetic DataStatistical Science10.1214/24-STS92739:2Online publication date: 1-May-2024
https://doi.org/10.1214/24-STS927
Lin SBun MGaboardi MKolaczyk ESmith A(2024)Differentially private confidence intervals for proportions under stratified random samplingElectronic Journal of Statistics10.1214/24-EJS223418:1Online publication date: 1-Jan-2024
https://doi.org/10.1214/24-EJS2234
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten