Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3452021.3458328acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
research-article

Improved Differentially Private Euclidean Distance Approximation

Published: 20 June 2021 Publication History

Abstract

This work shows how to privately and more accurately estimate Euclidean distance between pairs of vectors. Input vectors x and y are mapped to differentially private sketches $x'$ and $y'$, from which one can estimate the distance between x and y. Our estimator relies on the Sparser Johnson-Lindenstrauss constructions by Kane & Nelson (Journal of the ACM 2014), which for any 0<α,β<1/2 have optimal output dimension k=Θ(α^-2 łog(1/β)) and sparsity s=O(α^-1 łog(1/β)). We combine the constructions of Kane & Nelson with either the Laplace or the Gaussian mechanism from the differential privacy literature, depending on the privacy parameters $\varepsilon$ and δ. We also suggest a differentially private version of Fast Johnson-Lindenstrauss Transform (FJLT) by Ailon & Chazelle (SIAM Journal of Computing 2009) which offers a tradeoff in speed for variance for certain parameters. We answer an open question by Kenthapadi et al. (Journal of Privacy and Confidentiality 2013) by analyzing the privacy and utility guarantees of an estimator for Euclidean distance, relying on Laplacian rather than Gaussian noise. We prove that the Laplace mechanism yields lower variance than the Gaussian mechanism whenever δ<β^O(1/α). Thus, our work poses an improvement over the work of Kenthapadi et al. by giving a more efficient estimator with lower variance for sufficiently small δ. Our sketch also achieves pure differential privacy as a neat side-effect of the Laplace mechanism rather than the approximate differential privacy guarantee of the Gaussian mechanism, which may not be sufficiently strong for some settings. Our main result is a special case of more general, technical results proving that one can generally construct unbiased estimators for Euclidean distance with a high level of utility even under the constraint of differential privacy. The bulk of our analysis is proving that the variance of the estimator does not suffer too much in the presence of differential privacy.

References

[1]
Dimitris Achlioptas. 2003. Database-friendly random projections: Johnson-Lindenstrauss with binary coins. J. Comput. Syst. Sci., Vol. 66, 4 (2003), 671--687. https://doi.org/10.1016/S0022-0000(03)00025--4
[2]
Nir Ailon and Bernard Chazelle. 2009. The Fast Johnson--Lindenstrauss Transform and Approximate Nearest Neighbors. SIAM J. Comput., Vol. 39, 1 (2009), 302--322. https://doi.org/10.1137/060673096
[3]
Michael Barbaro and Tom Zeller. 2006. A Face is exposed for AOL searcher no. 4417749. New York Times (01 2006).
[4]
Raghav Bhaskar, Abhishek Bhowmick, Vipul Goyal, Srivatsan Laxman, and Abhradeep Thakurta. 2011. Noiseless Database Privacy. In 17th International Conference on the Theory and Application of Cryptology and Information Security, ASIACRYPT (Lecture Notes in Computer Science, Vol. 7073). 215--232. https://doi.org/10.1007/978--3--642--25385-0_12
[5]
Jeremiah Blocki, Avrim Blum, Anupam Datta, and Or Sheffet. 2012. The Johnson-Lindenstrauss Transform Itself Preserves Differential Privacy. In 53rd Symposium on Foundations of Computer Science, FOCS. 410--419. https://doi.org/10.1109/FOCS.2012.67
[6]
Christos Boutsidis, Anastasios Zouzias, Michael W. Mahoney, and Petros Drineas. 2015. Randomized Dimensionality Reduction for k-Means Clustering. IEEE Trans. Inf. Theory, Vol. 61, 2 (2015), 1045--1062. https://doi.org/10.1109/TIT.2014.2375327
[7]
Karl Bringmann, Fabian Kuhn, Konstantinos Panagiotou, Ueli Peter, and Henning Thomas. 2014. Internal DLA: Efficient Simulation of a Physical Growth Model. In Automata, Languages, and Programming, ICALP (Lecture Notes in Computer Science, Vol. 8572). 247--258. https://doi.org/10.1007/978--3--662--43948--7_21
[8]
Clé ment L. Canonne, Gautam Kamath, and Thomas Steinke. 2020. The Discrete Gaussian for Differential Privacy. CoRR, Vol. abs/2004.00010 (2020). arxiv: 2004.00010 https://arxiv.org/abs/2004.00010
[9]
Rui Chen, Qian Xiao, Yu Zhang, and Jianliang Xu. 2015. Differentially Private High-Dimensional Data Publication via Sampling-Based Inference. In 21st International Conference on Knowledge Discovery and Data Mining . 129--138. https://doi.org/10.1145/2783258.2783379
[10]
Kenneth L. Clarkson. 2008. Tighter bounds for random projections of manifolds. In 24th Symposium on Computational Geometry. 39--48. https://doi.org/10.1145/1377676.1377685
[11]
Kenneth L. Clarkson and David P. Woodruff. 2009. Numerical linear algebra in the streaming model. In 41st Symposium on Theory of Computing, STOC. 205--214. https://doi.org/10.1145/1536414.1536445
[12]
Kenneth L. Clarkson and David P. Woodruff. 2013. Low rank approximation and regression in input sparsity time. In Symposium on Theory of Computing Conference, STOC. 81--90. https://doi.org/10.1145/2488608.2488620
[13]
Michael B. Cohen, Sam Elder, Cameron Musco, Christopher Musco, and Madalina Persu. 2015. Dimensionality Reduction for k-Means Clustering and Low Rank Approximation. In 47th Symposium on Theory of Computing, STOC. 163--172. https://doi.org/10.1145/2746539.2746569
[14]
Anirban Dasgupta, Ravi Kumar, and Tamá s Sarló s. 2010. A sparse Johnson-Lindenstrauss transform. In 42nd Symposium on Theory of Computing, STOC. 341--350. https://doi.org/10.1145/1806689.1806737
[15]
Yves-Alexandre De Montjoye, César A Hidalgo, Michel Verleysen, and Vincent D Blondel. 2013. Unique in the crowd: The privacy bounds of human mobility. Scientific reports, Vol. 3 (2013), 1376.
[16]
Petros Drineas, Michael W. Mahoney, S. Muthukrishnan, and Tamá s Sarló s. 2011. Faster least squares approximation. Numer. Math., Vol. 117, 2 (2011), 219--249. https://doi.org/10.1007/s00211-010-0331--6
[17]
Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. 2006 a. Our Data, Ourselves: Privacy Via Distributed Noise Generation. In Advances in Cryptology - EUROCRYPT 2006, 25th Annual International Conference on the Theory and Applications of Cryptographic Techniques (Lecture Notes in Computer Science, Vol. 4004). 486--503. https://doi.org/10.1007/11761679_29
[18]
Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam D. Smith. 2006 b. Calibrating Noise to Sensitivity in Private Data Analysis. In 3rd Theory of Cryptography Conference, TCC (Lecture Notes in Computer Science, Vol. 3876). 265--284. https://doi.org/10.1007/11681878_14
[19]
Cynthia Dwork and Aaron Roth. 2014. The Algorithmic Foundations of Differential Privacy. Foundations and Trends in Theoretical Computer Science, Vol. 9, 3--4 (2014), 211--407. https://doi.org/10.1561/0400000042
[20]
Differential Privacy Team Google. 2020. Secure Noise Generation . Technical Report. Google.
[21]
Moritz Hardt and Kunal Talwar. 2010. On the geometry of differential privacy. In 42nd Symposium on Theory of Computing, STOC, Leonard J. Schulman (Ed.). 705--714. https://doi.org/10.1145/1806689.1806786
[22]
Justin Hsu, Sanjeev Khanna, and Aaron Roth. 2012. Distributed Private Heavy Hitters. In Automata, Languages, and Programming, ICALP (Lecture Notes in Computer Science, Vol. 7391). 461--472. https://doi.org/10.1007/978--3--642--31594--7_39
[23]
Piotr Indyk. 2006. Stable distributions, pseudorandom generators, embeddings, and data stream computation. J. ACM, Vol. 53, 3 (2006), 307--323. https://doi.org/10.1145/1147954.1147955
[24]
Piotr Indyk and Rajeev Motwani. 1998. Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality. In 30th Annual Symposium on the Theory of Computing, STOC. 604--613. https://doi.org/10.1145/276698.276876
[25]
T. S. Jayram and David P. Woodruff. 2011. Optimal Bounds for Johnson-Lindenstrauss Transforms and Streaming Problems with Sub-Constant Error. In 22nd Symposium on Discrete Algorithms, SODA. 1--10. https://doi.org/10.1137/1.9781611973082.1
[26]
William B Johnson and Joram Lindenstrauss. 1984. Extensions of Lipschitz mappings into a Hilbert space. Contemporary mathematics, Vol. 26, 189--206 (1984), 1.
[27]
Daniel M. Kane, Raghu Meka, and Jelani Nelson. 2011. Almost Optimal Explicit Johnson-Lindenstrauss Families. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques - 14th International Workshop, APPROX and 15th International Workshop, RANDOM (Lecture Notes in Computer Science, Vol. 6845). 628--639. https://doi.org/10.1007/978--3--642--22935-0_53
[28]
Daniel M. Kane and Jelani Nelson. 2014. Sparser Johnson-Lindenstrauss Transforms. J. ACM, Vol. 61, 1 (2014), 4:1--4:23. https://doi.org/10.1145/2559902
[29]
Krishnaram Kenthapadi, Aleksandra Korolova, Ilya Mironov, and Nina Mishra. 2013. Privacy via the Johnson-Lindenstrauss Transform. J. Priv. Confidentiality, Vol. 5, 1 (2013). https://doi.org/10.29012/jpc.v5i1.625
[30]
Andrew McGregor, Ilya Mironov, Toniann Pitassi, Omer Reingold, Kunal Talwar, and Salil Vadhan. 2010. The limits of two-party differential privacy. In 51st Annual Symposium on Foundations of Computer Science. 81--90.
[31]
Frank McSherry and Ilya Mironov. 2009. Differentially Private Recommender Systems: Building Privacy into the Netflix Prize Contenders. In 15th International Conference on Knowledge Discovery and Data Mining. 627--636. https://doi.org/10.1145/1557019.1557090
[32]
Darakhshan J. Mir, S. Muthukrishnan, Aleksandar Nikolov, and Rebecca N. Wright. 2010. Pan-private Algorithms: When Memory Does Not Help. CoRR, Vol. abs/1009.1544 (2010). arxiv: 1009.1544 http://arxiv.org/abs/1009.1544
[33]
Darakhshan J. Mir, S. Muthukrishnan, Aleksandar Nikolov, and Rebecca N. Wright. 2011. Pan-private algorithms via statistics on sketches. In 30th Symposium on Principles of Database Systems, PODS. 37--48. https://doi.org/10.1145/1989284.1989290
[34]
Ilya Mironov. 2012. On significance of the least significant bits for differential privacy. In Conference on Computer and Communications Security, CCS. 650--661. https://doi.org/10.1145/2382196.2382264
[35]
Ilya Mironov. 2017. Ré nyi Differential Privacy. In 30th Computer Security Foundations Symposium, CSF. IEEE Computer Society, 263--275. https://doi.org/10.1109/CSF.2017.11
[36]
Arvind Narayanan and Vitaly Shmatikov. 2008. Robust De-anonymization of Large Sparse Datasets. In Symposium on Security and Privacy . 111--125. https://doi.org/10.1109/SP.2008.33
[37]
Jelani Nelson and Huy L. Nguyen. 2013. OSNAP: Faster Numerical Linear Algebra Algorithms via Sparser Subspace Embeddings. In 54th Symposium on Foundations of Computer Science, FOCS. IEEE Computer Society, 117--126. https://doi.org/10.1109/FOCS.2013.21
[38]
Rasmus Pagh and Nina Mesing Stausholm. 2020. Efficient Differentially Private F(_mbox0 ) Linear Sketching. CoRR, Vol. abs/2001.11932 (2020). arxiv: 2001.11932 https://arxiv.org/abs/2001.11932
[39]
Wahbeh H. Qardaji, Weining Yang, and Ninghui Li. 2014. PriView: practical differentially private release of marginal contingency tables. In International Conference on Management of Data, SIGMOD. 1435--1446. https://doi.org/10.1145/2588555.2588575
[40]
Benjamin I. P. Rubinstein, Peter L. Bartlett, Ling Huang, and Nina Taft. 2012. Learning in a Large Function Space: Privacy-Preserving Mechanisms for SVM Learning. J. Priv. Confidentiality, Vol. 4, 1 (2012). https://doi.org/10.29012/jpc.v4i1.612
[41]
Daniel A. Spielman and Nikhil Srivastava. 2011. Graph Sparsification by Effective Resistances. SIAM J. Comput., Vol. 40, 6 (2011), 1913--1926. https://doi.org/10.1137/080734029
[42]
Latanya Sweeney. 2015. Only you, your doctor, and many others may know. Technology Science, Vol. 2015092903, 9 (2015), 29.
[43]
Pang-Ning Tan, Michael S. Steinbach, and Vipin Kumar. 2005. Introduction to Data Mining .Addison-Wesley. http://www-users.cs.umn.edu/%7Ekumar/dmbook/
[44]
Jalaj Upadhyay. 2014. Randomness efficient fast-Johnson-Lindenstrauss transform with applications in differential privacy and compressed sensing. arXiv preprint arXiv:1410.2470 (2014).
[45]
Stanley L. Warner. 1965. Randomized Response: A Survey Technique for Eliminating Evasive Answer Bias. J. Amer. Statist. Assoc., Vol. 60, 309 (1965), 63--69. http://www.jstor.org/stable/2283137
[46]
David P. Woodruff. 2014. Sketching as a Tool for Numerical Linear Algebra. Foundations and Trends in Theoretical Computer Science, Vol. 10, 1--2 (2014), 1--157. https://doi.org/10.1561/0400000060
[47]
Chugui Xu, Ju Ren, Yaoxue Zhang, Zhan Qin, and Kui Ren. 2017. DPPro: Differentially Private High-Dimensional Data Release via Random Projection. IEEE Transactions on Information Forensics and Security, Vol. 12, 12 (2017), 3081--3093. https://doi.org/10.1109/TIFS.2017.2737966
[48]
Jun Zhang, Graham Cormode, Cecilia M. Procopiuc, Divesh Srivastava, and Xiaokui Xiao. 2014. PrivBayes: private data release via bayesian networks. In International Conference on Management of Data, SIGMOD. 1423--1434. https://doi.org/10.1145/2588555.2588573

Cited By

View all
  • (2024)Perturb-and-projectProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692435(9161-9179)Online publication date: 21-Jul-2024
  • (2022)Improved utility analysis of private CountSketchProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3602128(25631-25643)Online publication date: 28-Nov-2022
  • (2022)A Survey on Differential Privacy for Unstructured Data ContentACM Computing Surveys10.1145/349023754:10s(1-28)Online publication date: 13-Sep-2022
  • Show More Cited By

Index Terms

  1. Improved Differentially Private Euclidean Distance Approximation

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    PODS'21: Proceedings of the 40th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems
    June 2021
    440 pages
    ISBN:9781450383813
    DOI:10.1145/3452021
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 20 June 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. differential privacy
    2. dimensionality reduction
    3. johnson-lindenstrauss

    Qualifiers

    • Research-article

    Funding Sources

    • Villum Foundation

    Conference

    SIGMOD/PODS '21
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 642 of 2,707 submissions, 24%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)20
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 08 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Perturb-and-projectProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692435(9161-9179)Online publication date: 21-Jul-2024
    • (2022)Improved utility analysis of private CountSketchProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3602128(25631-25643)Online publication date: 28-Nov-2022
    • (2022)A Survey on Differential Privacy for Unstructured Data ContentACM Computing Surveys10.1145/349023754:10s(1-28)Online publication date: 13-Sep-2022
    • (2022)Differential Privacy from Locally Adjustable Graph Algorithms: k-Core Decomposition, Low Out-Degree Ordering, and Densest Subgraphs2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS)10.1109/FOCS54457.2022.00077(754-765)Online publication date: Oct-2022

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media