Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3576915.3623142acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article
Open access

Geometry of Sensitivity: Twice Sampling and Hybrid Clipping in Differential Privacy with Optimal Gaussian Noise and Application to Deep Learning

Published: 21 November 2023 Publication History

Abstract

We study the fundamental problem of the construction of optimal randomization in Differential Privacy (DP). Depending on the clipping strategy or additional properties of the processing function, the corresponding sensitivity set theoretically determines the necessary randomization to produce the required security parameters. Towards the optimal utility-privacy tradeoff, finding the minimal perturbation for properly-selected sensitivity sets stands as a central problem in DP research. In practice, l2/l1-norm clippings with Gaussian/Laplace noise mechanisms are among the most common setups. However, they also suffer from the curse of dimensionality. For more generic clipping strategies, the understanding of the optimal noise for a high-dimensional sensitivity set remains limited. This raises challenges in mitigating the worst-case dimension dependence in privacy-preserving randomization, especially for deep learning applications.
In this paper, we revisit the geometry of high-dimensional sensitivity sets and present a series of results to characterize the non-asymptotically optimal Gaussian noise for Rényi DP (RDP). Our results are both negative and positive: on one hand, we show the curse of dimensionality is tight for a broad class of sensitivity sets satisfying certain symmetry properties; but if, fortunately, the representation of the sensitivity set is asymmetric on some group of orthogonal bases, we show the optimal noise bounds need not be explicitly dependent on either dimension or rank. We also revisit sampling in the high-dimensional scenario, which is the key for both privacy amplification and computation efficiency in large-scale data processing. We propose a novel method, termed twice sampling, which implements both sample-wise and coordinate-wise sampling, to enable Gaussian noises to fit the sensitivity geometry more closely. With closed-form RDP analysis, we prove twice sampling produces asymptotic improvement of the privacy amplification given an additional l∞ -norm restriction, especially for small sampling rate. We also provide concrete applications of our results on practical tasks. Through tighter privacy analysis combined with twice sampling, we efficiently train ResNet22 in low sampling rate on CIFAR10, and achieve 69.7% and 81.6% test accuracy with (ε=2,δ=10-5) and (ε=8,δ=10-5) DP guarantee, respectively.

References

[1]
Martin Abadi, Andy Chu, Ian Goodfellow, H Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. 2016. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC conference on computer and communications security. 308--318.
[2]
Hervé Abdi and Lynne J Williams. 2010. Principal component analysis. Wiley interdisciplinary reviews: computational statistics, Vol. 2, 4 (2010), 433--459.
[3]
Jordan Awan and Aleksandra Slavković. 2021. Structure and sensitivity in differential privacy: Comparing k-norm mechanisms. J. Amer. Statist. Assoc., Vol. 116, 534 (2021), 935--954.
[4]
Borja Balle, Gilles Barthe, and Marco Gaboardi. 2018. Privacy amplification by subsampling: Tight analyses via couplings and divergences. Advances in Neural Information Processing Systems, Vol. 31 (2018).
[5]
Raef Bassily, Adam Smith, and Abhradeep Thakurta. 2014. Private empirical risk minimization: Efficient algorithms and tight error bounds. In 2014 IEEE 55th annual symposium on foundations of computer science. IEEE, 464--473.
[6]
Stephen Boyd, Stephen P Boyd, and Lieven Vandenberghe. 2004. Convex optimization. Cambridge university press.
[7]
Mark Bun, Jonathan Ullman, and Salil Vadhan. 2014. Fingerprinting codes and the price of approximate differential privacy. In Proceedings of the forty-sixth annual ACM symposium on Theory of computing. 1--10.
[8]
Mirza Cilimkovic. 2015. Neural networks and back propagation algorithm. Institute of Technology Blanchardstown, Blanchardstown Road North Dublin, Vol. 15, 1 (2015).
[9]
Anindya De. 2012. Lower bounds in differential privacy. In Theory of Cryptography: 9th Theory of Cryptography Conference, TCC 2012, Taormina, Sicily, Italy, March 19-21, 2012. Proceedings 9. Springer, 321--338.
[10]
Soham De, Leonard Berrada, Jamie Hayes, Samuel L Smith, and Borja Balle. 2022. Unlocking high-accuracy differentially private image classification through scale. arXiv preprint arXiv:2204.13650 (2022).
[11]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition. Ieee, 248--255.
[12]
Jinshuo Dong, Aaron Roth, and Weijie J Su. 2022. Gaussian differential privacy. Journal of the Royal Statistical Society Series B: Statistical Methodology, Vol. 84, 1 (2022), 3--37.
[13]
Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. 2006 a. Our data, ourselves: Privacy via distributed noise generation. In Annual International Conference on the Theory and Applications of Cryptographic Techniques. Springer, 486--503.
[14]
Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. 2006 b. Calibrating noise to sensitivity in private data analysis. In Theory of cryptography conference. Springer, 265--284.
[15]
Cynthia Dwork, Aaron Roth, et al. 2014. The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci., Vol. 9, 3--4 (2014), 211--407.
[16]
Cynthia Dwork, Guy N Rothblum, and Salil Vadhan. 2010. Boosting and differential privacy. In 2010 IEEE 51st Annual Symposium on Foundations of Computer Science. IEEE, 51--60.
[17]
Quan Geng and Pramod Viswanath. 2015a. The optimal noise-adding mechanism in differential privacy. IEEE Transactions on Information Theory, Vol. 62, 2 (2015), 925--951.
[18]
Quan Geng and Pramod Viswanath. 2015b. Optimal noise adding mechanisms for approximate differential privacy. IEEE Transactions on Information Theory, Vol. 62, 2 (2015), 952--969.
[19]
Manuel Gil, Fady Alajaji, and Tamas Linder. 2013. Rényi divergence measures for commonly used univariate continuous distributions. Information Sciences, Vol. 249 (2013), 124--131.
[20]
Moritz Hardt and Kunal Talwar. 2010a. On the geometry of differential privacy. In Proceedings of the forty-second ACM symposium on Theory of computing. 705--714.
[21]
Moritz Hardt and Kunal Talwar. 2010b. On the geometry of differential privacy. In Proceedings of the forty-second ACM symposium on Theory of computing. 705--714.
[22]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.
[23]
Lijie Hu, Shuo Ni, Hanshen Xiao, and Di Wang. 2022. High dimensional differentially private stochastic optimization with heavy-tailed data. In Proceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems. 227--236.
[24]
Michel Journée, Yurii Nesterov, Peter Richtárik, and Rodolphe Sepulchre. 2010. Generalized power method for sparse principal component analysis. Journal of Machine Learning Research, Vol. 11, 2 (2010).
[25]
Peter Kairouz, Monica Ribero Diaz, Keith Rush, and Abhradeep Thakurta. 2021. (Nearly) Dimension Independent Private ERM with AdaGrad Rates via Publicly Estimated Subspaces. In Conference on Learning Theory. PMLR, 2717--2746.
[26]
Alex Krizhevsky, Geoffrey Hinton, et al. 2009. Learning multiple layers of features from tiny images. (2009).
[27]
Ninghui Li, Wahbeh Qardaji, and Dong Su. 2012. On sampling, anonymization, and differential privacy or, k-anonymization meets differential privacy. In Proceedings of the 7th ACM Symposium on Information, Computer and Communications Security. 32--33.
[28]
László Lovász and Miklós Simonovits. 1993. Random walks in a convex body and an improved volume algorithm. Random structures & algorithms, Vol. 4, 4 (1993), 359--412.
[29]
Zelun Luo, Daniel J Wu, Ehsan Adeli, and Li Fei-Fei. 2021. Scalable differential privacy with sparse network finetuning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5059--5068.
[30]
H Brendan McMahan, Galen Andrew, Ulfar Erlingsson, Steve Chien, Ilya Mironov, Nicolas Papernot, and Peter Kairouz. 2018a. A general approach to adding differential privacy to iterative training procedures. arXiv preprint arXiv:1812.06210 (2018).
[31]
H Brendan McMahan, Daniel Ramage, Kunal Talwar, and Li Zhang. 2018b. Learning Differentially Private Recurrent Language Models. In International Conference on Learning Representations.
[32]
Ilya Mironov. 2017. Rényi differential privacy. In 2017 IEEE 30th computer security foundations symposium (CSF). IEEE, 263--275.
[33]
Ilya Mironov, Kunal Talwar, and Li Zhang. 2019. R\'enyi differential privacy of the sampled gaussian mechanism. arXiv preprint arXiv:1908.10530 (2019).
[34]
Shanmugavelayutham Muthukrishnan and Aleksandar Nikolov. 2012. Optimal private halfspace counting via discrepancy. In Proceedings of the forty-fourth annual ACM symposium on Theory of computing. 1285--1292.
[35]
Aleksandar Nikolov, Kunal Talwar, and Li Zhang. 2013. The geometry of differential privacy: the sparse and approximate cases. In Proceedings of the forty-fifth annual ACM symposium on Theory of computing. 351--360.
[36]
OpenAI. 2023. GPT-4 Technical Report. arxiv: 2303.08774 [cs.CL]
[37]
Nicolas Papernot, Steve Chien, Shuang Song, Abhradeep Thakurta, and Ulfar Erlingsson. 2020. Making the shoe fit: Architectures, initializations, and tuning for learning with privacy. (2020).
[38]
Shuang Song, Thomas Steinke, Om Thakkar, and Abhradeep Thakurta. 2021. Evading the curse of dimensionality in unconstrained private glms. In International Conference on Artificial Intelligence and Statistics. PMLR, 2638--2646.
[39]
Tim Van Erven and Peter Harremos. 2014. Rényi divergence and Kullback-Leibler divergence. IEEE Transactions on Information Theory, Vol. 60, 7 (2014), 3797--3820.
[40]
Di Wang, Hanshen Xiao, Srinivas Devadas, and Jinhui Xu. 2020. On differentially private stochastic convex optimization with heavy-tailed data. In International Conference on Machine Learning. PMLR, 10081--10091.
[41]
Yu-Xiang Wang, Borja Balle, and Shiva Prasad Kasiviswanathan. 2019. Subsampled rényi differential privacy and analytical moments accountant. In The 22nd International Conference on Artificial Intelligence and Statistics. PMLR, 1226--1235.
[42]
Hanshen Xiao, Jun Wan, and Srinivas Devadas. 2023 a. Geometry of Sensitivity: Twice Sampling and Hybrid Clipping in Differential Privacy with Optimal Gaussian Noise and Application to Deep Learning. arXiv preprint arXiv:2309.02672 (2023).
[43]
Hanshen Xiao, Zihang Xiang, Di Wang, and Srinivas Devadas. 2023 b. A Theory to Instruct Differentially-Private Learning via Clipping Bias Reduction. In 2023 IEEE Symposium on Security and Privacy (SP). IEEE Computer Society, 2170--2189.
[44]
Xiaokui Xiao and Yufei Tao. 2008. Output perturbation with query relaxation. Proceedings of the VLDB Endowment, Vol. 1, 1 (2008), 857--869.
[45]
Da Yu, Huishuai Zhang, Wei Chen, Jian Yin, and Tie-Yan Liu. 2021. Large scale private learning via low-rank reparametrization. In International Conference on Machine Learning. PMLR, 12208--12218.
[46]
Huanyu Zhang, Ilya Mironov, and Meisam Hejazinia. 2021. Wide network learning with differential privacy. arXiv preprint arXiv:2103.01294 (2021).
[47]
Yingxue Zhou, Steven Wu, and Arindam Banerjee. 2021. Bypassing the Ambient Dimension: Private SGD with Gradient Subspace Identification. In International Conference on Learning Representations.
[48]
Junyi Zhu and Matthew Blaschko. 2021. Differentially Private SGD with Sparse Gradients. arXiv preprint arXiv:2112.00845 (2021).
[49]
Yuqing Zhu and Yu-Xiang Wang. 2019. Poission subsampled rényi differential privacy. In International Conference on Machine Learning. PMLR, 7634--7642.

Cited By

View all
  • (2024)Enhancing Data Utility in Personalized Differential Privacy: A Fine-Grained Processing ApproachData Security and Privacy Protection10.1007/978-981-97-8546-9_3(47-66)Online publication date: 25-Oct-2024

Index Terms

  1. Geometry of Sensitivity: Twice Sampling and Hybrid Clipping in Differential Privacy with Optimal Gaussian Noise and Application to Deep Learning

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CCS '23: Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security
    November 2023
    3722 pages
    ISBN:9798400700507
    DOI:10.1145/3576915
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 21 November 2023

    Check for updates

    Author Tags

    1. clipping
    2. deep learning
    3. differential privacy
    4. privacy amplification
    5. sensitivity geometry
    6. twice sampling

    Qualifiers

    • Research-article

    Conference

    CCS '23
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,261 of 6,999 submissions, 18%

    Upcoming Conference

    CCS '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)660
    • Downloads (Last 6 weeks)82
    Reflects downloads up to 21 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Enhancing Data Utility in Personalized Differential Privacy: A Fine-Grained Processing ApproachData Security and Privacy Protection10.1007/978-981-97-8546-9_3(47-66)Online publication date: 25-Oct-2024

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media