research-article

Open access

Mitigating Membership Inference Attacks via Weighted Smoothing

Authors:

Tianhao WangAuthors Info & Claims

ACSAC '23: Proceedings of the 39th Annual Computer Security Applications Conference

Pages 787 - 798

https://doi.org/10.1145/3627106.3627189

Published: 04 December 2023 Publication History

All formats PDF

Abstract

Recent advancements in deep learning have spotlighted a crucial privacy vulnerability to membership inference attack (MIA), where adversaries can determine if specific data was present in a training set, thus potentially revealing sensitive information. In this paper, we introduce a technique, weighted smoothing (WS), to mitigate MIA risks. Our approach is anchored on the observation that training samples differ in their vulnerability to MIA, primarily based on their distance to clusters of similar samples. The intuition is clusters will make model predictions more confident and increase MIA risks. Thus WS strategically introduces noise to training samples, depending on whether they are near a cluster or isolated. We evaluate WS against MIAs on multiple benchmark datasets and model architectures, demonstrating its effectiveness. We publish code at https://github.com/BennyTMT/weighted-smoothing.

References

[1]

Martin Abadi, Andy Chu, Ian Goodfellow, H Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. 2016. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC conference on computer and communications security. 308–318.

Digital Library

[2]

Jimmy Ba and Rich Caruana. 2014. Do deep nets really need to be deep?Advances in neural information processing systems 27 (2014).

[3]

Eugene Bagdasaryan, Omid Poursaeed, and Vitaly Shmatikov. 2019. Differential privacy has disparate impact on model accuracy. Advances in neural information processing systems 32 (2019).

[4]

Mark Bun and Thomas Steinke. 2016. Concentrated differential privacy: Simplifications, extensions, and lower bounds. In Theory of Cryptography Conference. Springer, 635–658.

Digital Library

[5]

Nicholas Carlini, Steve Chien, Milad Nasr, Shuang Song, Andreas Terzis, and Florian Tramer. 2022. Membership inference attacks from first principles. In 2022 IEEE Symposium on Security and Privacy (SP). IEEE, 1897–1914.

[6]

Nicholas Carlini, Ulfar Erlingsson, and Nicolas Papernot. 2018. Prototypical examples in deep learning: Metrics, characteristics, and utility. (2018).

[7]

Nicholas Carlini, Ulfar Erlingsson, and Nicolas Papernot. 2019. Distribution density, tails, and outliers in machine learning: Metrics and applications. arXiv preprint arXiv:1910.13427 (2019).

[8]

Nicholas Carlini, Matthew Jagielski, Nicolas Papernot, Andreas Terzis, Florian Tramer, and Chiyuan Zhang. 2022. The privacy onion effect: Memorization is relative. arXiv preprint arXiv:2206.10469 (2022).

[9]

Nicholas Carlini, Florian Tramer, Eric Wallace, Matthew Jagielski, Ariel Herbert-Voss, Katherine Lee, Adam Roberts, Tom Brown, Dawn Song, Ulfar Erlingsson, 2021. Extracting training data from large language models. In 30th USENIX Security Symposium (USENIX Security 21). 2633–2650.

[10]

Christopher A Choquette-Choo, Florian Tramer, Nicholas Carlini, and Nicolas Papernot. 2021. Label-only membership inference attacks. In International Conference on Machine Learning. PMLR, 1964–1974.

[11]

Minxin Du, Xiang Yue, Sherman Chow, Tianhao Wang, Chenyu Huang, and Huan Sun. 2023. DP-Forward: Fine-tuning and Inference on Language Models with Differential Privacy in Forward Pass. In Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security.

Digital Library

[12]

Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. 2006. Calibrating noise to sensitivity in private data analysis. In Theory of cryptography conference. Springer, 265–284.

[13]

Krizhevsky et al.2013. CIFAR-10 and CIFAR-100 dataset and analysis. https://www.cs.toronto.edu/ kriz/cifar.html. [Online; accessed 15-March-2022].

[14]

Yi et al.2021. CASIA-WebFace dataset introduction and source. https://paperswithcode.com/dataset/casia-webface. [Online; accessed 5-March-2022].

[15]

Vitaly Feldman. 2020. Does learning require memorization? a short tale about a long tail. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing. 954–959.

Digital Library

[16]

Vitaly Feldman and Chiyuan Zhang. 2020. What neural networks memorize and why: Discovering the long tail via influence estimation. Advances in Neural Information Processing Systems 33 (2020), 2881–2891.

[17]

Matt Fredrikson, Somesh Jha, and Thomas Ristenpart. 2015. Model inversion attacks that exploit confidence information and basic countermeasures. In Proceedings of the 22nd ACM SIGSAC conference on computer and communications security. 1322–1333.

Digital Library

[18]

Karan Ganju, Qi Wang, Wei Yang, Carl A Gunter, and Nikita Borisov. 2018. Property inference attacks on fully connected neural networks using permutation invariant representations. In Proceedings of the 2018 ACM SIGSAC conference on computer and communications security. 619–633.

Digital Library

[19]

Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. 2014. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014).

[20]

Tianyu Gu, Kang Liu, Brendan Dolan-Gavitt, and Siddharth Garg. 2019. Badnets: Evaluating backdooring attacks on deep neural networks. IEEE Access 7 (2019), 47230–47244.

[21]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778.

[22]

Geoffrey Hinton, Oriol Vinyals, Jeff Dean, 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 2, 7 (2015).

[23]

Nils Homer, Szabolcs Szelinger, Margot Redman, David Duggan, Waibhav Tembe, Jill Muehling, John V Pearson, Dietrich A Stephan, Stanley F Nelson, and David W Craig. 2008. Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS genetics 4, 8 (2008), e1000167.

[24]

Hongsheng Hu, Zoran Salcic, Gillian Dobbie, and Xuyun Zhang. 2021. Membership Inference Attacks on Machine Learning: A Survey. arXiv preprint arXiv:2103.07853 (2021).

[25]

Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. 2017. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4700–4708.

[26]

Bargav Jayaraman and David Evans. 2019. Evaluating differentially private machine learning in practice. In 28th { USENIX} Security Symposium ({ USENIX} Security 19). 1895–1912.

[27]

Jinyuan Jia, Ahmed Salem, Michael Backes, Yang Zhang, and Neil Zhenqiang Gong. 2019. Memguard: Defending against black-box membership inference attacks via adversarial examples. In Proceedings of the 2019 ACM SIGSAC conference on computer and communications security. 259–274.

Digital Library

[28]

Bogdan Kulynych and Mohammad Yaghini. 2018. mia: A library for running membership inference attacks against ML models. https://doi.org/10.5281/zenodo.1433744

[29]

Jiacheng Li, Ninghui Li, and Bruno Ribeiro. 2021. Membership inference attacks and defenses in classification models. In Proceedings of the Eleventh ACM Conference on Data and Application Security and Privacy. 5–16.

Digital Library

[30]

Zheng Li and Yang Zhang. 2021. Membership leakage in label-only exposures. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security. 880–895.

Digital Library

[31]

Milad Nasr, Reza Shokri, and Amir Houmansadr. 2018. Machine learning with membership privacy using adversarial regularization. In Proceedings of the 2018 ACM SIGSAC conference on computer and communications security. 634–646.

Digital Library

[32]

Shadi Rahimian, Tribhuvanesh Orekondy, and Mario Fritz. 2020. Sampling attacks: Amplification of membership inference attacks by repeated queries. arXiv preprint arXiv:2009.00395 (2020).

[33]

Md Atiqur Rahman, Tanzila Rahman, Robert Laganière, Noman Mohammed, and Yang Wang. 2018. Membership Inference Attack against Differentially Private Deep Learning Model.Trans. Data Priv. 11, 1 (2018), 61–79.

[34]

Ahmed Salem, Yang Zhang, Mathias Humbert, Pascal Berrang, Mario Fritz, and Michael Backes. 2018. Ml-leaks: Model and data independent membership inference attacks and defenses on machine learning models. arXiv preprint arXiv:1806.01246 (2018).

[35]

Virat Shejwalkar and Amir Houmansadr. 2019. Membership privacy for machine learning models through knowledge transfer. arXiv preprint arXiv:1906.06589 (2019).

[36]

Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. 2017. Membership inference attacks against machine learning models. In 2017 IEEE Symposium on Security and Privacy (SP). IEEE, 3–18.

[37]

Congzheng Song, Thomas Ristenpart, and Vitaly Shmatikov. 2017. Machine learning models that remember too much. In Proceedings of the 2017 ACM SIGSAC Conference on computer and communications security. 587–601.

Digital Library

[38]

Liwei Song and Prateek Mittal. 2021. Systematic evaluation of privacy risks of machine learning models. In 30th { USENIX} Security Symposium ({ USENIX} Security 21).

[39]

Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research 15, 1 (2014), 1929–1958.

Digital Library

[40]

Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. 2016. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2818–2826.

[41]

Xinyu Tang, Saeed Mahloujifar, Liwei Song, Virat Shejwalkar, Milad Nasr, Amir Houmansadr, and Prateek Mittal. 2022. Mitigating membership inference attacks by { Self-Distillation} through a novel ensemble architecture. In 31st USENIX Security Symposium (USENIX Security 22). 1433–1450.

[42]

Stacey Truex, Ling Liu, Mehmet Emre Gursoy, Wenqi Wei, and Lei Yu. 2019. Effects of differential privacy and data skewness on membership inference vulnerability. In 2019 First IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA). IEEE, 82–91.

[43]

Philipp Tschandl, Cliff Rosendahl, and Harald Kittler. 2018. The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Scientific data 5, 1 (2018), 1–9.

[44]

Jiayuan Ye, Aadyaa Maddi, Sasi Kumar Murakonda, Vincent Bindschaedler, and Reza Shokri. 2022. Enhanced membership inference attacks against machine learning models. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security. 3093–3106.

Digital Library

[45]

Samuel Yeom, Irene Giacomelli, Matt Fredrikson, and Somesh Jha. 2018. Privacy risk in machine learning: Analyzing the connection to overfitting. In 2018 IEEE 31st Computer Security Foundations Symposium (CSF). IEEE, 268–282.

[46]

Xue Ying. 2019. An overview of overfitting and its solutions. In Journal of Physics: Conference Series, Vol. 1168. IOP Publishing, 022022.

[47]

Ashkan Yousefpour, Igor Shilov, Alexandre Sablayrolles, Davide Testuggine, Karthik Prasad, Mani Malek, John Nguyen, Sayan Ghosh, Akash Bharadwaj, Jessica Zhao, Graham Cormode, and Ilya Mironov. 2021. Opacus: User-Friendly Differential Privacy Library in PyTorch. arXiv preprint arXiv:2109.12298 (2021).

[48]

Lei Yu, Ling Liu, Calton Pu, Mehmet Emre Gursoy, and Stacey Truex. 2019. Differentially private model publishing for deep learning. In 2019 IEEE Symposium on Security and Privacy (SP). IEEE, 332–349.

[49]

Hongyi Zhang, Moustapha Cisse, Yann N Dauphin, and David Lopez-Paz. 2017. mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412 (2017).

[50]

Bo Zhao and Hakan Bilen. 2021. Dataset condensation with differentiable siamese augmentation. In International Conference on Machine Learning. PMLR, 12674–12685.

[51]

Bo Zhao, Konda Reddy Mopuri, and Hakan Bilen. 2021. Dataset Condensation with Gradient Matching.ICLR 1, 2 (2021), 3.

Index Terms

Mitigating Membership Inference Attacks via Weighted Smoothing
1. Computing methodologies
  1. Machine learning
2. Security and privacy
  1. Network security

Index terms have been assigned to the content through auto-classification.

Recommendations

Defenses to Membership Inference Attacks: A Survey
Machine learning (ML) has gained widespread adoption in a variety of fields, including computer vision and natural language processing. However, ML models are vulnerable to membership inference attacks (MIAs), which can infer whether access data was used ...
Semi-Leak: Membership Inference Attacks Against Semi-supervised Learning
Computer Vision – ECCV 2022
Abstract
Semi-supervised learning (SSL) leverages both labeled and unlabeled data to train machine learning (ML) models. State-of-the-art SSL methods can achieve comparable performance to supervised learning by leveraging much fewer labeled data. However, ...
Poster: Membership Inference Attacks via Contrastive Learning
CCS '23: Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security

Since machine learning model is often trained on a limited data set, the model is trained multiple times on the same data sample, which causes the model to memorize most of the training set data. Membership Inference Attacks (MIAs) exploit this feature ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

ACSAC '23: Proceedings of the 39th Annual Computer Security Applications Conference

December 2023

836 pages

ISBN:9798400708862

DOI:10.1145/3627106

Copyright © 2023 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 December 2023

Check for updates

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

NSF (National Science Foundation)

Conference

ACSAC '23

ACSAC '23: Annual Computer Security Applications Conference

December 4 - 8, 2023

TX, Austin, USA

Acceptance Rates

Overall Acceptance Rate 104 of 497 submissions, 21%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
439
Total Downloads

Downloads (Last 12 months)439
Downloads (Last 6 weeks)64

Reflects downloads up to 18 Sep 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents