A Review on Machine Unlearning

Haibo Zhang ORCID: orcid.org/0000-0002-4275-405X¹,
Toru Nakamura²^na1,
Takamasa Isohara²^na1 &
…
Kouichi Sakurai³^na1

9939 Accesses
99 Altmetric
13 Mentions
Explore all metrics

Abstract

Recently, an increasing number of laws have governed the useability of users’ privacy. For example, Article 17 of the General Data Protection Regulation (GDPR), the right to be forgotten, requires machine learning applications to remove a portion of data from a dataset and retrain it if the user makes such a request. Furthermore, from the security perspective, training data for machine learning models, i.e., data that may contain user privacy, should be effectively protected, including appropriate erasure. Therefore, researchers propose various privacy-preserving methods to deal with such issues as machine unlearning. This paper provides an in-depth review of the security and privacy concerns in machine learning models. First, we present how machine learning can use users’ private data in daily life and the role that the GDPR plays in this problem. Then, we introduce the concept of machine unlearning by describing the security threats in machine learning models and how to protect users’ privacy from being violated using machine learning platforms. As the core content of the paper, we introduce and analyze current machine unlearning approaches and several representative results and discuss them in the context of the data lineage. Furthermore, we also discuss the future research challenges in this field.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Security of Machine Learning Systems

A Systematic Review of Challenges and Techniques of Privacy-Preserving Machine Learning

Non-Cryptographic Privacy Preserving Machine Learning Methods: A Review

Data availability

Interested parties can obtain the anonymised datasets that support the findings of this study from the corresponding author upon reasonable request.

References

Baracaldo N, Chen B, Ludwig H, Safavi JA. Mitigating poisoning attacks on machine learning models: a data provenance based approach. In: Proceedings of the 10th ACM workshop on artificial intelligence and security. 2017;103–110
Liu Y, Fan M, Chen C, Liu X, Ma Z, Wang L, Ma J. Backdoor defense with machine unlearning. arXiv. 2022. https://doi.org/10.48550/arXiv.2201.09538.
Article Google Scholar
Bourtoule L, Chandrasekaran V, Choquette-Choo CA, Jia H, Travers A, Zhang B, Lie D, Papernot N. Machine unlearning. In: 2021 IEEE symposium on security and privacy (SP). IEEE. 2021;141–159
Al-Rubaie M, Chang JM. Privacy-preserving machine learning: threats and solutions. IEEE Secur Priv. 2019;17(2):49–58.
Article Google Scholar
Schelter S. Towards efficient machine unlearning via incremental view maintenance.
Graves L, Nagisetty V, Ganesh V. Amnesiac machine learning. arXiv. 2020. https://doi.org/10.48550/arXiv.2010.10981.
Article Google Scholar
Chen M, Zhang Z, Wang T, Backes M, Humbert M, Zhang Y. When machine unlearning jeopardizes privacy. In: Proceedings of the 2021 ACM SIGSAC conference on computer and communications security. 2021;896–911
Gao J, Garg S, Mahmoody M, Vasudevan PN. Deletion inference, reconstruction, and compliance in machine (un) learning. arXiv. 2022. https://doi.org/10.48550/arXiv.2202.03460.
Article Google Scholar
Marchant NG, Rubinstein BI, Alfeld S. Hard to forget: poisoning attacks on certified machine unlearning. arXiv preprint arXiv:2109.08266. 2021
Baracaldo N, Chen B, Ludwig H, Safavi A, Zhang R. Detecting poisoning attacks on machine learning in iot environments. In: 2018 IEEE international congress on internet of things (ICIOT). IEEE 2018;57–64.
Chundawat VS, Tarun AK, Mandal M, Kankanhalli M. Zero-shot machine unlearning. arXiv. 2022. https://doi.org/10.48550/arXiv.2201.05629.
Article Google Scholar
Toreini E, Aitken M, Coopamootoo K, Elliott K, Zelaya CG, Van Moorsel A. The relationship between trust in ai and trustworthy machine learning technologies. In: Proceedings of the 2020 conference on fairness, accountability, and transparency. 2020;272–283.
Surma J. Hacking machine learning: towards the comprehensive taxonomy of attacks against machine learning systems. In: Proceedings of the 2020 the 4th international conference on innovation in artificial intelligence. 2020;1–4.
Tramèr F, Zhang F, Juels A, Reiter MK, Ristenpart T. Stealing machine learning models via prediction APIs. USENIX Secur Symp. 2016;16:601–18.
Google Scholar
Song C, Ristenpart T, Shmatikov V. Machine learning models that remember too much. In: Proceedings of the 2017 ACM SIGSAC confer-ence on computer and communications security. 2017;587–601.
Shen S, Tople S, Saxena P. Auror: Defending against poisoning attacks in collaborative deep learning systems. In: Proceedings of the 32nd annual conference on computer security applications. 2016;508–519.
Alsdurf H, Belliveau E, Bengio Y, Deleu T, Gupta P, Ippolito D, Janda R, Jarvie M, Kolody T, Krastev S, et al. Covi white paper. arXiv. 2020. https://doi.org/10.48550/arXiv.2005.08502.
Article Google Scholar
Ginart A, Guan MY, Valiant G, Zou J. Making ai forget you: data deletion in machine learning. arXiv. 2019. https://doi.org/10.48550/arXiv.1907.05012.
Article Google Scholar
Fredrikson M, Jha S, Ristenpart T. Model inversion attacks that exploit confidence information and basic countermeasures. In: Proceedings of the 22nd ACM SIGSAC conference on computer and communications security. 2015;1322–1333.
Mahadevan A, Mathioudakis M. Certifiable machine unlearning for linear models. arXiv. 2021. https://doi.org/10.48550/arXiv.2106.15093.
Article Google Scholar
Guo C, Goldstein T, Hannun A, Van Der Maaten L. Certified data removal from machine learning models. arXiv. 2019. https://doi.org/10.48550/arXiv.1911.03030.
Article Google Scholar
Thudi A, Jia H, Shumailov I, Papernot N. On the necessity of auditable algorithmic definitions for machine unlearning. arXiv. 2021. https://doi.org/10.48550/arXiv.2110.11891.
Article Google Scholar
Ullah E, Mai T, Rao A, Rossi RA, Arora R. Machine unlearning via algorithmic stability. In: conference on learning theory. PMLR. 2021;4126–4142.
Cao Y, Yang J. Towards making systems forget with machine unlearning. In: 2015 IEEE Symposium on Security and Privacy. IEEE. 2015;463–480.
Cao Y, Yu AF, Aday A, Stahl E, Merwine J, Yang J. Efficient repair of polluted machine learning systems via causal unlearning. In: Proceedings of the 2018 on Asia conference on computer and communications security. 2018;735–747.
Kashef R. A boosted svm classifier trained by incremental learning and decremental unlearning approach. Expert Syst Appl. 2021;167:114154.
Article Google Scholar
Jose ST, Simeone O. A unified pac-bayesian framework for machine unlearning via information risk minimization. In: 2021 IEEE 3 1st international workshop on machine learning for signal processing (MLSP). IEEE. 2021;1–6.
Liu G, Ma X, Yang Y, Wang C, Liu J. Federaser: enabling efficient client-level data removal from federated learning models. In: 2021 IEEE/ACM 29th international symposium on quality of service (IWQOS). IEEE. 2021;1–10.
Brophy J, Lowd D. Machine unlearning for random forests. In: International Conference on Machine Learning. PMLR. 2021;1092–1104.
Wu C, Zhu S, Mitra P. Federated unlearning with knowledge distillation. arXiv. 2022. https://doi.org/10.48550/arXiv.2201.09441.
Article Google Scholar
Du M, Chen Z, Liu C, Oak R, Song D. Lifelong anomaly detection through unlearning. In: Proceedings of the 2019 ACM SIGSAC conference on computer and communications security. 2019;1283–1297.
Baumhauer T, Sch¨ottle P, Zeppelzauer M. Machine unlearning: linear filtration for logit-based classifiers. arXiv. 2020. https://doi.org/10.48550/arXiv.2002.02730.
Article MATH Google Scholar
Golatkar A, Achille A, Soatto S. Eternal sunshine of the spotless net: selective forgetting in deep networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020;9304–9312.
Wu Y, Dobriban E, Davidson S. Deltagrad: rapid retraining of machine learning models. In: International conference on machine learning. PMLR. 2021;10355–10366.
Golatkar A, Achille A, Ravichandran A, Polito M, Soatto S. Mixed—privacy forgetting in deep networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021;792–801.
Izzo Z, Smart MA, Chaudhuri K, Zou J. Approximate data deletion from machine learning models. In: International conference on artificial intelligence and statistics. PMLR. 2021;2008–2016.
Neel S, Roth A, Sharifi-Malvajerdi S. Descent-to-delete: gradient- based methods for machine unlearning. In: Algorithmic learning theory. PMLR. 2021;931–962.
Thudi A, Deza G, Chandrasekaran V, Papernot N. Unrolling sgd: understanding factors influencing machine unlearning. arXiv. 2021. https://doi.org/10.48550/arXiv.2109.13398.
Article Google Scholar
Warnecke A, Pirch L, Wressnegger C, Rieck K. Machine unlearning of features and labels. arXiv. 2021. https://doi.org/10.48550/arXiv.2108.11577.
Article Google Scholar
He Y, Meng G, Chen K, He J, Hu X. Deepobliviate: a powerful charm for erasing data residual memory in deep neural networks. arXiv. 2021. https://doi.org/10.48550/arXiv.2105.06209.
Article Google Scholar
Gong J, Simeone O, Kassab R, Kang J. Forget-svgd: Particle-based bayesian federated unlearning. arXiv. 2021. https://doi.org/10.48550/arXiv.2111.12056.
Article Google Scholar
Guo T, Guo S, Zhang J, Xu W, Wang J. Vertical machine unlearning: Selectively removing sensitive information from latent feature space. arXiv. 2022. https://doi.org/10.48550/arXiv.2202.13295.
Article Google Scholar
Cauwenberghs G, Poggio T. Incremental and decremental support vector machine learning. advances in neural information processing systems. 2000;13.
Tsai C-H, Lin C-Y, Lin C-J. Incremental and decremental training for linear classification. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining. 2014;343–352.
Karasuyama M, Takeuchi I. Multiple incremental decremental learning of support vector machines. IEEE Trans Neural Networks. 2010;21(7):1048–59.
Article Google Scholar
Kearns M. Efficient noise-tolerant learning from statistical queries. J ACM. 1998;45(6):983–1006.
Article MathSciNet MATH Google Scholar
Martens J. New insights and perspectives on the natural gradient method. arXiv. 2014. https://doi.org/10.48550/arXiv.1412.1193.
Article MATH Google Scholar
Dwork C, Roth A, et al. The algorithmic foundations of differential privacy. Found Trends Theor Comput Sci. 2014;9(3–4):211–407.
MathSciNet MATH Google Scholar
Chaudhuri K, Monteleoni C. Privacy-preserving logistic regression. advances in neural information processing systems. 2008;21.
Golatkar A, Achille A, Soatto S. Forgetting outside the box: Scrubbing deep networks of information accessible from input-output observations Europea conference on computer vision. Cham: Springer; 2020. p. 383–98.
Google Scholar
Koh PW, Liang P. Understanding black-box predictions via influence functions. In: International conference on machine learning. PMLR. 2017;1885–1894.
Giordano R, Stephenson W, Liu R, Jordan M, Broderick T. A swiss army infinitesimal jackknife. In: The 22nd international conference on artificial intelligence and statistics. International conference on machine learning. PMLR. 2019;1139–1147.
Zhang Z, Sparks ER, Franklin MJ. Diagnosing machine learning pipelines with fine-grained lineage. In: Proceedings of the 26th international symposium on high-performance parallel and distributed computing. 2017;143–153.
Luo G, et al. A roadmap for automating lineage tracing to aid automatically explaining machine learning predictions for clinical decision support. JMIR Med Inform. 2021;9(5):27778.
Article Google Scholar
Thiago RM, Souza R, Azevedo L, Soares EFDS, Santos R, Dos Santos W, De Bayser M, Cardoso MC, Moreno MF, Cerqueira R. Managing data lineage of o&g machine learning models: the sweet spot for shale use case. First EAGE Digit Conf Exhib. 2020;2020:1–5.
Google Scholar
Li Y, Zheng X, Chen C, Liu J. Making recommender systems forget: learning and unlearning for erasable recommendation. arXiv. 2022. https://doi.org/10.48550/arXiv.2203.11491.
Article Google Scholar
Shokri R, Stronati M, Song C, Shmatikov V. Membership inference attacks against machine learning models. In: 2017 IEEE Symposium on Security and Privacy (SP). IEEE. 2017;3–18.
Yeom S, Giacomelli I, Fredrikson M, Jha S. Privacy risk in machine learning: analyzing the connection to overfitting. In: 2018 IEEE 31st computer security foundations symposium (CSF). IEEE. 2018;268–282.
Sablayrolles A, Douze M, Schmid C, Ollivier Y, J´egou H. White-box vs black-box: Bayes optimal strategies for membership inference. In: International conference on machine learning. PMLR. 2019;5558–5567.
Hayes J, Melis L, Danezis G, De Cristofaro E. Logan: membership inference attacks against generative models. Proc Privacy Enhanc Technol De Gruyter. 2019;2019:133–52.
Article Google Scholar

Download references

Acknowledgements

This research was partially supported by the Japan Science and Technology Agency (JST) Strategic International Collaborative Research Program (SICORP). The first author was supported by JST SPRING, under Grant No. JPMJSP2136.

Funding

The work of Haibo Zhang was supported by the JST-Mirai Program, JPMJSP2136.

Author information

Toru Nakamura, Takamasa Isohara, and Kouichi Sakurai have contributed equally to this work.

Authors and Affiliations

Department of Information Science and Technology, Graduate School of Information Science and Electrical Engineering, Kyushu University, Fukuoka, 819-0395, Japan
Haibo Zhang
KDDI Research Inc., Fujimino, 356-8502, Japan
Toru Nakamura & Takamasa Isohara
Department of Information Science and Technology, Faculty of Information Science and Electrical Engineering, Kyushu University, Fukuoka, 819-0395, Japan
Kouichi Sakurai

Authors

Haibo Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Toru Nakamura
View author publications
You can also search for this author in PubMed Google Scholar
Takamasa Isohara
View author publications
You can also search for this author in PubMed Google Scholar
Kouichi Sakurai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Haibo Zhang.

Ethics declarations

Conflict of Interest

On behalf of all the authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhang, H., Nakamura, T., Isohara, T. et al. A Review on Machine Unlearning. SN COMPUT. SCI. 4, 337 (2023). https://doi.org/10.1007/s42979-023-01767-4

Download citation

Received: 13 March 2022
Accepted: 04 March 2023
Published: 19 April 2023
DOI: https://doi.org/10.1007/s42979-023-01767-4

Abstract

Access this article

Subscribe and save

Buy Now