Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3555776.3578591acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

SOTERIA: Preserving Privacy in Distributed Machine Learning

Published: 07 June 2023 Publication History

Abstract

We propose Soteria, a system for distributed privacy-preserving Machine Learning (ML) that leverages Trusted Execution Environments (e.g. Intel SGX) to run code in isolated containers (enclaves). Unlike previous work, where all ML-related computation is performed at trusted enclaves, we introduce a hybrid scheme, combining computation done inside and outside these enclaves. The conducted experimental evaluation validates that our approach reduces the runtime of ML algorithms by up to 41%, when compared to previous related work. Our protocol is accompanied by a security proof, as well as a discussion regarding resilience against a wide spectrum of ML attacks.

References

[1]
[n.d.]. SOTERIA Proof. https://dbr-haslab.github.io/repository/sac23.pdf.
[2]
Mohammad Al-Rubaie and J Morris Chang. 2019. Privacy-preserving machine learning: Threats and solutions. IEEE Security & Privacy.
[3]
Yoshinori Aono, Takuya Hayashi, Lihua Wang, Shiho Moriai, et al. 2017. Privacy-preserving deep learning via additively homomorphic encryption. IEEE Transactions on Information Forensics and Security.
[4]
Microsoft Azure. [n. d.]. Azure Confidential Computing. https://azure.microsoft.com/en-us/solutions/confidential-compute/. (Accessed on 22/10/2022).
[5]
Raad Bahmani, Manuel Barbosa, Ferdinand Brasser, Bernardo Portela, et al. 2017. Secure multiparty computation from SGX. In International Conference on Financial Cryptography and Data Security. Springer.
[6]
Wieland Brendel, Jonas Rauber, and Matthias Bethge. 2018. Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models. In 6th International Conference on Learning Representations,.
[7]
R. Canetti. 2001. Universally composable security: A new paradigm for cryptographic protocols. In 42nd IEEE Symposium on Foundations of Computer Science.
[8]
Varun Chandrasekaran, Kamalika Chaudhuri, Irene Giacomelli, Somesh Jha, et al. 2020. Exploring connections between active learning and model extraction. In 29th USENIX Security Symposium.
[9]
Pin-Yu Chen, Huan Zhang, Yash Sharma, Jinfeng Yi, and Cho-Jui Hsieh. 2017. Zoo: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models. In 10th ACM workshop on artificial intelligence and security.
[10]
Databricks. [n. d.]. Optimizing Apache Spark UDFs. https://www.databricks.com/session_eu20/optimizing-apache-spark-udfs. (Accessed on 27/10/2022).
[11]
Tu Dinh Ngoc, Bao Bui, Stella Bitchebe, Alain Tchana, et al. 2019. Everything you should know about Intel SGX performance on virtualized systems. ACM on Measurement and Analysis of Computing Systems.
[12]
Muhammad El-Hindi, Tobias Ziegler, Matthias Heinrich, Adrian Lutsch, et al. 2022. Benchmarking the Second Generation of Intel SGX Hardware. In Data Management on New Hardware.
[13]
Matt Fredrikson, Somesh Jha, and Thomas Ristenpart. 2015. Model inversion attacks that exploit confidence information and basic countermeasures. In 22nd ACM SIGSAC Conference on Computer and Communications Security.
[14]
Large-Scale Data & Systems (LSDS) Group. [n. d.]. SGX-Spark. https://github.com/lsds/sgx-spark. (Accessed on 22/10/2022).
[15]
Tyler Hunt, Congzheng Song, Reza Shokri, Vitaly Shmatikov, et al. 2018. Chiron: Privacy-preserving machine learning as a service. arXiv preprint arXiv:1803.05961.
[16]
Nick Hynes, Raymond Cheng, and Dawn Song. 2018. Efficient deep learning on multi-source private data. arXiv preprint arXiv:1807.06689.
[17]
Intel. [n. d.]. HiBench is a big data benchmark suite. https://github.com/Intel-bigdata/HiBench. (Accessed on 22/10/2022).
[18]
Salman Iqbal, Miss Laiha Mat Kiah, Babak Dhaghighi, Muzammil Hussain, Suleman Khan, Muhammad Khurram Khan, and Kim-Kwang Raymond Choo. 2016. On cloud security attacks: A taxonomy and intrusion detection and prevention as a service. Journal of Network and Computer Applications.
[19]
Jianyu Jiang, Xusheng Chen, TszOn Li, Cheng Wang, et al. 2020. Uranus: Simple, efficient sgx programming and its applications. In 15th ACM Asia Conference on Computer and Communications Security.
[20]
Alexey Kurakin, Ian J. Goodfellow, and Samy Bengio. 2017. Adversarial Machine Learning at Scale. In 5th International Conference on Learning Representations.
[21]
Frank McKeen, Ilya Alexandrovich, Alex Berenzon, Carlos V Rozas, et al. 2013. Innovative instructions and software model for isolated execution. Hasp isca.
[22]
Xiangrui Meng, Joseph Bradley, Burak Yavuz, Evan Sparks, et al. 2016. Mllib: Machine learning in apache spark. The Journal of Machine Learning Research.
[23]
Olga Ohrimenko, Felix Schuster, Cédric Fournet, Aastha Mehta, et al. 2016. Oblivious multi-party machine learning on trusted processors. In 25th USENIX Security Symposium.
[24]
Ahmed Salem, Apratim Bhattacharya, Michael Backes, Mario Fritz, and other. 2020. Updates-leak: Data set inference and reconstruction attacks in online learning. In 29th USENIX Security Symposium.
[25]
Fahad Shaon, Murat Kantarcioglu, Zhiqiang Lin, and Latifur Khan. 2017. Sgx-bigmatrix: A practical encrypted data analytic framework with trusted processors. In ACM SIGSAC Conference on Computer and Communications Security.
[26]
Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. 2017. Membership inference attacks against machine learning models. In Symposium on Security and Privacy (SP).
[27]
Emil Stefanov, Marten Van Dijk, Elaine Shi, T-H Hubert Chan, et al. 2018. Path ORAM: an extremely simple oblivious RAM protocol. J. ACM.
[28]
Florian Tramèr, Fan Zhang, Ari Juels, Michael K Reiter, and other. 2016. Stealing machine learning models via prediction apis. In 25th USENIX Security Symposium.
[29]
Jean-Baptiste Truong, Pratyush Maini, Robert J Walls, and Nicolas Papernot. 2021. Data-free model extraction. In IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[30]
Chia-Che Tsai, Donald E Porter, and Mona Vij. 2017. Graphene-sgx: A practical library OS for unmodified applications on SGX. In USENIX Annual Technical Conference.
[31]
Matei Zaharia, Reynold S Xin, Patrick Wendell, Tathagata Das, et al. 2016. Apache spark: a unified engine for big data processing. Commun. ACM.
[32]
Wenting Zheng, Ankur Dave, Jethro G Beekman, Raluca Ada Popa, et al. 2017. Opaque: An oblivious and encrypted distributed analytics platform. In 14th USENIX Symposium on Networked Systems Design and Implementation.

Cited By

View all
  • (2024)A Review on Privacy Enhanced Distributed ML Against Poisoning AttacksAI Applications in Cyber Security and Communication Networks10.1007/978-981-97-3973-8_11(173-186)Online publication date: 18-Sep-2024
  • (2023)Privacy-Preserving Machine Learning on Apache SparkIEEE Access10.1109/ACCESS.2023.333222211(127907-127930)Online publication date: 2023

Index Terms

  1. SOTERIA: Preserving Privacy in Distributed Machine Learning

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SAC '23: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing
    March 2023
    1932 pages
    ISBN:9781450395175
    DOI:10.1145/3555776
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 June 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. apache spark
    2. machine learning
    3. Intel SGX
    4. privacy-preserving

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    SAC '23
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)66
    • Downloads (Last 6 weeks)6
    Reflects downloads up to 30 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A Review on Privacy Enhanced Distributed ML Against Poisoning AttacksAI Applications in Cyber Security and Communication Networks10.1007/978-981-97-3973-8_11(173-186)Online publication date: 18-Sep-2024
    • (2023)Privacy-Preserving Machine Learning on Apache SparkIEEE Access10.1109/ACCESS.2023.333222211(127907-127930)Online publication date: 2023

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media