research-article

Public Access

GuardNN: secure accelerator architecture for privacy-preserving deep learning

Authors:

G. Edward SuhAuthors Info & Claims

DAC '22: Proceedings of the 59th ACM/IEEE Design Automation Conference

Pages 349 - 354

https://doi.org/10.1145/3489517.3530439

Published: 23 August 2022 Publication History

Abstract

This paper proposes GuardNN, a secure DNN accelerator that provides hardware-based protection for user data and model parameters even in an untrusted environment. GuardNN shows that the architecture and protection can be customized for a specific application to provide strong confidentiality and integrity guarantees with negligible overhead. The design of the GuardNN instruction set reduces the TCB to just the accelerator and allows confidentiality protection even when the instructions from a host cannot be trusted. GuardNN minimizes the overhead of memory encryption and integrity verification by customizing the off-chip memory protection for the known memory access patterns of a DNN accelerator. GuardNN is prototyped on an FPGA, demonstrating effective confidentiality protection with ~3% performance overhead for inference.

References

[1]

J. Albericio et al. 2016. Cnvlutin: Ineffectual-neuron-free Deep Neural Network Computing. In ISCA.

Digital Library

[2]

L. Batina et al. 2019. CSI NN: Reverse Engineering of Neural Network Architectures Through Electromagnetic Side Channel. In USENIX Security.

[3]

N. Dowlin et al. 2016. CryptoNets: Applying Neural Networks to Encrypted Data with High Throughput and Accuracy. In ICML.

[4]

B. Gassend et al. 2003. Caches and hash trees for efficient memory integrity verification. In HPCA.

[5]

S. Gueron. 2016. Memory Encryption for General-Purpose Processors. In S&P.

[6]

W. Hua et al. 2018. Reverse Engineering Convolutional Neural Networks Through Side-channel Information Leaks. In DAC.

[7]

W. Hua et al. 2019. Boosting the Performance of CNN Accelerators with Dynamic Fine-Grained Channel Gating. In MICRO.

[8]

W. Hua et al. 2019. Channel Gating Neural Networks. In NeurIPS.

[9]

W. Hua et al. 2022. Reverse Engineering CNN Models using Side-Channel Attacks. In IEEE Design & Test.

[10]

T. Hunt et al. 2020. Telekine: Secure Computing with Cloud GPUs. In NSDI.

[11]

I. Jang et al. 2019. Heterogeneous Isolated Execution for Commodity GPUs. In ASPLOS.

[12]

N. P Jouppi et al. 2017. In-Datacenter Performance Analysis of a Tensor Processing Unit. In ISCA.

[13]

K. Kim et al. 2020. Vessels: Efficient and Scalable Deep Learning Prediction on Trusted Processors. In SoCC.

[14]

Y. Kim et al. 2016. Ramulator: A Fast and Extensible DRAM Simulator. In CAL.

[15]

N. Kumar et al. 2020. CrypTFlow: Secure TensorFlow Inference. In S&P.

[16]

D. Lee et al. 2020. Keystone: An Open Framework for Architecting Trusted Execution Environments. In EuroSys.

[17]

S. Lee et al. 2022. TNPU: Supporting Trusted Execution with Tree-less Integrity Protection for Neural Processing Unit. In HPCA.

[18]

T. Lee et al. 2019. Occlumency: Privacy-Preserving Remote Deep-Learning Inference Using SGX. In MobiCom.

Digital Library

[19]

G. Lloret-Talavera et al. 2022. Enabling Homomorphically Encrypted Inference for Large DNN Models. In IEEE TC.

[20]

F. McKeen et al. 2016. Intel Software Guard Extensions (Intel SGX) Support for Dynamic Memory Management Inside an Enclave. In HASP.

[21]

P. Mishra et al. 2020. Delphi: A Cryptographic Inference Service for Neural Networks. In USENIX Security.

[22]

D. Rathee et al. 2020. CrypTFlow2: Practical 2-Party Secure Inference. In CCS.

[23]

B. Reagen et al. 2021. Cheetah: Optimizing and Accelerating Homomorphic Encryption for Private Inference. In HPCA.

[24]

T. Ryffel et al. 2022. AriaNN: Low-Interaction Privacy-Preserving Deep Learning via Function Secret Sharing. In PET.

[25]

A. Samajdar et al. 2020. A systematic methodology for characterizing scalability of DNN accelerators using SCALE-sim. In ISPASS.

[26]

W. Shan et al. 2019. A 923 Gbps/W, 113-Cycle, 2-Sbox Energy-efficient AES Accelerator in 28nm CMOS. In VLSI.

[27]

F. Tramer et al. 2019. Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware. In ICLR.

[28]

S. Volos et al. 2018. Graviton: Trusted Execution Environments on GPUs. In OSDI.

[29]

S. Wagh et al. 2021. FALCON: Honest-Majority Maliciously Secure Framework for Private Deep Learning. In PET.

[30]

X. Wang et al. 2019. NPUFort: A Secure Architecture of DNN Accelerator Against Model Inversion Attack. In CF.

[31]

L. Wei et al. 2018. I Know What You See: Power Side-Channel Attack on Convolutional Neural Network Accelerators. In ACSAC.

[32]

Xilinx. 2018. CHaiDNN-v2: HLS based Deep Neural Network Accelerator Library for Xilinx Ultrascale+ MPSoCs. https://github.com/Xilinx/CHaiDNN.

[33]

M. Zhao et al. 2022. ShEF: Shielded Enclaves for Cloud FPGAs. In ASPLOS.

[34]

J. Zhu et al. 2020. Enabling Rack-scale Confidential Computing using Heterogeneous Trusted Execution Environment. In S&P.

Cited By

Mo FTarkhani ZHaddadi H(2024)Machine Learning with Confidential Computing: A Systematization of KnowledgeACM Computing Surveys10.1145/367000756:11(1-40)Online publication date: 29-Jun-2024
https://dl.acm.org/doi/10.1145/3670007
Li QShen ZQin ZXie YZhang XDu TCheng SWang XYin JCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)TransLinkGuard: Safeguarding Transformer Models Against Model Stealing in Edge DeploymentProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680786(3479-3488)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3680786
Lam MJohnson JXiong WMaeng KGupta ULi YLai LLeontiadis IRhu MLee HReddi VWei GBrooks DSuh ETsafrir DMusuvathi MGupta RAbu-Ghazaleh N(2024)GPU-based Private Information Retrieval for On-Device Machine Learning InferenceProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3617232.3624855(197-214)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3617232.3624855
Show More Cited By

Recommendations

WOM-Code Solutions for Low Latency and High Endurance in Phase Change Memory
This paper describes a write-once-memory-code phase change memory (WOM-code PCM) architecture for next-generation non-volatile memory applications. Specifically, we address the long latency of the write operation in PCM—attributed to PCM SET—...
A Novel Memory Block Management Scheme for PCM Using WOM-Code
HPCC-CSS-ICESS '15: Proceedings of the 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conf on Embedded Software and Systems

Phase Change Memory (PCM) is a promising DRAM replacement in embedded systems due to its attractive characteristics including low static power consumption and high density. However, long write latency is one of the major drawbacks in current PCM ...
Mellow writes: extending lifetime in resistive memories through selective slow write backs
ISCA'16

Emerging resistive memory technologies, such as PCRAM and ReRAM, have been proposed as promising replacements for DRAM-based main memory, due to their better scalability, low standby power, and non-volatility. However, limited write endurance is a major ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

DAC '22: Proceedings of the 59th ACM/IEEE Design Automation Conference

July 2022

1462 pages

ISBN:9781450391429

DOI:10.1145/3489517

General Chair:
Rob Oshana
NXP

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGDA: ACM Special Interest Group on Design Automation
IEEE CEDA

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 August 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Funding Sources

Conference

DAC '22

Sponsor:

SIGDA

DAC '22: 59th ACM/IEEE Design Automation Conference

July 10 - 14, 2022

California, San Francisco

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Upcoming Conference

DAC '25

Sponsor:
sigda

62nd ACM/IEEE Design Automation Conference

June 22 - 26, 2025

San Francisco , CA , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

11
Total Citations
View Citations
891
Total Downloads

Downloads (Last 12 months)414
Downloads (Last 6 weeks)65

Reflects downloads up to 16 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Mo FTarkhani ZHaddadi H(2024)Machine Learning with Confidential Computing: A Systematization of KnowledgeACM Computing Surveys10.1145/367000756:11(1-40)Online publication date: 29-Jun-2024
https://dl.acm.org/doi/10.1145/3670007
Li QShen ZQin ZXie YZhang XDu TCheng SWang XYin JCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)TransLinkGuard: Safeguarding Transformer Models Against Model Stealing in Edge DeploymentProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680786(3479-3488)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3680786
Lam MJohnson JXiong WMaeng KGupta ULi YLai LLeontiadis IRhu MLee HReddi VWei GBrooks DSuh ETsafrir DMusuvathi MGupta RAbu-Ghazaleh N(2024)GPU-based Private Information Retrieval for On-Device Machine Learning InferenceProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3617232.3624855(197-214)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3617232.3624855
Nakai TYamamoto R(2024)Co-designing Trusted Execution Environment and Model Encryption for Secure High-Performance DNN Inference on FPGAs2024 IEEE International Symposium on Circuits and Systems (ISCAS)10.1109/ISCAS58744.2024.10558579(1-5)Online publication date: 19-May-2024
https://doi.org/10.1109/ISCAS58744.2024.10558579
Feng EFeng DDu DXia YChen H(2024)sNPU: Trusted Execution Environments on Integrated NPUs2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00057(708-723)Online publication date: 29-Jun-2024
https://doi.org/10.1109/ISCA59077.2024.00057
Na SKim JLee SHuh J(2024)Supporting Secure Multi-GPU Computing with Dynamic and Batched Metadata Management2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00025(204-217)Online publication date: 2-Mar-2024
https://doi.org/10.1109/HPCA57654.2024.00025
Mohan AYe MFranke HSrivatsa MLiu ZGonzalez N(2024)Securing AI Inference in the Cloud: Is CPU-GPU Confidential Computing Ready?2024 IEEE 17th International Conference on Cloud Computing (CLOUD)10.1109/CLOUD62652.2024.00028(164-175)Online publication date: 7-Jul-2024
https://doi.org/10.1109/CLOUD62652.2024.00028
Banerjee SWei SRamrakhyani PTiwari M(2023)Triton: Software-Defined Threat Model for Secure Multi-Tenant ML Inference AcceleratorsProceedings of the 12th International Workshop on Hardware and Architectural Support for Security and Privacy10.1145/3623652.3623672(19-28)Online publication date: 29-Oct-2023
https://dl.acm.org/doi/10.1145/3623652.3623672
Lee KYan MEmer JChandrakasan A(2023)SecureLoop: Design Space Exploration of Secure DNN AcceleratorsProceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3613424.3614273(194-208)Online publication date: 28-Oct-2023
https://dl.acm.org/doi/10.1145/3613424.3614273
Yang DNair PLis MAamodt TJerger NSwift M(2023)HuffDuff: Stealing Pruned DNNs from Sparse AcceleratorsProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3575693.3575738(385-399)Online publication date: 27-Jan-2023
https://dl.acm.org/doi/10.1145/3575693.3575738
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents