Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

CUDA Leaks: A Detailed Hack for CUDA and a (Partial) Fix

Published: 13 January 2016 Publication History

Abstract

Graphics processing units (GPUs) are increasingly common on desktops, servers, and embedded platforms. In this article, we report on new security issues related to CUDA, which is the most widespread platform for GPU computing. In particular, details and proofs-of-concept are provided about novel vulnerabilities to which CUDA architectures are subject. We show how such vulnerabilities can be exploited to cause severe information leakage. As a case study, we experimentally show how to exploit one of these vulnerabilities on a GPU implementation of the AES encryption algorithm. Finally, we also suggest software patches and alternative approaches to tackle the presented vulnerabilities.

References

[1]
Najwa Aaraj, Anand Raghunathan, and Niraj K. Jha. 2011. A framework for defending embedded systems against software attacks. ACM Transactions on Embedded Computing Systems 10, 3, Article No. 33.
[2]
Alessandro Barenghi, Gerardo Pelosi, and Yannick Teglia. 2011. Information leakage discovery techniques to enhance secure chip design. In Information Security Theory and Practice: Security and Privacy of Mobile Devices in Wireless Communication. Lecture Notes in Computer Science, Vol. 6633. Springer, 128--143.
[3]
Spiridon F. Beldianu and Sotirios G. Ziavras. 2013. Multicore-based vector coprocessor sharing for performance and energy gains. ACM Transactions on Embedded Computing Systems 13, 2, Article No. 17.
[4]
Nick Black and Jason Rodzik. 2010. My Other Computer Is Your GPU: System-Centric CUDA Threat Modeling with CUBAR. Retrieved December 26, 2015, from http://nick-black.com/dankwiki/images/ d/d2/Cubar2010.pdf.
[5]
Lilian Bossuet, Michael Grand, Lubos Gaspar, Viktor Fischer, and Guy Gogniat. 2013. Architectures of flexible symmetric key crypto engines—a survey: From hardware coprocessor to multi-crypto-processor system on chip. ACM Computing Surveys 45, 4, Article No. 41.
[6]
Ian Buck, Tim Foley, Daniel Horn, Jeremy Sugerman, Kayvon Fatahalian, Mike Houston, and Pat Hanrahan. 2004. Brook for GPUs: Stream computing on graphics hardware. ACM Transactions on Graphics 23, 3, 777--786.
[7]
Wu Chun Feng and Shucai Xiao. 2010. To GPU synchronize or not GPU synchronize? In Proceedings of the 2010 IEEE International Symposium on Circuits and Systems (ISCAS'10). 3801--3804.
[8]
A. Di Biagio, A. Barenghi, G. Agosta, and G. Pelosi. 2009. Design of a parallel AES for graphics hardware using the CUDA framework. In Proceedings of the IEEE International Symposium on Parallel Distributed Processing (IPDPS'09). 1--8.
[9]
Jack Dongarra, Erich Strohmaier, and Horst Simon. 1993. TOP500 Supercomputing Sites. Retrieved December 26, 2015, from http://www.top500.org.
[10]
Paolo D'Arco and Angel Perez del Pozo. 2013. Toward tracing and revoking schemes secure against collusion and any form of secret information leakage. International Journal of Information Security 12, 1, 1--17.
[11]
Donald Evans, Phillip Bond, and Arden Bement. 1994. FIPS PUB 140-2: Security Requirements for Cryptographic Modules. Available at http://www.csrc.nist.gov.
[12]
Abhijeet Gaikwad and Ioane Muni Toke. 2010. Parallel iterative linear solvers on GPU: A financial engineering case. In Proceedings of the 18th Euromicro PDP Conference. IEEE, Los Alamitos, CA, 607--614.
[13]
Shi Guochun. 2012. CUDA Wrapper Library. Available at http://cudawrapper.sourceforge.net.
[14]
Michael Henson and Stephen Taylor. 2014. Memory encryption: A survey of existing techniques. ACM Computing Surveys 46, 4, Article No. 53.
[15]
Howard M. Heys. 2002. A tutorial on linear and differential cryptanalysis. Cryptologia 26, 3, 189--221.
[16]
Byunghyun Jang, Dana Schaa, Perhaad Mistry, and David Kaeli. 2011b. Exploiting memory access patterns to improve memory performance in data-parallel architectures. IEEE Transactions on Parallel and Distributed Systems 22, 1, 105--118.
[17]
Keon Jang, Sangjin Han, Seungyeop Han, Sue Moon, and Kyoung Soo Park. 2011a. SSLShader: Cheap SSL acceleration with commodity processors. In Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation (NSDI'11). 1.
[18]
Shinpei Kato. 2012. Gdev. Retrieved December 26, 2015, from https://github.com/shinpei0208/gdev.
[19]
Shinpei Kato, Karthik Lakshmanan, Ragunathan Rajkumar, and Yutaka Ishikawa. 2011. TimeGraph: GPU scheduling for real-time multi-tasking environments. In Proceedings of the 2011 USENIX Annual Technical Conference (USENIXATC'11). 2--16.
[20]
Junsung Kim, Ragunathan (Raj) Rajkumar, and Shinpei Kato. 2013. Towards adaptive GPU resource management for embedded real-time systems. ACM SIGBED Review 10, 1, 14--17.
[21]
V. V. Kindratenko, J. J. Enos, G. Shi, M. T. Showerman, G. W. Arnold, J. E. Stone, J. C. Phillips, and W.-M. Hwu. 2009. GPU clusters for high-performance computing. In Proceedings of the IEEE International Conference on Cluster Computing and Workshops (CLUSTER'09). 1--8. CLUSTR.2009.5289128
[22]
Paul Kocher, Joshua Jaffe, and Benjamin Jun. 1999. Differential Power Analysis. Springer-Verlag.
[23]
Robert Kotcher, Yutong Pei, Pranjal Jumde, and Collin Jackson. 2013. Cross-origin pixel stealing: Timing attacks using CSS filters. In Proceedings of the 2013 ACM SIGSAC Conference on Computer and Communications Security (CCS'13). ACM, New York, NY, 1055--1062.
[24]
Michael Larabel. 2012. NVIDIA Linux Driver Hack Gives You Root Access. Retrieved December 26, 2015, from http://www.phoronix.com/scan.php?page=news_item&px=MTE1MTk.
[25]
Ruby B. Lee, Peter C. S. Kwan, John P. McGregor, Jeffrey Dwoskin, and Zhenghong Wang. 2005. Architecture for protecting critical secrets in microprocessors. SIGARCH Computer Architecture News 33, 2, 2--13.
[26]
Sangho Lee, Youngsok Kim, Jangwoo Kim, and Jong Kim. 2014. Stealing webpages rendered on your browser by exploiting GPU vulnerabilities. In Proceedings of the 35th IEEE Symposium on Security and Privacy (S&P'14).
[27]
Flavio Lombardi and Roberto Di Pietro. 2010. CUDACS: Securing the cloud with CUDA-enabled secure virtualization. In Proceedings of the 12th International Conference on Information and Communications Security (ICICS'10). 92--106.
[28]
Clémentine Maurice, Christoph Neumann, Olivier Heen, and Aurélien Francillon. 2014. Confidentiality issues on a GPU in a virtualized environment. In Proceedings of the 18th International Conference on Financial Cryptography and Data Security (FC'14).
[29]
Rebecca T. Mercuri and Peter G. Neumann. 2003. Security by obscurity. Communications of the ACM 46, 11, 160--166.
[30]
Paulius Micikevicius. 2011. Local Memory and Register Spilling. Retrieved December 26, 2015, from http://on-demand.gputechconf.com/gtc-express/2011/presentations/registe r_spilling.pdf.
[31]
N. Nishikawa, K. Iwai, and T. Kurokawa. 2011. High-performance symmetric block ciphers on CUDA. In Proceedings of the 2011 2nd International Conference on Networking and Computing (ICNC'11). 221--227.
[32]
NVIDIA. 2014a. CUDA C Programming Guide. Retrieved December 26, 2015, from http://docs.nvidia.com/ cuda/cuda-c-programming-guide/index.html.
[33]
NVIDIA. 2014b. GRID GPUs. Available at http://www.nvidia.com/object/grid-technology.html.
[34]
S. B. Ors, F. Gurkaynak, E. Oswald, and B. Preneel. 2004. Power-analysis attack on an ASIC AES implementation. In Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC'04), Vol. 2. 546--552.
[35]
Marco Paolieri, Eduardo Quinones, and Francisco J. Cazorla. 2013. Timing effects of DDR memory systems in hard real-time multicore architectures: Issues and solutions. ACM Transactions on Embedded Computing Systems 12, 1, Article No. 64.
[36]
Alexandros Papakonstantinou, Karthik Gururaj, John A. Stratton, Deming Chen, Jason Cong, and Wen-Mei W. Hwu. 2013. Efficient compilation of CUDA kernels for high-performance computing on FPGAs. ACM Transactions on Embedded Computing Systems 13, 2, Article No. 25.
[37]
Joel Reardon, David Basin, and Srdjan Capkun. 2013. SoK: Secure data deletion. In Proceedings of the IEEE Symposium on Security and Privacy (SP'13). IEEE, Los Alamitos, CA, 301--315.
[38]
Rakesh Reddy and Peter Petrov. 2010. Cache partitioning for energy-efficient and interference-free embedded multitasking. ACM Transactions on Embedded Computing Systems 9, 3, Article No. 16.
[39]
Marco Riccardi, Roberto Di Pietro, Marta Palanques, and Jorge Aguilí Vila. 2013. Titans' revenge: Detecting Zeus via its own flaws. Computer Networks 57, 2, 422--435.
[40]
Alex Shye, Joseph Blomstedt, Tipp Moseley, Vijay Janapa Reddi, and Daniel A. Connors. 2009. PLR: A software approach to transient fault tolerance for multicore architectures. IEEE Transactions on Dependable and Secure Computing 6, 2, 135--148.
[41]
Henk C. A. Van Tilborg and Sushil Jajodia (Eds.). 2011. Encyclopedia of Cryptography and Security (2nd ed.). Springer.
[42]
Giorgos Vasiliadis, Elias Athanasopoulos, Michalis Polychronakis, and Sotiris Ioannidis. 2014. PixelVault: Using GPUs for securing cryptographic operations. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security (CCS'14). ACM, New York, NY, 1131--1142.
[43]
Uri Verner, Assaf Schuster, and Mark Silberstein. 2011. Processing data streams with hard real-time constraints on heterogeneous systems. In Proceedings of the International Conference on Supercomputing (ICS'11). ACM, New York, NY, 120--129.
[44]
Antonio Villani, Davide Balzarotti, and Roberto Di Pietro. 2015. The impact of GPU-assisted malware on memory forensics: A case study. In Proceedings of the Annual Digital Forensics Research Conference (DFRWS'15).
[45]
H. Wong, M. Papadopoulou, M. Sadooghi-Alvandi, and A. Moshovos. 2010. Demystifying GPU microarchitecture through microbenchmarking. In Proceedings of the 2010 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS'10). 235--246.
[46]
Xi Yang, Stephen M. Blackburn, Daniel Frampton, Jennifer B. Sartor, and Kathryn S. McKinley. 2011. Why nothing matters: The impact of zeroing. ACM SIGPLAN Notices 46, 10, 307--324.
[47]
Zillians. 2012. VGPU GPU virtualization. Available at http://www.zillians.com.

Cited By

View all
  • (2023)Building GPU TEEs using CPU Secure Enclaves with GEVisorProceedings of the 2023 ACM Symposium on Cloud Computing10.1145/3620678.3624659(249-264)Online publication date: 30-Oct-2023
  • (2022)TACCProceedings of the 15th ACM International Conference on Systems and Storage10.1145/3534056.3534943(58-71)Online publication date: 6-Jun-2022
  • (2022)GPUReplay: a 50-KB GPU stack for client MLProceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3503222.3507754(157-170)Online publication date: 28-Feb-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Embedded Computing Systems
ACM Transactions on Embedded Computing Systems  Volume 15, Issue 1
February 2016
530 pages
ISSN:1539-9087
EISSN:1558-3465
DOI:10.1145/2872313
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 13 January 2016
Accepted: 01 July 2015
Revised: 01 March 2015
Received: 01 September 2014
Published in TECS Volume 15, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. GPGPU
  2. GPU
  3. information leakage
  4. registers

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Prevention of and Fight against Crime Programme of the European Union European Commission—Directorate—General Home Affairs
  • European Antitrust Forensic IT Tools project (rif. HOME/2012/ISEC/FP/C2/4000003977)

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)114
  • Downloads (Last 6 weeks)16
Reflects downloads up to 19 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Building GPU TEEs using CPU Secure Enclaves with GEVisorProceedings of the 2023 ACM Symposium on Cloud Computing10.1145/3620678.3624659(249-264)Online publication date: 30-Oct-2023
  • (2022)TACCProceedings of the 15th ACM International Conference on Systems and Storage10.1145/3534056.3534943(58-71)Online publication date: 6-Jun-2022
  • (2022)GPUReplay: a 50-KB GPU stack for client MLProceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3503222.3507754(157-170)Online publication date: 28-Feb-2022
  • (2022)On the Effectiveness of Using Graphics Interrupt as a Side Channel for User Behavior SnoopingIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2021.309115919:5(3257-3270)Online publication date: 1-Sep-2022
  • (2022)Graphics Peeping Unit: Exploiting EM Side-Channel Information of GPUs to Eavesdrop on Your Neighbors2022 IEEE Symposium on Security and Privacy (SP)10.1109/SP46214.2022.9833773(1440-1457)Online publication date: May-2022
  • (2022)Cronus: Fault-Isolated, Secure and High-Performance Heterogeneous Computing for Trusted Execution EnvironmentProceedings of the 55th Annual IEEE/ACM International Symposium on Microarchitecture10.1109/MICRO56248.2022.00019(124-143)Online publication date: 1-Oct-2022
  • (2022)Accelerators & Security: The Socket ApproachIEEE Computer Architecture Letters10.1109/LCA.2022.317994721:2(65-68)Online publication date: 1-Jul-2022
  • (2022)LAK: A Low-Overhead Lock-and-Key Based Schema for GPU Memory Safety2022 IEEE 40th International Conference on Computer Design (ICCD)10.1109/ICCD56317.2022.00108(705-713)Online publication date: Oct-2022
  • (2021)Why Cs departments should consider offering CUDA as a standalone courseJournal of Computing Sciences in Colleges10.5555/3447286.344729336:4(51-58)Online publication date: 12-Jan-2021
  • (2021)Efficient Buffer Overflow Detection on GPUIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2020.304296532:5(1161-1177)Online publication date: 1-May-2021
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media