research-article

GPU virtualization for high performance general purpose computing on the ESX hypervisor

Authors:

Hari Sivaraman,

Rishi BidarkarAuthors Info & Claims

HPC '14: Proceedings of the High Performance Computing Symposium

Article No.: 2, Pages 1 - 8

Published: 13 April 2014 Publication History

Abstract

Graphics Processing Units (GPU) have become important components in high performance computing (HPC) systems for their massively parallel computing capability and energy efficiency. Virtualization technologies are increasingly applied to HPC to reduce administration costs and improve system utilization. However, virtualizing the GPU to support general purpose computing presents many challenges because of the complexity of this device. On VMware's ESX hypervisor, DirectPath I/O can provide virtual machines (VM) high performance access to physical GPUs. However, this technology does not allow multiplexing for sharing GPUs among VMs and is not compatible with vMotion, VMware's technology for transparently migrating VMs among hosts inside clusters. In this paper, we address these issues by implementing a solution that uses "remote API execution" and takes advantage of DirectPath I/O to enable general purpose GPU on ESX. This solution, named vmCUDA, allows CUDA applications running concurrently in multiple VMs on ESX to share GPU(s). Our solution requires neither recompilation nor even editing of the source code of CUDA applications. Our performance evaluation has shown that vmCUDA introduced an overhead of 0.6% - 3.5% for applications with moderate data size and 14% - 20% for those with large data (e.g. 12.5 GB - 237.5GB in our experiments).

References

[1]

Morgan, T., "Top 500 supers -- The Dawning of the GPUs," http://www.theregister.co.uk, 31st May 2010.

[2]

Hou, R., Jiang, T., Zhang, L., Qi, P., Dong, J., Wang, H., Gu, X., Zhang, S. "Cost effective data center servers," In the Proc. of the 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA), Feb. 2013, pp. 179--187.

Digital Library

[3]

Nvidia CUDA Toolkit Documentation, http://docs.nvidia.com/cuda/index.html

[4]

Munshi, A., "OpenCL 1.0 Specification," Khronos OpenCL Working Group, 2008.

[5]

Mergen, M. F., Uhlig, V., Krieger, O., Xenidis, J., "Virtualization for high-performance computing," in ACM SIGOPS Operating Systems Review Newsletter, Volume 40 Issue 2, April 2006, New York, NY, pp. 8--11,.

Digital Library

[6]

Younge, A. J., Henschel, R., Brown, J. T., Laszewski, G., Qiu, J., Fox, G. C., "Analysis of Virtualization Technologies for High Performance Computing Environments", in the Proceeding 2011 IEEE International Conference on Cloud Computing (CLOUD), 4-9 July 2011, Washington, DC, pp. 9--16.

Digital Library

[7]

Rosenblum, M., "VMware's Virtual Platform: A virtual machine monitor for commodity PCs," in Proceeding of Hot Chips 11: Stanford University, August 15--17, 1999, Stanford, CA.

[8]

Barham, P., Dragovic, B., Fraser, K., Hand, S., Harris, T., Ho, A., Neugebauer, R., Pratt, I., Warfield, A, "Xen and the Art of Virtualization," In Proc. 19th ACM Symposium on OperatingSystems Principles (SOSP), Oct. 2003, Bolton Landing, NY, pp. 164--177.

Digital Library

[9]

Dato, J., Peña, A. J., Silla, F., Mayo, R. & Quintana-Ort, E. S., "Enabling CUDA acceleration within virtual machines using rCUDA", in the Proceedings of HiPC 2011.

Digital Library

[10]

Duato, J., Peña, A. J., Silla, F., Mayo, R. & Quintana-Orti, E. S, "Performance of CUDA Virtualized Remote GPUs in High Performance Clusters", in the Proceedings of 2011 International Conference on Parallel Processing (ICPP), pp. 365--374.

Digital Library

[11]

Gupta, V., Schwan, K., Tolia, N., Talwar, V., and Ranganathan, P., "Pegasus: Coordinated Scheduling for Virtualized Accelerator-based systems", in the Proceedings of USENIX ATC 2011.

Digital Library

[12]

Gupta, V., Gavrilovska, A., Schwan, K., Kharche, H., Tolia, N., Talwar, V., and Ranganathan, P., "GViM: GPU-accelerated virtual machines", in Proceedings of the 3rd Workshop on System-level Virtualization for High Performance Computing, NY, USA: ACM, 2009, pp. 17--24.

Digital Library

[13]

Merritt, A., Gupta, V., Verma, A., Gavrilovska, A., and Schwan, K., "Shadowfax: Scaling in Heterogeneous Cluster Systems via GPGPU Assemblies", in the Proceedings of VTDC 2011.

Digital Library

[14]

Shi, L., Chen, H., Sun, J., "vCUDA: GPU accelerated high performance computing in virtual machines," in Proceedings of IEEE International Symposium on Parallel & Distributed Processing (IPDPS'09), 2009.

Digital Library

[15]

Nvidia GPU Computing SDK, https://developer.nvidia.com/gpu-computing-sdk

[16]

Reano, C., Pea, A. J., Silla, F., Duato, J.; Mayo, R., Quintana-Orti, E. S., "CU2rCU - towards the Complete rCUDA Remote GPU Virtualization and Sharing Solution," in the Proc. of the 2012 19th International Conference on High Performance Computing (HiPC), Dec. 2012, pp. 1--10.

[17]

Adams, K., Agesen, O., "A comparison of software and hardware techniques for x86 virtualization," in Operating Systems Review, 40(5):2--13, Dec. 2006.

Digital Library

[18]

Huang, W., Liu, J., Abali, B., D. K. Panda, D. K., Muraoka, Y. "A case for high performance computing with virtual machines", in the Proceedings of 20th Annual International Conference on Supercomputing, G. K. Egan, Ed., Cairns, Queensland, Australia, Jun. 2006, pp. 125--134.

Digital Library

[19]

VMware vSphere vMotion Architecture, Performance and Best Practices in VMware vSphere 5, http://www.vmware.com/files/pdf/vmotion-perf-vsphere5.pdf

[20]

Dowty M., Sugerman, J., "GPU virtualization on VMware's hosted I/O architecture," in Newsletter of ACM SIGOPS Operating Systems Review archive, Volume 43 Issue 3, July 2009, New York, NY, pp. 73--82.

Digital Library

[21]

Ciliendo, E., Kunimasa, T., "Linux Performance and Tuning Guidelines," in IBM Redbooks, 05 July 2007.

[22]

Nvidia Grid, http://www.nvidia.com/object/cloud-gaming.html

Cited By

Fingler HTarte IYu HSzekely AHu BAkella ARossbach CAamodt TJerger NSwift M(2023)Towards a Machine Learning-Assisted Kernel with LAKEProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3575693.3575697(846-861)Online publication date: 27-Jan-2023
https://dl.acm.org/doi/10.1145/3575693.3575697
Hunt TJia ZMiller VSzekely AHu YRossbach CWitchel EBhagwan RPorter G(2020)TelekineProceedings of the 17th Usenix Conference on Networked Systems Design and Implementation10.5555/3388242.3388301(817-834)Online publication date: 25-Feb-2020
https://dl.acm.org/doi/10.5555/3388242.3388301
Yu HPeters AAkshintala ARossbach CLarus JCeze LStrauss K(2020)AvA: Accelerated Virtualization of AcceleratorsProceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3373376.3378466(807-825)Online publication date: 9-Mar-2020
https://dl.acm.org/doi/10.1145/3373376.3378466
Show More Cited By

Index Terms

GPU virtualization for high performance general purpose computing on the ESX hypervisor

Recommendations

Virtualizing General Purpose GPUs for High Performance Cloud Computing: An Application to a Fluid Simulator
ISPA '12: Proceedings of the 2012 IEEE 10th International Symposium on Parallel and Distributed Processing with Applications

In this work we present an hypervisor-independent GPU Virtualization Service named GVirtus. It instantiates virtual machines able to access to the GPU in a transparent way. GPUs allow to speed up calculations over CPUs. Therefore, virtualizing GPUs is a ...
Increasing the performance of data centers by combining remote GPU virtualization with slurm
CCGRID '16: Proceedings of the 16th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing

The use of Graphics Processing Units (GPUs) presents several side effects, such as increased acquisition costs as well as larger space requirements. Furthermore, GPUs require a non-negligible amount of energy even while idle. Additionally, GPU ...
Computing prestack Kirchhoff time migration on general purpose GPU

This paper introduces how to optimize a practical prestack Kirchhoff time migration program by the Compute Unified Device Architecture (CUDA) on a general purpose GPU (GPGPU). A few useful optimization methods on GPGPU are demonstrated, such as how to ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

HPC '14: Proceedings of the High Performance Computing Symposium

April 2014

201 pages

Sponsors

(SCS): The Society for Modeling and Simulation International

In-Cooperation

SIGSIM: ACM Special Interest Group on Simulation and Modeling

Publisher

Society for Computer Simulation International

San Diego, CA, United States

Publication History

Published: 13 April 2014

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SpringSim '14

Sponsor:

(SCS)

SpringSim '14: 2014 Spring Simulation Multiconference

April 13 - 16, 2014

Florida, Tampa

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

12
Total Citations
View Citations
320
Total Downloads

Downloads (Last 12 months)21
Downloads (Last 6 weeks)4

Reflects downloads up to 13 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Fingler HTarte IYu HSzekely AHu BAkella ARossbach CAamodt TJerger NSwift M(2023)Towards a Machine Learning-Assisted Kernel with LAKEProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3575693.3575697(846-861)Online publication date: 27-Jan-2023
https://dl.acm.org/doi/10.1145/3575693.3575697
Hunt TJia ZMiller VSzekely AHu YRossbach CWitchel EBhagwan RPorter G(2020)TelekineProceedings of the 17th Usenix Conference on Networked Systems Design and Implementation10.5555/3388242.3388301(817-834)Online publication date: 25-Feb-2020
https://dl.acm.org/doi/10.5555/3388242.3388301
Yu HPeters AAkshintala ARossbach CLarus JCeze LStrauss K(2020)AvA: Accelerated Virtualization of AcceleratorsProceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3373376.3378466(807-825)Online publication date: 9-Mar-2020
https://dl.acm.org/doi/10.1145/3373376.3378466
Yu HPeters AAkshintala ARossbach C(2019)Automatic Virtualization of AcceleratorsProceedings of the Workshop on Hot Topics in Operating Systems10.1145/3317550.3321423(58-65)Online publication date: 13-May-2019
https://dl.acm.org/doi/10.1145/3317550.3321423
Tan HTan YHe XLi KLi K(2019)A Virtual Multi-Channel GPU Fair Scheduling Method for Virtual MachinesIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2018.286534130:2(257-270)Online publication date: 1-Feb-2019
https://dl.acm.org/doi/10.1109/TPDS.2018.2865341
Ausavarungnirun RMiller VLandgraf JGhose SGandhi JJog ARossbach CMutlu O(2018)MASKACM SIGPLAN Notices10.1145/3296957.317316953:2(503-518)Online publication date: 19-Mar-2018
https://dl.acm.org/doi/10.1145/3296957.3173169
Ausavarungnirun RLandgraf JMiller VGhose SGandhi JRossbach CMutlu O(2018)MosaicACM SIGOPS Operating Systems Review10.1145/3273982.327398652:1(27-44)Online publication date: 28-Aug-2018
https://dl.acm.org/doi/10.1145/3273982.3273986
Ausavarungnirun RMiller VLandgraf JGhose SGandhi JJog ARossbach CMutlu OShen XTuck JBianchini RSarkar V(2018)MASKProceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3173162.3173169(503-518)Online publication date: 19-Mar-2018
https://dl.acm.org/doi/10.1145/3173162.3173169
Hong CSpence INikolopoulos D(2017)GPU Virtualization and Scheduling MethodsACM Computing Surveys10.1145/306828150:3(1-37)Online publication date: 29-Jun-2017
https://dl.acm.org/doi/10.1145/3068281
Mwalongo FKrone MReina GErtl T(2016)State-of-the-Art Report in Web-based VisualizationComputer Graphics Forum10.5555/3071534.307158935:3(553-575)Online publication date: 1-Jun-2016
https://dl.acm.org/doi/10.5555/3071534.3071589
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents