short-paper

Supporting High Performance Molecular Dynamics in Virtualized Clusters using IOMMU, SR-IOV, and GPUDirect

Authors:

Andrew J. Younge,

John Paul Walters,

Stephen P. Crago,

Geoffrey C. FoxAuthors Info & Claims

ACM SIGPLAN Notices, Volume 50, Issue 7

Pages 31 - 38

https://doi.org/10.1145/2817817.2731194

Published: 14 March 2015 Publication History

Abstract

Cloud Infrastructure-as-a-Service paradigms have recently shown their utility for a vast array of computational problems, ranging from advanced web service architectures to high throughput computing. However, many scientific computing applications have been slow to adapt to virtualized cloud frameworks. This is due to performance impacts of virtualization technologies, coupled with the lack of advanced hardware support necessary for running many high performance scientific applications at scale.

By using KVM virtual machines that leverage both Nvidia GPUs and InfiniBand, we show that molecular dynamics simulations with LAMMPS and HOOMD run at near-native speeds. This experiment also illustrates how virtualized environments can support the latest parallel computing paradigms, including both MPI+CUDA and new GPUDirect RDMA functionality. Specific findings show initial promise in scaling of such applications to larger production deployments targeting large scale computational workloads.

References

[1]

Amazon elastic compute cloud (Amazon EC2). Website, August 2010. URL http://aws.amazon.com/ec2/.

[2]

NVIDIA GPUDirect. Website, November 2014. URL https://developer.nvidia.com/gpudirect.

[3]

Mellanox Neutron Plugin. Website, November 2014. URL https://wiki.openstack.org/wiki/Mellanox-Neutron.

[4]

Getting Xen working for Intel(R) Xeon Phi(tm) Coprocessor. Website, November 2014. URL https://software.intel.com/en-us/articles/getting-xen-working-for-intelr-xeonphitm-coprocessor.

[5]

AWS high performance computing. Website, November 2014. URL http://aws.amazon.com/hpc/.

[6]

Google Cloud Platform. Website, November 2014. URL https://cloud.google.com/.

[7]

OpenStack cloud software. Website, November 2014. URL http://openstack.org.

[8]

OpenStack flavors. Website, November 2014. URL http://docs.openstack.org/openstackops/content/flavors.html.

[9]

AMD Corporation. AMD I/O virtualization technology (IOMMU) specification. Technical report, AMD Corporation, 2009.

[10]

J. Anderson, A. Keys, C. Phillips, T. Dac Nguyen, and S. Glotzer. HOOMD-blue, general-purpose many-body dynamics on the GPU. In APS Meeting Abstracts, volume 1, page 18008, 2010.

[11]

ARM Limited. ARM system memory management unit architecture specification. Technical report, ARM Limited, 2013.

[12]

M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica, and M. Zaharia. A view of cloud computing. Commun. ACM, 53 :50--58, Apr. 2010. ISSN 0001-0782.

Digital Library

[13]

K. Asanovic, R. Bodik, B. C. Catanzaro, J. J. Gebis, P. Husbands, K. Keutzer, D. A. Patterson, W. L. Plishker, J. Shalf, S. W. Williams, et al. The landscape of parallel computing research: A view from Berkeley. Technical report, Technical Report UCB/EECS-2006-183, EECS Department, University of California, Berkeley, 2006.

[14]

S. Crago, K. Dunn, P. Eads, L. Hochstein, D.-I. Kang, M. Kang, D. Modium, K. Singh, J. Suh, and J. P.Walters. Heterogeneous cloud computing. In Cluster Computing (CLUSTER), 2011 IEEE International Conference on, pages 378--385. IEEE, 2011.

Digital Library

[15]

J. Dongarra, H. Meuer, and E. Strohmaier. Top 500 supercomputers. Website, November 2014. URL http://top500. org/.

[16]

J. Duato, A. J. Pena, F. Silla, J. C. Fernández, R. Mayo, and E. S. Quintana-Orti. Enabling CUDA acceleration within virtual machines using rCUDA. In High Performance Computing (HiPC), 2011 18th International Conference on, pages 1--10. IEEE, 2011.

Digital Library

[17]

G. Fox, G. von Laszewski, J. Diaz, K. Keahey, J. Fortes, R. Figueiredo, S. Smallen, W. Smith, and A. Grimshaw. FutureGrid-a reconfigurable testbed for Cloud, HPC and Grid computing. Contemporary High Performance Computing: From Petascale toward Exascale, Computational Science. Chapman and Hall/CRC, 2013.

[18]

N. Huber, M. von Quast, M. Hauck, and S. Kounev. Evaluating and modeling virtualization performance overhead for cloud environments. In CLOSER, pages 563--573, 2011.

[19]

R. Jennings. Cloud Computing with the Windows Azure Platform. John Wiley & Sons, 2010.

Digital Library

[20]

S. Jha, J. Qiu, A. Luckow, P. K. Mantha, and G. C. Fox. A tale of two data-intensive paradigms: Applications, abstractions, and architectures. In Proceedings of the 3rd International Congress on Big Data, 2014.

Digital Library

[21]

J. Jose, M. Li, X. Lu, K. C. Kandalla, M. D. Arnold, and D. K. Panda. SR-IOV support for virtualization on InfiniBand clusters: Early experience. In Cluster, Cloud and Grid Computing (CCGrid), 2013 13th IEEE/ACM International Symposium on, pages 385--392. IEEE, 2013.

Digital Library

[22]

K. Keahey, J. Mambretti, D. K. Panda, P. Rad, W. Smith, and D. Stanzione. NSF Chameleon cloud. Website, November 2014. URL http://www.chameleoncloud.org/.

[23]

J. Liu. Evaluating standard-based self-virtualizing devices: A performance study on 10 GbE NICs with SR-IOV support. In Parallel Distributed Processing (IPDPS), 2010 IEEE International Symposium on, pages 1--12, April 2010.

[24]

P. Luszczek, E. Meek, S. Moore, D. Terpstra, V. M. Weaver, and J. Dongarra. Evaluation of the HPC challenge benchmarks in virtualized environments. In Proceedings of the 2011 International Conference on Parallel Processing - Volume 2, Euro-Par'11, pages 436--445, Berlin, Heidelberg, 2012. Springer-Verlag.

Digital Library

[25]

R. L. Moore, C. Baru, D. Baxter, G. C. Fox, A. Majumdar, P. Papadopoulos, W. Pfeiffer, R. S. Sinkovits, S. Strande, M. Tatineni, et al. Gateways to discovery: Cyberinfrastructure for the long tail of science. In Proceedings of the 2014 Annual Conference on Extreme Science and Engineering Discovery Environment, page 39. ACM, 2014.

Digital Library

[26]

M. Musleh, V. Pai, J. P.Walters, A. J. Younge, and S. P. Crago. Bridging the virtualization performance gap for HPC using SR-IOV for InfiniBand. In Proceedings of the 7th IEEE International Conference on Cloud Computing (CLOUD 2014), Anchorage, AK, 2014. IEEE.

Digital Library

[27]

S. Plimpton, P. Crozier, and A. Thompson. LAMMPS-largescale atomic/molecular massively parallel simulator. Sandia National Laboratories, 2007.

[28]

L. Ramakrishnan, R. S. Canon, K. Muriki, I. Sakrejda, and N. J. Wright. Evaluating interconnect and virtualization performance for high performance computing. SIGMETRICS Perform. Eval. Rev., 40(2):55--60, Oct. 2012. ISSN 0163-5999.

Digital Library

[29]

M. Righini. Enabling Intel R virtualization technology features and benefits. Technical report, Intel Corporation, 2010.

[30]

T. P. P. D. L. Ruivo, G. B. Altayo, G. Garzoglio, S. Timm, H. Kim, S.-Y. Noh, and I. Raicu. Exploring InfiniBand hardware virtualization in OpenNebula towards efficient highperformance computing. In CCGRID, pages 943--948, 2014.

[31]

S. Seelam, L. Fong, A. Tantawi, J. Lewars, J. Divirgilio, and K. Gildea. Extreme scale computing: Modeling the impact of system noise in multicore clustered systems. In Parallel Distributed Processing (IPDPS), 2010 IEEE International Symposium on, pages 1--12, April 2010. .

[32]

G. Shainer, A. Ayoub, P. Lui, T. Liu, M. Kagan, C. R. Trott, G. Scantlen, and P. S. Crozier. The development of Mellanox/NVIDIA GPUDirect over InfiniBand-a new model for GPU to GPU communications. Computer Science-Research and Development, 26(3--4):267--273, 2011.

Digital Library

[33]

Y. Suzuki, S. Kato, H. Yamada, and K. Kono. GPUvm: why not virtualizing GPUs at the hypervisor? In Proceedings of the 2014 USENIX conference on USENIX Annual Technical Conference, pages 109--120. USENIX Association, 2014.

Digital Library

[34]

K. Tian, Y. Dong, and D. Cowperthwaite. A full GPU virtualization solution with mediated pass-through. In Proc. USENIX ATC, 2014.

Digital Library

[35]

L. Vu, H. Sivaraman, and R. Bidarkar. GPU virtualization for high performance general purpose computing on the ESX hypervisor. In Proceedings of the High Performance Computing Symposium, HPC '14, pages 2:1--2:8, San Diego, CA, USA, 2014. Society for Computer Simulation International.

Digital Library

[36]

J. P. Walters, A. J. Younge, D.-I. Kang, K.-T. Yao, M. Kang, S. P. Crago, and G. C. Fox. GPU-Passthrough performance: A comparison of KVM, Xen, VMWare ESXi, and LXC for CUDA and OpenCL applications. In Proceedings of the 7th IEEE International Conference on Cloud Computing (CLOUD 2014), Anchorage, AK, 2014. IEEE.

Digital Library

[37]

K. Yelick, S. Coghlan, B. Draney, R. S. Canon, et al. The Magellan report on cloud computing for science. Technical report, US Department of Energy, 2011.

[38]

A. J. Younge, R. Henschel, J. T. Brown, G. von Laszewski, J. Qiu, and G. C. Fox. Analysis of Virtualization Technologies for High Performance Computing Environments. In Proceedings of the 4th International Conference on Cloud Computing (CLOUD 2011), Washington, DC, 2011. IEEE.

Digital Library

[39]

A. J. Younge, J. P. Walters, S. Crago, and G. C. Fox. Evaluating GPU passthrough in Xen for high performance cloud computing. In High-Performance Grid and Cloud Computing Workshop at the 28th IEEE International Parallel and Distributed Processing Symposium, Pheonix, AZ, 05 2014. IEEE.

Digital Library

Cited By

Jung CKim SKim YYeom I(2022)Virtualizing GPU direct packet I/O on commodity Ethernet to accelerate GPU-NFVJournal of Network and Computer Applications10.1016/j.jnca.2022.103480206(103480)Online publication date: Oct-2022
https://doi.org/10.1016/j.jnca.2022.103480
Sisyukov AYulmetova OKuznecov V(2019)GPU Accelerated Industrial Data Analysis in Private Cloud Environment2019 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus)10.1109/EIConRus.2019.8656751(348-352)Online publication date: Jan-2019
https://doi.org/10.1109/EIConRus.2019.8656751
Wang BMa RQi ZYao JGuan H(2016)A user mode CPU-GPU scheduling framework for hybrid workloadsFuture Generation Computer Systems10.1016/j.future.2016.03.01163:C(25-36)Online publication date: 1-Oct-2016
https://dl.acm.org/doi/10.1016/j.future.2016.03.011
Show More Cited By

Index Terms

Supporting High Performance Molecular Dynamics in Virtualized Clusters using IOMMU, SR-IOV, and GPUDirect
1. Computer systems organization
  1. Architectures

Recommendations

Supporting High Performance Molecular Dynamics in Virtualized Clusters using IOMMU, SR-IOV, and GPUDirect
VEE '15: Proceedings of the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments

Cloud Infrastructure-as-a-Service paradigms have recently shown their utility for a vast array of computational problems, ranging from advanced web service architectures to high throughput computing. However, many scientific computing applications have ...
MVAPICH2 over openstack with SR-IOV: an efficient approach to build HPC clouds
CCGRID '15: Proceedings of the 15th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing

Cloud Computing with Virtualization offers attractive flexibility and elasticity to deliver resources by providing a platform for consolidating complex IT resources in a scalable manner. However, efficiently running HPC applications on Cloud Computing ...
High performance network virtualization with SR-IOV

Virtualization poses new challenges to I/O performance. The single-root I/O virtualization (SR-IOV) standard allows an I/O device to be shared by multiple Virtual Machines (VMs), without losing performance. We propose a generic virtualization ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices

ACM SIGPLAN Notices Volume 50, Issue 7

VEE '15

July 2015

221 pages

ISSN:0362-1340

EISSN:1558-1160

DOI:10.1145/2817817

Editor:
Andy Gill
University of Kansas, Lawrence, KS

Issue’s Table of Contents

VEE '15: Proceedings of the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments
March 2015
238 pages
ISBN:9781450334501
DOI:10.1145/2731186
General Chair:
Ada Gavrilovska
Georgia Tech
,
Program Chairs:
Angela Demke Brown
University of Toronto
,
Bjarne Steensgaard
Microsoft

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 March 2015

Published in SIGPLAN Volume 50, Issue 7

Check for updates

Author Tags

Qualifiers

Short-paper

Funding Sources

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

11
Total Citations
View Citations
533
Total Downloads

Downloads (Last 12 months)14
Downloads (Last 6 weeks)1

Reflects downloads up to 18 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Jung CKim SKim YYeom I(2022)Virtualizing GPU direct packet I/O on commodity Ethernet to accelerate GPU-NFVJournal of Network and Computer Applications10.1016/j.jnca.2022.103480206(103480)Online publication date: Oct-2022
https://doi.org/10.1016/j.jnca.2022.103480
Sisyukov AYulmetova OKuznecov V(2019)GPU Accelerated Industrial Data Analysis in Private Cloud Environment2019 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus)10.1109/EIConRus.2019.8656751(348-352)Online publication date: Jan-2019
https://doi.org/10.1109/EIConRus.2019.8656751
Wang BMa RQi ZYao JGuan H(2016)A user mode CPU-GPU scheduling framework for hybrid workloadsFuture Generation Computer Systems10.1016/j.future.2016.03.01163:C(25-36)Online publication date: 1-Oct-2016
https://dl.acm.org/doi/10.1016/j.future.2016.03.011
Zhou FMa RLi JChen LQiu WGuan H(2016)Optimizations for High Performance Network VirtualizationJournal of Computer Science and Technology10.1007/s11390-016-1614-x31:1(107-116)Online publication date: 8-Jan-2016
https://doi.org/10.1007/s11390-016-1614-x
Reaño CSilla F(2019)On the support of inter-node P2P GPU memory copies in rCUDAJournal of Parallel and Distributed Computing10.1016/j.jpdc.2018.12.011127:C(28-43)Online publication date: 1-May-2019
https://dl.acm.org/doi/10.1016/j.jpdc.2018.12.011
Prieto-Martínez FLópez-López EEurídice Juárez-Mercado KMedina-Franco J(2019)Computational Drug Design Methods—Current and Future PerspectivesIn Silico Drug Design10.1016/B978-0-12-816125-8.00002-X(19-44)Online publication date: 2019
https://doi.org/10.1016/B978-0-12-816125-8.00002-X
Sun LLi SGuo SXu Y(2018)Performance Modeling towards Interrupt System of Virtualized Cryptography Device2018 Third International Conference on Security of Smart Cities, Industrial Control System and Communications (SSIC)10.1109/SSIC.2018.8556713(1-6)Online publication date: Oct-2018
https://doi.org/10.1109/SSIC.2018.8556713
Hong CSpence INikolopoulos D(2017)GPU Virtualization and Scheduling MethodsACM Computing Surveys10.1145/306828150:3(1-37)Online publication date: 29-Jun-2017
https://dl.acm.org/doi/10.1145/3068281
Younge APedretti KGrant RBrightwell R(2017)A Tale of Two Systems: Using Containers to Deploy HPC Applications on Supercomputers and Clouds2017 IEEE International Conference on Cloud Computing Technology and Science (CloudCom)10.1109/CloudCom.2017.40(74-81)Online publication date: Dec-2017
https://doi.org/10.1109/CloudCom.2017.40
Younge APedretti KGrant RGaines BBrightwell R(2017)Enabling Diverse Software Stacks on Supercomputers Using High Performance Virtual Clusters2017 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER.2017.92(310-321)Online publication date: Sep-2017
https://doi.org/10.1109/CLUSTER.2017.92
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents