Nothing Special   »   [go: up one dir, main page]

skip to main content
short-paper

Supporting High Performance Molecular Dynamics in Virtualized Clusters using IOMMU, SR-IOV, and GPUDirect

Published: 14 March 2015 Publication History

Abstract

Cloud Infrastructure-as-a-Service paradigms have recently shown their utility for a vast array of computational problems, ranging from advanced web service architectures to high throughput computing. However, many scientific computing applications have been slow to adapt to virtualized cloud frameworks. This is due to performance impacts of virtualization technologies, coupled with the lack of advanced hardware support necessary for running many high performance scientific applications at scale.
By using KVM virtual machines that leverage both Nvidia GPUs and InfiniBand, we show that molecular dynamics simulations with LAMMPS and HOOMD run at near-native speeds. This experiment also illustrates how virtualized environments can support the latest parallel computing paradigms, including both MPI+CUDA and new GPUDirect RDMA functionality. Specific findings show initial promise in scaling of such applications to larger production deployments targeting large scale computational workloads.

References

[1]
Amazon elastic compute cloud (Amazon EC2). Website, August 2010. URL http://aws.amazon.com/ec2/.
[2]
NVIDIA GPUDirect. Website, November 2014. URL https://developer.nvidia.com/gpudirect.
[3]
Mellanox Neutron Plugin. Website, November 2014. URL https://wiki.openstack.org/wiki/Mellanox-Neutron.
[4]
Getting Xen working for Intel(R) Xeon Phi(tm) Coprocessor. Website, November 2014. URL https://software.intel.com/en-us/articles/getting-xen-working-for-intelr-xeonphitm-coprocessor.
[5]
AWS high performance computing. Website, November 2014. URL http://aws.amazon.com/hpc/.
[6]
Google Cloud Platform. Website, November 2014. URL https://cloud.google.com/.
[7]
OpenStack cloud software. Website, November 2014. URL http://openstack.org.
[8]
OpenStack flavors. Website, November 2014. URL http://docs.openstack.org/openstackops/content/flavors.html.
[9]
AMD Corporation. AMD I/O virtualization technology (IOMMU) specification. Technical report, AMD Corporation, 2009.
[10]
J. Anderson, A. Keys, C. Phillips, T. Dac Nguyen, and S. Glotzer. HOOMD-blue, general-purpose many-body dynamics on the GPU. In APS Meeting Abstracts, volume 1, page 18008, 2010.
[11]
ARM Limited. ARM system memory management unit architecture specification. Technical report, ARM Limited, 2013.
[12]
M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica, and M. Zaharia. A view of cloud computing. Commun. ACM, 53 :50--58, Apr. 2010. ISSN 0001-0782.
[13]
K. Asanovic, R. Bodik, B. C. Catanzaro, J. J. Gebis, P. Husbands, K. Keutzer, D. A. Patterson, W. L. Plishker, J. Shalf, S. W. Williams, et al. The landscape of parallel computing research: A view from Berkeley. Technical report, Technical Report UCB/EECS-2006-183, EECS Department, University of California, Berkeley, 2006.
[14]
S. Crago, K. Dunn, P. Eads, L. Hochstein, D.-I. Kang, M. Kang, D. Modium, K. Singh, J. Suh, and J. P.Walters. Heterogeneous cloud computing. In Cluster Computing (CLUSTER), 2011 IEEE International Conference on, pages 378--385. IEEE, 2011.
[15]
J. Dongarra, H. Meuer, and E. Strohmaier. Top 500 supercomputers. Website, November 2014. URL http://top500. org/.
[16]
J. Duato, A. J. Pena, F. Silla, J. C. Fernández, R. Mayo, and E. S. Quintana-Orti. Enabling CUDA acceleration within virtual machines using rCUDA. In High Performance Computing (HiPC), 2011 18th International Conference on, pages 1--10. IEEE, 2011.
[17]
G. Fox, G. von Laszewski, J. Diaz, K. Keahey, J. Fortes, R. Figueiredo, S. Smallen, W. Smith, and A. Grimshaw. FutureGrid-a reconfigurable testbed for Cloud, HPC and Grid computing. Contemporary High Performance Computing: From Petascale toward Exascale, Computational Science. Chapman and Hall/CRC, 2013.
[18]
N. Huber, M. von Quast, M. Hauck, and S. Kounev. Evaluating and modeling virtualization performance overhead for cloud environments. In CLOSER, pages 563--573, 2011.
[19]
R. Jennings. Cloud Computing with the Windows Azure Platform. John Wiley & Sons, 2010.
[20]
S. Jha, J. Qiu, A. Luckow, P. K. Mantha, and G. C. Fox. A tale of two data-intensive paradigms: Applications, abstractions, and architectures. In Proceedings of the 3rd International Congress on Big Data, 2014.
[21]
J. Jose, M. Li, X. Lu, K. C. Kandalla, M. D. Arnold, and D. K. Panda. SR-IOV support for virtualization on InfiniBand clusters: Early experience. In Cluster, Cloud and Grid Computing (CCGrid), 2013 13th IEEE/ACM International Symposium on, pages 385--392. IEEE, 2013.
[22]
K. Keahey, J. Mambretti, D. K. Panda, P. Rad, W. Smith, and D. Stanzione. NSF Chameleon cloud. Website, November 2014. URL http://www.chameleoncloud.org/.
[23]
J. Liu. Evaluating standard-based self-virtualizing devices: A performance study on 10 GbE NICs with SR-IOV support. In Parallel Distributed Processing (IPDPS), 2010 IEEE International Symposium on, pages 1--12, April 2010.
[24]
P. Luszczek, E. Meek, S. Moore, D. Terpstra, V. M. Weaver, and J. Dongarra. Evaluation of the HPC challenge benchmarks in virtualized environments. In Proceedings of the 2011 International Conference on Parallel Processing - Volume 2, Euro-Par'11, pages 436--445, Berlin, Heidelberg, 2012. Springer-Verlag.
[25]
R. L. Moore, C. Baru, D. Baxter, G. C. Fox, A. Majumdar, P. Papadopoulos, W. Pfeiffer, R. S. Sinkovits, S. Strande, M. Tatineni, et al. Gateways to discovery: Cyberinfrastructure for the long tail of science. In Proceedings of the 2014 Annual Conference on Extreme Science and Engineering Discovery Environment, page 39. ACM, 2014.
[26]
M. Musleh, V. Pai, J. P.Walters, A. J. Younge, and S. P. Crago. Bridging the virtualization performance gap for HPC using SR-IOV for InfiniBand. In Proceedings of the 7th IEEE International Conference on Cloud Computing (CLOUD 2014), Anchorage, AK, 2014. IEEE.
[27]
S. Plimpton, P. Crozier, and A. Thompson. LAMMPS-largescale atomic/molecular massively parallel simulator. Sandia National Laboratories, 2007.
[28]
L. Ramakrishnan, R. S. Canon, K. Muriki, I. Sakrejda, and N. J. Wright. Evaluating interconnect and virtualization performance for high performance computing. SIGMETRICS Perform. Eval. Rev., 40(2):55--60, Oct. 2012. ISSN 0163-5999.
[29]
M. Righini. Enabling Intel R virtualization technology features and benefits. Technical report, Intel Corporation, 2010.
[30]
T. P. P. D. L. Ruivo, G. B. Altayo, G. Garzoglio, S. Timm, H. Kim, S.-Y. Noh, and I. Raicu. Exploring InfiniBand hardware virtualization in OpenNebula towards efficient highperformance computing. In CCGRID, pages 943--948, 2014.
[31]
S. Seelam, L. Fong, A. Tantawi, J. Lewars, J. Divirgilio, and K. Gildea. Extreme scale computing: Modeling the impact of system noise in multicore clustered systems. In Parallel Distributed Processing (IPDPS), 2010 IEEE International Symposium on, pages 1--12, April 2010. .
[32]
G. Shainer, A. Ayoub, P. Lui, T. Liu, M. Kagan, C. R. Trott, G. Scantlen, and P. S. Crozier. The development of Mellanox/NVIDIA GPUDirect over InfiniBand-a new model for GPU to GPU communications. Computer Science-Research and Development, 26(3--4):267--273, 2011.
[33]
Y. Suzuki, S. Kato, H. Yamada, and K. Kono. GPUvm: why not virtualizing GPUs at the hypervisor? In Proceedings of the 2014 USENIX conference on USENIX Annual Technical Conference, pages 109--120. USENIX Association, 2014.
[34]
K. Tian, Y. Dong, and D. Cowperthwaite. A full GPU virtualization solution with mediated pass-through. In Proc. USENIX ATC, 2014.
[35]
L. Vu, H. Sivaraman, and R. Bidarkar. GPU virtualization for high performance general purpose computing on the ESX hypervisor. In Proceedings of the High Performance Computing Symposium, HPC '14, pages 2:1--2:8, San Diego, CA, USA, 2014. Society for Computer Simulation International.
[36]
J. P. Walters, A. J. Younge, D.-I. Kang, K.-T. Yao, M. Kang, S. P. Crago, and G. C. Fox. GPU-Passthrough performance: A comparison of KVM, Xen, VMWare ESXi, and LXC for CUDA and OpenCL applications. In Proceedings of the 7th IEEE International Conference on Cloud Computing (CLOUD 2014), Anchorage, AK, 2014. IEEE.
[37]
K. Yelick, S. Coghlan, B. Draney, R. S. Canon, et al. The Magellan report on cloud computing for science. Technical report, US Department of Energy, 2011.
[38]
A. J. Younge, R. Henschel, J. T. Brown, G. von Laszewski, J. Qiu, and G. C. Fox. Analysis of Virtualization Technologies for High Performance Computing Environments. In Proceedings of the 4th International Conference on Cloud Computing (CLOUD 2011), Washington, DC, 2011. IEEE.
[39]
A. J. Younge, J. P. Walters, S. Crago, and G. C. Fox. Evaluating GPU passthrough in Xen for high performance cloud computing. In High-Performance Grid and Cloud Computing Workshop at the 28th IEEE International Parallel and Distributed Processing Symposium, Pheonix, AZ, 05 2014. IEEE.

Cited By

View all
  • (2022)Virtualizing GPU direct packet I/O on commodity Ethernet to accelerate GPU-NFVJournal of Network and Computer Applications10.1016/j.jnca.2022.103480206(103480)Online publication date: Oct-2022
  • (2019)GPU Accelerated Industrial Data Analysis in Private Cloud Environment2019 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus)10.1109/EIConRus.2019.8656751(348-352)Online publication date: Jan-2019
  • (2016)A user mode CPU-GPU scheduling framework for hybrid workloadsFuture Generation Computer Systems10.1016/j.future.2016.03.01163:C(25-36)Online publication date: 1-Oct-2016
  • Show More Cited By

Index Terms

  1. Supporting High Performance Molecular Dynamics in Virtualized Clusters using IOMMU, SR-IOV, and GPUDirect

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 50, Issue 7
    VEE '15
    July 2015
    221 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/2817817
    • Editor:
    • Andy Gill
    Issue’s Table of Contents
    • cover image ACM Conferences
      VEE '15: Proceedings of the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments
      March 2015
      238 pages
      ISBN:9781450334501
      DOI:10.1145/2731186
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 March 2015
    Published in SIGPLAN Volume 50, Issue 7

    Check for updates

    Author Tags

    1. cloud computing
    2. gpudirect
    3. iaas
    4. iommu
    5. kvm
    6. molecular dynamics
    7. openstack
    8. sr-iov
    9. virtualization

    Qualifiers

    • Short-paper

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)14
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 18 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Virtualizing GPU direct packet I/O on commodity Ethernet to accelerate GPU-NFVJournal of Network and Computer Applications10.1016/j.jnca.2022.103480206(103480)Online publication date: Oct-2022
    • (2019)GPU Accelerated Industrial Data Analysis in Private Cloud Environment2019 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus)10.1109/EIConRus.2019.8656751(348-352)Online publication date: Jan-2019
    • (2016)A user mode CPU-GPU scheduling framework for hybrid workloadsFuture Generation Computer Systems10.1016/j.future.2016.03.01163:C(25-36)Online publication date: 1-Oct-2016
    • (2016)Optimizations for High Performance Network VirtualizationJournal of Computer Science and Technology10.1007/s11390-016-1614-x31:1(107-116)Online publication date: 8-Jan-2016
    • (2019)On the support of inter-node P2P GPU memory copies in rCUDAJournal of Parallel and Distributed Computing10.1016/j.jpdc.2018.12.011127:C(28-43)Online publication date: 1-May-2019
    • (2019)Computational Drug Design Methods—Current and Future PerspectivesIn Silico Drug Design10.1016/B978-0-12-816125-8.00002-X(19-44)Online publication date: 2019
    • (2018)Performance Modeling towards Interrupt System of Virtualized Cryptography Device2018 Third International Conference on Security of Smart Cities, Industrial Control System and Communications (SSIC)10.1109/SSIC.2018.8556713(1-6)Online publication date: Oct-2018
    • (2017)GPU Virtualization and Scheduling MethodsACM Computing Surveys10.1145/306828150:3(1-37)Online publication date: 29-Jun-2017
    • (2017)A Tale of Two Systems: Using Containers to Deploy HPC Applications on Supercomputers and Clouds2017 IEEE International Conference on Cloud Computing Technology and Science (CloudCom)10.1109/CloudCom.2017.40(74-81)Online publication date: Dec-2017
    • (2017)Enabling Diverse Software Stacks on Supercomputers Using High Performance Virtual Clusters2017 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER.2017.92(310-321)Online publication date: Sep-2017
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media