Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

A Hybrid I/O Virtualization Framework for RDMA-capable Network Interfaces

Published: 14 March 2015 Publication History

Abstract

DMA-capable interconnects, providing ultra-low latency and high bandwidth, are increasingly being used in the context of distributed storage and data processing systems. However, the deployment of such systems in virtualized data centers is currently inhibited by the lack of a flexible and high-performance virtualization solution for RDMA network interfaces.
In this work, we present a hybrid virtualization architecture which builds upon the concept of separation of paths for control and data operations available in RDMA. With hybrid virtualization, RDMA control operations are virtualized using hypervisor involvement, while data operations are set up to bypass the hypervisor completely. We describe HyV (Hybrid Virtualization), a virtualization framework for RDMA devices implementing such a hybrid architecture. In the paper, we provide a detailed evaluation of HyV for different RDMA technologies and operations. We further demonstrate the advantages of HyV in the context of a real distributed system by running RAMCloud on a set of HyV-enabled virtual machines deployed across a 6-node RDMA cluster. All of the performance results we obtained illustrate that hybrid virtualization enables bare-metal RDMA performance inside virtual machines while retaining the flexibility typically associated with paravirtualization.

References

[1]
Adit Ranadive and Bhavesh Davda. Toward a Paravirtual vRDMA Device for VMware ESXi Guests. VMware, 2012.
[2]
Ardalan Amiri Sani, Kevin Boos, Shaopu Qin, and Lin Zhong. I/O Paravirtualization at the Device File Boundary. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '14, pages 319--332, New York, NY, USA, 2014. ACM.
[3]
Nadav Amit, Dan Tsafrir, and Assaf Schuster. VSwapper: A Memory Swapper for Virtualized Environments. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '14, pages 349--366, New York, NY, USA, 2014. ACM.
[4]
Fabrice Bellard. QEMU, a Fast and Portable Dynamic Translator. In Proceedings of USENIX Annual Technical Conference, pages 41--46, 2005.
[5]
Aleksandar Dragojević, Dushyanth Narayanan, Miguel Castro, and Orion Hodson. FaRM: Fast Remote Memory. In 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI 14), pages 401--414, Seattle, WA, April 2014. USENIX Association.
[6]
Thorsten Von Eicken, Anindya Basu, Vineet Buch, and Werner Vogels. U-net: A user-level network interface for parallel and distributed computing. In In Fifteenth ACM Symposium on Operating System Principles, 1995.
[7]
Keir Fraser, Steven H, Rolf Neugebauer, Ian Pratt, Andrew Warfield, and Mark Williamson. Safe hardware access with the Xen virtual machine monitor. In In 1st Workshop on Operating System and Architectural Support for the on demand IT InfraStructure (OASIS), 2004.
[8]
Abel Gordon, Nadav Amit, Nadav Har'El, Muli Ben-Yehuda, Alex Landau, Assaf Schuster, and Dan Tsafrir. ELI: Baremetal Performance for I/O Virtualization. In Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XVII, pages 411--422, New York, NY, USA, 2012. ACM.
[9]
InfiniBand Trade Association. InfiniBand Architectur Specification, Volume 1, Release 1.2.1. 2007.
[10]
InfiniBand Trade Association. Annex A16: RDMA over Converged Ethernet (RoCE). 2010.
[11]
J. Pinkerton J. Hilland, P. Culley and R. Recio. RDMA Protocol Verbs Specification. http://www.rdmaconsortium. org/home/draft-hilland-iwarp-verbs-v1.0-RDMAC. pdf, 2003.
[12]
Hwanju Kim, Sangwook Kim, Jinkyu Jeong, Joonwon Lee, and Seungryoul Maeng. Demand-based Coordinated Scheduling for SMP VMs. In Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '13, pages 369--380, New York, NY, USA, 2013. ACM.
[13]
Hwanju Kim, Hyeontaek Lim, Jinkyu Jeong, Heeseung Jo, and Joonwon Lee. Task-aware Virtual Machine Scheduling for I/O Performance. In Proceedings of the 2009 ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, VEE '09, pages 101--110, New York, NY, USA, 2009. ACM.
[14]
Avi Kivity, Yaniv Kamay, Dor Laor, Uri Lublin, and Anthony Liguori. kvm: the Linux Virtual Machine Monitor. In Proceedings of the Linux Symposium, volume 1, pages 225--230, Ottawa, Ontario, Canada, June 2007.
[15]
L. Lamport. Proving the correctness of multiprocess programs. IEEE Trans. Softw. Eng., 3(2):125--143, March 1977.
[16]
Jiuxing Liu, Wei Huang, Bulent Abali, and Dhabaleswar K. Panda. High Performance VMM-bypass I/O in Virtual Machines. In Proceedings of the Annual Conference on USENIX '06 Annual Technical Conference, ATEC '06, pages 3--3, Berkeley, CA, USA, 2006. USENIX Association.
[17]
Matthew Wilcox. I'll Do It Later: Softirqs, Tasklets, Bottom Halves, Task Queues, Work Queues and Timers. In Linux.Conf.Au, 2003.
[18]
Christopher Mitchell, Yifeng Geng, and Jinyang Li. Using One-sided RDMA Reads to Build a Fast, CPU-efficient Keyvalue Store. In Proceedings of the 2013 USENIX Conference on Annual Technical Conference, USENIX ATC'13, pages 103--114, Berkeley, CA, USA, 2013. USENIX Association.
[19]
OFED. The Open Fabric Alliance, at https://www. openfabrics.org/.
[20]
Diego Ongaro, Alan L. Cox, and Scott Rixner. Scheduling I/O in Virtual Machine Monitors. In Proceedings of the Fourth ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, VEE '08, pages 1--10, New York, NY, USA, 2008. ACM.
[21]
Diego Ongaro, Stephen M. Rumble, Ryan Stutsman, John Ousterhout, and Mendel Rosenblum. Fast Crash Recovery in RAMCloud. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, SOSP '11, pages 29--41, New York, NY, USA, 2011. ACM.
[22]
John Ousterhout, Parag Agrawal, David Erickson, Christos Kozyrakis, Jacob Leverich, David Mazi'eres, Subhasish Mitra, Aravind Narayanan, Diego Ongaro, Guru Parulkar, Mendel Rosenblum, Stephen M. Rumble, Eric Stratmann, and Ryan Stutsman. The Case for RAMCloud. Commun. ACM, 54(7):121--130, July 2011.
[23]
John Ousterhout, Parag Agrawal, David Erickson, Christos Kozyrakis, Jacob Leverich, David Mazi'eres, Subhasish Mitra, Aravind Narayanan, Guru Parulkar, Mendel Rosenblum, Stephen M. Rumble, Eric Stratmann, and Ryan Stutsman. The Case for RAMClouds: Scalable High-performance Storage Entirely in DRAM. SIGOPS Oper. Syst. Rev., 43(4):92--105, January 2010.
[24]
Zhenhao Pan, Yaozu Dong, Yu Chen, Lei Zhang, and Zhijiao Zhang. CompSC: Live Migration with Pass-through Devices. In Proceedings of the 8th ACM SIGPLAN/SIGOPS Conference on Virtual Execution Environments, VEE '12, pages 109--120, New York, NY, USA, 2012. ACM.
[25]
PCI SIG. Single Root I/O Virtualization, at https://www.pcisig.com/specifications/iov/single_root/.
[26]
A Ranadive, A Gavrilovska, and K. Schwan. FaReS: Fair Resource Scheduling for VMM-Bypass InfiniBand Devices. In Cluster, Cloud and Grid Computing (CCGrid), 2010 10th IEEE/ACM International Conference on, pages 418--427, May 2010.
[27]
R. Recio, B. Metzler, P. Culley, J. Hilland, and D. Garcia. A Remote Direct Memory Access Protocol Specification. RFC 5040, October 2007.
[28]
S. A. Reinemo, T. Skeie, T. Sodring, O. Lysne, and O. Trudbakken. An Overview of QoS Capabilities in Infiniband, Advanced Switching Interconnect, and Ethernet. Comm. Mag., 44(7):32--38, September 2006.
[29]
Rusty Russell. virtio: Towards a De-facto Standard for Virtual I/O Devices. SIGOPS Oper. Syst. Rev., 42(5):95--103, July 2008.
[30]
Animesh Trivedi, Bernard Metzler, and Patrick Stuedi. A case for RDMA in clouds: turning supercomputer networking into commodity. In Proceedings of the Second Asia-Pacific Workshop on Systems, APSys '11, pages 17:1--17:5, New York, NY, USA, 2011. ACM.

Cited By

View all
  • (2024)Feasibility study for containerization of parallel computing applicationsThird International Conference on Electronic Information Engineering, Big Data, and Computer Technology (EIBDCT 2024)10.1117/12.3031057(71)Online publication date: 19-Jul-2024
  • (2024)Empowering Cloud Computing With Network Acceleration: A SurveyIEEE Communications Surveys & Tutorials10.1109/COMST.2024.337753126:4(2729-2768)Online publication date: 1-Oct-2024
  • (2021)Sova: A Software-Defined Autonomic Framework for Virtual Network AllocationsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2020.301214632:1(116-130)Online publication date: 1-Jan-2021
  • Show More Cited By

Index Terms

  1. A Hybrid I/O Virtualization Framework for RDMA-capable Network Interfaces

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 50, Issue 7
    VEE '15
    July 2015
    221 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/2817817
    • Editor:
    • Andy Gill
    Issue’s Table of Contents
    • cover image ACM Conferences
      VEE '15: Proceedings of the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments
      March 2015
      238 pages
      ISBN:9781450334501
      DOI:10.1145/2731186
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 March 2015
    Published in SIGPLAN Volume 50, Issue 7

    Check for updates

    Author Tags

    1. rdma
    2. virtualization

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)44
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 13 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Feasibility study for containerization of parallel computing applicationsThird International Conference on Electronic Information Engineering, Big Data, and Computer Technology (EIBDCT 2024)10.1117/12.3031057(71)Online publication date: 19-Jul-2024
    • (2024)Empowering Cloud Computing With Network Acceleration: A SurveyIEEE Communications Surveys & Tutorials10.1109/COMST.2024.337753126:4(2729-2768)Online publication date: 1-Oct-2024
    • (2021)Sova: A Software-Defined Autonomic Framework for Virtual Network AllocationsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2020.301214632:1(116-130)Online publication date: 1-Jan-2021
    • (2020)MasQProceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication10.1145/3387514.3405849(1-14)Online publication date: 30-Jul-2020
    • (2020)Extending the Control Plane of Container Orchestrators for I/O Virtualization2020 2nd International Workshop on Containers and New Orchestration Paradigms for Isolated Environments in HPC (CANOPIE-HPC)10.1109/CANOPIEHPC51917.2020.00006(1-7)Online publication date: Nov-2020
    • (2019)A Virtual Network Resource Allocation Framework Based on SR-IOVApplied Sciences10.3390/app90101379:1(137)Online publication date: 2-Jan-2019
    • (2019)Review of RDMA-enabled Consensus Protocols2019 International Symposium on Networks, Computers and Communications (ISNCC)10.1109/ISNCC.2019.8909103(1-4)Online publication date: Jun-2019
    • (2019)ZCopy-Vhost: Replacing Data Copy With Page Remapping in Virtual Packet I/OIEEE Access10.1109/ACCESS.2019.29119057(51047-51057)Online publication date: 2019
    • (2017)Host managed contention avoidance storage solutions for Big DataJournal of Big Data10.1186/s40537-017-0080-94:1Online publication date: 19-Jun-2017
    • (2017)Lightweight and Generic RDMA Engine Para-Virtualization for the KVM Hypervisor2017 International Conference on High Performance Computing & Simulation (HPCS)10.1109/HPCS.2017.112(737-744)Online publication date: Jul-2017
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media