Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Duet: cloud scale load balancing with hardware and software

Published: 17 August 2014 Publication History

Abstract

Load balancing is a foundational function of datacenter infrastructures and is critical to the performance of online services hosted in datacenters. As the demand for cloud services grows, expensive and hard-to-scale dedicated hardware load balancers are being replaced with software load balancers that scale using a distributed data plane that runs on commodity servers. Software load balancers offer low cost, high availability and high flexibility, but suffer high latency and low capacity per load balancer, making them less than ideal for applications that demand either high throughput, or low latency or both. In this paper, we present Duet, which offers all the benefits of software load balancer, along with low latency and high availability -- at next to no cost. We do this by exploiting a hitherto overlooked resource in the data center networks -- the switches themselves. We show how to embed the load balancing functionality into existing hardware switches, thereby achieving organic scalability at no extra cost. For flexibility and high availability, Duet seamlessly integrates the switch-based load balancer with a small deployment of software load balancer. We enumerate and solve several architectural and algorithmic challenges involved in building such a hybrid load balancer. We evaluate Duet using a prototype implementation, as well as extensive simulations driven by traces from our production data centers. Our evaluation shows that Duet provides 10x more capacity than a software load balancer, at a fraction of a cost, while reducing latency by a factor of 10 or more, and is able to quickly adapt to network dynamics including failures.

References

[1]
A10 networks ax series. http://www.a10networks.com.
[2]
Broadcom smart hashing. http://http://www.broadcom.com/collateral/wp/StrataXGS_SmartSwitch-WP200-R.pdf.
[3]
Embrane. http://www.embrane.com.
[4]
F5 load balancer. http://www.f5.com.
[5]
Ha proxy load balancer. http://haproxy.1wt.eu.
[6]
Loadbalancer.org virtual appliance. http://www.load-balancer.org.
[7]
Netscalar vpx virtual appliance. http://www.citrix.com.
[8]
M. Alizadeh, A. Greenberg, D. A. Maltz, J. Padhye, P. Patel, B. Prabhakar, S. Sengupta, and M. Sridharan. Data center TCP (DCTCP). In SIGCOMM, 2010.
[9]
P. Bodík, I. Menache, M. Chowdhury, P. Mani, D. A. Maltz, and I. Stoica. Surviving failures in bandwidth-constrained datacenters. In SIGCOMM, 2012.
[10]
C. Chekuri and S. Khanna. On multi-dimensional packing problems. In SODA, 1999.
[11]
M. Dobrescu, N. Egi, K. Argyraki, B.-G. Chun, K. Fall, G. Iannaccone, A. Knies, M. Manesh, and S. Ratnasamy. Routebricks: Exploiting parallelism to scale software routers. In SOSP, 2009.
[12]
S. Fayazbakhsh, V. Sekar, M. Yu, and J. Mogul. Flowtags: Enforcing network-wide policies in the presence of dynamic middlebox actions. Proc. HotSDN, 2013.
[13]
P. Gill, N. Jain, and N. Nagappan. Understanding network failures in data centers: measurement, analysis, and implications. In ACM SIGCOMM CCR, 2011.
[14]
J. Hamilton. The cost of latency. http://perspectives.mvdirona.com/2009/10/31/TheCostOfLatency.aspx.
[15]
N. Handigol, S. Seetharaman, M. Flajslik, N. McKeown, and R. Johari. Plug-n-serve: Load-balancing web traffic using openflow. ACM SIGCOMM Demo, 2009.
[16]
M. Moshref, M. Yu, A. Sharma, and R. Govindan. Scalable rule management for data centers. In NSDI, 2013.
[17]
P. Patel et al. Ananta: Cloud scale load balancing. In SIGCOMM, 2013.
[18]
Z. A. Qazi, C.-C. Tu, L. Chiang, R. Miao, V. Sekar, and M. Yu. Simple-fying middlebox policy enforcement using sdn. In SIGCOMM, 2013.
[19]
L. Ravindranath, J. padhye, R. Mahajan, and H. Balakrishnan. Timecard: Controlling User-Perceieved Delays in Server-based Mobile Applications. In SOSP, 2013.
[20]
R. Wang, D. Butnariu, and J. Rexford. Openflow-based server load balancing gone wild. In Usenix HotICE, 2011.
[21]
X. Wu, D. Turner, C.-C. Chen, D. A. Maltz, X. Yang, L. Yuan, and M. Zhang. Netpilot: automating datacenter network failure mitigation. ACM SIGCOMM CCR, 2012.
[22]
M. Yu, J. Rexford, M. J. Freedman, and J. Wang. Scalable flow-based networking with difane. In SIGCOMM, 2010.
[23]
D. Zhou, B. Fan, H. Lim, M. Kaminsky, and D. G. Andersen. Scalable, high performance ethernet forwarding with cuckooswitch. In CoNext, 2013.

Cited By

View all
  • (2024)Unraveling Physical Space Limits for LEO Network ScalabilityProceedings of the 23rd ACM Workshop on Hot Topics in Networks10.1145/3696348.3696885(43-51)Online publication date: 18-Nov-2024
  • (2024)RD-Probe: Scalable Monitoring With Sufficient Coverage In Complex Datacenter NetworksProceedings of the ACM SIGCOMM 2024 Conference10.1145/3651890.3672256(258-273)Online publication date: 4-Aug-2024
  • (2024)SFCache: Hybrid NF Synthesization in Runtime With Rule-Caching in Programmable SwitchesIEEE Transactions on Network and Service Management10.1109/TNSM.2024.339014021:4(4613-4624)Online publication date: 1-Aug-2024
  • Show More Cited By

Index Terms

  1. Duet: cloud scale load balancing with hardware and software

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM SIGCOMM Computer Communication Review
    ACM SIGCOMM Computer Communication Review  Volume 44, Issue 4
    SIGCOMM'14
    October 2014
    672 pages
    ISSN:0146-4833
    DOI:10.1145/2740070
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 17 August 2014
    Published in SIGCOMM-CCR Volume 44, Issue 4

    Check for updates

    Author Tags

    1. SDN
    2. datacenter
    3. load balancing

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)240
    • Downloads (Last 6 weeks)32
    Reflects downloads up to 14 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Unraveling Physical Space Limits for LEO Network ScalabilityProceedings of the 23rd ACM Workshop on Hot Topics in Networks10.1145/3696348.3696885(43-51)Online publication date: 18-Nov-2024
    • (2024)RD-Probe: Scalable Monitoring With Sufficient Coverage In Complex Datacenter NetworksProceedings of the ACM SIGCOMM 2024 Conference10.1145/3651890.3672256(258-273)Online publication date: 4-Aug-2024
    • (2024)SFCache: Hybrid NF Synthesization in Runtime With Rule-Caching in Programmable SwitchesIEEE Transactions on Network and Service Management10.1109/TNSM.2024.339014021:4(4613-4624)Online publication date: 1-Aug-2024
    • (2024)Hermes: Low-Overhead Inter-Switch Coordination in Network-Wide Data Plane Program DeploymentIEEE/ACM Transactions on Networking10.1109/TNET.2024.336132432:4(2842-2857)Online publication date: Aug-2024
    • (2024)BCLB: A Scalable and Cooperative Layer-4 Load Balancer for Data Centers2024 IEEE 44th International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS60910.2024.00096(993-1003)Online publication date: 23-Jul-2024
    • (2024)Performance evaluation of containers for low-latency packet processing in virtualized network environmentsPerformance Evaluation10.1016/j.peva.2024.102442166(102442)Online publication date: Nov-2024
    • (2023)Yuz: Improving Performance of Cluster-Based Services by Near-L4 Session-Persistent Load BalancingIEEE Transactions on Network and Service Management10.1109/TNSM.2023.334196421:2(1929-1942)Online publication date: 12-Dec-2023
    • (2023)Distributed Dispatching in the Parallel Server ModelIEEE/ACM Transactions on Networking10.1109/TNET.2022.322093131:4(1521-1534)Online publication date: Aug-2023
    • (2023)NetHCF: Filtering Spoofed IP Traffic With Programmable SwitchesIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2022.316101520:2(1641-1655)Online publication date: 1-Mar-2023
    • (2023)On The Protection of A High Performance Load Balancer Against SYN AttacksIEEE Transactions on Cloud Computing10.1109/TCC.2023.3234122(1-14)Online publication date: 2023
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media