Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3447786.3456241acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article
Open access

Parallelizing packet processing in container overlay networks

Published: 21 April 2021 Publication History

Abstract

Container networking, which provides connectivity among containers on multiple hosts, is crucial to building and scaling container-based microservices. While overlay networks are widely adopted in production systems, they cause significant performance degradation in both throughput and latency compared to physical networks. This paper seeks to understand the bottlenecks of in-kernel networking when running container overlay networks. Through profiling and code analysis, we find that a prolonged data path, due to packet transformation in overlay networks, is the culprit of performance loss. Furthermore, existing scaling techniques in the Linux network stack are ineffective for parallelizing the prolonged data path of a single network flow.
We propose Falcon, a fast and balanced container networking approach to scale the packet processing pipeline in overlay networks. Falcon pipelines software interrupts associated with different network devices of a single flow on multiple cores, thereby preventing execution serialization of excessive software interrupts from overloading a single core. Falcon further supports multiple network flows by effectively multiplexing and balancing software interrupts of different flows among available cores. We have developed a prototype of Falcon in Linux. Our evaluation with both micro-benchmarks and real-world applications demonstrates the effectiveness of Falcon, with significantly improved performance (by 300% for web serving) and reduced tail latency (by 53% for data caching).

References

[1]
8 surprising facts about real docker adoption. https://goo.gl/F94Yhn.
[2]
Apache Mesos. http://mesos.apache.org/.
[3]
Apache Mesos. https://mesos.apache.org/.
[4]
Calico. https://github.com/projectcalico/calico-containers.
[5]
cloudsuite. https://cloudsuite.ch.
[6]
Docker Swarm. https://docs.docker.com/engine/swarm/.
[7]
Elgg. https://elgg.org.
[8]
Encrypting Network Traffic. http://encryptionhowto.sourceforge.net/Encryption-HOWTO-5.html.
[9]
Flame Graph. https://github.com/brendangregg/FlameGraph.
[10]
Flannel. https://github.com/coreos/flannel/.
[11]
Google Cloud Container. https://cloud.google.com/containers/.
[12]
Improving Overlay Solutions with Hardware-Based VXLAN Termination. https://goo.gl/5sV8s6.
[13]
Kubernetes. https://kubernetes.io/.
[14]
Mellanox VXLAN Acceleration. https://goo.gl/QJU4BW.
[15]
Memcached. https://memcached.org/.
[16]
Open vSwitch. https://www.openvswitch.org/.
[17]
Open vSwitch. http://openvswitch.org/.
[18]
Optimizing the Virtual Network with VXLAN Overlay Offloading. https://goo.gl/LEquzj.
[19]
OVS Offload Using ASAP2 Direct. https://docs.mellanox.com/display/MLNXOFEDv471001/OVS+Offload+Using+ASAP2+Direct.
[20]
Receive Packet Steering. https://lwn.net/Articles/362339/.
[21]
Receive Side Scaling. https://goo.gl/BXvmAJ.
[22]
Scalable High-Performance User Space Networking for Containers. https://goo.gl/1SJjro.
[23]
Sockperf. https://github.com/Mellanox/sockperf.
[24]
TCPDump. https://www.tcpdump.org/.
[25]
Use overlay networks. https://docs.docker.com/network/overlay/.
[26]
Weave. https://github.com/weaveworks/weave.
[27]
World-Class Performance Ethernet SmartNICs Product Line. https://www.mellanox.com/files/doc-2020/ethernet-adapter-brochure.pdf.
[28]
K. Asanovic, R. Bodik, J. Demmel, T. Keaveny, K. Keutzer, J. Kubiatowicz, N. Morgan, D. Patterson, K. Sen, J. Wawrzynek, et al. A view of the parallel computing landscape. Communications of the ACM, 2009.
[29]
A. Baumann, P. Barham, P.-E. Dagand, T. Harris, R. Isaacs, S. Peter, T. Roscoe, A. Schüpbach, and A. Singhania. The multikernel: a new os architecture for scalable multicore systems. In Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles (SOSP), 2009.
[30]
A. Belay, G. Prekas, A. Klimovic, S. Grossman, C. Kozyrakis, and E. Bugnion. Ix: A protected dataplane operating system for high throughput and low latency. In Proceedings of USENIX Symposium on Operating System Design and Implementation (OSDI), 2014.
[31]
N. L. Binkert, L. R. Hsu, A. G. Saidi, R. G. Dreslinski, A. L. Schultz, and S. K. Reinhardt. Performance analysis of system overheads in tcp/ip workloads. In Proceedings of 14th International Conference on Parallel Architectures and Compilation Techniques (PACT), 2005.
[32]
S. Boyd-Wickizer, H. Chen, R. Chen, Y. Mao, M. F. Kaashoek, R. Morris, A. Pesterev, L. Stein, M. Wu, Y.-h. Dai, et al. Corey: An operating system for many cores. In Proceedings of 8th USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2008.
[33]
S. Boyd-Wickizer, A. T. Clements, Y. Mao, A. Pesterev, M. F. Kaashoek, R. Morris, N. Zeldovich, et al. An analysis of linux scalability to many cores. In Proceedings of 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2010.
[34]
L. Cheng and C.-L. Wang. vbalance: using interrupt load balance to improve i/o performance for smp virtual machines. In Proceedings of the Third ACM Symposium on Cloud Computing, page 2. ACM, 2012.
[35]
F. R. Dogar, T. Karagiannis, H. Ballani, and A. Rowstron. Decentralized task-aware scheduling for data center networks. In Proceedings of ACM Special Interest Group on Data Communication (SIGCOMM), 2014.
[36]
R. Dua, A. R. Raja, and D. Kakadia. Virtualization vs containerization to support paas. In Proceedings of IEEE IC2E, 2014.
[37]
P. Emmerich, D. Raumer, A. Beifuß, L. Erlacher, F. Wohlfart, T. M. Runge, S. Gallenmüller, and G. Carle. Optimizing latency and cpu load in packet processing systems. In Proceedings of International Symposium on Performance Evaluation of Computer and Telecommunication Systems (SPECTS), 2015.
[38]
W. Felter, A. Ferreira, R. Rajamony, and J. Rubio. An updated performance comparison of virtual machines and linux containers. In Proceedings of IEEE ISPASS, 2015.
[39]
A. Gember-Jacobson, R. Viswanathan, C. Prakash, R. Grandl, J. Khalid, S. Das, and A. Akella. Opennf: Enabling innovation in network function control. In ACM SIGCOMM Computer Communication Review, 2014.
[40]
P. Gilfeather and A. B. Maccabe. Modeling protocol offload for message-oriented communication. In Proceedings of the IEEE International Cluster Computing, 2005.
[41]
A. Gordon, N. Amit, N. Har'El, M. Ben-Yehuda, A. Landau, A. Schuster, and D. Tsafrir. Eli: bare-metal performance for i/o virtualization. In Proceedings of ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2012.
[42]
B. Han, V. Gopalakrishnan, L. Ji, and S. Lee. Network function virtualization: Challenges and opportunities for innovations. IEEE Communications Magazine, 2015.
[43]
Y. Hu, M. Song, and T. Li. Towards full containerization in containerized network function virtualization. In Proceedings of ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2017.
[44]
H. Huang, J. Rao, S. Wu, H. Jin, K. Suo, and X. Wu. Adaptive resource views for containers. In Proceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing (HPDC), 2019.
[45]
J. Huang, F. Qian, Y. Guo, Y. Zhou, Q. Xu, Z. M. Mao, S. Sen, and O. Spatscheck. An in-depth study of lte: effect of network protocol and application behavior on performance. In Proceedings of ACM SIGCOMM, 2013.
[46]
Y. Huang, J. Geng, D. Lin, B. Wang, J. Li, R. Ling, and D. Li. Los: A high performance and compatible user-level network operating system. In Proceedings of the First Asia-Pacific Workshop on Networking (APNet), 2017.
[47]
M. A. Jamshed, Y. Moon, D. Kim, D. Han, and K. Park. mos: A reusable networking stack for flow monitoring middleboxes. In Proceedings of USENIX NSDI, 2017.
[48]
E. Jeong, S. Woo, M. A. Jamshed, H. Jeong, S. Ihm, D. Han, and K. Park. mtcp: a highly scalable user-level tcp stack for multicore systems. In Proceedings of USENIX Symposium on Networked Systems Design and Implementation (NSDI), 2014.
[49]
K. Kaffes, T. Chong, J. T. Humphries, A. Belay, D. Mazières, and C. Kozyrakis. Shinjuku: Preemptive scheduling for μsecond-scale tail latency. In 16th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI}), 2019.
[50]
J. Lei, K. Suo, H. Lu, and J. Rao. Tackling parallelization challenges of kernel network stack for container overlay networks. In 11th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 19), Renton, WA, 2019. USENIX Association.
[51]
A. Madhavapeddy, R. Mortier, C. Rotsos, D. Scott, B. Singh, T. Gazagnaire, S. Smith, S. Hand, and J. Crowcroft. Unikernels: Library operating systems for the cloud. In Proceedings of ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2013.
[52]
J. Martins, M. Ahmed, C. Raiciu, V. Olteanu, M. Honda, R. Bifulco, and F. Huici. Clickos and the art of network function virtualization. In Proceedings of USENIX NSDI, 2014.
[53]
D. Merkel. Docker: lightweight linux containers for consistent development and deployment. In Linux Journal, 2014.
[54]
E. M. Nahum, D. J. Yates, J. F. Kurose, and D. Towsley. Performance issues in parallelized network protocols. In Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation (OSDI), 1994.
[55]
Z. Niu, H. Xu, D. Han, P. Cheng, Y. Xiong, G. Chen, and K. Winstein. Network stack as a service in the cloud. In Proceedings of the 16th ACM Workshop on Hot Topics in Networks (HotNets), 2017.
[56]
A. Ousterhout, J. Fried, J. Behrens, A. Belay, and H. Balakrishnan. Shenango: Achieving high {CPU} efficiency for latency-sensitive datacenter workloads. In 16th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI}), 2019.
[57]
S. Palkar, C. Lan, S. Han, K. Jang, A. Panda, S. Ratnasamy, L. Rizzo, and S. Shenker. E2: a framework for nfv applications. In Proceedings of the 25th Symposium on Operating Systems Principles (SOSP), 2015.
[58]
A. Panda, S. Han, K. Jang, M. Walls, S. Ratnasamy, and S. Shenker. Net-bricks: Taking the V out of NFV. In Proceedings of USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2016.
[59]
D. Patterson. The parallel revolution has started: Are you part of the solution or part of the problem? In International Conference on High Performance Computing for Computational Science (SC), 2010.
[60]
A. Pesterev, J. Strauss, N. Zeldovich, and R. T. Morris. Improving network connection locality on multicore systems. In Proceedings of the 7th ACM european conference on Computer Systems (Eurosys), 2012.
[61]
S. Peter, J. Li, I. Zhang, D. R. Ports, D. Woos, A. Krishnamurthy, T. Anderson, and T. Roscoe. Arrakis: The operating system is the control plane. In Proceedings of USENIX Symposium on Operating System Design and Implementation (OSDI), 2014.
[62]
G. Prekas, M. Kogias, and E. Bugnion. Zygos: Achieving low tail latency for microsecond-scale networked tasks. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP), 2017.
[63]
G. Regnier, S. Makineni, I. Illikkal, R. Iyer, D. Minturn, R. Huggahalli, D. Newell, L. Cline, and A. Foong. TCP onloading for data center servers. IEEE Computer, 2004.
[64]
L. Rizzo. Netmap: a novel framework for fast packet i/o. In Proceedings of 21st USENIX Security Symposium (USENIX Security), 2012.
[65]
L. Rizzo and G. Lettieri. Vale, a switched ethernet for virtual machines. In Proceedings of the 8th international conference on Emerging networking experiments and technologies (CoNEXT), 2012.
[66]
P. Sharma, L. Chaufournier, P. Shenoy, and Y. Tay. Containers and virtual machines at scale: A comparative study. In Proceedings of ACM Middleware, 2016.
[67]
P. Shivam and J. S. Chase. On the elusive benefits of protocol offload. In Proceedings of the ACM SIGCOMM workshop on Network-I/O convergence: experience, lessons, implications, 2003.
[68]
K. Suo, Y. Zhao, W. Chen, and J. Rao. An analysis and empirical study of container networks. In Proceedings of IEEE INFOCOM, 2018.
[69]
K. Suo, Y. Zhao, W. Chen, and J. Rao. vNetTracer: Efficient and programmable packet tracing in virtualized networks. In Proceedings of IEEE ICDCS, 2018.
[70]
K. Suo, Y. Zhao, J. Rao, L. Cheng, X. Zhou, and F. C. Lau. Preserving i/o prioritization in virtualized oses. In Proceedings of the Symposium on Cloud Computing (SoCC), 2017.
[71]
J. Weerasinghe and F. Abel. On the cost of tunnel endpoint processing in overlay virtual networks. In Proceedings of the 2014 IEEE/ACM 7th International Conference on Utility and Cloud Computing (UCC), 2014.
[72]
R. Westrelin, N. Fugier, E. Nordmark, K. Kunze, and E. Lemoine. Studying network protocol offload with emulation: approach and preliminary results. In Proceedings of 12th IEEE Symposium on High Performance Interconnects (HOTI), 2004.
[73]
P. Willmann, S. Rixner, and A. L. Cox. An evaluation of network stack parallelization strategies in modern operating systems. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC), 2006.
[74]
T. Yu, S. A. Noghabi, S. Raindel, H. Liu, J. Padhye, and V. Sekar. Freeflow: High performance container networking. In Proceedings of ACM Hot-Net, 2016.
[75]
Y. Zhang, Y. Li, K. Xu, D. Wang, M. Li, X. Cao, and Q. Liang. A communication-aware container re-distribution approach for high performance vnfs. In Proceedings of IEEE 37th International Conference on Distributed Computing Systems (ICDCS), 2017.
[76]
Y. Zhao, K. Suo, L. Cheng, and J. Rao. Scheduler activations for interference-resilient smp virtual machine scheduling. In Proceedings of the ACM/IFIP/USENIX Middleware Conference (Middleware), 2017.
[77]
Y. Zhao, K. Suo, X. Wu, J. Rao, S. Wu, and H. Jin. Preemptive multi-queue fair queuing. In Proceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing (HPDC), 2019.
[78]
D. Zhuo, K. Zhang, Y. Zhu, H. Liu, M. Rockett, A. Krishnamurthy, and T. Anderson. Slim: OS kernel support for a low-overhead container overlay network. In Proceedings of USENIX Symposium on Networked Systems Design and Implementation (NSDI), 2019.

Cited By

View all
  • (2024)Rethinking the Networking Stack for Serverless Environments: A Sidecar ApproachProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698561(213-222)Online publication date: 20-Nov-2024
  • (2024)Understanding Network Startup for Secure Containers in Multi-Tenant Clouds: Performance, Bottleneck and OptimizationProceedings of the 2024 ACM on Internet Measurement Conference10.1145/3646547.3688436(635-650)Online publication date: 4-Nov-2024
  • (2024)HD-IOV: SW-HW Co-designed I/O Virtualization with Scalability and Flexibility for Hyper-Density CloudProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3629557(834-850)Online publication date: 22-Apr-2024
  • Show More Cited By
  1. Parallelizing packet processing in container overlay networks

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    EuroSys '21: Proceedings of the Sixteenth European Conference on Computer Systems
    April 2021
    631 pages
    ISBN:9781450383349
    DOI:10.1145/3447786
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 21 April 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    EuroSys '21
    Sponsor:
    EuroSys '21: Sixteenth European Conference on Computer Systems
    April 26 - 28, 2021
    Online Event, United Kingdom

    Acceptance Rates

    EuroSys '21 Paper Acceptance Rate 38 of 181 submissions, 21%;
    Overall Acceptance Rate 241 of 1,308 submissions, 18%

    Upcoming Conference

    EuroSys '25
    Twentieth European Conference on Computer Systems
    March 30 - April 3, 2025
    Rotterdam , Netherlands

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)270
    • Downloads (Last 6 weeks)41
    Reflects downloads up to 19 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Rethinking the Networking Stack for Serverless Environments: A Sidecar ApproachProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698561(213-222)Online publication date: 20-Nov-2024
    • (2024)Understanding Network Startup for Secure Containers in Multi-Tenant Clouds: Performance, Bottleneck and OptimizationProceedings of the 2024 ACM on Internet Measurement Conference10.1145/3646547.3688436(635-650)Online publication date: 4-Nov-2024
    • (2024)HD-IOV: SW-HW Co-designed I/O Virtualization with Scalability and Flexibility for Hyper-Density CloudProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3629557(834-850)Online publication date: 22-Apr-2024
    • (2024)A Systematic Investigation of Hardware and Software in Electric Vehicular PlatformProceedings of the 2024 ACM Southeast Conference10.1145/3603287.3651203(9-17)Online publication date: 18-Apr-2024
    • (2024)SPRIGHT: High-Performance eBPF-Based Event-Driven, Shared-Memory Processing for Serverless ComputingIEEE/ACM Transactions on Networking10.1109/TNET.2024.336656132:3(2539-2554)Online publication date: Jun-2024
    • (2024)Hyperion: Hardware-Based High-Performance and Secure System for Container NetworksIEEE Transactions on Cloud Computing10.1109/TCC.2024.340317512:3(844-858)Online publication date: Jul-2024
    • (2024)Analysis and Optimization for Passive One-way Delay Measurement Tax in Container Networks2024 IEEE 17th International Conference on Cloud Computing (CLOUD)10.1109/CLOUD62652.2024.00036(247-255)Online publication date: 7-Jul-2024
    • (2023)Accelerating Data Delivery of Latency-Sensitive Applications in Container Overlay NetworkIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.330074534:12(3046-3058)Online publication date: 1-Dec-2023
    • (2023)Accelerating Packet Processing in Container Overlay Networks via Packet-level Parallelism2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS54959.2023.00018(79-89)Online publication date: May-2023
    • (2022)SPRIGHTProceedings of the ACM SIGCOMM 2022 Conference10.1145/3544216.3544259(780-794)Online publication date: 22-Aug-2022
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media