Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3229584.3229586acmconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article
Public Access

Catching the Microburst Culprits with Snappy

Published: 07 August 2018 Publication History

Abstract

Short-lived traffic surges, known as microbursts, can cause periods of unexpectedly high packet delay and loss on a link. Today, preventing microbursts requires deploying switches with larger packet buffers (incurring higher cost) or running the network at low utilization (sacrificing efficiency). Instead, we argue that switches should detect microbursts as they form, and take corrective action before the situation gets worse. This requires an efficient way for switches to identify the particular flows responsible for a microburst, and handle them automatically (e.g., by pacing, marking, or rerouting the packets). However, collecting fine-grained statistics about queue occupancy in real time is challenging, even with emerging programmable data planes. We present Snappy, which identifies the flows responsible for a microburst in real time. Snappy maintains multiple snapshots of the occupants of the queue over time, where each snapshot is a compact data structure that makes eicient use of data-plane memory. As each new packet arrives, Snappy updates one snapshot and also estimates the fraction of the queue occupied by the associated flow. Our simulations with data-center packet traces show that Snappy can target the flows responsible for microbursts at the sub-millisecond level.

References

[1]
Yehuda Afek, Anat Bremler-Barr, Shir Landau Feibish, and Liron Schiff. 2018. Detecting Heavy Flows in the SDN Match and Action Model. Computer Networks 136 (2018), 1--12.
[2]
Mohammad Alizadeh, Tom Edsall, Sarang Dharmapurikar, Ramanan Vaidyanathan, Kevin Chu, Andy Fingerhut, Vinh The Lam, Francis Matus, Rong Pan, Navindra Yadav, and George Varghese. 2014. CONGA: Distributed congestion-aware load balancing for datacenters. In ACM SIGCOMM Conference. 503--514.
[3]
Guido Appenzeller, Isaac Keslassy, and Nick McKeown. 2004. Sizing router buffers. In ACM SIGCOMM Conference. 281--292.
[4]
Ran Ben-Basat, Gil Einziger, Roy Friedman, and Yaron Kassner. 2016. Heavy Hitters in Streams and Sliding Windows. Technical Report CS-2016-01. Computer Science, Technion.
[5]
Theophilus Benson, Aditya Akella, and David A. Maltz. 2010. Network traffic characteristics of data centers in the wild. In ACM SIGCOMM Internet Measurement Conference. 267--280.
[6]
Theophilus Benson, Ashok Anand, Aditya Akella, and Ming Zhang. 2010. Understanding data center traffic characteristics. ACM SIGCOMM Computer Communication Review 40, 1 (2010), 92--99.
[7]
Vladimir Braverman, Ran Gelles, and Rafail Ostrovsky. 2014. How to catch L2-heavy-hitters on sliding windows. Theoretical Computer Science 554 (2014), 82--94.
[8]
The P4 Language Consortium. 2018. P416 Language Specifications. (2018). https://p4.org/p4-spec/docs/P4-16-v1.0.0-spec.pdf
[9]
The P4 Language Consortium. 2018. P416 Portable Switch Architecture. (2018). https://p4.org/p4-spec/docs/PSA-v1.0.0.pdf
[10]
Graham Cormode and S. Muthukrishnan. 2005. An improved data stream summary: The count-min sketch and its applications. Journal of Algorithms 55, 1 (2005), 58--75.
[11]
Soudeh Ghorbani, Zibin Yang, Philip Brighten Godfrey, Yashar Ganjali, and Amin Firoozshahian. 2017. DRILL: Micro Load Balancing for Low-latency Data Center Networks. In ACM SIGCOMM Conference. 225--238.
[12]
Jonathan Perry, Amy Ousterhout, Hari Balakrishnan, Devavrat Shah, and Hans Fugal. 2014. Fastpass: A centralized "zero-queue" datacenter network. In ACM SIGCOMM Conference. 307--318.
[13]
Naveen Kr. Sharma, Ming Liu, Kishore Atreya, and Arvind Krishnamurthy. 2018. Approximating Fair Queueing on Reconfigurable Switches. In USENIX Symposium on Networked Systems Design and Implementation.
[14]
Damon Wischik and Nick McKeown. 2005. Part I: Buffer Sizes for Core Routers. ACM SIGCOMM Computer Communication Review 35, 3 (July 2005), 75--78.
[15]
Qiao Zhang, Vincent Liu, Hongyi Zeng, and Arvind Krishnamurthy. 2017. High-resolution measurement of data center microbursts. In ACM SIGCOMM Internet Measurement Conference. ACM, 78--85.
[16]
Ying Zhang, Zhuoqing Morley Mao, and Jia Wang. 2007. Low-Rate TCP-Targeted DoS Attack Disrupts Internet Routing. In Network and Distributed System Security Symposium.

Cited By

View all
  • (2024)μMon: Empowering Microsecond-level Network Monitoring with WaveletsProceedings of the ACM SIGCOMM 2024 Conference10.1145/3651890.3672236(274-290)Online publication date: 4-Aug-2024
  • (2024)In-Network Address Caching for Virtual NetworksProceedings of the ACM SIGCOMM 2024 Conference10.1145/3651890.3672213(735-749)Online publication date: 4-Aug-2024
  • (2024)QALL: Distributed Queue-Behavior-Aware Load Balancing Using Programmable Data PlanesIEEE Transactions on Network and Service Management10.1109/TNSM.2023.334586221:2(2303-2322)Online publication date: Apr-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SelfDN 2018: Proceedings of the Afternoon Workshop on Self-Driving Networks
August 2018
48 pages
ISBN:9781450359146
DOI:10.1145/3229584
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 August 2018

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

SIGCOMM '18
Sponsor:
SIGCOMM '18: ACM SIGCOMM 2018 Conference
August 24, 2018
Budapest, Hungary

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)137
  • Downloads (Last 6 weeks)23
Reflects downloads up to 26 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)μMon: Empowering Microsecond-level Network Monitoring with WaveletsProceedings of the ACM SIGCOMM 2024 Conference10.1145/3651890.3672236(274-290)Online publication date: 4-Aug-2024
  • (2024)In-Network Address Caching for Virtual NetworksProceedings of the ACM SIGCOMM 2024 Conference10.1145/3651890.3672213(735-749)Online publication date: 4-Aug-2024
  • (2024)QALL: Distributed Queue-Behavior-Aware Load Balancing Using Programmable Data PlanesIEEE Transactions on Network and Service Management10.1109/TNSM.2023.334586221:2(2303-2322)Online publication date: Apr-2024
  • (2024)DynATOS+: A Network Telemetry System for Dynamic Traffic and Query WorkloadsIEEE/ACM Transactions on Networking10.1109/TNET.2024.336743232:4(2810-2825)Online publication date: Aug-2024
  • (2023)MFGAD-INT: in-band network telemetry data-driven anomaly detection using multi-feature fusion graph deep learningJournal of Cloud Computing: Advances, Systems and Applications10.1186/s13677-023-00492-w12:1Online publication date: 28-Aug-2023
  • (2023)MARS: Fault Localization in Programmable Networking Systems with Low-cost In-Band Network TelemetryProceedings of the 52nd International Conference on Parallel Processing10.1145/3605573.3605622(347-357)Online publication date: 7-Aug-2023
  • (2023)Toward Low-Latency and Accurate State Synchronization for Programmable NetworksIEEE/ACM Transactions on Networking10.1109/TNET.2022.321844631:3(1400-1415)Online publication date: Jun-2023
  • (2023)An adaptive and efficient approach to detect microbursts leveraging per-packet telemetry in a production networkNOMS 2023-2023 IEEE/IFIP Network Operations and Management Symposium10.1109/NOMS56928.2023.10154390(1-6)Online publication date: 8-May-2023
  • (2023)FASTeller: A Hardware Partial Aggregator for Accurate Flow Counting in Cloud Networks2023 IEEE 31st International Conference on Network Protocols (ICNP)10.1109/ICNP59255.2023.10355603(1-12)Online publication date: 10-Oct-2023
  • (2023)Taking Detours: An In-Network Fault-Tolerant Probing Planning for In-Band Network TelemetryICC 2023 - IEEE International Conference on Communications10.1109/ICC45041.2023.10279199(1934-1939)Online publication date: 28-May-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media