Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3387514.3406591acmconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article
Open access

Swift: Delay is Simple and Effective for Congestion Control in the Datacenter

Published: 30 July 2020 Publication History

Abstract

We report on experiences with Swift congestion control in Google datacenters. Swift targets an end-to-end delay by using AIMD control, with pacing under extreme congestion. With accurate RTT measurement and care in reasoning about delay targets, we find this design is a foundation for excellent performance when network distances are well-known. Importantly, its simplicity helps us to meet operational challenges. Delay is easy to decompose into fabric and host components to separate concerns, and effortless to deploy and maintain as a congestion signal while the datacenter evolves. In large-scale testbed experiments, Swift delivers a tail latency of <50μs for short RPCs, with near-zero packet drops, while sustaining ~100Gbps throughput per server. This is a tail of <3x the minimal latency at a load close to 100%. In production use in many different clusters, Swift achieves consistently low tail completion times for short RPCs, while providing high throughput for long RPCs. It has loss rates that are at least 10x lower than a DCTCP protocol, and handles O(10k) incasts that sharply degrade with DCTCP.

Supplementary Material

MP4 File (3387514.3406591.mp4)
Presentation Video for Swift paper at SIGCOMM 2020

References

[1]
Mohammad Alizadeh, Albert Greenberg, David A. Maltz, Jitendra Padhye, Parveen Patel, Balaji Prabhakar, Sudipta Sengupta, and Murari Sridharan. 2010. Data Center TCP (DCTCP). In Proceedings of the ACM SIGCOMM 2010 Conference (SIGCOMM '10). Association for Computing Machinery, New York, NY, USA, 63--74. https://doi.org/10.1145/1851182.1851192
[2]
Mohammad Alizadeh, Abdul Kabbani, Tom Edsall, Balaji Prabhakar, Amin Vahdat, and Masato Yasuda. 2012. Less is More: Trading a Little Bandwidth for Ultra-low Latency in the Data Center. In Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation (NSDI'12). USENIX Association, Berkeley, CA, USA, 19--19. http://dl.acm.org/citation.cfm?id=2228298.2228324
[3]
Mohammad Alizadeh, Shuang Yang, Milad Sharif, Sachin Katti, Nick McKeown, Balaji Prabhakar, and Scott Shenker. 2013. pFabric: Minimal Near-optimal Datacenter Transport. In Proceedings of the ACM SIGCOMM 2013 Conference (SIGCOMM '13). ACM, New York, NY, USA, 435--446. https://doi.org/10.1145/2486001.2486031
[4]
M. Allman, K. Avrachenkov, U. Ayesta, J. Blanton, and P. Hurtig. 2010. Early Retransmit for TCP and Stream Control Transmission Protocol (SCTP). RFC 5827. RFC Editor. http://www.rfc-editor.org/rfc/rfc5827.txt http://www.rfc-editor.org/rfc/rfc5827.txt.
[5]
Guido Appenzeller, Isaac Keslassy, and Nick McKeown. 2004. Sizing Router Buffers. In Proceedings of the 2004 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications (SIGCOMM '04). Association for Computing Machinery, New York, NY, USA, 281--S292. https://doi.org/10.1145/1015467.1015499
[6]
Mina Tahmasbi Arashloo, Alexey Lavrov, Manya Ghobadi, Jennifer Rexford, David Walker, and David Wentzlaff. 2020. Enabling Programmable Transport Protocols in High-Speed NICs. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20). USENIX Association, Santa Clara, CA, 93--109. https://www.usenix.org/conference/nsdi20/presentation/arashloo
[7]
Krste Asanović. 2014. FireBox: A Hardware Building Block for 2020 Warehouse-Scale Computers. In 12th USENIX Conference on File and Storage Technologies. USENIX Association, Santa Clara, CA.
[8]
Wei Bai, Kai Chen, Li Chen, Changhoon Kim, and Haitao Wu. 2016. Enabling ECN over Generic Packet Scheduling. In Proceedings of the 12th International on Conference on Emerging Networking Experiments and Technologies (CoNEXT '16). Association for Computing Machinery, New York, NY, USA, 191--204. https://doi.org/10.1145/2999572.2999575
[9]
Luiz Barroso, Mike Marty, David Patterson, and Parthasarathy Ranganathan. 2017. Attack of the Killer Microseconds. Commun. ACM 60, 4 (March 2017), 48--54. https://doi.org/10.1145/3015146
[10]
E. Blanton and M. Allman. 2004. Using TCP Duplicate Selective Acknowledgement (DSACKs) and Stream Control Transmission Protocol (SCTP) Duplicate Transmission Sequence Numbers (TSNs) to Detect Spurious Retransmissions. RFC 3708. RFC Editor.
[11]
Google Cloud Blog. 2018. How Distributed Shuffle improves scalability and performance in Cloud Dataflow pipelines. (2018). https://cloud. google.com/blog/products/data-analytics/how-distributed-shuffle-improves-scalability-and-performance-cloud-dataflow-pipelines
[12]
Lawrence S. Brakmo, Sean W. O'Malley, and Larry L. Peterson. 1994. TCP Vegas: New Techniques for Congestion Detection and Avoidance. SIGCOMM Comput. Commun. Rev. 24, 4 (Oct. 1994), 24--35. https://doi.org/10.1145/190809.190317
[13]
Chelsio Communications. 2020. Chelsio TCP Offload Engine. https://www.chelsio.com/nic/tcp-offload-engine/. (2020). Accessed: 2020-02-02.
[14]
Li Chen, Kai Chen, Wei Bai, and Mohammad Alizadeh. 2016. Scheduling Mix-flows in Commodity Datacenters with Karuna. In Proceedings of the ACM SIGCOMM 2016 Conference (SIGCOMM '16). ACM, New York, NY, USA, 174--187. https://doi.org/10.1145/2934872.2934888
[15]
Inho Cho, Keon Jang, and Dongsu Han. 2017. Credit-Scheduled Delay-Bounded Congestion Control for Datacenters. In Proceedings of the ACM SIGCOMM 2017 Conference (SIGCOMM '17). ACM, New York, NY, USA, 239--252.
[16]
Jeffrey Dean and Luiz André Barroso. 2013. The Tail at Scale. Commun. ACM 56, 2 (Feb. 2013), 74--80. https://doi.org/10.1145/2408776.2408794
[17]
Nandita Dukkipati and Nick McKeown. 2006. Why Flow-Completion Time is the Right Metric for Congestion Control. SIGCOMM Comput. Commun. Rev. 36, 1 (Jan. 2006), 59--62. https://doi.org/10.1145/1111322.1111336
[18]
Paolo Faraboschi, Kimberly Keeton, Tim Marsland, and Dejan Milojicic. 2015. Beyond Processor-centric Operating Systems. In 15th Workshop on Hot Topics in Operating Systems (HotOS XV). USENIX Association, Kartause Ittingen, Switzerland, 1--7. https://www.usenix.org/conference/hotos15/workshop-program/presentation/faraboschi
[19]
S. Floyd, J. Mahdavi, M. Mathis, and M. Podolsky. 2000. An Extension to the Selective Acknowledgement (SACK) Option for TCP. RFC 2883. RFC Editor.
[20]
Peter X. Gao, Akshay Narayan, Sagar Karandikar, Joao Carreira, Sangjin Han, Rachit Agarwal, Sylvia Ratnasamy, and Scott Shenker. 2016. Network Requirements for Resource Disaggregation. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16). USENIX Association, Savannah, GA, 249--264. https://www.usenix.org/conference/osdi16/technical-sessions/presentation/gao
[21]
Peter X. Gao, Akshay Narayan, Gautam Kumar, Rachit Agarwal, Sylvia Ratnasamy, and Scott Shenker. 2015. pHost: Distributed Near-optimal Datacenter Transport over Commodity Network Fabric. In Proceedings of the 11th ACM Conference on Emerging Networking Experiments and Technologies (CoNEXT '15). ACM, New York, NY, USA, Article 1, 12 pages. https://doi.org/10.1145/2716281.2836086
[22]
Matthew P. Grosvenor, Malte Schwarzkopf, Ionel Gog, Robert N. M. Watson, Andrew W. Moore, Steven Hand, and Jon Crowcroft. 2015. Queues Don't Matter When You Can JUMP Them!. In 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI 15). USENIX Association, Oakland, CA, 1--14. https://www.usenix.org/conference/nsdi15/technical-sessions/presentation/grosvenor
[23]
Mark Handley, Costin Raiciu, Alexandru Agache, Andrei Voinescu, Andrew W. Moore, Gianni Antichik, and Marcin Mojcik. 2017. Re-architecting Datacenter Networks and Stacks for Low Latency and High Performance. In Proceedings of the ACM SIGCOMM 2017 Conference (SIGCOMM '17). ACM, New York, NY, USA, 29--42.
[24]
Chi-Yao Hong, Matthew Caesar, and P. Brighten Godfrey. 2012. Finishing Flows Quickly with Preemptive Scheduling. In Proceedings of the ACM SIGCOMM 2012 Conference (SIGCOMM '12). ACM, New York, NY, USA, 127--138. https://doi.org/10.1145/2342356.2342389
[25]
Joseph Izraelevitz, Jian Yang, Lu Zhang, Juno Kim, Xiao Liu, Amirsaman Memaripour, Yun Joon Soh, Zixuan Wang, Yi Xu, Subramanya R. Dulloor, Jishen Zhao, and Steven Swanson. 2019. Basic Performance Measurements of the Intel Optane DC Persistent Memory Module. CoRR abs/1903.05714 (2019), 1--61. arXiv:1903.05714 http://arxiv.org/abs/1903.05714
[26]
Raj Jain, Dah Ming Chiu, and Hawe WR. 1984. A Quantitative Measure Of Fairness And Discrimination For Resource Allocation In Shared Computer Systems. (09 1984), 37 pages.
[27]
Dina Katabi, Mark Handley, and Charlie Rohrs. 2002. Congestion Control for High Bandwidth-Delay Product Networks. In Proceedings of the 2002 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications (SIGCOMM '02). Association for Computing Machinery, New York, NY, USA, 89--102. https://doi.org/10.1145/633025.633035
[28]
K. Katrinis, D. Syrivelis, D. Pnevmatikatos, G. Zervas, D. Theodoropoulos, I. Koutsopoulos, K. Hasharoni, D. Raho, C. Pinto, F. Espina, S. Lopez-Buedo, Q. Chen, M. Nemirovsky, D. Roca, H. Klos, and T. Berends. 2016. Rack-scale disaggregated cloud data centers: The dReDBox project vision. In 2016 Design, Automation Test in Europe Conference Exhibition (DATE). IEEE, Dresden, Germany, 690--695.
[29]
Changhoon Kim, Parag Bhide, Ed Doe, Hugh Holbrook, Anoop Ghanwani, Dan Daly, Mukesh Hira, and Bruce Davie. 2016. InâĂŘband Network Telemetry (INT). https://p4.org/assets/INT-current-spec.pdf. (2016). Accessed: 2020-01-13.
[30]
Ana Klimovic, Christos Kozyrakis, Eno Thereska, Binu John, and Sanjeev Kumar. 2016. Flash Storage Disaggregation. In Proceedings of the Eleventh European Conference on Computer Systems (EuroSys '16). Association for Computing Machinery, New York, NY, USA, Article Article 29, 15 pages. https://doi.org/10.1145/2901318.2901337
[31]
Ana Klimovic, Heiner Litz, and Christos Kozyrakis. 2017. ReFlex: Remote Flash Local Flash. In Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '17). Association for Computing Machinery, New York, NY, USA, 345--359. https://doi.org/10.1145/3037697.3037732
[32]
Gautam Kumar, Srikanth Kandula, Peter Bodik, and Ishai Menache. 2013. Virtualizing Traffic Shapers for Practical Resource Allocation. In Presented as part of the 5th USENIX Workshop on Hot Topics in Cloud Computing. USENIX, San Jose, CA, 1--6. https://www.usenix.org/conference/hotcloud13/workshop-program/presentations/Kumar
[33]
C. Lee, C. Park, K. Jang, S. Moon, and D. Han. 2017. DX: Latency-Based Congestion Control for Datacenters. IEEE/ACM Transactions on Networking 25, 1 (Feb 2017), 335--348. https://doi.org/10.1109/TNET.2016.2587286
[34]
Yuliang Li, Rui Miao, Hongqiang Harry Liu, Yan Zhuang, Fei Feng, Lingbo Tang, Zheng Cao, Ming Zhang, Frank Kelly, Mohammad Alizadeh, and et al. 2019. HPCC: High Precision Congestion Control. In Proceedings of the ACM Special Interest Group on Data Communication (SIGCOMM '19). Association for Computing Machinery, New York, NY, USA, 44--58. https://doi.org/10.1145/3341302.3342085
[35]
Youyou Lu, Jiwu Shu, Youmin Chen, and Tao Li. 2017. Octopus: an RDMA-enabled Distributed Persistent Memory File System. In 2017 USENIX Annual Technical Conference (USENIX ATC 17). USENIX Association, Santa Clara, CA, 773--785. https://www.usenix.org/conference/atc17/technical-sessions/presentation/lu
[36]
Michael Marty, Marc de Kruijf, Jacob Adriaens, Christopher Alfeld, Sean Bauer, Carlo Contavalli, Michael Dalton, Nandita Dukkipati, William C. Evans, Steve Gribble, and et al. 2019. Snap: A Microkernel Approach to Host Networking. In Proceedings of the 27th ACM Symposium on Operating Systems Principles (SOSP '19). Association for Computing Machinery, New York, NY, USA, 399--413. https://doi.org/10.1145/3341301.3359657
[37]
M. Mathis, J. Mahdavi, S. Floyd, and A. Romanow. 1996. TCP Selective Acknowledgment Options. RFC 2018. RFC Editor.
[38]
Radhika Mittal, Vinh The Lam, Nandita Dukkipati, Emily Blem, Hassan Wassel, Monia Ghobadi, Amin Vahdat, Yaogong Wang, David Wetherall, and David Zats. 2015. TIMELY: RTT-based Congestion Control for the Datacenter. In Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication (SIGCOMM '15). ACM, New York, NY, USA, 537--550. https://doi.org/10.1145/2785956.2787510
[39]
Behnam Montazeri, Yilong Li, Mohammad Alizadeh, and John Ousterhout. 2018. Homa: A Receiver-driven Low-latency Transport Protocol Using Network Priorities. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication (SIGCOMM '18). ACM, New York, NY, USA, 221--235. https://doi.org/10.1145/3230543.3230564
[40]
Jacob Nelson, Brandon Holt, Brandon Myers, Preston Briggs, Luis Ceze, Simon Kahan, and Mark Oskin. 2015. Latency-Tolerant Software Distributed Shared Memory. In 2015 USENIX Annual Technical Conference (USENIX ATC 15). USENIX Association, Santa Clara, CA, 291--305. https://www.usenix.org/conference/atc15/technical-session/presentation/nelson
[41]
John Ousterhout, Arjun Gopalan, Ashish Gupta, Ankita Kejriwal, Collin Lee, Behnam Montazeri, Diego Ongaro, Seo Jin Park, Henry Qin, Mendel Rosenblum, et al. 2015. The RAMCloud Storage System. ACM Transactions on Computer Systems (TOCS) 33, 3 (2015), 7.
[42]
Jonathan Perry, Amy Ousterhout, Hari Balakrishnan, Devavrat Shah, and Hans Fugal. 2014. Fastpass: A Centralized "Zero-queue" Datacenter Network. In Proceedings of the ACM SIGCOMM 2014 Conference (SIGCOMM '14). ACM, New York, NY, USA, 307--318. https://doi.org/10.1145/2619239.2626309
[43]
Ahmed Saeed, Nandita Dukkipati, Vytautas Valancius, Vinh The Lam, Carlo Contavalli, and Amin Vahdat. 2017. Carousel: Scalable Traffic Shaping at End Hosts. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication (SIGCOMM '17). Association for Computing Machinery, New York, NY, USA, 404--417. https://doi.org/10.1145/3098822.3098852
[44]
Ahmed Saeed, Yimeng Zhao, Nandita Dukkipati, Ellen Zegura, Mostafa Ammar, Khaled Harras, and Amin Vahdat. 2019. Eiffel: Efficient and Flexible Software Packet Scheduling. In 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 19). USENIX Association, Boston, MA, 17--32. https://www. usenix. org/conference/nsdi19/presentation/saeed
[45]
Yizhou Shan, Yutong Huang, Yilun Chen, and Yiying Zhang. 2018. LegoOS: A Disseminated, Distributed OS for Hardware Resource Disaggregation. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). USENIX Association, Carlsbad, CA, 69--87. https://www.usenix.org/conference/osdi18/presentation/shan
[46]
Arjun Singh, Joon Ong, Amit Agarwal, Glen Anderson, Ashby Armistead, Roy Bannon, Seb Boving, Gaurav Desai, Bob Felderman, Paulie Germano, and et al. 2015. Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google's Datacenter Network. SIGCOMM Comput. Commun. Rev. 45, 4 (Aug. 2015), 183--197. https://doi.org/10.1145/2829988.2787508
[47]
Arjun Singhvi, Aditya Akella, Dan Gibson, Thomas F. Wenisch, Monica Wong-Chan, Sean Clark, Milo M. K. Martin, Moray McLaren, Prashant Chandra, Rob Cauble, Hassan M. G. Wassel, Behnam Montazeri, Simon L. Sabato, Joel Scherpelz, and Amin Vahdat. 2020. 1RMA: Re-envisioning Remote Memory Access for Multitenant Datacenters. In Proceedings of the 2020 ACM Conference on Special Interest Group on Data Communication (SIGCOMM '20). ACM, New York, NY, USA, to appear.
[48]
IEEE Std. 2010. IEEE 802.11Qau. Congestion notification. (2010).
[49]
IEEE Std. 2011. IEEE. 802.11Qbb. Priority based flow control. (2011).
[50]
Mohit P. Tahiliani, Vishal Misra, and K. K. Ramakrishnan. 2019. A Principled Look at the Utility of Feedback in Congestion Control. In Proceedings of the 2019 Workshop on Buffer Sizing (BS '19). Association for Computing Machinery, New York, NY, USA, Article Article 8, 5 pages. https://doi.org/10.1145/3375235.3375243
[51]
Jordan Tigani and Siddartha Naidu. 2014. Google BigQuery Analytics. Wiley, Indianapolis, IN, USA.
[52]
Balajee Vamanan, Jahangir Hasan, and T.N. Vijaykumar. 2012. Deadline-aware Datacenter TCP (D2TCP). In Proceedings of the ACM SIGCOMM 2012 Conference (SIGCOMM '12). ACM, New York, NY, USA, 115--126. https://doi.org/10.1145/2342356. 2342388
[53]
Washington State Department of Transportation. 2020. What is a roundabout? https://www.wsdot.wa.gov/Safety/roundabouts/BasicFacts.htm. (2020). Accessed: 2020-01-13.
[54]
Christo Wilson, Hitesh Ballani, Thomas Karagiannis, and Ant Rowtron. 2011. Better Never Than Late: Meeting Deadlines in Datacenter Networks. In Proceedings of the ACM SIGCOMM 2011 Conference (SIGCOMM '11). ACM, New York, NY, USA, 50--61. https://doi.org/10.1145/2018436.2018443
[55]
Jian Xu and Steven Swanson. 2016. NOVA: A Log-structured File System for Hybrid Volatile/Non-volatile Main Memories. In 14th USENIX Conference on File and Storage Technologies (FAST 16). USENIX Association, Santa Clara, CA, 323--338. https://www.usenix.org/conference/fast16/technical-sessions/presentation/xu
[56]
Jian Yang, Joseph Izraelevitz, and Steven Swanson. 2019. Orion: A Distributed File System for Non-Volatile Main Memory and RDMA-Capable Networks. In 17th USENIX Conference on File and Storage Technologies (FAST 19). USENIX Association, Boston, MA, 221--234. https://www.usenix.org/conference/fast19/presentation/yang
[57]
Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauly, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2012. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing. In Presented as part of the 9th USENIX Symposium on Networked Systems Design and Implementation (NSDI 12). USENIX, San Jose, CA, 15--28. https://www. usenix. org/conference/nsdi12/technical-sessions/presentation/zaharia
[58]
Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2010. Spark: Cluster Computing with Working Sets. In Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing (HotCloud'10). USENIX Association, USA, 10.
[59]
Yibo Zhu, Haggai Eran, Daniel Firestone, Chuanxiong Guo, Marina Lipshteyn, Yehonatan Liron, Jitendra Padhye, Shachar Raindel, Mohamad Haj Yahia, and Ming Zhang. 2015. Congestion Control for Large-Scale RDMA Deployments. In Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication (SIGCOMM '15). ACM, New York, NY, USA, 523--536. https://doi. org/10. 1145/2785956. 2787484
[60]
Yibo Zhu, Monia Ghobadi, Vishal Misra, and Jitendra Padhye. 2016. ECN or Delay: Lessons Learnt from Analysis of DCQCN and TIMELY. In Proceedings of the 12th International on Conference on Emerging Networking EXperiments and Technologies (CoNEXT '16). Association for Computing Machinery, New York, NY, USA, 313--327. https://doi.org/10.1145/2999572.2999593

Cited By

View all
  • (2024)Congestion Control Mechanism Based on Backpressure Feedback in Data Center NetworksFuture Internet10.3390/fi1604013116:4(131)Online publication date: 15-Apr-2024
  • (2024)MLTCP: A Distributed Technique to Approximate Centralized Flow Scheduling For Machine LearningProceedings of the 23rd ACM Workshop on Hot Topics in Networks10.1145/3696348.3696878(167-176)Online publication date: 18-Nov-2024
  • (2024)DDT: Dynamical Selective Dropping Threshold for Reactive Congestion ControlProceedings of the ACM Turing Award Celebration Conference - China 202410.1145/3674399.3674412(12-17)Online publication date: 5-Jul-2024
  • Show More Cited By

Index Terms

  1. Swift: Delay is Simple and Effective for Congestion Control in the Datacenter

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SIGCOMM '20: Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication
      July 2020
      814 pages
      ISBN:9781450379557
      DOI:10.1145/3387514
      This work is licensed under a Creative Commons Attribution International 4.0 License.

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 30 July 2020

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Congestion Control
      2. Datacenter Transport
      3. Performance Isolation

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Conference

      SIGCOMM '20
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 462 of 3,389 submissions, 14%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)5,399
      • Downloads (Last 6 weeks)797
      Reflects downloads up to 18 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Congestion Control Mechanism Based on Backpressure Feedback in Data Center NetworksFuture Internet10.3390/fi1604013116:4(131)Online publication date: 15-Apr-2024
      • (2024)MLTCP: A Distributed Technique to Approximate Centralized Flow Scheduling For Machine LearningProceedings of the 23rd ACM Workshop on Hot Topics in Networks10.1145/3696348.3696878(167-176)Online publication date: 18-Nov-2024
      • (2024)DDT: Dynamical Selective Dropping Threshold for Reactive Congestion ControlProceedings of the ACM Turing Award Celebration Conference - China 202410.1145/3674399.3674412(12-17)Online publication date: 5-Jul-2024
      • (2024)Coupling Congestion Control and Flow Pausing in Data Center NetworkProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673085(1247-1256)Online publication date: 12-Aug-2024
      • (2024)FNCC: Fast Notification Congestion Control in Data Center NetworksProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673078(127-137)Online publication date: 12-Aug-2024
      • (2024)NetBlocks: Staging Layouts for High-Performance Custom Host Network StacksProceedings of the ACM on Programming Languages10.1145/36563968:PLDI(467-491)Online publication date: 20-Jun-2024
      • (2024)eZNS: Elastic Zoned Namespace for Enhanced Performance Isolation and Device UtilizationACM Transactions on Storage10.1145/365371620:3(1-41)Online publication date: 12-Apr-2024
      • (2024)Fast, Scalable, and Accurate Rate Limiter for RDMA NICsProceedings of the ACM SIGCOMM 2024 Conference10.1145/3651890.3672215(568-580)Online publication date: 4-Aug-2024
      • (2024)Taming the Elephants: Affordable Flow Length Prediction in the Data PlaneProceedings of the ACM on Networking10.1145/36494732:CoNEXT1(1-24)Online publication date: 28-Mar-2024
      • (2024)Understanding Incast Bursts in Modern DatacentersProceedings of the 2024 ACM on Internet Measurement Conference10.1145/3646547.3689028(674-680)Online publication date: 4-Nov-2024
      • Show More Cited By

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media