research-article

Open access

Using trio: juniper networks' programmable chipset - for emerging in-network applications

Authors:

Swamy Sadashivaiah Renu Kananda,

Manya GhobadiAuthors Info & Claims

SIGCOMM '22: Proceedings of the ACM SIGCOMM 2022 Conference

Pages 633 - 648

https://doi.org/10.1145/3544216.3544262

Published: 22 August 2022 Publication History

Abstract

This paper describes Trio, a programmable chipset used in Juniper Networks' MX-series routers and switches. Trio's architecture is based on a multi-threaded programmable packet processing engine and a hierarchy of high-capacity memory systems, making it fundamentally different from pipeline-based architectures. Trio gracefully handles non-homogeneous packet processing rates for a wide range of networking use cases and protocols, making it an ideal platform for emerging in-network applications. We begin by describing the Trio chipset's fundamental building blocks, including its multi-threaded Packet Forwarding and Packet Processing Engines. We then discuss Trio's programming language, called Microcode. To showcase Trio's flexible Microcode-based programming environment, we describe two use cases. First, we demonstrate Trio's ability to perform in-network aggregation for distributed machine learning. Second, we propose and design an in-network straggler mitigation technique using Trio's timer threads. We prototype both use cases on a testbed using three real DNN models (ResNet50, DenseNet161, and VGG11) to demonstrate Trio's ability to mitigate stragglers while performing in-network aggregation. Our evaluations show that when stragglers occur in the cluster, Trio outperforms today's pipeline-based solutions by up to 1.8x.

Supplementary Material

PDF File (p633-yang-supp.pdf)

Supplemental material.

Download
77.43 KB

References

[1]

[n. d.]. Accelerating the Next Generation of Juniper Connected Security with Trio. ([n. d.]). https://blogs.juniper.net/en-us/security/accelerating-the-next-generation-of-juniper-connected-security-with-trio.

[2]

[n. d.]. Barefoot Tofino. ([n. d.]). https://www.barefootnetworks.com/products/brief-tofino/.

[3]

[n. d.]. Juniper Networks Advanced Forwarding Interface (AFI). https://github.com/Juniper/AFI. ([n. d.]).

[4]

[n. d.]. Juniper Networks' MX Series Universal Routing Platform Interface Module Reference. ([n. d.]). https://www.juniper.net/documentation/us/en/hardware/mx-module-reference/topics/concept/mpc10e-15c-mrate.html.

[5]

[n. d.]. Juniper Networks vMX Series Universal Routing Platform. https://www.juniper.net/us/en/products/routers/mx-series/vmx-virtual-router-software.html. ([n. d.]).

[6]

[n. d.]. Juniper P4 Agent. https://github.com/Juniper/JP4Agent. ([n. d.]).

[7]

[n. d.]. MLPerf: A broad ML benchmark suite. ([n. d.]). https://mlperf.org/.

[8]

[n. d.]. NVIDIA Collective Communication Library (NCCL). ([n. d.]). https://developer.nvidia.com/nccl.

[9]

2017. Meet Horovod: Uber's Open Source Distributed Deep Learning Framework for TensorFlow. (2017). https://eng.uber.com/horovod.

[10]

2022. Juniper Networks' MX480 Universal Routing Platform. (2022). https://www.juniper.net/us/en/products/routers/mx-series/mx480-universal-routing-platform.html.

[11]

2022. NVIDIA A100 Tensor Core GPU. (2022). https://www.nvidia.com/en-us/data-center/a100/.

[12]

J. R. Allen, B. M. Bass, C. Basso, R. H. Boivie, J. L. Calvignac, G. T. Davis, L. Frelechoux, M. Heddes, A. Herkersdorf, A. Kind, J. F. Logan, M. Peyravian, M. A. Rinaldi, R. K. Sabhikhi, M. S. Siegel, and M. Waldvogel. 2003. IBM PowerNP network processor: Hardware, software, and applications. IBM Journal of Research and Development 47, 2.3 (2003), 177--193.

Digital Library

[13]

Ganesh Ananthanarayanan, Ali Ghodsi, Scott Shenker, and Ion Stoica. 2012. Why let resources idle? Aggressive Cloning of Jobs with Dolly. In USENIX HotCloud (usenix hotcloud ed.). https://www.microsoft.com/en-us/research/publication/let-resources-idle-aggressive-cloning-jobs-dolly/

[14]

Ganesh Ananthanarayanan, Ali Ghodsi, Scott Shenker, and Ion Stoica. 2013. Effective Straggler Mitigation: Attack of the Clones. In 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI 13). USENIX Association, Lombard, IL, 185--198. https://www.usenix.org/conference/nsdi13/technical-sessions/presentation/ananthanarayanan

[15]

Ganesh Ananthanarayanan, Srikanth Kandula, Albert Greenberg, Ion Stoica, Yi Lu, Bikas Saha, and Edward Harris. 2010. Reining in the Outliers in Map-Reduce Clusters using Mantri. In 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI 10). USENIX Association, Vancouver, BC. https://www.usenix.org/conference/osdi10/reining-outliers-map-reduce-clusters-using-mantri

[16]

Sally Bament. 2022. Juniper Introduces New Trio 6-based MX Portfolio. (2022). https://blogs.juniper.net/en-us/service-provider-transformation/juniper-introduces-new-trio-6-based-mx-portfolio.

[17]

Ran Ben Basat, Sivaramakrishnan Ramanathan, Yuliang Li, Gianni Antichi, Minian Yu, and Michael Mitzenmacher. 2020. PINT: Probabilistic In-band Network Telemetry. In ACM SIGCOMM.

Digital Library

[18]

Gil Bloch. 2019. Accelerating Distributed Deep Learning with In-Network Computing Technology. (Aug. 2019). https://conferences.sigcomm.org/events/apnet2019/slides/Industrial_1_3.pdf

[19]

Pat Bosshart, Dan Daly, Glen Gibb, Martin Izzard, Nick McKeown, Jennifer Rexford, Cole Schlesinger, Dan Talayco, Amin Vahdat, George Varghese, and David Walker. 2014. P4: Programming Protocol-Independent Packet Processors. ACM SIGCOMM Computer Communication Review (CCR) (2014).

Digital Library

[20]

Pat Bosshart, Glen Gibb, Hun-Seok Kim, George Varghese, Nick McKeown, Martin Izzard, Fernando Mujica, and Mark Horowitz. 2013. Forwarding metamorphosis: Fast programmable match-action processing in hardware for SDN. ACM SIGCOMM Computer Communication Review 43, 4 (2013), 99--110.

Digital Library

[21]

Pietro Bressana, Noa Zilberman, Dejan Vucinic, and Robert Soulé. 2020. Trading Latency for Compute in the Network. In ACM NAI.

[22]

Broadcom. [n. d.]. BCM56870 Series. ([n. d.]). https://www.broadcom.com/products/ethernet-connectivity/switching/strataxgs/bcm56870-series.

[23]

A. Caulfield, P. Costa, and M. Ghobadi. 2018. Beyond SmartNICs: Towards a Fully Programmable Cloud: Invited Paper. In IEEE HPRS.

[24]

Sharad Chole, Andy Fingerhut, Sha Ma, Anirudh Sivaraman, Shay Vargaftik, Alon Berger, Gal Mendelson, Mohammad Alizadeh, Shang-Tse Chuang, Isaac Keslassy, Ariel Orda, and Tom Edsall. 2017. dRMT: Disaggregated Programmable Switching. In ACM SIGCOMM.

[25]

James Cipar, Qirong Ho, Jin Kyu Kim, Seunghak Lee, Gregory R. Ganger, Garth Gibson, Kimberly Keeton, and Eric Xing. 2013. Solving the Straggler Problem with Bounded Staleness. In 14th Workshop on Hot Topics in Operating Systems (HotOS XIV). USENIX Association, Santa Ana Pueblo, NM. https://www.usenix.org/conference/hotos13/session/cipar

[26]

Eli Cortez, Anand Bonde, Alexandre Muzio, Mark Russinovich, Marcus Fontoura, and Ricardo Bianchini. 2017. Resource Central: Understanding and Predicting Workloads for Improved Resource Management in Large Cloud Platforms. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP '17). ACM, New York, NY, USA, 153--167.

Digital Library

[27]

H. T. Dang, P. Bressana, H. Wang, K. S. Lee, N. Zilberman, H. Weatherspoon, M. Canini, F. Pedone, and R. Soule. 2020. P4xos: Consensus as a Network Service. IEEE/ACM Transactions on Networking 28, 4 (2020).

Digital Library

[28]

Huynh Tu Dang, Marco Canini, Fernando Pedone, and Robert Soulé. 2016. Paxos made switch-y. ACM SIGCOMM Computer Communication Review 46, 2 (2016), 18--24.

Digital Library

[29]

Daniele De Sensi, Salvatore Di Girolamo, Saleh Ashkboos, Shigang Li, and Torsten Hoefler. 2021. Flare: Flexible in-Network Allreduce. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC '21). Association for Computing Machinery, New York, NY, USA, Article 35, 16 pages.

Digital Library

[30]

Jeffrey Dean, Greg Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Mark Mao, Marc' aurelio Ranzato, Andrew Senior, Paul Tucker, Ke Yang, Quoc Le, and Andrew Ng. 2012. Large Scale Distributed Deep Networks. In Advances in Neural Information Processing Systems, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger (Eds.), Vol. 25. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2012/file/6aca97005c68f1206823815f66102863-Paper.pdf

[31]

Jeffrey Dean and Sanjay Ghemawat. 2004. MapReduce: Simplified Data Processing on Large Clusters. In OSDI'04: Sixth Symposium on Operating System Design and Implementation. San Francisco, CA, 137--150.

Digital Library

[32]

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. 2009. ImageNet: A Large-Scale Hierarchical Image Database. In CVPR09.

[33]

Celestine Dünner, Thomas Parnell, Dimitrios Sarigiannis, Nikolas Ioannou, Andreea Anghel, Gummadi Ravi, Madhusudanan Kandasamy, and Haralampos Pozidis. 2018. Snap ML: A Hierarchical Framework for Machine Learning. In Advances in Neural Information Processing Systems 31, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett (Eds.). Curran Associates, Inc., 252--262. http://papers.nips.cc/paper/7309-snap-ml-a-hierarchical-framework-for-machine-learning.pdf

[34]

Farshid Farhat, Diman Zad Tootaghaj, Yuxiong He, Anand Sivasubramaniam, Mahmut Kandemir, and Chita R. Das. 2018. Stochastic Modeling and Optimization of Stragglers. IEEE Transactions on Cloud Computing 6, 4 (oct 2018), 1164--1177.

[35]

Yong Feng, Zhikang Chen, Haoyu Song, Wenquan Xu, Jiahao Li, Zijian Zhang, Tong Yun, Ying Wan, and Bin Liu. 2022. Enabling In-situ Programmability in Network Data Plane: From Architecture to Language. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22). 635--649.

[36]

N. Gebara, P. Costa, and M. Ghobadi. 2021. PANAMA: In-network Aggregation for Shared Machine Learning Clusters. In Proc. Conference on Machine Learning and Systems (MLSys). 1--16.

[37]

Nadeen Gebara, Alberto Lerner, Mingran Yang, Minlan Yu, Paolo Costa, and Manya Ghobadi. 2020. Challenging the Stateless Quo of Programmable Switches. In ACM Workshop on Hot Topics in Networks (HotNets). ACM. https://www.microsoft.com/en-us/research/publication/challenging-the-stateless-quo-of-programmable-switches/

[38]

Aaron Harlap, Henggang Cui, Wei Dai, Jinliang Wei, Gregory R. Ganger, Phillip B. Gibbons, Garth A. Gibson, and Eric P. Xing. 2016. Addressing the Straggler Problem for Iterative Convergent Parallel ML. In Proceedings of the Seventh ACM Symposium on Cloud Computing (SoCC '16). ACM, New York, NY, USA, 98--111.

Digital Library

[39]

Aaron Harlap, Deepak Narayanan, Amar Phanishayee, Vivek Seshadri, Nikhil R. Devanur, Gregory R. Ganger, and Phillip B. Gibbons. 2018. PipeDream: Fast and Efficient Pipeline Parallel DNN Training. CoRR abs/1806.03377 (2018).

[40]

Sayed Hadi Hashemi, Sangeetha Abdu Jyothi, and Roy H. Campbell. 2018. Communication Scheduling as a First-Class Citizen in Distributed Machine Learning Systems. CoRR abs/1803.03288 (2018). arXiv:1803.03288 http://arxiv.org/abs/1803.03288

[41]

K. He, X. Zhang, S. Ren, and J. Sun. 2016. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770--778.

[42]

Gao Huang, Zhuang Liu, Laurens van der Maaten, and Kilian Q. Weinberger. 2017. Densely Connected Convolutional Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '17). IEEE Computer Society, Honolulu, HI, 2261--2269. https://arxiv.org/abs/1608.06993v5

[43]

Xin Jin, Xiaozhou Li, Haoyu Zhang, Robert Soulé, Jeongkeun Lee, Nate Foster, Changhoon Kim, and Ion Stoica. 2017. NetCache: Balancing Key-Value Stores with Fast In-Network Caching. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP '17).

Digital Library

[44]

Daehyeok Kim, Zaoxing Liu, Yibo Zhu, Changhoon Kim, Jeongkeun Lee, Vyas Sekar, and Srinivasan Seshan. 2020. TEA: Enabling State-Intensive Network Functions on Programmable Switches. In Proceedings of the Annual Conference of the ACM Special Interest Group on Data Communication on the Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM '20). Association for Computing Machinery, New York, NY, USA, 90--106.

Digital Library

[45]

Daehyeok Kim, Yibo Zhu, Changhoon Kim, Jeongkeun Lee, and Srinivasan Seshan. 2018. Generic External Memory for Switch Data Planes. In ACM HotNets.

[46]

Eugene Kirpichov and Malo Denielou. 2016. No shard left behind: dynamic work rebalancing in Google Cloud Dataflow. (May 2016). https://cloud.google.com/blog/products/gcp/no-shard-left-behind-dynamic-work-rebalancing-in-google-cloud-dataflow

[47]

Benjamin Klenk, Nan Jiang, G. Thorson, and L. Dennison. 2020. An In-Network Architecture for Accelerating Shared-Memory Multiprocessor Collectives. 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA) (2020), 996--1009.

[48]

ChonLam Lao, Yanfang Le, Kshiteej Mahajan, Yixi Chen, Wenfei Wu, Aditya Akella, and Michael Swift. 2021. ATP: In-network Aggregation for Multi-tenant Learning. In 18th USENIX Symposium on Networked Systems Design and Implementation (NSDI 21). USENIX Association, 741--761.

[49]

Adam Lerer, Ledell Wu, Jiajun Shen, Timothée Lacroix, Luca Wehrstedt, Abhijit Bose, and Alexander Peysakhovich. 2019. PyTorch-BigGraph: A Large-scale Graph Embedding System. CoRR abs/1903.12287 (2019). arXiv:1903.12287 http://arxiv.org/abs/1903.12287

[50]

Alberto Lerner, Rana Hussein, and Philippe Cudré-Mauroux. 2019. The Case for Network Accelerated Query Processing. In Proceedings of the Innovative Data Systems Research Conference (CIDR '19).

[51]

Bojie Li, Zhenyuan Ruan, Wencong Xiao, Yuanwei Lu, Yongqiang Xiong, Andrew Putnam, Enhong Chen, and Lintao Zhang. 2017. KV-Direct: High-Performance In-Memory Key-Value Store with Programmable NIC. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP '17). ACM, New York, NY, USA, 137--152.

Digital Library

[52]

Jialin Li, Ellis Michael, and Dan R. K. Ports. 2017. Eris: Coordination-Free Consistent Transactions Using In-Network Concurrency Control. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP '17).

[53]

Mu Li, David G. Andersen, Jun Woo Park, Alexander J. Smola, Amr Ahmed, Vanja Josifovski, James Long, Eugene J. Shekita, and Bor-Yiing Su. 2014. Scaling Distributed Machine Learning with the Parameter Server. In 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14). USENIX Association, Broomfield, CO, 583--598. https://www.usenix.org/conference/osdi14/technical-sessions/presentation/li_mu

Digital Library

[54]

Youjie Li, Iou-Jen Liu, Yifan Yuan, Deming Chen, Alexander Schwing, and Jian Huang. 2019. Accelerating Distributed Reinforcement learning with In-Switch Computing. In 2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA). 279--291.

Digital Library

[55]

Rui Miao, Hongyi Zeng, Changhoon Kim, Jeongkeun Lee, and Minlan Yu. 2017. SilkRoad: Making Stateful Layer-4 Load Balancing Fast and Cheap Using Switching ASICs. In Proceedings of the 2017 ACM SIGCOMM Conference (SIGCOMM '17).

Digital Library

[56]

Derek G. Murray, Frank McSherry, Rebecca Isaacs, Michael Isard, Paul Barham, and Martín Abadi. 2013. Naiad: A Timely Dataflow System. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (SOSP '13). Association for Computing Machinery, New York, NY, USA, 439--455.

Digital Library

[57]

Kay Ousterhout, Patrick Wendell, Matei Zaharia, and Ion Stoica. 2013. Sparrow: Distributed, Low Latency Scheduling. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (SOSP '13). Association for Computing Machinery, New York, NY, USA, 69--84.

Digital Library

[58]

P4.org Architecture Working Group. [n. d.]. P416 Portable Switch Architecture (PSA). ([n. d.]). https://p4.org/p4-spec/docs/PSA.html.

[59]

Matthew Perron, Raul Castro Fernandez, David DeWitt, and Samuel Madden. 2020. Starling: A Scalable Query Engine on Cloud Functions. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (SIGMOD '20). Association for Computing Machinery, New York, NY, USA, 131--141.

Digital Library

[60]

Dan R. K. Ports and Jacob Nelson. 2019. When Should The Network Be The Computer?. In Proceedings of the Workshop on Hot Topics in Operating Systems (HotOS '19).

Digital Library

[61]

Netanel Raviv, Itzhak Tamo, Rashish Tandon, and Alexandros G. Dimakis. 2020. Gradient Coding From Cyclic MDS Codes and Expander Graphs. IEEE Transactions on Information Theory 66, 12 (2020), 7475--7489.

Digital Library

[62]

Arjun Roy, Hongyi Zeng, Jasmeet Bagga, George Porter, and Alex C. Snoeren. 2015. Inside the Social Network's (Datacenter) Network. In SIGCOMM.

[63]

Amedeo Sapio, Marco Canini, Chen-Yu Ho, Jacob Nelson, Panos Kalnis, Changhoon Kim, Arvind Krishnamurthy, Masoud Moshref, Dan Ports, and Peter Richtarik. 2021. Scaling Distributed Machine Learning with In-Network Aggregation. In 18th USENIX Symposium on Networked Systems Design and Implementation (NSDI 21). USENIX Association, 785--808. https://www.usenix.org/conference/nsdi21/presentation/sapio

[64]

Amedeo Sapio, Marco Canini, Chen-Yu Ho, Jacob Nelson, Panos Kalnis, Changhoon Kim, Arvind Krishnamurthy, Masoud Moshref, Dan Ports, and Peter Richtarik. 2022. SwitchML open source code. (2022). https://github.com/p4lang/p4app-switchML.

[65]

Alexander Sergeev and Mike Del Balso. 2018. Horovod: fast and easy distributed deep learning in TensorFlow. (2018). arXiv:cs.LG/1802.05799

[66]

Christopher J. Shallue, Jaehoon Lee, Joseph M. Antognini, Jascha Sohl-Dickstein, Roy Frostig, and George E. Dahl. 2018. Measuring the Effects of Data Parallelism on Neural Network Training. CoRR abs/1811.03600 (2018). arXiv:1811.03600 http://arxiv.org/abs/1811.03600

[67]

Vishal Shrivastav. 2022. Stateful Multi-Pipelined Programmable Switches. In Proceedings of the 2022 ACM SIGCOMM Conference (SIGCOMM '22).

Digital Library

[68]

Karen Simonyan and Andrew Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. (2015). arXiv:cs.CV/1409.1556

[69]

Anirudh Sivaraman, Alvin Cheung, Mihai Budiu, Changhoon Kim, Mohammad Alizadeh, Hari Balakrishnan, George Varghese, Nick McKeown, and Steve Licking. 2016. Packet Transactions: High-Level Programming for Line-Rate Switches. In ACM SIGCOMM.

Digital Library

[70]

Erich Strohmaier, Jack J. Dongarra, Hans W. Meuer, and Horst D. Simon. 1999. The Marketplace of High-Performance Computing. Parallel Comput. 25, 13--14 (dec 1999), 1517--1544.

Digital Library

[71]

Tushar Swamy, Alexander Rucker, Muhammad Shahbaz, and Kunle Olukotun. 2022. Taurus: A Data Plane Architecture for Per-Packet ML. ASPLOS (2022).

[72]

Rashish Tandon, Qi Lei, Alexandros G. Dimakis, and Nikos Karampatziakis. 2017. Gradient Coding: Avoiding Stragglers in Distributed Learning. In Proceedings of the 34th International Conference on Machine Learning (Proceedings of Machine Learning Research), Doina Precup and Yee Whye Teh (Eds.), Vol. 70. PMLR, International Convention Centre, Sydney, Australia, 3368--3376. http://proceedings.mlr.press/v70/tandon17a.html

[73]

Muhammad Tirmazi, Ran Ben Basat, Jiaqi Gao, and Minlan Yu. 2020. Cheetah: Accelerating Database Queries with Switch Pruning. In Proceedings of the 2020 ACM SIGMOD Conference (SIGMOD '20).

Digital Library

[74]

Yuta Tokusashi, Huynh Tu Dang, Fernando Pedone, Robert Soulé, and Noa Zilberman. 2019. The Case For In-Network Computing On Demand. In EuroSys.

[75]

Da Wang, Gauri Joshi, and Gregory W. Wornell. 2019. Efficient Straggler Replication in Large-Scale Parallel Computing. ACM Trans. Model. Perform. Eval. Comput. Syst. 4, 2, Article 7 (April 2019), 23 pages.

Digital Library

[76]

Zhaoqi Xiong and Noa Zilberman. 2019. Do Switches Dream of Machine Learning? Toward In-Network Classification. In Proceedings of the 18th ACM Workshop on Hot Topics in Networks (HotNets'19).

Digital Library

[77]

Yifan Yuan, Omar Alama, Jiawei Fei, Jacob Nelson, Dan RK Ports, Amedeo Sapio, Marco Canini, and Nam Sung Kim. 2022. Unlocking the Power of Inline {FloatingPoint} Operations on Programmable Switches. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22). 683--700.

[78]

Matei Zaharia, Dhruba Borthakur, Joydeep Sen Sarma, Khaled Elmeleegy, Scott Shenker, and Ion Stoica. 2010. Delay Scheduling: A Simple Technique for Achieving Locality and Fairness in Cluster Scheduling. In Proceedings of the 5th European Conference on Computer Systems (EuroSys '10). Association for Computing Machinery, New York, NY, USA, 265--278.

Digital Library

[79]

Matei Zaharia, Andy Konwinski, Anthony D. Joseph, Randy Katz, and Ion Stoica. 2008. Improving MapReduce Performance in Heterogeneous Environments. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation (OSDI'08). USENIX Association, Berkeley, CA, USA, 29--42.

Digital Library

Cited By

Wu WXu Y(2025)MimoSketch: A Framework for Frequency-Based Mining Tasks on Multiple Nodes With SketchesIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.352303437:3(1311-1324)Online publication date: Mar-2025
https://doi.org/10.1109/TKDE.2024.3523034
Yellas NAddis BBoumerdassi SRiggio RSecci S(2025)Function Placement for In-network Federated LearningComputer Networks10.1016/j.comnet.2024.110900256(110900)Online publication date: Jan-2025
https://doi.org/10.1016/j.comnet.2024.110900
Rajasekaran SGhobadi MAkella AVanbever LZhang I(2024)CASSINIProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691903(1403-1420)Online publication date: 16-Apr-2024
https://dl.acm.org/doi/10.5555/3691825.3691903
Show More Cited By

Index Terms

Using trio: juniper networks' programmable chipset - for emerging in-network applications
1. Networks
  1. Network components
    1. Intermediate nodes
      1. Routers
  2. Network services
    1. In-network processing
    2. Programmable networks

Recommendations

A P4-based 5G User Plane Function
SOSR '21: Proceedings of the ACM SIGCOMM Symposium on SDN Research (SOSR)

The demands on mobile networks are constantly evolving, but designing and integrating new high-speed packet processing remains a challenge due to the complexity of requirements and opacity of protocol specifications. 5G data planes should be implemented ...
TRIO: Burst Buffer Based I/O Orchestration
CLUSTER '15: Proceedings of the 2015 IEEE International Conference on Cluster Computing

The growing computing power on leadership HPC systems is often accompanied by ever-escalating failure rates. Checkpointing is a common defensive mechanism used by scientific applications for failure recovery. However, directly writing the large and ...
Trio: enabling sustainable and scalable outdoor wireless sensor network deployments
IPSN '06: Proceedings of the 5th international conference on Information processing in sensor networks

We present the philosophy, design, and initial evaluation of the Trio Testbed, a new outdoor sensor network deployment that consists of 557 solar-powered motes, seven gateway nodes, and a root server. The testbed covers an area of approximately 50,000 ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGCOMM '22: Proceedings of the ACM SIGCOMM 2022 Conference

August 2022

858 pages

ISBN:9781450394208

DOI:10.1145/3544216

General Chairs:
Fernando Kuipers
Delft University of Technology
,
Ariel Orda
Technion Israel Institute of Technology

Copyright © 2022 Owner/Author.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike International 4.0 License.

Sponsors

SIGCOMM: ACM Special Interest Group on Data Communication

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 August 2022

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

ARPA-E ENLITENED PINE
NSF (National Science Foundation)
Sloan fellowship
Air Force AI Accelerator
DARPA FastNICs

Conference

SIGCOMM '22

Sponsor:

SIGCOMM

SIGCOMM '22: ACM SIGCOMM 2022 Conference

August 22 - 26, 2022

Amsterdam, Netherlands

Acceptance Rates

Overall Acceptance Rate 462 of 3,389 submissions, 14%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

33
Total Citations
View Citations
4,289
Total Downloads

Downloads (Last 12 months)1,684
Downloads (Last 6 weeks)133

Reflects downloads up to 18 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wu WXu Y(2025)MimoSketch: A Framework for Frequency-Based Mining Tasks on Multiple Nodes With SketchesIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.352303437:3(1311-1324)Online publication date: Mar-2025
https://doi.org/10.1109/TKDE.2024.3523034
Yellas NAddis BBoumerdassi SRiggio RSecci S(2025)Function Placement for In-network Federated LearningComputer Networks10.1016/j.comnet.2024.110900256(110900)Online publication date: Jan-2025
https://doi.org/10.1016/j.comnet.2024.110900
Rajasekaran SGhobadi MAkella AVanbever LZhang I(2024)CASSINIProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691903(1403-1420)Online publication date: 16-Apr-2024
https://dl.acm.org/doi/10.5555/3691825.3691903
Li MBasat RVargaftik SLao CXu KMitzenmacher MYu MVanbever LZhang I(2024)THCProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691891(1191-1211)Online publication date: 16-Apr-2024
https://dl.acm.org/doi/10.5555/3691825.3691891
Feng YChen ZSong HZhang YZhou HSun RDong WLu PLiu SZhang CXu YLiu BVanbever LZhang I(2024)Empower programmable pipeline for advanced stateful packet processingProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691853(491-508)Online publication date: 16-Apr-2024
https://dl.acm.org/doi/10.5555/3691825.3691853
Li XGuo CQian KZhang MYang MXu M(2024)Near-Lossless Gradient Compression for Data-Parallel Distributed DNN TrainingProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698541(977-994)Online publication date: 20-Nov-2024
https://dl.acm.org/doi/10.1145/3698038.3698541
Lerner AZoni DCosta PAntichi G(2024)Rethinking the Switch Architecture for Stateful In-network ComputingProceedings of the 23rd ACM Workshop on Hot Topics in Networks10.1145/3696348.3696897(273-281)Online publication date: 18-Nov-2024
https://dl.acm.org/doi/10.1145/3696348.3696897
Song H(2024)In-Network AllReduce Optimization with Virtual Aggregation TreesProceedings of the 2024 SIGCOMM Workshop on Networks for AI Computing10.1145/3672198.3673800(54-60)Online publication date: 4-Aug-2024
https://dl.acm.org/doi/10.1145/3672198.3673800
Chen ZFeng YLiu SSong HZhou HYun TXu WPan TLiu BSekar VYu MSeneviratne AVeitch D(2024)OptimusPrime: Unleash Dataplane Programmability through a Transformable ArchitectureProceedings of the ACM SIGCOMM 2024 Conference10.1145/3651890.3672214(904-920)Online publication date: 4-Aug-2024
https://dl.acm.org/doi/10.1145/3651890.3672214
Li YHuang JLi ZLiu JZhou SZhang TJiang WWang J(2024)Straggler-Aware Gradient Aggregation for Large-Scale Distributed Deep Learning SystemIEEE/ACM Transactions on Networking10.1109/TNET.2024.344103932:6(4917-4930)Online publication date: Dec-2024
https://doi.org/10.1109/TNET.2024.3441039
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten