Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3544216.3544262acmconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article
Open access

Using trio: juniper networks' programmable chipset - for emerging in-network applications

Published: 22 August 2022 Publication History

Abstract

This paper describes Trio, a programmable chipset used in Juniper Networks' MX-series routers and switches. Trio's architecture is based on a multi-threaded programmable packet processing engine and a hierarchy of high-capacity memory systems, making it fundamentally different from pipeline-based architectures. Trio gracefully handles non-homogeneous packet processing rates for a wide range of networking use cases and protocols, making it an ideal platform for emerging in-network applications. We begin by describing the Trio chipset's fundamental building blocks, including its multi-threaded Packet Forwarding and Packet Processing Engines. We then discuss Trio's programming language, called Microcode. To showcase Trio's flexible Microcode-based programming environment, we describe two use cases. First, we demonstrate Trio's ability to perform in-network aggregation for distributed machine learning. Second, we propose and design an in-network straggler mitigation technique using Trio's timer threads. We prototype both use cases on a testbed using three real DNN models (ResNet50, DenseNet161, and VGG11) to demonstrate Trio's ability to mitigate stragglers while performing in-network aggregation. Our evaluations show that when stragglers occur in the cluster, Trio outperforms today's pipeline-based solutions by up to 1.8x.

Supplementary Material

PDF File (p633-yang-supp.pdf)
Supplemental material.

References

[1]
[n. d.]. Accelerating the Next Generation of Juniper Connected Security with Trio. ([n. d.]). https://blogs.juniper.net/en-us/security/accelerating-the-next-generation-of-juniper-connected-security-with-trio.
[2]
[n. d.]. Barefoot Tofino. ([n. d.]). https://www.barefootnetworks.com/products/brief-tofino/.
[3]
[n. d.]. Juniper Networks Advanced Forwarding Interface (AFI). https://github.com/Juniper/AFI. ([n. d.]).
[4]
[n. d.]. Juniper Networks' MX Series Universal Routing Platform Interface Module Reference. ([n. d.]). https://www.juniper.net/documentation/us/en/hardware/mx-module-reference/topics/concept/mpc10e-15c-mrate.html.
[5]
[n. d.]. Juniper Networks vMX Series Universal Routing Platform. https://www.juniper.net/us/en/products/routers/mx-series/vmx-virtual-router-software.html. ([n. d.]).
[6]
[n. d.]. Juniper P4 Agent. https://github.com/Juniper/JP4Agent. ([n. d.]).
[7]
[n. d.]. MLPerf: A broad ML benchmark suite. ([n. d.]). https://mlperf.org/.
[8]
[n. d.]. NVIDIA Collective Communication Library (NCCL). ([n. d.]). https://developer.nvidia.com/nccl.
[9]
2017. Meet Horovod: Uber's Open Source Distributed Deep Learning Framework for TensorFlow. (2017). https://eng.uber.com/horovod.
[10]
2022. Juniper Networks' MX480 Universal Routing Platform. (2022). https://www.juniper.net/us/en/products/routers/mx-series/mx480-universal-routing-platform.html.
[11]
2022. NVIDIA A100 Tensor Core GPU. (2022). https://www.nvidia.com/en-us/data-center/a100/.
[12]
J. R. Allen, B. M. Bass, C. Basso, R. H. Boivie, J. L. Calvignac, G. T. Davis, L. Frelechoux, M. Heddes, A. Herkersdorf, A. Kind, J. F. Logan, M. Peyravian, M. A. Rinaldi, R. K. Sabhikhi, M. S. Siegel, and M. Waldvogel. 2003. IBM PowerNP network processor: Hardware, software, and applications. IBM Journal of Research and Development 47, 2.3 (2003), 177--193.
[13]
Ganesh Ananthanarayanan, Ali Ghodsi, Scott Shenker, and Ion Stoica. 2012. Why let resources idle? Aggressive Cloning of Jobs with Dolly. In USENIX HotCloud (usenix hotcloud ed.). https://www.microsoft.com/en-us/research/publication/let-resources-idle-aggressive-cloning-jobs-dolly/
[14]
Ganesh Ananthanarayanan, Ali Ghodsi, Scott Shenker, and Ion Stoica. 2013. Effective Straggler Mitigation: Attack of the Clones. In 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI 13). USENIX Association, Lombard, IL, 185--198. https://www.usenix.org/conference/nsdi13/technical-sessions/presentation/ananthanarayanan
[15]
Ganesh Ananthanarayanan, Srikanth Kandula, Albert Greenberg, Ion Stoica, Yi Lu, Bikas Saha, and Edward Harris. 2010. Reining in the Outliers in Map-Reduce Clusters using Mantri. In 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI 10). USENIX Association, Vancouver, BC. https://www.usenix.org/conference/osdi10/reining-outliers-map-reduce-clusters-using-mantri
[16]
Sally Bament. 2022. Juniper Introduces New Trio 6-based MX Portfolio. (2022). https://blogs.juniper.net/en-us/service-provider-transformation/juniper-introduces-new-trio-6-based-mx-portfolio.
[17]
Ran Ben Basat, Sivaramakrishnan Ramanathan, Yuliang Li, Gianni Antichi, Minian Yu, and Michael Mitzenmacher. 2020. PINT: Probabilistic In-band Network Telemetry. In ACM SIGCOMM.
[18]
Gil Bloch. 2019. Accelerating Distributed Deep Learning with In-Network Computing Technology. (Aug. 2019). https://conferences.sigcomm.org/events/apnet2019/slides/Industrial_1_3.pdf
[19]
Pat Bosshart, Dan Daly, Glen Gibb, Martin Izzard, Nick McKeown, Jennifer Rexford, Cole Schlesinger, Dan Talayco, Amin Vahdat, George Varghese, and David Walker. 2014. P4: Programming Protocol-Independent Packet Processors. ACM SIGCOMM Computer Communication Review (CCR) (2014).
[20]
Pat Bosshart, Glen Gibb, Hun-Seok Kim, George Varghese, Nick McKeown, Martin Izzard, Fernando Mujica, and Mark Horowitz. 2013. Forwarding metamorphosis: Fast programmable match-action processing in hardware for SDN. ACM SIGCOMM Computer Communication Review 43, 4 (2013), 99--110.
[21]
Pietro Bressana, Noa Zilberman, Dejan Vucinic, and Robert Soulé. 2020. Trading Latency for Compute in the Network. In ACM NAI.
[22]
Broadcom. [n. d.]. BCM56870 Series. ([n. d.]). https://www.broadcom.com/products/ethernet-connectivity/switching/strataxgs/bcm56870-series.
[23]
A. Caulfield, P. Costa, and M. Ghobadi. 2018. Beyond SmartNICs: Towards a Fully Programmable Cloud: Invited Paper. In IEEE HPRS.
[24]
Sharad Chole, Andy Fingerhut, Sha Ma, Anirudh Sivaraman, Shay Vargaftik, Alon Berger, Gal Mendelson, Mohammad Alizadeh, Shang-Tse Chuang, Isaac Keslassy, Ariel Orda, and Tom Edsall. 2017. dRMT: Disaggregated Programmable Switching. In ACM SIGCOMM.
[25]
James Cipar, Qirong Ho, Jin Kyu Kim, Seunghak Lee, Gregory R. Ganger, Garth Gibson, Kimberly Keeton, and Eric Xing. 2013. Solving the Straggler Problem with Bounded Staleness. In 14th Workshop on Hot Topics in Operating Systems (HotOS XIV). USENIX Association, Santa Ana Pueblo, NM. https://www.usenix.org/conference/hotos13/session/cipar
[26]
Eli Cortez, Anand Bonde, Alexandre Muzio, Mark Russinovich, Marcus Fontoura, and Ricardo Bianchini. 2017. Resource Central: Understanding and Predicting Workloads for Improved Resource Management in Large Cloud Platforms. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP '17). ACM, New York, NY, USA, 153--167.
[27]
H. T. Dang, P. Bressana, H. Wang, K. S. Lee, N. Zilberman, H. Weatherspoon, M. Canini, F. Pedone, and R. Soule. 2020. P4xos: Consensus as a Network Service. IEEE/ACM Transactions on Networking 28, 4 (2020).
[28]
Huynh Tu Dang, Marco Canini, Fernando Pedone, and Robert Soulé. 2016. Paxos made switch-y. ACM SIGCOMM Computer Communication Review 46, 2 (2016), 18--24.
[29]
Daniele De Sensi, Salvatore Di Girolamo, Saleh Ashkboos, Shigang Li, and Torsten Hoefler. 2021. Flare: Flexible in-Network Allreduce. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC '21). Association for Computing Machinery, New York, NY, USA, Article 35, 16 pages.
[30]
Jeffrey Dean, Greg Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Mark Mao, Marc' aurelio Ranzato, Andrew Senior, Paul Tucker, Ke Yang, Quoc Le, and Andrew Ng. 2012. Large Scale Distributed Deep Networks. In Advances in Neural Information Processing Systems, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger (Eds.), Vol. 25. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2012/file/6aca97005c68f1206823815f66102863-Paper.pdf
[31]
Jeffrey Dean and Sanjay Ghemawat. 2004. MapReduce: Simplified Data Processing on Large Clusters. In OSDI'04: Sixth Symposium on Operating System Design and Implementation. San Francisco, CA, 137--150.
[32]
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. 2009. ImageNet: A Large-Scale Hierarchical Image Database. In CVPR09.
[33]
Celestine Dünner, Thomas Parnell, Dimitrios Sarigiannis, Nikolas Ioannou, Andreea Anghel, Gummadi Ravi, Madhusudanan Kandasamy, and Haralampos Pozidis. 2018. Snap ML: A Hierarchical Framework for Machine Learning. In Advances in Neural Information Processing Systems 31, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett (Eds.). Curran Associates, Inc., 252--262. http://papers.nips.cc/paper/7309-snap-ml-a-hierarchical-framework-for-machine-learning.pdf
[34]
Farshid Farhat, Diman Zad Tootaghaj, Yuxiong He, Anand Sivasubramaniam, Mahmut Kandemir, and Chita R. Das. 2018. Stochastic Modeling and Optimization of Stragglers. IEEE Transactions on Cloud Computing 6, 4 (oct 2018), 1164--1177.
[35]
Yong Feng, Zhikang Chen, Haoyu Song, Wenquan Xu, Jiahao Li, Zijian Zhang, Tong Yun, Ying Wan, and Bin Liu. 2022. Enabling In-situ Programmability in Network Data Plane: From Architecture to Language. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22). 635--649.
[36]
N. Gebara, P. Costa, and M. Ghobadi. 2021. PANAMA: In-network Aggregation for Shared Machine Learning Clusters. In Proc. Conference on Machine Learning and Systems (MLSys). 1--16.
[37]
Nadeen Gebara, Alberto Lerner, Mingran Yang, Minlan Yu, Paolo Costa, and Manya Ghobadi. 2020. Challenging the Stateless Quo of Programmable Switches. In ACM Workshop on Hot Topics in Networks (HotNets). ACM. https://www.microsoft.com/en-us/research/publication/challenging-the-stateless-quo-of-programmable-switches/
[38]
Aaron Harlap, Henggang Cui, Wei Dai, Jinliang Wei, Gregory R. Ganger, Phillip B. Gibbons, Garth A. Gibson, and Eric P. Xing. 2016. Addressing the Straggler Problem for Iterative Convergent Parallel ML. In Proceedings of the Seventh ACM Symposium on Cloud Computing (SoCC '16). ACM, New York, NY, USA, 98--111.
[39]
Aaron Harlap, Deepak Narayanan, Amar Phanishayee, Vivek Seshadri, Nikhil R. Devanur, Gregory R. Ganger, and Phillip B. Gibbons. 2018. PipeDream: Fast and Efficient Pipeline Parallel DNN Training. CoRR abs/1806.03377 (2018).
[40]
Sayed Hadi Hashemi, Sangeetha Abdu Jyothi, and Roy H. Campbell. 2018. Communication Scheduling as a First-Class Citizen in Distributed Machine Learning Systems. CoRR abs/1803.03288 (2018). arXiv:1803.03288 http://arxiv.org/abs/1803.03288
[41]
K. He, X. Zhang, S. Ren, and J. Sun. 2016. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770--778.
[42]
Gao Huang, Zhuang Liu, Laurens van der Maaten, and Kilian Q. Weinberger. 2017. Densely Connected Convolutional Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '17). IEEE Computer Society, Honolulu, HI, 2261--2269. https://arxiv.org/abs/1608.06993v5
[43]
Xin Jin, Xiaozhou Li, Haoyu Zhang, Robert Soulé, Jeongkeun Lee, Nate Foster, Changhoon Kim, and Ion Stoica. 2017. NetCache: Balancing Key-Value Stores with Fast In-Network Caching. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP '17).
[44]
Daehyeok Kim, Zaoxing Liu, Yibo Zhu, Changhoon Kim, Jeongkeun Lee, Vyas Sekar, and Srinivasan Seshan. 2020. TEA: Enabling State-Intensive Network Functions on Programmable Switches. In Proceedings of the Annual Conference of the ACM Special Interest Group on Data Communication on the Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM '20). Association for Computing Machinery, New York, NY, USA, 90--106.
[45]
Daehyeok Kim, Yibo Zhu, Changhoon Kim, Jeongkeun Lee, and Srinivasan Seshan. 2018. Generic External Memory for Switch Data Planes. In ACM HotNets.
[46]
Eugene Kirpichov and Malo Denielou. 2016. No shard left behind: dynamic work rebalancing in Google Cloud Dataflow. (May 2016). https://cloud.google.com/blog/products/gcp/no-shard-left-behind-dynamic-work-rebalancing-in-google-cloud-dataflow
[47]
Benjamin Klenk, Nan Jiang, G. Thorson, and L. Dennison. 2020. An In-Network Architecture for Accelerating Shared-Memory Multiprocessor Collectives. 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA) (2020), 996--1009.
[48]
ChonLam Lao, Yanfang Le, Kshiteej Mahajan, Yixi Chen, Wenfei Wu, Aditya Akella, and Michael Swift. 2021. ATP: In-network Aggregation for Multi-tenant Learning. In 18th USENIX Symposium on Networked Systems Design and Implementation (NSDI 21). USENIX Association, 741--761.
[49]
Adam Lerer, Ledell Wu, Jiajun Shen, Timothée Lacroix, Luca Wehrstedt, Abhijit Bose, and Alexander Peysakhovich. 2019. PyTorch-BigGraph: A Large-scale Graph Embedding System. CoRR abs/1903.12287 (2019). arXiv:1903.12287 http://arxiv.org/abs/1903.12287
[50]
Alberto Lerner, Rana Hussein, and Philippe Cudré-Mauroux. 2019. The Case for Network Accelerated Query Processing. In Proceedings of the Innovative Data Systems Research Conference (CIDR '19).
[51]
Bojie Li, Zhenyuan Ruan, Wencong Xiao, Yuanwei Lu, Yongqiang Xiong, Andrew Putnam, Enhong Chen, and Lintao Zhang. 2017. KV-Direct: High-Performance In-Memory Key-Value Store with Programmable NIC. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP '17). ACM, New York, NY, USA, 137--152.
[52]
Jialin Li, Ellis Michael, and Dan R. K. Ports. 2017. Eris: Coordination-Free Consistent Transactions Using In-Network Concurrency Control. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP '17).
[53]
Mu Li, David G. Andersen, Jun Woo Park, Alexander J. Smola, Amr Ahmed, Vanja Josifovski, James Long, Eugene J. Shekita, and Bor-Yiing Su. 2014. Scaling Distributed Machine Learning with the Parameter Server. In 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14). USENIX Association, Broomfield, CO, 583--598. https://www.usenix.org/conference/osdi14/technical-sessions/presentation/li_mu
[54]
Youjie Li, Iou-Jen Liu, Yifan Yuan, Deming Chen, Alexander Schwing, and Jian Huang. 2019. Accelerating Distributed Reinforcement learning with In-Switch Computing. In 2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA). 279--291.
[55]
Rui Miao, Hongyi Zeng, Changhoon Kim, Jeongkeun Lee, and Minlan Yu. 2017. SilkRoad: Making Stateful Layer-4 Load Balancing Fast and Cheap Using Switching ASICs. In Proceedings of the 2017 ACM SIGCOMM Conference (SIGCOMM '17).
[56]
Derek G. Murray, Frank McSherry, Rebecca Isaacs, Michael Isard, Paul Barham, and Martín Abadi. 2013. Naiad: A Timely Dataflow System. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (SOSP '13). Association for Computing Machinery, New York, NY, USA, 439--455.
[57]
Kay Ousterhout, Patrick Wendell, Matei Zaharia, and Ion Stoica. 2013. Sparrow: Distributed, Low Latency Scheduling. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (SOSP '13). Association for Computing Machinery, New York, NY, USA, 69--84.
[58]
P4.org Architecture Working Group. [n. d.]. P416 Portable Switch Architecture (PSA). ([n. d.]). https://p4.org/p4-spec/docs/PSA.html.
[59]
Matthew Perron, Raul Castro Fernandez, David DeWitt, and Samuel Madden. 2020. Starling: A Scalable Query Engine on Cloud Functions. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (SIGMOD '20). Association for Computing Machinery, New York, NY, USA, 131--141.
[60]
Dan R. K. Ports and Jacob Nelson. 2019. When Should The Network Be The Computer?. In Proceedings of the Workshop on Hot Topics in Operating Systems (HotOS '19).
[61]
Netanel Raviv, Itzhak Tamo, Rashish Tandon, and Alexandros G. Dimakis. 2020. Gradient Coding From Cyclic MDS Codes and Expander Graphs. IEEE Transactions on Information Theory 66, 12 (2020), 7475--7489.
[62]
Arjun Roy, Hongyi Zeng, Jasmeet Bagga, George Porter, and Alex C. Snoeren. 2015. Inside the Social Network's (Datacenter) Network. In SIGCOMM.
[63]
Amedeo Sapio, Marco Canini, Chen-Yu Ho, Jacob Nelson, Panos Kalnis, Changhoon Kim, Arvind Krishnamurthy, Masoud Moshref, Dan Ports, and Peter Richtarik. 2021. Scaling Distributed Machine Learning with In-Network Aggregation. In 18th USENIX Symposium on Networked Systems Design and Implementation (NSDI 21). USENIX Association, 785--808. https://www.usenix.org/conference/nsdi21/presentation/sapio
[64]
Amedeo Sapio, Marco Canini, Chen-Yu Ho, Jacob Nelson, Panos Kalnis, Changhoon Kim, Arvind Krishnamurthy, Masoud Moshref, Dan Ports, and Peter Richtarik. 2022. SwitchML open source code. (2022). https://github.com/p4lang/p4app-switchML.
[65]
Alexander Sergeev and Mike Del Balso. 2018. Horovod: fast and easy distributed deep learning in TensorFlow. (2018). arXiv:cs.LG/1802.05799
[66]
Christopher J. Shallue, Jaehoon Lee, Joseph M. Antognini, Jascha Sohl-Dickstein, Roy Frostig, and George E. Dahl. 2018. Measuring the Effects of Data Parallelism on Neural Network Training. CoRR abs/1811.03600 (2018). arXiv:1811.03600 http://arxiv.org/abs/1811.03600
[67]
Vishal Shrivastav. 2022. Stateful Multi-Pipelined Programmable Switches. In Proceedings of the 2022 ACM SIGCOMM Conference (SIGCOMM '22).
[68]
Karen Simonyan and Andrew Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. (2015). arXiv:cs.CV/1409.1556
[69]
Anirudh Sivaraman, Alvin Cheung, Mihai Budiu, Changhoon Kim, Mohammad Alizadeh, Hari Balakrishnan, George Varghese, Nick McKeown, and Steve Licking. 2016. Packet Transactions: High-Level Programming for Line-Rate Switches. In ACM SIGCOMM.
[70]
Erich Strohmaier, Jack J. Dongarra, Hans W. Meuer, and Horst D. Simon. 1999. The Marketplace of High-Performance Computing. Parallel Comput. 25, 13--14 (dec 1999), 1517--1544.
[71]
Tushar Swamy, Alexander Rucker, Muhammad Shahbaz, and Kunle Olukotun. 2022. Taurus: A Data Plane Architecture for Per-Packet ML. ASPLOS (2022).
[72]
Rashish Tandon, Qi Lei, Alexandros G. Dimakis, and Nikos Karampatziakis. 2017. Gradient Coding: Avoiding Stragglers in Distributed Learning. In Proceedings of the 34th International Conference on Machine Learning (Proceedings of Machine Learning Research), Doina Precup and Yee Whye Teh (Eds.), Vol. 70. PMLR, International Convention Centre, Sydney, Australia, 3368--3376. http://proceedings.mlr.press/v70/tandon17a.html
[73]
Muhammad Tirmazi, Ran Ben Basat, Jiaqi Gao, and Minlan Yu. 2020. Cheetah: Accelerating Database Queries with Switch Pruning. In Proceedings of the 2020 ACM SIGMOD Conference (SIGMOD '20).
[74]
Yuta Tokusashi, Huynh Tu Dang, Fernando Pedone, Robert Soulé, and Noa Zilberman. 2019. The Case For In-Network Computing On Demand. In EuroSys.
[75]
Da Wang, Gauri Joshi, and Gregory W. Wornell. 2019. Efficient Straggler Replication in Large-Scale Parallel Computing. ACM Trans. Model. Perform. Eval. Comput. Syst. 4, 2, Article 7 (April 2019), 23 pages.
[76]
Zhaoqi Xiong and Noa Zilberman. 2019. Do Switches Dream of Machine Learning? Toward In-Network Classification. In Proceedings of the 18th ACM Workshop on Hot Topics in Networks (HotNets'19).
[77]
Yifan Yuan, Omar Alama, Jiawei Fei, Jacob Nelson, Dan RK Ports, Amedeo Sapio, Marco Canini, and Nam Sung Kim. 2022. Unlocking the Power of Inline {FloatingPoint} Operations on Programmable Switches. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22). 683--700.
[78]
Matei Zaharia, Dhruba Borthakur, Joydeep Sen Sarma, Khaled Elmeleegy, Scott Shenker, and Ion Stoica. 2010. Delay Scheduling: A Simple Technique for Achieving Locality and Fairness in Cluster Scheduling. In Proceedings of the 5th European Conference on Computer Systems (EuroSys '10). Association for Computing Machinery, New York, NY, USA, 265--278.
[79]
Matei Zaharia, Andy Konwinski, Anthony D. Joseph, Randy Katz, and Ion Stoica. 2008. Improving MapReduce Performance in Heterogeneous Environments. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation (OSDI'08). USENIX Association, Berkeley, CA, USA, 29--42.

Cited By

View all
  • (2025)MimoSketch: A Framework for Frequency-Based Mining Tasks on Multiple Nodes With SketchesIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.352303437:3(1311-1324)Online publication date: Mar-2025
  • (2025)Function Placement for In-network Federated LearningComputer Networks10.1016/j.comnet.2024.110900256(110900)Online publication date: Jan-2025
  • (2024)CASSINIProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691903(1403-1420)Online publication date: 16-Apr-2024
  • Show More Cited By

Index Terms

  1. Using trio: juniper networks' programmable chipset - for emerging in-network applications

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        SIGCOMM '22: Proceedings of the ACM SIGCOMM 2022 Conference
        August 2022
        858 pages
        ISBN:9781450394208
        DOI:10.1145/3544216
        This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike International 4.0 License.

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 22 August 2022

        Check for updates

        Author Tags

        1. network hardware design
        2. network support for machine learning
        3. programmable dataplanes

        Qualifiers

        • Research-article

        Funding Sources

        Conference

        SIGCOMM '22
        Sponsor:
        SIGCOMM '22: ACM SIGCOMM 2022 Conference
        August 22 - 26, 2022
        Amsterdam, Netherlands

        Acceptance Rates

        Overall Acceptance Rate 462 of 3,389 submissions, 14%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)1,684
        • Downloads (Last 6 weeks)133
        Reflects downloads up to 18 Feb 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2025)MimoSketch: A Framework for Frequency-Based Mining Tasks on Multiple Nodes With SketchesIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.352303437:3(1311-1324)Online publication date: Mar-2025
        • (2025)Function Placement for In-network Federated LearningComputer Networks10.1016/j.comnet.2024.110900256(110900)Online publication date: Jan-2025
        • (2024)CASSINIProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691903(1403-1420)Online publication date: 16-Apr-2024
        • (2024)THCProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691891(1191-1211)Online publication date: 16-Apr-2024
        • (2024)Empower programmable pipeline for advanced stateful packet processingProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691853(491-508)Online publication date: 16-Apr-2024
        • (2024)Near-Lossless Gradient Compression for Data-Parallel Distributed DNN TrainingProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698541(977-994)Online publication date: 20-Nov-2024
        • (2024)Rethinking the Switch Architecture for Stateful In-network ComputingProceedings of the 23rd ACM Workshop on Hot Topics in Networks10.1145/3696348.3696897(273-281)Online publication date: 18-Nov-2024
        • (2024)In-Network AllReduce Optimization with Virtual Aggregation TreesProceedings of the 2024 SIGCOMM Workshop on Networks for AI Computing10.1145/3672198.3673800(54-60)Online publication date: 4-Aug-2024
        • (2024)OptimusPrime: Unleash Dataplane Programmability through a Transformable ArchitectureProceedings of the ACM SIGCOMM 2024 Conference10.1145/3651890.3672214(904-920)Online publication date: 4-Aug-2024
        • (2024)Straggler-Aware Gradient Aggregation for Large-Scale Distributed Deep Learning SystemIEEE/ACM Transactions on Networking10.1109/TNET.2024.344103932:6(4917-4930)Online publication date: Dec-2024
        • Show More Cited By

        View Options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Login options

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media