Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3466752.3480055acmconferencesArticle/Chapter ViewAbstractPublication PagesmicroConference Proceedingsconference-collections
research-article
Public Access

Cerebros: Evading the RPC Tax in Datacenters

Published: 17 October 2021 Publication History

Abstract

The emerging paradigm of microservices decomposes online services into fine-grained software modules frequently communicating over the datacenter network, often using Remote Procedure Calls (RPCs). Ongoing advancements in the network stack have exposed the RPC layer itself as a bottleneck, that we show accounts for 40–90% of a microservice’s total execution cycles. We break down the underlying modules that comprise production RPC layers and demonstrate, based on prior evidence, that CPUs can only expect limited improvements for such tasks, mandating a shift to hardware to remove the RPC layer as a limiter of microservice performance. Although recently proposed accelerators can efficiently handle a portion of the RPC layer, their overall benefit is limited by unnecessary CPU involvement, which occurs because the accelerators are architected as co-processors under the CPU’s control. Instead, we show that conclusively removing the RPC layer bottleneck requires all of the RPC layer’s modules to be executed by a NIC-attached hardware accelerator. We introduce Cerebros, a dedicated RPC processor that executes the Apache Thrift RPC layer and acts as an intermediary stage between the NIC and the microservice running on the CPU. Our evaluation using the DeathStarBench microservice suite shows that Cerebros reduces the CPU cycles spent in the RPC layer by 37–64 ×, yielding a 1.8–14 × reduction in total cycles expended per microservice request.

References

[1]
Mohammad Alizadeh, Albert G. Greenberg, David A. Maltz, Jitendra Padhye, Parveen Patel, Balaji Prabhakar, Sudipta Sengupta, and Murari Sridharan. 2010. Data center TCP (DCTCP). In Proceedings of the ACM SIGCOMM 2010 Conference. 63–74.
[2]
Muhammad Shoaib Bin Altaf and David A. Wood. 2017. LogCA: A High-Level Performance Model for Hardware Accelerators. In Proceedings of the 44th International Symposium on Computer Architecture (ISCA). 375–388.
[3]
Ali Ansari, Pejman Lotfi-Kamran, and Hamid Sarbazi-Azad. 2020. Divide and Conquer Frontend Bottleneck. In Proceedings of the 47th International Symposium on Computer Architecture (ISCA). 65–78.
[4]
Apache Software Foundation. [n.d.]. Thrift. Retrieved August 16, 2019 from https://thrift.apache.org/
[5]
Nils Asmussen, Michael Roitzsch, and Hermann Härtig. 2019. M³x: Autonomous Accelerators via Context-Enabled Fast-Path Communication. In Proceedings of the 2019 USENIX Annual Technical Conference (ATC). 617–632.
[6]
Grant Ayers, Nayana Prasad Nagendra, David I. August, Hyoun Kyu Cho, Svilen Kanev, Christos Kozyrakis, Trivikram Krishnamurthy, Heiner Litz, Tipp Moseley, and Parthasarathy Ranganathan. 2019. AsmDB: understanding and mitigating front-end stalls in warehouse-scale computers. In Proceedings of the 46th International Symposium on Computer Architecture (ISCA). 462–473.
[7]
Luiz André Barroso, Jimmy Clidaras, and Urs Hölzle. 2013. The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, Second Edition. Morgan & Claypool Publishers.
[8]
Luiz André Barroso, Mike Marty, David A. Patterson, and Parthasarathy Ranganathan. 2017. Attack of the killer microseconds. Commun. ACM 60, 4 (2017), 48–54.
[9]
Adrian M. Caulfield, Eric S. Chung, Andrew Putnam, Hari Angepat, Jeremy Fowers, Michael Haselman, Stephen Heil, Matt Humphrey, Puneet Kaur, Joo-Young Kim, Daniel Lo, Todd Massengill, Kalin Ovtcharov, Michael Papamichael, Lisa Woods, Sitaram Lanka, Derek Chiou, and Doug Burger. 2016. A cloud-scale acceleration architecture. In Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). 7:1–7:13.
[10]
Adrian Cockcroft. 2015. Microservices the Good Bad and the Ugly. Retrieved August 16, 2019 from https://www.slideshare.net/adriancockcroft/microservices-the-good-bad-and-the-ugly
[11]
James Coleman. 2009. Reducing Interrupt Latency Through the Use of Message Signaled Interrupts. Retrieved March 28, 2020 from https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/msg-signaled-interrupts-paper.pdf
[12]
NVIDIA Corp.2020. Developing a Linux Kernel Module using GPUDirect RDMA. Retrieved March 29, 2020 from https://docs.nvidia.com/cuda/gpudirect-rdma/index.html
[13]
Alexandros Daglis, Stanko Novakovic, Edouard Bugnion, Babak Falsafi, and Boris Grot. 2015. Manycore network interfaces for in-memory rack-scale computing. In Proceedings of the 42nd International Symposium on Computer Architecture (ISCA). 567–579.
[14]
Alexandros Daglis, Mark Sutherland, and Babak Falsafi. 2019. RPCValet: NI-Driven Tail-Aware Balancing of µs-Scale RPCs. In Proceedings of the 24th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-XXIV). 35–48.
[15]
Michael Dalton, David Schultz, Jacob Adriaens, Ahsan Arefin, Anshuman Gupta, Brian Fahs, Dima Rubinstein, Enrique Cauich Zermeno, Erik Rubow, James Alexander Docauer, Jesse Alpert, Jing Ai, Jon Olson, Kevin DeCabooter, Marc de Kruijf, Nan Hua, Nathan Lewis, Nikhil Kasinadhuni, Riccardo Crepaldi, Srinivas Krishnan, Subbaiah Venkata, Yossi Richter, Uday Naik, and Amin Vahdat. 2018. Andromeda: Performance, Isolation, and Velocity at Scale in Cloud Network Virtualization. In Proceedings of the 15th Symposium on Networked Systems Design and Implementation (NSDI). 373–387.
[16]
Datacenter Knowledge. 2018. The Year of 100GbE in Data Center Networks. Retrieved November 19, 2020 from https://www.datacenterknowledge.com/networks/year-100gbe-data-center-networks
[17]
DPDK [n.d.]. Data Plane Development Kit. https://www.dpdk.org
[18]
Aleksandar Dragojevic, Dushyanth Narayanan, Miguel Castro, and Orion Hodson. 2014. FaRM: Fast Remote Memory. In Proceedings of the 11th Symposium on Networked Systems Design and Implementation (NSDI). 401–414.
[19]
Dave Dunning, Greg J. Regnier, Gary L. McAlpine, Don Cameron, Bill Shubert, Frank Berry, Anne Marie Merritt, Ed Gronke, and Chris Dodd. 1998. The Virtual Interface Architecture. IEEE Micro 18, 2 (1998), 66–76.
[20]
Haggai Eran, Lior Zeno, Maroun Tork, Gabi Malka, and Mark Silberstein. 2019. NICA: An Infrastructure for Inline Acceleration of Network Applications. In Proceedings of the 2019 USENIX Annual Technical Conference (ATC). 345–362.
[21]
Facebook Inc.[n.d.]. Facebook Thrift. Retrieved November 19, 2020 from https://github.com/facebook/fbthrift
[22]
Michael Ferdman, Thomas F. Wenisch, Anastasia Ailamaki, Babak Falsafi, and Andreas Moshovos. 2008. Temporal instruction fetch streaming. In Proceedings of the 41st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). 1–10.
[23]
Daniel Firestone, Andrew Putnam, Sambrama Mundkur, Derek Chiou, Alireza Dabagh, Mike Andrewartha, Hari Angepat, Vivek Bhanu, Adrian M. Caulfield, Eric S. Chung, Harish Kumar Chandrappa, Somesh Chaturmohta, Matt Humphrey, Jack Lavier, Norman Lam, Fengfen Liu, Kalin Ovtcharov, Jitu Padhye, Gautham Popuri, Shachar Raindel, Tejas Sapre, Mark Shaw, Gabriel Silva, Madhan Sivakumar, Nisheeth Srivastava, Anshuman Verma, Qasim Zuhair, Deepak Bansal, Doug Burger, Kushagra Vaid, David A. Maltz, and Albert G. Greenberg. 2018. Azure Accelerated Networking: SmartNICs in the Public Cloud. In Proceedings of the 15th Symposium on Networked Systems Design and Implementation (NSDI). 51–66.
[24]
Yu Gan, Yanqi Zhang, Dailun Cheng, Ankitha Shetty, Priyal Rathi, Nayan Katarki, Ariana Bruno, Justin Hu, Brian Ritchken, Brendon Jackson, Kelvin Hu, Meghna Pancholi, Yuan He, Brett Clancy, Chris Colen, Fukang Wen, Catherine Leung, Siyuan Wang, Leon Zaruvinsky, Mateo Espinosa, Rick Lin, Zhongling Liu, Jake Padilla, and Christina Delimitrou. 2019. An Open-Source Benchmark Suite for Microservices and Their Hardware-Software Implications for Cloud & Edge Systems. In Proceedings of the 24th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-XXIV). 3–18.
[25]
Google. [n.d.]. FlatBuffers. Retrieved April 5, 2019 from https://google.github.io/flatbuffers/
[26]
Google. [n.d.]. gRPC. Retrieved April 16, 2021 from https://grpc.io/
[27]
Albert G. Greenberg, James R. Hamilton, Navendu Jain, Srikanth Kandula, Changhoon Kim, Parantap Lahiri, David A. Maltz, Parveen Patel, and Sudipta Sengupta. 2009. VL2: a scalable and flexible data center network. In Proceedings of the ACM SIGCOMM 2009 Conference. 51–62.
[28]
Boris Grot, Joel Hestness, Stephen W. Keckler, and Onur Mutlu. 2009. Express Cube Topologies for on-Chip Interconnects. In Proceedings of the 15th IEEE Symposium on High-Performance Computer Architecture (HPCA). 163–174.
[29]
Chuanxiong Guo, Haitao Wu, Zhong Deng, Gaurav Soni, Jianxi Ye, Jitu Padhye, and Marina Lipshteyn. 2016. RDMA over Commodity Ethernet at Scale. In Proceedings of the ACM SIGCOMM 2016 Conference. 202–215.
[30]
Tom Halfhill. 2015. Oracle Shrinks Sparc M7. Linley Group Microprocessor Report (September 2015).
[31]
Mark Handley, Costin Raiciu, Alexandru Agache, Andrei Voinescu, Andrew W. Moore, Gianni Antichi, and Marcin Wójcik. 2017. Re-architecting datacenter networks and stacks for low latency and high performance. In Proceedings of the ACM SIGCOMM 2017 Conference. 29–42.
[32]
Todd Hoff. 2016. Lessons Learned From Scaling Uber To 2000 Engineers, 1000 Services, And 8000 Git Repositories. Retrieved August 16, 2019 from http://highscalability.com/blog/2016/10/12/lessons-learned-from-scaling-uber-to-2000-engineers-1000-ser.html
[33]
Stephen Ibanez, Alex Mallery, Serhat Arslan, Theo Jepsen, Muhammad Shahbaz, Changhoon Kim, and Nick McKeown. 2021. The nanoPU: A Nanosecond Network Stack for Datacenters. In Proceedings of the 15th Symposium on Operating System Design and Implementation (OSDI). 239–256.
[34]
Intel. 2014. Introduction to Intel Ethernet Flow Director and Memcached Performance. https://www.intel.com/content/www/us/en/ethernet-products/converged-network-adapters/ethernet-flow-director.html
[35]
Intel Corp. 2016. Intel Xeon Processor D-1500 Product Family. https://cdrdv2.intel.com/v1/dl/getcontent/333423. (Date retrieved: 6 March 2020).
[36]
Jaeyoung Jang, Sungjun Jung, Sunmin Jeong, Jun Heo, Hoon Shin, Tae Jun Ham, and Jae W. Lee. 2020. A Specialized Architecture for Object Serialization with Applications to Big Data Analytics. In Proceedings of the 47th International Symposium on Computer Architecture (ISCA). 322–334.
[37]
Gopal Kakivaya, Lu Xun, Richard Hasha, Shegufta Bakht Ahsan, Todd Pfleiger, Rishi Sinha, Anurag Gupta, Mihail Tarta, Mark Fussell, Vipul Modi, Mansoor Mohsin, Ray Kong, Anmol Ahuja, Oana Platon, Alex Wun, Matthew Snider, Chacko Daniel, Dan Mastrian, Yang Li, Aprameya Rao, Vaishnav Kidambi, Randy Wang, Abhishek Ram, Sumukh Shivaprakash, Rajeet Nair, Alan Warwick, Bharat S. Narasimman, Meng Lin, Jeffrey Chen, Abhay Balkrishna Mhatre, Preetha Subbarayalu, Mert Coskun, and Indranil Gupta. 2018. Service fabric: a distributed platform for building microservices in the cloud. In Proceedings of the 2018 EuroSys Conference. 33:1–33:15.
[38]
Anuj Kalia, Michael Kaminsky, and David G. Andersen. 2014. Using RDMA efficiently for key-value services. In Proceedings of the ACM SIGCOMM 2014 Conference. 295–306.
[39]
Anuj Kalia, Michael Kaminsky, and David G. Andersen. 2016. Design Guidelines for High Performance RDMA Systems. In Proceedings of the 2016 USENIX Annual Technical Conference (ATC). 437–450.
[40]
Anuj Kalia, Michael Kaminsky, and David G. Andersen. 2016. FaSST: Fast, Scalable and Simple Distributed Transactions with Two-Sided (RDMA) Datagram RPCs. In Proceedings of the 12th Symposium on Operating System Design and Implementation (OSDI). 185–201.
[41]
Anuj Kalia, Michael Kaminsky, and David G. Andersen. 2019. Datacenter RPCs can be General and Fast. In Proceedings of the 16th Symposium on Networked Systems Design and Implementation (NSDI). 1–16.
[42]
Svilen Kanev, Juan Pablo Darago, Kim M. Hazelwood, Parthasarathy Ranganathan, Tipp Moseley, Gu-Yeon Wei, and David M. Brooks. 2016. Profiling a Warehouse-Scale Computer. IEEE Micro 36, 3 (2016), 54–59.
[43]
Sagar Karandikar, Howard Mao, Donggyu Kim, David Biancolin, Alon Amid, Dayeol Lee, Nathan Pemberton, Emmanuel Amaro, Colin Schmidt, Aditya Chopra, Qijing Huang, Kyle Kovacs, Borivoje Nikolic, Randy H. Katz, Jonathan Bachrach, and Krste Asanovic. 2018. FireSim: FPGA-Accelerated Cycle-Exact Scale-Out System Simulation in the Public Cloud. In Proceedings of the 45th International Symposium on Computer Architecture (ISCA). 29–42.
[44]
Cansu Kaynak, Boris Grot, and Babak Falsafi. 2013. SHIFT: shared history instruction fetch for lean-core server processors. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). 272–283.
[45]
Cansu Kaynak, Boris Grot, and Babak Falsafi. 2015. Confluence: unified instruction supply for scale-out servers. In Proceedings of the 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). 166–177.
[46]
Kenton Varda, Sandstorm.io. [n.d.]. Cap’n Proto. Retrieved September 3, 2021 from https://capnproto.org
[47]
Tanvir Ahmed Khan, Akshitha Sriraman, Joseph Devietti, Gilles Pokam, Heiner Litz, and Baris Kasikci. 2020. I-SPY: Context-Driven Conditional Instruction Prefetching with Coalescing. In Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). 146–159.
[48]
Marios Kogias, George Prekas, Adrien Ghosn, Jonas Fietz, and Edouard Bugnion. 2019. R2P2: Making RPCs first-class datacenter citizens. In Proceedings of the 2019 USENIX Annual Technical Conference (ATC). 863–880.
[49]
Gautam Kumar, Nandita Dukkipati, Keon Jang, Hassan M. G. Wassel, Xian Wu, Behnam Montazeri, Yaogong Wang, Kevin Springborn, Christopher Alfeld, Michael Ryan, David Wetherall, and Amin Vahdat. 2020. Swift: Delay is Simple and Effective for Congestion Control in the Datacenter. In Proceedings of the ACM SIGCOMM 2020 Conference. 514–528.
[50]
Rakesh Kumar, Boris Grot, and Vijay Nagarajan. 2018. Blasting through the Front-End Bottleneck with Shotgun. In Proceedings of the 23rd International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-XXIII). 30–42.
[51]
Rakesh Kumar, Cheng-Chieh Huang, Boris Grot, and Vijay Nagarajan. 2017. Boomerang: A Metadata-Free Architecture for Control Flow Delivery. In Proceedings of the 23rd IEEE Symposium on High-Performance Computer Architecture (HPCA). 493–504.
[52]
Nikita Lazarev, Shaojie Xiang, Neil Adit, Zhiru Zhang, and Christina Delimitrou. 2021. Dagger: Efficient and Fast RPCs in Cloud Microservices with Near-Memory Reconfigurable NICs. In ASPLOS 2021. 36–51.
[53]
Ming Liu, Tianyi Cui, Henry Schuh, Arvind Krishnamurthy, Simon Peter, and Karan Gupta. 2019. Offloading distributed applications onto smartNICs using iPipe. In Proceedings of the ACM SIGCOMM 2019 Conference. 318–333.
[54]
Pejman Lotfi-Kamran, Boris Grot, Michael Ferdman, Stavros Volos, Yusuf Onur Koçberber, Javier Picorel, Almutaz Adileh, Djordje Jevdjic, Sachin Idgunji, Emre Özer, and Babak Falsafi. 2012. Scale-out processors. In Proceedings of the 39th International Symposium on Computer Architecture (ISCA). 500–511.
[55]
Tony Mauro. 2015. Adopting Microservices at Netflix: Lessons for Architectural Design. Retrieved August 16, 2019 from https://www.nginx.com/blog/microservices-at-netflix-architectural-best-practices
[56]
Behnam Montazeri, Yilong Li, Mohammad Alizadeh, and John K. Ousterhout. 2018. Homa: a receiver-driven low-latency transport protocol using network priorities. In Proceedings of the ACM SIGCOMM 2018 Conference. 221–235.
[57]
Stanko Novakovic, Alexandros Daglis, Edouard Bugnion, Babak Falsafi, and Boris Grot. 2014. Scale-out NUMA. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-XIX). 3–18.
[58]
Parallel Systems Architecture Lab (PARSA), EPFL. 2020. QFlex. https://qflex.epfl.ch
[59]
Arash Pourhabibi. 2021. Hardware-Software Co-Design of an RPC Processor. EPFL PhD Thesis (2021).
[60]
Arash Pourhabibi, Siddharth Gupta, Hussein Kassir, Mark Sutherland, Zilu Tian, Mario Paulo Drumond, Babak Falsafi, and Christoph Koch. 2020. Optimus Prime: Accelerating Data Transformation in Servers. In Proceedings of the 25th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-XXV). 1203–1216.
[61]
Henry Qin, Qian Li, Jacqueline Speiser, Peter Kraft, and John K. Ousterhout. 2018. Arachne: Core-Aware Thread Management. In Proceedings of the 13th Symposium on Operating System Design and Implementation (OSDI). 145–160.
[62]
Deepti Raghavan, Philip Alexander Levis, Matei Zaharia, and Irene Zhang. 2021. Breakfast of champions: towards zero-copy serialization with NIC scatter-gather. In Proceedings of The 18th Workshop on Hot Topics in Operating Systems (HotOS-XVIII). 199–205.
[63]
Glenn Reinman, Brad Calder, and Todd M. Austin. 1999. Fetch Directed Instruction Prefetching. In Proceedings of the 32nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). 16–27.
[64]
Stephen M. Rumble, Diego Ongaro, Ryan Stutsman, Mendel Rosenblum, and John K. Ousterhout. 2011. It’s Time for Low Latency. In Proceedings of The 13th Workshop on Hot Topics in Operating Systems (HotOS-XIII).
[65]
Yakun Sophia Shao, Sam Likun Xi, Vijayalakshmi Srinivasan, Gu-Yeon Wei, and David M. Brooks. 2016. Co-designing accelerators and SoC interfaces using gem5-Aladdin. In Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). 48:1–48:12.
[66]
Mark Silberstein, Bryan Ford, Idit Keidar, and Emmett Witchel. 2013. GPUfs: integrating a file system with GPUs. In Proceedings of the 18th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-XVIII). 485–498.
[67]
Arjun Singh, Joon Ong, Amit Agarwal, Glen Anderson, Ashby Armistead, Roy Bannon, Seb Boving, Gaurav Desai, Bob Felderman, Paulie Germano, Anand Kanagala, Jeff Provost, Jason Simmons, Eiichi Tanda, Jim Wanderer, Urs Hölzle, Stephen Stuart, and Amin Vahdat. 2015. Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google’s Datacenter Network. In Proceedings of the ACM SIGCOMM 2015 Conference. 183–197.
[68]
James E. Smith. 1984. Decoupled Access/Execute Computer Architectures. ACM Trans. Comput. Syst. 2, 4 (1984), 289–308.
[69]
Akshitha Sriraman and Abhishek Dhanotia. 2020. Accelerometer: Understanding Acceleration Opportunities for Data Center Overheads at Hyperscale. In Proceedings of the 25th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-XXV). 733–750.
[70]
Akshitha Sriraman and Thomas F. Wenisch. 2018. μTune: Auto-Tuned Threading for OLDI Microservices. In 13th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2018, Carlsbad, CA, USA, October 8-10, 2018.177–194.
[71]
Mark Sutherland, Siddharth Gupta, Babak Falsafi, Virendra J. Marathe, Dionisios N. Pnevmatikatos, and Alexandros Daglis. 2020. The NEBULA RPC-Optimized Architecture. In Proceedings of the 47th International Symposium on Computer Architecture (ISCA). 199–212.
[72]
Hung-Wei Tseng, Qianchen Zhao, Yuxiao Zhou, Mark Gahagan, and Steven Swanson. 2016. Morpheus: Creating Application Objects Efficiently for Heterogeneous Computing. In Proceedings of the 43rd International Symposium on Computer Architecture (ISCA). 53–65.
[73]
Bob Wheeler. 2011. Calxeda Spins 4W Server-on-a-Chip. Linley Group Microprocessor Report (November 2011).
[74]
Adam Wolnikowski, Stephen Ibanez, Jonathan Stone, Changhoon Kim, Rajit Manohar, and Robert Soulé. 2021. Zerializer: towards zero-copy serialization. In Proceedings of The 18th Workshop on Hot Topics in Operating Systems (HotOS-XVIII). 206–212.
[75]
Hao Zhou, Ming Chen, Qian Lin, Yong Wang, Xiaobin She, Sifan Liu, Rui Gu, Beng Chin Ooi, and Junfeng Yang. 2018. Overload Control for Scaling WeChat Microservices. In Proceedings of the ACM Symposium on Cloud Computing, SoCC 2018,Carlsbad, CA, USA, October 11-13, 2018. 149–161. https://doi.org/10.1145/3267809.3267823

Cited By

View all
  • (2024)Poster: Reducing Data Movement Tax for Serialization in MicroservicesProceedings of the 20th International Conference on emerging Networking EXperiments and Technologies10.1145/3680121.3699882(17-18)Online publication date: 9-Dec-2024
  • (2024)In-Storage Domain-Specific Acceleration for Serverless ComputingProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640413(530-548)Online publication date: 27-Apr-2024
  • (2024)SMART: Dual-channel Southbound Message Delivery in Clouds with Rate Estimation2024 IEEE/ACM 32nd International Symposium on Quality of Service (IWQoS)10.1109/IWQoS61813.2024.10682836(1-10)Online publication date: 19-Jun-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
MICRO '21: MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture
October 2021
1322 pages
ISBN:9781450385572
DOI:10.1145/3466752
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Datacenters
  2. Hardware Accelerators
  3. Microservices
  4. Networked Systems
  5. Remote Procedure Calls

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

MICRO '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 484 of 2,242 submissions, 22%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)695
  • Downloads (Last 6 weeks)118
Reflects downloads up to 18 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Poster: Reducing Data Movement Tax for Serialization in MicroservicesProceedings of the 20th International Conference on emerging Networking EXperiments and Technologies10.1145/3680121.3699882(17-18)Online publication date: 9-Dec-2024
  • (2024)In-Storage Domain-Specific Acceleration for Serverless ComputingProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640413(530-548)Online publication date: 27-Apr-2024
  • (2024)SMART: Dual-channel Southbound Message Delivery in Clouds with Rate Estimation2024 IEEE/ACM 32nd International Symposium on Quality of Service (IWQoS)10.1109/IWQoS61813.2024.10682836(1-10)Online publication date: 19-Jun-2024
  • (2024)Intel Accelerators Ecosystem: An SoC-Oriented Perspective : Industry Product2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00066(848-862)Online publication date: 29-Jun-2024
  • (2024)SmartDIMM: In-Memory Acceleration of Upper Layer Protocols2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00032(312-329)Online publication date: 2-Mar-2024
  • (2023)Turbo: SmartNIC-enabled Dynamic Load Balancing of µs-scale RPCs2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA56546.2023.10071135(1045-1058)Online publication date: Feb-2023
  • (2023)Rambda: RDMA-driven Acceleration Framework for Memory-intensive µs-scale Datacenter Applications2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA56546.2023.10071127(499-515)Online publication date: Feb-2023
  • (2023)SpecFaaS: Accelerating Serverless Applications with Speculative Function Execution2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA56546.2023.10071120(814-827)Online publication date: Feb-2023
  • (2022)Accelerating Data Serialization/Deserialization Protocols with In-Network Compute2022 IEEE/ACM International Workshop on Exascale MPI (ExaMPI)10.1109/ExaMPI56604.2022.00008(22-30)Online publication date: Nov-2022

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media