Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3132747.3132764acmconferencesArticle/Chapter ViewAbstractPublication PagessospConference Proceedingsconference-collections
research-article
Public Access

NetCache: Balancing Key-Value Stores with Fast In-Network Caching

Published: 14 October 2017 Publication History

Abstract

We present NetCache, a new key-value store architecture that leverages the power and flexibility of new-generation programmable switches to handle queries on hot items and balance the load across storage nodes. NetCache provides high aggregate throughput and low latency even under highly-skewed and rapidly-changing workloads. The core of NetCache is a packet-processing pipeline that exploits the capabilities of modern programmable switch ASICs to efficiently detect, index, cache and serve hot key-value items in the switch data plane. Additionally, our solution guarantees cache coherence with minimal overhead. We implement a NetCache prototype on Barefoot Tofino switches and commodity servers and demonstrate that a single switch can process 2+ billion queries per second for 64K items with 16-byte keys and 128-byte values, while only consuming a small portion of its hardware resources. To the best of our knowledge, this is the first time that a sophisticated application-level functionality, such as in-network caching, has been shown to run at line rate on programmable switches. Furthermore, we show that NetCache improves the throughput by 3-10x and reduces the latency of up to 40% of queries by 50%, for high-performance, in-memory key-value stores.

Supplementary Material

MP4 File (netcache.mp4)

References

[1]
David G. Andersen, Jason Franklin, Michael Kaminsky, Amar Phanishayee, Lawrence Tan, and Vijay Vasudevan. 2009. FAWN: A Fast Array of Wimpy Nodes. In ACM SOSP.
[2]
Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. 2012. Workload Analysis of a Large-scale Key-value Store. In ACM SIGMETRICS.
[3]
Barefoot. 2017. Barefoot Capilano. (2017). https://www.barefootnetworks.com/technology/#capilano.
[4]
Barefoot. 2017. Barefoot Tofino. (2017). https://www.barefootnetworks.com/technology/#tofino.
[5]
M. Berezecki, E. Frachtenberg, M. Paleczny, and K. Steele. 2011. Many-core Key-value Store. In IEEE IGCC.
[6]
Pat Bosshart, Dan Daly, Glen Gibb, Martin Izzard, Nick McKeown, Jennifer Rexford, Cole Schlesinger, Dan Talayco, Amin Vahdat, George Varghese, and David Walker. 2014. P4: Programming Protocol-independent Packet Processors. ACM SIGCOMM CCR (July 2014).
[7]
Broadcom. 2017. Broadcom Tomahawk II. (2017). https://www.broadcom.com/.
[8]
Andrei Broder and Michael Mitzenmacher. 2004. Network applications of Bloom filters: A survey. Internet mathematics (2004).
[9]
Cavium. 2017. Cavium XPliant. (2017). https://www.cavium.com/.
[10]
Yue Cheng, Aayush Gupta, and Ali R. Butt. 2015. An In-memory Object Caching Framework with Adaptive Load Balancing. In EuroSys.
[11]
Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking Cloud Serving Systems with YCSB. In ACM SOCC.
[12]
Graham Cormode and S. Muthukrishnan. 2005. An improved data stream summary: The Count-Min sketch and its applications. Journal of Algorithms (April 2005).
[13]
Frank Dabek, M. Frans Kaashoek, David Karger, Robert Morris, and Ion Stoica. 2001. Wide-area Cooperative Storage with CFS. In ACM SOSP.
[14]
Jeffrey Dean and Luiz André Barroso. 2013. The Tail at Scale. ACM CACM (February 2013).
[15]
Aleksandar Dragojević, Dushyanth Narayanan, Miguel Castro, and Orion Hodson. 2014. FaRM: Fast Remote Memory. In USENIX NSDI.
[16]
Bin Fan, David G. Andersen, and Michael Kaminsky. 2013. MemC3: Compact and Concurrent MemCache with Dumber Caching and Smarter Hashing. In USENIX NSDI.
[17]
Bin Fan, Hyeontaek Lim, David G. Andersen, and Michael Kaminsky. 2011. Small Cache, Big Effect: Provable Load Balancing for Randomly Partitioned Cluster Services. In ACM SOCC.
[18]
Jim Gray, Prakash Sundaresan, Susanne Englert, Ken Baclawski, and Peter J. Weinberger. 1994. Quickly Generating Billion-record Synthetic Databases. In ACM SIGMOD.
[19]
Qi Huang, Helga Gudmundsdottir, Ymir Vigfusson, Daniel A. Freedman, Ken Birman, and Robbert van Renesse. 2014. Characterizing Load Imbalance in Real-World Networked Caches. In ACM SIGCOMM HotNets Workshop.
[20]
Intel. 2017. Intel Data Plane Development Kit (DPDK). (2017). http://dpdk.org/.
[21]
Jaeyeon Jung, Balachander Krishnamurthy, and Michael Rabinovich. 2002. Flash Crowds and Denial of Service Attacks: Characterization and Implications for CDNs and Web Sites. In WWW.
[22]
Anuj Kalia, Michael Kaminsky, and David G Andersen. 2014. Using RDMA efficiently for key-value services. In ACM SIGCOMM.
[23]
Anuj Kalia, Michael Kaminsky, and David G. Andersen. 2016. Design Guidelines for High Performance RDMA Systems. In USENIX ATC.
[24]
David Karger, Eric Lehman, Tom Leighton, Rina Panigrahy, Matthew Levine, and Daniel Lewin. 1997. Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web. In ACM STOC.
[25]
Markus Klems, Adam Silberstein, Jianjun Chen, Masood Mortazavi, Sahaya Andrews Albert, P.P.S. Narayan, Adwait Tumbde, and Brian Cooper. 2012. The Yahoo!: Cloud Datastore Load Balancer. In CloudDB.
[26]
Sheng Li, Hyeontaek Lim, Victor W Lee, Jung Ho Ahn, Anuj Kalia, Michael Kaminsky, David G Andersen, O Seongil, Sukhan Lee, and Pradeep Dubey. 2015. Architecting to achieve a billion requests per second throughput on a single key-value store server platform. In ISCA.
[27]
Xiaozhou Li, David G Andersen, Michael Kaminsky, and Michael J Freedman. 2014. Algorithmic improvements for fast concurrent cuckoo hashing. In EuroSys.
[28]
Xiaozhou Li, Raghav Sethi, Michael Kaminsky, David G. Andersen, and Michael J. Freedman. 2016. Be Fast, Cheap and in Control with SwitchKV. In USENIX NSDI.
[29]
Hyeontaek Lim, Bin Fan, David G. Andersen, and Michael Kaminsky. 2011. SILT: A Memory-efficient, High-performance Key-value Store. In ACM SOSP.
[30]
Hyeontaek Lim, Dongsu Han, David G. Andersen, and Michael Kaminsky. 2014. MICA: A Holistic Approach to Fast In-memory Key-value Storage. In USENIX NSDI.
[31]
Ming Liu, Liang Luo, Jacob Nelson, Luis Ceze, Arvind Krishnamurthy, and Kishore Atreya. 2017. IncBricks: Toward In-Network Computation with an In-Network Cache. In ACM ASPLOS.
[32]
Memcached. 2017. Memcached key-value store. (2017). https://memcached.org/.
[33]
Michael Mitzenmacher. 2001. The Power of Two Choices in Randomized Load Balancing. IEEE Transactions on Parallel and Distributed Systems (October 2001).
[34]
Rajesh Nishtala, Hans Fugal, Steven Grimm, Marc Kwiatkowski, Herman Lee, Harry C. Li, Ryan McElroy, Mike Paleczny, Daniel Peek, Paul Saab, David Stafford, Tony Tung, and Venkateshwaran Venkataramani. 2013. Scaling Memcache at Facebook. In USENIX NSDI.
[35]
NoviFlow. 2017. NoviFlow NoviSwitch. (2017). http://noviflow.com/products/noviswitch/.
[36]
John Ousterhout, Arjun Gopalan, Ashish Gupta, Ankita Kejriwal, Collin Lee, Behnam Montazeri, Diego Ongaro, Seo Jin Park, Henry Qin, Mendel Rosenblum, Stephen Rumble, Ryan Stutsman, and Stephen Yang. 2015. The RAMCloud Storage System. ACM Transactions on Computer Systems (August 2015).
[37]
K. V. Rashmi, Mosharaf Chowdhury, Jack Kosaian, Ion Stoica, and Kannan Ramchandran. 2016. EC-Cache: Load-Balanced, Low-Latency Cluster Caching with Online Erasure Coding. In USENIX OSDI.
[38]
Redis. 2017. Redis data structure store. (2017). https://redis.io/.
[39]
Redis. 2017. Using Redis as an LRU cache. (2017). https://redis.io/topics/lru-cache.
[40]
Rebecca Taft, Essam Mansour, Marco Serafini, Jennie Duggan, Aaron J. Elmore, Ashraf Aboulnaga, Andrew Pavlo, and Michael Stonebraker. 2014. E-Store: Fine-grained Elastic Partitioning for Distributed Transaction Processing Systems. VLDB (November 2014).
[41]
TommyDS. 2017. TommyDS C library. (2017). http//:www.tommyds.it/.
[42]
Vijay Vasudevan, Michael Kaminsky, and David G. Andersen. 2012. Using Vector Interfaces to Deliver Millions of IOPS from a Networked Key-value Storage Server. In ACM SOCC.
[43]
Venkateshwaran Venkataramani, Zach Amsden, Nathan Bronson, George Cabrera III, Prasad Chakka, Peter Dimov, Hui Ding, Jack Ferris, Anthony Giardullo, Jeremy Hoon, Sachin Kulkarni, Nathan Lawrence, Mark Marchukov, Dmitri Petrov, and Lovro Puzar. 2012. TAO: How Facebook Serves the Social Graph. In ACM SIGMOD.

Cited By

View all
  • (2025)ML-NIC: accelerating machine learning inference using smart network interface cardsFrontiers in Computer Science10.3389/fcomp.2024.14933996Online publication date: 6-Jan-2025
  • (2025)Dalio: In-Kernel Centralized Replication for Key-Value StoresIEICE Transactions on Information and Systems10.1587/transinf.2024EDL8060E108.D:2(157-160)Online publication date: 1-Feb-2025
  • (2025)In Search of a Memory-Efficient Framework for Online Cardinality EstimationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.348657137:1(392-407)Online publication date: Jan-2025
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SOSP '17: Proceedings of the 26th Symposium on Operating Systems Principles
October 2017
677 pages
ISBN:9781450350853
DOI:10.1145/3132747
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 October 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Caching
  2. Key-value stores
  3. Programmable switches

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

SOSP '17
Sponsor:

Acceptance Rates

Overall Acceptance Rate 174 of 961 submissions, 18%

Upcoming Conference

SOSP '25
ACM SIGOPS 31st Symposium on Operating Systems Principles
October 13 - 16, 2025
Seoul , Republic of Korea

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1,422
  • Downloads (Last 6 weeks)160
Reflects downloads up to 31 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2025)ML-NIC: accelerating machine learning inference using smart network interface cardsFrontiers in Computer Science10.3389/fcomp.2024.14933996Online publication date: 6-Jan-2025
  • (2025)Dalio: In-Kernel Centralized Replication for Key-Value StoresIEICE Transactions on Information and Systems10.1587/transinf.2024EDL8060E108.D:2(157-160)Online publication date: 1-Feb-2025
  • (2025)In Search of a Memory-Efficient Framework for Online Cardinality EstimationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.348657137:1(392-407)Online publication date: Jan-2025
  • (2025)NetMod: Toward Accelerating Cloud RAN Distributed Unit Modulation Within Programmable SwitchesIEEE Transactions on Computers10.1109/TC.2024.350037974:2(665-677)Online publication date: Feb-2025
  • (2025)Enabling High Performance and Resource Utilization in Clustered Cache via Hotness Identification, Data Copying, and Instance MergingIEEE Transactions on Computers10.1109/TC.2024.347799474:2(371-385)Online publication date: Feb-2025
  • (2024)CyberStarProceedings of the 2024 USENIX Conference on Usenix Annual Technical Conference10.5555/3691992.3692006(227-246)Online publication date: 10-Jul-2024
  • (2024)Fast and scalable in-network lock management using lock fissionProceedings of the 18th USENIX Conference on Operating Systems Design and Implementation10.5555/3691938.3691952(251-268)Online publication date: 10-Jul-2024
  • (2024)Multitenant in-network acceleration with SwitchVMProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691863(691-708)Online publication date: 16-Apr-2024
  • (2024)In-memory key-value store live migration with NetMigrateProceedings of the 22nd USENIX Conference on File and Storage Technologies10.5555/3650697.3650710(209-224)Online publication date: 27-Feb-2024
  • (2024)Anomaly Detection in In-Network Fast ReRoute Systems2024 IFIP Networking Conference (IFIP Networking)10.23919/IFIPNetworking62109.2024.10619865(122-130)Online publication date: 3-Jun-2024
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media