research-article

Public Access

Raven: belady-guided, predictive (deep) learning for in-memory and content caching

Authors:

Zhi-Li ZhangAuthors Info & Claims

CoNEXT '22: Proceedings of the 18th International Conference on emerging Networking EXperiments and Technologies

Pages 72 - 90

https://doi.org/10.1145/3555050.3569134

Published: 30 November 2022 Publication History

Abstract

Performance of caching algorithms not only determines the quality of experience for users, but also affects the operating and capital expenditures for cloud service providers. Today's production systems rely on heuristics such as LRU (least recently used) and its variants, which work well for certain types of workloads, and cannot effectively cope with diverse and time-varying workload characteristics. While learning-based caching algorithms have been proposed to deal with these challenges, they still impose assumptions about workload characteristics and often suffer poor generalizability.

In this paper, we propose Raven, a general learning-based caching framework that leverages the insights from the offline optimal Belady algorithm for both in-memory and content caching. Raven learns the distributions of objects' next-request arrival times without any prior assumptions by employing Mixture Density Network (MDN)-based universal distribution estimation. It utilizes the estimated distributions to compute the probability of an object that arrives farthest than any other objects in the cache and evicts the one with the largest such probability, regulated by the sizes of objects if appropriate. Raven (probabilistically) approximates Belady by explicitly accounting for the stochastic, time-varying, and non-stationary nature of object arrival processes. Evaluation results on production workloads demonstrate that, compared with the best existing caching algorithms, Raven improves the object hit ratio and byte hit ratio by up to 7.3% and 7.1%, respectively, reduces the average access latency by up to 17.9% and the traffic to the origin servers by up to 18.8%.

References

[1]

Micah Adler, Ramesh K Sitaraman, and Harish Venkataramani. 2011. Algorithms for optimizing the bandwidth cost of content delivery. Computer Networks 55, 18 (2011), 4007--4020.

[2]

Mehmet Altinel, Christof Bornhoevd, Chandrasekaran Mohan, Mir Hamid Pirahesh, Berthold Reinwald, and Saileshwar Krishnamurthy. 2008. System and method for adaptive database caching. US Patent 7,395,258.

[3]

Antonios Antoniadis, Christian Coester, Marek Elias, Adam Polak, and Bertrand Simon. 2020. Online metric algorithms with untrusted predictions. In International Conference on Machine Learning. PMLR, 345--355.

[4]

Martin Arlitt, Ludmila Cherkasova, John Dilley, Rich Friedrich, and Tai Jin. 2000. Evaluating content management techniques for web proxy caches. ACM SIGMETRICS Performance Evaluation Review 27, 4 (2000), 3--11.

Digital Library

[5]

Sorav Bansal and Dharmendra S. Modha. 2004. CAR: Clock with Adaptive Replacement. In 3rd USENIX Conference on File and Storage Technologies (FAST 04). USENIX Association, San Francisco, CA. https://www.usenix.org/conference/fast-04/car-clock-adaptive-replacement

[6]

Sorav Bansal and Dharmendra S Modha. 2004. CAR: Clock with Adaptive Replacement. In FAST, Vol. 4. 187--200.

Digital Library

[7]

Nathan Beckmann, Haoxian Chen, and Asaf Cidon. 2018. {LHD}: Improving cache hit rate by maximizing hit density. In 15th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI} 18). 389--403.

[8]

Laszlo A. Belady. 1966. A study of replacement algorithms for a virtual-storage computer. IBM Systems journal 5, 2 (1966), 78--101.

Digital Library

[9]

Daniel S Berger. 2018. Towards lightweight and robust machine learning for cdn caching. In Proceedings of the 17th ACM Workshop on Hot Topics in Networks. 134--140.

Digital Library

[10]

Daniel S Berger, Nathan Beckmann, and Mor Harchol-Balter. 2018. Practical bounds on optimal caching with variable object sizes. Proceedings of the ACM on Measurement and Analysis of Computing Systems 2, 2 (2018), 1--38.

Digital Library

[11]

Daniel S Berger, Ramesh K Sitaraman, and Mor Harchol-Balter. 2017. Adaptsize: Orchestrating the hot object memory cache in a content delivery network. In 14th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI} 17). 483--498.

[12]

Christopher M Bishop. 1994. Mixture density networks. (1994).

[13]

Aaron Blankstein, Siddhartha Sen, and Michael J Freedman. 2017. Hyperbolic caching: Flexible caching for web applications. In 2017 {USENIX} Annual Technical Conference ({USENIX} {ATC} 17). 499--511.

[14]

Burton H. Bloom. 1970. Space/Time Trade-Offs in Hash Coding with Allowable Errors. Commun. ACM 13, 7 (July 1970), 422--426.

Digital Library

[15]

Zhichao Cao, Siying Dong, Sagar Vemuri, and David HC Du. 2020. Characterizing, Modeling, and Benchmarking {RocksDB} {Key-Value} Workloads at Facebook. In 18th USENIX Conference on File and Storage Technologies (FAST 20). 209--223.

Digital Library

[16]

Livia Elena Chatzieleftheriou, Merkouris Karaliopoulos, and Iordanis Koutsopoulos. 2017. Caching-aware recommendations: Nudging user preferences towards better caching performance. In IEEE INFOCOM 2017-IEEE Conference on Computer Communications. IEEE, 1--9.

[17]

Jakub Chłędowski, Adam Polak, Bartosz Szabucki, and Konrad Tomasz Zołna. 2021. Robust learning-augmented caching: An experimental study. In International Conference on Machine Learning. PMLR, 1920--1930.

[18]

Junyoung Chung, Kyle Kastner, Laurent Dinh, Kratarth Goel, Aaron C Courville, and Yoshua Bengio. 2015. A recurrent latent variable model for sequential data. Advances in neural information processing systems 28 (2015).

[19]

Asaf Cidon, Daniel Rushton, Stephen M Rumble, and Ryan Stutsman. 2017. Memshare: a dynamic multi-tenant key-value cache. In 2017 USENIX Annual Technical Conference (USENIX ATC 17). 321--334.

Digital Library

[20]

CitiBike. 2022. Citi bike trip histories. https://ride.citibikenyc.com/system-data

[21]

Ronán Conroy. 2015. Sample size A rough guide. Retreived from http://www.beaumontethics.ie/docs/application/samplesizecalculation.pdf (2015).

[22]

Pytorch Contributors. 2022. Tensors and Dynamic neural networks in Python with strong GPU acceleration. https://github.com/pytorch/pytorch

[23]

Renato Costa and Jose Pazos. 2017. Mlcache: A multi-armed bandit policy for an operating system page cache. Technical Report. Technical report, University of British Columbia.

[24]

Ruizhi Deng, Bo Chang, Marcus A Brubaker, Greg Mori, and Andreas Lehrmann. 2020. Modeling continuous stochastic processes with dynamic normalizing flows. Advances in Neural Information Processing Systems 33 (2020), 7805--7815.

[25]

Nan Du, Hanjun Dai, Rakshit Trivedi, Utkarsh Upadhyay, Manuel Gomez-Rodriguez, and Le Song. 2016. Recurrent marked temporal point processes: Embedding event history to vector. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1555--1564.

Digital Library

[26]

Gil Einziger, Roy Friedman, and Ben Manes. 2017. TinyLFU: A Highly Efficient Cache Admission Policy. ACM Trans. Storage 13, 4, Article 35 (Nov. 2017), 31 pages.

Digital Library

[27]

Bin Fan, Hyeontaek Lim, David G. Andersen, and Michael Kaminsky. 2011. Small Cache, Big Effect: Provable Load Balancing for Randomly Partitioned Cluster Services. In Proceedings of the 2nd ACM Symposium on Cloud Computing (Cascais, Portugal) (SOCC '11). Association for Computing Machinery, New York, NY, USA, Article 23, 12 pages.

Digital Library

[28]

Amos Fiat, Richard M Karp, Michael Luby, Lyle A McGeoch, Daniel D Sleator, and Neal E Young. 1991. Competitive paging algorithms. Journal of Algorithms 12, 4 (1991), 685--699.

Digital Library

[29]

Brad Fitzpatrick. 2004. Distributed caching with memcached. Linux journal 124 (2004).

[30]

Jerome H. Friedman. 2001. Greedy Function Approximation: A Gradient Boosting Machine. The Annals of Statistics 29, 5 (2001), 1189--1232.

[31]

Xiameng Hu, Xiaolin Wang, Yechen Li, Lan Zhou, Yingwei Luo, Chen Ding, Song Jiang, and Zhenlin Wang. 2015. LAMA: Optimized Locality-aware Memory Allocation for Key-value Cache. In 2015 USENIX Annual Technical Conference (USENIX ATC 15). USENIX Association, Santa Clara, CA, 57--69. https://www.usenix.org/conference/atc15/technical-session/presentation/hu

[32]

Qi Huang, Ken Birman, Robbert Van Renesse, Wyatt Lloyd, Sanjeev Kumar, and Harry C Li. 2013. An analysis of Facebook photo caching. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles. 167--181.

Digital Library

[33]

Amazon Inc. 2022. Amazon EC2 On-Demand Pricing. https://aws.amazon.com/ec2/pricing/on-demand/

[34]

Amazon Inc. 2022. Amazon ElastiCache pricing. https://aws.amazon.com/elasticache/pricing/

[35]

Amazon Inc. 2022. AWS Wavelength Pricing. https://aws.amazon.com/wavelength/pricing/

[36]

Mohammad Rafiqul Islam. 2018. Sample size and its role in Central Limit Theorem (CLT). Computational and Applied Mathematics Journal 4, 1 (2018), 1--7.

[37]

Akanksha Jain and Calvin Lin. 2016. Back to the future: Leveraging Belady's algorithm for improved cache replacement. In 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA). IEEE, 78--89.

Digital Library

[38]

Akanksha Jain and Calvin Lin. 2016. Back to the Future: Leveraging Belady's Algorithm for Improved Cache Replacement. In 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA). 78--89.

Digital Library

[39]

Akanksha Jain and Calvin Lin. 2018. Rethinking belady's algorithm to accommodate prefetching. In 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA). IEEE, 110--123.

Digital Library

[40]

Akanksha Jain and Calvin Lin. 2018. Rethinking Belady's Algorithm to Accommodate Prefetching. In 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA). 110--123.

Digital Library

[41]

Vadim Kirilin, Aditya Sundarrajan, Sergey Gorinsky, and Ramesh K. Sitaraman. 2020. RL-Cache: Learning-Based Cache Admission for Content Delivery. IEEE Journal on Selected Areas in Communications 38, 10 (2020), 2372--2385.

[42]

Elias Koutsoupias and Christos H Papadimitriou. 2000. Beyond competitive analysis. SIAM J. Comput. 30, 1 (2000), 300--317.

Digital Library

[43]

K Kylkoski and J Virtamo. 1998. Cache replacement algorithms for the renewal arrival model. In Fourteenth Nordic Teletraffic Seminar, NTS, Vol. 14. 139--148.

[44]

Mathias Lecuyer, Joshua Lockerman, Lamont Nelson, Siddhartha Sen, Amit Sharma, and Aleksandrs Slivkins. 2017. Harvesting Randomness to Optimize Distributed Systems. In Proceedings of the 16th ACM Workshop on Hot Topics in Networks (Palo Alto, CA, USA) (HotNets-XVI). Association for Computing Machinery, New York, NY, USA, 178--184.

Digital Library

[45]

Donghee Lee, Jongmoo Choi, Jong-Hun Kim, Sam H Noh, Sang Lyul Min, Yookun Cho, and Chong Sang Kim. 1999. On the existence of a spectrum of policies that subsumes the least recently used (LRU) and least frequently used (LFU) policies. In Proceedings of the 1999 ACM SIGMETRICS international conference on Measurement and modeling of computer systems. 134--143.

Digital Library

[46]

Donghee Lee, Jongmoo Choi, Jong-Hun Kim, Sam H Noh, Sang Lyul Min, Yookun Cho, and Chong Sang Kim. 2001. LRFU: A spectrum of policies that subsumes the least recently used and least frequently used policies. IEEE transactions on Computers 50, 12 (2001), 1352--1361.

[47]

Tao Lei. 2021. When attention meets fast recurrence: Training language models with reduced compute. arXiv preprint arXiv:2102.12459 (2021).

[48]

Tao Lei, Yu Zhang, Sida I Wang, Hui Dai, and Yoav Artzi. 2017. Simple recurrent units for highly parallelizable recurrence. arXiv preprint arXiv:1709.02755 (2017).

[49]

Suoheng Li, Jie Xu, Mihaela van der Schaar, and Weiping Li. 2016. Trend-aware video caching through online learning. IEEE Transactions on Multimedia 18, 12 (2016), 2503--2516.

Digital Library

[50]

Evan Liu, Milad Hashemi, Kevin Swersky, Parthasarathy Ranganathan, and Junwhan Ahn. 2020. An imitation learning approach for cache replacement. In International Conference on Machine Learning. PMLR, 6237--6247.

[51]

Redis Ltd. 2021. Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache, and message broker. https://redis.io/

[52]

Thodoris Lykouris and Sergei Vassilvtiskii. 2018. Competitive Caching with Machine Learned Advice. In International Conference on Machine Learning. PMLR, 3296--3305.

[53]

Bruce M. Maggs and Ramesh K. Sitaraman. 2015. Algorithmic Nuggets in Content Delivery. SIGCOMM Comput. Commun. Rev. 45, 3 (July 2015), 52--66.

Digital Library

[54]

Osama Makansi, Eddy Ilg, Ozgun Cicek, and Thomas Brox. 2019. Overcoming limitations of mixture density networks: A sampling and fitting framework for multimodal future prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7144--7153.

[55]

Nimrod Megiddo and Dharmendra S Modha. 2003. ARC: A Self-Tuning, Low Overhead Replacement Cache. In Fast, Vol. 3. 115--130.

Digital Library

[56]

Elizabeth J O'neil, Patrick E O'neil, and Gerhard Weikum. 1993. The LRU-K page replacement algorithm for database disk buffering. Acm Sigmod Record 22, 2 (1993), 297--306.

Digital Library

[57]

Seon-yeong Park, Dawoon Jung, Jeong-uk Kang, Jin-soo Kim, and Joonwon Lee. 2006. CFLRU: A Replacement Algorithm for Flash Memory. In Proceedings of the 2006 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (Seoul, Korea) (CASES '06). Association for Computing Machinery, New York, NY, USA, 234--241.

Digital Library

[58]

Georgios S Paschos, Apostolos Destounis, Luigi Vigneri, and George Iosifidis. 2019. Learning to cache with no regrets. In IEEE INFOCOM 2019-IEEE Conference on Computer Communications. IEEE, 235--243.

Digital Library

[59]

Konstantinos Psounis and Balaji Prabhakar. 2002. Efficient randomized web-cache replacement schemes using samples from past eviction times. IEEE/ACM transactions on networking 10, 4 (2002), 441--454.

Digital Library

[60]

Kaushik Rajan and Govindarajan Ramaswamy. 2007. Emulating optimal replacement with a shepherd cache. In 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007). IEEE, 445--454.

Digital Library

[61]

Kaushik Rajan and Govindarajan Ramaswamy. 2007. Emulating Optimal Replacement with a Shepherd Cache. In 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007). 445--454.

Digital Library

[62]

Dushyant Rao, Francesco Visin, Andrei Rusu, Razvan Pascanu, Yee Whye Teh, and Raia Hadsell. 2019. Continual unsupervised representation learning. Advances in Neural Information Processing Systems 32 (2019).

[63]

Dhruv Rohatgi. 2020. Near-optimal bounds for online caching with machine learned advice. In Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms. SIAM, 1834--1845.

[64]

Alireza Sadeghi, Fatemeh Sheikholeslami, and Georgios B Giannakis. 2017. Optimal and scalable caching for 5G using reinforcement learning of space-time popularities. IEEE Journal of Selected Topics in Signal Processing 12, 1 (2017), 180--190.

[65]

Ketan Shah, Anirban Mitra, and Dhruv Matani. 2010. An O (1) algorithm for implementing the LFU cache eviction scheme. no 1 (2010), 1--8.

[66]

Oleksandr Shchur, Marin Biloš, and Stephan Günnemann. 2019. Intensity-Free Learning of Temporal Point Processes. In International Conference on Learning Representations.

[67]

Hanul Shin, Jung Kwon Lee, Jaehong Kim, and Jiwon Kim. 2017. Continual learning with deep generative replay. Advances in neural information processing systems 30 (2017).

Digital Library

[68]

Zhenyu Song, Daniel S. Berger, Kai Li, and Wyatt Lloyd. 2020. Learning Relaxed Belady for Content Distribution Network Caching. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20). USENIX Association, Santa Clara, CA, 529--544.

[69]

Linpeng Tang, Qi Huang, Amit Puntambekar, Ymir Vigfusson, Wyatt Lloyd, and Kai Li. 2017. Popularity prediction of facebook videos for higher quality streaming. In 2017 {USENIX} Annual Technical Conference ({USENIX}{ATC} 17). 111--123.

[70]

Stefano Traverso, Mohamed Ahmed, Michele Garetto, Paolo Giaccone, Emilio Leonardi, and Saverio Niccolini. 2013. Temporal locality in today's content caching: why it matters and how to model it. ACM SIGCOMM Computer Communication Review 43, 5 (2013), 5--12.

Digital Library

[71]

Stefano Traverso, Mohamed Ahmed, Michele Garetto, Paolo Giaccone, Emilio Leonardi, and Saverio Niccolini. 2015. Unravelling the impact of temporal and geographical locality in content caching systems. IEEE Transactions on Multimedia 17, 10 (2015), 1839--1854.

Digital Library

[72]

Giuseppe Vietri, Liana V. Rodriguez, Wendy A. Martinez, Steven Lyons, Jason Liu, Raju Rangaswami, Ming Zhao, and Giri Narasimhan. 2018. Driving Cache Replacement with ML-based LeCaR. In 10th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage 18). USENIX Association, Boston, MA. https://www.usenix.org/conference/hotstorage18/presentation/vietri

Digital Library

[73]

Webcachesim2. 2021. A simulator for CDN caching and web caching policies. https://github.com/sunnyszy/lrb

[74]

Alexander Wei. 2020. Better and Simpler Learning-Augmented Online Caching. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020). Schloss Dagstuhl-Leibniz-Zentrum für Informatik.

[75]

Wikitech. 2019. This dataset is a restricted public snapshot of the wmf.webrequest table intended for caching research. https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Caching

[76]

Yujia Xie, Haoming Jiang, Feng Liu, Tuo Zhao, and Hongyuan Zha. 2019. Meta learning with relational information for short sequences. Advances in neural information processing systems 32 (2019).

[77]

Gang Yan, Jian Li, and Don Towsley. 2021. Learning from optimal caching for content delivery. In Proceedings of the 17th International Conference on emerging Networking EXperiments and Technologies. 344--358.

Digital Library

[78]

Juncheng Yang, Yao Yue, and KV Rashmi. 2020. A large scale analysis of hundreds of in-memory cache clusters at Twitter. In 14th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 20). 191--208.

[79]

Chen Zhong, M Cenk Gursoy, and Senem Velipasalar. 2018. A deep reinforcement learning-based framework for content caching. In 2018 52nd Annual Conference on Information Sciences and Systems (CISS). IEEE, 1--6.

Cited By

Zhou WWang Q(2023)An Efficient Cache Eviction Strategy based on Learning and Belady Algorithm2023 IEEE 12th International Conference on Cloud Networking (CloudNet)10.1109/CloudNet59005.2023.10490040(433-437)Online publication date: 1-Nov-2023
https://doi.org/10.1109/CloudNet59005.2023.10490040

Index Terms

Raven: belady-guided, predictive (deep) learning for in-memory and content caching
1. Computing methodologies
  1. Machine learning
2. Theory of computation
  1. Design and analysis of algorithms
    1. Online algorithms
      1. Caching and paging algorithms

Recommendations

TLB Improvements for Chip Multiprocessors: Inter-Core Cooperative Prefetchers and Shared Last-Level TLBs

Translation Lookaside Buffers (TLBs) are critical to overall system performance. Much past research has addressed uniprocessor TLBs, lowering access times and miss rates. However, as Chip MultiProcessors (CMPs) become ubiquitous, TLB design and ...
SELECTIVE VICTIM CACHING: A METHOD TO IMPROVE THE PERFORMANCE OF DIRECT-MAPPED CACHES
High performance cache replacement using re-reference interval prediction (RRIP)
ISCA '10

Practical cache replacement policies attempt to emulate optimal replacement by predicting the re-reference interval of a cache block. The commonly used LRU replacement policy always predicts a near-immediate re-reference interval on cache hits and ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CoNEXT '22: Proceedings of the 18th International Conference on emerging Networking EXperiments and Technologies

November 2022

431 pages

ISBN:9781450395083

DOI:10.1145/3555050

General Chairs:
Giuseppe Bianchi
University of Rome Tor Vergata, Italy
,
Alessandro Mei
Sapienza University of Rome, Italy

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGCOMM: ACM Special Interest Group on Data Communication

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 November 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Funding Sources

NSF (National Science Foundation)

Conference

CoNEXT '22

Sponsor:

SIGCOMM

CoNEXT '22: The 18th International Conference on emerging Networking EXperiments and Technologies

December 6 - 9, 2022

Roma, Italy

Acceptance Rates

CoNEXT '22 Paper Acceptance Rate 28 of 151 submissions, 19%;

Overall Acceptance Rate 198 of 789 submissions, 25%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
504
Total Downloads

Downloads (Last 12 months)247
Downloads (Last 6 weeks)36

Reflects downloads up to 23 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhou WWang Q(2023)An Efficient Cache Eviction Strategy based on Learning and Belady Algorithm2023 IEEE 12th International Conference on Cloud Networking (CloudNet)10.1109/CloudNet59005.2023.10490040(433-437)Online publication date: 1-Nov-2023
https://doi.org/10.1109/CloudNet59005.2023.10490040

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents