Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3603269.3604863acmconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article
Open access

Darwin: Flexible Learning-based CDN Caching

Published: 01 September 2023 Publication History

Abstract

Cache management is critical for Content Delivery Networks (CDNs), impacting their performance and operational costs. Most production CDNs apply static, hand-tuned caching policy parameters at cache servers, such as admission frequency or size thresholds for the Hot Object Caches (HOC) of their system. However, these static policies fall short when a server is faced with unpredictable traffic pattern changes, even when policies employ multiple control parameters/knobs. Recent approaches have proposed learning-based solutions to dynamically adjust policy parameters, but they are limited in action space, caching objectives, or impose high overhead. We propose Darwin, a CDN cache management system that is robust to traffic pattern changes and can flexibly optimize different caching objectives with unrestricted action spaces. Darwin employs a three-stage pipeline involving traffic pattern feature collection, unsupervised clustering for classification, and neural bandit expert selection to choose the optimal caching policy. Through extensive simulations, experiments using an Apache Traffic Server (ATS)-based prototype, and theoretical analysis, we show that Darwin achieves significant performance gain w.r.t. different objectives such as maximizing object hit rates and minimizing disk writes, while simultaneously adapting to traffic pattern shifts. Darwin imposes negligible overhead and achieves high throughput compared to the state-of-the-art.

References

[1]
Amin, K., Kearns, M., and Syed, U. Graphical models for bandit problems. arXiv preprint arXiv:1202.3782 (2012).
[2]
Ari, I., Amer, A., Gramacy, R. B., Miller, E. L., Brandt, S. A., and Long, D. D. Acme: Adaptive caching using multiple experts. In WDAS (2002), vol. 2, pp. 143--158.
[3]
Asiso. Overview of the cdn akamai. https://www.asioso.com/en/blog/overview-of-the-cdn-akamai-b520, Feb 2021.
[4]
Atsidakou, A., Papadigenopoulos, O., Caramanis, C., Sanghavi, S., and Shakkottai, S. Asymptotically-optimal gaussian bandits with side observations. In International Conference on Machine Learning (2022), PMLR, pp. 1057--1077.
[5]
Basat, R. B., Einziger, G., Friedman, R., and Kassner, Y. Randomized admission policy for efficient top-k and frequency estimation. In IEEE INFOCOM 2017-IEEE Conference on Computer Communications (2017), IEEE, pp. 1--9.
[6]
Beckmann, N., Chen, H., and Cidon, A. LHD: improving cache hit rate by maximizing hit density. In 15th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2018, Renton, WA, USA, April 9-11, 2018 (2018), S. Banerjee and S. Seshan, Eds., USENIX Association, pp. 389--403.
[7]
Belady, L. A. A study of replacement algorithms for a virtual-storage computer. IBM Systems journal 5, 2 (1966), 78--101.
[8]
Belady, L. A. A study of replacement algorithms for a virtual-storage computer. In IBM Systems journal (1996), vol. 5, pp. 78--101.
[9]
Berger, D. S. Towards lightweight and robust machine learning for CDN caching. In Proceedings of the 17th ACM Workshop on Hot Topics in Networks, HotNets 2018, Redmond, WA, USA, November 15-16, 2018 (2018), ACM, pp. 134--140.
[10]
Berger, D. S., Sitaraman, R. K., and Harchol-Balter, M. Adaptsize: Orchestrating the hot object memory cache in a content delivery network. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17) (2017), pp. 483--498.
[11]
Buccapatnam, S., Eryilmaz, A., and Shroff, N. B. Stochastic bandits with side observations on networks. In The 2014 ACM international conference on Measurement and modeling of computer systems (2014), pp. 289--300.
[12]
Chen, F., Sitaraman, R. K., and Torres, M. End-user mapping: Next generation request routing for content delivery. In ACM SIGCOMM Computer Communication Review (2015), vol. 45, ACM, pp. 167--181.
[13]
Cherkasova, L., and Ciardo, G. Role of aging, frequency, and size in web cache replacement policies. In High-Performance Computing and Networking: 9th International Conference, HPCN Europe 2001 Amsterdam, The Netherlands, June 25--27, 2001 Proceedings 9 (2001), Springer, pp. 114--123.
[14]
Dilley, J., Maggs, B. M., Parikh, J., Prokop, H., Sitaraman, R. K., and Weihl, W. E. Globally distributed content delivery. IEEE Internet Computing 6, 5 (2002), 50--58.
[15]
Dubhashi, D. P., and Panconesi, A. Concentration of measure for the analysis of randomized algorithms. Cambridge University Press, 2009.
[16]
Duplyakin, D., Ricci, R., Maricq, A., Wong, G., Duerig, J., Eide, E., Stoller, L., Hibler, M., Johnson, D., Webb, K., et al. The design and operation of cloudlab. In USENIX Annual Technical Conference (2019), pp. 1--14.
[17]
Einziger, G., Friedman, R., and Manes, B. Tinylfu: A highly efficient cache admission policy. ACM Transactions on Storage (ToS) 13, 4 (2017), 1--31.
[18]
Foundation, T. A. S. Apache traffic server. https://trafficserver.apache.org/, 2018. Accessed: 2023-01-30.
[19]
Garivier, A., and Kaufmann, E. Optimal best arm identification with fixed confidence. In Conference on Learning Theory (2016), PMLR, pp. 998--1027.
[20]
Guan, Y., Zhang, X., and Guo, Z. Caca: Learning-based content-aware cache admission for video content in edge caching. In Proceedings of the 27th ACM International Conference on Multimedia (2019), pp. 456--464.
[21]
Jeong, J., and Dubois, M. Cost-sensitive cache replacement algorithms. In The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings. (2003), IEEE, pp. 327--337.
[22]
Kirilin, V., Sundarrajan, A., Gorinsky, S., and Sitaraman, R. K. Rl-cache: Learning-based cache admission for content delivery. In Proceedings of the 2019 Workshop on Network Meets AI & ML (2019), pp. 57--63.
[23]
Lattimore, F., Lattimore, T., and Reid, M. D. Causal bandits: Learning good interventions via causal inference. In Advances in Neural Information Processing Systems (2016), pp. 1181--1189.
[24]
Lattimore, T., and Szepesvári, C. Bandit algorithms. Cambridge University Press, 2020.
[25]
Li, L., Chu, W., Langford, J., and Schapire, R. E. A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th international conference on World wide web (2010), pp. 661--670.
[26]
Li, Z., Ratliff, L., Nassif, H., Jamieson, K., and Jain, L. Instance-optimal pac algorithms for contextual bandits. arXiv preprint arXiv:2207.02357 (2022).
[27]
Maggs, B. M., and Sitaraman, R. K. Algorithmic nuggets in content delivery. ACM SIGCOMM Computer Communication Review 45, 3 (2015), 52--66.
[28]
McAllister, S., Berg, B., Tutuncu-Macias, J., Yang, J., Gunasekar, S., Lu, J., Berger, D. S., Beckmann, N., and Ganger, G. R. Kangaroo: Theory and practice of caching billions of tiny objects on flash. ACM Trans. Storage 18, 3 (2022), 21:1--21:33.
[29]
McDiarmid, C. Concentration. Probabilistic methods for algorithmic discrete mathematics (1998), 195--248.
[30]
Narayanan, A., Verma, S., Ramadan, E., Babaie, P., and Zhang, Z. Deepcache: A deep learning based framework for content caching. In Proceedings of the 2018 Workshop on Network Meets AI & ML, NetAI@SIGCOMM 2018, Budapest, Hungary, August 24, 2018 (2018), ACM, pp. 48--53.
[31]
Nygren, E., Sitaraman, R. K., and Sun, J. The Akamai Network: A platform for high-performance Internet applications. ACM SIGOPS Operating Systems Review 44, 3 (2010), 2--19.
[32]
Rizzo, L., and Vicisano, L. Replacement policies for a proxy cache. IEEE/ACM Transactions on networking 8, 2 (2000), 158--170.
[33]
Rodriguez, L. V., Yusuf, F. B., Lyons, S., Paz, E., Rangaswami, R., Liu, J., Zhao, M., and Narasimhan, G. Learning cache replacement with cacheus. In FAST (2021), pp. 341--354.
[34]
Sabnis, A., and Sitaraman, R. K. Tragen: a synthetic trace generator for realistic cache simulations. In Proceedings of the 21st ACM Internet Measurement Conference (2021), pp. 366--379.
[35]
Sazoglu, F. B., Cambazoglu, B. B., Ozcan, R., Altingovde, I. S., and Ulusoy, Ö. A financial cost metric for result caching. In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval (2013), pp. 873--876.
[36]
Schomp, K., Bhardwaj, O., Kurdoglu, E., Muhaimen, M., and Sitaraman, R. K. Akamai dns: Providing authoritative answers to the world's queries. In Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication (2020), pp. 465--478.
[37]
Sen, R., Shanmugam, K., Kocaoglu, M., Dimakis, A., and Shakkottai, S. Contextual bandits with latent confounders: An nmf approach. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (2017), pp. 518--527.
[38]
Sharma, N., Basu, S., Shanmugam, K., and Shakkottai, S. On under-exploration in bandits with mean bounds from confounded data. arXiv preprint arXiv:2002.08405 (2020).
[39]
Song, Z., Berger, D. S., Li, K., and Lloyd, W. Learning relaxed belady for content distribution network caching. In 17th USENIX Symposium on Networked Systems Design and Implementation (2020).
[40]
Suksomboon, K., Tarnoi, S., Ji, Y., Koibuchi, M., Fukuda, K., Abe, S., Motonori, N., Aoki, M., Urushidani, S., and Yamada, S. Popcache: Cache more or less based on content popularity for information-centric networking. In 38th Annual IEEE conference on local computer networks (2013), IEEE, pp. 236--243.
[41]
Sundarrajan, A., Feng, M., Kasbekar, M., and Sitaraman, R. K. Footprint descriptors: Theory and practice of cache provisioning in a global cdn. In Proceedings of the 13th International Conference on emerging Networking EXperiments and Technologies (2017), pp. 55--67.
[42]
Sundarrajan, A., Kasbekar, M., Sitaraman, R. K., and Shukla, S. Midgress-aware traffic provisioning for content delivery. In 2020 USENIX Annual Technical Conference (USENIX ATC 20) (July 2020), USENIX Association, pp. 543--557.
[43]
Valko, M., Munos, R., Kveton, B., and Kocák, T. Spectral bandits for smooth graph functions. In International Conference on Machine Learning (2014), pp. 46--54.
[44]
Vietri, G., Rodriguez, L. V., Martinez, W. A., Lyons, S., Liu, J., Rangaswami, R., Zhao, M., and Narasimhan, G. Driving cache replacement with ml-based lecar. In HotStorage (2018), pp. 928--936.
[45]
Vietri, G., Rodriguez, L. V., Martinez, W. A., Lyons, S., Liu, J., Rangaswami, R., Zhao, M., and Narasimhan, G. Driving cache replacement with ml-based lecar. In 10th USENIX Workshop on Hot Topics in Storage and File Systems, HotStorage 2018, Boston, MA, USA, July 9-10, 2018 (2018), A. Goel and N. Talagala, Eds., USENIX Association.
[46]
Wikipedia. Stack distance, 2023. https://en.wikipedia.org/wiki/Cache_performance_measurement_and_metric.
[47]
Wu, K.-L., Yu, P. S., and Wolf, J. L. Segment-based proxy caching of multimedia streams. In Proceedings of the 10th international conference on World Wide Web (2001), pp. 36--44.
[48]
Wu, Y., György, A., and Szepesvári, C. Online learning with gaussian payoffs and side observations. Advances in Neural Information Processing Systems 28 (2015).
[49]
Yan, G., Li, J., and Towsley, D. Learning from optimal caching for content delivery. In CoNEXT '21: The 17th International Conference on emerging Networking EXperiments and Technologies, Virtual Event, Munich, Germany, December 7 - 10, 2021 (2021), G. Carle and J. Ott, Eds., ACM, pp. 344--358.
[50]
Yang, J., Sabnis, A., Berger, D. S., Rashmi, K., and Sitaraman, R. K. C2dn: How to harness erasure codes at the edge for efficient content delivery. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22) (2022), pp. 1159--1177.
[51]
Zhang, J., and Bareinboim, E. Transfer learning in multi-armed bandit: A causal approach. In Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems (2017), pp. 1778--1780.
[52]
Zhang, Q., Xiang, Z., Zhu, W., and Gao, L. Cost-based cache replacement and server selection for multimedia proxy across wireless internet. IEEE Transactions on Multimedia 6, 4 (2004), 587--598.

Cited By

View all
  • (2024)Twist: A Multi-site Transmission Solution for On-demand Video StreamingProceedings of the ACM on Networking10.1145/36562972:CoNEXT2(1-19)Online publication date: 13-Jun-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ACM SIGCOMM '23: Proceedings of the ACM SIGCOMM 2023 Conference
September 2023
1217 pages
ISBN:9798400702365
DOI:10.1145/3603269
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 September 2023

Check for updates

Badges

Author Tags

  1. content delivery networks
  2. cache management
  3. machine learning

Qualifiers

  • Research-article

Funding Sources

  • CNS

Conference

ACM SIGCOMM '23
Sponsor:
ACM SIGCOMM '23: ACM SIGCOMM 2023 Conference
September 10, 2023
NY, New York, USA

Acceptance Rates

Overall Acceptance Rate 462 of 3,389 submissions, 14%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1,430
  • Downloads (Last 6 weeks)189
Reflects downloads up to 22 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Twist: A Multi-site Transmission Solution for On-demand Video StreamingProceedings of the ACM on Networking10.1145/36562972:CoNEXT2(1-19)Online publication date: 13-Jun-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media