Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3205289.3205299acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
research-article

Demystifying Cache Policies for Photo Stores at Scale: A Tencent Case Study

Published: 12 June 2018 Publication History

Abstract

Photo service providers are facing critical challenges of dealing with the huge amount of photo storage, typically in a magnitude of billions of photos, while ensuring national-wide or world-wide satisfactory user experiences. Distributed photo caching architecture is widely deployed to meet high performance expectations, where efficient still mysterious caching policies play essential roles. In this work, we present a comprehensive study on internet-scale photo caching algorithms in the case of QQPhoto from Tencent Inc., the largest social network service company in China. We unveil that even advanced cache algorithms can only perform at a similar level as simple baseline algorithms and there still exists a large performance gap between these cache algorithms and the theoretically optimal algorithm due to the complicated access behaviors in such a large multi-tenant environment. We then expound the behind reasons for that phenomenon via extensively investigating the characteristics of QQPhoto workloads. Finally, in order to realistically further improve QQPhoto cache efficiency, we propose to incorporate a prefetcher in the cache stack based on the observed immediacy feature that is unique to the QQPhoto workload. Evaluation results show that with appropriate prefetching we improve the cache hit ratio by up to 7.4%, while reducing the average access latency by 6.9% at a marginal cost of 4.14% backend network traffic compared to the original system that performs no prefetching.

References

[1]
2017. How QQPhoto works(Chinese). https://cloud.tencent.com/developer/article/1005732. (2017). {Online}.
[2]
2017. Tencent. http://www.tencent.com/. (2017). {Online}.
[3]
Dulcardo Arteaga, Jorge Cabrera, Jing Xu, Swaminathan Sundararaman, and Ming Zhao. 2016. Cloud Cache: On-demand Flash Cache Management for Cloud Computing. In Proceedings of the 14th USENIX Conference on File and Storage Technologies(FAST'16). Santa Clara, CA, 355--369.
[4]
Xiao Bai, B. Barla Cambazoglu, and Archie Russell. 2016. Improved Caching Techniques for Large-Scale Image Hosting Services. In Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR'16). Pisa, Italy, 639--648.
[5]
Doug Beaver, Sanjeev Kumar, Harry C. Li, Jason Sobel, and Peter Vajgel. 2010. Finding a Needle in Haystack: Facebook's Photo Storage. In Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation(OSDI'10). Vancouver, BC, Canada, 47--60.
[6]
Nathan Beckmann and Daniel Sanchez. 2015. Talus: A Simple Way to Remove Performance Cliffs in Caches. In Proceedings of the IEEE 21st International Symposium on High Performance Computer Architecture(HPCA'15). San Francisco, CA, 64--75.
[7]
Nathan Beckmann and Daniel Sanchez. 2016. Modeling Cache Performance Beyond LRU. In Proceedings of the 22nd International Symposium on High Performance Computer Architecture(HPCA'16). Barcelona, Spain, 225--236.
[8]
Nathan Beckmann and Daniel Sanchez. 2017. Maximizing Cache Performance Under Uncertainty. In Proceedings of the 23rd International Symposium on High Performance Computer Architecture (HPCA'2017). Austin, TX, 109--120.
[9]
L. A. Belady. 1966. A Study of Replacement Algorithms for a Virtual Storage Computer. IBM Systems Journal 5, 2 (1966), 78--101.
[10]
Abhishek Bhattacharjee. 2017. Translation-Triggered Prefetching. In Proceedings of the 22nd International Conference on Architectural Support for Programming Languages and Operating Systems(ASPLOS'17). Xi'an, China, 63--76.
[11]
Aaron Blankstein, Siddhartha Sen, and Michael J. Freedman. 2017. Hyperbolic Caching: Flexible Caching for Web Applications. In Proceedings of the 2017 USENIX Annual Technical Conference (USENIX ATC '17). Santa Clara, CA, 499--511.
[12]
Nathan Bronson, Zach Amsden, George Cabrera, Prasad Chakka, Peter Dimov, Hui Ding, Jack Ferris, Anthony Giardullo, Sachin Kulkarni, Harry Li, Mark Marchukov, Dmitri Petrov, Lovro Puzar, Yee Jiun Song, and Venkat Venkararamani. 2013. TAO: Facebook's Distributed Data Store for the Social Graph. In Proceedings of the 2013 USENIX Annual Technical Conference (USENIX ATC'13). San Jose, CA, 49--60.
[13]
Yue Cheng, Fred Douglis, Philip Shilane, Grant Wallace, Peter Desnoyers, and Kai Li. 2016. Erasing Belady's Limitations: In Search of Flash Cache Offline Optimality. In Proceedings of the 2016 USENIX Annual Technical Conference (USENIX ATC'16). Denver, CO, 379--392.
[14]
Asaf Cidon, Assaf Eisenman, Mohammad Alizadeh, and Sachin Katti. 2016. Cliffhanger: Scaling Performance Cliffs in Web Memory Caches. In Proceedings of the 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI '16). Santa Clara, CA, 379--392.
[15]
Asaf Cidon, Daniel Rushton, Stephen M. Rumble, and Ryan Stutsman. 2017. Memshare: a Dynamic Multi-tenant Key-value Cache. In Proceedings of the 2017 USENIX Annual Technical Conference (USENIX ATC'17). Santa Clara, CA, 321--334.
[16]
Anirban Dasgupta, Ravi Kumar, and Tamás Sarlós. 2017. Caching with Dual Costs. In Proceedings of the 26th International Conference on World Wide Web(WWW'17). Perth, Australia, 643--652.
[17]
Lei Guo, Enhua Tan, Songqing Chen, Zhen Xiao, and Xiaodong Zhang. 2008. The Stretched Exponential Distribution of Internet Media Access Patterns. In Proceedings of the 27th Annual ACM Symposium on Principles of Distributed Computing (PODC'08). Toronto Canada, 283--294.
[18]
Ping Huang, Pradeep Subedi, Xubin He, Shuang He, and Ke Zhou. 2014. FlexECC: Partially Relaxing ECC of MLC SSD for Better Cache Performance. In Proceedings of the 2014 USENIX Annual Technical Conference (USENIX ATC'14). Philadelphia, PA, 489--500.
[19]
Ping Huang, Guanying Wu, Xubin He, and Weijun Xiao. 2014. An Aggressive Worn-out Flash Block Management Scheme to Alleviate SSD Performance Degradation. In Proceedings of the 9th European Conference on Computer Systems(Eurosys '14). Amsterdam, The Netherlands, 22:1--22:14.
[20]
Qi Huang, Ken Birman, Robbert van Renesse, Wyatt Lloyd, Sanjeev Kumar, and Harry C. Li. 2013. An Analysis of Facebook Photo Caching. In Proceedings of the 24th ACM Symposium on Operating Systems Principles(SOSP '13). Farminton, Pennsylvania, 167--181.
[21]
Sai Huang, Qingsong Wei, Dan Feng, Jianxi Chen, and Cheng Chen. 2013. Improving Flash-Based Disk Cache with Lazy Adaptive Replacement. In Proceedings of the IEEE 29th Symposium on Mass Storage Systems and Technologies(MSST'13). Long Beach, CA, 28:1--28:10.
[22]
Jinchun Kim, Seth H. Pugsley, Paul V. Gratz, A. L. Narasimha Reddy, Chris Wilkerson, and Zeshan Chishti. 2016. Path Confidence Based Lookahead Prefetching. In Proceedings of the 49th Annual IEE/ACM International Symposium on Microarchitecture(MICRO '16). Taibei, Taiwan, 1--12.
[23]
Sangwook Kim, Hwanju Kim, Sang-Hoon Kimg, and Joonwon Lee adn Jinkyu Jeong. 2015. Request-Oriented Durable Write Caching for Application Performance. In Proceedings of the 2015 USENIX Annual Technical Conference (USENIX ATC' 15). Santa Clara, CA, 193--206.
[24]
Cheng Li, Philip Shilane, Fred Douglis, Hyong Shim, Stephen Smaldone, and Grant Wallace. 2012. Nitro: A Capacity-Optimized SSD Cache for Primary Storage. In Proceedings of the 2014 USENIX Annual Technical Conference (USENIX ATC' 14). 501--512.
[25]
Cheng Li, Philip Shilane, Fred Douglis, and Grant Wallace. 2015. Pannier: A Container-based Flash Cache for Compound Objects. In Proceedings of the 16th Annual Middleware Conference(Middleware'15). New York, NY, USA, 50--62.
[26]
Wenji Li, Gregory Jean-Baptise, Juan Riveros, Giri Narasimhan, Tony Zhang, and Ming Zhao. 2016. CacheDedup: In-line Deduplication for Flash Caching. In Proceedings of the 14th USENIX Conference on File and Storage Technologies(FAST '16). Santa Clara, CA, 301--314.
[27]
Nimrod Megiddo and Dharmendra S. Modha. 2003. ARC: A Self-Tuning, Low Overhead Replacement Cache. In Proceedings of the 2nd USENIX Conference on File and Storage Technologies(FAST'03). San Francisco, CA, 115--130.
[28]
Subramanian Muralidhar, Wyatt Lloyd, Sabyasachi Roy, Cory Hill, Ernest Lin, Weiwen Liu, Satadru Pan, Shiva Shankar, Viswanath Sivakumar, Linpeng Tang, and Sanjeev Kumar. 2014. f4: Facebook's Warm BLOB Storage System. In Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation(OSDI'14). Broomfield, CO, 383--398.
[29]
Stavros Nikolaou, Robbert Van Renesse, and Nicolas Schiper. 2016. Proactive Cache Placement on Cooperative Client Caches for Online Social Networks. IEEE Transactions on Parallel and Distributed Systems 27, 4 (April 2016), 1174--1186.
[30]
Rajesh Nishtala, Hans Fugal, Steven Grimm, Marc Kwiatkowski, Herman Lee, Harry C. Li, Ryan McElroy, Mike Paleczny, Daniel Peek, Paul Saab, David Stafford, Tony Tung, and Venkateshwaran Venkataramani. 2013. Scaling Memcache at Facebook. In Proceeding of the 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI '13). Lombard, IL, 385--398.
[31]
Elizabeth J. O'Neil, Patrick E. O'Neil, and Gerhard Weikum. 1993. The LRU-K Page Replacement Algorithm for Database Disk Buffering. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data(SIGMOD '93). New York, NY, USA, 297--306.
[32]
Qifan Pu, Haoyuan Li, Matei Zaharia, Ali Ghodsi, and Ion Stoica. 2016. FairRide: Near-Optimal, Fair Cache Sharing. In Proceedings of the 13 th USENIX Symposium on Networked Systems Design and Implementation (NSDI '16). Santa Clara, CA, 393--406.
[33]
Trausti Saemundsson, Hjortur Bjornsson, Gregory Chockler, and Ymir Vigfusson. 2014. Dynamic Performance Profiling of Cloud Caches. In Proceedings of the 5th ACM Symposium on Cloud Computing(SoCC'14). Seattle, WA, 28:1--28:14.
[34]
Mohit Saxena, Michael M.Swift, and Yiying Zhang. 2012. FlashTier: a Lightweight, Consistent and Durable Storage Cache. In Proceedings of the 7th ACM European Conference on Computer Systems(Eurosys '12). Bern, Switzerland, 267--280.
[35]
Zhaoyan Shen, Feng Chen, Yichen Jia, and Zili Shao. 2017. DIDACache: A Deep Integration of Device and Application for Flash Based Key-Value Caching. In Proceedings of the 15th USENIX Conference on File and Storage Technologies(FAST'17). Santa Clara, CA, 391--405.
[36]
Manjunath Shevgoor, Sahil Koladiya, Rajeev Balasubramonian, Chris Wilkerson, Seth H Pugsley, and Zeshan Chishti. 2015. Efficiently Prefetching Complex Address Patterns. In Proceedings of the 48th Annual IEE/ACM International Symposium on Microarchitecture(MICRO'15). Waikii, HI, 141--152.
[37]
Jiang Song and Xiaodong Zhang. 2001. LIRS: An Efficient Low Inter-reference Recency Set Replacement Policy to Improve Buffer Cache Performance. In Proceedings of the 2002 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems(SIGMETRICS'02). Marina del Rey, CA, 31--42.
[38]
Ioan Stefanovici, Eno Thereska, Greg O'hea, Bianca Schroeder, Hitesh Ballani, Thomas Karagiannis, Antony Rowstron, and Tom Talpey. 2015. Software-Defined Caching: Managing Caches in Multi-tenant Data Centers. In Proceedings of the 6th ACM Symposium on Cloud Computing(SoCC '15). Hawaii, 174--181.
[39]
Linpeng Tang, Qi Huang, Wyatt Lloyd, Sanjeev Kumar, and Kai Li. 2015. RIPQ: Advanced Photo Caching on Flash for Facebook. In Proceedings the 13th USENIX Conference on File and Storage Technologies(FAST'15). Santa Clara, CA, 373--386.
[40]
Linpeng Tang, Qi Huang, Amit Puntambekar, Ymir Vigfusson, Wyatt Lloyd, and Kai Li. 2017. Popularity Prediction of Facebook Videos for Higher Quality Streaming. In Proceedings of the 2017 USENIX Annual Technical Conference (USENIX ATC' 17). Santa Clara, CA, 111--123.
[41]
Jeffrey S. Vitter. 1985. Random Sampling with a Reservoir. ACM Trans. Math. Software 11, 1 (March 1985), 37--57.
[42]
Carl Waldspurger, Trausti Saemundson, Irfan Ahmad, and Nohhyun Park. 2017. Cache Modeling and Optimization using Miniature Simulations. In Proceedings of the 2017 USENIX Annual Technical Conference (USENIX ATC' 17). Santa Clara, CA, 487--498.
[43]
Jiajun Wang, Reena Panda, and Lizy Kurian John. 2017. Prefetching for Cloud Workloads: An Analysis Based on Address Patterns. In Proceedings of the 2017 IEEE International Symposium on Performance Analysis of Systems and Software(ISPASS'17). Santa Rosa, CA, 163--172.
[44]
Ke Zhou, Shaofu Hu, Ping Huang, and Yuhong Zhao. 2017. LX-SSD: Enhancing the Lifespan of NAND Flash-based Memory via Recycling Invalid Pages. In Proceedings of the IEEE 33rd Symposium on Mass Storage Systems and Technologies(MSST' 17). Santa Clara, CA, 28:1--28:13.
[45]
Yuanyuan Zhou and James F. Philbin. 2001. The Multi-Queue Replacement Algorithm for Second Level Buffer Caches. In Proceedings of the 2001 USENIX Annual Technical Conference(USENIX ATC'01). Boston, MA, 91--104.

Cited By

View all
  • (2024)FLOWS: Balanced MRC Profiling for Heterogeneous Object-Size CacheProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3650078(421-440)Online publication date: 22-Apr-2024
  • (2024)Beyond Belady to Attain a Seemingly Unattainable Byte Miss Ratio for Content Delivery NetworksIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.345209635:11(1949-1963)Online publication date: Nov-2024
  • (2023)SHADEProceedings of the 21st USENIX Conference on File and Storage Technologies10.5555/3585938.3585947(135-151)Online publication date: 21-Feb-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICS '18: Proceedings of the 2018 International Conference on Supercomputing
June 2018
407 pages
ISBN:9781450357838
DOI:10.1145/3205289
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 June 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Caching
  2. Distributed Storage
  3. Performance Evaluation

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICS '18
Sponsor:

Acceptance Rates

Overall Acceptance Rate 629 of 2,180 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)66
  • Downloads (Last 6 weeks)5
Reflects downloads up to 18 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)FLOWS: Balanced MRC Profiling for Heterogeneous Object-Size CacheProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3650078(421-440)Online publication date: 22-Apr-2024
  • (2024)Beyond Belady to Attain a Seemingly Unattainable Byte Miss Ratio for Content Delivery NetworksIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.345209635:11(1949-1963)Online publication date: Nov-2024
  • (2023)SHADEProceedings of the 21st USENIX Conference on File and Storage Technologies10.5555/3585938.3585947(135-151)Online publication date: 21-Feb-2023
  • (2023)PC-Allocation: Performance Cliff-Aware Two-Level Cache Resource Allocation Scheme for Storage SystemApplied Sciences10.3390/app1306355613:6(3556)Online publication date: 10-Mar-2023
  • (2023)FIFO queues are all you need for cache evictionProceedings of the 29th Symposium on Operating Systems Principles10.1145/3600006.3613147(130-149)Online publication date: 23-Oct-2023
  • (2023)A delayed eviction caching replacement strategy with unified standard for edge serversComputer Networks: The International Journal of Computer and Telecommunications Networking10.1016/j.comnet.2023.109794230:COnline publication date: 1-Jul-2023
  • (2022)Lightweight Robust Size Aware Cache ManagementACM Transactions on Storage10.1145/350792018:3(1-23)Online publication date: 24-Aug-2022
  • (2022)Exploration and Exploitation for Buffer-Controlled HDD-Writes for SSD-HDD Hybrid Storage ServerACM Transactions on Storage10.1145/346541018:1(1-29)Online publication date: 29-Jan-2022
  • (2022)HET-KG: Communication-Efficient Knowledge Graph Embedding Training via Hotness-Aware Cache2022 IEEE 38th International Conference on Data Engineering (ICDE)10.1109/ICDE53745.2022.00177(1754-1766)Online publication date: May-2022
  • (2022)Workload-aware storage policies for cloud object storageJournal of Parallel and Distributed Computing10.1016/j.jpdc.2022.01.026163(232-247)Online publication date: May-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media