research-article

Public Access

Fast Miss Ratio Curve Modeling for Storage Cache

Authors:

Chencheng YeAuthors Info & Claims

ACM Transactions on Storage (TOS), Volume 14, Issue 2

Article No.: 12, Pages 1 - 34

https://doi.org/10.1145/3185751

Published: 12 April 2018 Publication History

Abstract

The reuse distance (least recently used (LRU) stack distance) is an essential metric for performance prediction and optimization of storage cache. Over the past four decades, there have been steady improvements in the algorithmic efficiency of reuse distance measurement. This progress is accelerating in recent years, both in theory and practical implementation.

In this article, we present a kinetic model of LRU cache memory, based on the average eviction time (AET) of the cached data. The AET model enables fast measurement and use of low-cost sampling. It can produce the miss ratio curve in linear time with extremely low space costs. On storage trace benchmarks, AET reduces the time and space costs compared to former techniques. Furthermore, AET is a composable model that can characterize shared cache behavior through sampling and modeling individual programs or traces.

References

[1]

Arnold O. Allen. 2014. Probability, Statistics, and Queueing Theory. Academic Press.

[2]

George Almasi, Calin Cascaval, and David A. Padua. 2002. Calculating stack distances efficiently. In Proceedings of the ACM SIGPLAN Workshop on Memory System Performance. Berlin, Germany, 37--43.

Digital Library

[3]

Nathan Beckmann and Daniel Sanchez. 2017. Maximizing cache performance under uncertainty. In Proceedings of the 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA’17). IEEE, 109--120.

[4]

Kristof Beyls and Erik H. D’Hollander. 2006. Discovery of locality-improving refactoring by reuse path analysis. In Proceedings of High Performance Computing and Communications. Springer. Lecture Notes in Computer Science, Vol. 4208. 220--229.

Digital Library

[5]

Hjortur Bjornsson, Gregory Chockler, Trausti Saemundsson, and Ymir Vigfusson. 2013. Dynamic performance profiling of cloud caches. In Proceedings of the 4th Annual Symposium on Cloud Computing. ACM, 59.

Digital Library

[6]

Jacob Brock, Yechen Li, Chencheng Ye, and Chen Ding. 2015. Optimal cache partition-sharing: Don’t ever take a fence down until you know why it was put up—Robert Frost. In Proceedings of the International Conference on Parallel Processing.

Digital Library

[7]

Mustafa Canim, George A. Mihaila, Bishwaranjan Bhattacharjee, Kenneth A. Ross, and Christian A. Lang. 2010. SSD bufferpool extensions for database systems. Proceedings of the VLDB Endowment 3, 1--2 (2010), 1435--1446.

Digital Library

[8]

Calin Cascaval, Evelyn Duesterwald, Peter F. Sweeney, and Robert W. Wisniewski. 2005. Multiple page size modeling and optimization. In Proceedings of the International Conference on Parallel Architecture and Compilation Techniques. 339--349.

Digital Library

[9]

Dhruba Chandra, Fei Guo, Seongbeom Kim, and Yan Solihin. 2005. Predicting inter-thread cache contention on a chip multi-processor architecture. In Proceedings of the 11th International Symposium on High-Performance Computer Architecture (HPCA’11). IEEE, 340--351.

Digital Library

[10]

Zhifeng Chen, Yuanyuan Zhou, and Kai Li. 2003. Eviction-based cache placement for storage caches. In Proceedings of the USENIX Annual Technical Conference, General Track. 269--281.

[11]

Edward Grady Coffman and Peter J. Denning. 1973. Operating Systems Theory. Vol. 973. Prentice-Hall Englewood Cliffs, NJ.

Digital Library

[12]

Peter J. Denning. 1968. The working set model for program behavior. Commun. ACM 11, 5 (1968), 323--333.

Digital Library

[13]

Peter J. Denning. 1980. Working sets past and present. IEEE Trans. Software Eng.1 (1980), 64--84.

Digital Library

[14]

Peter J. Denning, Craig H. Martell, and Vint Cerf. 2015. Great Principles of Computing. MIT Press.

Digital Library

[15]

Peter J. Denning and Stuart C. Schwartz. 1972. Properties of the working-set model. Commun. ACM 15, 3 (1972), 191--198.

Digital Library

[16]

Peter J. Denning and Donald R. Slutz. 1978. Generalized working sets for segment reference strings. Commun. ACM 21, 9 (1978), 750--759.

Digital Library

[17]

Chen Ding, Xiaoya Xiang, Bin Bao, Hao Luo, Ying-Wei Luo, and Xiao-Lin Wang. 2014. Performance metrics and models for shared cache. Journal of Computer Science and Technology 29, 4 (2014), 692--712.

[18]

Zachary Drudi, Nicholas J. A. Harvey, Stephen Ingram, Andrew Warfield, and Jake Wires. 2015. Approximating hit rate curves using streaming algorithms. In LIPIcs-Leibniz International Proceedings in Informatics, Vol. 40. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik.

[19]

E. Duesterwald, C. Cascaval, and S. Dwarkadas. 2003. Characterizing and predicting program behavior and its variability. In Proceedings of the International Conference on Parallel Architecture and Compilation Techniques. New Orleans, Louisiana.

Digital Library

[20]

David Eklov and Erik Hagersten. 2010. StatStack: Efficient modeling of LRU caches. In Proceedings of the 2010 IEEE International Symposium on Performance Analysis of Systems 8 Software (ISPASS’10). IEEE, 55--65.

[21]

Éric Fusy, G. Olivier, and Frédéric Meunier. 2007. Hyperloglog: The analysis of a near-optimal cardinality estimation algorithm. In Proceedings of the 2007 International Conference on Analysis of Algorithms (AofA’07).

[22]

Binny S. Gill. 2008. On multi-level exclusive caching: Offline optimality and why promotions are better than demotions. In Proceedings of the 6th USENIX Conference on File and Storage Technologies. USENIX Association, 4.

Digital Library

[23]

Xiameng Hu, Xiaolin Wang, Yechen Li, Lan Zhou, Yingwei Luo, Chen Ding, Song Jiang, and Zhenlin Wang. 2015. LAMA: Optimized locality-aware memory allocation for key-value cache. In Proceedings of USENIX Annual Technical Conference.

Digital Library

[24]

Song Jiang and Xiaodong Zhang. 2002. LIRS: An efficient low inter-reference recency set replacement policy to improve buffer cache performance. ACM SIGMETRICS Perform. Eval. Rev. 30, 1 (2002), 31--42.

Digital Library

[25]

Yunlian Jiang, Eddy Z. Zhang, Kai Tian, and Xipeng Shen. 2010. Is reuse distance applicable to data locality analysis on chip multiprocessors? In Compiler Construction. Springer, 264--282.

Digital Library

[26]

Taeho Kgil and Trevor Mudge. 2006. FlashCache: A NAND flash memory file cache for low power web servers. In Proceedings of the 2006 International Conference on Compilers, Architecture and Synthesis for Embedded Systems. ACM, 103--112.

Digital Library

[27]

Yul H. Kim, Mark D. Hill, and David A. Wood. 1991. Implementing stack simulation for highly-associative memories. In Proceedings of the International Conference on Measurement and Modeling of Computer Systems. 212--213.

Digital Library

[28]

R. L. Mattson, J. Gecsei, D. Slutz, and I. L. Traiger. 1970. Evaluation techniques for storage hierarchies. IBM Syst. J. 9, 2 (1970), 78--117.

Digital Library

[29]

Nimrod Megiddo and Dharmendra S. Modha. 2003. ARC: A self-tuning, low overhead replacement cache. In FAST, Vol. 3. 115--130.

Digital Library

[30]

Dushyanth Narayanan, Austin Donnelly, and Antony Rowstron. 2008. Write off-loading: Practical power management for enterprise storage. ACM Trans. Storage (TOS) 4, 3 (2008), 10.

Digital Library

[31]

Frank Olken. 1981a. Efficient Methods for Calculating the Success Function of Fixed-space Replacement Policies. Technical Report. Lawrence Berkeley Lab, CA.

[32]

F. Olken. 1981b. Efficient Methods for Calculating the Success Function of Fixed Space Replacement Policies. Technical Report LBL-12370. Lawrence Berkeley Laboratory.

[33]

Derek L. Schuff, Milind Kulkarni, and Vijay S. Pai. 2010. Accelerating multicore reuse distance analysis with sampling and parallelization. In Proceedings of the International Conference on Parallel Architecture and Compilation Techniques. 53--64.

Digital Library

[34]

Xipeng Shen, Jonathan Shaw, Brian Meeker, and Chen Ding. 2007. Locality approximation using time. In ACM SIGPLAN Notices, Vol. 42. ACM, 55--61.

Digital Library

[35]

Donald R. Slutz and Irving L. Traiger. 1974. A note on the calculation working set size. Commun. ACM 17, 10 (1974), 563--565.

Digital Library

[36]

G. Edward Suh, Srinivas Devadas, and Larry Rudolph. 2001. Analytical cache models with applications to cache partitioning. In Proceedings of the 15th International Conference on Supercomputing. ACM, 1--12.

Digital Library

[37]

G. Edward Suh, Srinivas Devadas, and Larry Rudolph. 2014. Analytical cache models with applications to cache partitioning. In Proceedings of the 25th Anniversary International Conference on Supercomputing Anniversary Volume. ACM, 323--334.

Digital Library

[38]

David K. Tam, Reza Azimi, Livio Soares, and Michael Stumm. 2009. RapidMRC: Approximating L2 miss rate curves on commodity systems for online optimizations. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems. 121--132.

Digital Library

[39]

Jeffrey S. Vitter. 1985. Random sampling with a reservoir. ACM Trans. Math. Software (TOMS) 11, 1 (1985), 37--57.

Digital Library

[40]

Carl Waldspurger, Trausti Saemundsson, Irfan Ahmad, and Nohhyun Park. 2017. Cache modeling and optimization using miniature simulations. In Proceedings of USENIX ATC. 487--498.

Digital Library

[41]

Carl A. Waldspurger, Nohhyun Park, Alexander Garthwaite, and Irfan Ahmad. 2015. Efficient MRC construction with SHARDS. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST’15). USENIX Association, 95--110.

Digital Library

[42]

Xiaolin Wang, Yechen Li, Yingwei Luo, Xiameng Hu, Jacob Brock, Chen Ding, and Zhenlin Wang. 2015. Optimal footprint symbiosis in shared cache. In CCGRID.

[43]

Jake Wires, Stephen Ingram, Zachary Drudi, Nicholas J. A. Harvey, Andrew Warfield, and Coho Data. 2014. Characterizing storage workloads with counter stacks. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation. USENIX Association, 335--349.

Digital Library

[44]

Theodore M. Wong and John Wilkes. 2002. My cache or yours? Making storage more exclusive. In Proceedings of the USENIX Annual Technical Conference, General Track. 161--175.

Digital Library

[45]

Xiaoya Xiang, Bin Bao, Tongxin Bai, Chen Ding, and Trishul M. Chilimbi. 2011a. All-window profiling and composable models of cache sharing. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 91--102.

Digital Library

[46]

Xiaoya Xiang, Bin Bao, Chen Ding, and Yaoqing Gao. 2011b. Linear-time modeling of program working set in shared cache. In Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques (PACT’11). IEEE, 350--360.

Digital Library

[47]

Xiaoya Xiang, Chen Ding, Hao Luo, and Bin Bao. 2013. HOTL: A higher order theory of locality. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems. 343--356.

Digital Library

[48]

Gala Yadgar, Michael Factor, Kai Li, and Assaf Schuster. 2008. Mc2: Multiple clients on a multilevel cache. In Proceedings of the 28th International Conference on Distributed Computing Systems (ICDCS’08). IEEE, 722--730.

Digital Library

[49]

Xiao Zhang, Sandhya Dwarkadas, and Kai Shen. 2009. Towards practical page coloring-based multicore cache management. In Proceedings of the 4th ACM European Conference on Computer Systems. ACM, 89--102.

Digital Library

[50]

Yutao Zhong and Wentao Chang. 2008. Sampling-based program locality approximation. In Proceedings of the International Symposium on Memory Management. 91--100.

Digital Library

[51]

Yutao Zhong, Xipeng Shen, and Chen Ding. 2009. Program locality analysis using reuse distance. ACM Trans. Program. Lang. Syst. (TOPLAS) 31, 6 (2009), 20.

Digital Library

[52]

Pin Zhou, Vivek Pandey, Jagadeesan Sundaresan, Anand Raghuraman, Yuanyuan Zhou, and Sanjeev Kumar. 2004. Dynamic tracking of page miss ratio curve for memory management. In ACM SIGOPS Operating Systems Review, Vol. 38. ACM, 177--188.

Digital Library

Cited By

Shakiba KSultan SStumm MMa XWon Y(2024)KosmoProceedings of the 22nd USENIX Conference on File and Storage Technologies10.5555/3650697.3650703(89-106)Online publication date: 27-Feb-2024
https://dl.acm.org/doi/10.5555/3650697.3650703
Liu FZhu YSun SDing CSmith WHosseini K(2024)Parallel Loop Locality Analysis for Symbolic Thread CountsProceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques10.1145/3656019.3676948(219-232)Online publication date: 14-Oct-2024
https://dl.acm.org/doi/10.1145/3656019.3676948
Wang PLiu YLiu ZZhao ZLiu KZhou KHuang Z(2024)-LAP: A Lightweight and Adaptive Cache Partitioning Scheme With Prudent Resizing Decisions for Content Delivery NetworksIEEE Transactions on Cloud Computing10.1109/TCC.2024.342045412:3(942-953)Online publication date: Jul-2024
https://doi.org/10.1109/TCC.2024.3420454
Show More Cited By

Index Terms

Fast Miss Ratio Curve Modeling for Storage Cache
1. Computing methodologies
  1. Modeling and simulation
    1. Model development and analysis
      1. Modeling methodologies
2. Information systems
  1. Information storage systems
    1. Storage management
      1. Hierarchical storage management

Recommendations

mPart: miss-ratio curve guided partitioning in key-value stores
ISMM 2018: Proceedings of the 2018 ACM SIGPLAN International Symposium on Memory Management

Web applications employ key-value stores to cache the data that is most commonly accessed. The cache improves an web application's performance by serving its requests from memory, avoiding fetching them from the backend database. Since the memory space ...
Reshaping cache misses to improve row-buffer locality in multicore systems
PACT '13: Proceedings of the 22nd international conference on Parallel architectures and compilation techniques

Optimizing cache locality has always been important since the emergence of caches, and numerous cache locality optimization schemes have been published in compiler literature. However, in modern architectures, cache locality is not the only factor that ...
Cache miss behavior: is it √2?
CF '06: Proceedings of the 3rd conference on Computing frontiers

It has long been empirically observed that the cache miss rate decreased as a power law of cache size, where the power was approximately -1/2. In this paper, we examine the dependence of the cache miss rate on cache size both theoretically and through ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Storage

ACM Transactions on Storage Volume 14, Issue 2

May 2018

210 pages

ISSN:1553-3077

EISSN:1553-3093

DOI:10.1145/3208078

Editor:
Sam H. Noh
Ulsan National Institute of Science and Technology, Ulsan, Korea

Issue’s Table of Contents

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 April 2018

Accepted: 01 January 2018

Revised: 01 January 2018

Received: 01 February 2017

Published in TOS Volume 14, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

National Science Foundation
IBM CAS Faculty Fellowship
National Science Foundation of China
Shenzhen Key Research

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

26
Total Citations
View Citations
1,089
Total Downloads

Downloads (Last 12 months)170
Downloads (Last 6 weeks)21

Reflects downloads up to 26 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Shakiba KSultan SStumm MMa XWon Y(2024)KosmoProceedings of the 22nd USENIX Conference on File and Storage Technologies10.5555/3650697.3650703(89-106)Online publication date: 27-Feb-2024
https://dl.acm.org/doi/10.5555/3650697.3650703
Liu FZhu YSun SDing CSmith WHosseini K(2024)Parallel Loop Locality Analysis for Symbolic Thread CountsProceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques10.1145/3656019.3676948(219-232)Online publication date: 14-Oct-2024
https://dl.acm.org/doi/10.1145/3656019.3676948
Wang PLiu YLiu ZZhao ZLiu KZhou KHuang Z(2024)-LAP: A Lightweight and Adaptive Cache Partitioning Scheme With Prudent Resizing Decisions for Content Delivery NetworksIEEE Transactions on Cloud Computing10.1109/TCC.2024.342045412:3(942-953)Online publication date: Jul-2024
https://doi.org/10.1109/TCC.2024.3420454
Liu SZhang CNie SDuan KWu W(2023)PC-Allocation: Performance Cliff-Aware Two-Level Cache Resource Allocation Scheme for Storage SystemApplied Sciences10.3390/app1306355613:6(3556)Online publication date: 10-Mar-2023
https://doi.org/10.3390/app13063556
Ding CReber BPatru DKneipp AFigorito M(2023)E=MC^2: Efficient Mobility Centric CachingProceedings of the International Symposium on Memory Systems10.1145/3631882.3631892(1-5)Online publication date: 2-Oct-2023
https://dl.acm.org/doi/10.1145/3631882.3631892
Ajdari MPeykani Sani PMoradi AKhanalizadeh Imani MBazkhanei AAsadi HAamodt TJerger NSwift M(2023)Re-architecting I/O Caches for Emerging Fast Storage DevicesProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3582016.3582041(542-555)Online publication date: 25-Mar-2023
https://dl.acm.org/doi/10.1145/3582016.3582041
Wang YYang JWang Z(2023)Multi-Tenant In-Memory Key-Value Cache Partitioning Using Efficient Random Sampling-Based LRU ModelIEEE Transactions on Cloud Computing10.1109/TCC.2023.330088911:4(3601-3618)Online publication date: Oct-2023
https://doi.org/10.1109/TCC.2023.3300889
Gomes CHempstead M(2023)CInC: Workload Characterization In Context of Resource Contention2023 IEEE International Symposium on Workload Characterization (IISWC)10.1109/IISWC59245.2023.00035(201-205)Online publication date: 1-Oct-2023
https://doi.org/10.1109/IISWC59245.2023.00035
Ding CChen DLiu FReber BSmith W(2022)CARL: Compiler Assigned Reference LeasingACM Transactions on Architecture and Code Optimization10.1145/349873019:1(1-28)Online publication date: 17-Mar-2022
https://dl.acm.org/doi/10.1145/3498730
Athni CVinod Chippalkatti VNandakumar ANandana APavitra Y(2022)Improved Cache Replacement Policy based on Recency Time Re-Reference Interval Prediction2022 IEEE 7th International conference for Convergence in Technology (I2CT)10.1109/I2CT54291.2022.9824298(1-6)Online publication date: 7-Apr-2022
https://doi.org/10.1109/I2CT54291.2022.9824298
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Figures

Tables

Media

View Issue’s Table of Contents