Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3366423.3380176acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

PG2S+: Stack Distance Construction Using Popularity, Gap and Machine Learning

Published: 20 April 2020 Publication History

Abstract

Stack distance characterizes temporal locality of workloads and plays a vital role in cache analysis since the 1970s. However, exact stack distance calculation is too costly, and impractical for online use. Hence, much work was done to optimize the exact computation, or approximate it through sampling or modeling.
This paper introduces a new approximation technique PG2S that is based on reference popularity and gap distance. This approximation is exact under the Independent Reference Model (IRM). The technique is further extended, using machine learning, to PG2S+ for non-IRM reference patterns. Extensive experiments show that PG2S+ is much more accurate and robust than other state-of-the-art algorithms for determining stack distance.
PG2S+ is the first technique to exploit the strong correlation among reference popularity, gap distance and stack distance.

References

[1]
George Almási, Cǎlin Caşcaval, and David A. Padua. 2002. Calculating Stack Distances Efficiently. In MSP. 37–43.
[2]
B. T. Bennett and V. J. Kruskal. 1975. LRU Stack Processing. IBM Journal of Research and Development 19, 4 (1975), 353–357.
[3]
E. Berg and E. Hagersten. 2004. StatCache: a probabilistic approach to efficient and accurate data locality analysis. In IEEE ISPASS, 2004. 20–27.
[4]
Daniel Byrne. 2018. A Survey of Miss-Ratio Curve Construction Techniques. CoRR abs/1804.01972(2018).
[5]
Asaf Cidon, Assaf Eisenman, Mohammad Alizadeh, and Sachin Katti. 2016. Cliffhanger: Scaling Performance Cliffs in Web Memory Caches. In NSDI. 379–392.
[6]
Thomas M. Conte, Mary Ann Hirsch, and W-MW Hwu. 1998. Combining trace sampling with single pass methods for efficient cache simulation. IEEE Trans. Comput. 47, 6 (1998), 714–720.
[7]
Chen Ding and Yutao Zhong. 2003. Predicting Whole-program Locality Through Reuse Distance Analysis. SIGPLAN Not. 38, 5 (May 2003), 245–257.
[8]
D. Eklov and E. Hagersten. 2010. StatStack: Efficient modeling of LRU caches. In IEEE ISPASS. 55–65.
[9]
L. He, Z. Yu, and H. Jin. 2012. FractalMRC: Online Cache Miss Rate Curve Prediction on Commodity Systems. In IEEE IPDPS. 1341–1351.
[10]
Xiameng Hu, Xiaolin Wang, Yechen Li, Lan Zhou, Yingwei Luo, Chen Ding, Song Jiang, and Zhenlin Wang. 2015. LAMA: Optimized Locality-aware Memory Allocation for Key-value Cache. In 2015 USENIX Annual Technical Conference (USENIX ATC 15). 57–69.
[11]
Xiameng Hu, Xiaolin Wang, Lan Zhou, Yingwei Luo, Chen Ding, and Zhenlin Wang. 2016. Kinetic Modeling of Data Eviction in Cache. In Proceedings of the 2016 USENIX Conference on Usenix Annual Technical Conference. 351–364.
[12]
Cliff Young Kaz Sato and David Patterson. 2017. An in-depth look at Google’s first Tensor Processing Unit (TPU). https://cloud.google.com/blog/products/gcp/an-in-depth-look-at-googles-first-tensor-processing-unit-tpu
[13]
Yul H. Kim, Mark D. Hill, and David A. Wood. 1991. Implementing Stack Simulation for Highly-associative Memories. SIGMETRICS Perform. Eval. Rev. 19, 1 (April 1991), 212–213.
[14]
Tim Kraska, Alex Beutel, Ed H. Chi, Jeffrey Dean, and Neoklis Polyzotis. 2018. The Case for Learned Index Structures. In SIGMOD(SIGMOD ’18). ACM, New York, NY, USA, 489–504.
[15]
R. L. Mattson, J. Gecsei, D. R. Slutz, and I. L. Traiger. 1970. Evaluation techniques for storage hierarchies. IBM Systems Journal 9, 2 (1970), 78–117.
[16]
Dushyanth Narayanan, Austin Donnelly, and Antony Rowstron. 2008. Write off-loading: Practical power management for enterprise storage. ACM TOS 4, 3 (2008), 10.
[17]
Qingpeng Niu, James Dinan, Qingda Lu, and Ponnuswamy Sadayappan. 2012. PARDA: A fast parallel reuse distance analysis algorithm. In IEEE IPDPS. 1284–1294.
[18]
Frank Olken. 1983. Efficient methods for calculating the success function of fixed space replacement polities. Performance Evaluation 3 (05 1983).
[19]
G. S. Paschos, G. Iosifidis, M. Tao, D. Towsley, and G. Caire. 2018. The Role of Caching in Future Communication Systems and Networks. IEEE Journal on Selected Areas in Communications 36, 6 (June 2018), 1111–1125.
[20]
Pavlos Petoumenos, Georgios Keramidas, Håkan Zeffer, Stefanos Kaxiras, and Erik Hagersten. 2006. Modeling Cache Sharing on Chip Multiprocessor Architectures. In IEEE IISWC. 160–171.
[21]
Moinuddin K. Qureshi and Yale N. Patt. 2006. Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches. In International Symposium on Microarchitecture. 423–432.
[22]
Trausti Saemundsson, Hjortur Bjornsson, Gregory Chockler, and Ymir Vigfusson. 2014. Dynamic Performance Profiling of Cloud Caches. In SOCC. 28:1–28:14.
[23]
D. L. Schuff, M. Kulkarni, and V. S. Pai. 2010. Accelerating multicore reuse distance analysis with sampling and parallelization. In PACT. 53–63.
[24]
Xipeng Shen, Jonathan Shaw, Brian Meeker, and Chen Ding. 2007. Locality Approximation Using Time. SIGPLAN Not. 42, 1 (Jan. 2007), 55–61.
[25]
Rabin A. Sugumar and Santosh G. Abraham. 1991. Efficient Simulation of Multiple Cache Configurations using Binomial Trees. Technical Report.
[26]
David K. Tam, Reza Azimi, Livio B. Soares, and Michael Stumm. 2009. RapidMRC: Approximating L2 Miss Rate Curves on Commodity Systems for Online Optimizations. SIGPLAN Not. 44, 3, 121–132.
[27]
Y. C. Tay and Min Zou. 2006. A Page Fault Equation for Modeling the Effect of Memory Size. Perform. Eval. 63, 2 (Feb. 2006), 99–130.
[28]
Dominique Thiébaut. 1989. On the Fractal Dimension of Computer Programs and Its Application to the Prediction of the Cache Miss Ratio. IEEE Trans. Comput. 38, 7 (July 1989), 1012–1026.
[29]
Carl A. Waldspurger, Nohhyun Park, Alexander Garthwaite, and Irfan Ahmad. 2015. Efficient MRC Construction with SHARDS. In FAST. 95–110.
[30]
Jake Wires, Stephen Ingram, Zachary Drudi, Nicholas J. A. Harvey, and Andrew Warfield. 2014. Characterizing Storage Workloads with Counter Stacks. In OSDI. 335–349.
[31]
Liang Yuan, Chen Ding, Wesley Smith, Peter Denning, and Yunquan Zhang. 2019. A Relational Theory of Locality. ACM Trans. Archit. Code Optim.(2019).
[32]
Yutao Zhong and Wentao Chang. 2008. Sampling-based Program Locality Approximation. In ISMM. 91–100.
[33]
Yutao Zhong, Xipeng Shen, and Chen Ding. 2009. Program Locality Analysis Using Reuse Distance. ACM Trans. Program. Lang. Syst. 31, 6, Article 20 (Aug. 2009), 39 pages.
[34]
Pin Zhou, Vivek Pandey, Jagadeesan Sundaresan, Anand Raghuraman, Yuanyuan Zhou, and Sanjeev Kumar. 2004. Dynamic Tracking of Page Miss Ratio Curve for Memory Management. SIGOPS Oper. Syst. Rev. 38, 5 (Oct. 2004), 177–188.

Cited By

View all
  • (2024)TTLs Matter: Efficient Cache Sizing with TTL-Aware Miss Ratio Curves and Working Set SizesProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3650066(387-404)Online publication date: 22-Apr-2024
  • (2023)Increment - and - Freeze: Every Cache, Everywhere, All of the TimeProceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3558481.3591085(129-139)Online publication date: 17-Jun-2023
  • (2022)Efficient Stack Distance Approximation Based on Workload CharacteristicsIEEE Access10.1109/ACCESS.2022.318032710(59792-59805)Online publication date: 2022

Index Terms

  1. PG2S+: Stack Distance Construction Using Popularity, Gap and Machine Learning
              Index terms have been assigned to the content through auto-classification.

              Recommendations

              Comments

              Please enable JavaScript to view thecomments powered by Disqus.

              Information & Contributors

              Information

              Published In

              cover image ACM Conferences
              WWW '20: Proceedings of The Web Conference 2020
              April 2020
              3143 pages
              ISBN:9781450370233
              DOI:10.1145/3366423
              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Sponsors

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              Published: 20 April 2020

              Permissions

              Request permissions for this article.

              Check for updates

              Qualifiers

              • Research-article
              • Research
              • Refereed limited

              Conference

              WWW '20
              Sponsor:
              WWW '20: The Web Conference 2020
              April 20 - 24, 2020
              Taipei, Taiwan

              Acceptance Rates

              Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

              Contributors

              Other Metrics

              Bibliometrics & Citations

              Bibliometrics

              Article Metrics

              • Downloads (Last 12 months)16
              • Downloads (Last 6 weeks)2
              Reflects downloads up to 13 Nov 2024

              Other Metrics

              Citations

              Cited By

              View all
              • (2024)TTLs Matter: Efficient Cache Sizing with TTL-Aware Miss Ratio Curves and Working Set SizesProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3650066(387-404)Online publication date: 22-Apr-2024
              • (2023)Increment - and - Freeze: Every Cache, Everywhere, All of the TimeProceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3558481.3591085(129-139)Online publication date: 17-Jun-2023
              • (2022)Efficient Stack Distance Approximation Based on Workload CharacteristicsIEEE Access10.1109/ACCESS.2022.318032710(59792-59805)Online publication date: 2022

              View Options

              Get Access

              Login options

              View options

              PDF

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader

              HTML Format

              View this article in HTML Format.

              HTML Format

              Media

              Figures

              Other

              Tables

              Share

              Share

              Share this Publication link

              Share on social media