Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/166955.166974acmconferencesArticle/Chapter ViewAbstractPublication PagesmetricsConference Proceedingsconference-collections
Article
Free access

Efficient simulation of caches under optimal replacement with applications to miss characterization

Published: 01 June 1993 Publication History

Abstract

Cache miss characterization models such as the three Cs model are useful in developing schemes to reduce cache misses and their penalty. In this paper we propose the OPT model that uses cache simulation under optimal (OPT) replacement to obtain a finer and more accurate characterization of misses than the three Cs model. However, current methods for optimal cache simulation are slow and difficult to use. We present three new techniques for optimal cache simulation. First, we propose a limited lookahead strategy with error fixing, which allows one pass simulation of multiple optimal caches. Second, we propose a scheme to group entries in the OPT stack, which allows efficient tree based fully-associative cache simulation under OPT. Third, we propose a scheme for exploiting partial inclusion in set-associative cache simulation under OPT. Simulators based on these algorithms were used to obtain cache miss characterizations using the OPT model for nine SPEC benchmarks. The results indicate that miss ratios under OPT are substantially lower than those under LRU replacement, by up to 70% in fully-associative caches, and up to 32% in two-way set-associative caches.

References

[1]
A. Agarwal. Analysis of Cache Performance for Operating Systems and Mu{tiprogramming. PhD thesis, Stanford University, 1988. Available as Technical Report CSL-TR-87-332.
[2]
L. A. Belady. A study of replacement algorithms for a virtual-storage computer. IBM Systems Journal, 5(2):78-101, 1966.
[3]
B. T. Bennett and V. J. Kruskal. LRU stack processing. IBM J. of Research and Development, pages 353-357, July 1975.
[4]
D. Gannon, W. Jalby, and K. GalUvan. Strategies for cache and local memory management by global program transformation. J. of Par. and Dist. Comp., 5:587-616, 1988.
[5]
J. L. Hennessy emd D. A. Patterson. Computer Architecture -- A Quantitive Approach. Morgan Kaufmann Publishers Inc., 1990.
[6]
M. D. Hill. Aspects of Cache Memory and Instruction fornia, Berkeley, 1987. Available as Technical Report UCB/CSD 87/381.
[7]
M. D. Hill and A. J. Smith. Evaluating associativity in CPU caches. IEEE Trans. on Computers, 38(12):1612- 1630, December 1989.
[8]
W. W. Hwu and P. P. Chang. Achieving high instruction cache performance with an optimizing compiler. In Proc. of 16th Intl. Syrup. on Computer Architecture, pages 242-251, 1989.
[9]
W. W. Hwu and T. M. Conte. The susceptibility of programs to context switching. Technical Report CRHC- 91-14, Center for Reliable and High-Performance Computing, University of Illinois, Urbana, April 1991.
[10]
N. P. Jouppi. Improving direct-mapped cache performmice by the addition of & small fully-associative cache and prefetch buffets. In Proc. of 17th Intl. Syrup. on Computer Architecture, pages 364-373, 1990.
[11]
Y. H. Kim, M. D. Hill, and D. A. Wood. Implementing stack simulation for highly-associative memories. In Proc. ACM SIGMETRICS Cont., pages 212-213, 1991.
[12]
R. L. Mattson, J. Gecsei, D. R. Slutz, and I. L. Traiger. Evaluation techniques for storage hierarchies. IBM Systems Journal, 9(2):78-117, 1970.
[13]
S. McFarling. Program Analysis and Optimization for Machines with Instruction Cache. PhD thesis, Stanford University, 1988. Technical Report No. CSL-TR-91- 493.
[14]
S. McFarling. Program optimization for instruction caches. In Proceedings of ASPLOS III, 1989.
[15]
T. N. Mudge et al. The design of a microsupercomputer. IEEE Computer, pages 57-64, Jan 1991.
[16]
F. Olken. Efficient methods for calculating the success function of fixed space replacement policies. Technical Report LBL-12370, Lawrence Berkeley Laboratory, 1981.
[17]
D. D. Sleator and R. E. Taxjan. Serf adjusting binary search trees. J. of the A CM, 32(3):652-686, 1985.
[18]
H. S. Stone. High.Performance Computer Architecture. Addison-Wesley, 2"d edition, 1987.
[19]
R. A. Sugumar and S. G. Abraham. Efficient simulation of multiple cache configurations using binomial trees. Technical Report CSE-TR-111-91, CSE Division, University of Michigan, 1991.
[20]
R. A. Suguma~ and S. G. Abraham. Efficient simulation of caches under optimal replacement with applications to miss characterization. Technical Report CSE-TR- 143-92, CSE Division, University of Michigan, 1992.
[21]
D. Thiebaut. On the frsctal dimension of computer programs and its application to the prediction of the cache miss ratio. IEEE Trans. on Computers, 38(7):1012- 1026, July 1989.
[22]
J. G. Thompson. Efficient analysis of Caching Systems. PhD thesis, University of California, Berkeley, 1987.

Cited By

View all
  • (2022)CARL: Compiler Assigned Reference LeasingACM Transactions on Architecture and Code Optimization10.1145/349873019:1(1-28)Online publication date: 17-Mar-2022
  • (2022)Predicting Reuse Interval for Optimized Web Caching: An LSTM-Based Machine Learning ApproachSC22: International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SC41404.2022.00091(1-15)Online publication date: Nov-2022
  • (2022)Whisper: Profile-Guided Branch Misprediction Elimination for Data Center Applications2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO56248.2022.00017(19-34)Online publication date: Oct-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMETRICS '93: Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
June 1993
286 pages
ISBN:0897915801
DOI:10.1145/166955
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 June 1993

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

SIGMETRICS93
Sponsor:

Acceptance Rates

Overall Acceptance Rate 459 of 2,691 submissions, 17%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)99
  • Downloads (Last 6 weeks)16
Reflects downloads up to 20 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2022)CARL: Compiler Assigned Reference LeasingACM Transactions on Architecture and Code Optimization10.1145/349873019:1(1-28)Online publication date: 17-Mar-2022
  • (2022)Predicting Reuse Interval for Optimized Web Caching: An LSTM-Based Machine Learning ApproachSC22: International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SC41404.2022.00091(1-15)Online publication date: Nov-2022
  • (2022)Whisper: Profile-Guided Branch Misprediction Elimination for Data Center Applications2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO56248.2022.00017(19-34)Online publication date: Oct-2022
  • (2019)The CSI Framework for Compiler-Inserted Program InstrumentationACM SIGMETRICS Performance Evaluation Review10.1145/3308809.330886046:1(100-102)Online publication date: 17-Jan-2019
  • (2019)Beating OPT with Statistical Clairvoyance and Variable Size CachingProceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3297858.3304067(243-256)Online publication date: 4-Apr-2019
  • (2018)The CSI Framework for Compiler-Inserted Program InstrumentationACM SIGMETRICS Performance Evaluation Review10.1145/3292040.321965746:1(100-102)Online publication date: 12-Jun-2018
  • (2018)The CSI Framework for Compiler-Inserted Program InstrumentationAbstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems10.1145/3219617.3219657(100-102)Online publication date: 12-Jun-2018
  • (2017)The CSI Framework for Compiler-Inserted Program InstrumentationProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/31545021:2(1-25)Online publication date: 19-Dec-2017
  • (2017)Optimal On-Line Computation of Stack Distances for MIN and OPTProceedings of the Computing Frontiers Conference10.1145/3075564.3075571(237-246)Online publication date: 15-May-2017
  • (2017)AutoMatch: An automated framework for relative performance estimation and workload distribution on heterogeneous HPC systems2017 IEEE International Symposium on Workload Characterization (IISWC)10.1109/IISWC.2017.8167754(32-42)Online publication date: Oct-2017
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media