article

Free access

Page placement algorithms for large real-indexed caches

Editor: Anita K. Jones Authors:

Mark D. HillAuthors Info & Claims

ACM Transactions on Computer Systems (TOCS), Volume 10, Issue 4

Pages 338 - 359

https://doi.org/10.1145/138873.138876

Published: 01 November 1992 Publication History

Abstract

When a computer system supports both paged virtual memory and large real-indexed caches, cache performance depends in part on the main memory page placement. To date, most operating systems place pages by selecting an arbitrary page frame from a pool of page frames that have been made available by the page replacement algorithm. We give a simple model that shows that this naive (arbitrary) page placement leads to up to 30% unnecessary cache conflicts. We develop several page placement algorithms, called careful-mapping algorithms, that try to select a page frame (from the pool of available page frames) that is likely to reduce cache contention. Using trace-driven simulation, we find that careful mapping results in 10–20% fewer (dynamic) cache misses than naive mapping (for a direct-mapped real-indexed multimegabyte cache). Thus, our results suggest that careful mapping by the operating system can get about half the cache miss reduction that a cache size (or associativity) doubling can.

References

[1]

AGARWAL, A., HOROWITZ, M., AND HENNESSY, J. An analytical cache model. ACM Trans. Comput. Syst. 7, 2 (May 1989), 184 215.

[2]

BABAOGLU, O., AND JoY, W. Converting a swap-based system to do paging m an architecture lacking page-referenced bits. In Proceedings of the 8th Symposium on Operating System Principles. 1981, pp. 78-86.

[3]

BAER, J., AND WANG, W. On the inclusion properties for multi-level cache hierarchies. In Procee&ngs of the 15th Annual International Symposium on Computer Architecture. 1988, pp. 73-80.

[4]

BORG, A., t~SSLER, R. E., L^ZAN& G., AND WALL, D.W. Long address traces from RISC machines: Generation and analysis. Res. Rep. 89/14, Western Research Laboratory, Digital Equipment Corporation, Palo Alto, Calif., 1989.

[5]

BORG, A., KESSLER, R. E., AND WALL, D.W. Generation and analysis of very long address traces. In Proceedings of the 17th Annual International Symposium on Computer Architecture. 1990, pp. 270-279.

[6]

BRAY, B. K., LYNCH, W. L., AND FLYNN, M. J. Page allocation to reduce access time of physical caches. Tech. Rep. CSL-TR-90-454, Computer Systems Laboratory, Stanford Univ., Stanford, Calif., 1990.

[7]

DENNING, P.J. The working set model for program behavior. Commun. ACM 11, 5 (May 1968), 323-333.

[8]

FERRARI, D. The improvement of program behavior. IEEE Comput. (Nov. 1976), 39 47.

[9]

GOODMAN, J.R. Coherency for multiprocessor virtual address caches. In Proceedings of the 2nd Internattonal Conference on Architectural Support for Programming' Languages and Operating Systems (Oct.). 1987, pp. 72-81.

[10]

GOODMAN, J. R., AND CHIANG, M. The use of static column RAM as a memory hierarchy In Proceedings of the llth Annual International Symposium on Computer Architecture. 1984, pp. 167-174.

[11]

HmL, M. D., AND SMITH, A J. Evaluating associatiwty in CPU caches. IEEE Trans. Comput. 38, 12 (Dec. 1989), 1612 1630.

[12]

HWU, W. W., AND CHANG, r. r. Achieving high instruction cache performance with an optimizing compiler. In Proceedings of the 1Gth International Symposium on Computer Architecture. 1989, pp. 242 251.

[13]

KAPLAN, K. R., AND WINDER, R.O. Cache-based computer systems. IEEE Comput. G, 3 (Mar. 1973), 30 36.

[14]

KELLY, E. Personal communications. 1990.

[15]

KESSLER, R.E. Analysis of multi-megabyte secondary CPU cache memories. Ph.D. thesis, Computer Sciences Tech. Rep. #1032, Univ. of Wisconsin--Madison, 1991.

[16]

KESSLER, R. E., AND HILL, M.D. Miss reduction in large, real-indexed caches. Computer Sciences Tech. Rep. #940, Univ. of Wisconsin--Madison, 1990.

[17]

L~, M. S., ROTHBE~G, E. E., AND WOLF, M.E. The cache performance and optimizations of blocked algorithms. In Proceedlngs of the 4th International Conference on Architectural Support for Programming Languages and Operating Systems (Apr). 1991, pp. 63-74

[18]

MCFARLING, S. Program optimization for instruction caches. In Proceedings of the 3rd International Conference on Architectural Support for Programmzng Languages and Operating Systems (Apr.). 1989, pp. 183 191.

[19]

NIELSEN, M. J. K. Titan system manual. Res. Rep. 86/1, Western Research Laboratory, Digital Equipment Corporation, Palo Alto, Calif., 1986.

[20]

PRZYBYLSKI, S., HOROWITZ, M., AND HENNESSY, J. Characteristics of performance-optimal multi-level cache hierarchies. In Proceedings of the 16th Annual International Symposium on Computer Architecture. 1989, pp. 114-121.

[21]

SITES, R. L., AND AGARWAL, A. Multiprocessor cache analysis using ATUM. In Proceedings of the 14th Annual International Symposium on Computer Architecture. 1988, pp. 186 195.

[22]

SMITH, A.J. Cache memories. Comput. Surv. 14, 3 (Sept. 1982), 473 530.

[23]

STAMOS, J.W. Static grouping of small objects to enhance performance of a paged virtual memory. ACM Trans. Comput. Syst. 2, 2 (May 1984), 155 180.

[24]

TAYLOR, G., DAVIES, P., AND FARMWALD, M. The TLB slice--A low-cost high-speed address translation mechanism. In Proceedings of the 17th Annual International Symposium on Computer Architecture. 1990, pp. 355-363.

[25]

THmBAUT, D., AND STONE, H.S. Footprints in the cache. ACM Trans. Comp. Syst. 5, 4 (Nov. 1987), 305-329.

[26]

WANa, W., BAER, J., AND LEVY, H.M. Organization and performance of a two-level virtualreal cache hierarchy. In Proceedings of the 16th Annual International Sympostum on ComputerArch~tecture. 1989, pp. 140 148.

Cited By

Anand SFriedman MGiardino MAlonso GTsafrir DMusuvathi MGupta RAbu-Ghazaleh N(2024)Skip It: Take Control of Your Cache!Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640407(1077-1094)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3620665.3640407
Lin YDeng QHan MFeng ZWang SPeng Q(2024)LAG-based schedulability analysis for preemptive global EDF scheduling with dynamic cache allocationJournal of Systems Architecture10.1016/j.sysarc.2023.103045147(103045)Online publication date: Feb-2024
https://doi.org/10.1016/j.sysarc.2023.103045
Mahmud FKim SChawla HKim ETsai CMuzahid A(2023)Attack of the Knights:Non Uniform Cache Side Channel AttackProceedings of the 39th Annual Computer Security Applications Conference10.1145/3627106.3627199(691-703)Online publication date: 4-Dec-2023
https://dl.acm.org/doi/10.1145/3627106.3627199
Show More Cited By

Index Terms

Recommendations

Improving Performance of Large Physically Indexed Caches by Decoupling Memory Addresses from Cache Addresses

Modern CPUs often use large physically indexed caches that are direct-mapped or have low associativities. Such caches do not interact well with virtual memory systems. An improperly placed physical page will end up in a wrong place in the cache, causing ...
Introducing hierarchy-awareness in replacement and bypass algorithms for last-level caches
PACT '12: Proceedings of the 21st international conference on Parallel architectures and compilation techniques

The replacement policies for the last-level caches (LLCs) are usually designed based on the access information available locally at the LLC. These policies are inherently sub-optimal due to lack of information about the activities in the inner-levels of ...
Design and Optimization of Large Size and Low Overhead Off-Chip Caches

Large off-chip L3 caches can significantly improve the performance of memory-intensive applications. However, conventional L3 SRAM caches are facing two issues as those applications require increasingly large caches. First, an SRAM cache has a limited ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

Copyright © 1992 ACM.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 November 1992

Published in TOCS Volume 10, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

231
Total Citations
View Citations
1,536
Total Downloads

Downloads (Last 12 months)174
Downloads (Last 6 weeks)41

Reflects downloads up to 21 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Anand SFriedman MGiardino MAlonso GTsafrir DMusuvathi MGupta RAbu-Ghazaleh N(2024)Skip It: Take Control of Your Cache!Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640407(1077-1094)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3620665.3640407
Lin YDeng QHan MFeng ZWang SPeng Q(2024)LAG-based schedulability analysis for preemptive global EDF scheduling with dynamic cache allocationJournal of Systems Architecture10.1016/j.sysarc.2023.103045147(103045)Online publication date: Feb-2024
https://doi.org/10.1016/j.sysarc.2023.103045
Mahmud FKim SChawla HKim ETsai CMuzahid A(2023)Attack of the Knights:Non Uniform Cache Side Channel AttackProceedings of the 39th Annual Computer Security Applications Conference10.1145/3627106.3627199(691-703)Online publication date: 4-Dec-2023
https://dl.acm.org/doi/10.1145/3627106.3627199
Gosakan KHan JKuszmaul WMubarek IMukherjee NSriram KTagliavini GWest EBender MBhattacharjee AConway AFarach-Colton MGandhi JJohnson RKannan SPorter DAamodt TJerger NSwift M(2023)Mosaic Pages: Big TLB Reach with Small PagesProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3582016.3582021(433-448)Online publication date: 25-Mar-2023
https://dl.acm.org/doi/10.1145/3582016.3582021
Shang XJia WShan JDing XBorcea C(2023)Reestablishing Page Placement Mechanisms for Nested VirtualizationIEEE Transactions on Cloud Computing10.1109/TCC.2023.3276368(1-12)Online publication date: 2023
https://doi.org/10.1109/TCC.2023.3276368
Wistoff NSchneider MGürkaynak FHeiser GBenini L(2023)Systematic Prevention of On-Core Timing Channels by Full Temporal PartitioningIEEE Transactions on Computers10.1109/TC.2022.321263672:5(1420-1430)Online publication date: 1-May-2023
https://dl.acm.org/doi/10.1109/TC.2022.3212636
Chen HWu XDai LLiu Y(2023)Brief Industry Paper: Latency-Driven Optimization of Instruction Blocks Orchestration on Memory2023 IEEE Real-Time Systems Symposium (RTSS)10.1109/RTSS59052.2023.00053(463-467)Online publication date: 5-Dec-2023
https://doi.org/10.1109/RTSS59052.2023.00053
Lin YSun JDeng QHan MFeng ZWang S(2023)LAG-Based Analysis for Preemptive Global Scheduling with Dynamic Cache Allocation2023 IEEE 29th International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA)10.1109/RTCSA58653.2023.00022(107-116)Online publication date: 30-Aug-2023
https://doi.org/10.1109/RTCSA58653.2023.00022
Sison RBuckley SMurray TKlein GHeiser G(2023)Formalising the Prevention of Microarchitectural Timing Channels by Operating SystemsFormal Methods10.1007/978-3-031-27481-7_8(103-121)Online publication date: 6-Mar-2023
https://dl.acm.org/doi/10.1007/978-3-031-27481-7_8
Wan JBi YZhou ZLi Z(2022)MeshUp: Stateless Cache Side-channel Attack on CPU Mesh2022 IEEE Symposium on Security and Privacy (SP)10.1109/SP46214.2022.9833794(1506-1524)Online publication date: May-2022
https://doi.org/10.1109/SP46214.2022.9833794
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents