Rethinking belady's algorithm to accommodate prefetching
2018 ACM/IEEE 45th Annual International Symposium on Computer …, 2018•ieeexplore.ieee.org
This paper shows that in the presence of data prefetchers, cache replacement policies are
faced with a large unexplored design space. In particular, we observe that while Belady's
MIN algorithm minimizes the total number of cache misses-including those for prefetched
lines-it does not minimize the number of demand misses. To address this shortcoming, we
introduce Demand-MIN, a variant of Belady's algorithm that minimizes the number of
demand misses at the cost of increased prefetcher traffic. Together, MIN and Demand-MIN …
faced with a large unexplored design space. In particular, we observe that while Belady's
MIN algorithm minimizes the total number of cache misses-including those for prefetched
lines-it does not minimize the number of demand misses. To address this shortcoming, we
introduce Demand-MIN, a variant of Belady's algorithm that minimizes the number of
demand misses at the cost of increased prefetcher traffic. Together, MIN and Demand-MIN …
This paper shows that in the presence of data prefetchers, cache replacement policies are faced with a large unexplored design space. In particular, we observe that while Belady's MIN algorithm minimizes the total number of cache misses-including those for prefetched lines-it does not minimize the number of demand misses. To address this shortcoming, we introduce Demand-MIN, a variant of Belady's algorithm that minimizes the number of demand misses at the cost of increased prefetcher traffic. Together, MIN and Demand-MIN define the boundaries of an important design space, with many intermediate points lying between them. To reason about this design space, we introduce a simple conceptual framework, which we use to define a new cache replacement policy called Harmony. Our empirical evaluation shows that for a mix of SPEC 2006 benchmarks running on a 4-core system with a stride prefetcher, Harmony improves IPC by 7.7% over an LRU baseline, compared to 3.7% for the previous state-of-the-art. On an 8-core system, Harmony improves IPC by 9.4% compared to 4.4% for the previous state-of-the-art.
ieeexplore.ieee.org