article

Morphable Cache Architectures: Potential Benefits

Authors:

N. Vijaykrishnan,

J. RamanujamAuthors Info & Claims

ACM SIGPLAN Notices, Volume 36, Issue 8

Pages 128 - 137

https://doi.org/10.1145/384196.384215

Published: 01 August 2001 Publication History

Abstract

Computer architects have tried to mitigate the consequences of high memory latencies using a variety techniques. An example of these techniques is multi-level caches to counteract the latency that results from having a memory that is slower than the processor. Recent research has demonstrated that compiler optimizations that modify data layouts and restructure computation can be successful in improving memory system performance. However, in many cases, working with a fixed cache configuration prevents the application/compiler from obtaining the maximum performance. In addition, prompted by demands in portability, long battery life, and low-cost packaging, the computer industry has started viewing energy and power as decisive design factors, along with performance and cost. This makes the job of the compiler/user even more difficult as one needs to strike a balance between low power/energy consumption and high performance. Consequently, adapting the code to the underlying cache/memory hierarchy is becoming more and more difficult.

In this paper, we take an alternate approach and attempt to adapt the cache architecture to the software needs. We focus on array-dominated applications and measure the potential benefits that could be gained from a morphable (reconfigurable) cache architecture. Our results show that not only different applications work best with different cache configurations, but also that different loop nests in a given application demand different configurations. Our results also indicate that the most suitable cache configuration for a given application or a single nest depends strongly on the objective function being optimized. For example, minimizing cache memory energy requires a different cache configuration for each nest than an objective which tries to minimize the overall memory system energy. Based on our experiments, we conclude that fine-grain (loop nest-level) cache configuration management is an important step for a solution to the challenging architecture/software tradeoffs awaiting system designers in the future.

References

[1]

S. Abraham and S. Mahlke. Automatic and efficient evaluation of memory hierarchies for embedded systems. In Proc. Int. Symp. Microarchitecture, pp. 114-125, Haifa, Israel, Nov. 1999.

Digital Library

[2]

D. H. Albonesi. Selective cache ways: on-demand cache resource allocation. In Proc. the 32nd International Symposium on Microarchitecture, pp. 248-259, November 1999.

Digital Library

[3]

E. Anderson, P. Van Vleet, L. Brown, J. Baer, and A. Karlin. On the performance potential of dynamic cache line sizes. Technical Report TR-99-02-01, CS Department, University ofWashington, 1999.

[4]

R. Balasubramonian, D. H. Albonesi, A. Buyuktosunoglu, and S. Dwarkadas. Memory hierarchy reconfiguration for energy and performance in general-purpose processor architectures. In Proc. 33rd International Symposium on Microarchitecture, pp. 245-257, December 2000.

Digital Library

[5]

D. C. Burger. Hardware techniques to improve the performance of the processor/memory interface. Ph.D. Dissertation, Computer Sciences Department, University of Wisconsin-Madison, WI, December, 1998.

Digital Library

[6]

F. Catthoor, S. Wuytack, E. D. Greef, F. Balasa, L. Nachtergaele, and A. Vandecappelle. Custom memory management methodology. Kluwer Academic Publishers, June, 1998.

Digital Library

[7]

A. Chandrakasan, W. J. Bowhill and F. Fox. Design of High-Performance Microprocessor Circuits. IEEE Press, 2001.

Digital Library

[8]

D. Chiou, P. Jain, S. Devadas, and L. Rudolph. Dynamic cache partitioning via columnization. CSG-Memo-430, MIT, November 1999.

[9]

B. Cmelik and D. Keppel. Shade: a fast instruction-set simulator for execution profiling. In Proc. the 1994 ACM SIGMETRICS Conference on theMeasurement and Modeling of Computer Systems, May 1994, pp. 128-137.

Digital Library

[10]

J. F. Edmondson et al. Internal organization of the Alpha 21164, a 300 MHz 64-bit quad-issue CMOS RISC microprocessor. Digital Technical Journal, Vol. 7, No. 1, 1995, pp. 119-135.

Digital Library

[11]

G. Esakkimuthu, N. Vijaykrishnan, M. Kandemir, and M. Irwin. Memory system energy: in uence of hardware-software optimizations. In Proc. ISLPED'00.

Digital Library

[12]

D. Gannon, W. Jalby, and K. Gallivan. Strategies for cache and local memory management by global program transformations. Journal of Parallel & Distributed Computing, 5(5):587-616, October 1988.

Digital Library

[13]

K. Ghose and M. B. Kamble. Reducing power in superscalar processor caches using subbanking, multiple line buffers and bit-line segmentation. In Proc. ISLPED'99, San Diego, CA, 1999, pp. 70-75.

Digital Library

[14]

X. Ji, D. Nicolaescu, A. Veidenbaum, A. Nicolau, and R. Gupta. Compiler-directed cache assist adaptivity. Technical Report ICS-TR-00-17, ICS Department, University of California-Irvine, June 2000.

[15]

T. L. Johnson, D. A. Connors, M. C. Merten, and W. W. Hwu. Run-time cache bypassing. IEEE Transactions on Computers, Vol. 48, No. 12, December 1999, pp. 1338-1354.

Digital Library

[16]

N. Jouppi. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers. In Proc. the 17th International Symposium on Computer Architecture, pp. 364-373, May 1990.

Digital Library

[17]

J. R. Lorch and A. J. Smith. Software strategies for portable computer energy management. IEEE Personal Communications, pp. 60-73, June 1998.

[18]

K. S. McKinley and O. Temam. A quantitative analysis of loop nest locality. In Proc. the 7th International Conference on Architectural Support for Programming Languages and Operating Systems, Boston, MA, October 1996.

Digital Library

[19]

M. Powell, S. Yang, B. Falsafi, K. Roy, T. N. Vijaykumar. Gated-Vdd: A circuit technique to reduce leakage in deep-submicron cache memories. In Proc. ACM/IEEE Int. Symp. Low Power Elec. and Design, July 2000.

Digital Library

[20]

P. Ranganathan, S. Adve, and N. P. Jouppi. Reconfigurable caches and their application to media processing. In Proc. the 27th International Symposium on Computer Architecture, June 2000, pp. 214-224.

Digital Library

[21]

W.-T. Shiue and C. Chakrabarti. Memory exploration for low power embedded systems. Technical Report CLPE-TR-9-1999-20, Arizona State Univ.

[22]

C. Su and A. Despain. Cache design tradeoffs for power and performance optimization: a case study. In Proc. ISLPED'95, pp. 63-68.

Digital Library

[23]

O. Temam, C. Fricker, and W. Jalby. Cache interference phenomena. In Proc. ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, Nashville, May 1994.

Digital Library

[24]

A. Veidenbaum, W. Tang, R. Gupta, A. Nicolau, and X. Ji. Adapting cache line size to application behavior. In Proceedings of the 13th ACM International Conference on Supercomputing, 1999 (ICS '99).

Digital Library

[25]

N. Vijaykrishnan, M. Kandemir, M. J. Irwin, H. Y. Kim, and W. Ye. Energy-driven integrated hardware-software optimizations using SimplePower. In Proc. the International Symposium on Computer Architecture (ISCA'00), June 2000.

Digital Library

[26]

S. Wilton and N. P. Jouppi. CACTI: an enhanced cycle access and cycle time model. IEEE Journal of Solid-State Circuits, pp. 677-687, 1996.

[27]

M. Wolfe. High Performance Compilers for Parallel Computing, Addison Wesley, CA, 1996.

Digital Library

[28]

X. Zhang, A. Dasdan, M. Schulz, R. Gupta, and A. Chien. Architectural adaptation for application-specific locality optimization. In Proceedings of the International Conference on Computer Design, Oct. 1997.

Digital Library

Index Terms

Morphable Cache Architectures: Potential Benefits
1. Hardware
  1. Hardware validation
  2. Integrated circuits
    1. Semiconductor memory
      1. Dynamic memory
2. Software and its engineering
  1. Software notations and tools
    1. Compilers

Recommendations

Morphable Cache Architectures: Potential Benefits
LCTES '01: Proceedings of the ACM SIGPLAN workshop on Languages, compilers and tools for embedded systems

Computer architects have tried to mitigate the consequences of high memory latencies using a variety techniques. An example of these techniques is multi-level caches to counteract the latency that results from having a memory that is slower than the ...
Morphable Cache Architectures: Potential Benefits
OM '01: Proceedings of the 2001 ACM SIGPLAN workshop on Optimization of middleware and distributed systems

Computer architects have tried to mitigate the consequences of high memory latencies using a variety techniques. An example of these techniques is multi-level caches to counteract the latency that results from having a memory that is slower than the ...
Cache management for discrete processor architectures
ISPA'05: Proceedings of the Third international conference on Parallel and Distributed Processing and Applications

Many schemes had been used to reduce the performance (or speed) gap between processors and main memories; such as the cache memory is one of the most methods. In this paper, we issue the structure of shared cache, which is based on the multiprocessor ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices

ACM SIGPLAN Notices Volume 36, Issue 8

Aug. 2001

245 pages

ISSN:0362-1340

EISSN:1558-1160

DOI:10.1145/384196

Chairman:
Ron K. Cytron
Washington Univ., St. Louis, MO
,
Editor:
Cindy Norris
Appalachian State Univ., Boone, NC

Issue’s Table of Contents

LCTES '01: Proceedings of the ACM SIGPLAN workshop on Languages, compilers and tools for embedded systems
August 2001
250 pages
ISBN:1581134258
DOI:10.1145/384197
Chairmen:
Seongsoo Hong
Snow Bird, UT
,
Santosh Pande
Snow Bird, UT

Copyright © 2001 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 August 2001

Published in SIGPLAN Volume 36, Issue 8

Check for updates

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
404
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 13 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents