Nothing Special   »   [go: up one dir, main page]

skip to main content
article

Morphable Cache Architectures: Potential Benefits

Published: 01 August 2001 Publication History

Abstract

Computer architects have tried to mitigate the consequences of high memory latencies using a variety techniques. An example of these techniques is multi-level caches to counteract the latency that results from having a memory that is slower than the processor. Recent research has demonstrated that compiler optimizations that modify data layouts and restructure computation can be successful in improving memory system performance. However, in many cases, working with a fixed cache configuration prevents the application/compiler from obtaining the maximum performance. In addition, prompted by demands in portability, long battery life, and low-cost packaging, the computer industry has started viewing energy and power as decisive design factors, along with performance and cost. This makes the job of the compiler/user even more difficult as one needs to strike a balance between low power/energy consumption and high performance. Consequently, adapting the code to the underlying cache/memory hierarchy is becoming more and more difficult.
In this paper, we take an alternate approach and attempt to adapt the cache architecture to the software needs. We focus on array-dominated applications and measure the potential benefits that could be gained from a morphable (reconfigurable) cache architecture. Our results show that not only different applications work best with different cache configurations, but also that different loop nests in a given application demand different configurations. Our results also indicate that the most suitable cache configuration for a given application or a single nest depends strongly on the objective function being optimized. For example, minimizing cache memory energy requires a different cache configuration for each nest than an objective which tries to minimize the overall memory system energy. Based on our experiments, we conclude that fine-grain (loop nest-level) cache configuration management is an important step for a solution to the challenging architecture/software tradeoffs awaiting system designers in the future.

References

[1]
S. Abraham and S. Mahlke. Automatic and efficient evaluation of memory hierarchies for embedded systems. In Proc. Int. Symp. Microarchitecture, pp. 114-125, Haifa, Israel, Nov. 1999.
[2]
D. H. Albonesi. Selective cache ways: on-demand cache resource allocation. In Proc. the 32nd International Symposium on Microarchitecture, pp. 248-259, November 1999.
[3]
E. Anderson, P. Van Vleet, L. Brown, J. Baer, and A. Karlin. On the performance potential of dynamic cache line sizes. Technical Report TR-99-02-01, CS Department, University ofWashington, 1999.
[4]
R. Balasubramonian, D. H. Albonesi, A. Buyuktosunoglu, and S. Dwarkadas. Memory hierarchy reconfiguration for energy and performance in general-purpose processor architectures. In Proc. 33rd International Symposium on Microarchitecture, pp. 245-257, December 2000.
[5]
D. C. Burger. Hardware techniques to improve the performance of the processor/memory interface. Ph.D. Dissertation, Computer Sciences Department, University of Wisconsin-Madison, WI, December, 1998.
[6]
F. Catthoor, S. Wuytack, E. D. Greef, F. Balasa, L. Nachtergaele, and A. Vandecappelle. Custom memory management methodology. Kluwer Academic Publishers, June, 1998.
[7]
A. Chandrakasan, W. J. Bowhill and F. Fox. Design of High-Performance Microprocessor Circuits. IEEE Press, 2001.
[8]
D. Chiou, P. Jain, S. Devadas, and L. Rudolph. Dynamic cache partitioning via columnization. CSG-Memo-430, MIT, November 1999.
[9]
B. Cmelik and D. Keppel. Shade: a fast instruction-set simulator for execution profiling. In Proc. the 1994 ACM SIGMETRICS Conference on theMeasurement and Modeling of Computer Systems, May 1994, pp. 128-137.
[10]
J. F. Edmondson et al. Internal organization of the Alpha 21164, a 300 MHz 64-bit quad-issue CMOS RISC microprocessor. Digital Technical Journal, Vol. 7, No. 1, 1995, pp. 119-135.
[11]
G. Esakkimuthu, N. Vijaykrishnan, M. Kandemir, and M. Irwin. Memory system energy: in uence of hardware-software optimizations. In Proc. ISLPED'00.
[12]
D. Gannon, W. Jalby, and K. Gallivan. Strategies for cache and local memory management by global program transformations. Journal of Parallel & Distributed Computing, 5(5):587-616, October 1988.
[13]
K. Ghose and M. B. Kamble. Reducing power in superscalar processor caches using subbanking, multiple line buffers and bit-line segmentation. In Proc. ISLPED'99, San Diego, CA, 1999, pp. 70-75.
[14]
X. Ji, D. Nicolaescu, A. Veidenbaum, A. Nicolau, and R. Gupta. Compiler-directed cache assist adaptivity. Technical Report ICS-TR-00-17, ICS Department, University of California-Irvine, June 2000.
[15]
T. L. Johnson, D. A. Connors, M. C. Merten, and W. W. Hwu. Run-time cache bypassing. IEEE Transactions on Computers, Vol. 48, No. 12, December 1999, pp. 1338-1354.
[16]
N. Jouppi. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers. In Proc. the 17th International Symposium on Computer Architecture, pp. 364-373, May 1990.
[17]
J. R. Lorch and A. J. Smith. Software strategies for portable computer energy management. IEEE Personal Communications, pp. 60-73, June 1998.
[18]
K. S. McKinley and O. Temam. A quantitative analysis of loop nest locality. In Proc. the 7th International Conference on Architectural Support for Programming Languages and Operating Systems, Boston, MA, October 1996.
[19]
M. Powell, S. Yang, B. Falsafi, K. Roy, T. N. Vijaykumar. Gated-Vdd: A circuit technique to reduce leakage in deep-submicron cache memories. In Proc. ACM/IEEE Int. Symp. Low Power Elec. and Design, July 2000.
[20]
P. Ranganathan, S. Adve, and N. P. Jouppi. Reconfigurable caches and their application to media processing. In Proc. the 27th International Symposium on Computer Architecture, June 2000, pp. 214-224.
[21]
W.-T. Shiue and C. Chakrabarti. Memory exploration for low power embedded systems. Technical Report CLPE-TR-9-1999-20, Arizona State Univ.
[22]
C. Su and A. Despain. Cache design tradeoffs for power and performance optimization: a case study. In Proc. ISLPED'95, pp. 63-68.
[23]
O. Temam, C. Fricker, and W. Jalby. Cache interference phenomena. In Proc. ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, Nashville, May 1994.
[24]
A. Veidenbaum, W. Tang, R. Gupta, A. Nicolau, and X. Ji. Adapting cache line size to application behavior. In Proceedings of the 13th ACM International Conference on Supercomputing, 1999 (ICS '99).
[25]
N. Vijaykrishnan, M. Kandemir, M. J. Irwin, H. Y. Kim, and W. Ye. Energy-driven integrated hardware-software optimizations using SimplePower. In Proc. the International Symposium on Computer Architecture (ISCA'00), June 2000.
[26]
S. Wilton and N. P. Jouppi. CACTI: an enhanced cycle access and cycle time model. IEEE Journal of Solid-State Circuits, pp. 677-687, 1996.
[27]
M. Wolfe. High Performance Compilers for Parallel Computing, Addison Wesley, CA, 1996.
[28]
X. Zhang, A. Dasdan, M. Schulz, R. Gupta, and A. Chien. Architectural adaptation for application-specific locality optimization. In Proceedings of the International Conference on Computer Design, Oct. 1997.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices
ACM SIGPLAN Notices  Volume 36, Issue 8
Aug. 2001
245 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/384196
Issue’s Table of Contents
  • cover image ACM Conferences
    LCTES '01: Proceedings of the ACM SIGPLAN workshop on Languages, compilers and tools for embedded systems
    August 2001
    250 pages
    ISBN:1581134258
    DOI:10.1145/384197
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 August 2001
Published in SIGPLAN Volume 36, Issue 8

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Nov 2024

Other Metrics

Citations

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media