Article

Optimization for the Intel® Itanium® architecture register stack

Authors:

Daniel A. Connors,

Gerolf Hoflehner,

Dan LaveryAuthors Info & Claims

CGO '03: Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization

Pages 115 - 124

Published: 23 March 2003 Publication History

Abstract

The Intel® Itanium® architecture contains a number of innovative compiler-controllable features designed to exploit instruction level parallelism. New code generation and optimization techniques are critical to the application of these features to improve processor performance. For instance, the Itanium® architecture provides a compiler-controllable virtual register stack to reduce the penalty of memory accesses associated with procedure calls. The ltanium® Register Stack Engine (RSE) transparently manages the register stack and saves and restores physical registers to and from memory as needed. Existing code generation techniques for the register stack aggressively allocate virtual registers without regard to the register pressure on different control-flow paths. As such, applications with large data sets may stress the RSE, and cause substantial execution delays due to the high number of register saves and restores. Since the Itanium® architecture is developed around Explicitly Parallel Instruction Computing (EPIC) concepts, solutions to increasing the register stack efficiency favor code generation techniques rather than hardware approaches.

References

[1]

A. Aho, R. Sethi, and J. Ullman. Compilers: Principles, Techniques, and Tools. Addison-Wesley, Reading, MA, 1986.]]

Digital Library

[2]

D.I. August, D. A. Connors, S. A. Mahlke, J. W. Sias, K. M. Crozier, B. Cheng, P. R. Eaton, Q. B. Olaniran, and W. W. Hwu. Integrated predication and speculative execution in the IMPACT EPIC architecture. In Proceedings of the 25th International Symposium on Computer Architecture, pages 227--237, June 1998.]]

Digital Library

[3]

J. Bharadwaj, W. Y. Chen, W. Chuang, G. Hoflehner, K. Menezes, K. Muthukumar, and J. Pierce. The intel ia-64 compiler code generator. IEE Micro, 20(5):44--52, September, October 2000.]]

Digital Library

[4]

I. Bratt, A. Settle, and D. A. Connors. Predicate-based transformations to eliminate control and data-irrelevant cache misses. In Proceedings of the First Workshop on Explicitly Parallel Instruction Computing Architectures and Compiler Techniques, pages 11--22, December 2001.]]

[5]

G. J. Chaitin. Register allocation and spilling via graph coloring. In Proceedings of the ACM SIGPLAN 82 Symp. on Compiler Construction, pages 98--105, June 1982.]]

Digital Library

[6]

A. Douillet, J. N. Amaral, and G. R. Gao. Fine-grain stacked register allocation for the itanium architecture. In 15th Workshop on Languages and Compilers for Parallel Computing (LCPC), 2002.]]

[7]

R.E. Hank, W. W. Hwu, and B. R. Rau. Region-based compilation: An introduction and motivation. In Proceedings of the 28th Annual International Symposium on Microarchitecture, pages 158--168, December 1995.]]

[8]

G. E Hoflehner and J. E. Pierce. Method and apparatus for inserting more than one allocation instruction within a routine. In United States Patent Disclosure, June 2002.]]

[9]

Intel Corporation. lntel IA-64 Architecture Software Developer's Manual. Santa Clara, CA, 2000.]]

[10]

Intel Corporation. Intel IA-64 Architecture Software Developer's Manual. Santa Clara, CA, 2002.]]

[11]

D. Keppel. Register windows and user-space threads on the SPARC. Technical Report TR-91-08-01, 1991.]]

[12]

T. Kiyohara, S. M. W. Chen, R. Bringmann, R. Hank, S. Anik, and W. Hwu. Register connection: A new approach to adding registers into instruction set architectures. In Proceedings of the 20th International Symposium on Computer Architecture, pages 247--256, May 1993.]]

Digital Library

[13]

R. Krishnaiyer, D. Kulkarni, D. Lavery, W. Li, C. Lim, J. Ng, and D. Sehr. An advanced optimizer for the ia-64 architecture. IEEE Micro, 20(6):60--68, November 2000.]]

Digital Library

[14]

M. Martin, A. Roth, and C. Fischer. Exploiting dead value information. In Proceedings of the 30th International Symposium on Microarchitecture, pages 125--135, December 1997.]]

Digital Library

[15]

S. Muchnick. Advanced Compiler Design and Implementation. Morgan Kaufman, San Francisco, California, 1997.]]

Digital Library

[16]

M. Postiff, D. Greene, S. Raasch, and T. N. Mudge. Integrating superscalar processor components to implement register caching. In International Conference on Supercomputing, pages 348--357, 2001.]]

Digital Library

[17]

B. R. Rau, M. Lee, P. P. Tirumalai, and M. S. Schlansker. Register allocation for software pipelined loops. In Proceedings of the ACM SIGPLAN 92 Conference on Programming Language Design and Implementation, pages 283--299, June 1992.]]

Digital Library

[18]

D. L. Weaver and T. Germond. The SPARC Architecture Manual. SPARC International, Inc., Menlo Park, CA, 1994.]]

Digital Library

[19]

R. D. Weldon, S. S. Chang, H. Wang, G. Hoflehner, P. H. Wang, D. Lavery, and J. P. Shen. Quantitative evaluation of the register stack engine and optimizations for future itanium processors. In Proceedings of the Sixth Annual Workshop on Interaction between Compilers and Computer Architectures, Santa Clara, CA 95052, July 2002.]]

Digital Library

Cited By

Baev IHank RGross DAltman ESkadron KZorn B(2006)PrematerializationProceedings of the 15th international conference on Parallel architectures and compilation techniques10.1145/1152154.1152197(285-294)Online publication date: 16-Sep-2006
https://dl.acm.org/doi/10.1145/1152154.1152197
Hoflehner GKirkegaard KSkinner RLavery DLee YLi W(2004)Compiler Optimizations for Transaction Processing Workloads on Itanium® Linux SystemsProceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture10.1109/MICRO.2004.11(294-303)Online publication date: 4-Dec-2004
https://dl.acm.org/doi/10.1109/MICRO.2004.11
Yang LChan SGao GJu RLueh GZhang ZBanerjee UGallivan KGonzalez A(2003)Inter-procedural stacked register allocation for itanium® like architectureProceedings of the 17th annual international conference on Supercomputing10.1145/782814.782844(215-225)Online publication date: 23-Jun-2003
https://dl.acm.org/doi/10.1145/782814.782844

Recommendations

Intel® Itanium® floating-point architecture
WCAE '03: Proceedings of the 2003 workshop on Computer architecture education: Held in conjunction with the 30th International Symposium on Computer Architecture

The Intel® Itanium® architecture is increasingly becoming one of the major processor architectures present in the market today. Launched in 2001, the Intel Itanium processor was followed in 2002 by the Itanium 2 processor, with increased integer and ...
Inter-procedural stacked register allocation for itanium® like architecture
ICS '03: Proceedings of the 17th annual international conference on Supercomputing

A hardware managed register stack, Register Stack Engine (RSE), is implemented in Itanium® architecture to provide a unified and flexible register structure to software. The compiler allocates each procedure a register stack frame with its size ...
Fine-grain stacked register allocation for the itanium architecture
LCPC'02: Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing

The introduction of a hardware managed register stack in the Itanium Architecture creates an opportunity to optimize both the frequency in which a compiler requests allocation of registers from this stack and the number of registers requested. The ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CGO '03: Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization

March 2003

349 pages

ISBN:076951913X

General Chairs:
Richard Johnson
Transmeta
,
Tom Conte
NC State University
,
Program Chair:
Wen-mei Hwu
University of Illinois at Urbana-Champaign

Copyright © Copyright (c) 2003 Institute of Electrical and Electronics Engineers, Inc. All rights reserved.

Sponsors

IEEE Computer Society TC-uARCH
SIGMICRO: ACM Special Interest Group on Microarchitectural Research and Processing

Publisher

IEEE Computer Society

United States

Publication History

Published: 23 March 2003

Check for updates

Qualifiers

Article

Conference

CGO03

Sponsor:

SIGMICRO

CGO03: First Annual International IEEE/ACM Symposium on Code Generation and Optimization 2003

March 23 - 26, 2003

California, San Francisco, USA

Acceptance Rates

Overall Acceptance Rate 312 of 1,061 submissions, 29%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
632
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)0

Reflects downloads up to 21 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Baev IHank RGross DAltman ESkadron KZorn B(2006)PrematerializationProceedings of the 15th international conference on Parallel architectures and compilation techniques10.1145/1152154.1152197(285-294)Online publication date: 16-Sep-2006
https://dl.acm.org/doi/10.1145/1152154.1152197
Hoflehner GKirkegaard KSkinner RLavery DLee YLi W(2004)Compiler Optimizations for Transaction Processing Workloads on Itanium® Linux SystemsProceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture10.1109/MICRO.2004.11(294-303)Online publication date: 4-Dec-2004
https://dl.acm.org/doi/10.1109/MICRO.2004.11
Yang LChan SGao GJu RLueh GZhang ZBanerjee UGallivan KGonzalez A(2003)Inter-procedural stacked register allocation for itanium® like architectureProceedings of the 17th annual international conference on Supercomputing10.1145/782814.782844(215-225)Online publication date: 23-Jun-2003
https://dl.acm.org/doi/10.1145/782814.782844

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents