Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/318789.318806acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
Article
Free access

Control flow optimization for supercomputer scalar processing

Published: 01 June 1989 Publication History

Abstract

Control intensive scalar programs pose a very different challenge to highly pipelined supercomputers than vectorizable numeric applications. Function call/return and branch instructions disrupt the flow of instructions through the pipeline, degrading the utilization of the pipelined datapaths. This paper describes control flow optimization for scalar processing using an optimizing compiler. To obtain program control flow information, a system independent profiler has been integrated into the IMPACT-I C compiler. The control flow information obtained is converted into a weighted control graph. Based on the weighted control graph, function inline expansion, multi-way branch layout, and software branch prediction can be implemented. Using better compiler technology results in a very low cost hardware control unit (architecture) for high performance scalar processors.

References

[1]
P.M. Kogge, The Architecture of Pipelined Computers, pp. 237-243, McGraw-Hill, 1981.
[2]
M. Auslander and M. Hopkins, "An Overview of the PL.8 Compiler," Proceedings of the SIGPLAN Symposium on Compiler Construction, ACM, June 1982.
[3]
R.M. Stallman, Internals of GNU CC, Free Software Foundation, Inc., 1988.
[4]
F. Chow and J. Hennessy, "Register Allocation by Priority-bases Coloring," Proceedings of the ACM SIGPLAN Symposium on Compiler Constructions, pp. 222-232, June 17-22, 1984.
[5]
C.A. Huson, An In-line Subroutine Expander for Parafrase, M.S. Thesis, University of Illinois at Urbana-Champaign, 1982.
[6]
R. Allen and S. Johnson, "Compiling C for Vectorization, Parallelism, and Inline Expansion," Proceedings of the ACM SIGPLAN '88 Conference on Programming Language Design and Implementation, pp. 241-249, June 22-24, 1988.
[7]
D.A. Patterson and C. H. Sequin, "A VLSI RISC," IEEE Computer, pp. 8 - 21, September, 1982.
[8]
D.R. Ditzel, H. R. McleUan, and A. D. Berenbaum, "The Hardware Architecture of the CRISP Microprocessor," Proceedings of the 14th Annual International Symposium on Computer Architecture, Pittsburgh, Pennsylvania, June 2-5, 1987.
[9]
S. McFarling and J.L. Hennessy, "Reducing the Cost of Branches," The 13th International Symposium on Computer Architecture Conference Proceedings, pp. 396-403, Tokyo, Japan, June 1986.
[10]
J. Emer and D. Clark, "A Characterization of Processor Performance in the VAX-11/780," Proceedings of the 1}th Annual Symposium on Computer Architecture, June 1984.
[11]
S. Weiss and J. E. Smith, "Instruction Issue Logic in Pipelined Supercomputers," IEEE Transactions on Computers, vol. C-33, pp. 1013--1022, IEEE, November 1984.
[12]
Y.N. Patt, W. W. Hwu, M. C. Shebanow, "HPS, A New Microarchitecture: Rationale and Introduction,'' Proceedings of the 18th International Microprogramming Workshop, pp. 103-108, Asilomar, CA, Dec. 1985.
[13]
W.W. Hwu, "Exploiting Concurrency to Achieve High Performance in a Single-chip Microarchitecture," Ph.D. Dissertation, Computer Science Division Report, vol. no. UCB/CSD 88/398, University of California, Berkeley, January 1988.
[14]
Ramon D. Acosta, Jacob Kjelstrup, and H.C. Tomg, "An Instruction Issuing Approach to Enhancing Performance in Multiple Functional Unit Processors," IEEE Transactions on Computers, vol. C-35, no. 9, September 1986.
[15]
J.K.F. Lee and A. J. Smith, "Branch Prediction Strategies and Branch Target Buffer Design," IEEE Computer, January 1984.
[16]
J.E. Smith, "A Study of Branch Prediction Strategies,'' Proceedings of the 8th International Symposium of Computer Architecture, pp. 135 - 148, June, 1981.
[17]
J.A. DeRosa and H. M. Levy, "An Evaluation of Branch Architectures," Proceedings of the i4th Annual Symposium on Computer Architecture, June 1987.
[18]
D.R. Ditzel and H. R. McLellan, "Branch Folding in the CRISP Microprocessor: Reducing Branch Delay to Zero," Proceedings of the I4th Annual International Symposium on Computer Architecture, pp. 2- 9, Pittsburgh, Pennsylvania, June 2-5, 1987.
[19]
Shebanow, M.C. and Part, Y.N., "Autocorrelafion Branch Prediction," in preparation.
[20]
Wen-mei W. Hwu, Thomas M. Conte, and Pohua P. Chang, "Comparing Software and Hardware Schemes For Reducing the Cost of Branches," Proceedings of the 16th Annual Symposium on Corr~uter Architecture, May 1989.
[21]
G. Radin, "The 801 Minicomputer," Proceedings of the Symposium on Architectural Support for Programming Languages and Operating Systems, pp. 39 - 47, March 1982.
[22]
J.L. Hennessy, N. louppi, F. Baskett, and J. Gill, "MIPS: A VLSI Processor Architecture," Proceedings of the CMU Conference on VLSi Systems and Computations, October 1981.
[23]
J.S. Birnbaum and W. S. Worley, "Beyond RISC: High Precision Architecture," Spring COMPCON, p. 40, 1986.
[24]
M. Hill and et al, "Design Decisions in SPUR," IEEE Computer, pp. 8 - 22, November 1986.
[25]
P. Chow and M. Horowitz, "Architecture Tradeoffs in the Design of MIPS-X," Proceedings of the 14th Annual International Symposium on Computer Architecture, Pittsburgh, Pennsylvania, June 2-5, 1987.
[26]
Gerry Kane, MIPS R2000 RiSC ARCHITECTURE, Prentice Hall, Englewood Cliffs, NJ 07632, 1987.
[27]
Charles Melear, "The Design of the 88000 RISC Family," IEEE MICRO, pp. 26-38, April 1989.
[28]
W.W. Hwu and P. P. Chang, "Inline Function Expansion for Compiling C Programs," ACM SIG- PLAN '89 Conference on Programming Language Design and Implementation, Portland, Oregon, June 21-23, 1989.
[29]
P.P. Chang and W. W. Hwu, "Trace Selection for Compiling Large C Application Programs to Microcode," Proceedings of the 21st Annual Workshop on Microprogramming and Microarchitectures, San Diego, California, November 29 - December 2 1988.
[30]
j.R. Ellis, Bulldog: A Compiler for VLiW Architectures, The MiT Press, 1986.
[31]
J.A. Fisher, "Trace Scheduling: A Technique for Global Microcode Compaction," IEEE Transactions on Computers, vol. vol. c-30, no.7, pp. 478- 490, IEEE, July 1981.
[32]
Wen-mei W. Hwu and Pohua P. Chang, "Achieving High Instruction Cache Performance with an Optimizing Compiler," Proceedings of the 16th Annual Symposium on Computer Architecture, May 1989.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICS '89: Proceedings of the 3rd international conference on Supercomputing
June 1989
484 pages
ISBN:0897913094
DOI:10.1145/318789
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 June 1989

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

ICS89
Sponsor:

Acceptance Rates

Overall Acceptance Rate 629 of 2,180 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)60
  • Downloads (Last 6 weeks)13
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (1998)IMPACT25 years of the international symposia on Computer architecture (selected papers)10.1145/285930.286000(408-417)Online publication date: 1-Aug-1998
  • (1992)Efficient Instruction Sequencing with Inline Target InsertionIEEE Transactions on Computers10.1109/12.21466241:12(1537-1551)Online publication date: 1-Dec-1992
  • (1991)IMPACTACM SIGARCH Computer Architecture News10.1145/115953.11597919:3(266-275)Online publication date: 1-Apr-1991
  • (1991)IMPACTProceedings of the 18th annual international symposium on Computer architecture10.1145/115952.115979(266-275)Online publication date: 1-Apr-1991
  • (1991)Using profile information to assist classic code optimizationsSoftware—Practice & Experience10.1002/spe.438021120421:12(1301-1321)Online publication date: 1-Dec-1991

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media