Article

Free access

Comparing static and dynamic code scheduling for multiple-instruction-issue processors

Authors:

Pohua P. Chang,

William Y. Chen,

Scott A. Mahlke,

Wen-mei W. HwuAuthors Info & Claims

MICRO 24: Proceedings of the 24th annual international symposium on Microarchitecture

Pages 25 - 33

https://doi.org/10.1145/123465.123471

Published: 01 September 1991 Publication History

PDF eReader

References

[1]

R.D. Acosta, J. Kjelstrup, and H. C. Torng, 'An instruction Issuing Approach to Enhancing Performance in Multiple Functional Unit Processors", IEEE Transactions on Computers, vol. C-35, no.9, pp.815-828, September, 1986.

Digital Library

Google Scholar

[2]

Advanced Micro Devices, 'Am29000 Streamlined Instruction Processor, Advance Information", Publication Number 09075, Rev. A, Amendment/0, Sunnyvale, California.

Google Scholar

[3]

P. P. Chang, S. A. Mahlke, W. Y. Chen, and W. W. Hwu, "Code Optimization Techniques for Multiple-instructionissue Architectures," Center for Reliable and High-Performance Computing Report, Uiversity of Illinois, in preparation.

Google Scholar

[4]

R. Cohn, T. Gross, M. Lam, and P.S. Tseng, 'Architecture and Compiler Tradeoffs for a Long Instruction Word Microprocessors'', Proceedings of the Third International Con/erence on Architectural Support for Programming Languages and Operating Systems, April, 1989.

Digital Library

Google Scholar

[5]

R.P. Colwell, R. P. Nix, :I. J. O'Donnell, D. B. Papworth, P. K. Rodma~, 'A VLIW Architecture for a Trace Scheduling Compiler', Proceedings of the Second International Conference on Architectural Support for Programming Languages and Operating Systems, Palo Alto, California, October, 1987.

Crossref

Google Scholar

[6]

J.R. Ellis, Bulldog: A Compiler for VLIW Architectures, The MIT Press, 1986.

Digital Library

Google Scholar

[7]

J.A Fisher, "Trace scheduling: A technique for global microcode compaction", iEEE Transactions on Uomputersj vol.c- 30, no.7, July 1981.

Google Scholar

[8]

J.A. Fisher, 'VLIW architectures and the ELI-512", Proceedings o/the lOth An. nual Symposium on Computer Architecture, June, 1983.

Digital Library

Google Scholar

[9]

M.C. Golumbic and V. Rainish, "Instruction Scheduling Beyond Basic Blocks", IBM Journal o/ Research and Development, vol.34, no.l, pp.93-97, January, 1990.

Digital Library

Google Scholar

[10]

T. Gross and M. S. Lam, "Compilation for a High-Performance Systolic Array", Proceedings of the SIGPLAN 1986 Symposium on Compiler Construction, June, 1986

Digital Library

Google Scholar

[11]

J. L. Hennessy and T. Gross, "Postpass Code Optimization of Pipelined Constraints', A CM Transactions on Programming Languages and Systems, vol.5, pp.422-448, ACM, July, 1983.

Digital Library

Google Scholar

[12]

M. A. Howland, R. A. Mueller, and P. H. Sweany, "Trace Scheduling Optimization in a Retargetable Microcode Compiler", Proceedings of the ~Oth International Microprogramming Workshop, Colorado Springs, December, 1987.

Digital Library

Google Scholar

[13]

W.W. Hwu and Y. N. Putt, 'HPSm, a High Performance Restricted Data Flow Architecture Having Minimal Functionality'', The 13th International Symposium on Computer Architecture Conference Proceedings, pp. 297-306, June, 1986.

Digital Library

Google Scholar

[14]

W.W. Hwu and P. P. Chang, 'Inline Function Expansion for Compiling Realistic C Programs", Proceedings, A CM SIGPLAN'89 Conference on Programming Language Design and Implementation, Portland, Oregon, June 21-23, 1989.

Digital Library

Google Scholar

[15]

W.W. Hwu and Pohua P. Chang, 'Efficient Instruction Sequencing with Inline Target Insertion", Coordinated Science Laboratory Report, UILU-ENG-90- 2215, CSG-123, May, 1990.

Google Scholar

[16]

IBM, Special Issue on IBM RISC System/6000 Processor, IBM Journal o/Research and Development, vol. 34, no. 1, January, 1990.

Google Scholar

[17]

Intel, "i860(TM) 64-Bit Microprocessor", Order Number 240296-002, Santa Clara, California, April, 1989.

Google Scholar

[18]

N.P. Jouppi and D. W. Wall, "Available Instruction-Level Parallelism for Superscalar and Superpipelined Machines", Proceedings o/ the Third International Con. ference on Architectural Support for Programming Languages and Operating Systems, April, 1989.

Digital Library

Google Scholar

[19]

G. Kane, MIPS R~O00 RISC Architecture, Prentice Hall, Englewood Cliffs, NJ, 1987.

Google Scholar

[20]

P.M. Kogge, The Architecture of Pipelined Computers, pp.237-243, McGraw-Hill, 1981.

Google Scholar

[21]

M. Lam, "Software Pipelining: An Effective Scheduling Technique for VLIW Machines'', Proceedings of the SIGPLAN '88 Conference on Programming Language Design and Implementation, June, 1988.

Digital Library

Google Scholar

[22]

A. Nicolau, "Uniform parallelism exploitation in ordinary programs", Proceedings of the International Conference on Parallel Processing, pp.614-618, August, 1985.

Google Scholar

[23]

Y.N. Patt, W. W. Hwu, and M. C. Shebanow, 'ttPS, A New Microarchitecture: Rationale and Introduction", Pro. ceedings of the 18th International Micro. programming Workshop, pp.103-108, Asilomar, CA, December, 1985.

Digital Library

Google Scholar

[24]

B.R. Rau and C.D. Glaeser, "Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing", Proceedings of the 14th Annual Workshop on Microprogramming, pp.183-198, October, 1981.

Digital Library

Google Scholar

[25]

B. Rau, D. Yen, W. Yen, and R.A. Towle, 'The Cydra 5 departmental supercomputer', Computer, voL22, pp. 12-35, January, 1989.

Digital Library

Google Scholar

[26]

M.D. Smith, M. Johnson, and M. A. Horowitz, "Limits on Multiple Instruction Issue", Proceedings of the 3rd International Conference on Architectural Support for Programming Languages and Operating Systems, April 1989.

Digital Library

Google Scholar

[27]

M. D. Smith, M. S. Laxa, and M. A. Horowitz, "Boosting Beyond Static Scheduling in a Superscalax Processor", Proceedings of the 17th International Symposium on Computer Architecture, June, 1990.

Digital Library

Google Scholar

[28]

G.S. Sohi and S. Vajapeyam, "Instruction Issue Logic for High Performance, Interruptible Pipelined Processors", Proceedings o/ the lJth Annual Symposium on Computer Architecture, June, 1987.

Digital Library

Google Scholar

[29]

G.S. Sohi and S. Vajapeyam, "Tradeoffs in Instruction Format Design for Horizontal Architectures", Proceedings o/the Third International Conference on Architectural Support for Programming Languages and Operating Systems, April, 1989.

Digital Library

Google Scholar

[30]

The SPARCTM Architecture Manual, Part No. 800-1399-07, Revision 50, SUN, Mountain View, California, August 1987.

Google Scholar

[31]

J. E. Thornton, Design of a Computer: The Control Data 6600, Glenview, iL: Scott, Foresman and Co., 1970.

Digital Library

Google Scholar

[32]

R. M. Tomasulo, "An Efficient Algorithm for Exploiting Multiple Arithmetic Units", IBM Journal of Research and Development, vot.ll, pp.25-33, January, I967.

Digital Library

Google Scholar

[33]

A.K. Uht and C. D. Polychronopoulos and J. F. Kotea,' On the Combination of Hardware and Software Concurrency Extra(:- tion Methods", Proceedings of the ~Oth A n. nual Workshop on Microprogramming and Microarchitecture, pp.lg3-141, December, 1987.

Digital Library

Google Scholar

[34]

It.S~ Warren, Jr., "Instruction Scheduling for the IBM RISC System/6000 Processor', IBM Journal of Research and Development, vo1.34, no.l, pp. 85-92, Januaxy, 1990.

Digital Library

Google Scholar

[35]

S. Weiss and J. E. Smith, 'A Study of Scalar Compilation Techniques for Pipelined Supercomputers', Proceedings of the Second International Conference on Architectural Support for Programming Languages and Operating Systems, October, 1987.

Crossref

Google Scholar

Cited By

View all

Alonso TSutter GLópez-Buedo Sde Vergara J(2023)Enhancing Conditional Stalling to Boost Performance of Stream-Processing Logic With RAW DependenciesIEEE Transactions on Circuits and Systems II: Express Briefs10.1109/TCSII.2023.323773670:7(2620-2624)Online publication date: Jul-2023
https://doi.org/10.1109/TCSII.2023.3237736
Huang ZHilton ALee B(2016)Decoupling loads for nano-instruction set computersACM SIGARCH Computer Architecture News10.1145/3007787.300118144:3(406-417)Online publication date: 18-Jun-2016
https://dl.acm.org/doi/10.1145/3007787.3001181
Huang ZHilton ALee BMin SLoh G(2016)Decoupling loads for nano-instruction set computersProceedings of the 43rd International Symposium on Computer Architecture10.1109/ISCA.2016.43(406-417)Online publication date: 18-Jun-2016
https://dl.acm.org/doi/10.1109/ISCA.2016.43
Show More Cited By

Index Terms

Comparing static and dynamic code scheduling for multiple-instruction-issue processors
1. General and reference
  1. Cross-computing tools and techniques
    1. Performance
2. Software and its engineering
  1. Software notations and tools
    1. Compilers
  2. Software organization and properties
    1. Contextual software domains
      1. Operating systems
        Process management
        Scheduling

Recommendations

A Register Pressure Sensitive Instruction Scheduler for Dynamic Issue Processors
PACT '97: Proceedings of the 1997 International Conference on Parallel Architectures and Compilation Techniques

Several modern superscalar processors contain an out of order (OOO) instruction issue mechanism, which resolves dependencies between instructions to expose greater instruction level parallelism (ILP). How to extend a traditional instruction scheduler to ...
Static Scheduling for Out-of-order Instruction Issue Processors
ACAC '00: Proceedings of the 5th Australasian Computer Architecture Conference

Superscalar processors strive to increase the number of instructions issued in each processor cycle. Compilers therefore need to expose as much Instruction Level Parallelism (ILP) as possible by using increasingly complex code optimizations. However, ...
Optimal instruction scheduling and register allocation for multiple-issue processors

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

MICRO 24: Proceedings of the 24th annual international symposium on Microarchitecture

September 1991

223 pages

ISBN:0897914600

DOI:10.1145/123465

Chairman:
Yashwant K. Malaiya

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 September 1991

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Article

Acceptance Rates

Overall Acceptance Rate 484 of 2,242 submissions, 22%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

20
Total Citations
View Citations
600
Total Downloads

Downloads (Last 12 months)77
Downloads (Last 6 weeks)13

Reflects downloads up to 21 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Alonso TSutter GLópez-Buedo Sde Vergara J(2023)Enhancing Conditional Stalling to Boost Performance of Stream-Processing Logic With RAW DependenciesIEEE Transactions on Circuits and Systems II: Express Briefs10.1109/TCSII.2023.323773670:7(2620-2624)Online publication date: Jul-2023
https://doi.org/10.1109/TCSII.2023.3237736
Huang ZHilton ALee B(2016)Decoupling loads for nano-instruction set computersACM SIGARCH Computer Architecture News10.1145/3007787.300118144:3(406-417)Online publication date: 18-Jun-2016
https://dl.acm.org/doi/10.1145/3007787.3001181
Huang ZHilton ALee BMin SLoh G(2016)Decoupling loads for nano-instruction set computersProceedings of the 43rd International Symposium on Computer Architecture10.1109/ISCA.2016.43(406-417)Online publication date: 18-Jun-2016
https://dl.acm.org/doi/10.1109/ISCA.2016.43
McFarlin DTucker CZilles C(2013)Discerning the dominant out-of-order performance advantageACM SIGPLAN Notices10.1145/2499368.245114348:4(241-252)Online publication date: 16-Mar-2013
https://dl.acm.org/doi/10.1145/2499368.2451143
McFarlin DTucker CZilles C(2013)Discerning the dominant out-of-order performance advantageACM SIGARCH Computer Architecture News10.1145/2490301.245114341:1(241-252)Online publication date: 16-Mar-2013
https://dl.acm.org/doi/10.1145/2490301.2451143
McFarlin DTucker CZilles CSarkar VBodik R(2013)Discerning the dominant out-of-order performance advantageProceedings of the eighteenth international conference on Architectural support for programming languages and operating systems10.1145/2451116.2451143(241-252)Online publication date: 16-Mar-2013
https://dl.acm.org/doi/10.1145/2451116.2451143
Sato T(2005)Data dependence path reduction with tunneling load instructionsHigh Performance Computing10.1007/BFb0024210(119-130)Online publication date: 9-Jun-2005
https://doi.org/10.1007/BFb0024210
Spadini FFahs BPatel SLumetta SJohnson RConte THwu W(2003)Improving quasi-dynamic schedules through region slipProceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization10.5555/776261.776278(149-158)Online publication date: 23-Mar-2003
https://dl.acm.org/doi/10.5555/776261.776278
Spadini FFahs BPatel SLumetta S(2003)Improving quasi-dynamic schedules through region slipInternational Symposium on Code Generation and Optimization, 2003. CGO 2003.10.1109/CGO.2003.1191541(149-158)Online publication date: 2003
https://doi.org/10.1109/CGO.2003.1191541
Zingirian NMaresca M(2002)On the efficiency of image and video processing programs on instruction level parallel processorsProceedings of the IEEE10.1109/JPROC.2002.80144590:7(1230-1243)Online publication date: Jul-2002
https://doi.org/10.1109/JPROC.2002.801445
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Index Terms

Recommendations

A Register Pressure Sensitive Instruction Scheduler for Dynamic Issue Processors

Static Scheduling for Out-of-order Instruction Issue Processors

Optimal instruction scheduling and register allocation for multiple-issue processors