Article

Free access

Using a lookahead window in a compaction-based parallelizing compiler

Authors:

Toshio Nakatani,

Kemal EbcioğluAuthors Info & Claims

MICRO 23: Proceedings of the 23rd annual workshop and symposium on Microprogramming and microarchitecture

Pages 57 - 68

Published: 30 November 1990 Publication History

Abstract

Lookahead is a common technique for high performance uniprocessor design. In general, however, hardware lookahead window is too small to exploit instruction-level parallelism at run time, while compaction-based parallelizing compilers must suffer from worst-case exponential code explosion at compile time. In this paper, we propose a software lookahead method, which allows inter-basic block code motions within the prespecified number of operations, called software lookahead window, on any path emanating from the currently processed instruction at compile time. By software lookahead, instruction-level parallelism can be exploited in a much greater code area than the hardware approach, but the lookahead region is still limited to a constant depth with a user-specifiable window, and thus code explosion is restricted. The proposed scheme has been implemented in our prototype parallelizing compiler, which can generate code for uniprocessors with multiple functional units and multiway conditional branches, such as VLIW machines, and potentially for superscalars as well. To study code explosion problem and instruction-level parallelism for branch intensive code, we compiled five AIX utilities: sort, fgrep, sed, yacc, and compress. It is demonstrated that, with software lookahead, code explosion problem is effectively alleviated, yet a substantial amount of inter-basic block parallelism is successfully extracted.

References

[1]

Aiken, A. (1988). Compaction-Based Parallelization, Ph.D. Dissertation, Department of Computer Science, Cornell University.

Digital Library

[2]

Aiken, A. and Nicolau, A. {1988}. Perfect Pipelining: A New Loop Parallelization Technique, In Proceedings of the 1988 European Symposium on Programming, pp. 221-235.

Digital Library

[3]

Ebcioglu, K. {1987}. Compilation Technique for Software Pipelining of Loops with Conditional Jumps, In Proceedings of the 20th Annual Workshop on Microprogramming, pp. 69-79, ACM Press.

Digital Library

[4]

Ebcioglu, K. (1988). Some Design Ideas for a VLIW Architecture for Sequential Natured Software, In Parallel Processing (Proceedings of IFIP WG 10.3 Working Conference on Parallel Processing), edited by M. Cosnard et al., North Holland.

[5]

Ebcioglu, K. and Nicolau, A. {1989} A Global Resource-Constrained Parallelization Technique, In Proceedings of the Third International Conference on Supercomputing, pp. 154-163, Crete.

Digital Library

[6]

Ebcioglu, K. and Nakatani, T. {1989}. A New Compilation Technique for Parallelizing Loops with Unpredictable Branches on a VLIW Architecture, In Languages and Compilers for Parallel Computing, edited by D. Gelernter et al., Research Monographs in Parallel and Distributed Computing, MIT Press.

Digital Library

[7]

Ellis, J. {1985}. Bulldog: A Compiler for VLIW Architectures, Ph.D. Dissertation, Department of Computer Science, Yale University.

Digital Library

[8]

Ferrante, J., Ottenstein, K., and Warren, J. {1987}. The Program Dependence Graph and Its Use in Optimization, ACM Transactions on Programming Languages and Systems, 9:3, pp. 319-349.

Digital Library

[9]

Fisher, J. {1981}. Trace Scheduling: A Technique for Global Microcode Compaction, IEEE Transactions on Computers, c-30:7, pp. 478-490.

Digital Library

[10]

Gupta, R. and Soffa, M. {1990}. Region Scheduling: An Approach for Detecting and Redistributing Parallelism, IEEE Transactions on Software Engineering, 16:4, pp. 421-431.

Digital Library

[11]

Hsu, P. {1985}. Highly Concurrent Scalar Processing, Ph.D. Dissertation, Department of Computer Science, University of Illinois at Urbana- Champaign.

Digital Library

[12]

Lam, M. {1988}. Software Pipelining: An Eflective Scheduling Technique for VLIW Machines, In Proceedings of the SIGPLAN 1988 Conference of Programming Language Design and Implementation, pp. 318-328, ACM Press.

Digital Library

[13]

Nakatani, T. and Ebcioglu, K. {1989}. "Combining" as u Compilation Technique for a VLIW Architecture, In Proceedings of the 22nd Annual International Workshop of Microprogramming and Microarchitecture, ACM and IEEE, pp. 43-55.

Digital Library

[14]

Nicolau, A. {1985}. Percolation Scheduling: A Parallel Compilation Technique, TR 85-678, Department of Computer Science, Cornell University.

Digital Library

[15]

Tomasulo, R. {1967}. An Efficient Algorithm for Exploiting Multiple Arithmetic Units, IBM Journal of Research and Development, 11:1, pp. 25-33.

Digital Library

[16]

Warren, S.H., Auslander, M.A., Chaitin, G.J., Chibib, A.C., Hopkins, M.E., and MacKay, A.L.{1986}. Final Code Generation in the PL.6 Compiler, Report No. RC 11974, IBM T.J. Watson Research Center.

Cited By

Aiken ANicolau ANovack S(1995)Resource-Constrained Software PipeliningIEEE Transactions on Parallel and Distributed Systems10.1109/71.4761676:12(1248-1270)Online publication date: 1-Dec-1995
https://dl.acm.org/doi/10.1109/71.476167
Dubey PAdams GFlynn M(1994)Instruction Window Size Trade-Offs and Characterization of Program ParallelismIEEE Transactions on Computers10.1109/12.27848143:4(431-442)Online publication date: 1-Apr-1994
https://dl.acm.org/doi/10.1109/12.278481
Ning QSlonim JGentleman W(1993)Register allocation for optimal loop schedulingProceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: distributed computing - Volume 210.5555/962367.962399(942-955)Online publication date: 24-Oct-1993
https://dl.acm.org/doi/10.5555/962367.962399
Show More Cited By

Using a lookahead window in a compaction-based parallelizing compiler
1. Software and its engineering
  1. Software notations and tools
2. Theory of computation
  1. Models of computation
    1. Concurrency

Recommendations

Using a lookahead window in a compaction-based parallelizing compiler

Lookahead is a common technique for high performance uniprocessor design. In general, however, hardware lookahead window is too small to exploit instruction-level parallelism at run time, while compaction-based parallelizing compilers must suffer from ...
Performance of OSCAR multigrain parallelizing compiler on SMP servers
LCPC'04: Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing

This paper describes performance of OSCAR multigrain parallelizing compiler on various SMP servers, such as IBM pSeries 690, Sun Fire V880, Sun Ultra 80, NEC TX7/i6010 and SGI Altix 3700. The OSCAR compiler hierarchically exploits the coarse grain task ...
Tuning Compiler Optimizations for Simultaneous Multithreading
Special issue on the 30th annual ACM/IEEE international symposium on microarchitecture, part II

Simultaneous Multithreading (SMT) is a processor architectural technique that promises to significantly improve the utilization and performance of modern wide-issue superscalar processors. An SM T processor is capable of issuing multiple instructions ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MICRO 23: Proceedings of the 23rd annual workshop and symposium on Microprogramming and microarchitecture

November 1990

299 pages

ISBN:0897914139

Sponsors

SIGMICRO: ACM Special Interest Group on Microarchitectural Research and Processing
IEEE-CS: Computer Society

Publisher

IEEE Computer Society Press

Washington, DC, United States

Publication History

Published: 30 November 1990

Check for updates

Qualifiers

Article

Conference

MICRO90

Sponsor:

SIGMICRO
IEEE-CS

MICRO90: 23nd Annual Workshop on Microprogramming and Microarchitecture

November 27 - 29, 1990

Florida, Orlando, USA

Acceptance Rates

Overall Acceptance Rate 484 of 2,242 submissions, 22%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

12
Total Citations
View Citations
206
Total Downloads

Downloads (Last 12 months)15
Downloads (Last 6 weeks)3

Reflects downloads up to 19 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Aiken ANicolau ANovack S(1995)Resource-Constrained Software PipeliningIEEE Transactions on Parallel and Distributed Systems10.1109/71.4761676:12(1248-1270)Online publication date: 1-Dec-1995
https://dl.acm.org/doi/10.1109/71.476167
Dubey PAdams GFlynn M(1994)Instruction Window Size Trade-Offs and Characterization of Program ParallelismIEEE Transactions on Computers10.1109/12.27848143:4(431-442)Online publication date: 1-Apr-1994
https://dl.acm.org/doi/10.1109/12.278481
Ning QSlonim JGentleman W(1993)Register allocation for optimal loop schedulingProceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: distributed computing - Volume 210.5555/962367.962399(942-955)Online publication date: 24-Oct-1993
https://dl.acm.org/doi/10.5555/962367.962399
Moon SEbcioglu KAgrawala A(1993)Selective Scheduling Framework for Speculative Operations in VLIW and Superscalar ProcessorsProceedings of the IFIP WG10.3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism10.5555/647025.714252(229-242)Online publication date: 20-Jan-1993
https://dl.acm.org/doi/10.5555/647025.714252
Wang JEisenbeis C(1993)Decomposed Software PipeliningProceedings of the IFIP WG10.3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism10.5555/647025.714249(3-14)Online publication date: 20-Jan-1993
https://dl.acm.org/doi/10.5555/647025.714249
Moon SEbcioğlu KWolfe AMangione-Smith W(1993)A study on the number of memory ports in multiple instruction issue machinesProceedings of the 26th annual international symposium on Microarchitecture10.5555/255235.255251(49-59)Online publication date: 1-Dec-1993
https://dl.acm.org/doi/10.5555/255235.255251
Ning QGao GVan Deusen MLang B(1993)A novel framework of register allocation for software pipeliningProceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages10.1145/158511.158519(29-42)Online publication date: 1-Mar-1993
https://dl.acm.org/doi/10.1145/158511.158519
Nakatani TEbcioglu K(1993)Making Compaction-Based Parallelization AffordableIEEE Transactions on Parallel and Distributed Systems10.1109/71.2435284:9(1014-1029)Online publication date: 1-Sep-1993
https://dl.acm.org/doi/10.1109/71.243528
Silberman GEbcioglu K(1993)An Architectural Framework for Supporting Heterogeneous Instruction-Set ArchitecturesComputer10.1109/2.21444126:6(39-56)Online publication date: 1-Jun-1993
https://dl.acm.org/doi/10.1109/2.214441
Moon SEbcioğlu KHwu W(1992)An efficient resource-constrained global scheduling technique for superscalar and VLIW processorsProceedings of the 25th annual international symposium on Microarchitecture10.5555/144953.145000(55-71)Online publication date: 10-Dec-1992
https://dl.acm.org/doi/10.5555/144953.145000
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents