Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/255237.255249acmconferencesArticle/Chapter ViewAbstractPublication PagesmicroConference Proceedingsconference-collections
Article
Free access

Using a lookahead window in a compaction-based parallelizing compiler

Published: 30 November 1990 Publication History

Abstract

Lookahead is a common technique for high performance uniprocessor design. In general, however, hardware lookahead window is too small to exploit instruction-level parallelism at run time, while compaction-based parallelizing compilers must suffer from worst-case exponential code explosion at compile time. In this paper, we propose a software lookahead method, which allows inter-basic block code motions within the prespecified number of operations, called software lookahead window, on any path emanating from the currently processed instruction at compile time. By software lookahead, instruction-level parallelism can be exploited in a much greater code area than the hardware approach, but the lookahead region is still limited to a constant depth with a user-specifiable window, and thus code explosion is restricted. The proposed scheme has been implemented in our prototype parallelizing compiler, which can generate code for uniprocessors with multiple functional units and multiway conditional branches, such as VLIW machines, and potentially for superscalars as well. To study code explosion problem and instruction-level parallelism for branch intensive code, we compiled five AIX utilities: sort, fgrep, sed, yacc, and compress. It is demonstrated that, with software lookahead, code explosion problem is effectively alleviated, yet a substantial amount of inter-basic block parallelism is successfully extracted.

References

[1]
Aiken, A. (1988). Compaction-Based Parallelization, Ph.D. Dissertation, Department of Computer Science, Cornell University.
[2]
Aiken, A. and Nicolau, A. {1988}. Perfect Pipelining: A New Loop Parallelization Technique, In Proceedings of the 1988 European Symposium on Programming, pp. 221-235.
[3]
Ebcioglu, K. {1987}. Compilation Technique for Software Pipelining of Loops with Conditional Jumps, In Proceedings of the 20th Annual Workshop on Microprogramming, pp. 69-79, ACM Press.
[4]
Ebcioglu, K. (1988). Some Design Ideas for a VLIW Architecture for Sequential Natured Software, In Parallel Processing (Proceedings of IFIP WG 10.3 Working Conference on Parallel Processing), edited by M. Cosnard et al., North Holland.
[5]
Ebcioglu, K. and Nicolau, A. {1989} A Global Resource-Constrained Parallelization Technique, In Proceedings of the Third International Conference on Supercomputing, pp. 154-163, Crete.
[6]
Ebcioglu, K. and Nakatani, T. {1989}. A New Compilation Technique for Parallelizing Loops with Unpredictable Branches on a VLIW Architecture, In Languages and Compilers for Parallel Computing, edited by D. Gelernter et al., Research Monographs in Parallel and Distributed Computing, MIT Press.
[7]
Ellis, J. {1985}. Bulldog: A Compiler for VLIW Architectures, Ph.D. Dissertation, Department of Computer Science, Yale University.
[8]
Ferrante, J., Ottenstein, K., and Warren, J. {1987}. The Program Dependence Graph and Its Use in Optimization, ACM Transactions on Programming Languages and Systems, 9:3, pp. 319-349.
[9]
Fisher, J. {1981}. Trace Scheduling: A Technique for Global Microcode Compaction, IEEE Transactions on Computers, c-30:7, pp. 478-490.
[10]
Gupta, R. and Soffa, M. {1990}. Region Scheduling: An Approach for Detecting and Redistributing Parallelism, IEEE Transactions on Software Engineering, 16:4, pp. 421-431.
[11]
Hsu, P. {1985}. Highly Concurrent Scalar Processing, Ph.D. Dissertation, Department of Computer Science, University of Illinois at Urbana- Champaign.
[12]
Lam, M. {1988}. Software Pipelining: An Eflective Scheduling Technique for VLIW Machines, In Proceedings of the SIGPLAN 1988 Conference of Programming Language Design and Implementation, pp. 318-328, ACM Press.
[13]
Nakatani, T. and Ebcioglu, K. {1989}. "Combining" as u Compilation Technique for a VLIW Architecture, In Proceedings of the 22nd Annual International Workshop of Microprogramming and Microarchitecture, ACM and IEEE, pp. 43-55.
[14]
Nicolau, A. {1985}. Percolation Scheduling: A Parallel Compilation Technique, TR 85-678, Department of Computer Science, Cornell University.
[15]
Tomasulo, R. {1967}. An Efficient Algorithm for Exploiting Multiple Arithmetic Units, IBM Journal of Research and Development, 11:1, pp. 25-33.
[16]
Warren, S.H., Auslander, M.A., Chaitin, G.J., Chibib, A.C., Hopkins, M.E., and MacKay, A.L.{1986}. Final Code Generation in the PL.6 Compiler, Report No. RC 11974, IBM T.J. Watson Research Center.

Cited By

View all
  • (1995)Resource-Constrained Software PipeliningIEEE Transactions on Parallel and Distributed Systems10.1109/71.4761676:12(1248-1270)Online publication date: 1-Dec-1995
  • (1994)Instruction Window Size Trade-Offs and Characterization of Program ParallelismIEEE Transactions on Computers10.1109/12.27848143:4(431-442)Online publication date: 1-Apr-1994
  • (1993)Register allocation for optimal loop schedulingProceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: distributed computing - Volume 210.5555/962367.962399(942-955)Online publication date: 24-Oct-1993
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
MICRO 23: Proceedings of the 23rd annual workshop and symposium on Microprogramming and microarchitecture
November 1990
299 pages
ISBN:0897914139

Sponsors

Publisher

IEEE Computer Society Press

Washington, DC, United States

Publication History

Published: 30 November 1990

Check for updates

Qualifiers

  • Article

Conference

MICRO90
Sponsor:

Acceptance Rates

Overall Acceptance Rate 484 of 2,242 submissions, 22%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)15
  • Downloads (Last 6 weeks)3
Reflects downloads up to 19 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (1995)Resource-Constrained Software PipeliningIEEE Transactions on Parallel and Distributed Systems10.1109/71.4761676:12(1248-1270)Online publication date: 1-Dec-1995
  • (1994)Instruction Window Size Trade-Offs and Characterization of Program ParallelismIEEE Transactions on Computers10.1109/12.27848143:4(431-442)Online publication date: 1-Apr-1994
  • (1993)Register allocation for optimal loop schedulingProceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: distributed computing - Volume 210.5555/962367.962399(942-955)Online publication date: 24-Oct-1993
  • (1993)Selective Scheduling Framework for Speculative Operations in VLIW and Superscalar ProcessorsProceedings of the IFIP WG10.3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism10.5555/647025.714252(229-242)Online publication date: 20-Jan-1993
  • (1993)Decomposed Software PipeliningProceedings of the IFIP WG10.3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism10.5555/647025.714249(3-14)Online publication date: 20-Jan-1993
  • (1993)A study on the number of memory ports in multiple instruction issue machinesProceedings of the 26th annual international symposium on Microarchitecture10.5555/255235.255251(49-59)Online publication date: 1-Dec-1993
  • (1993)A novel framework of register allocation for software pipeliningProceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages10.1145/158511.158519(29-42)Online publication date: 1-Mar-1993
  • (1993)Making Compaction-Based Parallelization AffordableIEEE Transactions on Parallel and Distributed Systems10.1109/71.2435284:9(1014-1029)Online publication date: 1-Sep-1993
  • (1993)An Architectural Framework for Supporting Heterogeneous Instruction-Set ArchitecturesComputer10.1109/2.21444126:6(39-56)Online publication date: 1-Jun-1993
  • (1992)An efficient resource-constrained global scheduling technique for superscalar and VLIW processorsProceedings of the 25th annual international symposium on Microarchitecture10.5555/144953.145000(55-71)Online publication date: 10-Dec-1992
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media