Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Region-based compilation: Introduction, motivation, and initial experience

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

The most important task of a compiler designed to exploit instruction-level parallelism (ILP) is instruction scheduling. If higher levels of ILP are to be achieved, the compiler must use, as the unit of scheduling, regions consisting of multiple basic blocks—preferably those that frequently execute consecutively, and which capture cycles in the program’s execution. Traditionally, compilers have been built using the function as the unit of compilation. In this framework, function boundaries often act as barriers to the formation of the most suitable scheduling regions. Function inlining may be used to circumvent this problem by assembling strongly coupled functions into the same compilation unit, but at the cost of very large function bodies. Consequently, global optimizations whose compile time and space requirements are superlinear in the size of the compilation unit, may be rendered prohibitively expensive. This paper introduces a new approach, called region-based compilation, wherein the compiler, after inlining, repartitions the program into more desirable compilation units, termed regions. Region-based compilation allows the compiler to control problem size and complexity while exposing inter-procedural scheduling, optimization and code motion opportunities.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. J. A. Fisher, Trace Scheduling: A technique for Global Microcode Compaction,IEEE Trans. on Computers,C-30:478–490, (July 1981).

    Article  Google Scholar 

  2. J. Ellis,Bulldog: A Compiler for VLIW Architectures, Cambridge, Massachusetts, The MIT Press (1985).

    Google Scholar 

  3. B. R. Rau and C. D. Glaeser, Some Scheduling Techniques and an Easily Schedulable Horizontal Architecture for High Performance Scientific Computing,Proc. 20th Ann. Workshop on Microprogramming and Microarchitecture, pp. 183–198 (October 1981).

  4. B. R. Rau, Iterative Modulo Scheduling,Int. J. Pa. Pr. 24:3–64 (February 1996).

    Google Scholar 

  5. W. W. Hwu, S. A. Mahlke, W. Y. Chen, P. P. Chang, N. J. Warter, R. A. Bringmann, R. G. Ouellette, R. E. Hank, T. Kiyohara, G. E. Haab, J. G. Holm, and D. M. Lavery, The Superblock: An Effective Technique for VLIW and Superscalar Compilation,J. Supercomputing,7:229–248 (January 1993).

    Article  Google Scholar 

  6. P. G. Lowney, S. M. Freudenberger, T. J. Karzes, W. D. Lichtenstein, R. P. Nix, J. S. O’Donell, and J. C. Ruttenberg, The Multiflow Trace Scheduling Compiler,J. Supercomputing,7, 51–142 (January 1993).

    Article  Google Scholar 

  7. J. C. Dehnert and R. A. Towle, Compiling for the Cydra 5,J. Supercomputing,7:181–227 (January 1993).

    Article  Google Scholar 

  8. R. Allen and S. Johnson, Compiling C for Vectorization, Parallelization, and Inline Expansion,Proc. ACMSIGPLAN 1988 Conf. Progr. Lang. Design and Implementation, pp. 241–249 (June 1988).

  9. J. W. Davidson and A. M. Holler, Subprogram Inlining: A study of Its Effects on Program Execution Time,IEEE Trans. Software Engineering,18:89–101 (February 1992).

    Article  Google Scholar 

  10. W. W. Hwu and P. P. Chang, Inline Function Expansion for Compiling Realistic C Programs,Proc. ACM SIGPLAN 1989 Conf. Progr. Lang. Design and Implementation, pp. 246–257 (June 1989).

  11. S. A. Mahlke, N. J. Warter, W. Y. Chen, P. P. Chang, and W. W. Hwu, The Effect of Compiler Optimizations on Available Parallelism in Scalar Programs,Proc. Int. Conf. Parallel Processing, pp. 142–145 (August 1991).

  12. S. A. Mahlke, D. C. Lin, W. Y. Chen, R. E. Hank, and R. A. Bringmann, Effective Compiler Support for Predicated Execution Using the Hyperblock,Proc. 25th Int. Symp. on Microarchitecture, pp. 45–54 (December 1992).

  13. S. A. Mahlke, W. Y. Chen, P. P. Chang, and W. W. Hwu, Scalar Program Performance on Multiple-Instruction-Issue Processors with a Limited Number of Registers,Proc. 25th Ann. Hawaii Int’l. Conf. Syst. Sci., pp. 34–44 (January 1992).

  14. M. Schlansker and V. Kathail, Critical Path Reduction for Scalar Programs,Proc. 28th Int’l. Symp. Microarchitecture, pp. 57–69 (December 1995).

  15. D. M. Lavery and W. W. Hwu, Unrolling-Based Optimizations for Modulo Scheduling,Proc. 28th Int’l. Symp. Microarchitecture, pp. 327–337 (November 1995).

  16. M. S. Schlansker, V. Kathail, and S. Anik, Parallelization of Control Recurrences for ILP Processors,Int. J. Parallel Processing 24:65–102 (February 1996).

    Google Scholar 

  17. A. Aho, R. Sethi, and J. Ullman,Compilers: Principles, Techniques, and Tools. Reading, Massachusetts, Addison-Wesley (1986).

    Google Scholar 

  18. R. Gupta and M. L. Soffa, Region Scheduling: An Approach for Detecting and Redistributing Parallelism,IEEE Trans. Software Engineering,16:421–431 (April 1990).

    Article  Google Scholar 

  19. R. S. Pressman,Software Engineering: A Practitioner’s Approach. New York, McGraw-Hill (1988).

    Google Scholar 

  20. G. J. Chaitin, Register Allocation and Spilling via Graph Coloring,Proc. ACM SIGPLAN Symp. Compiler Construction, pp. 98–105 (June 1982).

  21. F. C. Chow and J. L. Hennessy, The Priority-Based Coloring Approach to Register Allocation,ACM Trans. Prog. Lang. Syst. 12:501–536 (October 1990).

    Article  Google Scholar 

  22. S. A. Mahlke, Exploiting Instruction Level Parallelism in the Presence of Conditional Branches. Ph.D. Thesis, Department of Electrical and Computer Engineering, University of Illinois, Illinois (1996).

    Google Scholar 

  23. U. Mahadevan and S. Ramakrishnan, Instruction Scheduling Over Regions: A Framework for Scheduling Across Basic Blocks,Proc. 5th Int’l. Conf. Compiler Construction, pp. 419–434 (April 1994).

  24. R. E. Hank, S. A. Mahlke, R. A. Bringmann, J. C. Gyllenhaal, and W. W. Hwu, Superblock Formation Using Static Program Analysis,Proc. 26th Ann. Int’l. Symp. Microarchitecture (December 1993).

  25. M. S. Lam, Software Pipelining: An Effective Scheduling Technique for VLIW Machines,Proc. ACM SIGPLAN Conf. Progr. Lang. Design and Implementation, pp. 318–328 (June 1988).

  26. K. Ebcioglu, A Compilation Technique for Software Pipelining of Loops with Conditional Jumps,Proc. 20th Ann. Workshop on Microprogramming and Microarchitecture, pp. 69–79 (December 1987).

  27. A. Aiken and A. Nicolau, Optimal Loop Parallelization,Proc. ACMSIGPLAN Conf. Progr. Lang. Design and Implementation, pp. 308–317 (June 1988).

  28. K. Ebcioglu and T. Nakatani, A New Compilation Technique for Parallelizing Loops with Unpredictable Branches on a VLIW Architecture,Lang. and Compilers for Parallel Computing, pp. 213–229 (1989).

  29. P. Tirumalai, M. Lee, and M. Schlansker, Parallelization of Loops with Exits on Pipelined Architectures,Supercomputing (November 1990).

  30. B. R. Rau, Iterative Modulo Scheduling: An Algorithm for Software Pipelining Loops,Proc. 27th Int’l. Symp. Microarchitecture, pp. 63–74 (December 1994).

  31. M. Schlansker, V. Kathail, and S. Anik, Height Reduction of Control Recurrences for ILP Processors,Proc. 27th Int’l. Symp. Microarchitecture, pp. 40–51 (December 1994).

  32. D. M. Lavery and W. W. Hwu, Modulo Scheduling of Loops in Control-Intensive Non-Numeric Programs,Proc. 29th Int’l. Symp. Microarchitecture, pp. 126–137 (December 1996).

  33. P. P. Chang and W. W. Hwu, Trace Selection for Compiling Large C Application Programs to Microcode,Proc. 21st Int’l. Workshop on Microprogramming and microarchitecture, pp. 188–198 (November 1988).

  34. R. E. Hank,Region-Based Compilation, Ph.D. Thesis, Department of Electrical and Computer Engineering, University of Illinois, Urbana, Illinois (1996).

    Google Scholar 

  35. J. Ferrante, K. J. Ottenstein, and J. D. Warren, The Program Dependence Graph and Its Use in Optimization,ACM Trans. Progr. Lang. Syst. 9:319–349 (July 1987).

    Article  MATH  Google Scholar 

  36. T. Ball and J. R. Larus, Branch Prediction for Free,Proc. ACM SIGPLAN Conf. Prog. Lang. Design and Implementation, pp. 300–313 (June 1993).

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hank, R.E., Hwu, Wm.W. & Rau, B.R. Region-based compilation: Introduction, motivation, and initial experience. Int J Parallel Prog 25, 113–146 (1997). https://doi.org/10.1007/BF02700049

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02700049

Key Words

Navigation