Percolation Scheduling (PS) is a new technique for compiling programs into parallel code. It attempts to overcome problems that limit the effectiveness and applicability of currently available techniques. PS globally rearranges code past basic block boundaries. Its core is a small set of simple, primitive program transformations defined in terms of adjacent nodes in a program graph. These transformations constitute the lowest level in a system of transformations and guidance rules. Higher levels of this hierarchy control and enhance the applicability of the core transformations and enable us to exploit both fine grained and coarse parallelism. Unlike other, more ad hoc approaches, PS is based on rigorous definitions of the computational model and of the core transformations. The correctness and termination of the transformations is proven here. The completeness of the transformations is also discussed. As a result our implementation, which is now underway, can proceed on a sound basis. In particular, PS enjoys greater adaptability and independence between the levels than would be possible otherwise. This paper describes PS in detail. The correctness aspects as well as illustrations of the effectiveness of our techniques are presented. Architectures which may benefit from the use of PS are also discussed.
Cited By
- Porpodas V and Cintra M CAeSaR Proceedings of the 2013 International Conference on Compilers, Architectures and Synthesis for Embedded Systems, (1-10)
- Nicolau A, Li G and Kejariwal A Techniques for efficient placement of synchronization primitives Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming, (199-208)
- Nicolau A, Li G and Kejariwal A (2009). Techniques for efficient placement of synchronization primitives, ACM SIGPLAN Notices, 44:4, (199-208), Online publication date: 14-Feb-2009.
- Milicev D and Jovanovic Z (2019). Control Flow Regeneration for Software Pipelined Loops with Conditions, International Journal of Parallel Programming, 30:3, (149-179), Online publication date: 1-Jun-2002.
- Goossens G, Van Praet J, Lanneer D, Geurts W, Kifli A, Liem C and Paulin P Embedded software in real-time signal processing systems Readings in hardware/software co-design, (433-451)
- Zhou H, Jennings M and Conte T Tree traversal scheduling Proceedings of the 14th international conference on Languages and compilers for parallel computing, (223-238)
- Amme W, Braun P, Thomasset F and Zehendner E (2019). Data Dependence Analysis of Assembly Code, International Journal of Parallel Programming, 28:5, (431-467), Online publication date: 1-Oct-2000.
- Ebcioğlu K, Altman E, Gschwind M and Sathaye S Optimizations and oracle parallelism with dynamic translation Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture, (284-295)
- Milicev D and Jovanovic Z A Formal Model of Software Pipelining Loops with Conditions Proceedings of the 11th International Symposium on Parallel Processing, (554-558)
- Abraham S, Kathail V and Deitrich B Meld scheduling Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture, (308-321)
- Ando H, Nakanishi C, Hara T and Nakaya M Unconstrained speculative execution with predicated state buffering Proceedings of the 22nd annual international symposium on Computer architecture, (126-137)
- Ando H, Nakanishi C, Hara T and Nakaya M (1995). Unconstrained speculative execution with predicated state buffering, ACM SIGARCH Computer Architecture News, 23:2, (126-137), Online publication date: 1-May-1995.
- Schlansker M and Kathail V Critical path reduction for scalar programs Proceedings of the 28th annual international symposium on Microarchitecture, (57-69)
- Chen S, Fuchs W and Hwu W An Analytical Approach to Scheduling Code for Superscalar and VLIW Architectures Proceedings of the 1994 International Conference on Parallel Processing - Volume 01, (285-292)
- Abnous A and Bagherzadeh N (2019). Pipelining and Bypassing in a VLIW Processor, IEEE Transactions on Parallel and Distributed Systems, 5:6, (658-664), Online publication date: 1-Jun-1994.
- Gao G (2019). An Efficient Hybrid Dataflow Architecture Model, Journal of Parallel and Distributed Computing, 19:4, (293-307), Online publication date: 1-Dec-1993.
- Nakatani T and Ebcioglu K (2019). Making Compaction-Based Parallelization Affordable, IEEE Transactions on Parallel and Distributed Systems, 4:9, (1014-1029), Online publication date: 1-Sep-1993.
- Sano B and Despain A The 16-fold way Proceedings of the 26th annual international symposium on Microarchitecture, (60-69)
- Smotherman M, Chawla S, Cox S and Malloy B Instruction scheduling for the Motorola 88110 Proceedings of the 26th annual international symposium on Microarchitecture, (257-262)
- Newburn C, Huang A and Shen J Balancing Fine- and Medium-Grained Parallelism in Scheduling Loops for the XIMD Architecture Proceedings of the IFIP WG10.3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism, (39-52)
- Fernandes E and Barbosa F Effects of building blocks on the performance of super-scalar architecture Proceedings of the 19th annual international symposium on Computer architecture, (36-45)
- Smith M, Horowitz M and Lam M Efficient superscalar performance through boosting Proceedings of the fifth international conference on Architectural support for programming languages and operating systems, (248-259)
- Silberman G and Ebcioğlu K An architectural framework for migration from CISC to higher performance platforms Proceedings of the 6th international conference on Supercomputing, (198-215)
- Smith M, Horowitz M and Lam M (2019). Efficient superscalar performance through boosting, ACM SIGPLAN Notices, 27:9, (248-259), Online publication date: 1-Sep-1992.
- Moon S and Ebcioğlu K (2019). An efficient resource-constrained global scheduling technique for superscalar and VLIW processors, ACM SIGMICRO Newsletter, 23:1-2, (55-71), Online publication date: 10-Dec-1992.
- Bernstein D, Cohen D, Lavon Y and Rainish V (1992). Performance evaluation of instruction scheduling on the IBM RISC System/6000, ACM SIGMICRO Newsletter, 23:1-2, (226-235), Online publication date: 10-Dec-1992.
- Beaty S (1992). Lookahead scheduling, ACM SIGMICRO Newsletter, 23:1-2, (256-259), Online publication date: 10-Dec-1992.
- Fernandes E and Barbosa F (1992). Effects of building blocks on the performance of super-scalar architecture, ACM SIGARCH Computer Architecture News, 20:2, (36-45), Online publication date: 1-May-1992.
- Moon S and Ebcioğlu K An efficient resource-constrained global scheduling technique for superscalar and VLIW processors Proceedings of the 25th annual international symposium on Microarchitecture, (55-71)
- Bernstein D, Cohen D, Lavon Y and Rainish V Performance evaluation of instruction scheduling on the IBM RISC System/6000 Proceedings of the 25th annual international symposium on Microarchitecture, (226-235)
- Beaty S Lookahead scheduling Proceedings of the 25th annual international symposium on Microarchitecture, (256-259)
- Wolfe A and Shen J A variable instruction stream extension to the VLIW architecture Proceedings of the fourth international conference on Architectural support for programming languages and operating systems, (2-14)
- Wolfe A and Shen J (1991). A variable instruction stream extension to the VLIW architecture, ACM SIGPLAN Notices, 26:4, (2-14), Online publication date: 2-Apr-1991.
- Wolfe A and Shen J (1991). A variable instruction stream extension to the VLIW architecture, ACM SIGOPS Operating Systems Review, 25:Special Issue, (2-14), Online publication date: 2-Apr-1991.
- Wolfe A and Shen J (1991). A variable instruction stream extension to the VLIW architecture, ACM SIGARCH Computer Architecture News, 19:2, (2-14), Online publication date: 2-Apr-1991.
- Schuette M and Shen J An instruction-level performance analysis of the Multiflow TRACE 14/300 Proceedings of the 24th annual international symposium on Microarchitecture, (2-11)
- Bernstein D, Cohen D and Krawczyk H Code duplication Proceedings of the 24th annual international symposium on Microarchitecture, (103-113)
- Breternitz M and Shen J Implementation optimization techniques for architecture synthesis of application-specific processors Proceedings of the 24th annual international symposium on Microarchitecture, (114-123)
- Corporaal H and Mulder H MOVE: a framework for high-performance processor design Proceedings of the 1991 ACM/IEEE conference on Supercomputing, (692-701)
- Beaty S, Whitley D and Johnson G (1991). Motivation and framework for using genetic algorithms for microcode compaction, ACM SIGMICRO Newsletter, 22:1, (20-27), Online publication date: 1-Jan-1991.
- Nakatani T and Ebcioǧlu K (1991). Using a lookahead window in a compaction-based parallelizing compiler, ACM SIGMICRO Newsletter, 22:1, (8-19), Online publication date: 1-Jan-1991.
- Moon S, Carson S and Agrawala A Hardware implementation of a general multi-way jump mechanism Proceedings of the 23rd annual workshop and symposium on Microprogramming and microarchitecture, (38-45)
- Nakatani T and Ebcioğlu K Using a lookahead window in a compaction-based parallelizing compiler Proceedings of the 23rd annual workshop and symposium on Microprogramming and microarchitecture, (57-68)
- Nicolau A and Potasman R Realistic scheduling Proceedings of the 23rd annual workshop and symposium on Microprogramming and microarchitecture, (69-79)
- Beaty S, Whitley D and Johnson G Motivation and framework for using genetic algorithms for microcode compaction Proceedings of the 23rd annual workshop and symposium on Microprogramming and microarchitecture, (117-124)
- Shih L Microprogramming heritage of RISC design Proceedings of the 23rd annual workshop and symposium on Microprogramming and microarchitecture, (275-280)
- Warren H (2019). Instruction scheduling for the IBM RISC System/6000 processor, IBM Journal of Research and Development, 34:1, (85-92), Online publication date: 3-Jan-1990.
- Ebcioĝlu K and Kumar M (1990). A wide instruction word architecture for parallel execution of logic programs coded in BSL, New Generation Computing, 7:2-3, (219-242), Online publication date: 1-Feb-1990.
- Nakatani T and Ebcioğlu K “Combining” as a compilation technique for VLIW architectures Proceedings of the 22nd annual workshop on Microprogramming and microarchitecture, (43-55)
- Nakatani T and Ebcioğlu K (1989). “Combining” as a compilation technique for VLIW architectures, ACM SIGMICRO Newsletter, 20:3, (43-55), Online publication date: 1-Aug-1989.
- Aiken A and Nicolau A (2019). A Development Environment for Horizontal Microcode, IEEE Transactions on Software Engineering, 14:5, (584-594), Online publication date: 1-May-1988.
- Ebcioğlu K (1988). A compilation technique for software pipelining of loops with conditional jumps, ACM SIGMICRO Newsletter, 19:3, (36-41), Online publication date: 1-Sep-1988.
- Ebcioğlu K A compilation technique for software pipelining of loops with conditional jumps Proceedings of the 20th annual workshop on Microprogramming, (69-79)
- Aiken A and Nicolau A (1986). A development environment for horizontal microcode programs, ACM SIGMICRO Newsletter, 17:4, (23-31), Online publication date: 21-Dec-1986.
- Aiken A and Nicolau A A development environment for horizontal microcode programs Proceedings of the 19th annual workshop on Microprogramming, (23-31)
Recommendations
Dynamic compilation of data-parallel kernels for vector processors
CGO '12: Proceedings of the Tenth International Symposium on Code Generation and OptimizationModern processors enjoy augmented throughput and power efficiency through specialized functional units leveraged via instruction set extensions. These functional units accelerate performance for specific types of operations but must be programmed ...
Lightweight verification of separate compilation
POPL '16: Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming LanguagesMajor compiler verification efforts, such as the CompCert project, have traditionally simplified the verification problem by restricting attention to the correctness of whole-program compilation, leaving open the question of how to verify the ...
Symbolic Compilation of PSL
The IEEE standard property specification language (PSL) is increasingly used in many phases of the hardware design cycle, from specification to verification. PSL combines linear temporal logic (LTL) with sequential extended regular expressions (SEREs) ...