Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/192724.192733acmconferencesArticle/Chapter ViewAbstractPublication PagesmicroConference Proceedingsconference-collections
Article
Free access

Minimizing register requirements under resource-constrained rate-optimal software pipelining

Published: 30 November 1994 Publication History

Abstract

In this paper we address the following software pipelining problem: given a loop and a machine architecture with a fixed number of processor resources (e.g. function units), how can one construct a software-pipelined schedule which runs on the given architecture at the maximum possible iteration rate (a` la rate-optimal) while minimizing the number of registers?
The main contributions of this paper are:
•First, we demonstrate that such problem can be described by a simple mathematical formulation with precise optimization objectives under periodic linear scheduling framework. The mathematical formulation provides a clear picture which permits one to visualize the overall solution space (for rate-optimal schedules) under different sets of constraints.
•Secondly, we show that a precise mathematical formulation and its solution does make a significant performance difference! We evaluated the performance of our method against three other leading contemporary heuristic methods: Huff's Slack Scheduling, Wang, Eisenbeis, Jourdan and Su's FRLC, and Gasperoni and Schwiegelshohn's modified list scheduling. Experimental results show that the method described in this paper performed significantly better than these methods.

References

[1]
A. Aiken and A. Nicolau. A realistic resourceconstrained software pipelining algorithm. In A. Nicolau, D. Gelernter, T. Gross, and D. Padua, editors, Advances ~n Languages and Compilers for Parallel Processing, Res. Monographs in Parallel and Distrib. Computing, chapter 14, pages 274-290. 1991.
[2]
J. R. Allen, K. Kennedy, C. Porterfield, and J. Warren. Conversion of control dependence to data dependence. In Conf. Rec. o/the Tenth Ann. A UM Syrup. on Principles of Programming Languages, pages 177-189, Austin, TX, Jan. 24-26, 1983.
[3]
E. R. Altman, R. Govindarajan, and G. R. Gao. Software pipelining to minimize registers and resources. ACAPS Technical Memo 79, School of Computer Science, McGill University, Montrdal, Qua., 1994. under preparation.
[4]
J. C. Dehnert and R. A. Towle. Compiling for Cydra 5. Journal of $upercomput~ng, 7:181-227, May 1993.
[5]
K. Ebcioglu and T. Nakatani. A new compilation technique for parallelizing loops with unpredictable branches on a VLIW architecture. In D. Gelernter, A. Nicolau, and D. Padua, editors, Languages and Compilers for Parallel Computing, Res. Monographs in Parallel and Distrib. Computing, chapter 12, pages 213-229. 1990.
[6]
F. Gasperoni and U. Schwiegelshohn. Efficient algorithms for cyclic scheduling. Res. Rep. RC 17068, IBM T. J. Watson Res. Center, Yorktown Heights, NY, 1991.
[7]
M. B. Girkar, M. R. Haghighat, C. L. Lee, B. P. Leung, and D. A. Schouten. Parafrase-2 user's manual. TR RC-17068(#75743), Center for Supercomputing Research and Development, University of Illinois at Urbana-Champagne, IL, 1991.
[8]
R. Govindarajan, E. R. Altman, and G. R. Gao. Minimizing register requirement in resource-constrained software pipelining. ACAPS Technical Memo 80, School of Computer Science, McGill University, MontrdM, Qud., 1994.
[9]
R. A. Huff. Lifetime-sensitive modulo scheduling. In Proc. of the SIGPLAN '93 Conf. on Programming Language Design and Implementation, pages 258-267, Albuquerque, NM, Jun. 23-25, 1993.
[10]
C.-T. Hwang, J.-H. Lee, and Y.-C. Hsu. A formal approach to the scheduling problem in high-level synthesis. }EEE Trans. on Computer-A~ded Deszgn, 10(4):464-475, Apr. 1991.
[11]
M. Lam. Software pipelining: An effective schedub ing technique for VLIW machines. In Proc. of the SIGPLAN '88 Conf. on Programming Language Design and Implementation, pages 318-328, Atlanta, GA, Jun. 22-24, 1988.
[12]
S. Moon and K. Ebcio~lu. An efficient resourceconstrained global scheduling technique for superscalar and VLIW processors. In Proc. of the 25th Ann. Intl. Syrup. on Microarchitecture, pages 55-71, Portland, OR, Dec. 1-4, 1992.
[13]
Q. Ning and G. R. Gao. A novel framework of register allocation for software pipelining. In Conf. Rec. of the Twentieth Ann. A CM SIGPLAN-SIGA CT Syrup. on Principles of Programming Languages, pages 29-42, Charleston, SC, Jan. 10-13, 1993.
[14]
M. Rajagopalan and V. H. Allan. Efficient scheduling of fine grain parallelism in loops. In Proc. of the 26th Ann. Intl. Symp. on Microarchitecture, pages 2- 11, Austin, TX, Dec. 1-3, 1993.
[15]
S. Ramakrishnan. Software pipelining in PA-RISC compilers. Hewlett-Packard J., pages 39-45, Jun. 1992.
[16]
B. R. Rau and J. A. Fisher. Instruction-level parallel processing: History, overview and perspective. J. of Supercomputing, 7:9-50, May 1993.
[17]
B. R. Rau and C. D. Glaeser. Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing. In Proc. of the 1Jth Ann. Microprogramming Work., pages 183-198, Chatham, MA, Oct. 12-15, 1981.
[18]
B. R. Rau, M. Lee, P. P. Tirumalai, and M. S. Schlansker. Register allocation for software pipelined loops. In Proc. of the SIGPLAN '92 Conf. on Programm~ng Language Design and Implementation, pages 283-299, San Francisco, CA, Jun. 17-19, 1992.
[19]
B. R. Rau, M. S. Schlansker, and P. P. Tirumalai. Code generation schema for modulo scheduled loops. In Proc. of the 25th Ann. Intl. Syrup. on Microarchitecture, pages 158-169, Portland, OR, Dec. 1-4, 1992.
[20]
B. R. Rau, D. W. L. Yen, W. Yen, and R. A. Towle. The Cydra 5 departmental supercomputer. Computer, 22(1):12-35, Jan. 1989.
[21]
R. Reiter. Scheduling parallel computations. J. of the A CM, 15(4):590-599, Oct. 1968.
[22]
R. F. Touzeau. A Fortran compiler for the FPS- 164 scientific computer. In Proc. o/ the SIGPLAN '8g Syrup. on Compiler Construction, pages 48-57, Montreal, Qud., Jun. 17-22, 1984.
[23]
J. Wang, C. Eisenbeis, M. Jourdan, and B. Su. DE- composed Software Pipelining: A new approach to exploit irtstruc~ion-level parallelism for loop programs. Res. Rep. RR-1838, INRIA-Rocquencourt, France, Jan. 1993.
[24]
N. J. Wafter, S. A. Mahlke, W. Hwu, and B. Ramakrishna Rau. Reverse If-Conversion. In Proc. o/ the SIGPLAN '93 Conf. on Programming Language Design and Implementation, pages 290-299, Albuquerque, NM, Jun. 23-25, 1993.

Cited By

View all
  • (2023)Long-life Sensitive Modulo Scheduling with Adaptive Loop Expansion2022 IEEE 28th International Conference on Parallel and Distributed Systems (ICPADS)10.1109/ICPADS56603.2022.00075(530-537)Online publication date: Jan-2023
  • (2022)Adaptive Low-Cost Loop Expansion for Modulo SchedulingNetwork and Parallel Computing10.1007/978-3-031-21395-3_3(30-41)Online publication date: 1-Dec-2022
  • (2019)Survey on Combinatorial Register Allocation and Instruction SchedulingACM Computing Surveys10.1145/320092052:3(1-50)Online publication date: 18-Jun-2019
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
MICRO 27: Proceedings of the 27th annual international symposium on Microarchitecture
November 1994
233 pages
ISBN:0897917073
DOI:10.1145/192724
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 November 1994

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

MICRO94
Sponsor:
MICRO94: 27th Annual International Symposium on Microarchitecture
November 30 - December 2, 1994
California, San Jose, USA

Acceptance Rates

Overall Acceptance Rate 484 of 2,242 submissions, 22%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)79
  • Downloads (Last 6 weeks)19
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Long-life Sensitive Modulo Scheduling with Adaptive Loop Expansion2022 IEEE 28th International Conference on Parallel and Distributed Systems (ICPADS)10.1109/ICPADS56603.2022.00075(530-537)Online publication date: Jan-2023
  • (2022)Adaptive Low-Cost Loop Expansion for Modulo SchedulingNetwork and Parallel Computing10.1007/978-3-031-21395-3_3(30-41)Online publication date: 1-Dec-2022
  • (2019)Survey on Combinatorial Register Allocation and Instruction SchedulingACM Computing Surveys10.1145/320092052:3(1-50)Online publication date: 18-Jun-2019
  • (2016)Modulo SchedulingInstruction Level Parallelism10.1007/978-1-4899-7797-7_6(133-165)Online publication date: 30-Nov-2016
  • (2014)Improving performance of loops on DIAM-based VLIW architecturesACM SIGPLAN Notices10.1145/2666357.259782549:5(135-144)Online publication date: 12-Jun-2014
  • (2014)Improving performance of loops on DIAM-based VLIW architecturesProceedings of the 2014 SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems10.1145/2597809.2597825(135-144)Online publication date: 12-Jun-2014
  • (2014)Flushing-Enabled Loop Pipelining for High-Level SynthesisProceedings of the 51st Annual Design Automation Conference10.1145/2593069.2593143(1-6)Online publication date: 1-Jun-2014
  • (2014)Optimum modulo schedules for minimum register requirementsACM International Conference on Supercomputing 25th Anniversary Volume10.1145/2591635.2667171(227-236)Online publication date: 10-Jun-2014
  • (2014)Author retrospective for optimum modulo schedules for minimum register requirementsACM International Conference on Supercomputing 25th Anniversary Volume10.1145/2591635.2591653(35-36)Online publication date: 10-Jun-2014
  • (2014)Predicate-aware, makespan-preserving software pipelining of scheduling tablesACM Transactions on Architecture and Code Optimization10.1145/257967611:1(1-26)Online publication date: 1-Feb-2014
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media