article

Free access

Register allocation for software pipelined loops

Authors:

P. P. Tirumalai,

M. S. SchlanskerAuthors Info & Claims

ACM SIGPLAN Notices, Volume 27, Issue 7

Pages 283 - 299

https://doi.org/10.1145/143103.143141

Published: 01 July 1992 Publication History

Abstract

Software pipelining is an important instruction scheduling technique for efficiently overlapping successive iterations of loops and executing them in parallel. This paper studies the task of register allocation for software pipelined loops, both with and without hardware features that are specifically aimed at supporting software pipelines. Register allocation for software pipelines presents certain novel problems leading to unconventional solutions, especially in the presence of hardware support. This paper formulates these novel problems and presents a number of alternative solution strategies. These alternatives are comprehensively tested against over one thousand loops to determine the best register allocation strategy, both with and without the hardware support for software pipelining.

References

[1]

Allen, J.R., et al. Conversion of control dependence to data dependence. In Proceedings of the Tenth Annual ACM Symposium on Principles of Programming Languages, (January, 1983).

Digital Library

[2]

Berry, M., et al. The Perfect Club Benchmarks: Effective Performance Evaluation of Supercomputers. The International Journal of Supercomputer Applications, 3 (1989), 5-40.

Digital Library

[3]

Callahan, D., Cart, S., and Kennedy, K. Improving Register Allocation for Subscripted Variables, In Proceedings of the A CM StGPLAN '90 Conference on Programming Language Design and Implementation, (June, 1990), 53- 65.

Digital Library

[4]

Chaitin, G.J. Register allocation and spilling via graph coloring. In Proceedings of the SIGPLAN82 Symposium on Compiler Construction, (June, 1982), 201-207.

Digital Library

[5]

Charlesworth, A.E. An Approach to Scientific Array Processing: The Architectural Design of the AP-120B/FPS- 164 Family. IEEE Computer 14, 9 (September, 1981), 18- 27.

Digital Library

[6]

Dehnert, J.C., Hsu, P.Y.-T., and Bratt, J.P. Overlapped loop support in the Cydra 5. In Proceedings of the Third international Conference on Architectural Support for Programming Languages and Operating Systems, (Boston, Mass., April, 1989), 26-38.

Digital Library

[7]

Ebcioglu, K., and Nakatani, T. A new compilation technique for parallelizing loops with unpredictable branches on a VLIW architecture. In Proceedings of the Second Workshop on Programming Languages and Compilers for Parallel Computing, (Urbana-Champaign, 1989), 213-229.

Digital Library

[8]

Hendren, L.J., et al. Register Allocation using Cyclic interval Graphs: A New Approach to an Old Problem. ACAPS Technical Memo 33. Advanced Computer Architecture and Program Structures Group, McGill University, Montreal, Canada, 1992.

[9]

Hsu, P.Y.T. Highly Concurrent Scalar Processing. CSG-49. Coordinated Science Lab., University of Illinois, Urbana, Illinois, 1986.

[10]

Jain, S. Circular scheduling: A new technique to perform software pipelining. In Proceedings of the ACM SIGPLAN '91 Conference on Programming Language Design and implementation, (June, 1991), 219-228.

Digital Library

[11]

Lain, M. Software pipelining: an effective scheduling technique for VLIW machines. In Proceedings of the A CM SIGPLAN '88 Conference on Programming Language Design and Implementation, (June, 1988), 318-327.

Digital Library

[12]

Lee, R.L., Kwok, A.Y., and Briggs, F.A. The floating point performance of a superscalar SPARC processor. In Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, (Santa Clara, California, April, 1991), 28-37.

Digital Library

[13]

Nicolau, A., and Potasman, R. Realistic scheduling: compaction for pipelined architectures. In Proceedings of the 23th Annual Workshop on Microprogramming and Microarchitecture, (Orlando, Florida, November, 1990), 69-79.

Digital Library

[14]

Rau, B.R. Data flow and dependence analysis for instruction level parallelism. In Proceedings of the Fourth Workshop on Languages and Compilers for Parallel Computing, (Santa Clara, August, 1991).

Digital Library

[15]

Rau, B.R., and Glaeser, C.D. Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing. In Proceedings of the Fourteenth Annual Workshop on Microprogramming, (October, 1981), 183-198.

Digital Library

[16]

Rau, B.R., et al. Register Allocation for Modulo Scheduled Loops: Strategies, Algorithms and Heuristics. HP Labs Technical Report HPL-92-48. Hewlett-Packard Laboratories, Palo Alto, California, 1992.

[17]

Rau, B.R., et al. Code Generation Schema for Modulo Scheduled DO-Loops and WHILE-Loops. HP Labs Technical Report HPL-92-47. Hewlett-Packard Laboratories, Palo Alto, California, 1992.

[18]

Rau, B.R., et al. The Cydra 5 departmental supercomputer: design philosophies, decisions and trade-offs, i EEE Computer 22, 1 (January, 1989).

Digital Library

[19]

Timmalai, P., Lee, M., and Schlansker, M.S. Parallelization of loops with exits on pipelined architectures, in Proceedings of the Supercomputing '90, (November, 1990), 200-212.

Digital Library

[20]

Uniejewski, .l. SPEC Benchmark Suite: Designed for Today's Advanced Systems. SPEC Newsletter 1, 1 (Fall, 1989).

Cited By

Zhong HLiu Z(2023)Long-life Sensitive Modulo Scheduling with Adaptive Loop Expansion2022 IEEE 28th International Conference on Parallel and Distributed Systems (ICPADS)10.1109/ICPADS56603.2022.00075(530-537)Online publication date: Jan-2023
https://doi.org/10.1109/ICPADS56603.2022.00075
Li XJain AMaskell DFahmy S(2018)A time-multiplexed FPGA overlay with linear interconnect2018 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE.2018.8342171(1075-1080)Online publication date: Mar-2018
https://doi.org/10.23919/DATE.2018.8342171
Eichenberger ADavidson EAbraham S(2016)Minimizing Register Requirements of a Modulo Schedule via Optimum Stage SchedulingInternational Journal of Parallel Programming10.1007/BF0335674424:2(103-132)Online publication date: 26-May-2016
https://doi.org/10.1007/BF03356744
Show More Cited By

Index Terms

Register allocation for software pipelined loops
1. Software and its engineering
  1. Software notations and tools
    1. Compilers
      1. Source code generation
    2. General programming languages
      1. Language features
2. Theory of computation
  1. Semantics and reasoning
    1. Program constructs

Recommendations

Register allocation for software pipelined multi-dimensional loops
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation

Software pipelining of a multi-dimensional loop is an important optimization that overlaps the execution of successive outermost loop iterations to explore instruction-level parallelism from the entire n-dimensional iteration space. This paper ...
Register allocation for software pipelined multi-dimensional loops
PLDI '05: Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation

Software pipelining of a multi-dimensional loop is an important optimization that overlaps the execution of successive outermost loop iterations to explore instruction-level parallelism from the entire n-dimensional iteration space. This paper ...
Register allocation for software pipelined multidimensional loops

This article investigates register allocation for software pipelined multidimensional loops where the execution of successive iterations from an n-dimensional loop is overlapped. For single loop software pipelining, the lifetimes of a loop variable in ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices

ACM SIGPLAN Notices Volume 27, Issue 7

July 1992

352 pages

ISSN:0362-1340

EISSN:1558-1160

DOI:10.1145/143103

Editor:
Richard Wexelblat
IDA/CDED, Alexandria, VA

Issue’s Table of Contents

PLDI '92: Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
July 1992
352 pages
ISBN:0897914759
DOI:10.1145/143095
Chairman:
Stuart I. Feldman
Bell Communications Research, Morristown, NJ
,
Editor:
Richard L. Wexelblat

Copyright © 1992 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 July 1992

Published in SIGPLAN Volume 27, Issue 7

Check for updates

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

170
Total Citations
View Citations
1,390
Total Downloads

Downloads (Last 12 months)228
Downloads (Last 6 weeks)31

Reflects downloads up to 21 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhong HLiu Z(2023)Long-life Sensitive Modulo Scheduling with Adaptive Loop Expansion2022 IEEE 28th International Conference on Parallel and Distributed Systems (ICPADS)10.1109/ICPADS56603.2022.00075(530-537)Online publication date: Jan-2023
https://doi.org/10.1109/ICPADS56603.2022.00075
Li XJain AMaskell DFahmy S(2018)A time-multiplexed FPGA overlay with linear interconnect2018 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE.2018.8342171(1075-1080)Online publication date: Mar-2018
https://doi.org/10.23919/DATE.2018.8342171
Eichenberger ADavidson EAbraham S(2016)Minimizing Register Requirements of a Modulo Schedule via Optimum Stage SchedulingInternational Journal of Parallel Programming10.1007/BF0335674424:2(103-132)Online publication date: 26-May-2016
https://doi.org/10.1007/BF03356744
Aiken ABanerjee UKejariwal ANicolau AAiken ABanerjee UKejariwal ANicolau A(2016)Software Pipelining by Kernel RecognitionInstruction Level Parallelism10.1007/978-1-4899-7797-7_7(167-203)Online publication date: 30-Nov-2016
https://doi.org/10.1007/978-1-4899-7797-7_7
Rong HPark HWang CWu YFarrens MKozyrakis C(2013)Allocating rotating registers by schedulingProceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/2540708.2540738(346-358)Online publication date: 7-Dec-2013
https://dl.acm.org/doi/10.1145/2540708.2540738
Bachir MCohen ATouati S(2012)On the effectiveness of register moves to minimise post-pass unrolling in software pipelined loops2012 International Conference on High Performance Computing & Simulation (HPCS)10.1109/HPCSim.2012.6266972(551-558)Online publication date: Jul-2012
https://doi.org/10.1109/HPCSim.2012.6266972
Bachir MBrault FTouati SCohen A(2011)Loop unrolling minimisation in the presence of multiple register types: A viable alternative to modulo variable expansion2011 International Conference on High Performance Computing & Simulation10.1109/HPCSim.2011.5999826(207-214)Online publication date: Jul-2011
https://doi.org/10.1109/HPCSim.2011.5999826
Deschinkel KTouati SBriais S(2011)SIRALINAJournal of Combinatorial Optimization10.1007/s10878-010-9332-822:4(819-844)Online publication date: 1-Nov-2011
https://dl.acm.org/doi/10.1007/s10878-010-9332-8
Govindarajan R(2010)Instruction SchedulingThe Compiler Design Handbook10.1201/9781420040579.ch17Online publication date: 7-Mar-2010
https://doi.org/10.1201/9781420040579.ch17
Gopinath K(2010)Register AllocationThe Compiler Design Handbook10.1201/9781420040579.ch13Online publication date: 7-Mar-2010
https://doi.org/10.1201/9781420040579.ch13
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents