Nothing Special   »   [go: up one dir, main page]

skip to main content
article
Free access

Register allocation for software pipelined loops

Published: 01 July 1992 Publication History

Abstract

Software pipelining is an important instruction scheduling technique for efficiently overlapping successive iterations of loops and executing them in parallel. This paper studies the task of register allocation for software pipelined loops, both with and without hardware features that are specifically aimed at supporting software pipelines. Register allocation for software pipelines presents certain novel problems leading to unconventional solutions, especially in the presence of hardware support. This paper formulates these novel problems and presents a number of alternative solution strategies. These alternatives are comprehensively tested against over one thousand loops to determine the best register allocation strategy, both with and without the hardware support for software pipelining.

References

[1]
Allen, J.R., et al. Conversion of control dependence to data dependence. In Proceedings of the Tenth Annual ACM Symposium on Principles of Programming Languages, (January, 1983).
[2]
Berry, M., et al. The Perfect Club Benchmarks: Effective Performance Evaluation of Supercomputers. The International Journal of Supercomputer Applications, 3 (1989), 5-40.
[3]
Callahan, D., Cart, S., and Kennedy, K. Improving Register Allocation for Subscripted Variables, In Proceedings of the A CM StGPLAN '90 Conference on Programming Language Design and Implementation, (June, 1990), 53- 65.
[4]
Chaitin, G.J. Register allocation and spilling via graph coloring. In Proceedings of the SIGPLAN82 Symposium on Compiler Construction, (June, 1982), 201-207.
[5]
Charlesworth, A.E. An Approach to Scientific Array Processing: The Architectural Design of the AP-120B/FPS- 164 Family. IEEE Computer 14, 9 (September, 1981), 18- 27.
[6]
Dehnert, J.C., Hsu, P.Y.-T., and Bratt, J.P. Overlapped loop support in the Cydra 5. In Proceedings of the Third international Conference on Architectural Support for Programming Languages and Operating Systems, (Boston, Mass., April, 1989), 26-38.
[7]
Ebcioglu, K., and Nakatani, T. A new compilation technique for parallelizing loops with unpredictable branches on a VLIW architecture. In Proceedings of the Second Workshop on Programming Languages and Compilers for Parallel Computing, (Urbana-Champaign, 1989), 213-229.
[8]
Hendren, L.J., et al. Register Allocation using Cyclic interval Graphs: A New Approach to an Old Problem. ACAPS Technical Memo 33. Advanced Computer Architecture and Program Structures Group, McGill University, Montreal, Canada, 1992.
[9]
Hsu, P.Y.T. Highly Concurrent Scalar Processing. CSG-49. Coordinated Science Lab., University of Illinois, Urbana, Illinois, 1986.
[10]
Jain, S. Circular scheduling: A new technique to perform software pipelining. In Proceedings of the ACM SIGPLAN '91 Conference on Programming Language Design and implementation, (June, 1991), 219-228.
[11]
Lain, M. Software pipelining: an effective scheduling technique for VLIW machines. In Proceedings of the A CM SIGPLAN '88 Conference on Programming Language Design and Implementation, (June, 1988), 318-327.
[12]
Lee, R.L., Kwok, A.Y., and Briggs, F.A. The floating point performance of a superscalar SPARC processor. In Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, (Santa Clara, California, April, 1991), 28-37.
[13]
Nicolau, A., and Potasman, R. Realistic scheduling: compaction for pipelined architectures. In Proceedings of the 23th Annual Workshop on Microprogramming and Microarchitecture, (Orlando, Florida, November, 1990), 69-79.
[14]
Rau, B.R. Data flow and dependence analysis for instruction level parallelism. In Proceedings of the Fourth Workshop on Languages and Compilers for Parallel Computing, (Santa Clara, August, 1991).
[15]
Rau, B.R., and Glaeser, C.D. Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing. In Proceedings of the Fourteenth Annual Workshop on Microprogramming, (October, 1981), 183-198.
[16]
Rau, B.R., et al. Register Allocation for Modulo Scheduled Loops: Strategies, Algorithms and Heuristics. HP Labs Technical Report HPL-92-48. Hewlett-Packard Laboratories, Palo Alto, California, 1992.
[17]
Rau, B.R., et al. Code Generation Schema for Modulo Scheduled DO-Loops and WHILE-Loops. HP Labs Technical Report HPL-92-47. Hewlett-Packard Laboratories, Palo Alto, California, 1992.
[18]
Rau, B.R., et al. The Cydra 5 departmental supercomputer: design philosophies, decisions and trade-offs, i EEE Computer 22, 1 (January, 1989).
[19]
Timmalai, P., Lee, M., and Schlansker, M.S. Parallelization of loops with exits on pipelined architectures, in Proceedings of the Supercomputing '90, (November, 1990), 200-212.
[20]
Uniejewski, .l. SPEC Benchmark Suite: Designed for Today's Advanced Systems. SPEC Newsletter 1, 1 (Fall, 1989).

Cited By

View all
  • (2023)Long-life Sensitive Modulo Scheduling with Adaptive Loop Expansion2022 IEEE 28th International Conference on Parallel and Distributed Systems (ICPADS)10.1109/ICPADS56603.2022.00075(530-537)Online publication date: Jan-2023
  • (2018)A time-multiplexed FPGA overlay with linear interconnect2018 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE.2018.8342171(1075-1080)Online publication date: Mar-2018
  • (2016)Minimizing Register Requirements of a Modulo Schedule via Optimum Stage SchedulingInternational Journal of Parallel Programming10.1007/BF0335674424:2(103-132)Online publication date: 26-May-2016
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices
ACM SIGPLAN Notices  Volume 27, Issue 7
July 1992
352 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/143103
Issue’s Table of Contents
  • cover image ACM Conferences
    PLDI '92: Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
    July 1992
    352 pages
    ISBN:0897914759
    DOI:10.1145/143095
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 July 1992
Published in SIGPLAN Volume 27, Issue 7

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)228
  • Downloads (Last 6 weeks)31
Reflects downloads up to 21 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Long-life Sensitive Modulo Scheduling with Adaptive Loop Expansion2022 IEEE 28th International Conference on Parallel and Distributed Systems (ICPADS)10.1109/ICPADS56603.2022.00075(530-537)Online publication date: Jan-2023
  • (2018)A time-multiplexed FPGA overlay with linear interconnect2018 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE.2018.8342171(1075-1080)Online publication date: Mar-2018
  • (2016)Minimizing Register Requirements of a Modulo Schedule via Optimum Stage SchedulingInternational Journal of Parallel Programming10.1007/BF0335674424:2(103-132)Online publication date: 26-May-2016
  • (2016)Software Pipelining by Kernel RecognitionInstruction Level Parallelism10.1007/978-1-4899-7797-7_7(167-203)Online publication date: 30-Nov-2016
  • (2013)Allocating rotating registers by schedulingProceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/2540708.2540738(346-358)Online publication date: 7-Dec-2013
  • (2012)On the effectiveness of register moves to minimise post-pass unrolling in software pipelined loops2012 International Conference on High Performance Computing & Simulation (HPCS)10.1109/HPCSim.2012.6266972(551-558)Online publication date: Jul-2012
  • (2011)Loop unrolling minimisation in the presence of multiple register types: A viable alternative to modulo variable expansion2011 International Conference on High Performance Computing & Simulation10.1109/HPCSim.2011.5999826(207-214)Online publication date: Jul-2011
  • (2011)SIRALINAJournal of Combinatorial Optimization10.1007/s10878-010-9332-822:4(819-844)Online publication date: 1-Nov-2011
  • (2010)Instruction SchedulingThe Compiler Design Handbook10.1201/9781420040579.ch17Online publication date: 7-Mar-2010
  • (2010)Register AllocationThe Compiler Design Handbook10.1201/9781420040579.ch13Online publication date: 7-Mar-2010
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media