Abstract
The performance of scientific programs on superscalar processors can be significantly degraded by memory references that frequently arise due to load and store operations associated with array references. Therefore, register allocation techniques have been developed for allocating registers to array elements whose values are repeatedly referenced over one or more loop iterations. To place load, store, and register-to-register shift operations without introducing fully/partially redundant and dead memory operations, a detailed value flow analysis of array references is required. We present an analysis framework to efficiently solve various data flow problems required by array load-store optimizations. The framework determines the collective behavior of recurrent references spread over multiple loop iterations. We also demonstrate how our algorithms can be adapted for various fine-grain architectures.
Partially supported by National Science Foundation Presidential Young Investigator Award CCR-9157371 to the University of Pittsburgh and a grant from Hewlett-Packard Laboratories.
Preview
Unable to display preview. Download preview PDF.
References
M.E. Benitez and J.W. Davidson, “Code Generation for Streaming: an Access/Execute Mechanism,” Proceedings of Arch. Support for Programming Languages and Operating Systems-IV, pages 132–141, 1991.
R. Bodik and R. Gupta, “Optimal Placement of Load-Store Operations for Array Accesses in Loops,” Technical report 95-03, DCS, Univ. of Pittsburgh, 1995.
D. Callahan, S. Carr, and K. Kennedy, “Improving Register Allocation for Subscripted Variables,” Proceedings of the SIGPLAN Conference on Programming Language Design and Implementation, White Plains, New York, pages 53–65, June 1990.
S. Carr and K. Kennedy, “Scalar Replacement in the Presence of Conditional Control Flow,” Software-Practice and Experience, Vol. 24, No. 1, pages 51–77, Jan. 1994.
G.J. Chaitin, “Register Allocation and Spilling via Graph Coloring,” Proceedings of the SIGPLAN Symposium on Compiler Construction, SIGPLAN Notices, vol. 17, no. 6, pages 98–105, June 1982.
T.H. Cormen, C.E. Leiserson, and R.L. Rivest, Introduction to Algorithms, The MIT Press, Cambridge, Massachusetts, 1990.
J.C. Dehnert, P.Y.-T. Hsu, and J.P. Bratt, “Overlapped Loop Support in the Cydra 5,” Proceedings of ASPLOS-III, pages 26–39, 1989.
D.M. Dhamdhere, “Practical Adaptation of the Global Optimization Algorithm of Morel and Renvoise,” ACM Transactions on Programming Languages and Systems, Volume 13, No. 2, pages 291–294, April 1991.
D.M. Dhamdhere, B.K. Rosen and F.K. Zadeck, “How to Analyze Large Programs Efficiently and informatively,” Proc. of the SIGPLAN PLDI, San Francisco, California, pages 212–223, June 1992.
E. Duesterwald, R. Gupta, and M.L. Soffa, “Register Pipelining: An Integrated Approach to Register Allocation for Scalar and Subscripted Variables,” Proc. of International Workshop on Compiler Construction, LNCS 641 Springer Verlag, pages 192–206, Paderborn, Germany, October 1992.
E. Duesterwald, R. Gupta, and M.L. Soffa, “A Practical Data Flow Framework for Array Reference Analysis and its Application in Optimizations,” Proc. of ACM SIGPLAN Conference on Programming Language Design and Implementation, pp.68–77, Albuquerque, New Mexico, June 1993.
R. Gupta, “Generalized Dominators and Post-Dominators,” The Nineteenth Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 246–257, Albuquerque, New Mexico, January 1992.
L. Hendren, G.R. Gao, E.R. Altman, and C. Mukerji, “A Register Allocation Framework based upon Hierarchical Cyclic Interval Graphs,” Intnl. Workshop on Compiler Construction, LNCS 641 Springer Verlag, pages 176–191, Germany, 1992.
V. Kathail, M. Schlansker, and B. Rau, HPL PlayDoh Architecture Specification: Version 1.0, HPL-93-80, February, 1994.
J. Knoop, O. Ruthing, and B. Steffen, “Optimal Code Motion: Theory and Practice,” ACM TOPLAS, vol. 16, num. 4, 1117–1155.
P. Kolte and M.J. Harrold, “Load/Store Range Analysis for Global Register Allocation,” Proc. of the SIGPLAN Conference on Programming Language Design and Implementation, Albuquerque, New Mexico, pages 268–277, June 1994.
E. Morel and C. Renvoise, “Global Optimization by Suppression of Partial Redundancies,” Communications of the ACM, Volume 22, No. 2, pages 96–103, 1979.
M. Wolfe and U. Banerjee, “Data Dependence and Its Application to Parallel Processing,” Intl. Journal of Parallel Programming, Vol. 16, No. 2, April 1987.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1996 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bodík, R., Gupta, R. (1996). Array data flow analysis for load-store optimizations in superscalar architectures. In: Huang, CH., Sadayappan, P., Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds) Languages and Compilers for Parallel Computing. LCPC 1995. Lecture Notes in Computer Science, vol 1033. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0014188
Download citation
DOI: https://doi.org/10.1007/BFb0014188
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60765-6
Online ISBN: 978-3-540-49446-1
eBook Packages: Springer Book Archive