Abstract
This paper applies unimodular transformations and tiling to improve data locality of a loop nest. Due to data dependences and reuse information, not all dimensions of the iteration space will and can be tiled. By using cones to represent data dependences and vector spaces to quantify data reuse in the program, a reuse-driven transformational approach is presented, which aims at maximizing the amount of data reuse carried in the tiled dimensions of the iteration space while keeping the number of tiled dimensions to a minimum (to reduce loop control overhead). In the special case of one single fully permutable loop nest, an algorithm is presented that tiles the program optimally so that all data reuse is carried in the tiled dimensions. In the general case of multiple fully permutable loop nests, data dependences can prevent all data reuse to be carried in the tiled dimensions. An algorithm is presented that aims at localizing data reuse in the tiled dimensions so that the reuse space localized has the largest dimensionality possible.
Similar content being viewed by others
REFERENCES
M. J. Wolfe, High Performance Compilers for Parallel Computing, Addison-Wesley (1996).
M. E. Wolf and M. S. Lam, A Data Locality Optimizing Algorithm, Proc. ACM SIGPLAN '91 Conf. Progr. Lang. Design and Implementation, ACM (June 1991).
M. J. Wolfe, More Iteration Space Tiling, Supercomputing '88, pp. 655–664 (November 1989).
M. E. Wolf and M. S. Lam, A Loop Transformation Theory and an Algorithm to Maximize Parallelism, IEEE Trans. Parallel Distrib. Syst., 2(4):452–471 (October 1991).
A. Schrijver, Theory of Linear and Integer Programming, Series in Discrete Mathematics, John Wiley (1986).
F. Irigoin, Loop Reordering with Dependence Direction Vectors, Technical Report EMPCAI-I A/184, Ecole Nationale Superieure des Mines de Paris (November 1988).
M. E. Dyer and L. G. Proll, An Algorithm for Determining All Extreme Points of a Convex Polytope, Math. Progr., 12:81–96 (1977).
H. Le Verge, A Note on Chernikova's Algorithm, Technical Report 635, IRISA ( INRIA-Rennes) (February 1992).
J. Xue, Communication-Minimal Tiling of Uniform Dependence Loops, J. Parallel Distrib. Computing, 42(1):42–59 (1997).
A. Darte and F. Vivien, A Comparison of Nested Loops Parallelization Algorithms, Technical Report 95–11, Ecole Normale Supérieure de Lyon (May 1995).
F. Irigoin and R. Triolet, Supernode Partitioning, Proc. 15th Ann. ACM Symp. Principles of Progr. Lang., San Diego, California, pp. 319–329 (January 1988).
J. Xue, On Tiling as a Loop Transformation, Parallel Processing Letters, 7(4):409–424 (1997).
A. Darte and F. Vivien, Optimal Fine and Medium Grain Parallelism Detection in Polyhedral Reduced Dependence Graphs, Proc. Int'l. Conf. Parallel Architectures and Compilation Techniques, Boston, Massachusetts, pp. 281–291 (1996).
Y. Q. Yang, C. Ancourt, and F. Irigoin, Minimal Data Dependence Abstractions for Loop Transformations, Proc. seventh Workshop on Languages and Compilers for Parallel Computing, Ithaca (August 1994).
A. Darte and F. Vivien, Combining Retiming and Scheduling Techniques for Loop Parallelization and Loop Tiling, Technical Report 96–34, Ecole Normale Supérieure de Lyon (November 1996).
P. Boulet, A. Darte, T. Risset, and Y. Robert, (Pen)-Ultimate Tiling, Integration, the VLSI Journal, 17:33–51 (1994).
J. Ramanujam and P. Sadayappan, Tiling Multidimensional Iteration Spaces for Multicomputers, J. Parallel and Distributed Computing 16(2):108–230 (October 1992).
H. Ohta, Y. Saito, M. Kainaga, and H. Ono, Optimal Tile Size Adjustment in Compiling for General DOACROSS Loop Nests, ACM Int'l. Conf. Supercomputing, ACM Press, pp. 270–279 (1995).
Rights and permissions
About this article
Cite this article
Xue, J., Huang, CH. Reuse-Driven Tiling for Improving Data Locality. International Journal of Parallel Programming 26, 671–696 (1998). https://doi.org/10.1023/A:1018734612524
Issue Date:
DOI: https://doi.org/10.1023/A:1018734612524