Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Reuse-Driven Tiling for Improving Data Locality

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

This paper applies unimodular transformations and tiling to improve data locality of a loop nest. Due to data dependences and reuse information, not all dimensions of the iteration space will and can be tiled. By using cones to represent data dependences and vector spaces to quantify data reuse in the program, a reuse-driven transformational approach is presented, which aims at maximizing the amount of data reuse carried in the tiled dimensions of the iteration space while keeping the number of tiled dimensions to a minimum (to reduce loop control overhead). In the special case of one single fully permutable loop nest, an algorithm is presented that tiles the program optimally so that all data reuse is carried in the tiled dimensions. In the general case of multiple fully permutable loop nests, data dependences can prevent all data reuse to be carried in the tiled dimensions. An algorithm is presented that aims at localizing data reuse in the tiled dimensions so that the reuse space localized has the largest dimensionality possible.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

REFERENCES

  1. M. J. Wolfe, High Performance Compilers for Parallel Computing, Addison-Wesley (1996).

  2. M. E. Wolf and M. S. Lam, A Data Locality Optimizing Algorithm, Proc. ACM SIGPLAN '91 Conf. Progr. Lang. Design and Implementation, ACM (June 1991).

  3. M. J. Wolfe, More Iteration Space Tiling, Supercomputing '88, pp. 655–664 (November 1989).

  4. M. E. Wolf and M. S. Lam, A Loop Transformation Theory and an Algorithm to Maximize Parallelism, IEEE Trans. Parallel Distrib. Syst., 2(4):452–471 (October 1991).

    Google Scholar 

  5. A. Schrijver, Theory of Linear and Integer Programming, Series in Discrete Mathematics, John Wiley (1986).

  6. F. Irigoin, Loop Reordering with Dependence Direction Vectors, Technical Report EMPCAI-I A/184, Ecole Nationale Superieure des Mines de Paris (November 1988).

  7. M. E. Dyer and L. G. Proll, An Algorithm for Determining All Extreme Points of a Convex Polytope, Math. Progr., 12:81–96 (1977).

    Google Scholar 

  8. H. Le Verge, A Note on Chernikova's Algorithm, Technical Report 635, IRISA ( INRIA-Rennes) (February 1992).

  9. J. Xue, Communication-Minimal Tiling of Uniform Dependence Loops, J. Parallel Distrib. Computing, 42(1):42–59 (1997).

    Google Scholar 

  10. A. Darte and F. Vivien, A Comparison of Nested Loops Parallelization Algorithms, Technical Report 95–11, Ecole Normale Supérieure de Lyon (May 1995).

  11. F. Irigoin and R. Triolet, Supernode Partitioning, Proc. 15th Ann. ACM Symp. Principles of Progr. Lang., San Diego, California, pp. 319–329 (January 1988).

  12. J. Xue, On Tiling as a Loop Transformation, Parallel Processing Letters, 7(4):409–424 (1997).

    Google Scholar 

  13. A. Darte and F. Vivien, Optimal Fine and Medium Grain Parallelism Detection in Polyhedral Reduced Dependence Graphs, Proc. Int'l. Conf. Parallel Architectures and Compilation Techniques, Boston, Massachusetts, pp. 281–291 (1996).

  14. Y. Q. Yang, C. Ancourt, and F. Irigoin, Minimal Data Dependence Abstractions for Loop Transformations, Proc. seventh Workshop on Languages and Compilers for Parallel Computing, Ithaca (August 1994).

  15. A. Darte and F. Vivien, Combining Retiming and Scheduling Techniques for Loop Parallelization and Loop Tiling, Technical Report 96–34, Ecole Normale Supérieure de Lyon (November 1996).

  16. P. Boulet, A. Darte, T. Risset, and Y. Robert, (Pen)-Ultimate Tiling, Integration, the VLSI Journal, 17:33–51 (1994).

    Google Scholar 

  17. J. Ramanujam and P. Sadayappan, Tiling Multidimensional Iteration Spaces for Multicomputers, J. Parallel and Distributed Computing 16(2):108–230 (October 1992).

    Google Scholar 

  18. H. Ohta, Y. Saito, M. Kainaga, and H. Ono, Optimal Tile Size Adjustment in Compiling for General DOACROSS Loop Nests, ACM Int'l. Conf. Supercomputing, ACM Press, pp. 270–279 (1995).

Download references

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xue, J., Huang, CH. Reuse-Driven Tiling for Improving Data Locality. International Journal of Parallel Programming 26, 671–696 (1998). https://doi.org/10.1023/A:1018734612524

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1018734612524

Navigation