Abstract
Data-parallel languages such as High Performance Fortran, Vienna Fortran and Fortran D include directives for alignment and distribution that describe how data and computation are mapped onto the processors in a distributed-memory multiprocessor. A compiler for these language that generates code for each processor has to compute the sequence of local memory addresses accessed by each processor and the sequence of sends and receives for a given processor to access non-local data. While the address generation problem has received much attention, issues in communication have not been dealt with extensively. A novel approach for the management of communication sets and strategies for local storage of remote references is presented. Algorithms for deriving communication patterns are discussed first. Then, two schemes that extend the notion of a local array by providing storage for non-local elements (called overlap regions) interspersed throughout the storage for the local portion are presented. The two schemes, namely course padding and column padding enhance locality of reference significantly at the cost of a small overhead due to unpacking of messages. The performance of these schemes are compared to the traditional buffer-based approach and improvements of up to 30% in total time are demonstrated. Several message optimizations such as offset communication, message aggregation and coalescing are also discussed.
Supported in part by an NSF Young Investigator Award CCR-9457768, and NSF grant CCR-9210422, and by the Louisiana Board of Regents through contract LEQSF (1991–94)-RD-A-09.
Preview
Unable to display preview. Download preview PDF.
References
A. Ancourt, F. Coelho, F. Irigoin, and R. Keryell. A linear algebra framework for static HPF code distribution. To appear in Scientific Programming, 1996.
S. Benkner. Handling block-cyclic distributed arrays in Vienna Fortran 90. In Proc. International Conference on Parallel Architectures and Compilation Techniques, Limassol, Cyprus, June 1995.
B. Chapman, P. Mehrotra, and H. Zima. Programming in Vienna Fortran. Scientific Programming, 1(1):31–50, Fall 1992.
S. Chatterjee, J. Gilbert, F. Long, R. Schreiber, and S. Teng. Generating local addresses and communication sets for data parallel programs. Journal of Parallel and Distributed Computing, 26(1):72–84, 1995.
G. Fox, S. Hiranandani, K. Kennedy, C. Koelbel, U. Kremer, C. Tseng, and M. Wu. Fortran D language specification. Technical Report CRPC-TR90079, Rice University, December 1990.
M. Gerndt. Updating distributed variables in local computations. Concurrency: Practice and Experience, 2(3):171–193, September 1990.
S. Gupta, S. Kaushik, C. Huang, and P. Sadayappan. On compiling array expressions for efficient execution on distributed-memory machines. To appear in Journal of Parallel and Distributed Computing.
High Performance Fortran Forum. High Performance Fortran language specification. Scientific Programming, 2(1–2): 1–170, 1993.
K. Kennedy, N. Nedeljkovic, and A. Sethi. A linear-time algorithm for computing the memory access sequence in data-parallel programs. In Proc. of Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Santa Barbara, CA, July 1995.
K. Kennedy, N. Nedeljkovic, and A. Sethi. Communication generation for cyclic(k) distributions. In Languages, Compilers, and Run-Time Systems for Scalable Computers, B. Szymanski and B. Sinharoy (Eds.), Kluwer Academic Publishers, 1995.
C. Koelbel. Compile-time generation of communication for scientific programs. In Proc. Supercomputing '91, pages 101–110, November 1991.
C. Koelbel, D. Loveman, R. Schreiber, G. Steele, and M. Zosel. High Performance Fortran Handbook. The MIT Press, 1994.
J. Ramanujam. Non-unimodular transformations of nested loops. In Proc. Supercomputing 92, pages 214–223, November 1992.
C. van Reeuwijk, H.J. Sips, W. Denissen, and E. M. Paalvast. Implementing HPF distributed arrays on a message-passing parallel computer system. CP Technical Report series, TR9506, Delft University of Technology, 1995.
J. Stichnoth. Efficient compilation of array statements for private memory multicomputers. Technical Report CMU-CS-93-109, School of Computer Science, Carnegie-Mellon University, February 1993.
E. Su, A. Lain, S. Ramaswamy, D.J. Palermo, E.W. Hodges IV, and P. Banerjee. Advanced compilation techniques in the PARADIGM compiler for distributed-memory multicomputers. In Proc. 1995 ACM International Conference on Supercomputing, Barcelona, Spain, July 1995.
A. Thirumalai. Code generation and optimization for High Performance Fortran. M.S. Thesis, Department of Electrical and Computer Engineering, Louisiana State University, August 1995.
A. Thirumalai and J. Ramanujam. An efficient compile-time approach to compute address sequences in data parallel programs. In Proc. 5th International Workshop on Compilers for Parallel Computers, Malaga, Spain, pages 581–605, June 1995.
A. Thirumalai and J. Ramanujam. Fast address sequence generation for data-parallel programs using integer lattices. In Languages and Compilers for Parallel Computing, P. Sadayappan et al. (Eds.), Lecture Notes in Computer Science, Springer-Verlag, 1996.
A. Thirumalai, J. Ramanujam, and A. Venkatachar. Communication generation and optimization for HPF. In Languages, Compilers, and Run-Time Systems for Scalable Computers, B. Szymanski and B. Sinharoy (Eds.), Kluwer Academic Publishers, 1995.
A. Thirumalai and J. Ramanujam. Efficient computation of address sequences in data parallel programs using closed forms for basis vectors. Journal of Parallel and Distributed Computing, 38(2): 188–203, November 1996.
M. Wolfe. High performance compilers for parallel computing. Addison-Wesley Publishing Co., Redwood City, CA, 1996.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Venkatachar, A., Ramanujam, J., Thirumalai, A. (1997). Generalized overlap regions for communication optimization in data-parallel programs. In: Sehr, D., Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds) Languages and Compilers for Parallel Computing. LCPC 1996. Lecture Notes in Computer Science, vol 1239. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0017266
Download citation
DOI: https://doi.org/10.1007/BFb0017266
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63091-3
Online ISBN: 978-3-540-69128-0
eBook Packages: Springer Book Archive