Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/209936.209948acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
Article
Free access

A linear-time algorithm for computing the memory access sequence in data-parallel programs

Published: 01 August 1995 Publication History

Abstract

Data-parallel languages, such as High Performance Fortran, are widely regarded as a promising means for writing portable programs for distributed-memory machines. Novel features of these languages call for the development of new techniques in both compilers and run-time systems. In this paper, we present an improved algorithm for finding the local memory access sequence in computations involving regular sections of arrays with cyclic(k) distributions. After establishing the fact that regular section indices correspond to elements of an integer lattice, we show how to find a lattice basis that allows for simple and fast enumeration of memory accesses. The complexity of our algorithm is shown to be lower than that of the previous solution for the same problem. In addition, the experimental results demonstrate the efficiency of our method in practice.

References

[1]
C. Ancourt, F. Coelho, F. Irigoin, and R. Keryell. A linear algebra framework for static HPF code distribution. In Proceedings of the Fourth Workshop on Compilers for Parallel Computers, Delft, The Netherlands, December 1993.
[2]
B. Chapman, P. Mehrotra, and H. Zima. Programming in Vienna Fortran. Scientific Programming, 1(1):31-50, Fall 1992.
[3]
S. Chatterjee. Private communication, October 1994.
[4]
S. Chatterjee, J. Gilbert, F. Long, R. Schreiber, and S. Teng. Generating local addresses and communication sets for dataparallel programs. In Proceedings of the Fourth A CM SIG- PLAN Symposium on Principles and Practice of Parallel Programming, San Diego, CA, May 1993.
[5]
T.H. Cormen, C,E. Lelserson, and R.L. Rivest. Introduction to Algorithms. The MIT Press, Cambridge, MA, 1990.
[6]
J. Dongarra, R. van de Geijn, and D. Walker. A look at scalable dense linear algebra libraries. In Proceedings of the 1992 Scalable High Performance Computing Conference, pages 372-379, Williamsburg, VA, April 1992.
[7]
S.K.S. Gupta, S.D. Kaushik, C.-H. Huang, and P. Sadayappan. On compiling array expressions for efficient execution on distributed-memory machines. Teclmical Report OSE- CISRC-4/94-TR19, Department of Computer and Information Science, The Ohio State University, April 1994.
[8]
High Performance Fortran Forum. High Performance Fortran language specification. Scientific Programming, 2(1- 2):1-170, 1993.
[9]
S. Hiranandani, K. Kennedy, J. Mellor-Crummey, and A. Sethi. Compilation techniques for block-cyclic distributions. In Proceedings of the 199# ACM international Conference on Supercomputlng, Manchester, England, July 1994.
[10]
S. Hiranandani, K. Kemmdy, and C.-W. Tseng. Compiling Fortran D for MIMD distributed-memory machines. Communications of the A CM, 35(8):66-80, August 1992.
[11]
R. Kannan. Algorithmic geometry of numbers. In J. Traub, editor, Annual Review of Computer Science. Annual Reviews hlc., Polo Alto, CA, 1987.
[12]
K. Kennedy, N. Nedeljkovid, and A. Sethi. Efficient address generation for block-cyclic distributions. In Proceedings of the 1995 A CM International Conference on Supercomputing, Barcelona, Spain, July 1995.
[13]
A. Knies, M. O'Keefe, and T. MacDonald. High Performance Fortran: A practical analysis. Scientific Programming, 3(3):187-199, Fall 1994.
[14]
C. Koelbel, D. Loveman, R. Schreiber# G. Steele, Jr., and M. Zosel. The High Performance Fortran Handbook. The MIT Press, Cambridge, MA, 1994.
[15]
C. van Reeuwijk, H. J. Sips, W. Denissen, a#ld E. M. Paalvast. Implementing HPF distributed arrays on a mesagepassing parallel computer. Tectutical report, Advanced School of Computing and Imaging, Delft University of Technology, February 1995.
[16]
J. Stichnoth, D. O'Hallaxon, aald T. Gross. Generating communication for array statements: Design, implementation, and evaluation. Journal of Parallel and Distributed Computing, 21(1):150-159, April 1994.

Cited By

View all
  • (2022)DeinsumProceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis10.5555/3571885.3571918(1-15)Online publication date: 13-Nov-2022
  • (2022)Deinsum: Practically I/O Optimal Multi-Linear AlgebraSC22: International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SC41404.2022.00030(1-15)Online publication date: Nov-2022
  • (2014)Author retrospectiveACM International Conference on Supercomputing 25th Anniversary Volume10.1145/2591635.2591651(29-31)Online publication date: 10-Jun-2014
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
PPOPP '95: Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
August 1995
234 pages
ISBN:0897917006
DOI:10.1145/209936
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 August 1995

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

PPoPP95
Sponsor:
PPoPP95: Principles & Practices of Parallel Programming
July 19 - 21, 1995
California, Santa Barbara, USA

Acceptance Rates

Overall Acceptance Rate 230 of 1,014 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)81
  • Downloads (Last 6 weeks)19
Reflects downloads up to 19 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2022)DeinsumProceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis10.5555/3571885.3571918(1-15)Online publication date: 13-Nov-2022
  • (2022)Deinsum: Practically I/O Optimal Multi-Linear AlgebraSC22: International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SC41404.2022.00030(1-15)Online publication date: Nov-2022
  • (2014)Author retrospectiveACM International Conference on Supercomputing 25th Anniversary Volume10.1145/2591635.2591651(29-31)Online publication date: 10-Jun-2014
  • (2005)Code generation for complex subscripts in data-parallel programsLanguages and Compilers for Parallel Computing10.1007/BFb0032683(49-63)Online publication date: 9-Jun-2005
  • (2005)Optimizing the representation of local iteration sets and access sequences for block-cyclic distributionsLanguages and Compilers for Parallel Computing10.1007/BFb0017267(420-434)Online publication date: 10-Jun-2005
  • (2005)Generalized overlap regions for communication optimization in data-parallel programsLanguages and Compilers for Parallel Computing10.1007/BFb0017266(404-419)Online publication date: 10-Jun-2005
  • (2005)A communication backend for parallel language compilersLanguages and Compilers for Parallel Computing10.1007/BFb0014202(224-238)Online publication date: 9-Jun-2005
  • (2005)Compiling array statements for efficient execution on distributed-memory machines: Two-level mappingsLanguages and Compilers for Parallel Computing10.1007/BFb0014201(209-223)Online publication date: 9-Jun-2005
  • (2005)Fast address sequence generation for data-parallel programs using integer latticesLanguages and Compilers for Parallel Computing10.1007/BFb0014200(191-208)Online publication date: 9-Jun-2005
  • (2004)An efficient algorithm for communication set generation of data parallel programs with block-cyclic distributionParallel Computing10.1016/j.parco.2004.02.00130:4(473-501)Online publication date: 1-Apr-2004
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media