Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/377792.377818acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
Article

Optimizing locality for ODE solvers

Published: 17 June 2001 Publication History

Abstract

Runge-Kutta methods are popular methods for the solution of systems of ordinary differential equations and are provided by many scientific libraries. The performance of Runge-Kutta methods does not only depend on the specific application problem to be solved but also on the characteristics of the target machine. For processors with memory hierarchy, the locality of data referencing pattern has a large impact on the efficiency of a program. In this paper, we describe program transformations for Runge-Kutta methods resulting in programs with improved locality behavior. The transformations are based on properties of the solution method but are independent from the specific application problem or the specific target machine, so that the resulting implementation is suitable as library function. We show that the locality improvement leads to performance gains on different target machines. We also demonstrate how the locality of memory references can be further increased by exploiting the dependence structure of the right hand side function of specific ordinary differential equations.

References

[1]
E. Anderson, Z. Bai, C. Bischof, L. S. Blackford, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarlin, A. McKenney, and D. Sorensen. LAPACK Users' Guide, Third Edition. SIAM, 1999.
[2]
R. Berrendorf and B. Mohr. PCL - The Performance Counter Library, Version 2.0. Research Centre J. ulich, September 2000.
[3]
J. Bilmes, K. Asanovic, C.-W. Chin, and J. Demmel. Optimizing Matrix Multiply using PHiPAC: a Portable, High-Performance, ANSI C Coding Methodology. In 11th ACM Int. Conf. on Supercomputing, 1997.
[4]
R.W. Brankin, I. Gladwell, and L.F. Shampine. RKSUITE release 1.0, 1991.
[5]
K. Burrage. Parallel and Sequential Methods for Ordinary Differential Equations. Oxford Science Publications, 1995.
[6]
J. Choi, J.J. Dongarra, L.S. Ostrouchov, A.P. Petitet, D.W. Walker, and R.C. Whaley. Design and Implementation of the ScaLAPACK LU, QR and Cholesky Factorization Routines. Scientific Programming, 5:173-184, 1996.
[7]
Wayne H. Enright, Desmond J. Higham, Brynjulf Owren, and Philip W. Sharp. A Survey of the Explicit Runge-Kutta Method. Technical Report 94-291, University of Toronto, Department of Computer Science, 1995.
[8]
E. Fehlberg. Classical fifth-, sixth-, seventh- and eighth order Runge-Kutta formulas with step size control. Computing, 4:93-106, 1969.
[9]
K.S. Gatlin and L. Carter. Architecture-Cognizant Divide and Conquer Algorithms. In Proc. of Supercomputing'99 Conference, 1999.
[10]
E. Hairer, S.P. N.rsett, and G. Wanner. Solving Ordinary Differential Equations I: Nonsti Problems. Springer-Verlag, Berlin, 1993.
[11]
Fran.cois Irigoin and Remi Triolet. Supernode partitioning. In 15th Annual ACM Symposium on Principles of Programming Languages, pages 319-329, San Diego, Calif., January 1988.
[12]
M. Kandemira, J. Ramanujam, and A. Choudhary. Compiler algorithms for optimizing locality and parallelism on shared and distributed memory machines. Journal of Parallel and Distributed Computing, 60:924-965, August 2000.
[13]
P.J. Prince and J.R. Dormand. High order embedded Runge-Kutta formulae. J. Comp. Appl. Math., 7(1):67-75, 1981.
[14]
T. Rauber and G. R. unger. Diagonal-Implicitly Iterated Runge-Kutta Methods on Distributed Memory Machines. Int. Journal of High Speed Computing, 10(2):185-207, 1999.
[15]
T. Rauber and G. R. unger. Parallel Execution of Embedded and Iterated Runge-Kutta Methods. Concurrency: Practice andExperience, 11(7):367-385, 1999.
[16]
L. Stals and U. R. ude. Data Local Iterative Methods for the Efficient SolutionofPartial Differential Equations. In Proc. of Computational Techniques and Applications, 1997.
[17]
C. Wei., W. Karl, M. Kowarschik, and U. R. ude. Memory Characteristics of Iterative Methods. In Proceedings of the ACM/IEEE SC99 Conference, Portland, Oregon, November 1999.
[18]
R.C. Whaley and J.J. Dongarra. Automatically tuned linear algebra software. Technical report, University of Tennessee, 1999.
[19]
M. Wolfe. High Performance Compilers for Parallel Computing. Addison Wesley, 1996.

Cited By

View all
  • (2021)Modeling the effect of application-specific program transformations on energy and performance improvements of parallel ODE solversJournal of Computational Science10.1016/j.jocs.2021.10135651(101356)Online publication date: Apr-2021
  • (2018)How do Loop Transformations Affect the Energy Consumption of Multi-Threaded Runge-Kutta Methods?2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)10.1109/PDP2018.2018.00085(499-507)Online publication date: Mar-2018
  • (2018)Energy and Performance Improvement of Parallel ODE Solvers by Application-Specific Program Transformations2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW.2018.00151(967-976)Online publication date: May-2018
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICS '01: Proceedings of the 15th international conference on Supercomputing
June 2001
510 pages
ISBN:158113410X
DOI:10.1145/377792
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 June 2001

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

ICS01
Sponsor:

Acceptance Rates

ICS '01 Paper Acceptance Rate 45 of 133 submissions, 34%;
Overall Acceptance Rate 629 of 2,180 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2021)Modeling the effect of application-specific program transformations on energy and performance improvements of parallel ODE solversJournal of Computational Science10.1016/j.jocs.2021.10135651(101356)Online publication date: Apr-2021
  • (2018)How do Loop Transformations Affect the Energy Consumption of Multi-Threaded Runge-Kutta Methods?2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)10.1109/PDP2018.2018.00085(499-507)Online publication date: Mar-2018
  • (2018)Energy and Performance Improvement of Parallel ODE Solvers by Application-Specific Program Transformations2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW.2018.00151(967-976)Online publication date: May-2018
  • (2008)Performance effects of gram-schmidt orthogonalization on multi-core infiniband clusters2008 IEEE International Symposium on Parallel and Distributed Processing10.1109/IPDPS.2008.4536474(1-8)Online publication date: Apr-2008
  • (2008)Cache optimization for mixed regular and irregular computations2008 IEEE International Symposium on Parallel and Distributed Processing10.1109/IPDPS.2008.4536184(1-8)Online publication date: Apr-2008
  • (2004)Performance optimization of RK methods using block-based pipeliningPerformance analysis and grid computing10.5555/976094.976098(41-56)Online publication date: 1-Jan-2004
  • (2004)Simulation-based analysis of parallel runge-kutta solversProceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing10.1007/11558958_133(1105-1114)Online publication date: 20-Jun-2004
  • (2003)A distributed hierarchical programming model for heterogeneous cluster of SMPsProceedings International Parallel and Distributed Processing Symposium10.1109/IPDPS.2003.1213307(8)Online publication date: 2003
  • (2003)Program-based locality measures for scientific computingProceedings International Parallel and Distributed Processing Symposium10.1109/IPDPS.2003.1213306(8)Online publication date: 2003
  • (2003)Scalable Parallel RK Solvers for ODEs Derived by the Method of LinesEuro-Par 2003 Parallel Processing10.1007/978-3-540-45209-6_113(830-839)Online publication date: 2003
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media