Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Performance analysis of the high-performance conjugate gradient benchmark on GPUs

Published: 24 November 2019 Publication History

Abstract

Graphics processing unit accelerated supercomputers have proved to be very effective, especially with regard to power efficiency, for accelerating compute intensive applications like the high-performance Linpack used in the TOP500 list. This paper presents the details of a CUDA implementation of the high-performance conjugate gradient, a new proposed benchmark that better represents modern application workloads which rely more heavily on memory system and network performance than high-performance Linpack. The results obtained at full scale on the largest graphics processing unit supercomputers in the world, Titan, the Cray XK7 at ORNL and Piz-Daint, the Cray XC30 at CSCS, indicate that graphics processing unit accelerated supercomputers are also very effective for this type of workload. A comparison with other architectures is also presented, showing that graphics processing units, with their high memory bandwidth, are the highest performing devices for this new benchmark.

References

[1]
Barrett RF, Heroux MA, Lin PT . 2011 Poster: Mini-applications: Vehicles for co-design. In: Proceedings of the 2011 high-performance computing networking, storage and analysis companion SC '11 Companion, New York, USA, pp. pp.1-–2. New York: ACM Press.
[2]
Briggs WL, Henson VE, McCormick SF 2000 A multigrid tutorial . Philadelphia, PA: SIAM.
[3]
Cohen J, Castonguay P 2012 Efficient graph matching and coloring on the GPU. In: GPU Technology Conference, San Jose, USA, 14-17 May 2012, pp. pp.1-–10.
[4]
Dongarra J, Heroux MA 2013 Toward a new metric for ranking high-performance computing systems. Sandia Report SAND2013-4744, USA.
[5]
Dongarra J, Luszczek P 2005 Introduction to the HPC challenge benchmark suite. ICL Technical Report ICL-UT-05-01 also appears as CS Department Technical Report UT-CS-05-544.
[6]
Golub GH, Van Loan CF 1996 Matrix Computations, 3rd Edition . Baltimore, MD: John Hopkins University Press.
[7]
Heroux MA, Dongarra J, Luszczek P 2013 HPCG technical specification. Sandia Report SAND2013-8752.
[8]
Jones MT, Plassmann PE 1992 A parallel graph coloring heuristic. SIAM Journal on Computing Volume 14 : pp.654-–669.
[9]
Luby M 1986 A simple parallel algorithm for the maximal independent set problem. SIAM Journal on Computing Volume 15 Issue 4: pp.1036-–1053.
[10]
McCalpin JD 1995 Memory bandwidth and machine balance in current high-performance computers. IEEE Computer Society Technical Committee on Computer Architecture TCCA Newsletter, 1995.
[11]
Park J, Smelyanskiy M 2014 Optimizing Gauss-Seidel smoother in HPCG. In: ASCR HPCG workshop, Bethesda, MD, 25 March 2014.
[12]
Phillips EH, Fatica M 2010 Implementing the Himeno benchmark with CUDA on GPU clusters. In: 2010 IEEE international symposium on parallel and distributed processing, pp. pp.1-–10. IEEE.

Cited By

View all
  • (2024)DBSR: An Efficient Storage Format for Vectorizing Sparse Triangular Solvers on Structured GridsProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.1109/SC41406.2024.00065(1-14)Online publication date: 17-Nov-2024
  • (2023)HPCG on long-vector architecturesFuture Generation Computer Systems10.1016/j.future.2023.01.015143:C(152-162)Online publication date: 1-Jun-2023

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image International Journal of High Performance Computing Applications
International Journal of High Performance Computing Applications  Volume 30, Issue 1
2 2016
131 pages

Publisher

Sage Publications, Inc.

United States

Publication History

Published: 24 November 2019

Author Tags

  1. CUDA
  2. GPU computing
  3. HPC
  4. parallel computing
  5. performance analysis

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 27 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)DBSR: An Efficient Storage Format for Vectorizing Sparse Triangular Solvers on Structured GridsProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.1109/SC41406.2024.00065(1-14)Online publication date: 17-Nov-2024
  • (2023)HPCG on long-vector architecturesFuture Generation Computer Systems10.1016/j.future.2023.01.015143:C(152-162)Online publication date: 1-Jun-2023

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media