Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Performance characteristics of hybrid MPI/OpenMP implementations of NAS parallel benchmarks SP and BT on large-scale multicore supercomputers

Published: 29 March 2011 Publication History

Abstract

The NAS Parallel Benchmarks (NPB) are well-known applications with the fixed algorithms for evaluating parallel systems and tools. Multicore supercomputers provide a natural programming paradigm for hybrid programs, whereby OpenMP can be used with the data sharing with the multicores that comprise a node and MPI can be used with the communication between nodes. In this paper, we use SP and BT benchmarks of MPI NPB 3.3 as a basis for a comparative approach to implement hybrid MPI/OpenMP versions of SP and BT. In particular, we can compare the performance of the hybrid SP and BT with the MPI counterparts on large-scale multicore supercomputers. Our performance results indicate that the hybrid SP outperforms the MPI SP by up to 20.76%, and the hybrid BT outperforms the MPI BT by up to 8.58% on up to 10,000 cores on BlueGene/P at Argonne National Laboratory and Jaguar (Cray XT4/5) at Oak Ridge National Laboratory. We also use performance tools and MPI trace libraries available on these supercomputers to further investigate the performance characteristics of the hybrid SP and BT.

References

[1]
Argonne Leadership Computing Facility BlueGene/P (Intrepid), Argonne National Laboratory, http://www.alcf.anl.gov/resources.
[2]
D. Bailey, E. Barszcz, et al., The NAS Parallel Benchmarks, Tech. Report RNR-94-007, 1994.
[3]
F. Cappello and D. Etiemble, MPI versus MPI+OpenMP on the IBM SP for the NAS Benchmarks, SC2000.
[4]
Cray Performance analysis toolkit (CrayPat), http://www.nccs.gov/computing-resources/jaguar/software/?software=craypat. Also see Using Cray Performance Analysis Tools, Cray Doc S-2376-41, 2007.
[5]
H. Jin, M. Frumkin and J. Yan, The OpenMP Implementation of NAS Parallel Benchmarks and Its Performance, NAS Technical Report NAS-99-011, October 1999.
[6]
G. Jost, H. Jin, D. Mey, and F. Hatay, Comparing the OpenMP, MPI, and Hybrid Programming Paradigms on an SMP Cluster, the Fifth European Workshop on OpenMP (EWOMP03), Sep. 2003.
[7]
H. Jin and R. Van der Wijingaart, Performance Characteristics of the Multi-Zone NAS Parallel Benchmarks, IPDPS'04, 2004.
[8]
G. Lakner, I. Chung, G. Cong, S. Fadden, N. Goracke, D. Klepacki, J. Lien, C. Pospiech, S. R. Seelam, and H. Wen, IBM System Blue Gene Solution: Performance Analysis Tools, Redbook, REDP-4256-01, November 2008.
[9]
HPCT MPI Profiling and Tracing Library, https://wiki.alcf.anl.gov/index.php/HPCT_MPITRACE.
[10]
NAS Parallel Benchmarks 3.3, http://www.nas.nasa.gov/Resources/Software/npb.html.
[11]
NCCS Jaguar and JaguarPF, Oak Ridge National Laboratory,http://www.nccs.gov/computing-resources/jaguar/
[12]
V. Salapura, K. Ganesan, A. Gara, M. Gschwind, J. Sexton, and R. Walkup, Next-Generation Performance Counters: Towards Monitoring over Thousand Concurrent Events, IBM Research Report, RC24351 (W0709-061), September 19, 2007.
[13]
Universal Performance Counter (UPC) Unit and HPM library for BG/P, https://wiki.alcf.anl.gov/index.php/Performance
[14]
R. Van der Wijngaart and H. Jin, NAS Parallel Benchmarks, Multi-Zone Versions, NAS Technical Report NAS-03-010, July 2003.

Cited By

View all
  • (2022)Supercharging the APGAS Programming Model with Relocatable Distributed CollectionsScientific Programming10.1155/2022/50924222022Online publication date: 1-Jan-2022
  • (2022)First Experiences in Performance Benchmarking with the New SPEChpc 2021 Suites2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid54584.2022.00077(675-684)Online publication date: May-2022
  • (2019)Impact of using multi-levels of parallelism on HPC applications performance hosted on Azure cloud computingInternational Journal of High Performance Computing and Networking10.5555/3337645.333764613:3(251-260)Online publication date: 1-Jan-2019
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGMETRICS Performance Evaluation Review
ACM SIGMETRICS Performance Evaluation Review  Volume 38, Issue 4
Special issue on the 1st international workshop on performance modeling, benchmarking and simulation of high performance computing systems (PMBS 10)
March 2011
93 pages
ISSN:0163-5999
DOI:10.1145/1964218
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 March 2011
Published in SIGMETRICS Volume 38, Issue 4

Check for updates

Author Tags

  1. Hybrid MPI/OpenMP
  2. NAS parallel benchmarks
  3. benchmarks
  4. multicore
  5. performance characteristics
  6. supercomputers

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)2
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Supercharging the APGAS Programming Model with Relocatable Distributed CollectionsScientific Programming10.1155/2022/50924222022Online publication date: 1-Jan-2022
  • (2022)First Experiences in Performance Benchmarking with the New SPEChpc 2021 Suites2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid54584.2022.00077(675-684)Online publication date: May-2022
  • (2019)Impact of using multi-levels of parallelism on HPC applications performance hosted on Azure cloud computingInternational Journal of High Performance Computing and Networking10.5555/3337645.333764613:3(251-260)Online publication date: 1-Jan-2019
  • (2016)IMPACCProceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing10.1145/2907294.2907302(189-201)Online publication date: 31-May-2016
  • (2015)A load balancing parallel method for frequent pattern mining on multi-core clusterProceedings of the Symposium on High Performance Computing10.5555/2872599.2872606(49-58)Online publication date: 12-Apr-2015
  • (2015)MPI+ULTProceedings of the 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conf on Embedded Software and Systems10.1109/HPCC-CSS-ICESS.2015.82(444-454)Online publication date: 24-Aug-2015
  • (2015)Characterizing MPI and hybrid MPI+threads applications at scaleProceedings of the 15th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing10.1109/CCGrid.2015.93(1075-1083)Online publication date: 4-May-2015
  • (2014)Optimizing Message-Passing on Multicore Architectures Using Hardware Multi-threadingProceedings of the 2014 22nd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing10.1109/PDP.2014.63(262-270)Online publication date: 12-Feb-2014
  • (2014)Performance analysis of hybrid OpenMP/MPI based on multi-core cluster architecture2014 International Conference on Computational Science and Technology (ICCST)10.1109/ICCST.2014.7045189(1-6)Online publication date: Aug-2014
  • (2014)Evaluating performance and power efficiency of scientific applications on multi-threaded systemsProceedings of the 2nd International Workshop on Energy Efficient Supercomputing10.1109/E2SC.2014.15(11-20)Online publication date: 16-Nov-2014
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media