Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/777412.777433acmconferencesArticle/Chapter ViewAbstractPublication PagesspaaConference Proceedingsconference-collections
Article

Performance comparison of MPI and three openMP programming styles on shared memory multiprocessors

Published: 07 June 2003 Publication History

Abstract

When using a shared memory multiprocessor, the programmer faces the selection of the portable programming model which will deliver the best performance. Even if he restricts his choice to the standard programming environments (MPI and OpenMP), he has a choice of a broad range of programming approaches.To help the programmer in his selection, we compare MPI with three OpenMP programming styles (loop level, loop level with large parallel sections, SPMD) using a subset of the NAS benchmark (CG, MG, FT, LU), two dataset sizes (A and B) and two shared memory multiprocessors (IBM SP3 Night Hawk II, SGI Origin 3800). We also present a path from MPI to OpenMP SPMD guiding the programmers starting from an existing MPI code. We present the first SPMD OpenMP version of the NAS benchmark and compare it with other OpenMP versions from independent sources (PBN, SDSC and RWCP). Experimental results demonstrate that OpenMP provides competitive performance compared to MPI for a large set of experimental conditions. However the price of this performance is a strong programming effort on data set adaptation and inter-thread communications. MPI still provides the best performance under some conditions. We present breakdowns of the execution times and measurements of hardware performance counters to explain the performance differences.

References

[1]
F. Cappello and D. Etiemble. MPI versus MPI+OpenMP on IBM SP for the NAS Benchmarks. Proc. of the international Conference on Supercomputing 2000 : High-Performance Networking and Computing (SC2000), 2000. http://www.sc2000.org/proceedings/techpapr/index.htm.]]
[2]
P. Kloos amd F. Mathey and P. Blaise. OpenMP and MPI programming with a CG algorithm. In Proceedings of the Second European Workshop on OpenMP (EWOMP 2000), http://www.epcc.ed.ac.uk/ewomp2000/proceedings.html, 2000.]]
[3]
A. Kneer. Industrial Mixed OpenMP/MPI CFD application for Practical Use in Free-surface Flow Calculations. In International Workshop on OpenMP Applications and Tools, WOMPAT 2000, http://www.cs.uh.edu/wompat2000/Program.html, 2000.]]
[4]
L. Smith and M. Bull. Development of Mixed Mode MPI/ OpenMP Applications. In International Workshop on OpenMP Applications and Tools, WOMPAT 2000, http://www.cs.uh.edu/wompat2000/Program.html, 2000.]]
[5]
Marc Snir, Steve Otto, Steven Huss-Lederman, David Walker, and Jack Dongarra. MPI: The Complete Reference. Massachussets Institute of Technology Press, 1996.]]
[6]
Jack Dongarra et al. Message Passing Interface Forum. www.mpi-forum.org/docs/docs.html, 1994.]]
[7]
M.K. Bane et al. A Comparison of MPI and OpenMP Implementations of a Finite Analysis Code. In Cray User Group, (CUG-2000) (Noordwijk, Netherlands, 22--26 May 2000), 2000.]]
[8]
Kazuhiro Kusano, Shigehisa Satoh, and Mitsuhisa Sato. Performance Evaluation of the Omni OpenMP Compiler. Lecture Notes in Computer Science, 1940:403--414, 2000.]]
[9]
B. Chapman, A. Patil, and A. Prabhakar. Performance Oriented Programming for NUMA Architectures. In Springer-Verlag~Berlin Heidelberg, editor, LNCS 2104, International Workshop on OpenMP Applications and Tools, WOMPAT 2001, West Lafayette, IN, USA, 2001.]]
[10]
P. Kloos amd F. Mathey and P. Blaise. OpenMP and MPI programming with a CG algorithm. In Proceedings of the Second European Workshop on OpenMP (EWOMP 2000), http://www.epcc.ed.ac.uk/ewomp2000/proceedings.html, 2000.]]
[11]
E. Ayguade et al. NANOS: Effective Integration of Fine-grain Parallelism Exploitation and Multiprogramming, 1999.]]
[12]
Yoshizumi Tanaka, Kenjiro Taura, Mitsuhisa Sato, and Akinori Yonezawa. Performance Evaluation of OpenMP Applications with Nested Parallelism. In Languages, Compilers, and Run-Time Systems for Scalable Computers, pages 100--112, 2000.]]
[13]
David Bailey, Tim Harris, William Saphir, Rob van~der Wijngaart, Alex Woo, and Maurice Yarrow. The NAS Parallel Benchmarks 2.0. Report NAS-95-020, Numerical Aerodynamic Simulation Facility, NASA Ames Research Center, Mail Stop T 27 A-1, Moffett Field, CA 94035-1000, USA, December 1995.]]
[14]
F. C. Wong, R. P. Martin, R. H. Arpaci-Dusseau, and D. E. Culler. Architectural Requirements and Scalability of the NAS Parallel Benchmarks. In Proc. of international Conference on Supercomputing 1999 : High-Performance Networking and Computing (SC299), 1999.]]
[15]
H. Jin, M. Frumkin, and J. Yan. The OpenMP Implementation of the NAS Parallel Benchmarks and its Performance. In NASA Ames~Research Center, editor, Technical Report NAS-99-01, 1999.]]
[16]
B. Armstrong, S. Wook Kim, and R. Eigenmann. Quantifying Differences between OpenMP and MPI Using a Large-Scale Application Suite. In Springer-Verlag~Berlin Heidelberg, editor, LNCS 1940, ISHPC International Workshop on OpenMP: Experiences and Implementations (WOMPEI 2000), 2000.]]
[17]
A. J. Wallcraft. SPMD OpenMP vs MPI for Ocean Models. In First European Workshop on OpenMP - EWOMP'99, http://www.it.lth.se/ewomp99/programme.html, 2000.]]

Cited By

View all
  • (2024)Designing and prototyping extensions to the Message Passing Interface in MPICHThe International Journal of High Performance Computing Applications10.1177/10943420241263544Online publication date: 19-Aug-2024
  • (2023)Frustrated With MPI+Threads? Try MPIxThreads!Proceedings of the 30th European MPI Users' Group Meeting10.1145/3615318.3615320(1-10)Online publication date: 11-Sep-2023
  • (2022)Coupling of a multi-GPU accelerated elasto-visco-plastic fast Fourier transform constitutive model with the implicit finite element methodComputational Materials Science10.1016/j.commatsci.2022.111348208(111348)Online publication date: Jun-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SPAA '03: Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures
June 2003
374 pages
ISBN:1581136617
DOI:10.1145/777412
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 June 2003

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. MPI
  2. multiprocessors
  3. openMP
  4. performance evaluation
  5. shared memory

Qualifiers

  • Article

Conference

SPAA03

Acceptance Rates

SPAA '03 Paper Acceptance Rate 38 of 106 submissions, 36%;
Overall Acceptance Rate 447 of 1,461 submissions, 31%

Upcoming Conference

SPAA '25
37th ACM Symposium on Parallelism in Algorithms and Architectures
July 28 - August 1, 2025
Portland , OR , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)56
  • Downloads (Last 6 weeks)12
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Designing and prototyping extensions to the Message Passing Interface in MPICHThe International Journal of High Performance Computing Applications10.1177/10943420241263544Online publication date: 19-Aug-2024
  • (2023)Frustrated With MPI+Threads? Try MPIxThreads!Proceedings of the 30th European MPI Users' Group Meeting10.1145/3615318.3615320(1-10)Online publication date: 11-Sep-2023
  • (2022)Coupling of a multi-GPU accelerated elasto-visco-plastic fast Fourier transform constitutive model with the implicit finite element methodComputational Materials Science10.1016/j.commatsci.2022.111348208(111348)Online publication date: Jun-2022
  • (2021)Parallel Implementation of Real-Time Object Detection using OpenMP2021 International Conference on Information Science and Communications Technologies (ICISCT)10.1109/ICISCT52966.2021.9670146(1-4)Online publication date: 3-Nov-2021
  • (2020)parSMURF, a high-performance computing tool for the genome-wide detection of pathogenic variantsGigaScience10.1093/gigascience/giaa0529:5Online publication date: 23-May-2020
  • (2020)A Comparison of the Scalability of OpenMP ImplementationsEuro-Par 2020: Parallel Processing10.1007/978-3-030-57675-2_6(83-97)Online publication date: 24-Aug-2020
  • (2020)A Methodology Approach to Compare Performance of Parallel Programming Models for Shared-Memory ArchitecturesNumerical Computations: Theory and Algorithms10.1007/978-3-030-39081-5_28(318-325)Online publication date: 14-Feb-2020
  • (2017)Accommodating Thread-Level Heterogeneity in Coupled Parallel Applications2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS.2017.13(469-478)Online publication date: May-2017
  • (2017)OpenMP Tasking Model for Ada: Safety and CorrectnessReliable Software Technologies – Ada-Europe 201710.1007/978-3-319-60588-3_12(184-200)Online publication date: 30-May-2017
  • (2016)IMPACCProceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing10.1145/2907294.2907302(189-201)Online publication date: 31-May-2016
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media