Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3149412.3149416acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Performance and Power Characteristics and Optimizations of Hybrid MPI/OpenMP LULESH Miniapps under Various Workloads

Published: 12 November 2017 Publication History

Abstract

Energy efficient execution of scientific applications requires insight into how HPC system features affect the performance and power of the applications. In this paper, we analyze and model performance and power characteristics of hybrid MPI/OpenMP LULESH (Livermore Unstructured Lagrange Explicit Shock Hydrodynamics) miniapps under various workloads using MuMMI (Multiple Metrics Modeling Infrastructure). Output from these models is then used to guide code optimizations of performance and power. Our optimization methods result in performance improvement and energy savings of up to approximately 10%. Further, based on the insight learned from our models and measurements under various workloads, applying DCT (Dynamic Concurrency Throttling) to the optimized codes results in the energy savings by 43.12% to 58.30% for different problem sizes compared with the baseline results on 27 nodes with 32 threads per node on a 36-node Intel Haswell testbed cluster Shepard.

References

[1]
CrayPat, http://docs.cray.com/books/S-2315-52/html-S-2315-52/z1055157958smg.html.
[2]
M. Curtis-Maury, A. Shah, F. Blagojevic, D. S. Nikolopoulos, B. R. de Supinski and M. Schulz, Prediction Models for Multi-Dimensional Power-Performance Optimization on Many Cores, PACT'08, 2008.
[3]
R. Ge, X. Feng, S. Song, et al., PowerPack: Energy Profiling and Analysis of High Performance Systems and Applications, IEEE Trans. on Para. and Dis. Sys. 21(5), 2010, pp. 658--667.
[4]
R. D. Hornung, J. A. Keasler and M. B. Gokhale, Hydrodynamics Challenge Problem, Tech. Report LLNL-TR-490254, Lawrence Livermore National Laboratory, July 5, 2011.
[5]
HPCToolkit, http://hpctoolkit.org.
[6]
Intel Thread Affinity Interface, https://software.intel.com/en-us/node/522691.
[7]
C. Isci and M. Martonosi, Runtime Power Monitoring in High-End Processors: Methodology and Empirical Data. 36th IEEE/ACM Intern. Sym. on Microarchitecture, 2003.
[8]
E. A. Leon and I. Karlin, Characterizing the Impact of Program Optimizations on Power and Energy for Explicit Hydrodynamics. 2014 IEEE 28th International Parallel and Distributed Processing Symposium Workshops, 2014.
[9]
E. A. Leon, I. Karlin, and R. E. Grant, Optimizing Explicit Hydrodynamics for Power, Energy and Performance. IEEE International Conference on Cluster Computing, 2015.
[10]
libhugetlbfs manual, http://sourceforge.net/projects/ libhugetlbfs/.
[11]
J. H. Laros III, P. Pokorny and D. DeBonis, PowerInsight: A Commodity Power Measurement Capability. International Green Computing Conference, 2013.
[12]
C. Lively, V. Taylor, X. Wu, H. Chang, C. Su, K. Cameron, S. Moore, and D. Terpstra, E-AMOM: An Energy-Aware Modeling and Optimization Methodology for Scientific Applications on Multicore Systems, Computer Science and Research and Development, Vol. 29, No. 3, 2014, pp. 197--210.
[13]
C. Lively, X. Wu, V. Taylor, S. Moore, H. Chang, C. Su and K. Cameron, Power-Aware Predictive Models of Hybrid (MPI/OpenMP) Scientific Applications on Multi-core Systems, Computer Science and Research and Development, Vol. 27, No. 4, 2012, pp. 245--253.
[14]
LULESH: Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics, https://asc.llnl.gov/CORAL-benchmarks/Summaries/ LULESH_Summary_v1.pdf.
[15]
I. Karlin, A. Bhatele, J. Keasler, B. L. Chamberlain, J. Cohen, Z. DeVito, R. Haque, D. Laney, E. Luke, F. Wang, D. Richards, M. Schulz, and C. Still, Exploring Traditional and Emerging Parallel Programming Models Using a Proxy Application, IEEE International Parallel and Distributed Processing Symposium (IPDPS'13), 2013.
[16]
P. Mucci, D. Ahlin, J. Danielsson, P. Ekman, and L. Malinowski, PerfMiner:Cluster-Wide Collection, Storage and Presentation of Application Level Hardware Performance Data. Euro-Par 2005, Monte de Caparica, Portugal, September 2005.
[17]
OpenSpeedShop, https://openspeedshop.org.
[18]
PAPI (Performance API), http://icl.cs.utk.edu/papi/.
[19]
ScoreP, http://www.vi-hps.org/Tools/Score-P.html.
[20]
Shepard, SNL Advanced Systems Technology Test Beds, http://www.sandia.gov/asc/computational_systems/HAAPS.html.
[21]
S. L. Song, K. Barker, and D. Kerbyson, Unified Performance and Power Modeling of Scientific Workloads. International Workshop on Energy Efficient Supercomputing, Nov. 2013.
[22]
TAU (Tuning and Analysis Utilities), http://www.cs.uoregon.edu/research/tau/home.php
[23]
A. Tiwari, M. A. Laurenzano, L. Carrington, and A. Snavely, Modeling Power and Energy Usage of HPC Kernels, International Workshop on High Performance, Power-Aware Computing, May 2012.
[24]
S. Wallace, V. Vishwanath, S. Coghlan, J. Tramm, Z. Lan, and M. E. Papka, Application Power Profiling on Blue Gene/Q, IEEE Conference on Cluster Computing, 2013.
[25]
W. Wang, J. Cavazos, and A. Porterfield, Energy Auto-Tuning Using the Polyhedral Approach. In 4th International Workshop on Polyhedral Compilation Techniques, Vienna, Austria, Jan. 20, 2014.
[26]
X. Wu and V. Taylor, Using Large Page and Processor Binding to Optimize the Performance of OpenMP Scientific Applications on an IBM POWER5+ System, the Intern. Conf. on High Performance Computing, Networking and Communication Systems, 2009.
[27]
X. Wu, V. Taylor, C. Lively, H. Chang, B. Li, K. Cameron, D. Terpstra and S. Moore, MuMMI: Multiple Metrics Modeling Infrastructure (Book chapter), Tools for HPC 2013, Springer 2014. Also see http://www.mummi.org/info.
[28]
X. Wu, V. Taylor, J. Cook, and P. Mucci, Using Performance-Power Modeling to Improve Energy Efficiency of HPC Applications, IEEE Computer, Vol. 49, No. 10, Oct. 2016, pp. 20--29.

Cited By

View all
  • (2024)ytopt: Autotuning Scientific Applications for Energy Efficiency at Large ScalesConcurrency and Computation: Practice and Experience10.1002/cpe.8322Online publication date: 30-Oct-2024
  • (2021)Performance and Energy Improvement of ECP Proxy App SW4lite under Various Workloads2021 IEEE/ACM Workshop on Memory Centric High Performance Computing (MCHPC)10.1109/MCHPC54807.2021.00009(17-24)Online publication date: Nov-2021
  • (2019)Evaluating LULESH Kernels on OpenCL FPGAIntelligent Information and Database Systems10.1007/978-3-030-17227-5_15(199-213)Online publication date: 29-Mar-2019
  1. Performance and Power Characteristics and Optimizations of Hybrid MPI/OpenMP LULESH Miniapps under Various Workloads

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        E2SC'17: Proceedings of the 5th International Workshop on Energy Efficient Supercomputing
        November 2017
        84 pages
        ISBN:9781450351324
        DOI:10.1145/3149412
        © 2017 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 12 November 2017

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. MPI
        2. OpenMP
        3. Power modeling
        4. energy optimization
        5. performance modeling
        6. performance optimization

        Qualifiers

        • Research-article
        • Research
        • Refereed limited

        Conference

        SC '17
        Sponsor:

        Acceptance Rates

        E2SC'17 Paper Acceptance Rate 10 of 21 submissions, 48%;
        Overall Acceptance Rate 17 of 33 submissions, 52%

        Upcoming Conference

        ICSE 2025

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)6
        • Downloads (Last 6 weeks)1
        Reflects downloads up to 18 Nov 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)ytopt: Autotuning Scientific Applications for Energy Efficiency at Large ScalesConcurrency and Computation: Practice and Experience10.1002/cpe.8322Online publication date: 30-Oct-2024
        • (2021)Performance and Energy Improvement of ECP Proxy App SW4lite under Various Workloads2021 IEEE/ACM Workshop on Memory Centric High Performance Computing (MCHPC)10.1109/MCHPC54807.2021.00009(17-24)Online publication date: Nov-2021
        • (2019)Evaluating LULESH Kernels on OpenCL FPGAIntelligent Information and Database Systems10.1007/978-3-030-17227-5_15(199-213)Online publication date: 29-Mar-2019

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media