Abstract
As the HPC community attempts to reach exascale performance, power will be one of the most critical constrained resources. Achieving practical exascale computing will therefore rely on optimizing performance subject to a power constraint. However, this additional complication should not add to the burden of application developers; optimizing the run-time environment given restricted power will primarily be the job of high-performance system software.
This paper introduces Conductor, a run-time system that intelligently distributes available power to nodes and cores to improve performance. The key techniques used are configuration space exploration and adaptive power balancing. Configuration exploration dynamically selects the optimal thread concurrency level and DVFS state subject to a hardware-enforced power bound. Adaptive power balancing efficiently determines where critical paths are likely to occur so that more power is distributed to those paths. Greater power, in turn, allows increased thread concurrency levels, the DVFS states, or both. We describe these techniques in detail and show that, compared to the state-of-the-art technique of using statically predetermined, per-node power caps, Conductor leads to a best-case performance improvement of up to 30 %, and average improvement of 19.1 %.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
CoMD (2013). https://github.com/exmatex/CoMD
Ashby, S., Beckman, P., Chen, J., Colella, P., Collins, B., Crawford, D., Dongarra, J., Kothe, D., Lusk, R., Messina, P., Mezzacappa, T., Moin, P., Norman, M., Rosner, R., Sarkar, V., Siegel, A., Streitz, F., White, A., Wright, M.: The opportunities and challenges of exascale computing (2010)
Bailey, D., Barszcz, E., Barton, J., Browning, D., Carter, R., Dagum, L., Fatoohi, R., Frederickson, P., Lasinski, T., Schreiber, R., et al.: The NAS parallel benchmarks summary and preliminary results. In: Supercomputing, pp. 158–165 (1991)
Bailey, P.E., Lowenthal, D.K., Ravi, V., Rountree, B., Schulz, M., de Supinski, B.R.: Adaptive configuration selection for power-constrained heterogeneous systems. In: ICPP (2014)
Bulatov, V., Cai, W., Fier, J., Hiratani, M., Hommes, G., Pierce, T., Tang, M., Rhee, M., Yates, K., Arsenlis, T.: Scalable line dynamics in ParaDiS. In: Supercomputing (2004)
Cameron, K.W., Feng, X., Ge, R.: Performance-constrained distributed DVS scheduling for scientific applications on power-aware clusters. In: Supercomputing (2005)
Darema, F., George, D.A., Norton, V.A., Pfister, G.F.: A single-program-multiple-data computational model for EPEX/FORTRAN. Parallel Comput. 7(1), 11–24 (1988)
Etinski, M., Corbalan, J., Labarta, J., Valero, M.: Optimizing job performance under a given power constraint in HPC centers. In: IGCC (2010)
Etinski, M., Corbalan, J., Labarta, J., Valero, M.: Linear programming based parallel job scheduling for power constrained systems. In: HPCS (2011)
Femal, M.E., Freeh, V.W.: Safe overprovisioning: using power limits to increase aggregate throughput. In: Falsafi, B., VijayKumar, T.N. (eds.) PACS 2004. LNCS, vol. 3471, pp. 150–164. Springer, Heidelberg (2005)
Ge, R., Feng, X., Feng, W., Cameron, K.W.: CPU Miser: a performance-directed, run-time system for power-aware clusters. In: ICPP (2007)
Hsu, C.-H., Feng, W.-C.: A power-aware run-time system for high-performance computing. In: Supercomputing, November 2005
InsideHPC. Power consumption is the exascale gorilla in the room. http://insidehpc.com/2010/12/10/power-consumption-is-the-exascale-gorilla-in-the-room/
Intel. Intel-64 and IA-32 Architectures Software Developer’s Manual, Volumes 3A and 3B: System Programming Guide, December 2011
Isci, C., Buyuktosunoglu, A., Cher, C., Bose, P., Martonosi, M.: An analysis of efficient multi-core global power management policies: maximizing performance for a given power budget. In: IEEE/ACM International Symposium on Microarchitecture, pp. 347–358 (2006)
Kappiah, N., Freeh, V.W., Lowenthal, D.K.: Just in time dynamic voltage scaling: exploiting inter-node slack to save energy in MPI programs. In: Supercomputing, November 2005
Karlin, I., Keasler, J., Neely, R.: Lulesh 2.0 updates and changes. Technical report LLNL-TR-641973, August 2013
Li, D., de Supinski, B., Schulz, M., Cameron, K., Nikolopoulos, D.: Hybrid MPI/OpenMP power-aware computing. In: IPDPS (2010)
Nathuji, R., Schwan, K., Somani, A., Joshi, Y.: VPM tokens: virtual machine-aware power budgeting in datacenters. Cluster Comput. 12(2), 189–203 (2009)
Patki, T., Lowenthal, D.K., Rountree, B., Schulz, M., de Supinski, B.R.: Exploring hardware overprovisioning in power-constrained, high performance computing. In: ICS (2013)
Pawlowski, S.S.: Exascale science: the next frontier in high performance computing. In: International Conference on Supercomputing, June 2010
Rountree, B., Ahn, D.H., de Supinski, B.R., Lowenthal, D.K., Schulz, M.: Beyond DVFS: a first look at performance under a hardware-enforced power bound. In: HPPAC (2012)
Rountree, B., Lowenthal, D.K., de Supinski, B., Schulz, M., Freeh, V.W.: Adagio: making DVS practical for complex HPC applications. In: ICS (2009)
Rountree, B., Lowenthal, D.K., Funk, S., Freeh, V.W., de Supinski, B., Schulz, M.: Bounding energy consumption in large-scale MPI programs. In: Supercomputing, November 2007
Sarood, O., Langer, A., Gupta, A., Kale, L.: Maximizing throughput of overprovisioned HPC data centers under a strict power budget. In: Supercomputing (2014)
Sarood, O., Langer, A., Kalé, L., Rountree, B., De Supinski, B.: Optimizing power allocation to CPU and memory subsystems in overprovisioned HPC systems. In: CLUSTER (2013)
van der Wijngaart, R.F., Haopiang, J.: NAS parallel multi-zone benchmarks (2003)
Acknowledgements
Part of this work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under contract DE-AC52-07NA27344 (LLNL-CONF-667408).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Marathe, A., Bailey, P.E., Lowenthal, D.K., Rountree, B., Schulz, M., de Supinski, B.R. (2015). A Run-Time System for Power-Constrained HPC Applications. In: Kunkel, J., Ludwig, T. (eds) High Performance Computing. ISC High Performance 2015. Lecture Notes in Computer Science(), vol 9137. Springer, Cham. https://doi.org/10.1007/978-3-319-20119-1_28
Download citation
DOI: https://doi.org/10.1007/978-3-319-20119-1_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-20118-4
Online ISBN: 978-3-319-20119-1
eBook Packages: Computer ScienceComputer Science (R0)