research-article

Public Access

Energy-efficient Application Resource Scheduling using Machine Learning Classifiers

Authors:

Steven Hofmeyr,

Henry HoffmannAuthors Info & Claims

ICPP '18: Proceedings of the 47th International Conference on Parallel Processing

Article No.: 45, Pages 1 - 11

https://doi.org/10.1145/3225058.3225088

Published: 13 August 2018 Publication History

Abstract

Resource scheduling in high performance computing (HPC) usually aims to minimize application runtime rather than optimize for energy efficiency. Most existing research on reducing power and energy consumption imposes the constraint that little or no performance loss is allowed, which improves but still does not maximize energy efficiency. By optimizing for energy efficiency instead of application turnaround time, we can reduce the cost of running scientific applications. We propose using machine learning classification, driven by low-level hardware performance counters, to predict the most energy-efficient resource settings to use during application runtime, which unlike static resource scheduling dynamically adapts to changing application behavior. We evaluate our approach on a large shared-memory system using four complex bioinformatic HPC applications, decreasing energy consumption over the naive race scheduler by 20% on average, and by as much as 38%. An average increase in runtime of 31% is dominated by a 39% reduction in power consumption, from which we extrapolate the potential for a 24% increase in throughput for future over-provisioned, power-constrained clusters. This work demonstrates that low-overhead classification is suitable for dynamically optimizing energy efficiency during application runtime.

References

[1]

Bilge Acun, Phil Miller, and Laxmikant V. Kale. 2016. Variation Among Processors Under Turbo Boost in HPC Systems. In ICS.

Digital Library

[2]

Mark F. Adams, Jed Brown, John Shalf, Brian van Straalen, Erich Strohmaier, and Sam Williams. 2014. HPGMG 1.0: A Benchmark for Ranking High Performance Computing Systems. Technical Report LBNL-6630E. LBNL.

[3]

Claudia Alvarado, Dan Tamir, and Apan Qasem. 2015. Realizing Energy-efficient Thread Affinity Configurations with Supervised Learning. In IGSC.

Digital Library

[4]

D. H. Bailey, E. Barszcz, J. T. Barton, D. S. Browning, R. L. Carter, L. Dagum, R. A. Fatoohi, P. O. Frederickson, T. A. Lasinski, R. S. Schreiber, H. D. Simon, V. Venkatakrishnan, and S. K. Weeratunga. 1991. The NAS Parallel Benchmarks-Summary and Preliminary Results. In SC.

Digital Library

[5]

Michael Berry, Thomas E. Potok, Prasanna Balaprakash, Henry Hoffmann, Raju Vatsavai, Prabhat, and Robinson Pino. 2015. Machine Learning and Understanding for Intelligent Extreme Scale Scientific Computing and Discovery. (2015).

[6]

Dimitrios Chasapis, Marc Casas, Miquel Moretó, Martin Schulz, Eduard Ayguadé, Jesus Labarta, and Mateo Valero. 2016. Runtime-Guided Mitigation of Manufacturing Variability in Power-Constrained Multi-Socket NUMA Nodes. In ICS.

Digital Library

[7]

Matthew Curtis-Maury, Ankur Shah, Filip Blagojevic, Dimitrios S. Nikolopoulos, Bronis R. de Supinski, and Martin Schulz. 2008. Prediction Models for Multi-dimensional Power-performance Optimization on Many Cores. In PACT.

Digital Library

[8]

Howard David, Eugene Gorbatov, Ulf R. Hanebutte, Rahul Khanna, and Christian Le. 2010. RAPL: Memory Power Estimation and Capping. In ISLPED.

Digital Library

[9]

Christina Delimitrou and Christos Kozyrakis. 2013. Paragon: QoS-aware Scheduling for Heterogeneous Datacenters. In ASPLOS.

Digital Library

[10]

Christina Delimitrou and Christos Kozyrakis. 2014. Quasar: Resource-efficient and QoS-aware Cluster Management. In ASPLOS.

Digital Library

[11]

Sudip S Dosanjh, Shane Canon, Jack Deslippe, Kjiersten Fagnan, Richard A Gerber, Lisa Gerhardt, Jason Hick, Douglas Jacobsen, David Skinner, and Nicholas J Wright. 2013. Extreme Data Science at the National Energy Research Scientific Computing (NERSC) Center. In PARCO.

[12]

Matteo Ferroni, Andrea Corna, Andrea Damiani, Rolando Brondolin, Juan A. Colmenares, Steven Hofmeyr, John D. Kubiatowicz, and Marco D. Santambrogio. 2017. Power Consumption Models for Multi-Tenant Server Infrastructures. ACM TACO 14, 4, Article 38 (Nov. 2017).

Digital Library

[13]

Jerome H. Friedman. 2001. Greedy Function Approximation: A Gradient Boosting Machine. Ann. Statist. 29, 5 (Oct. 2001).

[14]

Evangelos Georganas, Aydın Buluç, Jarrod Chapman, Steven Hofmeyr, Chaitanya Aluru, Rob Egan, Leonid Oliker, Daniel Rokhsar, and Katherine Yelick. 2015. HipMer: An Extreme-Scale De Novo Genome Assembler. In SC.

Digital Library

[15]

Pierre Geurts, Damien Ernst, and Louis Wehenkel. 2006. Extremely Randomized Trees. Machine Learning 63, 1 (01 Apr 2006).

Digital Library

[16]

Van Emden Henson and Ulrike Meier Yang. 2002. BoomerAMG: A Parallel Algebraic Multigrid Solver and Preconditioner. Appl. Numer. Math. (April 2002).

Digital Library

[17]

Geoffrey E. Hinton. 1989. Connectionist Learning Procedures. Artificial Intelligence 40, 1 (1989).

Digital Library

[18]

Henry Hoffmann. 2013. Racing and Pacing to Idle: An Evaluation of Heuristics for Energy-aware Resource Allocation. In HotPower.

Digital Library

[19]

Henry Hoffmann. 2014. CoAdapt: Predictable Behavior for Accuracy-Aware Applications Running on Power-Aware Systems. In ECRTS.

[20]

Henry Hoffmann. 2015. JouleGuard: Energy Guarantees for Approximate Applications. In SOSP.

Digital Library

[21]

Connor Imes, David H. K. Kim, Martina Maggio, and Henry Hofmann. 2016. Portable Multicore Resource Management for Applications with Performance Constraints. In MCSoC.

[22]

Nandini Kappiah, Vincent W. Freeh, and David K. Lowenthal. 2005. Just In Time Dynamic Voltage Scaling: Exploiting Inter-Node Slack to Save Energy in MPI Programs. In SC.

Digital Library

[23]

Ian Karlin, Jeff Keasler, and Rob Neely. 2013. LULESH 2.0 Updates and Changes. Technical Report LLNL-TR-641973.

[24]

David H. K. Kim, Connor Imes, and Henry Hoffmann. 2015. Racing and Pacing to Idle: Theoretical and Empirical Analysis of Energy Optimization Heuristics. In CPSNA.

Digital Library

[25]

Minyoung Kim, Mark-Oliver Stehr, Carolyn Talcott, Nikil Dutt, and Nalini Venkatasubramanian. 2013. xTune: A Formal Methodology for Cross-layer Tuning of Mobile Embedded Systems. ACM TECS 11, 4, Article 73 (2013).

Digital Library

[26]

Peter Kogge, Shekhar Borkar, Dan Campbell, William Carlson, William Dally, Monty Denneau, Paul Franzon, William Harrod, Jon Hiller, Stephen Keckler, Dean Klein, and Robert Lucas. 2008. ExaScale Computing Study: Technology Challenges in Achieving Exascale Systems. (28 Sept. 2008).

[27]

Adam J. Kunen, Teresa S. Bailey, and Peter N. Brown. 2015. KRIPKE - A Massively Parallel Transport Mini-App. In American Nuclear Society M&C. http://www.osti.gov/scitech/servlets/purl/1229802

[28]

Los Alamos National Laboratory. 2016. CoMD. (2016). https://github.com/ECP-copa/CoMD

[29]

Lawrence Livermore National Laboratory. 2017. Co-design at Lawrence Livermore National Lab -- Quicksilver. (2017). https://codesign.llnl.gov/quicksilver.php

[30]

Dinghua Li, Chi-Man Liu, Ruibang Luo, Kunihiko Sadakane, and Tak-Wah Lam. 2015. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31, 10 (2015).

[31]

Simone Libutti, Giuseppe Massari, Patrick Bellasi, and William Fornaciari. 2014. Exploiting Performance Counters for Energy Efficient Co-Scheduling of Mixed Workloads on Multi-Core Platforms. In PARMA-DITAM.

Digital Library

[32]

Aniruddha Marathe, Peter E. Bailey, David K. Lowenthal, Barry Rountree, Martin Schulz, and Bronis R. de Supinski. 2015. A Run-Time System for Power-Constrained HPC Applications. In ISC.

[33]

John D. McCalpin. 1995. Memory Bandwidth and Machine Balance in Current High Performance Computers. IEEE TCCA Newsletter (Dec. 1995).

[34]

Nikita Mishra, Connor Imes, John D. Lafferty, and Henry Hoffmann. 2018. CALOREE: Learning Control for Predictable Latency and Low Energy. In ASPLOS.

Digital Library

[35]

Nikita Mishra, Huazhe Zhang, John D. Lafferty, and Henry Hoffmann. 2015. A Probabilistic Graphical Model-based Approach for Minimizing Energy Under Performance Constraints. In ASPLOS.

Digital Library

[36]

NERSC. 2017 (accessed Jan, 2018). GENEPOOL. (2017 (accessed Jan, 2018)). http://www.nersc.gov/users/computational-systems/genepool/

[37]

Sergey Nurk, Dmitry Meleshko, Anton Korobeynikov, and Pavel Pevzner. 2016. metaSPAdes: a new versatile de novo metagenomics assembler. ArXiv e-prints (April 2016). arXiv:q-bio.GN/1604.03071

[38]

opcm. 2016. Processor Counter Monitor (PCM). (2016). https://github.com/opcm/pcm

[39]

Tapasya Patki, David K. Lowenthal, Anjana Sasidharan, Matthias Maiterth, Barry L. Rountree, Martin Schulz, and Bronis R. de Supinski. 2015. Practical Resource Management in Power-Constrained, High Performance Computing. In HPDC.

Digital Library

[40]

Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, and Édouard Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12 (Nov. 2011). http://dl.acm.org/citation.cfm?id=1953048.2078195

Digital Library

[41]

Yu Peng, Henry CM Leung, Siu-Ming Yiu, and Francis YL Chin. 2012. IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics 28, 11 (2012).

[42]

Barry Rountree, David K. Lownenthal, Bronis R. de Supinski, Martin Schulz, Vincent W. Freeh, and Tyler Bletsch. 2009. Adagio: Making DVS Practical for Complex HPC Applications. In ICS.

Digital Library

[43]

Osman Sarood, Akhil Langer, Laxmikant Kale, Barry Rountree, and Bronis de Supinski. 2013. Optimizing Power Allocation to CPU and Memory Subsystems in Overprovisioned HPC Systems. In CLUSTER.

[44]

Hiroshi Sasaki, Yoshimichi Ikeda, Masaaki Kondo, and Hiroshi Nakamura. 2007. An Intra-task Dvfs Technique Based on Statistical Analysis of Hardware Events. In CF.

Digital Library

[45]

Srinath Sridharan, Gagan Gupta, and Gurindar S. Sohi. 2013. Holistic Run-time Parallelism Management for Time and Energy Efficiency. In ICS.

Digital Library

[46]

ExaOSR Team. 2012. Key Challenges for Exascale OS/R. (15 June 2012). https://collab.mcs.anl.gov/display/exaosr/Challenges

[47]

John R. Tramm, Andrew R. Siegel, Benoit Forget, and Colin Josey. 2014. Performance Analysis of a Reduced Data Movement Algorithm for Neutron Cross Section Data in Monte Carlo Simulations. In EASC.

[48]

John R. Tramm, Andrew R. Siegel, Tanzima Islam, and Martin Schulz. 2014. XSBench - The Development and Verification of a Performance Abstraction for Monte Carlo Reactor Analysis. In PHYSOR.

[49]

Ghislain Landry Tsafack Chetsa, Laurent Lefèvre, Jean-Marc Pierson, Patricia Stolf, and Georges Da Costa. 2013. Exploiting Performance Counters to Predict and Improve Energy Performance of HPC Systems. Future Generation Computer Systems (Aug. 2013). https://hal.inria.fr/hal-00925306

[50]

Vibhore Vardhan, Wanghong Yuan, Albert F. Harris III, Sarita V. Adve, Robin Kravets, Klara Nahrstedt, Daniel Grobe Sachs, and Douglas L. Jones. 2009. GRACE-2: Integrating Fine-Grained Application Adaptation with Global Adaptation for Saving Energy. IJES 4, 2 (2009).

[51]

Thomas Willhalm. 2014. Intel PCM Column Names Decoder Ring. (18 July 2014). https://software.intel.com/en-us/blogs/2014/07/18/intel-pcm-column-names-decoder-ring

[52]

Xingfu Wu, Valerie Taylor, Jeanine Cook, and Philip J. Mucci. 2016. Using Performance-Power Modeling to Improve Energy Efficiency of HPC Applications. Computer 49, 10 (Oct. 2016).

[53]

Huazhe Zhang and Henry Hoffmann. 2016. Maximizing Performance Under a Power Cap: A Comparison of Hardware, Software, and Hybrid Techniques. In ASPLOS.

Digital Library

Cited By

Zheng CZang MHong XPerreault LBensoussane RVargaftik SBen-Itzhak YZilberman N(2024)Planter: Rapid Prototyping of In-Network Machine Learning InferenceACM SIGCOMM Computer Communication Review10.1145/3687230.368723254:1(2-21)Online publication date: 6-Aug-2024
https://doi.org/10.1145/3687230.3687232
Kocot BCzarnul PProficz J(2023)Energy-Aware Scheduling for High-Performance Computing Systems: A SurveyEnergies10.3390/en1602089016:2(890)Online publication date: 12-Jan-2023
https://doi.org/10.3390/en16020890
Blasch E(2023)Fusion Orchestration Guidelines (FOG) for Collaborative Computing and Network Data FusionNAECON 2023 - IEEE National Aerospace and Electronics Conference10.1109/NAECON58068.2023.10365788(286-293)Online publication date: 28-Aug-2023
https://doi.org/10.1109/NAECON58068.2023.10365788
Show More Cited By

Index Terms

Energy-efficient Application Resource Scheduling using Machine Learning Classifiers

Recommendations

Energy-aware grid resource scheduling: model and algorithm

Energy efficiency for high-performance computing and communication systems has recently become an important concern, but most current grid environments do not implement energy-aware resource management. This paper proposes an energy-aware grid resource ...
Energy-aware task scheduling in data centers using an application signature
Abstract
Data centers are power hungry facilities. Energy-aware task scheduling approaches are of utmost importance to improve energy savings in data centers, although they need to know beforehand the energy consumption of the applications that ...
Graphical abstract

Display Omitted
Highlights
- Energy-aware task scheduling approaches reduce energy consumption in data centers.
A virtualized energy-efficient office environment
e-Energy '10: Proceedings of the 1st International Conference on Energy-Efficient Computing and Networking

The energy efficiency of information and communication technology becomes more and more important due to the raise of energy costs and the world wide desire to reduce CO₂ emissions. Data centers have been in the focus concerning their energy efficiency ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICPP '18: Proceedings of the 47th International Conference on Parallel Processing

August 2018

945 pages

ISBN:9781450365109

DOI:10.1145/3225058

Copyright © 2018 ACM.

© 2018 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

In-Cooperation

University of Oregon: University of Oregon

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 August 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Conference

ICPP 2018

ICPP 2018: 47th International Conference on Parallel Processing

August 13 - 16, 2018

OR, Eugene, USA

Acceptance Rates

ICPP '18 Paper Acceptance Rate 91 of 313 submissions, 29%;

Overall Acceptance Rate 91 of 313 submissions, 29%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

26
Total Citations
View Citations
1,314
Total Downloads

Downloads (Last 12 months)288
Downloads (Last 6 weeks)32

Reflects downloads up to 30 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zheng CZang MHong XPerreault LBensoussane RVargaftik SBen-Itzhak YZilberman N(2024)Planter: Rapid Prototyping of In-Network Machine Learning InferenceACM SIGCOMM Computer Communication Review10.1145/3687230.368723254:1(2-21)Online publication date: 6-Aug-2024
https://doi.org/10.1145/3687230.3687232
Kocot BCzarnul PProficz J(2023)Energy-Aware Scheduling for High-Performance Computing Systems: A SurveyEnergies10.3390/en1602089016:2(890)Online publication date: 12-Jan-2023
https://doi.org/10.3390/en16020890
Blasch E(2023)Fusion Orchestration Guidelines (FOG) for Collaborative Computing and Network Data FusionNAECON 2023 - IEEE National Aerospace and Electronics Conference10.1109/NAECON58068.2023.10365788(286-293)Online publication date: 28-Aug-2023
https://doi.org/10.1109/NAECON58068.2023.10365788
Pervaiz AYang YDuracz ABartha FSai RImes CCartwright RPalem KLu SHoffmann HScholliers CSinger J(2022)GOAL: Supporting General and Dynamic Adaptation in Computing SystemsProceedings of the 2022 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software10.1145/3563835.3567655(16-32)Online publication date: 29-Nov-2022
https://dl.acm.org/doi/10.1145/3563835.3567655
Srivastava TZhang HHoffmann H(2022)Penelope: Peer-to-peer Power ManagementProceedings of the 51st International Conference on Parallel Processing10.1145/3545008.3545047(1-11)Online publication date: 29-Aug-2022
https://dl.acm.org/doi/10.1145/3545008.3545047
Wu NXie Y(2022)A Survey of Machine Learning for Computer Architecture and SystemsACM Computing Surveys10.1145/349452355:3(1-39)Online publication date: 3-Feb-2022
https://dl.acm.org/doi/10.1145/3494523
Wang QMei XLiu HLeung YLi ZChu X(2022)Energy-Aware Non-Preemptive Task Scheduling With Deadline Constraint in DVFS-Enabled Heterogeneous ClustersIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2022.318109633:12(4083-4099)Online publication date: 1-Dec-2022
https://doi.org/10.1109/TPDS.2022.3181096
Wang YZhang WHao MWang Z(2022)Online Power Management for Multi-Cores: A Reinforcement Learning Based ApproachIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2021.309227033:4(751-764)Online publication date: 1-Apr-2022
https://doi.org/10.1109/TPDS.2021.3092270
Chen WWang YXu YGao CHan YZhang L(2022)Amphis: Managing Reconfigurable Processor Architectures With Generative Adversarial LearningIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.319798041:11(3993-4003)Online publication date: Nov-2022
https://doi.org/10.1109/TCAD.2022.3197980
Sehgal MGoyal SKumar S(2022)Comparative Study on Energy-Efficiency for Wireless Body Area Network using Machine Learning Approach2022 Seventh International Conference on Parallel, Distributed and Grid Computing (PDGC)10.1109/PDGC56933.2022.10053368(372-377)Online publication date: 25-Nov-2022
https://doi.org/10.1109/PDGC56933.2022.10053368
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents