Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Maximizing Performance Under a Power Cap: A Comparison of Hardware, Software, and Hybrid Techniques

Published: 25 March 2016 Publication History

Abstract

Power and thermal dissipation constrain multicore performance scaling. Modern processors are built such that they could sustain damaging levels of power dissipation, creating a need for systems that can implement processor power caps. A particular challenge is developing systems that can maximize performance within a power cap, and approaches have been proposed in both software and hardware. Software approaches are flexible, allowing multiple hardware resources to be coordinated for maximum performance, but software is slow, requiring a long time to converge to the power target. In contrast, hardware power capping quickly converges to the the power cap, but only manages voltage and frequency, limiting its potential performance. In this work we propose PUPiL, a hybrid software/hardware power capping system. Unlike previous approaches, PUPiL combines hardware's fast reaction time with software's flexibility. We implement PUPiL on real Linux/x86 platform and compare it to Intel's commercial hardware power capping system for both single and multi-application workloads. We find PUPiL provides the same reaction time as Intel's hardware with significantly higher performance. On average, PUPiL outperforms hardware by from 1:18-2:4 depending on workload and power target. Thus, PUPiL provides a promising way to enforce power caps with greater performance than current state-of-the-art hardware-only approaches.

References

[1]
V. Anagnostopoulou, S. Biswas, H. Saadeldeen, R. Bianchini, T. Yang, D. Franklin, and F. Chong. "Power-Aware Resource Allocation for CPU- and Memory-Intense Internet Services". In: E2DC. 2012.
[2]
K. Bergman, S. Borkar, D. Campbell, W. Carlson, W. Dally, M. Denneau, P. Franzon, W. Harrod, J. Hiller, S. Karp, S. Keckler, D. Klein, R. Lucas, M. Richards, A. Scarpelli, S. Scott, A. Snavely, T. Sterling, R. S. Williams, K. Yelick, K. Bergman, S. Borkar, D. Campbell, W. Carlson, W. Dally, M. Denneau, P. Franzon, W. Harrod, J. Hiller, S. Keckler, D. Klein, P. Kogge, R. S. Williams, and K. Yelick. ExaScale Computing Study: Technology Challenges in Achieving Exascale Systems Peter Kogge, Editor & Study Lead. 2008.
[3]
C. Bienia, S. Kumar, J. P. Singh, and K. Li. "The PARSEC Benchmark Suite: Characterization and Architectural Implications". In: PACT. 2008.
[4]
S. Che, M. Boyer, J. Meng, D. Tarjan, J. W. Sheaffer, S.-H. Lee, and K. Skadron. "Rodinia: A Benchmark Suite for Heterogeneous Computing". In: IISWC. 2009.
[5]
J. Chen and L. K. John. "Predictive coordination of multiple on-chip resources for chip multiprocessors". In: ICS. 2011.
[6]
R. Cochran, C. Hankendi, A. K. Coskun, and S. Reda. "Pack & Cap: adaptive DVFS and thread packing under power caps". In: MICRO. 2011.
[7]
H. David, E. Gorbatov, U. R. Hanebutte, R. Khanna, and C. Le. "RAPL: Memory Power Estimation and Capping". In: ISLPED. 2010.
[8]
Q. Deng, D. Meisner, A. Bhattacharjee, T. F. Wenisch, and R. Bianchini. "CoScale: Coordinating CPU and Memory System DVFS in Server Systems". In: MICRO. 2012.
[9]
Q. Deng, D. Meisner, A. Bhattacharjee, T. F. Wenisch, and R. Bianchini. "MultiScale: memory system DVFS with multiple memory controllers". In: ISLPED. 2012.
[10]
B. Diniz, D. Guedes, W. Meira Jr., and R. Bianchini. "Limiting the power consumption of main memory". In: ISCA. 2007.
[11]
H. Esmaeilzadeh, E. Blem, R. St. Amant, K. Sankaralingam, and D. Burger. "Dark silicon and the end of multicore scaling". In: ISCA. 2011.
[12]
H. Esmaeilzadeh, T. Cao, X. Yang, S. M. Blackburn, and K. S. McKinley. "Looking Back and Looking Forward: Power, Performance, and Upheaval". In: Commun. ACM 55.7 (July 2012), pp. 105--114.
[13]
S. Eyerman and L. Eeckhout. "Restating the Case for Weighted-IPC Metrics to Evaluate Multiprogram Workload Performance". In: Computer Architecture Letters 13.2 (2014), pp. 93--96. ISSN: 1556-6056.
[14]
W. Felter, K. Rajamani, T. Keller, and C. Rusu. "A performance-conserving approach for reducing peak power consumption in server systems". In: ICS. 2005.
[15]
J. Flinn and M. Satyanarayanan. "Energy-aware adaptation for mobile applications". In: SOSP. 1999.
[16]
R. Fonseca, P. Dutta, P. Levis, and I. Stoica. "Quanto: Tracking Energy in Networked Embedded Systems". In: OSDI. 2008.
[17]
A. Gandhi, M. Harchol-Balter, R. Das, C. Lefurgy, and J. Kephart. "Power capping via forced idleness". In: Workshop on Energy-Efficient Design. Austin, TX, 2009.
[18]
J. L. Hellerstein, Y. Diao, S. Parekh, and D. M. Tilbury. Feedback Control of Computing Systems. John Wiley & Sons, 2004.
[19]
U. Hoelzle and L. A. Barroso. The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines. 1st. Morgan and Claypool Publishers, 2009.
[20]
H. Hoffmann. "JouleGuard: Energy Guarantees for Approximate Applications". In: SOSP. 2015.
[21]
H. Hoffmann, J. Eastep, M. D. Santambrogio, J. E. Miller, and A. Agarwal. "Application heartbeats: a generic interface for specifying program performance and goals in autonomous computing environments". In: ICAC. 2010.
[22]
H. Hoffmann and M. Maggio. "PCP: A Generalized Approach to Optimizing Performance Under Power Constraints through Resource Management". In: ICAC. 2014.
[23]
H. Hoffmann, M. Maggio, M. D. Santambrogio, A. Leva, and A. Agarwal. "A Generalized Software Framework for Accurate and Efficient Managment of Performance Goals". In: EMSOFT. 2013.
[24]
H. Hoffmann, S. Sidiroglou, M. Carbin, S. Misailovic, A. Agarwal, and M. Rinard. "Dynamic Knobs for Responsive Power-Aware Computing". In: ASPLOS. 2011.
[25]
T. Horvath, T. Abdelzaher, K. Skadron, and X. Liu. "Dynamic Voltage Scaling in Multitier Web Servers with End-to-End Delay Control". In: Computers, IEEE Transactions on 56.4 (2007).
[26]
C. Imes, D. H. K. Kim, M. Maggio, and H. Hoffmann. "POET: A Portable Approach to Minimizing Energy Under Soft Real-time Constraints". In: RTAS. 2015.
[27]
T. Instruments. http://www.ti.com/product/ina231.
[28]
S. Iqbal, Y. Liang, and H. Grahn. "ParMiBench - An Open-Source Benchmark for Embedded Multiprocessor Systems". In: Computer Architecture Letters 9.2 (2010). ISSN: 1556-6056.
[29]
C. Isci, A. Buyuktosunoglu, C.-Y. Cher, P. Bose, and M. Martonosi. "An Analysis of Efficient Multi-Core Global Power Management Policies: Maximizing Performance for a Given Power Budget". In: MICRO. 2006.
[30]
M. Kim, M.-O. Stehr, C. Talcott, N. Dutt, and N. Venkatasubramanian. "xTune: A Formal Methodology for Cross-layer Tuning of Mobile Embedded Systems". In: ACM Trans. Embed. Comput. Syst. 11.4 (Jan. 2013).
[31]
C. Lefurgy, X. Wang, and M. Ware. "Power capping: a prelude to power shifting". In: Cluster Computing 11.2 (2008).
[32]
X. Li, R. Gupta, S. V. Adve, and Y. Zhou. "Cross-component energy management: Joint adaptation of processor and memory". In: ACM Trans. Archit. Code Optim. 4.3 (2007).
[33]
X. Li, Z. Li, Y. Zhou, and S. Adve. "Performance directed energy management for main memory and disks". In: Trans. Storage 1.3 (2005).
[34]
M. Maggio, H. Hoffmann, M. D. S. an d Anant Agarwal, and A. Leva. "Power optimization in embedded systems via feedback control of resource allocation". In: IEEE Transactions on Control Systems Technology (to appear) ().
[35]
D. Meisner, C. M. Sadler, L. A. Barroso, W.-D. Weber, and T. F. Wenisch. "Power management of online data-intensive services". In: ISCA (2011).
[36]
A. Merkel and F. Bellosa. "Balancing power consumption in multiprocessor systems". In: EuroSys. 2006.
[37]
A. Merkel, J. Stoess, and F. Bellosa. "Resource-conscious scheduling for energy efficiency on multi-core processors". In: EuroSys. 2010.
[38]
N. Mishra, H. Zhang, J. D. Lafferty, and H. Hoffmann. "A Probabilistic Graphical Model-based Approach for Minimizing Energy Under Performance Constraints". In: ASPLOS. 2015.
[39]
S. Mohapatra, R. Cornea, H. Oh, K. Lee, M. Kim, N. Dutt, R. Gupta, A. Nicolau, S. Shukla, and N. Venkata- subramanian. "A cross-layer approach for power-performance optimization in distributed mobile systems". In: IPDPS. 2005.
[40]
R. Narayanan, B. Ozisikyilmaz, J. Zambreno, G. Memik, and A. Choudhary. "MineBench: A Benchmark Suite for Data Mining Workloads". In: IISWC. 2006.
[41]
R. Nathuji and K. Schwan. "VirtualPower: coordinated power management in virtualized enterprise systems". In: SOSP. 2007.
[42]
R. Raghavendra, P. Ranganathan, V. Talwar, Z. Wang, and X. Zhu. "No "power" struggles: coordinated multi-level power management for the data center". In: ASPLOS. 2008.
[43]
K. K. Rangan, G.-Y. Wei, and D. Brooks. "Thread motion: fine-grained power management for multi-core systems". In: ISCA. 2009.
[44]
S. Reda, R. Cochran, and A. Coskun. "Adaptive Power Capping for Servers with Multithreaded Workloads". In: Micro, IEEE 32.5 (2012).
[45]
A. Roy, S. M. Rumble, R. Stutsman, P. Levis, D. Mazieres, and N. Zeldovich. "Energy Management in Mobile Devices with the Cinder Operating System". In: EuroSys. 2011.
[46]
R. Sasanka, C. J. Hughes, and S. V. Adve. "Joint Local and Global Hardware Adaptations for Energy". In: ASPLOS. 2002.
[47]
A. Sharifi, S. Srikantaiah, A. K. Mishra, M. Kandemir, and C. R. Das. "METE: meeting end-to-end QoS in multicores through system-wide resource management". In: SIGMETRICS. 2011.
[48]
K. Shen, A. Shriraman, S. Dwarkadas, X. Zhang, and Z. Chen. "Power Containers: An OS Facility for Fine-grained Power and Energy Management on Multicore Servers". In: ASPLOS 2013.
[49]
Y. Shin, K. Shin, P. Kenkare, R. Kashyap, H.-J. Lee, D. Seo, B. Millar, Y. Kwon, R. Iyengar, M.-S. Kim, A. Chowdhury, S.-I. Bae, I. Hon, W. Jeong, A. Lindner, U. Cho, K. Hawkins, J. Son, and S. Hwang. "28nm High- Metal-Gate Heterogeneous Quad-Core CPUs for High-Performance and Energy-Efficient Mobile Application Processor". In: ISSCC. 2013.
[50]
Y. Sinangil, S. M. Neuman, M. E. Sinangi, N. Ickes, G. Bezerra, E. Lau, J. E. Miller, H. Hoffmann, S. Devadas, and A. P. Chandraksan. "A Self-Aware Processor SoC using Energy Monitors Integrated into Power Converters for Self-Adaptation". In: VLSI Symposium. 2014.
[51]
D. C. Snowdon, E. Le Sueur, S. M. Petters, and G. Heiser. "Koala: A Platform for OS-level Power Management". In: EuroSys. 2009.
[52]
B. Sprunt. "The basics of performance-monitoring hardware". In: IEEE Micro 22.4 (2002).
[53]
M. B. Taylor. "Is Dark Silicon Useful? Harnessing the Four Horesemen of the Coming Dark Silicon Apocalypse". In: Design Automation Conference. 2012.
[54]
E. Team. Key Challenges for Exascale OS/R. Online document, https://collab.mcs.anl.gov/display/exaosr/Challengesl.
[55]
P. Team. Online document, http://icl.cs.utk.edu/papi/.
[56]
V. Vardhan, W. Yuan, A. F. H. III, S. V. Adve, R. Kravets, K. Nahrstedt, D. G. Sachs, and D. L. Jones. "GRACE-2: integrating fine-grained application adaptation with global adaptation for saving energy". In: IJES 4.2 (2009).
[57]
G. Venkatesh, J. Sampson, N. Goulding, S. Garcia, V. Bryksin, J. Lugo-Martinez, S. Swanson, and M. B. Taylor. "Conservation cores: reducing the energy of mature computations". In: ASPLOS. 2010.
[58]
A. Verma, G. Dasgupta, T. K. Nayak, P. De, and R. Kothari. "Server workload analysis for power minimization using consolidation". In: USENIX Annual technical conference. 2009.
[59]
X. Wang, M. Chen, and X. Fu. "MIMO Power Control for High-Density Servers in an Enclosure". In: IEEE Transactions on Parallel and Distributed Systems 21.10 (2010).
[60]
M. Weiser, B. B. Welch, A. J. Demers, and S. Shenker. "Scheduling for Reduced CPU Energy". In: OSDI. 1994.
[61]
A. Weissel, B. Beutel, and F. Bellosa. "Cooperative I/O: A Novel I/O Semantics for Energy-Aware Applications". In: OSDI. 2002.
[62]
J. A. Winter, D. H. Albonesi, and C. A. Shoemaker. "Scalable thread scheduling and global power management for heterogeneous many-core architectures". In: PACT. 2010.
[63]
Q. Wu, P. Juang, M. Martonosi, and D. W. Clark. "Formal online methods for voltage/frequency control in multiple clock domain microprocessors". In: ASPLOS. 2004.
[64]
W. Yuan and K. Nahrstedt. "Energy-efficient soft real-time CPU scheduling for mobile multimedia systems". In: SOSP. 2003.
[65]
X. Zhang, R. Zhong, S. Dwarkadas, and K. Shen. "A Flexible Framework for Throttling-Enabled Multicore Management (TEMM)". In: ICPP. 2012.
[66]
S. Zhuravlev, J. C. Saez, S. Blagodurov, A. Fedorova, and M. Prieto. "Survey of Energy-Cognizant Scheduling Techniques". In: IEEE Trans. Parallel Distrib. Syst. 24.7 (2013), pp. 1447--1464. URL: http://doi.ieeecomputersociety.org/10.1109/TPDS.2012.20.

Cited By

View all
  • (2024)Synergistically Rebalancing the EDP of Container-Based Parallel ApplicationsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.335735335:3(484-498)Online publication date: Mar-2024
  • (2023)DPS: Adaptive Power Management for Overprovisioned SystemsProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607091(1-14)Online publication date: 12-Nov-2023
  • (2022)MAPPER: Managing Application Performance via Parallel Efficiency Regulation*ACM Transactions on Architecture and Code Optimization10.1145/350176719:2(1-26)Online publication date: 24-Mar-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News
ACM SIGARCH Computer Architecture News  Volume 44, Issue 2
ASPLOS'16
May 2016
774 pages
ISSN:0163-5964
DOI:10.1145/2980024
Issue’s Table of Contents
  • cover image ACM Conferences
    ASPLOS '16: Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems
    March 2016
    824 pages
    ISBN:9781450340915
    DOI:10.1145/2872362
    • General Chair:
    • Tom Conte,
    • Program Chair:
    • Yuanyuan Zhou
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 March 2016
Published in SIGARCH Volume 44, Issue 2

Check for updates

Author Tags

  1. adaptive systems
  2. decision-tree
  3. power management

Qualifiers

  • Research-article

Funding Sources

  • Dept. of Energy
  • NSF

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)105
  • Downloads (Last 6 weeks)18
Reflects downloads up to 08 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Synergistically Rebalancing the EDP of Container-Based Parallel ApplicationsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.335735335:3(484-498)Online publication date: Mar-2024
  • (2023)DPS: Adaptive Power Management for Overprovisioned SystemsProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607091(1-14)Online publication date: 12-Nov-2023
  • (2022)MAPPER: Managing Application Performance via Parallel Efficiency Regulation*ACM Transactions on Architecture and Code Optimization10.1145/350176719:2(1-26)Online publication date: 24-Mar-2022
  • (2022)The Impact of CPU Voltage Margins on Power-Constrained ExecutionIEEE Transactions on Sustainable Computing10.1109/TSUSC.2020.30451957:1(221-234)Online publication date: 1-Jan-2022
  • (2022)Online Power Management for Multi-Cores: A Reinforcement Learning Based ApproachIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2021.309227033:4(751-764)Online publication date: 1-Apr-2022
  • (2021)Mapping Computations in Heterogeneous Multicore Systems with Statistical Regression on Program InputsACM Transactions on Embedded Computing Systems10.1145/347828820:6(1-35)Online publication date: 18-Oct-2021
  • (2021)Strategies for Heterogeneous Multi-Core Processing Based on Graph ProgrammingProceedings of the 2021 7th International Conference on Computing and Data Engineering10.1145/3456172.3456196(54-60)Online publication date: 15-Jan-2021
  • (2021)UBARACM Transactions on Embedded Computing Systems10.1145/344164420:3(1-25)Online publication date: 27-Mar-2021
  • (2021)FlexProceedings of the 48th Annual International Symposium on Computer Architecture10.1109/ISCA52012.2021.00033(319-332)Online publication date: 14-Jun-2021
  • (2021)An approach to reduce energy consumption and performance losses on heterogeneous servers using power cappingJournal of Scheduling10.1007/s10951-020-00649-424:5(489-505)Online publication date: 1-Oct-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media