Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

GPUWattch: enabling energy optimizations in GPGPUs

Published: 23 June 2013 Publication History

Abstract

General-purpose GPUs (GPGPUs) are becoming prevalent in mainstream computing, and performance per watt has emerged as a more crucial evaluation metric than peak performance. As such, GPU architects require robust tools that will enable them to quickly explore new ways to optimize GPGPUs for energy efficiency. We propose a new GPGPU power model that is configurable, capable of cycle-level calculations, and carefully validated against real hardware measurements. To achieve configurability, we use a bottom-up methodology and abstract parameters from the microarchitectural components as the model's inputs. We developed a rigorous suite of 80 microbenchmarks that we use to bound any modeling uncertainties and inaccuracies. The power model is comprehensively validated against measurements of two commercially available GPUs, and the measured error is within 9.9% and 13.4% for the two target GPUs (GTX 480 and Quadro FX5600). The model also accurately tracks the power consumption trend over time. We integrated the power model with the cycle-level simulator GPGPU-Sim and demonstrate the energy savings by utilizing dynamic voltage and frequency scaling (DVFS) and clock gating. Traditional DVFS reduces GPU energy consumption by 14.4% by leveraging within-kernel runtime variations. More finer-grained SM cluster-level DVFS improves the energy savings from 6.6% to 13.6% for those benchmarks that show clustered execution behavior. We also show that clock gating inactive lanes during divergence reduces dynamic power by 11.2%.

References

[1]
MacSim, http://code.google.com/p/macsim.
[2]
Predictive technology model, http://ptm.asu.edu.
[3]
Synopsys Inc., Power Compiler, www.synopsys.com.
[4]
A. Bakhoda et al. Analyzing CUDA workloads using a detailed GPU simulator. In ISPASS, 2009.
[5]
M. Bauer et al. CudaDMA: optimizing GPU memory bandwidth via warp specialization. In SC, 2011.
[6]
D. Brooks et al. Wattch: a framework for architectural-level power analysis and optimizations. In ISCA, 2000.
[7]
S. Che et al. Rodinia: A benchmark suite for heterogeneous computing. In IISWC, 2009.
[8]
S. Collange et al. Power consumption of GPUs from a software perspective. In ICCS, 2009.
[9]
W. J. Dally. Moving the needle, computer architecture research in academe and industry. In ISCA, 2010.
[10]
J. M. V. Dyke et al. Graphics system with virtual memory pages and non-power of two number of memory elements, 2011.
[11]
W. Fung and T. Aamodt. Thread block compaction for efficient SIMT control flow. In HPCA, 2011.
[12]
W. Fung et al. Dynamic warp formation and scheduling for efficient GPU control flow. In MICRO, 2007.
[13]
S. Hong and H. Kim. An integrated GPU power and performance model. In ISCA, 2010.
[14]
C. Isci et al. Live, runtime phase monitoring and prediction on real systems with application to dynamic power management. In MICRO, 2006.
[15]
H. Jacobson et al. Stretching the limits of clock-gating efficiency in server-class processors. In HPCA, 2005.
[16]
T. Kailath, A. Sayed, and B. Hassibi. Linear Estimation. Prentice Hall, 2000.
[17]
K. Kasichayanula et al. Power aware computing on GPUs. SAAHPC, 2012.
[18]
S. Keckler. Life After Dennard and How I Learned to Love the Picojoule. In MICRO, 2012.
[19]
W. Kim et al. System level analysis of fast, per-core DVFS using on-chip switching regulators. In HPCA, 2008.
[20]
J. Lee et al. Improving throughput of power-constrained GPUs using dynamic voltage/frequency and core scaling. In PACT, 2011.
[21]
H. Li et al. Deterministic clock gating for microprocessor power reduction. In HPCA, 2003.
[22]
S. Li et al. McPAT: an integrated power, area, and timing modeling framework for multicore and manycore architectures. In MICRO, 2009.
[23]
E. Lindholm et al. NVIDIA Tesla: A unified graphics and computing architecture. Micro, IEEE, 2008.
[24]
J. E. Lindholm et al. Simulating multiported memories using lower port count memories, 2008.
[25]
S. Liu et al. Operand collector architecture, 2010.
[26]
H. Nagasaka et al. Statistical power modeling of GPU kernels using performance counters. In Green Computing Conference, 2010.
[27]
V. Narasiman et al. Improving GPU performance via large warps and two-level warp scheduling. In MICRO, 2011.
[28]
NVIDIA. Fermi Compute Architecture Whitepaper, 2009.
[29]
NVIDIA. Compute Visual Profiler - User Guide, Version 4, 2011.
[30]
NVIDIA. NVIDIA CUDA C Programming Guide, 2012.
[31]
H.-J. Oh et al. A fully pipelined single-precision floating-point unit in the synergistic processor element of a CELL processor. JSSC, 2006.
[32]
V. Sathish et al. Lossless and lossy memory-link compression techniques for improving performance of memory-bound GPGPU workloads. In PACT, 2012.
[33]
S. Thoziyoor et al. A comprehensive memory modeling tool and its application to the design and analysis of future memory hierarchies. In ISCA, 2008.
[34]
R. Ubal et al. Multi2Sim: A simulation framework for CPU-GPU computing. In PACT, 2012.
[35]
T. Vogelsang. Understanding the energy consumption of dynamic random access memories. In MICRO, 2010.
[36]
H. Wang and Q. Chen. Power estimating model and analysis of general programming on GPU. Journal of Software, 2012.
[37]
Q. Wu et al. A dynamic compilation framework for controlling microprocessor energy and performance. In MICRO, 2005.
[38]
Y. Zhang et al. Performance and power analysis of ATI GPU: A statistical approach. In NSA, 2011.

Cited By

View all
  • (2025)Survey of CPU and memory simulators in computer architecture: A comprehensive analysis including compiler integration and emerging technology applicationsSimulation Modelling Practice and Theory10.1016/j.simpat.2024.103032138(103032)Online publication date: Jan-2025
  • (2024)Agnostic Energy Consumption Models for Heterogeneous GPUs in Cloud ComputingApplied Sciences10.3390/app1406238514:6(2385)Online publication date: 12-Mar-2024
  • (2024)PowerScout: Security-Oriented Power Delivery Network Modeling for Side-Channel Vulnerability AnalysisIEEE Transactions on Emerging Topics in Computing10.1109/TETC.2023.325782612:2(532-545)Online publication date: Apr-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News
ACM SIGARCH Computer Architecture News  Volume 41, Issue 3
ICSA '13
June 2013
666 pages
ISSN:0163-5964
DOI:10.1145/2508148
Issue’s Table of Contents
  • cover image ACM Other conferences
    ISCA '13: Proceedings of the 40th Annual International Symposium on Computer Architecture
    June 2013
    686 pages
    ISBN:9781450320795
    DOI:10.1145/2485922
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 June 2013
Published in SIGARCH Volume 41, Issue 3

Check for updates

Author Tags

  1. CUDA
  2. GPU architecture
  3. energy
  4. power
  5. power estimation

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)292
  • Downloads (Last 6 weeks)41
Reflects downloads up to 18 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2025)Survey of CPU and memory simulators in computer architecture: A comprehensive analysis including compiler integration and emerging technology applicationsSimulation Modelling Practice and Theory10.1016/j.simpat.2024.103032138(103032)Online publication date: Jan-2025
  • (2024)Agnostic Energy Consumption Models for Heterogeneous GPUs in Cloud ComputingApplied Sciences10.3390/app1406238514:6(2385)Online publication date: 12-Mar-2024
  • (2024)PowerScout: Security-Oriented Power Delivery Network Modeling for Side-Channel Vulnerability AnalysisIEEE Transactions on Emerging Topics in Computing10.1109/TETC.2023.325782612:2(532-545)Online publication date: Apr-2024
  • (2024)Accurate and Convenient Energy Measurements for GPUs: A Detailed Study of NVIDIA GPU's Built-In Power SensorProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.1109/SC41406.2024.00028(1-17)Online publication date: 17-Nov-2024
  • (2024)Circular Reconfigurable Parallel Processor for Edge Computing : Industrial Product ✶2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00067(863-875)Online publication date: 29-Jun-2024
  • (2024)Guser: A GPGPU Power Stressmark Generator2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00087(1111-1124)Online publication date: 2-Mar-2024
  • (2024)TCC: GPGPU Architecture for Instruction Decoder and Control Flow Error Detection2024 27th International Symposium on Design & Diagnostics of Electronic Circuits & Systems (DDECS)10.1109/DDECS60919.2024.10508915(104-109)Online publication date: 3-Apr-2024
  • (2024)Nearest data processing in GPUSustainable Computing: Informatics and Systems10.1016/j.suscom.2024.10104744(101047)Online publication date: Dec-2024
  • (2024)An efficient sequential consistency implementation with dynamic race detection for GPUsJournal of Parallel and Distributed Computing10.1016/j.jpdc.2023.104836187(104836)Online publication date: May-2024
  • (2023)MOELA: A Multi-Objective Evolutionary/Learning Design Space Exploration Framework for 3D Heterogeneous Manycore Platforms2023 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE56975.2023.10137276(1-6)Online publication date: Apr-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media