research-article

Optimizing total power of many-core processors considering voltage scaling limit and process variations

Authors:

Nam Sung KimAuthors Info & Claims

ISLPED '09: Proceedings of the 2009 ACM/IEEE international symposium on Low power electronics and design

Pages 201 - 206

https://doi.org/10.1145/1594233.1594283

Published: 19 August 2009 Publication History

Publisher Site Get Access

Abstract

Recently, processor manufacturers have integrated more than a hundred cores in a single die to deliver extremely high throughput for highly-parallel, data-intensive applications like physics simulations, 3D-graphics, etc. Meanwhile, excessive power consumption rather than silicon area will limit the performance of many-core processors running the aforementioned applications. In this paper, to optimize the total power of many-core processors, we analyze the impact of 1) the number of cores, 2) parallelism in applications, and 3) supply voltage scaling limit due to on-die memory failure at low supply voltage. Our analysis shows that doubling the number of cores with lower than nominal supply voltage offers the most cost-effective power reduction, resulting in up to 65% less power consumption for highly-parallel applications even when supply voltage scaling is limited to 0.7V. The reduced power, in turn, can be used to improve throughput at higher voltage in power-constrained many-core processors. Furthermore, we extend our analysis to consider within-die core-to-core frequency and leakage variations. When only a subset of cores in a many-core processor are to be chosen to achieve a demanded throughput, moderately fast and leaky cores always provide optimal power consumption. In addition, frequency-island clocking, which allows independent frequency for each core, leads to 7% less power consumption than global clocking, and it prefers the fastest core (among the chosen ones) to process the totally sequential portion of workload.

References

[1]

"Era of Tera," http://www.intel.com/pressroom/archive /releases/20070204comp.htm

[2]

"NVIDIA GeForce 8800 GPU architecture overview," http: //www.nvidia.com/object/geforce_8800gt_tech_briefs.html

[3]

"GeForce 8800&NVIDIA CUDA: a new architecture for computing on the GPU", www.gpgpu.org/sc2006/workshop /presentations/Buck_NVIDIA_Cuda.pdf.

[4]

L. Seiler et al., "Larrabee: a many-core x86 architecture for visual computing," ACM Trans. on Graphics, vol. 27, no. 3, ar. 18, Aug 2008.

Digital Library

[5]

S. Borkar et al., "Design and reliability challenges in nanometer technologies," IEEE DAC, pp. 75 75, 2004.

Digital Library

[6]

K. Aygun et al., "Power delivery for high-performance microprocessor," Intel Technology Journal, Vol. 9, No. 4, pp. 273--283, Nov. 2005.

[7]

"Predictive technology model," http://www.eas.asu.edu/~ptm

[8]

S. Herbert and D. Marculescu, "Characterizing chip-multiprocessor variability-tolerance," in Proc. DAC, Jun 2008

Digital Library

[9]

A. Hartstein et al., "The optimum pipeline depth for a microprocessor," IEEE ISCA, pp. 7--13, 2002.

Digital Library

[10]

A. Hartstein et al., "Optimum power/performance pipeline depth," IEEE Microarchitecture, pp. 333--344, 2003.

Digital Library

[11]

V. Srinivasan, et al., "Optimizing pipelines for power and performance," IEEE Microarchitecture, pp. 333--344, 2002.

Digital Library

[12]

S. Heo et al., "Power-optimal pipelining in deep submicron technology," IEEE ISLPED, pp. 218--223, 2004.

Digital Library

[13]

D. Markovic, et al., "Methods for true energy-performance optimization," IEEE JSSC, 39(8), pp. 1282--1293, Aug. 2004.

[14]

D. Liu, et al., "Trading speed for low power by choice of supply and threshold voltage," IEEE JSSC, 28(1), pp. 10--17, Jan. 1993.

[15]

N. Kim et al., "Optimizing total power through pipelining and parallel processing under the presence of process variations," in Proc. ICCAD, Nov 2005.

[16]

D. Woo and H.-H. Lee, "Extending Amdahl's Law for energy-efficient computing in the many-core era," IEEE Computer, pp. 24--31, Dec 2008.

Digital Library

[17]

K. Bowman et al., "Impact of die-to-die and within-die parameter fluctuations on the maximum clock frequency distribution for gigascale integrations," JSSC, Feb 2002.

[18]

J. Tschanz et al., "Adaptive body bias for reducing impacts of die-to-die and within-die parameter variations on microprocessor frequency and leakage," JSSC, Nov 2002.

[19]

K. Bowman et al., "Impact of die-to-die and within-die parameter variations on the throughput distribution of multicore processors," in Proc. ISLPED, Aug 2007.

Digital Library

[20]

E. Humenay et al., "Impact of process variations on multicore performance symmetry," in Proc. DATE, Apr 2007.

Digital Library

[21]

J. Donald and M. Martonosi, "Power efficiency for variation-tolerant multicore processors," in Proc. ISLPED, Aug 2006.

Digital Library

Cited By

Bahrami FRanjbar BRohbani NEjlali A(2021)PVMC: Task Mapping and Scheduling under Process Variation Heterogeneity in Mixed-Criticality SystemsIEEE Transactions on Emerging Topics in Computing10.1109/TETC.2021.3072286(1-1)Online publication date: 2021
https://doi.org/10.1109/TETC.2021.3072286
Majzoub SSaleh RAshraf ITaouil MHamdioui S(2019)Energy Optimization for Large-Scale 3D Manycores in the Dark-Silicon EraIEEE Access10.1109/ACCESS.2019.29004777(33115-33129)Online publication date: 2019
https://doi.org/10.1109/ACCESS.2019.2900477
Kapadia NPasricha S(2017)A Runtime Framework for Robust Application Scheduling With Adaptive Parallelism in the Dark-Silicon EraIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2016.259423825:2(534-546)Online publication date: 1-Feb-2017
https://dl.acm.org/doi/10.1109/TVLSI.2016.2594238
Show More Cited By

Index Terms

Optimizing total power of many-core processors considering voltage scaling limit and process variations
1. Computing methodologies
  1. Computer graphics
    1. Graphics systems and interfaces

Recommendations

Voltage scaling and dark silicon in symmetric multicore processors

As technology scales further, multicore and many-core processors emerge as an alternative to keep up with performance demands. However, because of power and thermal constraints, we are obliged to power off remarkable area of chip. Many innovative ...
Many-core needs fine-grained scheduling: A case study of query processing on Intel Xeon Phi processors
Abstract
Emerging many-core processors feature very high memory bandwidth and computational power. For example, Intel Xeon Phi many-core processors of the Knights Corner (KNC) and Knights Landing (KNL) architectures embrace 60 to 64 x86-based ...
Highlights
- We find that the state-of-the-art implementations of in-memory database operators suffer severely from memory stalls. Also, such implementations under-...
Optimizing throughput of power- and thermal-constrained multicore processors using DVFS and per-core power-gating
DAC '09: Proceedings of the 46th Annual Design Automation Conference

Process variability from a range of sources is growing as technology scales below 65nm, resulting in increasingly nonuniform transistor delay and leakage power both within a die and across dies. As a result, the negative impact of process variations on ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ISLPED '09: Proceedings of the 2009 ACM/IEEE international symposium on Low power electronics and design

August 2009

452 pages

ISBN:9781605586847

DOI:10.1145/1594233

General Chairs:
Jörg Henkel
University of Karlsruhe
,
Ali Keshavarzi
Taiwan Semiconductor Manufacturing Company
,
Program Chairs:
Naehyuck Chang
Seoul National University
,
Tahir Ghani
Intel Corporation

Copyright © 2009 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 August 2009

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ISLPED'09

Sponsor:

ISLPED'09: International Symposium on Low Power Electronics and Design

August 19 - 21, 2009

CA, San Fancisco, USA

Acceptance Rates

ISLPED '09 Paper Acceptance Rate 72 of 208 submissions, 35%;

Overall Acceptance Rate 398 of 1,159 submissions, 34%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

24
Total Citations
View Citations
853
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)2

Reflects downloads up to 02 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Bahrami FRanjbar BRohbani NEjlali A(2021)PVMC: Task Mapping and Scheduling under Process Variation Heterogeneity in Mixed-Criticality SystemsIEEE Transactions on Emerging Topics in Computing10.1109/TETC.2021.3072286(1-1)Online publication date: 2021
https://doi.org/10.1109/TETC.2021.3072286
Majzoub SSaleh RAshraf ITaouil MHamdioui S(2019)Energy Optimization for Large-Scale 3D Manycores in the Dark-Silicon EraIEEE Access10.1109/ACCESS.2019.29004777(33115-33129)Online publication date: 2019
https://doi.org/10.1109/ACCESS.2019.2900477
Kapadia NPasricha S(2017)A Runtime Framework for Robust Application Scheduling With Adaptive Parallelism in the Dark-Silicon EraIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2016.259423825:2(534-546)Online publication date: 1-Feb-2017
https://dl.acm.org/doi/10.1109/TVLSI.2016.2594238
Raparti VKapadia NPasricha S(2017)ARTEMIS: An Aging-Aware Runtime Application Mapping Framework for 3D NoC-Based Chip MultiprocessorsIEEE Transactions on Multi-Scale Computing Systems10.1109/TMSCS.2017.26868563:2(72-85)Online publication date: 1-Apr-2017
https://doi.org/10.1109/TMSCS.2017.2686856
Kapadia NPasricha S(2017)Robust Application Scheduling with Adaptive Parallelism in Dark-Silicon Constrained Multicore SystemsThe Dark Side of Silicon10.1007/978-3-319-31596-6_8(217-236)Online publication date: 1-Jan-2017
https://doi.org/10.1007/978-3-319-31596-6_8
Mittal S(2016)A Survey of Architectural Techniques for Managing Process VariationACM Computing Surveys10.1145/287116748:4(1-29)Online publication date: 9-Feb-2016
https://dl.acm.org/doi/10.1145/2871167
Raparti VKapadia NPasricha S(2016)CHARM: A checkpoint-based resource management framework for reliable multicore computing in the dark silicon era2016 IEEE 34th International Conference on Computer Design (ICCD)10.1109/ICCD.2016.7753281(201-208)Online publication date: Oct-2016
https://doi.org/10.1109/ICCD.2016.7753281
Kapadia NPasricha SNebel WAtienza D(2015)VARSHAProceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition10.5555/2755753.2757059(1060-1065)Online publication date: 9-Mar-2015
https://dl.acm.org/doi/10.5555/2755753.2757059
Pan GYang JJou JLai B(2015)Scalable Global Power Management Policy Based on Combinatorial Optimization for MultiprocessorsACM Transactions on Embedded Computing Systems10.1145/281140414:4(1-24)Online publication date: 8-Dec-2015
https://dl.acm.org/doi/10.1145/2811404
Jevtic RHanh-Phuc Le Blagojevic MBailey SAsanovic KAlon ENikolic B(2015)Per-Core DVFS With Switched-Capacitor Converters for Energy Efficiency in Manycore ProcessorsIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2014.231691923:4(723-730)Online publication date: Apr-2015
https://doi.org/10.1109/TVLSI.2014.2316919
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents