Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1594233.1594283acmconferencesArticle/Chapter ViewAbstractPublication PagesislpedConference Proceedingsconference-collections
research-article

Optimizing total power of many-core processors considering voltage scaling limit and process variations

Published: 19 August 2009 Publication History

Abstract

Recently, processor manufacturers have integrated more than a hundred cores in a single die to deliver extremely high throughput for highly-parallel, data-intensive applications like physics simulations, 3D-graphics, etc. Meanwhile, excessive power consumption rather than silicon area will limit the performance of many-core processors running the aforementioned applications. In this paper, to optimize the total power of many-core processors, we analyze the impact of 1) the number of cores, 2) parallelism in applications, and 3) supply voltage scaling limit due to on-die memory failure at low supply voltage. Our analysis shows that doubling the number of cores with lower than nominal supply voltage offers the most cost-effective power reduction, resulting in up to 65% less power consumption for highly-parallel applications even when supply voltage scaling is limited to 0.7V. The reduced power, in turn, can be used to improve throughput at higher voltage in power-constrained many-core processors. Furthermore, we extend our analysis to consider within-die core-to-core frequency and leakage variations. When only a subset of cores in a many-core processor are to be chosen to achieve a demanded throughput, moderately fast and leaky cores always provide optimal power consumption. In addition, frequency-island clocking, which allows independent frequency for each core, leads to 7% less power consumption than global clocking, and it prefers the fastest core (among the chosen ones) to process the totally sequential portion of workload.

References

[1]
"Era of Tera," http://www.intel.com/pressroom/archive /releases/20070204comp.htm
[2]
"NVIDIA GeForce 8800 GPU architecture overview," http: //www.nvidia.com/object/geforce_8800gt_tech_briefs.html
[3]
"GeForce 8800&NVIDIA CUDA: a new architecture for computing on the GPU", www.gpgpu.org/sc2006/workshop /presentations/Buck_NVIDIA_Cuda.pdf.
[4]
L. Seiler et al., "Larrabee: a many-core x86 architecture for visual computing," ACM Trans. on Graphics, vol. 27, no. 3, ar. 18, Aug 2008.
[5]
S. Borkar et al., "Design and reliability challenges in nanometer technologies," IEEE DAC, pp. 75 75, 2004.
[6]
K. Aygun et al., "Power delivery for high-performance microprocessor," Intel Technology Journal, Vol. 9, No. 4, pp. 273--283, Nov. 2005.
[7]
"Predictive technology model," http://www.eas.asu.edu/~ptm
[8]
S. Herbert and D. Marculescu, "Characterizing chip-multiprocessor variability-tolerance," in Proc. DAC, Jun 2008
[9]
A. Hartstein et al., "The optimum pipeline depth for a microprocessor," IEEE ISCA, pp. 7--13, 2002.
[10]
A. Hartstein et al., "Optimum power/performance pipeline depth," IEEE Microarchitecture, pp. 333--344, 2003.
[11]
V. Srinivasan, et al., "Optimizing pipelines for power and performance," IEEE Microarchitecture, pp. 333--344, 2002.
[12]
S. Heo et al., "Power-optimal pipelining in deep submicron technology," IEEE ISLPED, pp. 218--223, 2004.
[13]
D. Markovic, et al., "Methods for true energy-performance optimization," IEEE JSSC, 39(8), pp. 1282--1293, Aug. 2004.
[14]
D. Liu, et al., "Trading speed for low power by choice of supply and threshold voltage," IEEE JSSC, 28(1), pp. 10--17, Jan. 1993.
[15]
N. Kim et al., "Optimizing total power through pipelining and parallel processing under the presence of process variations," in Proc. ICCAD, Nov 2005.
[16]
D. Woo and H.-H. Lee, "Extending Amdahl's Law for energy-efficient computing in the many-core era," IEEE Computer, pp. 24--31, Dec 2008.
[17]
K. Bowman et al., "Impact of die-to-die and within-die parameter fluctuations on the maximum clock frequency distribution for gigascale integrations," JSSC, Feb 2002.
[18]
J. Tschanz et al., "Adaptive body bias for reducing impacts of die-to-die and within-die parameter variations on microprocessor frequency and leakage," JSSC, Nov 2002.
[19]
K. Bowman et al., "Impact of die-to-die and within-die parameter variations on the throughput distribution of multicore processors," in Proc. ISLPED, Aug 2007.
[20]
E. Humenay et al., "Impact of process variations on multicore performance symmetry," in Proc. DATE, Apr 2007.
[21]
J. Donald and M. Martonosi, "Power efficiency for variation-tolerant multicore processors," in Proc. ISLPED, Aug 2006.

Cited By

View all
  • (2021)PVMC: Task Mapping and Scheduling under Process Variation Heterogeneity in Mixed-Criticality SystemsIEEE Transactions on Emerging Topics in Computing10.1109/TETC.2021.3072286(1-1)Online publication date: 2021
  • (2019)Energy Optimization for Large-Scale 3D Manycores in the Dark-Silicon EraIEEE Access10.1109/ACCESS.2019.29004777(33115-33129)Online publication date: 2019
  • (2017)A Runtime Framework for Robust Application Scheduling With Adaptive Parallelism in the Dark-Silicon EraIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2016.259423825:2(534-546)Online publication date: 1-Feb-2017
  • Show More Cited By

Index Terms

  1. Optimizing total power of many-core processors considering voltage scaling limit and process variations

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ISLPED '09: Proceedings of the 2009 ACM/IEEE international symposium on Low power electronics and design
    August 2009
    452 pages
    ISBN:9781605586847
    DOI:10.1145/1594233
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 19 August 2009

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. many-core processor
    2. parallel applications
    3. process variations
    4. voltage and frequency scaling

    Qualifiers

    • Research-article

    Conference

    ISLPED'09
    Sponsor:

    Acceptance Rates

    ISLPED '09 Paper Acceptance Rate 72 of 208 submissions, 35%;
    Overall Acceptance Rate 398 of 1,159 submissions, 34%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)4
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 02 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)PVMC: Task Mapping and Scheduling under Process Variation Heterogeneity in Mixed-Criticality SystemsIEEE Transactions on Emerging Topics in Computing10.1109/TETC.2021.3072286(1-1)Online publication date: 2021
    • (2019)Energy Optimization for Large-Scale 3D Manycores in the Dark-Silicon EraIEEE Access10.1109/ACCESS.2019.29004777(33115-33129)Online publication date: 2019
    • (2017)A Runtime Framework for Robust Application Scheduling With Adaptive Parallelism in the Dark-Silicon EraIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2016.259423825:2(534-546)Online publication date: 1-Feb-2017
    • (2017)ARTEMIS: An Aging-Aware Runtime Application Mapping Framework for 3D NoC-Based Chip MultiprocessorsIEEE Transactions on Multi-Scale Computing Systems10.1109/TMSCS.2017.26868563:2(72-85)Online publication date: 1-Apr-2017
    • (2017)Robust Application Scheduling with Adaptive Parallelism in Dark-Silicon Constrained Multicore SystemsThe Dark Side of Silicon10.1007/978-3-319-31596-6_8(217-236)Online publication date: 1-Jan-2017
    • (2016)A Survey of Architectural Techniques for Managing Process VariationACM Computing Surveys10.1145/287116748:4(1-29)Online publication date: 9-Feb-2016
    • (2016)CHARM: A checkpoint-based resource management framework for reliable multicore computing in the dark silicon era2016 IEEE 34th International Conference on Computer Design (ICCD)10.1109/ICCD.2016.7753281(201-208)Online publication date: Oct-2016
    • (2015)VARSHAProceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition10.5555/2755753.2757059(1060-1065)Online publication date: 9-Mar-2015
    • (2015)Scalable Global Power Management Policy Based on Combinatorial Optimization for MultiprocessorsACM Transactions on Embedded Computing Systems10.1145/281140414:4(1-24)Online publication date: 8-Dec-2015
    • (2015)Per-Core DVFS With Switched-Capacitor Converters for Energy Efficiency in Manycore ProcessorsIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2014.231691923:4(723-730)Online publication date: Apr-2015
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media