Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1375457.1375498acmconferencesArticle/Chapter ViewAbstractPublication PagesmetricsConference Proceedingsconference-collections
research-article

Software-directed combined cpu/link voltage scaling fornoc-based cmps

Published: 02 June 2008 Publication History

Abstract

Network-on-Chip (NoC) based chip multiprocessors (CMPs) are expected to become more widespread in future, in both high performance scientific computing and low-end embedded computing. For many execution environments that employ these systems, reducing power consumption is an important goal. This paper presents a software approach for reducing power consumption in such systems through compiler-directed voltage/frequency scaling. The unique characteristic of this approach is that it scales the voltages and frequencies of select CPUs and communication links in a coordinated manner to maximize energy savings without degrading performance. Our approach has three important components. The first component is the identification of phases in the application. The next step is to determine the critical execution paths and slacks in each phase. For implementing these two components, our approach employs a novel parallel program representation. The last component of our approach is the assignment of voltages and frequencies to CPUs and communication links to maximize energy savings. We use integer linear programming (ILP) for this voltage/frequency assignment problem. To test our approach, we implemented it within a compilation framework and conducted experiments with applications from the SPEComp suite and SPECjbb. Our results show that the proposed combined CPU/link scaling is much more effective than scaling voltages of CPUs or communication links in isolation. In addition, we observed that the energy savings obtained are consistent across a wide range of values of our major simulation parameters such as the number of CPUs, the number of voltage/frequency levels, and the thread-to-CPU mapping.

References

[1]
V. Agarwal et al. Clock Rate Versus IPC: The End of the Road for Conventional Microarchitectures. In Proc. ISCA, 2000.]]
[2]
D. Albonesi et al. Dynamically Tuning Processor Resources with Adaptive Processing. IEEE Computer, 36(12):43--51, 2003.]]
[3]
A. Andrei et al. Simultaneous communication and processor voltage scaling for dynamic and leakage energy reduction in time-constrained systems. In Proc. ICCAD, 2004.]]
[4]
G. Ascia et al. Multi-objective mapping for mesh-based NoC architectures. In Proc. CODES+ISSS, Sept. 2004.]]
[5]
V. Aslot et al. SPEComp: A New Benchmark Suite for Measuring Parallel Computer Performance. In Proc. WOMBAT, July 2001.]]
[6]
J. Balfour and W. J. Dally. Design tradeoffs for tiled CMP on-chip networks. In Proc. ICS, 2006.]]
[7]
B. M. Beckmann and D. A. Wood. Managing wire delay in large chip-multiprocessor caches. In Proc. MICRO, 2004.]]
[8]
D. Brooks et al. Wattch: a framework for architectural-level power analysis and optimizations. In Proc. ISCA, 2000.]]
[9]
K. Chakraborty et al. Computation spreading: employing hardware migration to specialize CMP cores on-the-fly. In Proc. ASPLOS, 2006.]]
[10]
K.-C. Chang et al. A low-power crossroad switch architecture and its core placement for network-on-chip. In Proc. ISLPED, 2005.]]
[11]
G. Chen et al. Compiler-directed channel allocation for saving power in on-chip networks. In Proc. POPL, January 2006.]]
[12]
G. Chen et al. Reducing NoC energy consumption through compiler-directed channel voltage scaling. In Proc. PLDI, June 2006.]]
[13]
Z. Chishti et al. Optimizing replication, communication, and capacity allocation in CMPs. In Proc. ISCA, 2005.]]
[14]
W. J. Dally and B. Towles. Route packets, not wires: on-chip interconnection networks. In Proc. DAC, 2001.]]
[15]
N. Eisley and L.-S. Peh. High-level power analysis of on-chip networks. In Proc. CASES, Sept. 2004.]]
[16]
K. Flautner et al. Automatic performance setting for dynamic voltage scaling. In Mobile Computing and Networking, 2001.]]
[17]
A. Gerstlauer. Communication abstractions for system-level design and synthesis. Technical report. TR-03-30, Center for Embedded Computer Systems, University of California, Irvine, CA, 2003.]]
[18]
K. Govil et al. Comparing algorithm for dynamic speed-setting of a low-power CPU. In Mobile Computing and Networking, pages 13--25, 1995.]]
[19]
D. Grunwald et al. Policies for dynamic clock scheduling. In Proc. OSDI, 2000.]]
[20]
L. Hammond et al. A Single-Chip Multiprocessor. IEEE Computer Special Issue on "Billion-Transistor Processors", Sept. 1997.]]
[21]
L. Hammond et al. Data speculation support for a chip multiprocessor. In Proc. ASPLOS, 1998.]]
[22]
L. Hsu et al. Exploring the cache design space for large scale CMPs. In SIGARCH Comput. Archit. News, 33(4):24--33, 2005.]]
[23]
J. Hu and R. Marculescu. Energy- and performance-aware mapping for regular NoC architectures. IEEE TCAD, 24(4), Apr. 2005.]]
[24]
C. Isci et al. An analysis of efficient multi-core global power management policies: Maximizing performance for a given power budget. In Proc. MICRO, 2006.]]
[25]
J. Kim and M. Horowitz. Adaptive supply serial links with sub-1V operation and per-pin clock recovery. In Proc. Int. Solid-State Circuits Conference, Feb. 2002.]]
[26]
J. Kahle et al. Introduction to the Cell Multiprocessor. IBM Journal of Research and Development, 49(4-5), 2005.]]
[27]
T. Kempf et al. A modular simulation framework for spatial and temporal task mapping onto multi-processor soc platforms. In Proc. DATE, 2005.]]
[28]
P. Kongetira et al. Niagara: A 32-Way Multithreaded SPARC Processor. IEEE MICRO Magazine, Apr. 2005.]]
[29]
W. Lee et al. Space-time scheduling of instruction-level parallelism on a RAW machine. In Proc. ASPLOS, 1998.]]
[30]
F. Li et al. Design and Management of 3D Chip Multiprocessors Using Network-in-Memory. In Proc. ISCA, 2006.]]
[31]
Y. Li et al. Performance, energy, and thermal considerations for SMT and CMP architectures. In Proc. HPCA, 2005.]]
[32]
J. R. Lorch and A. J. Smith. Improving dynamic voltage scaling algorithms with PACE. In SIGMETRICS/Performance, 2001.]]
[33]
lp solve. ftp://ftp.es.ele.tue.nl/pub/lp/ solve/.]]
[34]
J. Luo et al. Simultaneous dynamic voltage scaling of processors and communication links in real-time distributed embedded systems. In Proc. DATE, 2003.]]
[35]
J. Madsen et al. Network-on-chip modeling for system-level multiprocessor simulation. In Proc. RTSS, 2003.]]
[36]
N. Magen et al. Interconnect Power Dissipation in a Microprocessor. In Proc. SLIP, 2004.]]
[37]
P. S. Magnusson et al. Simics: A full system simulation platform. IEEE Computer, vol. 35, no. 2, 2002.]]
[38]
K. Mai et al. Smart Memories: a modular reconfigurable architecture. In Proc. ISCA, 2000.]]
[39]
R. Nagarajan et al. Static placement, dynamic issue (spdi) scheduling for edge architectures. In Proc. PACT, 2004.]]
[40]
G. L. Nemhauser and L. A. Wolsey. Integer and combinatorial optimization. Wiley-Interscience, New York, NY, USA, 1998.]]
[41]
O. Ozturk et al. An ILP based approach to reducing energy consumption in NoC based CMPs. In Proc. ISLPED, 2007.]]
[42]
G. Reinman and N. P. Jouppi. Cacti 2.0: An integrated cache timing and power model. Tech. rep., Compaq. February, 2000.]]
[43]
M. Ruggiero et al. Communication-aware allocation and scheduling framework for stream-oriented multi-processor systems-on-chip. In Proc. DATE, 2006.]]
[44]
K. Sankaralingam et al. Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture. In Proc. ISCA, 2003.]]
[45]
L. Shang et al. Dynamic Voltage Scaling with Links for Power Optimization of Interconnection Networks. In Proc. HPCA, 2003.]]
[46]
T. Sherwood et al. Automatically Characterizing Large Scale Program Behavior. In Proc. ASPLOS, Oct 2002.]]
[47]
D. Shin and J. Kim. Power-aware communication optimization for networks-on-chips with voltage scalable links. In Proc. CODES+ISSS, Sept. 2004.]]
[48]
A. Smith et al. Compiling for edge architectures. In Proc. CGO, 2006.]]
[49]
S. W. Son et al. Reducing energy consumption of parallel sparse matrix applications through integrated link/CPU voltage scaling. The Journal of Supercomputing, Volume 41, Issue 3, 2007.]]
[50]
V. Soteriou and L.-S. Peh. Design space exploration of power-aware on/off interconnection networks. In Proc. ICCD, Oct. 2004.]]
[51]
V. Soteriou et al. Software-directed power-aware interconnection networks. In Proc. CASES, 2005.]]
[52]
Standard Performance Evaluation Corporation. http://www.spec.org/jbb2005/]]
[53]
V. Suhendra et al. Integrated scratchpad memory optimization and task scheduling for mpsoc architectures. In Proc. CASES, 2006.]]
[54]
M. B. Taylor et al. The RAW microprocessor: A computational fabric for software circuits and general purpose programs. IEEE Micro, 22(2), 2002.]]
[55]
C.-W. Tseng. Compiler optimizations for eliminating barrier synchronization. In Proc. PPoPP, July 1995.]]
[56]
H. Wang et al. Power-driven design of router microarchitectures in on-chip networks. In Proc. MICRO, 2003.]]
[57]
H.-S. Wang et al. ORION: A power-performance simulator for interconnection networks. In Proc. MICRO, 2002.]]
[58]
R. P. Wilson et al. SUIF: an infrastructure for research on parallelizing and optimizing compilers. In SIGPLAN Not., 29(12), 1994.]]
[59]
M. E. Wolf et al. Combining loop transformations considering caches and scheduling. In Proc. MICRO, 1996.]]
[60]
F. Worm et al. An adaptive low power transmission scheme for on-chip networks. In Proc. ISSS, 2002.]]
[61]
F. Xie et al. Bounds on power savings using runtime dynamic voltage scaling: an exact algorithm and a linear-time heuristic approximation. In Proc. ISLPED, 2005.]]

Cited By

View all
  • (2015)Phase Detection with Hidden Markov Models for DVFS on Many-Core Processors2015 IEEE 35th International Conference on Distributed Computing Systems10.1109/ICDCS.2015.27(185-195)Online publication date: Jun-2015
  • (2011)Communication-aware VFI partitioning for GALS-based networks-on-chipDesign Automation for Embedded Systems10.1007/s10617-011-9070-x15:2(89-109)Online publication date: 1-Jun-2011
  • (2010)Compiler directed network-on-chip reliability enhancement for chip multiprocessorsACM SIGPLAN Notices10.1145/1755951.175590245:4(85-94)Online publication date: 13-Apr-2010
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMETRICS '08: Proceedings of the 2008 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
June 2008
486 pages
ISBN:9781605580050
DOI:10.1145/1375457
  • cover image ACM SIGMETRICS Performance Evaluation Review
    ACM SIGMETRICS Performance Evaluation Review  Volume 36, Issue 1
    SIGMETRICS '08
    June 2008
    469 pages
    ISSN:0163-5999
    DOI:10.1145/1384529
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 June 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. CMP
  2. NoC
  3. communication link
  4. compiler
  5. cpu
  6. voltage scaling

Qualifiers

  • Research-article

Conference

SIGMETRICS08

Acceptance Rates

Overall Acceptance Rate 459 of 2,691 submissions, 17%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2015)Phase Detection with Hidden Markov Models for DVFS on Many-Core Processors2015 IEEE 35th International Conference on Distributed Computing Systems10.1109/ICDCS.2015.27(185-195)Online publication date: Jun-2015
  • (2011)Communication-aware VFI partitioning for GALS-based networks-on-chipDesign Automation for Embedded Systems10.1007/s10617-011-9070-x15:2(89-109)Online publication date: 1-Jun-2011
  • (2010)Compiler directed network-on-chip reliability enhancement for chip multiprocessorsACM SIGPLAN Notices10.1145/1755951.175590245:4(85-94)Online publication date: 13-Apr-2010
  • (2010)Compiler directed network-on-chip reliability enhancement for chip multiprocessorsProceedings of the ACM SIGPLAN/SIGBED 2010 conference on Languages, compilers, and tools for embedded systems10.1145/1755888.1755902(85-94)Online publication date: 13-Apr-2010

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media