Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1086297.1086333acmconferencesArticle/Chapter ViewAbstractPublication PagesesweekConference Proceedingsconference-collections
Article

Software-directed power-aware interconnection networks

Published: 24 September 2005 Publication History

Abstract

Interconnection networks have been deployed as the communication fabric in a wide range of parallel computer systems. With recent technological trends allowing growing quantities of chip resources and faster clock rates, there have been prevailing concerns of increasing power consumption being a major limiting factor in the design of parallel computer systems, from multiprocessor SoCs to multi-chip embedded systems and parallel servers. To tackle this, power-aware networks must become inherent components of single-chip and multi-chip system.On the hardware design side, while there has been some recent interconnection network power reduction research, especially targeted towards communication links, the techniques presented are ad hoc and are not tailored to the application running on the network. We show that with these ad hoc techniques, power savings and corresponding impact on network latency vary significantly from one application to the next -- in many cases network performance can suffer severely. On the software side, extensive research on compile-time optimization has produced parallelizing compilers that can efficiently map an application onto hardware for high performance. However, research into power-aware parallelizing compilers is in its infancy; none addressed communication power.In this paper, we take the first steps towards tailoring applications' communication needs at run-time for low power. We propose software techniques that extend the flow of a parallelizing compiler in order to direct run-time network power optimization. We target network links, the dominant power consumer in these systems, allowing DVS instructions extracted during static compilation to orchestrate link voltage and frequency transitions for power savings during application runtime. Concurrently, a hardware online mechanism measures network congestion levels and adapts these off-line DVS settings to optimize network performance. Our simulations show that link power consumption can be greatly reduced by up to 76.3%, with a minor increase in network latency in the range of 0.23 to 6.78% across a number of benchmark suites running on three existing parallel architectures, from very fine-grained single-chip to coarse-grained multi-chip architectures.

References

[1]
T. D. Burd and R. W. Brodersen. Design issues for dynamic voltage scaling. In Proc. of the 5th International Symposium on Low Power Electronics and Design (ISLPED'00), pp. 9--14, 2000.
[2]
X. Chen and L.-S. Peh. Exploring the design space of power-aware opto-electronic network systems. In Proc. of the 11th International Symposium on High-Performance Computer Architecture (HPCA-11), pp. 120--131, Feb. 2005.
[3]
W. J. Dally and B. Towles. Route packets, not wires: On-chip interconnection networks. In Proc. of the of 41st Design Automation Conference (DAC-41), pp. 684--689, June 2001.
[4]
J. Duato. A theory of fault-tolerant routing in wormhole networks. IEEE Transactions on Parallel and Distributed Systems (TPDS), Vol. 8, No. 8, Aug. 1997.
[5]
N. Eisley and L.-S. Peh. High-level analysis for on-chip networks. In Proc. of the 7th International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES'04), pp. 104--115, Sept. 2004. (LUNA. {online} http://www.princeton.edu/~eisley/LUNA.html).
[6]
J. Hu and R. Marculescu. Energy-aware communication and task scheduling for network-on-chip architectures under real-time Constraints. In Proc. of Design, Automation and Test in Europe Conference and Exhibition (DATE'04), pp. 10234--10239, Feb. 2004.
[7]
InfiniBand Trade Alliance. The InfiniBand architecture. {online} http:// www.infinibandta.org.
[8]
A. Jalabert et. al. xpipesCompiler: A tool for instantiating application specific networks on chip. In Proc. of Design, Automation and Test in Europe Conference and Exhibition (DATE'04), pp. 20884--20889, Feb. 2004.
[9]
I. Kadayif et. al. Exploiting processor workload heterogeneity for reducing energy consumption in chip multiprocessors. In Proc. of the Design, Automation and Test in Europe Conference and Exhibition (DATE'04), pp. 21158--21163 Feb. 2004.
[10]
E. J. Kim et. al. Energy optimization techniques in cluster interconnects. In Proc. of the International Symposium on Low Power Electronics and Design (ISLPED'03), pp. 459--464, Aug. 2003.
[11]
J. Kim and M. Horowitz. Adaptive supply serial links with sub-1V operation and per-pin clock recovery. In Proc. International Solid-State Circuits Conference (ISSCC), pp. 1403--1413, Feb. 2002.
[12]
J. S. Kim et. al. Energy characterization of a tiled architecture processor with on-chip networks. In Proc. of the 8th International Symposium on Low Power Electronics and Design (ISLPED'03), pp. 424--427, Aug. 2003.
[13]
L. Kleinrock. Queueing Systems, Vol. 1. John Wiley and Sons. New York, NY, 1975.
[14]
C. Lee et. al. Mediabench: A tool for evaluating and synthesizing multimedia and communications systems. In Proc. of the 30th International Symposium on Microarchitecture (MICRO-30), pp. 330--335, Nov. 1997.
[15]
W. Lee et. al. Space-time scheduling of instruction-level parallelism on a Raw machine. In Proc. of the 8th International Conference on Architectural Support for Programming Language and Operating Systems (ASPLOS-8), pp. 46--57, Oct. 1998.
[16]
J. Luo et. al. Simultaneous dynamic voltage scaling of processors and communication links in real-time distributed embedded systems. In Proc. of Design, Automation and Test in Europe Conference and Exhibition (DATE'03), pp. 11150--11151, 2003.
[17]
S. S. Mukherjee et. al. The Alpha 21364 network architecture. IEEE Micro, 22(1), 2002.
[18]
V. S. Pai et. al. RSIM: An execution-driven simulator for ILP-based shared-memory multiprocessors and uniprocessors. In IEEE Technical Committee on Computer Architecture Newsletter (TCCA), 35(11), pp. 37--48, Oct. 1997.
[19]
C. Patel et. al. Power-constrained design of multiprocessor interconnection networks. In Proc. of the 15th International Conference on Computer Design (ICCD'97), pp. 408--416, Oct. 1997.
[20]
PoPNet. {online} http://www.princeton.edu/edu/~lshang/popnet.html
[21]
K. Sankaralingam et. al. Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture. In Proc. of the 30th International Symposium on Computer Architecture (ISCA-30), pp. 422--422, June 2003.
[22]
H. Saputra et. al. Energy-conscious compilation based on voltage scaling. In Proc. of the Joint Conference on Languages, Compilers and Tools for Embedded Systems: Software and Compilers for Embedded Systems, pp. 2--11, June 2002.
[23]
Semiconductor Industry Association. International Technology Roadmap for Semiconductors, 2001. {online} http://public.itrs.net/Files/2001ITRS/Home.htm
[24]
L. Shang et. al. Dynamic voltage scaling with links for power optimization of interconnection networks. In Proc. of the 9th International Symposium on High-Performance Computer Architecture (HPCA-9), pp. 79--90, Feb. 2003.
[25]
V. Soteriou and L.-S. Peh. Design-space exploration of power-aware on/off interconnection networks. In Proc. of the 22nd International Conference on Computer Design (ICCD'04), pp. 510--517, Oct. 2004.
[26]
J. M. Stine and N. P. Carter. Comparing adaptive routing and dynamic voltage scaling for link power reduction. Computer Architecture Letters, Vol. 3, June 2004.
[27]
M. B. Taylor et. al. Evaluation of the Raw microprocessor: An exposed-wire-delay architecture for ILP and streams. In Proc. of the 31st International Symposium on Computer Architecture (ISCA-31), pp. 2--13, June 2004.
[28]
The Standard Performance Evaluation Corporation. {online}http://www.spec.org/
[29]
H. Wang et al. Orion: A power-performance simulator for interconnection networks. In Proc. of the 35th International Symposium on Microarchitecture (MICRO-35), pp. 294--305, Nov. 2002.
[30]
G. Wei et. al. A variable-frequency parallel I/O interface with adaptive power-supply regulation. Journal of Solid-State Circuits, 35(11):16001610, Nov. 2000.
[31]
R. Wilson et. al. SUIF: An infrastructure for research on parallelizing and optimizing compilers. ACM SIGPLAN Notices, 29(12), Dec. 1996.
[32]
S. C. Woo et. al. The SPLASH-2 programs: characterization and methodological considerations. In Proc. of the 22nd International Symposium on Computer Architecture (ISCA-22), pp. 24--36, June 1995.
[33]
F. Worm et. al. An adaptive low-power transmission scheme for on-chip networks. In Proc. of the International Symposium on System Synthesis (ISSS), pp. 92--100, 2002.
[34]
F. Xie et. al. Compile-time dynamic voltage scaling settings: opportunities and limits. In Proc. of Programming Language Design and Implementation (PLDI), pp. 49--62, June 2003.

Cited By

View all
  • (2014)A survey on techniques for improving the energy efficiency of large-scale distributed systemsACM Computing Surveys10.1145/253263746:4(1-31)Online publication date: 1-Mar-2014
  • (2008)Application mapping for chip multiprocessorsProceedings of the 45th annual Design Automation Conference10.1145/1391469.1391628(620-625)Online publication date: 8-Jun-2008
  • (2008)Software-directed combined cpu/link voltage scaling fornoc-based cmpsACM SIGMETRICS Performance Evaluation Review10.1145/1384529.137549836:1(359-370)Online publication date: 2-Jun-2008
  • Show More Cited By

Index Terms

  1. Software-directed power-aware interconnection networks

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        CASES '05: Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems
        September 2005
        326 pages
        ISBN:159593149X
        DOI:10.1145/1086297
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 24 September 2005

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. communication links
        2. dynamic voltage
        3. interconnection networks
        4. networks on-a-chip (NoC)
        5. scaling
        6. simulation
        7. software-directed power reduction

        Qualifiers

        • Article

        Conference

        CASES05

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 13 Feb 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2014)A survey on techniques for improving the energy efficiency of large-scale distributed systemsACM Computing Surveys10.1145/253263746:4(1-31)Online publication date: 1-Mar-2014
        • (2008)Application mapping for chip multiprocessorsProceedings of the 45th annual Design Automation Conference10.1145/1391469.1391628(620-625)Online publication date: 8-Jun-2008
        • (2008)Software-directed combined cpu/link voltage scaling fornoc-based cmpsACM SIGMETRICS Performance Evaluation Review10.1145/1384529.137549836:1(359-370)Online publication date: 2-Jun-2008
        • (2008)Software-directed combined cpu/link voltage scaling fornoc-based cmpsProceedings of the 2008 ACM SIGMETRICS international conference on Measurement and modeling of computer systems10.1145/1375457.1375498(359-370)Online publication date: 2-Jun-2008
        • (2008)Ring data location prediction scheme for Non-Uniform Cache Architectures2008 IEEE International Conference on Computer Design10.1109/ICCD.2008.4751936(693-698)Online publication date: Oct-2008
        • (2008)Communication Based Proactive Link Power ManagementProceedings of the 4th International Conference on High Performance Embedded Architectures and Compilers10.1007/978-3-540-92990-1_16(198-215)Online publication date: 24-Dec-2008
        • (2007)Software-directed power-aware interconnection networksACM Transactions on Architecture and Code Optimization10.1145/1216544.12165484:1(5-es)Online publication date: 1-Mar-2007
        • (2007)Enhancing Locality in Two-Dimensional Space through Integrated Computation and Data MappingsProceedings of the 20th International Conference on VLSI Design held jointly with 6th International Conference: Embedded Systems10.1109/VLSID.2007.77(227-232)Online publication date: 6-Jan-2007
        • (2007)Compiler-directed power optimization of high-performance interconnection networks for load-balancing MPI applicationsFrontiers of Computer Science in China10.1007/s11704-007-0008-11:1(94-105)Online publication date: Feb-2007
        • (2006)Dynamic power saving in fat-tree interconnection networks using on/off linksProceedings of the 20th international conference on Parallel and distributed processing10.5555/1898699.1898826(299-299)Online publication date: 25-Apr-2006
        • Show More Cited By

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media