Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1109/MICRO.2010.21acmconferencesArticle/Chapter ViewAbstractPublication PagesmicroConference Proceedingsconference-collections
Article

LOFT: A High Performance Network-on-Chip Providing Quality-of-Service Support

Published: 04 December 2010 Publication History

Abstract

Providing quality-of-service (QoS) for concurrent tasks in many-core architectures is becoming important, especially for real-time applications. QoS support for on-chip shared resources (such as shared cache, bus, and memory controllers)in chip-multiprocessors has been investigated in recent years. Unlike other shared resources, network-on-chip (NoC) does not typically have central arbitration of accesses to the shared resource. Instead, each router shares the responsibility of resource allocation. While such distributed nature benefits the scalable performance of NoC, it also dramatically complicates the problem of providing QoS support for individual flows. Existing approaches to address this problem suffer from various shortcomings such as low network utilization and weak QoS guarantees. In this work, we propose LOFT No architecture which features both high network utilization and strong QoS guarantees. LOFT is based on the combination of two mechanisms: a) locally-synchronized frames (LSF), which is a distributed frame-based scheduling mechanism that provides flexible QoS guarantees to different flows and b)flit-reservation (FRS), which is a flow-control mechanism integrated in LSF that improves network utilization. The experimental results show that LOFT delivers flexible and reliable QoS guarantees while sufficiently utilizes available network capacity to gain high overall throughput.

References

[1]
Intel Core¿ i7 Processor Families. Intel Corporation. {Online}. Available: http://www.intel.com/design/corei7/
[2]
L. Seiler, D. Carmean, E. Sprangle et al., "Larrabee: a many-core x86 architecture for visual computing," ACM Trans. Graphics, vol. 27, no. 3, pp. 1-15, 2008.
[3]
Tilera Processor Families. Tilera Corporation. {Online}. Available: http://www.tilera.com/products/processors.php
[4]
T. G. Mattson, R. Van der Wijngaart, and M. Frumkin, "Programming the Intel 80-core network-on-a-chip terascale processor," in Proc. of Conference on Supercomputing, 2008, pp. 1-11.
[5]
J. Howard, S. Dighe, Y. Hoskote et al., "A 48-core IA-32 message-passing processor with DVFS in 45nm CMOS," in Digest of Technical Papers in Solid-State Circuits Conference, 7-11 2010, pp. 108-109.
[6]
R. Iyer, "CQoS: a framework for enabling QoS in shared caches of CMP platforms," in Proc. of International Conference on Supercomputing, 2004, pp. 257-266.
[7]
K. J. Nesbit, J. Laudon, and J. E. Smith, "Virtual private caches," in Proc. of International Symposium on Computer architecture, 2007, pp. 57-68.
[8]
S. Srikantaiah, R. Das, A. K. Mishra et al., "A case for integrated processor-cache partitioning in chip multiprocessors," in Proc. of the Conference on High Performance Computing Networking, Storage and Analysis, 2009, pp. 1-12.
[9]
K. J. Nesbit, N. Aggarwal, J. Laudon et al., "Fair queuing memory systems," in Proc. of International Symposium on Microarchitecture, 2006, pp. 208-222.
[10]
O. Mutlu and T. Moscibroda, "Stall-time fair memory access scheduling for chip-multiprocessors," in Proc. of International Symposium on Microarchitecture, 2007, pp. 146-160.
[11]
E. Ebrahimi, C. J. Lee, O. Mutlu et al., "Fairness via source throttling: a configurable and high-performance fairness substrate for multicore memory systems," in Proc. of Architectural Support for Programming Languages and Operating Systems, 2010, pp. 335-346.
[12]
J. W. Lee, M. C. Ng, and K. Asanovic, "Globally-synchronized frames for guaranteed quality-of-service in on-chip networks," in Proc. of International Symposium on Computer Architecture, 2008, pp. 89-100.
[13]
B. Grot, S. W. Keckler, and O. Mutlu, "Preemptive virtual clock: a flexible, efficient, and cost-effective QoS scheme for networks-on-chip," in Proc. of International Symposium on Microarchitecture, 2009, pp. 268-279.
[14]
K. Goossens, J. Dielissen, and A. Radulescu, "Æthereal network on chip: concepts, architectures, and implementations," IEEE Trans. Design and Test, vol. 22, no. 5, pp. 414-421, 2005.
[15]
M. Millberg, E. Nilsson, R. Thid, and A. Jantsch, "Guaranteed bandwidth using looped containers in temporally disjoint networks within the Nostrum network on chip," in Proc. of Conference on Design, Automation and Test in Europe, 2004, pp. 890-895.
[16]
T. Bjerregaard and J. Sparso, "A router architecture for connection-oriented service guarantees in the MANGO clockless network-on-chip," in Proc. of Conference on Design, Automation and Test in Europe, 2005, pp. 1226-1231.
[17]
W.-D. Weber, J. Chou, I. Swarbrick, and D. Wingard, "A quality-of-service mechanism for interconnection networks in system-on-chips," in Proc. of Conference on Design, Automation and Test in Europe, 2005, pp. 1232-1237.
[18]
J. H. Kim and A. A. Chien, "Rotating combined queueing (RCQ): bandwidth and latency guarantees in low-cost, high-performance networks," in Proc. International Symposium on Computer architecture, 1996, pp. 226-236.
[19]
R. Das, O. Mutlu, T. Moscibroda, and C. R. Das, "Application-aware prioritization mechanisms for on-chip networks," in Proc. of International Symposium on Microarchitecture, 2009, pp. 280-291.
[20]
L.-S. Peh and W. J. Dally, "Flit-reservation flow control," IEEE Trans. on Parallel and Distributed Systems, vol. 3, no. 3, pp. 194-205, 2000.
[21]
H. Zhang and S. Keshav, "Comparison of rate-based service disciplines," in Proc. of Conference on Communications architecture & protocols, 1991, pp. 113-121.
[22]
J. Kim, D. Park, C. Nicopoulos, N. Vijaykrishnan, and C. R. Das, "Design and analysis of an NoC architecture from performance, reliability and energy perspective," in Proc. of ACM symposium on Architecture for Networking and Communications Systems, 2005, pp. 173-182.
[23]
L.-S. Peh and W. J. Dally, "A delay model and speculative architecture for pipelined routers," in Proc. of International Symposium on High-Performance Computer Architecture, 2001, p. 255.
[24]
S. Li, J. H. Ahn, R. D. Strong, J. B. Brockman, D. M. Tullsen, and N. P. Jouppi, "McPAT: An Integrated Power, Area, and Timing Modeling Framework for Multicore and Manycore Architectures," in Proc. of International Symposium on Microarchitecture, 2009, pp. 469-480.
[25]
D. Vantrease, N. Binkert, R. Schreiber, and M. H. Lipasti, "Light speed arbitration and flow control for nanophotonic interconnects," in Proc. of International Symposium on Microarchitecture, 2009, pp. 304-315.

Cited By

View all
  • (2020)Improving Memory Efficiency in Heterogeneous MPSoCs through Row-Buffer Locality-aware ForwardingACM Transactions on Architecture and Code Optimization10.1145/337714917:1(1-26)Online publication date: 4-Mar-2020
  • (2019)A Self-aware Resource Management Framework for Heterogeneous Multicore SoCs with Diverse QoS TargetsACM Transactions on Architecture and Code Optimization10.1145/331980416:2(1-23)Online publication date: 9-Apr-2019
  • (2017)History-Based Arbitration for Fairness in Processor-Interconnect of NUMA ServersACM SIGARCH Computer Architecture News10.1145/3093337.303775345:1(765-777)Online publication date: 4-Apr-2017
  • Show More Cited By

Index Terms

  1. LOFT: A High Performance Network-on-Chip Providing Quality-of-Service Support

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        MICRO '43: Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
        December 2010
        542 pages
        ISBN:9780769542997

        Sponsors

        Publisher

        IEEE Computer Society

        United States

        Publication History

        Published: 04 December 2010

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate 484 of 2,242 submissions, 22%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)1
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 21 Nov 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2020)Improving Memory Efficiency in Heterogeneous MPSoCs through Row-Buffer Locality-aware ForwardingACM Transactions on Architecture and Code Optimization10.1145/337714917:1(1-26)Online publication date: 4-Mar-2020
        • (2019)A Self-aware Resource Management Framework for Heterogeneous Multicore SoCs with Diverse QoS TargetsACM Transactions on Architecture and Code Optimization10.1145/331980416:2(1-23)Online publication date: 9-Apr-2019
        • (2017)History-Based Arbitration for Fairness in Processor-Interconnect of NUMA ServersACM SIGARCH Computer Architecture News10.1145/3093337.303775345:1(765-777)Online publication date: 4-Apr-2017
        • (2017)History-Based Arbitration for Fairness in Processor-Interconnect of NUMA ServersACM SIGPLAN Notices10.1145/3093336.303775352:4(765-777)Online publication date: 4-Apr-2017
        • (2017)History-Based Arbitration for Fairness in Processor-Interconnect of NUMA ServersProceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3037697.3037753(765-777)Online publication date: 4-Apr-2017
        • (2016)OSCARThe 49th Annual IEEE/ACM International Symposium on Microarchitecture10.5555/3195638.3195672(1-13)Online publication date: 15-Oct-2016
        • (2016)Aggregate Flow-Based Performance Fairness in CMPsACM Transactions on Architecture and Code Optimization10.1145/301442913:4(1-27)Online publication date: 28-Dec-2016
        • (2015)A Low-Latency and High-Throughput Multiple-Level Arbitration Scheme Supporting Quality-of-Service in Optical On-chip NetworkProceedings of the 8th International Workshop on Network on Chip Architectures10.1145/2835512.2835519(9-14)Online publication date: 5-Dec-2015
        • (2015)Static Task Partitioning for Locked Caches in Multicore Real-Time SystemsACM Transactions on Embedded Computing Systems10.1145/263855714:1(1-30)Online publication date: 21-Jan-2015
        • (2014)Quality-of-Service for a High-Radix SwitchProceedings of the 51st Annual Design Automation Conference10.1145/2593069.2593194(1-6)Online publication date: 1-Jun-2014
        • Show More Cited By

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media