Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Analytical Performance Models for NoCs with Multiple Priority Traffic Classes

Published: 07 October 2019 Publication History

Abstract

Networks-on-chip (NoCs) have become the standard for interconnect solutions in industrial designs ranging from client CPUs to many-core chip-multiprocessors. Since NoCs play a vital role in system performance and power consumption, pre-silicon evaluation environments include cycle-accurate NoC simulators. Long simulations increase the execution time of evaluation frameworks, which are already notoriously slow, and prohibit design-space exploration. Existing analytical NoC models, which assume fair arbitration, cannot replace these simulations since industrial NoCs typically employ priority schedulers and multiple priority classes. To address this limitation, we propose a systematic approach to construct priority-aware analytical performance models using micro-architecture specifications and input traffic. Our approach decomposes the given NoC into individual queues with modified service time to enable accurate and scalable latency computations. Specifically, we introduce novel transformations along with an algorithm that iteratively applies these transformations to decompose the queuing system. Experimental evaluations using real architectures and applications show high accuracy of 97% and up to 2.5× speedup in full-system simulation.

References

[1]
N. Agarwal et al. [n.d.]. GARNET: A detailed on-chip network model inside a full-system simulator. In 2009 IEEE intl. symp. on Performance Analysis of Systems and Software. 33--42.
[2]
I. Awan and R. Fretwell. 2005. Analysis of discrete-time queues with space and service priorities for arbitrary arrival processes. In Parallel and Distributed Systems. Proc. 11th Intl Conf. on, Vol. 2. 115--119.
[3]
A. Bartolini et al. 2010. A virtual platform environment for exploring power, thermal and reliability management control strategies in high-performance multicores. In Proc. of the Great lakes Symp. on VLSI. 311--316.
[4]
A. W. Berger and W. Whitt. 2000. Workload bounds in fluid models with priorities. Performance Evaluation 41, 4 (2000), 249--267.
[5]
D. P. Bertsekas, R. G. Gallager, and P. Humblet. 1992. Data Networks. Vol. 2. Prentice-Hall International New Jersey.
[6]
C. Bienia, S. Kumar, J. P. Singh, and K. Li. 2008. The PARSEC benchmark suite: Characterization and architectural implications. In Proc. of the Intl. Conf. on Parallel Arch. and Compilation Tech. 72--81.
[7]
N. Binkert et al. 2011. The Gem5 simulator. SIGARCH Comp. Arch. News (May. 2011).
[8]
P. Bogdan and R. Marculescu. 2011. Non-stationary traffic analysis and its implications on multicore platform design. IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems 30, 4 (2011), 508--519.
[9]
G. Bolch, S. Greiner, H. De Meer, and K. S. Trivedi. 2006. Queueing Networks and Markov Chains: Modeling and Performance Evaluation with Computer Science Applications. John Wiley 8 Sons.
[10]
W. Choi et al. 2017. On-chip communication network for efficient training of deep convolutional networks on heterogeneous manycore systems. IEEE Trans. on Computers 67, 5 (2017), 672--686.
[11]
A. C. de Melo. 2010. The new linux perf tools. In Linux Kongress, Vol. 18.
[12]
J. Doweck et al. 2017. Inside 6th-generation intel core: New microarchitecture code-named skylake. IEEE Micro 2 (2017), 52--62.
[13]
S. Ikehara and M. Miyazaki. [n.d.]. Approximate analysis of queueing networks with nonpreemptive priority scheduling. In Proc. 11th Int. Teletraffic Congr.
[14]
J. Jeffers, J. Reinders, and A. Sodani. 2016. Intel Xeon Phi Processor High Performance Programming: Knights Landing Edition. Morgan Kaufmann.
[15]
N. Jiang et al. [n.d.]. A detailed and flexible cycle-accurate network-on-chip simulator. In 2013 IEEE Intl. Symp. on Performance Analysis of Systems and Software (ISPASS). 86--96.
[16]
X. Jin and G. Min. 2009. Modelling and analysis of priority queueing systems with multi-class self-similar network traffic: A novel and efficient queue-decomposition approach. IEEE Trans. on Communications 57, 5 (2009).
[17]
J. A. Kahle et al. 2005. Introduction to the cell multiprocessor. IBM journal of Research and Development 49, 4.5 (2005), 589--604.
[18]
H. Kashif and H. Patel. 2014. Bounding buffer space requirements for real-time priority-aware networks. In Asia and South Pacific Design Autom. Conf. 113--118.
[19]
C. N. Keltcher, K. J. McGrath, A. Ahmed, and P. Conway. 2003. The AMD opteron processor for multiprocessor servers. IEEE Micro 23, 2 (2003), 66--76.
[20]
A. E. Kiasari, Z. Lu, and A. Jantsch. 2013. An analytical latency model for networks-on-chip. IEEE Trans. on Very Large Scale Integration (VLSI) Systems 21, 1 (2013), 113--123.
[21]
R. Leupers et al. 2011. Virtual manycore platforms: Moving towards 100+ processor cores. In Proc. of DATE. 1--6.
[22]
P. S. Magnusson et al. [n.d.]. Simics: A full system simulation platform. Computer 35, 2 ([n. d.]), 50--58.
[23]
U. Y. Ogras, P. Bogdan, and R. Marculescu. 2010. An analytical approach for network-on-chip performance analysis. IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems 29, 12 (2010), 2001--2013.
[24]
U. Y. Ogras, Y. Emre, J. Xu, T. Kam, and M. Kishinevsky. 2012. Energy-guided exploration of on-chip network design for exa-scale computing. In Proc. of Intl. Workshop on System Level Interconnect Prediction. 24--31.
[25]
U. Y. Ogras, M. Kishinevsky, and S. Chatterjee. [n.d.]. xPLORE: Communication Fabric Design and Optimization Framework. Developed at Strategic CAD Labs, Intel Corp.
[26]
P. P. Pande, C. Grecu, M. Jones, A. Ivanov, and R. Saleh. 2005. Performance evaluation and design trade-offs for network-on-chip interconnect architectures. IEEE Transactions on Computers 54, 8 (2005), 1025--1040.
[27]
A. Patel et al. 2011. MARSS: A full system simulator for multicore x86 CPUs. In Design Autom. Conf. 1050--1055.
[28]
Y. Qian, Z. Lu, and W. Dou. [n.d.]. Analysis of worst-case delay bounds for best-effort communication in wormhole networks on chip. In 2009 3rd ACM/IEEE Interl. Symp. on Networks-on-Chip. 44--53.
[29]
Z.-L. Qian et al. 2015. A support vector regression (SVR)-based latency model for network-on-chip (NoC) architectures. IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems 35, 3 (2015), 471--484.
[30]
A. Rico et al. 2017. ARM HPC ecosystem and the reemergence of vectors. In Proc. of the Computing Frontiers Conf. ACM, 329--334.
[31]
E. Rotem and S. P. Engineer. 2015. Intel architecture, code name skylake deep dive: A new architecture to manage power performance and energy efficiency. In Intel Developer Forum.
[32]
M. P. Singh and M. K. Jain. 2014. Evolution of processor architecture in mobile phones. Intl. Journ. of Computer Applications 90, 4 (2014).
[33]
J. Walraevens. 2004. Discrete-time Queueing Models with Priorities. Ph.D. Dissertation. Ghent University.
[34]
P. Wettin et al. 2014. Performance evaluation of wireless NoCs in presence of irregular network routing strategies. In Proc. of the conf. on DATE. 272.
[35]
Y. Wu et al. 2010. Analytical modelling of networks in multicomputer systems under bursty and batch arrival traffic. The Journ. of Supercomputing 51, 2 (2010), 115--130.
[36]
Venkata Yaswanth Raparti, Nishit Kapadia, and Sudeep Pasricha. 2017. ARTEMIS: An aging-aware runtime application mapping framework for 3D NoC-based chip multiprocessors. IEEE Transactions on Multi-Scale Computing Systems 3, 2 (2017), 72--85.

Cited By

View all
  • (2024)Subnetwork Based Traffic Aware Rerouting for CMesh Bufferless Network-on-ChipJournal of Circuits, Systems and Computers10.1142/S021812662450207433:12Online publication date: 16-Feb-2024
  • (2023)FARSI: An Early-stage Design Space Exploration Framework to Tame the Domain-specific System-on-chip ComplexityACM Transactions on Embedded Computing Systems10.1145/354401622:2(1-35)Online publication date: 24-Jan-2023
  • (2023)Fast Performance Analysis for NoCs With Weighted Round-Robin Arbitration and Finite BuffersIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2023.325066231:5(670-683)Online publication date: 16-Mar-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Embedded Computing Systems
ACM Transactions on Embedded Computing Systems  Volume 18, Issue 5s
Special Issue ESWEEK 2019, CASES 2019, CODES+ISSS 2019 and EMSOFT 2019
October 2019
1423 pages
ISSN:1539-9087
EISSN:1558-3465
DOI:10.1145/3365919
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 07 October 2019
Accepted: 01 July 2019
Revised: 01 June 2019
Received: 01 April 2019
Published in TECS Volume 18, Issue 5s

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. NoC performance analysis
  2. priority-based NoC
  3. queuing networks

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)116
  • Downloads (Last 6 weeks)23
Reflects downloads up to 24 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Subnetwork Based Traffic Aware Rerouting for CMesh Bufferless Network-on-ChipJournal of Circuits, Systems and Computers10.1142/S021812662450207433:12Online publication date: 16-Feb-2024
  • (2023)FARSI: An Early-stage Design Space Exploration Framework to Tame the Domain-specific System-on-chip ComplexityACM Transactions on Embedded Computing Systems10.1145/354401622:2(1-35)Online publication date: 24-Jan-2023
  • (2023)Fast Performance Analysis for NoCs With Weighted Round-Robin Arbitration and Finite BuffersIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2023.325066231:5(670-683)Online publication date: 16-Mar-2023
  • (2023)MNSIM 2.0: A Behavior-Level Modeling Tool for Processing-In-Memory ArchitecturesIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2023.325169642:11(4112-4125)Online publication date: 1-Nov-2023
  • (2023)Traffic Characterization Based Stochastic Modelling of Network-on-ChipIEEE Transactions on Computers10.1109/TC.2022.319196572:4(1215-1222)Online publication date: 1-Apr-2023
  • (2023)Fast Analysis Using Finite Queuing Model for Multilayer NoCsIEEE Design & Test10.1109/MDAT.2023.331016740:6(112-124)Online publication date: Dec-2023
  • (2023)Adaptive distribution of control messages for improving bandwidth utilization in multiple NoCThe Journal of Supercomputing10.1007/s11227-023-05208-079:15(17208-17246)Online publication date: 7-May-2023
  • (2023)In-Memory Computing for AI Accelerators: Challenges and SolutionsEmbedded Machine Learning for Cyber-Physical, IoT, and Edge Computing10.1007/978-3-031-19568-6_7(199-224)Online publication date: 1-Oct-2023
  • (2021)MGait: Model-Based Gait Analysis Using Wearable Bend and Inertial SensorsACM Transactions on Internet of Things10.1145/34854343:1(1-24)Online publication date: 27-Oct-2021
  • (2021)Impact of On-chip Interconnect on In-memory Acceleration of Deep Neural NetworksACM Journal on Emerging Technologies in Computing Systems10.1145/346023318:2(1-22)Online publication date: 31-Dec-2021
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media