Nothing Special   »   [go: up one dir, main page]

skip to main content
tutorial

Performance Evaluation of NoC-Based Multicore Systems: From Traffic Analysis to NoC Latency Modeling

Published: 11 May 2016 Publication History

Abstract

In this survey, we review several approaches for predicting performance of Network-on-Chip (NoC)-based multicore systems, starting from the traffic models to the complex NoC models for latency evaluation. We first review typical traffic models to represent the application workloads in NoC. Specifically, we review Markovian and non-Markovian (e.g., self-similar or long-range memory-dependent) traffic models and discuss their applications on multicore platform design. Then, we review the analytical techniques to predict NoC performance under given input traffic. We investigate analytical models for average as well as maximum delay evaluation. We also review the developments and design challenges of NoC simulators. One interesting research direction in NoC performance evaluation consists of combining simulation and analytical models in order to exploit their advantages together. Toward this end, we discuss several newly proposed approaches that use hardware-based or learning-based techniques. Finally, we summarize several open problems and our perspective to address these challenges.

References

[1]
C. Ababei, P. P. Pande, and S. Pasricha. 2012. Network-on-chips (NoC) Blog. Retrieved from http://networkonchip.wordpress.com/.
[2]
P. Abad, P. Prieto, L. G. Menezo, A. Colaso, V. Puente, and J. A. Gregorio. 2012. TOPAZ: An open-source interconnection network simulator for chip multiprocessors and supercomputers. In Proceedings of the 2012 6th IEEE/ACM International Symposium on Networks on Chip (NoCS). 99--106.
[3]
N. Agarwal, T. Krishna, L. S. Peh, and N. K. Jha. 2009. GARNET: A detailed on-chip network model inside a full-system simulator. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS’09). 33--42.
[4]
E. K. Ardestani and J. Renau. 2013. ESESC: A fast multicore simulator using time-based sampling. In Proceedings of the 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA’13). 448--459.
[5]
M. Arjomand and H. Sarbazi-Azad. 2009. A comprehensive power-performance model for NoCs with multi-flit channel buffers. In Proceedings of the 23rd International Conference on Supercomputing (ICS’09). ACM, New York, NY, 470--478.
[6]
M. Arjomand and H. Sarbazi-Azad. 2010. Power-performance analysis of networks-on-chip with arbitrary buffer allocation schemes. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 29, 10 (Oct. 2010), 1558--1571.
[7]
Atlas. 2011. Atlas environment for Network-on-Chips. Retrieved from http://corfu.pucrs.br/redmine/projects/atlas/wiki.
[8]
M. Bekooij, O. P. Poplavko, B. Mesman, M. Pastrnak, and J. Van Meerbergen. 2004. Predictable embedded multiprocessor system design. In Proceedings of the International Workshop on Software and Compilers for Embedded Systems (SCOPES), Lecture Notes in Computer Science, Vol. 3199. Springer.
[9]
Y. Ben-Itzhak, I. Cidon, and A. Kolodny. 2011. Delay analysis of wormhole based heterogeneous NoC. In Proceedings of the 2011 5th IEEE/ACM International Symposium on Networks on Chip (NoCS). 161--168.
[10]
Y. Ben-Itzhak, E. Zahavi, I. Cidon, and A. Kolodny. 2012. HNOCS: Modular open-source simulator for heterogeneous NoCs. In Proceedings of the 2012 International Conference on Embedded Computer Systems (SAMOS). 51--57.
[11]
L. Benini and G. De Micheli. 2002. Networks on chips: A new SoC paradigm. Computer 35, 1 (Jan. 2002), 70--78.
[12]
J. Beran. 1994. Statistics for Long-Memory Processes. Chapman and Hall.
[13]
D. Bertozzi and L. Benini. 2004. Xpipes: A network-on-chip architecture for gigascale systems-on-chip. IEEE Circuits and Systems Magazine 4, 2 (2004), 18--31.
[14]
D. Bertozzi, A. Jalabert, S. Murali, R. Tamhankar, S. Stergiou, L. Benini, and G. De Micheli. 2005. NoC synthesis flow for customized domain specific multiprocessor systems-on-chip. IEEE Transactions on Parallel and Distributed Systems 16, 2 (2005), 113--129.
[15]
P. Bogdan. 2015. Mathematical modeling and control of multifractal workloads for data-center-on-a-chip optimization. In Proceedings of the 2015 9th IEEE/ACM International Symposium on Networks-on-Chip (NoCS). 21:1--21:8.
[16]
P. Bogdan, M. Kas, R. Marculescu, and O. Mutlu. 2010. QuaLe: A quantum-leap inspired model for non-stationary analysis of NoC traffic in chip multi-processors. In Proceedings of the 2010 4th ACM/IEEE International Symposium on Networks-on-Chip (NOCS). 241--248.
[17]
P. Bogdan and R. Marculescu. 2009. Statistical physics approaches for network-on-chip traffic characterization. In Proceedings of the 7th IEEE/ACM International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’09). ACM, New York, NY, 461--470.
[18]
P. Bogdan and R. Marculescu. 2010. Workload characterization and its impact on multicore platform design. In Proceedings of the 2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS). 231--240.
[19]
P. Bogdan and R. Marculescu. 2011. Non-stationary traffic analysis and its implications on multicore platform design. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 30, 4 (2011), 508--519.
[20]
P. Bogdan and Y. Xue. 2015. Mathematical models and control algorithms for dynamic optimization of multicore platforms: A complex dynamic approach. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’15). 170--175.
[21]
G. Bolch, S. Greiner, H. de Meer, and K. S. Trivedi. 2006. Queueing Networks and Markov Chains: Modeling and Performance Evaluation with Computer Science Applications (2nd ed.). John Wiley and Sons.
[22]
S. Borkar. 2009. Design perspectives on 22nm CMOS and beyond. In Proceedings of the 46th ACM/IEEE Design Automation Conference (DAC’09). 93--94.
[23]
J. Y. Le Boudec and P. Thiran. 2004. Network Calculus: A Theory of Deterministic Queuing Systems for the Internet. Lecture Notes in Computer Science, Vol. 2050. Springer-Verlag, Berlin. http://www.springer.com/us/book/9783540421849?token=prtst0416p.
[24]
T. E. Carlson, W. Heirman, and L. Eeckhout. 2013. Sampled simulation of multi-threaded applications. In Proceedings of the 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). 2--12.
[25]
G. Casale, E. Z. Zhang, and E. Smirni. 2008. KPC-toolbox: Simple yet effective trace fitting using Markovian arrival processes. In Proceedings of the 5th International Conference on Quantitative Evaluation of Systems (QEST’08). 83--92.
[26]
S. Chakraborty, S. Kunzli, and L. Thiele. 2003. A general framework for analysing system properties in platform-based embedded system designs. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition, 2003. 190--195.
[27]
C. S. Chang. 2000. Performance Guarantees in Communication Networks. Springer-Verlag, New York.
[28]
S. Chatterjee, M. Kishinevsky, and U. Y. Ogras. 2012. xMAS: Quick formal modeling of communication fabrics to enable verification. IEEE Design Test of Computers 29, 3 (2012), 80--88.
[29]
Connect. 2011. Configurable NEtwork Creation Tool. Retrieved from http://users.ece.cmu.edu/∼mpapamic/connect/.
[30]
Wenbo Dai and N. E. Jerger. 2014. Sampling-based approaches to accelerate network-on-chip simulation. In Proceedings of the 2014 8th IEEE/ACM International Symposium on Networks-on-Chip (NoCS). 41--48.
[31]
W. Dally. 1992. Virtual-channel flow control. IEEE Transactions on Parallel and Distributed Systems 3, 2 (1992), 194--205.
[32]
W. J. Dally and B. Towles. 2001. Route packets, not wires: On-chip interconnection networks. In Proceedings of Design Automation Conference, 2001. 684--689.
[33]
W. Dally and B. Towles. 2003. Principles and Practices of Interconnect Networks. Morgan Kaufmann, San Francisco, CA.
[34]
J. Diamond and A. Alfa. 2000. On approximating higher-order MAPs with MAPs of order two. Queueing Systems 34 (2000), 269--288.
[35]
G. Donald and C. M. Harris. 2008. Fundamentals of Queueing Theory. Wiley.
[36]
G. Du, M. Li, Z. Lu, M. Gao, and C. Wang. 2014. An analytical model for worst-case reorder buffer size of multi-path minimal routing NoCs. In Proceedings of the 2014 8th IEEE/ACM International Symposium on Networks-on-Chip (NoCS). 49--56.
[37]
M. Eggenberger and M. Radetzki. 2013. Scalable parallel simulation of networks on chip. In Proceedings of the 2013 7th IEEE/ACM International Symposium on Networks on Chip (NoCS). 1--8.
[38]
E. Fischer and G. P. Fettweis. 2013. An accurate and scalable analytic model for round-robin arbitration in network-on-chip. In Proceedings of the 2013 7th IEEE/ACM International Symposium on Networks on Chip (NoCS). 1--8.
[39]
W. Fischer and K. Meier-Hellstern. 1993. The Markov-modulated Poisson process (MMPP) cookbook. Elsevier Performance Evaluation 18, 2 (1993), 149--171.
[40]
J. Flich and D. Bertozzi (Eds.). 2010. Designing Network On-Chip Architectures in the Nanoscale Era. Chapman and Hall/CRC.
[41]
S. Foroutan, Y. Thonnart, and F. Petrot. 2013. An iterative computational technique for performance evaluation of networks-on-chip. IEEE Transactions on Computers 62, 8 (Aug. 2013), 1641--1655.
[42]
Gem5. 2009. Gem5 simulator. (2009). http://www.m5sim.org/.
[43]
N. Genko, D. Atienza, G. De Micheli, J. M. Mendias, R. Hermida, and F. Catthoor. 2005. A complete network-on-chip emulation framework. In Proceedings of the Conference on Design, Automation and Test in Europe - Volume 1 (DATE’05). 246--251.
[44]
gMemNoCsim. 2011. gMemNoCsim simulator. Retrieved from http://www.gap.upv.es/index.php?option=com_content&view==article&id==72&Itemid==108.
[45]
Graphite. 2010. Graphite simulator. Retrieved from http://groups.csail.mit.edu/carbon/.
[46]
P. Gratz and S. W. Keckler. 2010. Realistic workload characterization and analysis for networks-on-chip design. In Proceedings of the 4th Workshop on Chip Multiprocessor Memory Systems and Interconnects (CMP-MSI).
[47]
C. Grecu, A. Ivanov, P. Pandey, A. Jantsch, E. Salminen, and R. Marculescu. 2007. An initiative towards open network-on-chip benchmarks.
[48]
Z. Guz, I. Walter, E. Bolotin, I. Cidon, R. Ginosar, and A. Kolodny. 2007. Network delays and link capacities in application-specific wormhole NoCs. VLSI Design (2007).
[49]
A. Hansson, M. Wiggers, A. Moonen, K. Goossens, and M. Bekooij. 2008. Applying dataflow analysis to dimension buffers for guaranteed performance in networks on chip. In Proceedings of the 2nd ACM/IEEE International Symposium on Networks-on-Chip (NoCS’08). 211--212.
[50]
J. Hestness and S. W. Keckler. 2010. Netrace: Dependency-driven, trace-based network-on-chip simulation. In Proceedings of the 3rd International Workshop on Network on Chip Architectures (NoCArc).
[51]
HNoC. 2013. HNoC simulator. Retrieved from http://hnocs.eew.technion.ac.il/.
[52]
H. Hossain, M. Ahmed, A. Al-Nayeem, T. Z. Islam, and M. M. Akbar. 2007. Gpnocsim—A general purpose simulator for network-on-chip. In Proceedings of the International Conference on Information and Communication Technology (ICICT’07). 254--257.
[53]
J. Hu and R. Marculescu. 2003. Energy-aware mapping for tile-based NoC architectures under performance constraints. In Proceedings of the ASP-DAC 2003 Design Automation Conference. 233--239.
[54]
J. Hu and R. Marculescu. 2004a. Application-specific buffer space allocation for networks-on-chip router design. In Proceedings of the IEEE/ACM International Conference on Computer Aided Design (ICCAD’04). 354--361.
[55]
J. Hu and R. Marculescu. 2004b. Energy-aware communication and task scheduling for network-on-chip architectures under real-time constraints. In Proceedings of Design, Automation and Test in Europe Conference and Exhibition, 2004, Vol. 1. 234--239.
[56]
J. Hu and R. Marculescu. 2005. Energy- and performance-aware mapping for regular NoC architectures. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 24, 4 (April 2005), 551--562.
[57]
P. C. Hu and L. Kleinrock. 1997. An analytical model for wormhole routing with finite size input buffers. In 15th International Teletraffic Congress.
[58]
E. A. F. Ihlen. 2012. Introduction to multifractal detrended fluctuation analysis in matlab. Frontiers in Physiology 3 (2012), 141.
[59]
F. Jafari, Z. Lu, A. Jantsch, and M. H. Yaghmaee. 2010. Buffer optimization in network-on-chip through flow regulation. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 29, 12 (2010), 1973--1986.
[60]
Nan Jiang, D. U. Becker, G. Michelogiannakis, J. Balfour, B. Towles, D. E. Shaw, J. Kim, and W. J. Dally. 2013. A detailed and flexible cycle-accurate network-on-chip simulator. In Proceedings of the 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). 86--96.
[61]
Y. Jiang and Y. Liu. 2008. Stochastic Network Calculus. Springer-Verlag, London, UK.
[62]
A. B. Kahng, B. Li, L. S. Peh, and K. Samadi. 2012. ORION 2.0: A power-area simulator for interconnection networks. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 20, 1 (Jan. 2012), 191--196.
[63]
J. W. Kantelhardt, S. A. Zschiegner, E. Koscielny-Bunde, S. Havlin, A. Bunde, and H. E. Stanley. 2002. Multifractal detrended fluctuation analysis of nonstationary time series. Physica A: Statistical Mechanics and its Applications 316 (Dec. 2002), 87--114.
[64]
A. E. Kiasari, A. Jantsch, and Z. Lu. 2013a. Mathematical formalisms for performance evaluation of networks-on-chip. ACM Computing Surveys 45, 3, Article 38 (July 2013), 41 pages.
[65]
A. E. Kiasari, Z. Lu, and A. Jantsch. 2013b. An analytical latency model for networks-on-chip. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 21, 1 (Jan. 2013), 113--123.
[66]
A. E. Kiasari, D. Rahmati, H. Sarbazi-Azad, and S. Hessabi. 2008. A Markovian performance model for networks-on-chip. In Proceedings of the 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP’08). 157--164.
[67]
L. Kleinrock. 1975. Queueing Systems, Volume I: Theory. Wiley.
[68]
A. Klemm, C. Lindemann, and M. Lohmann. 2002. Traffic modeling of IP networks using the batch Markovian arrival process. In Computer Performance Evaluation: Modelling Techniques and Tools, T. Field, P. G. Harrison, J. Bradley, and U. Harder (Eds.). Lecture Notes in Computer Science, Vol. 2324. Springer, Berlin, 92--110.
[69]
H. Kobayashi. 1974. Application of the diffusion approximation to queueing networks I: Equilibrium queue distributions. Journal of the ACM 21, 2 (April 1974), 316--328.
[70]
D. D. Kouvatsos, S. Assi, and M. Ould-Khaoua. 2005. Performance modelling of wormhole-routed hypercubes with bursty traffice and finite buffers. International Journal of Simulation 6, 3--4 (2005), 69--81.
[71]
P. J. Kuhn. 2013. Tutorial on Queuing Theory. University of Stuttgart.
[72]
M. C. Lai, L. Gao, N. Xiao, and Z. Y. Wang. 2009. An accurate and efficient performance analysis approach based on queuing model for network on chip. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design—Digest of Technical Papers (ICCAD’09). 563--570.
[73]
S. Lee. 2003. Real-time wormhole channels. Journal of Parallel And Distributed Computing 63 (2003), 299--311.
[74]
M. Lis, P. Ren, M. H. Cho, K. S. Shim, C. W. Fletcher, O. Khan, and S. Devadas. 2011. Scalable, accurate multicore simulation in the 1000-core era. In Proceedings of the 2011 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). 175--185.
[75]
M. Lis, K. S. Shim, M. H. Cho, P. Ren, O. Khan, and S. Devadas. 2010. DARSIM: A parallel cycle-level NoC simulator. In Proceedings of the 6th Annual Workshop on Modeling, Benchmarking and Simulation (MoBS’10), Lieven Eeckhout and Thomas Wenisch (Eds.). https://hal.inria.fr/inria-00492982.
[76]
W. Liu, J. Xu, X. Wu, Y. Ye, X. Wang, W. Zhang, M. Nikdast, and Z. Wang. 2011. A NoC traffic suite based on real applications. In 2011 IEEE Computer Society Annual Symposium on VLSI (ISVLSI). 66--71.
[77]
M. Lodde and J. Flich. 2012. Memory hierarchy and network co-design through trace-driven simulation. In Proceedings of 7th International Summer School on Advanced Computer Architecture and Compilation for High-Performance and Embedded Systems.
[78]
R. Lopes and N. Betrouni. 2009. Fractal and multifractal analysis: A review. Medical Image Analysis 13, 4 (2009), 634--649.
[79]
Z. Lu, R. Thid, M. Millberg, E. Nilsson, and A. Jantsch. 2005. NNSE: Nostrum network-on-chip simulation environment. In Swedish System-on-Chip Conference (SSoCC). 1--4.
[80]
Z. Lu, Y. Yao, and Y. Jiang. 2014. Towards stochastic delay bound analysis for network-on-chip. In Proceedings of the 2014 8th IEEE/ACM International Symposium on Networks-on-Chip (NoCS). 64--71.
[81]
O. Lysne. 1998. Towards a generic analytical model of wormhole routing networks. Microprocessors and Microsystems 21, 7--8 (1998), 491--498.
[82]
I. R. Mackintosh. 2008. OCP-IP NoC benchmarking WG activities. IEEE Design Test of Computers 25, 5 (Sept. 2008), 504--504.
[83]
S. Mahadevan, F. Angiolini, M. Storoaard, R. G. Olsen, J. Sparsoe, and J. Madsen. 2005. Network traffic generator model for fast network-on-chip simulation. In Proceedings of Design, Automation and Test in Europe, 2005, Vol. 2. 780--785.
[84]
R. Marculescu and P. Bogdan. 2009. The chip is the network: Toward a science of network-on-chip design. Foundations and Trends in Electronic Design Automation 2, 4 (2009), 371--461.
[85]
G. Min and M. Ould-Khaoua. 2004. A performance model for wormhole-switched interconnection networks under self-similar traffic. IEEE Transactions on Computers 53, 5 (2004), 601--613.
[86]
A. Nayebi, S. Meraji, A. Shamaei, and H. Sarbazi-Azad. 2007. XMulator: A listener-based integrated simulation platform for interconnection networks. In Proceedings of the 1st Asia International Conference on Modelling Simulation (AMS’07). 128--132.
[87]
Netmaker. 2009. Netmaker interconnection networks simulator. Retrieved from http://www-dyn.cl.cam.ac.uk/∼rdm34/wiki/index.php?title=Main_Page.
[88]
N. Nikitin and J. Cortadella. 2009. A performance analytical model for network-on-chip with constant service time routers. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design—Digest of Technical Papers (ICCAD’09). 571--578.
[89]
NIRGAM. 2007. NIRGAM simulator. Retrieved from http://nirgam.ecs.soton.ac.uk/home.php.
[90]
NoCbench. 2011. NoCbench. Retrieved from http://www.tkt.cs.tut.fi/research/nocbench/index.html.
[91]
Noxim. 2011. Noxim simulator. Retrieved from http://noxim.sourceforge.net/.
[92]
OCCN. 2003. OCCN modeling framework. Retrieved from http://occn.sourceforge.net/.
[93]
U. Y. Ogras, P. Bogdan, and R. Marculescu. 2010. An analytical approach for network-on-chip performance analysis. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 29, 12 (2010), 2001--2013.
[94]
D. Ohmann, E. Fischer, and G. Fettweis. 2014. Transient queuing models for input-buffered routers in network-on-chip. In Proceedings of the 2014 8th IEEE/ACM International Symposium on Networks-on-Chip (NoCS). 57--63.
[95]
M. Ould-Khaoua. 1999. A performance model for Duato’s fully adaptive routing algorithm in k-ary n-cubes. IEEE Transactions on Computers 48, 12 (1999), 1297--1304.
[96]
M. K. Papamichael and J. C. Hoe. 2012. CONNECT: Re-examining conventional wisdom for designing NoCs in the context of FPGAs. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA’12). 37--46.
[97]
M. K. Papamichael, J. C. Hoe, and O. Mutlu. 2011. FIST: A fast, lightweight, FPGA-friendly packet latency estimator for NoC modeling in full-system simulations. In Proceedings of the 2011 5th IEEE/ACM International Symposium on Networks on Chip (NoCS). 137--144.
[98]
K. Park and W. Willinger. 2000. Self-Similar Network Traffic and Performance Evaluation. John Wiley and Sons, New York.
[99]
PARSEC. 2009. PARSEC Benchmark Suite. (2009). http://parsec.cs.princeton.edu/.
[100]
V. Paxson. 1997. Fast, approximate synthesis of fractional Gaussian noise for generating self-similar network traffic. Computer Communication Review 27 (1997), 5--18.
[101]
L.-S. Peh and W. J. Dally. 2001. A delay model and speculative architecture for pipelined routers. In Proceedings of the 7th International Symposium on High-Performance Computer Architecture (HPCA’01). IEEE Computer Society, Washington, DC, 255--266.
[102]
Physionet. 2004. A Brief Overview of Multifractal Time Series. Retrieved from http://www.physionet. org/tutorials/multifractal/index.shtml.
[103]
C. Pinto, S. Raghav, A. Marongiu, M. Ruggiero, D. Atienza, and L. Benini. 2011. GPGPU-accelerated parallel and fast simulation of thousand-core platforms. In Proceedings of the 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid). 53--62.
[104]
S. Prabhu. 2010. OCIN_TSIM-A DVFS Aware Simulator for NoC Design Space Exploration and Optimization. M.Sc. thesis. Texas A&M University.
[105]
V. Puente, J. A. Gregorio, and R. Beivide. 2002. SICOSYS: An integrated framework for studying interconnection network performance in multiprocessor systems. In Proceedings of the 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing, 2002. 15--22.
[106]
A. Pullini, F. Angiolini, P. Meloni, D. Atienza, S. Murali, L. Raffo, G. De Micheli, and L. Benini. 2007. NoC design and implementation in 65nm technology. In Proceedings of the 1st International Symposium on Networks-on-Chip (NOCS’07). 273--282.
[107]
Y. Qian, Z. Lu, and Q. Dou. 2010a. QoS scheduling for NoCs: Strict priority queueing versus weighted round robin. In Proceedings of the 2010 IEEE International Conference on Computer Design (ICCD). 52--59.
[108]
Y. Qian, Z. Lu, and W. Dou. 2009a. Analysis of communication delay bounds for network on chips. In Proceedings of the Design Automation Conference (ASP-DAC’09). 7--12.
[109]
Y. Qian, Z. Lu, and W. Dou. 2009b. Analysis of worst-case delay bounds for best-effort communication in wormhole networks on chip. In Proceedings of the 3rd ACM/IEEE International Symposium on Networks-on-Chip (NoCS’09). 44--53.
[110]
Y. Qian, Z. Lu, and W. Dou. 2009c. Applying network calculus for performance analysis of self-similar traffic in on-chip networks. In Proceedings of the 7th IEEE/ACM International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’09). ACM, New York, NY, 453--460.
[111]
Y. Qian, Z. Lu, and W. Dou. 2010b. Analysis of worst-case delay bounds for on-chip packet-switching networks. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 29, 5 (2010), 802--815.
[112]
Z. Qian, D. Juan, P. Bogdan, C. Tsui, D. Marculescu, and R. Marculescu. 2015. A support vector regression (SVR) based latency model for network-on-chip (NoC) architectures. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems PP, 99 (2015), 1--1.
[113]
Zhiliang Qian, Da-Cheng Juan, P. Bogdan, Chi ying Tsui, D. Marculescu, and R. Marculescu. 2014. A comprehensive and accurate latency model for network-on-chip performance analysis. In Proceedings of the 19th Design Automation Conference (ASP-DAC’14). 323--328.
[114]
Z. L. Qian, D. C. Juan, P. Bogdan, C. Y. Tsui, D. Marculescu, and R. Marculescu. 2013. SVR-NoC: A performance analysis tool for network-on-chips using learning-based support vector regression model. In Proceedings of the ACM/IEEE Design Automation and Test in Europe (DATE).
[115]
D. Rahmati, S. Murali, L. Benini, F. Angiolini, G. De Micheli, and H. Sarbazi-Azad. 2009. A method for calculating hard QoS guarantees for networks-on-chip. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design—Digest of Technical Papers (ICCAD’09). 579--586.
[116]
D. Rahmati, S. Murali, L. Benini, F. Angiolini, G. De Micheli, and H. Sarbazi-Azad. 2013. Computing accurate performance bounds for best effort networks-on-chip. IEEE Transactions on Computers 62, 3 (March 2013), 452--467.
[117]
F. J. Ridruejo Perez and J. Miguel-Alonso. 2005. INSEE: An interconnection network simulation and evaluation environment. In Euro-Par 2005 Parallel Processing. Lecture Notes in Computer Science, Vol. 3648. Springer, Berlin, 1014--1023.
[118]
B. Ryu and S. Lowen. 2000. Fractal traffic models for internet simulation. In Proceedings of the 5th IEEE Symposium on Computers and Communications (ISCC’00). 200--206.
[119]
S. Shah-Heydari and T. Le-Ngoc. 1998. Multiple-state MMPP models for multimedia ATM traffic. In Proceedings of the International Conference on Telecommunications (ICT’98). 435--439.
[120]
S. Shah-Heydari and T. Le-Ngoc. 2000. MMPP models for multimedia traffic. Telecommunication Systems 15, 3--4 (2000), 273--293.
[121]
Z. Shi and A. Burns. 2008. Real-time communication analysis for on-chip networks with wormhole switching. In Proceedings of the 2nd ACM/IEEE International Symposium on Networks-on-Chip (NoCS’08). 161--170.
[122]
Z. Shi and A. Burns. 2010. Schedulability analysis and task mapping for real-time on-chip communication. Real-Time Systems 46, 3 (2010), 360--385.
[123]
Simics. 2012. Simics simulator. Retrieved from http://www.virtutech.com/.
[124]
V. Soteriou, H. S. Wang, and L. S. Peh. 2006. A statistical traffic model for on-chip interconnection networks. In Proceedings of the 14th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS’06). 104--116.
[125]
SPLASH-2. 2007. Modified SPLASH-2 Benchmark Suite. Retrieved from http://www.capsl.udel.edu/splash/.
[126]
C. D. Spradling. 2007. SPEC CPU2006 benchmark tools. SIGARCH Computer Architecture News 35, 1 (March 2007).
[127]
S. Stergiou, F. Angiolini, S. Carta, L. Raffo, D. Bertozzi, and G. De Micheli. 2005. Xpipes Lite: A synthesis oriented design library for networks on chips. In Proceedings of Design, Automation and Test in Europe, 2005, Vol. 2. 1188--1193.
[128]
L. Thiele, S. Chakraborty, and M. Naedele. 2000. Real-time calculus for scheduling hard real-time systems. In Proceedings of the 2000 IEEE International Symposium on Circuits and Systems (ISCAS’00), Vol. 4. 101--104.
[129]
A. Tran and B. Baas. 2012. NoCTweak: A Highly Parameterizable Simulator for Early Exploration of Performance and Energy of Networks On-Chip. Technical Report, VLSI Computation Lab, ECE Department, UC Davis, July 2012.
[130]
V. Vapnik. 1998. Statistical Learning theory. John Wiley and Sons.
[131]
G. Varatkar and R. Marculescu. 2002. Traffic analysis for on-chip networks design of multimedia applications. In Proceedings of the 39th Design Automation Conference, 2002. 795--800.
[132]
G. V. Varatkar and R. Marculescu. 2004. On-chip traffic modeling and synthesis for MPEG-2 video applications. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 12, 1 (2004), 108--119.
[133]
D. Y. Wang, N. E. Jerger, and J. G. Steffan. 2011. DART: A programmable architecture for NoC simulation on FPGAs. In Proceedings of the 2011 5th IEEE/ACM International Symposium on Networks on Chip (NoCS). 145--152.
[134]
Z. Wang, W. Liu, J. Xu, B. Li, R. Iyer, R. Illikkal, X. Wu, W. H. Mow, and W. Ye. 2014. A case study on the communication and computation behaviors of real applications in NoC-based MPSoCs. In Proceedings of the 2014 IEEE Computer Society Annual Symposium on VLSI (ISVLSI). 480--485.
[135]
T. F. Wenisch, R. E. Wunderlich, M. Ferdman, A. Ailamaki, B. Falsafi, and J. C. Hoe. 2006. SimFlex: Statistical sampling of computer system simulation. IEEE Micro 26, 4 (July 2006), 18--31.
[136]
P. T. Wolkotte, P. K. F. Holzenspies, and G. J. M. Smit. 2007. Fast, accurate and detailed NoC simulations. In Proceedings of the 1st International Symposium on Networks-on-Chip (NOCS’07). 323--332.
[137]
S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta. 1995. The SPLASH-2 programs: Characterization and methodological considerations. In Proceedings of the 22nd International Symposium on Computer Architecture.
[138]
Wormsim. 2008. Wormsim simulator. (2008). http://www.ece.cmu.edu/∼sld/software/worm_sim.php.
[139]
Y. Wu, G. Min, M. Ould-Khaoua, H. Yin, and L. Wang. 2010. Analytical modelling of networks in multicomputer systems under bursty and batch arrival traffic. The Journal of Supercomputing 51, 2 (2010), 115--130.
[140]
T. Yoshihara, S. Kasahara, and Y. Takahashi. 2001. Practical time-scale fitting of self-similar traffic with Markov-modulated Poisson process. Telecommunication Systems 17, 1 (2001), 185--211.
[141]
X. Zhao and Z. Lu. 2013. Per-flow delay bound analysis based on a formalized microarchitectural model. In Proceedings of the 2013 7th IEEE/ACM International Symposium on Networks on Chip (NoCS). 1--8.
[142]
M. Zolghadr, K. Mirhosseini, S. Gorgin, and A. Nayebi. 2011. GPU-based NoC simulator. In Proceedings of the 2011 9th IEEE/ACM International Conference on Formal Methods and Models for Codesign (MEMOCODE). 83--88.

Cited By

View all
  • (2023)Fast and Accurate NoC Latency Estimation for Application-Specific Traffics via Machine LearningIEEE Transactions on Circuits and Systems II: Express Briefs10.1109/TCSII.2023.325870070:9(3569-3573)Online publication date: Sep-2023
  • (2023)Fast Analysis Using Finite Queuing Model for Multilayer NoCsIEEE Design & Test10.1109/MDAT.2023.331016740:6(112-124)Online publication date: Dec-2023
  • (2023)An Artificial Bee Colony Based Mapping Method for Three Dimensional Network-on-Chip2023 International Conference on Emerging Trends in Networks and Computer Communications (ETNCC)10.1109/ETNCC59188.2023.10284966(1-6)Online publication date: 16-Aug-2023
  • Show More Cited By

Index Terms

  1. Performance Evaluation of NoC-Based Multicore Systems: From Traffic Analysis to NoC Latency Modeling

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Design Automation of Electronic Systems
      ACM Transactions on Design Automation of Electronic Systems  Volume 21, Issue 3
      Special Section on New Physical Design Techniques for the Next Generation Integration Technology and Regular Papers
      July 2016
      434 pages
      ISSN:1084-4309
      EISSN:1557-7309
      DOI:10.1145/2926747
      • Editor:
      • Naehyuck Chang
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Journal Family

      Publication History

      Published: 11 May 2016
      Accepted: 01 December 2015
      Revised: 01 October 2015
      Received: 01 June 2015
      Published in TODAES Volume 21, Issue 3

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Performance evaluation
      2. analytical model
      3. average and maximum delay
      4. network-on-chips (NoCs)
      5. simulation
      6. traffic models

      Qualifiers

      • Tutorial
      • Research
      • Refereed

      Funding Sources

      • US National Science Foundation (NSF)
      • Hong Kong Research Grants Council (RGC)

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)72
      • Downloads (Last 6 weeks)5
      Reflects downloads up to 01 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Fast and Accurate NoC Latency Estimation for Application-Specific Traffics via Machine LearningIEEE Transactions on Circuits and Systems II: Express Briefs10.1109/TCSII.2023.325870070:9(3569-3573)Online publication date: Sep-2023
      • (2023)Fast Analysis Using Finite Queuing Model for Multilayer NoCsIEEE Design & Test10.1109/MDAT.2023.331016740:6(112-124)Online publication date: Dec-2023
      • (2023)An Artificial Bee Colony Based Mapping Method for Three Dimensional Network-on-Chip2023 International Conference on Emerging Trends in Networks and Computer Communications (ETNCC)10.1109/ETNCC59188.2023.10284966(1-6)Online publication date: 16-Aug-2023
      • (2023)Fast NoC Router Latency Estimation Using Machine Learning2023 China Semiconductor Technology International Conference (CSTIC)10.1109/CSTIC58779.2023.10219215(1-3)Online publication date: 26-Jun-2023
      • (2023)Non-Gaussian Traffic Modeling for Multicore Architecture Using Wavelet Based Rosenblatt ProcessIEEE Access10.1109/ACCESS.2023.326557211(38523-38533)Online publication date: 2023
      • (2021)NoC Performance Model for Efficient Network Latency Estimation2021 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE51398.2021.9474101(994-999)Online publication date: 1-Feb-2021
      • (2021)ILP formulation and heuristic method for energy-aware application mapping on 3D-NoCsThe Journal of Supercomputing10.1007/s11227-020-03365-077:3(2667-2680)Online publication date: 1-Mar-2021
      • (2020)Analysis of Performance Bottlenecks in SoC Interconnect Subsystems2020 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus)10.1109/EIConRus49466.2020.9039237(1911-1914)Online publication date: Jan-2020
      • (2020)Performance analysis of network-on-chip in many-core processorsJournal of Parallel and Distributed Computing10.1016/j.jpdc.2020.09.013Online publication date: Sep-2020
      • (2018)Hybrid Network-on-ChipComplexity10.1155/2018/10408692018Online publication date: 30-Jul-2018
      • Show More Cited By

      View Options

      Get Access

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media