Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Accelerating UNISIM-Based Cycle-Level Microarchitectural Simulations on Multicore Platforms

Published: 01 June 2011 Publication History

Abstract

UNISIM has been shown to ease the development of simulators for multi-/many-core systems. However, UNISIM cycle-level simulations of large-scale multiprocessor systems could be very time consuming. In this article, we propose a systematic framework for accelerating UNISIM cycle-level simulations on multicore platforms. The proposed framework relies on exploiting the fine-grained parallelism within the simulated cycles using POSIX threads. A multithreaded simulation engine has been devised from the single-threaded UNISIM SystemC engine to facilitate the exploitation of inherent parallelism. An adaptive technique that manages the overall computation workload by adjusting the number of threads employed at any given time is proposed. In addition, we have introduced a technique to balance the workloads of multithreaded executions. This load balancing involves the distributions of SystemC objects among threads. A graph-partitioning-based technique has been introduced to automate such distributions. Finally, two strategies are proposed for realizing nonautomated and fully automated adaptive multithreaded simulations, respectively. Our investigations show that notable acceleration can be achieved by deploying the proposed framework. In particular, we show that simulations on an 8-core multicore platform can provide for close to 6X speedups when simulating many-core systems with large number of cores.

References

[1]
August, D., Chang, J., Girbal, S., Garcia-Perez, D., Mouchard, G., Penry, D., Temam, O., and Vachharajani, N. 2007. UNISIM: An open simulation environment and library for complex architecture design and collaborative development. Comput. Archit. Lett. 6, 2, 45--48.
[2]
CellSim. 2009. CellSim simulator. http://pcsostres.ac.upc.edu/cellsim/doku.php/start.
[3]
Chidester, M. and George, A. 2002. Parallel simulation of chip-multiprocessor architectures. ACM Trans. Model. Comput. Simul. 12, 3, 176--200.
[4]
Combes, P., Caron, E., Desprez, F., Chopard, B., and Zory, J. 2008. Relaxing synchronization in a parallel SystemC kernel. In Proceedings of the International Symposium on Parallel and Distributed Applications.
[5]
Donald, J. and Martonosi, M. 2006. An efficient, practical parallelization methodology for multicore architecture simulation. Comput. Archit. Lett. 5, 2, 14.
[6]
Ezudheen, P., Chandran, P., Chandra, J., Simon, B. P., and Ravi, D. 2009. Parallelizing SystemC kernel for fast hardware simulation on SMP machines. In Proceedings of the Workshop on Principles of Advanced and Distributed Simulation.
[7]
Falsafi, B. and Wood, D. A. 1997. Modeling cost/performance of a parallel computer simulator. ACM Trans. Model. Comput. Simul. 7, 1, 104--130.
[8]
Fujimoto, R. M. 1990. Parallel discrete event simulation. Comm. ACM 33, 10, 30--53.
[9]
Huang, K., Bacivarov, I., Hugelshofer, F., and Thiele, L. 2008. Scalably distributed SystemC simulation for embedded applications. In Proceedings of the International Symposium on Industrial Embedded Systems.
[10]
Kaouane, L., Houzet, D., and Huet, S. 2008. SysCellC: SystemC on Cell. In Proceedings of the International Conference on Computational Sciences and Its Applications (ICCSA’08).
[11]
Karypis, G. and Kumar, V. 1998. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20, 1, 359--392.
[12]
Liao, X., Jigang, W., and Srikanthan, T. 2009. A modular simulator framework for Network-on-Chip based manycore chips using UNISIM. Trans. High-Perform. Embed. Archit. Compil. 4, 4.
[13]
Low, Y., Lim, C.-C., Cai, W., Huang, S.-Y., Hsu, W.-J., Jain, S., and Turner, S. J. 1999. Survey of languages and runtime libraries for parallel discrete event simulation. http://sim.sagepub.com/content/72/3/170.abstract.
[14]
Mullins, R., West, A., and Moore, S. 2004. Low-Latency virtual-channel routers for on-chip networks. In Proceedings of the 31st Annual International Symposium on Computer Architecture.
[15]
Naguib, Y. N. and Guindi, R. S. 2007. Speeding up SystemC simulation through process splitting. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’07).
[16]
Nanjundappa, M., Patel, H. D., Jose, B. A., and Shukla, S. K. 2010. SCGPSim: A fast SystemC simulator on GPUs. In Proceedings of the Asia and South Pacific Design Automation Conference (ASP-DAC’10).
[17]
OSCI. 2009. OSCI standards and reference implementation. http://www.systemc.org.
[18]
Patel, H. and Shukla, S. K. 2005. Towards a heterogeneous simulation kernel for system-level models: A SystemC kernel for synchronous data flow models. IEEE Trans. Comput.-Aid. Des. Integr. Circ. Syst. 24, 8.
[19]
Penry, D., Fay, D., Hodgdon, D., Wells, R., Schelle, G., August, D. I., and Connors, D. 2006. Exploiting parallelism and structure to accelerate the simulation of chip multiprocessors. In Proceedings of the International Symposium on High-Performance Computer Architecture.
[20]
Pérez, D. G., Mouchard, G., and Temam, O. 2004. A new optimized implemention of the systemc engine using acyclic scheduling. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’04).
[21]
PI. 2009. Calculating π using MPI. http://www.unix.mcs.anl.gov/mpi/usingmpi/examples/simplempi/cpi_c.htm.
[22]
Savoiu, N., Shukla, S., and Gupta, R. 2002. Automated concurrency re-assignment in high level system models for efficient system-level simulation. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’02).
[23]
Smith, J., Smith, K. S., and Smith II, R. J. 1987. Faster architectural simulation through parallelism. In Proceedings of the 24th ACM/IEEE Design Automation Conference (DAC’87).
[24]
Soulé, L. and Blank, T. 1988. Parallel logic simulation on general purpose machines. In Proceedings of the 25th ACM/IEEE Design Automation Conference (DAC’88).
[25]
Tilera. 2009. Tilera. http://www.tilera.com.
[26]
Tota, S. V., Casu, M. R., Roch, M. R., Macchiarulo, L., and Zamboni, M. 2009. A case study for NOC-based homogeneous MPSoC architectures. IEEE Trans. VLSI Syst. 17, 3, 384--388.
[27]
Trams, M. 2004. A first mature revision of a synchronization library for distributed RTL simulation in SystemC. http://www.digital-force.net.
[28]
Tsai, J.-J. and Fujimoto, R. M. 1993. Automatic parallelization of discrete event simulation programs. In Proceedings of the 25th Conference on Winter Simulation (WSC’93).
[29]
UNISIM. 2009. UNIted SIMulation Environment. https://unisim.org.
[30]
Vangal, S., Howard, J., Ruhl, G., Dighe, S., Wilson, H., Tschanz, J., Finan, D., Iyer, P., Singh, A., Jacob, T., Jain, S., Venkataraman, S., Hoskote, Y., and Borkar, N. 2007. An 80-tile 1.28tflops Network-on-Chip in 65nm CMOS. In Proceedings of the IEEE International Conference on Solid-State Circuits (ISSCC’07) (Digest of Technical Papers).

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Design Automation of Electronic Systems
ACM Transactions on Design Automation of Electronic Systems  Volume 16, Issue 3
June 2011
330 pages
ISSN:1084-4309
EISSN:1557-7309
DOI:10.1145/1970353
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 01 June 2011
Accepted: 01 February 2011
Revised: 01 October 2010
Received: 01 March 2010
Published in TODAES Volume 16, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Simulation
  2. UNISIM
  3. cycle-level microarchitectural simulation
  4. multi-/many-core systems
  5. parallelization
  6. parallelization of SystemC engine

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 237
    Total Downloads
  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Nov 2024

Other Metrics

Citations

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media