Abstract
Nowadays the market is moving to have multiple cores on the same chip (Chip Multiprocessors – CMP) with a multi-sliced L2 which is shared by 2 cores. CMPs with 8 cores can already be found, and future CMPs will have more than 8 cores. Typical implementations of CMPs share the L2 cache among the processors and have 2 cores sharing the same L2. We are interested in investigating the behavior of the pair: L2 sharing x L2 cache size. So, we construct models of two different organizations of CMPs: (i) tiles, with L1 and L2 private, interconnected through a router; (ii) tiles with L1 private and L2 shared among processors. The (ii) organization is evaluated with different numbers (2, 4) of cores sharing the same L2 slice and also, the L2 shared slice size is changed (1 MB, 2MB and 4 MB). With a total number of 32 cores, the proposed configurations of (ii) organization are evaluated with a full-system simulation under SPLASH-2 benchmarks. By applying both techniques, results show that the execution time is improved of about 18.9% for Ocean, 88.8% for Raytrace,and 31.8% for Volrend.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Zhang, M., Asanovic, K.: Victim Replication: Maximizing Capacity while Hiding Wire Delay in Tiled Chip Multiprocessors. In: ISCA 2005, USA (2005)
Chisti, Z., Powell, M.D., Vijaykumar, T.N.: Optimizing Replication, Communication, and Capacity Allocation in CMPs. In: ISCA 2005, USA (2005)
Kumar, R., Zyuban, V., Tullsen, D.M.: Interconnections in Multi-core Architectures: Understanding Mechanisms, Overheads and Scaling. In: ISCA 2005, USA (2005)
Waingold, E., et al.: Baring it all to Software: Raw Machines. Computer (1997)
Taylor, M.B., et al.: Evaluation of the Raw Microprocessor: An Exposed-Wire-Delay Architecture for ILP and Streams. In: Proceedings of ISCA 2004 (2004)
Nagarajan, R.N., Sankaralingam, K., Burger, D., Leckler, S.W.: A Design Space Evaluation of Grid Processor Architectures. In: ISCA 2001 (2001)
Sankaralingam, K., Nagarajan, R.N., Liu, H., Kim, C.: Exploiting ILP, TLP, and DLP with the Polymorphous TRIPS Architecture. IEEE (2003)
Cascaval, C., et al.: Evaluation of a Multithreaded Architecture for Cellular Computing (2002)
http://www.research.scea.com/research/html/CellGDC05/index.html
Barroso, L., et al.: Piranha: a scalable architecture based on single-chip multiprocessing. In: ISCA (2002)
Woo, S., Ohara, M., Torrie, E., Singh, J.P., Gupta, A.: The SPLASH-2 programs: Characterization and Methodological Considerations. In: Proceedings of the 22nd. Annual Symposium on Computer Architecture, pp. 24–36 (1995)
Kumar, R., Tullsen, D.M., Jouppi, N.P., Ranganathan, P.: Heterogeneous Chip Multiprocessors. Computer 38(11), 32–38 (2005)
Liu, C., Sivasubramaniam, A., Kandemir, M.: Optimizing Bus Energy Consumption of On-Chip Multiprocessors Using Frequent Values, pdp. In: 12th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2004), p. 340 (2004)
Olukotun, K., et al.: The Case for a Single-Chip Multiprocessor. In: Proceedings of the Seventh International Symposium on Architectural Support for Parallel Languages and Operating Systems (October 1996)
Huh, J., Burger, D., Kecler, S.: Exploring the design space of future CMPs. In: PACT 1997 (1997)
Villa, F., Acacio, M., Garcia, J.: Memory Subsystem Characterization in a 16-Core Snoop-Based Chip-Multiprocessor Architecture. In: Yang, L.T., Rana, O.F., Di Martino, B., Dongarra, J. (eds.) HPCC 2005. LNCS, vol. 3726, pp. 213–222. Springer, Heidelberg (2005)
Curstis-Maury,, et al.: An Evaluation of OpenMP on Current and Emerging Multithreaded/Multicore Processors, IWOMP, Eugene, Oregon, USA, June 1-4 (2005)
Kumar, R., Tullsen, D.M.: Heterogeneous Chip Multiprocessors. Computer (2005)
Chisti, Z., Powell, M.D., Vijaykumar, T.N.: Distance Associativity for High-Performance Energy-Efficient Non-Uniform Cache Architectures. In: Proceedings of the 36th Annual International Symposium on Microarchitecture (MICRO), December 2003, pp. 55–66 (2003)
Kumar, R., Jouppi, N.P., Tullsen, D.M.: Conjoined-core Chip Multiprocessing. In: 37th International Symposium on Microarchitecture (December 2004)
Kumar, R., Zyuban, V., Tullsen, D.M.: Interconnections in Multi-core Architectures: Understanding Mechanisms, Overheads and Scaling. In: ISCA, Wisconsin-Madison, USA (2005)
Nayfeh, B.A., Hammond, L., Olukotun, K.: Evaluation of Design Alternatives for a Multiprocessor Microprocessor. In: ISCA (May 1996)
Marino, M.D.: Preliminary evaluation of interconnection latency on a CMP with multisliced-L2. In: XXI South Symposium on Microeletronics, Porto Alegre, Brasil (May 2006)
Shivakumar, P., Jouppi, N.P.: Cacti 3.0: An integrated cache timing, power and area model. Technical report, Compaq Computer Corporation (August 2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Marino, M.D. (2006). L2-Cache Hierarchical Organizations for Multi-core Architectures. In: Min, G., Di Martino, B., Yang, L.T., Guo, M., Rünger, G. (eds) Frontiers of High Performance Computing and Networking – ISPA 2006 Workshops. ISPA 2006. Lecture Notes in Computer Science, vol 4331. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11942634_9
Download citation
DOI: https://doi.org/10.1007/11942634_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49860-5
Online ISBN: 978-3-540-49862-9
eBook Packages: Computer ScienceComputer Science (R0)