L2-Cache Hierarchical Organizations for Multi-core Architectures

Mario Donato Marino²¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4331))

Included in the following conference series:

International Symposium on Parallel and Distributed Processing and Applications

457 Accesses
2 Citations

Abstract

Nowadays the market is moving to have multiple cores on the same chip (Chip Multiprocessors – CMP) with a multi-sliced L2 which is shared by 2 cores. CMPs with 8 cores can already be found, and future CMPs will have more than 8 cores. Typical implementations of CMPs share the L2 cache among the processors and have 2 cores sharing the same L2. We are interested in investigating the behavior of the pair: L2 sharing x L2 cache size. So, we construct models of two different organizations of CMPs: (i) tiles, with L1 and L2 private, interconnected through a router; (ii) tiles with L1 private and L2 shared among processors. The (ii) organization is evaluated with different numbers (2, 4) of cores sharing the same L2 slice and also, the L2 shared slice size is changed (1 MB, 2MB and 4 MB). With a total number of 32 cores, the proposed configurations of (ii) organization are evaluated with a full-system simulation under SPLASH-2 benchmarks. By applying both techniques, results show that the execution time is improved of about 18.9% for Ocean, 88.8% for Raytrace,and 31.8% for Volrend.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

A Dynamic Cache Architecture for Efficient Memory Resource Allocation in Many-Core Systems

Fusion Coherence: Scalable Cache Coherence for Heterogeneous Kilo-Core System

Exploiting Hidden Non-uniformity of Uniform Memory Access on Manycore CPUs

References

Zhang, M., Asanovic, K.: Victim Replication: Maximizing Capacity while Hiding Wire Delay in Tiled Chip Multiprocessors. In: ISCA 2005, USA (2005)
Google Scholar
Chisti, Z., Powell, M.D., Vijaykumar, T.N.: Optimizing Replication, Communication, and Capacity Allocation in CMPs. In: ISCA 2005, USA (2005)
Google Scholar
Kumar, R., Zyuban, V., Tullsen, D.M.: Interconnections in Multi-core Architectures: Understanding Mechanisms, Overheads and Scaling. In: ISCA 2005, USA (2005)
Google Scholar
Waingold, E., et al.: Baring it all to Software: Raw Machines. Computer (1997)
Google Scholar
Taylor, M.B., et al.: Evaluation of the Raw Microprocessor: An Exposed-Wire-Delay Architecture for ILP and Streams. In: Proceedings of ISCA 2004 (2004)
Google Scholar
Nagarajan, R.N., Sankaralingam, K., Burger, D., Leckler, S.W.: A Design Space Evaluation of Grid Processor Architectures. In: ISCA 2001 (2001)
Google Scholar
Sankaralingam, K., Nagarajan, R.N., Liu, H., Kim, C.: Exploiting ILP, TLP, and DLP with the Polymorphous TRIPS Architecture. IEEE (2003)
Google Scholar
Cascaval, C., et al.: Evaluation of a Multithreaded Architecture for Cellular Computing (2002)
Google Scholar
http://www.amd.com
http://www.ibm.com
http://www.intel.com
http://www.research.scea.com/research/html/CellGDC05/index.html
Barroso, L., et al.: Piranha: a scalable architecture based on single-chip multiprocessing. In: ISCA (2002)
Google Scholar
http://www.simics.net
Woo, S., Ohara, M., Torrie, E., Singh, J.P., Gupta, A.: The SPLASH-2 programs: Characterization and Methodological Considerations. In: Proceedings of the 22nd. Annual Symposium on Computer Architecture, pp. 24–36 (1995)
Google Scholar
Kumar, R., Tullsen, D.M., Jouppi, N.P., Ranganathan, P.: Heterogeneous Chip Multiprocessors. Computer 38(11), 32–38 (2005)
Article Google Scholar
http://www.sun.com
Liu, C., Sivasubramaniam, A., Kandemir, M.: Optimizing Bus Energy Consumption of On-Chip Multiprocessors Using Frequent Values, pdp. In: 12th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2004), p. 340 (2004)
Google Scholar
Olukotun, K., et al.: The Case for a Single-Chip Multiprocessor. In: Proceedings of the Seventh International Symposium on Architectural Support for Parallel Languages and Operating Systems (October 1996)
Google Scholar
Huh, J., Burger, D., Kecler, S.: Exploring the design space of future CMPs. In: PACT 1997 (1997)
Google Scholar
Villa, F., Acacio, M., Garcia, J.: Memory Subsystem Characterization in a 16-Core Snoop-Based Chip-Multiprocessor Architecture. In: Yang, L.T., Rana, O.F., Di Martino, B., Dongarra, J. (eds.) HPCC 2005. LNCS, vol. 3726, pp. 213–222. Springer, Heidelberg (2005)
Chapter Google Scholar
Curstis-Maury,, et al.: An Evaluation of OpenMP on Current and Emerging Multithreaded/Multicore Processors, IWOMP, Eugene, Oregon, USA, June 1-4 (2005)
Google Scholar
Kumar, R., Tullsen, D.M.: Heterogeneous Chip Multiprocessors. Computer (2005)
Google Scholar
Chisti, Z., Powell, M.D., Vijaykumar, T.N.: Distance Associativity for High-Performance Energy-Efficient Non-Uniform Cache Architectures. In: Proceedings of the 36th Annual International Symposium on Microarchitecture (MICRO), December 2003, pp. 55–66 (2003)
Google Scholar
Kumar, R., Jouppi, N.P., Tullsen, D.M.: Conjoined-core Chip Multiprocessing. In: 37th International Symposium on Microarchitecture (December 2004)
Google Scholar
Kumar, R., Zyuban, V., Tullsen, D.M.: Interconnections in Multi-core Architectures: Understanding Mechanisms, Overheads and Scaling. In: ISCA, Wisconsin-Madison, USA (2005)
Google Scholar
Nayfeh, B.A., Hammond, L., Olukotun, K.: Evaluation of Design Alternatives for a Multiprocessor Microprocessor. In: ISCA (May 1996)
Google Scholar
Marino, M.D.: Preliminary evaluation of interconnection latency on a CMP with multisliced-L2. In: XXI South Symposium on Microeletronics, Porto Alegre, Brasil (May 2006)
Google Scholar
Shivakumar, P., Jouppi, N.P.: Cacti 3.0: An integrated cache timing, power and area model. Technical report, Compaq Computer Corporation (August 2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Computing Engineering Department- Polytechnic School, University of Sao Paulo,
Mario Donato Marino

Authors

Mario Donato Marino
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computing, School of Informatics, University of Bradford, BD7 1DP, Bradford, U.K.
Geyong Min
Dipartimento di Ingegneria dell’ Informazione - Second, University of Naples - Italy, Real Casa dell’Annunziata - via Roma, 29 81031, Aversa (CE), Italy
Beniamino Di Martino
Department of Computer Science, St. Francis Xavier University, Antigonish, Canada
Laurence T. Yang
Department of Computer Science and Engineering, Shanghai Jiao Tong University, 200030, Shanghai, China
Minyi Guo
Department of Computer Science, Chemnitz University of Technology, Germany
Gudula Rünger

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Marino, M.D. (2006). L2-Cache Hierarchical Organizations for Multi-core Architectures. In: Min, G., Di Martino, B., Yang, L.T., Guo, M., Rünger, G. (eds) Frontiers of High Performance Computing and Networking – ISPA 2006 Workshops. ISPA 2006. Lecture Notes in Computer Science, vol 4331. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11942634_9

Download citation

DOI: https://doi.org/10.1007/11942634_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49860-5
Online ISBN: 978-3-540-49862-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

L2-Cache Hierarchical Organizations for Multi-core Architectures

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

A Dynamic Cache Architecture for Efficient Memory Resource Allocation in Many-Core Systems

Fusion Coherence: Scalable Cache Coherence for Heterogeneous Kilo-Core System

Exploiting Hidden Non-uniformity of Uniform Memory Access on Manycore CPUs

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

L2-Cache Hierarchical Organizations for Multi-core Architectures

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

A Dynamic Cache Architecture for Efficient Memory Resource Allocation in Many-Core Systems

Fusion Coherence: Scalable Cache Coherence for Heterogeneous Kilo-Core System

Exploiting Hidden Non-uniformity of Uniform Memory Access on Manycore CPUs

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation