Nothing Special   »   [go: up one dir, main page]

Skip to main content

Cache Memory and On-Chip Cache Architecture: A Survey

  • Conference paper
  • First Online:
Advanced Computing, Machine Learning, Robotics and Internet Technologies (AMRIT 2023)

Abstract

Presently, one of the most essential performance of new multicore CPUs is processing speed. Various components, including cache, are employed to increase the processing speed of the processor. When cache memory comes to multi-core system speed, it is crucial. Because CPU speed is increasing at an alarming rate, an extremely fast cache memory is required to keep up with the processor. On-chip cache systems Cache memory is used to store information. Between the main memory and the CPU, there is a buffer called cache. The rate at which information flows between the central processor and main memory is synchronized using cache memory. The advantage of storing knowledge in cache over RAM is that it has faster retrieval times, but it has the downside of consuming on-chip energy. The performance of cache memory is evaluated in this research using these three variables: miss rate, miss penalty, and cache time interval.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Curtis, H.A.: Systematic procedures for realizing synchronous sequential machines using flip-flop memory: part I. IEEE Trans. Comput. 100(12), 1121–1127 (1969)

    Article  Google Scholar 

  2. Winder, R.0.: A data base for computer performance evaluation. In: Presented at IEEE Workshop on System Performance Evaluation, Argonne, Ill., October 1971, and published in COMPUTER, vol. 6, no. 3 (1973)

    Google Scholar 

  3. Chow, C.K.: An optimization of memory hierarchies by geometric programming. In Proceedings of 7th Annual Princeton Conference Information Science Systp. vol. 15 (1973)

    Google Scholar 

  4. Aven, O.I., Boguslavsky, L.B., Kogan, Y.A.: Some results on distribution-free analysis of paging algorithms. IEEE Trans. Comput. 25(07), 737–745 (1976)

    Article  MathSciNet  Google Scholar 

  5. Gueret, P., Moser, A., Wolf, P.: IBM J. Res Develop. 24 (1980, this issue)

    Google Scholar 

  6. Amdahl, C.: [Amdahl 82] private commuvtict~tiovt, March 82

    Google Scholar 

  7. Hasegawa, M., Shigei, Y.: High-speed top-of-stack scheme for VLSI processor: a management algorithm and its analysis. ACM SIGARCH Comput. Architect. News 13(3), 48–54 (1985)

    Article  Google Scholar 

  8. Winsor, D.C., Mudge, T.N.: Analysis of bus hierarchies for multiprocessors. ACM SIGARCH Comput. Architect. News 16(2), 100–107 (1988)

    Article  Google Scholar 

  9. Baer, J.L., Wang, W.-H.: On the inclusion properties for multi-level cache hierarchies. In The 15th Annual Symposium on Computer Architecture, pp. 73–80. IEEE Computer Society Press, June (1988)

    Google Scholar 

  10. Smith, A.J.: Second bibliography on cache memories. ACM SIGARCH Comput. Architect. News 19(4), 154–182 (1991)

    Article  Google Scholar 

  11. Bakoglu, H.B., Grohoski, G.F., Montoye, R.K.: The IBM RISC system/6000 processor: hardware overview. IBM J. Res. Dev. 34(1), 12–22 (1990)

    Article  Google Scholar 

  12. Flynn, M.J.: Computer Architecture: Concurrent and Parallel Processor Design (1994)

    Google Scholar 

  13. Lebeck, A.R., Wood, D.A.: Cache profiling and the SPEC benchmarks: a case study. Computer 27(10), 15–26 (1994)

    Article  Google Scholar 

  14. Stiliadis, D., Varma, A.: Selective victim caching: a method to improve the performance of direct-mapped caches. IEEE Trans. Comput. 46(5), 603–610 (1997)

    Article  Google Scholar 

  15. Peir, J.K., Lee, Y., Hsu, W.W.: Capturing dynamic memory reference behavior with adaptive cache topology. In: Proceedings of the Eighth International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 240–250 (1998)

    Google Scholar 

  16. Sebek, F., Gustafsson, J.: Determining the worst case instruction cache miss-ratio. In: Proceedings of Workshop On Embedded System Codesign (ESCODES’02) (2002)

    Google Scholar 

  17. Yang, S.H., Powell, M.D., Falsafi, B., Vijaykumar, T.N.: Exploiting choice in resizable cache design to optimize deep-submicron processor energy-delay. In: Proceedings Eighth International Symposium on High Performance Computer Architecture, pp. 151–161. IEEE (2002)

    Google Scholar 

  18. Lee, K.W., Park, W.C., Kim, I.S., Han, T.D.: A pixel cache architecture with selective placement scheme based on z-test result. Microprocess. Microsyst. 29(1), 41–46 (2005)

    Article  Google Scholar 

  19. Temam, O.: An algorithm for optimally exploiting spatial and temporal locality in upper memory levels. IEEE Trans. Comput. 48(2), 150–158 (1999)

    Article  Google Scholar 

  20. Almoosa, N., Wardi, Y., Yalamanchili, S.: Controller design for tracking induced miss-rates in cache memories. In: IEEE ICCA 2010, pp. 1355–1359. IEEE June (2010)

    Google Scholar 

  21. Ahmed, N. Mateev, N., Pingali, K.: Tiling imperfectly nested loop nests. In: Proceedings of the ACM/IEEE Supercomputing Conference, Dallas, Texas, USA (2000)

    Google Scholar 

  22. Xu, Y., et al.: A novel cache size optimization scheme based on manifold learning in content centric networking. J. Netw. Comput. Appl. 37, 273–281 (2014)

    Article  Google Scholar 

  23. Kampe, M., Stenstrom, P., Dubois, M.: Self-correcting LRU replacement policies. In: Proceedings of the 1st Conference on Computing Frontiers, pp. 181–191 (2004)

    Google Scholar 

  24. Xilinx, UG585 Zynq-7000 All Programmable SoC Technical Reference Manual (v1.11) (2016)

    Google Scholar 

  25. Burger, D., Austin, T.M.: The SimpleScalar tool set, version 2.0. ACM Sigarch Comput. Architect News 25(3), 13–25 (1997)

    Google Scholar 

  26. Kim, C., Burger, D., Keckler, S.W.: An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches. In: Proceedings of the 10th International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 211–222 (2002)

    Google Scholar 

  27. Chishti, Z., Powell, M.D., Vijaykumar, T.N.: Distance associativity for high-performance energy-efficient non-uniform cache architectures. In: Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36, pp. 55–66. IEEE (2003)

    Google Scholar 

  28. Beckmann, B.M., Wood, D.A.: Managing wire delay in large chip-multi processor caches. In: 37th International Symposium on Microarchitecture (MICRO-37'04), pp. 319–330. IEEE (2004)

    Google Scholar 

  29. Huh, J., Kim, C., Shafi, H., Zhang, L., Burger, D., Keckler, S.W.: A NUCA substrate for flexible CMP cache sharing. In: ACM International Conference on Supercomputing 25th Anniversary Volume, pp. 380–389 (2005)

    Google Scholar 

  30. Liu, C., Sivasubramaniam, A., Kandemir, M.: Organizing the last line of defense before hitting the memory wall for CMPs. In: 10th International Symposium on High Performance Computer Architecture (HPC 2004), pp. 176–185. IEEE (2004)

    Google Scholar 

  31. Dybdahl, H., Stenstrom, P.: An adaptive shared/private NUCA cache partitioning scheme for chip multiprocessors. In: 2007 IEEE 13th International Symposium on High Performance Computer Architecture, pp. 2–12. IEEE (2007)

    Google Scholar 

  32. Merino, J., Puente, V., Prieto, P., Gregorio, J.Á.: SP-NUCA: a cost effective dynamic non-uniform cache architecture. ACM SIGARCH Comput. Architect. News 36(2), 64–71 (2008)

    Article  Google Scholar 

  33. Mittal, S., Vetter, J.S., Li, D.: A survey of architectural approaches for managing embedded DRAM and non-volatile on-chip caches. IEEE Trans. Parallel Distrib. Syst. 26(6), 1524–1537 (2014)

    Article  Google Scholar 

  34. Madan, N., et al.: Optimizing communication and capacity in a 3D stacked reconfigurable cache hierarchy. In 2009 IEEE 15th International Symposium on High Performance Computer Architecture, pp. 262–274. IEEE (2009)

    Google Scholar 

  35. Inoue, K., Hashiguchi, S., Ueno, S., Fukumoto, N., Murakami, K.: 3D implemented SRAM/DRAM hybrid cache architecture for high-performance and low power consumption. In: 2011 IEEE 54th International Midwest Symposium on Circuits and Systems (MWSCAS), pp. 1–4. IEEE (2011)

    Google Scholar 

  36. Hameed, F., Bauer, L., Henkel, J.: Adaptive cache management for a combined SRAM and DRAM cache hierarchy for multi-cores. In: 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE), pp. 77–82. IEEE (2013)

    Google Scholar 

  37. Hameed, F., Bauer, L., Henkel, J.: Reducing latency in an SRAM/DRAM cache hierarchy via a novel tag-cache architecture. In: Proceedings of the 51st Annual Design Automation Conference, pp. 1–6 (2014)

    Google Scholar 

  38. Huang, C.C., Nagarajan, V.: ATCache: Reducing DRAM cache latency via a small SRAM tag cache. In: Proceedings of the 23rd International Conference on Parallel Architectures and Compilation, pp. 51–60 (2014)

    Google Scholar 

  39. Gulur, N., Mehendale, M., Govindarajan, R.: A comprehensive analytical performance model of dram caches. In: Proceedings of the 6th ACM/SPEC International Conference on Performance Engineering, pp. 157–168 (2015)

    Google Scholar 

  40. Balasubramonian, R., Jouppi, N.P., Muralimanohar, N.: Multi-core cache hierarchies. Synth. Lect. Comput. Architect. 6(3), 1 (2011)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nurulla Mansur Barbhuiya .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Barbhuiya, N.M., Das, P., Roy, B.R. (2024). Cache Memory and On-Chip Cache Architecture: A Survey. In: Das, P., Begum, S.A., Buyya, R. (eds) Advanced Computing, Machine Learning, Robotics and Internet Technologies. AMRIT 2023. Communications in Computer and Information Science, vol 1954. Springer, Cham. https://doi.org/10.1007/978-3-031-47221-3_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-47221-3_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-47220-6

  • Online ISBN: 978-3-031-47221-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics