research-article

Open access

Understanding Cache Compression

Authors:

Daniel Rodrigues Carvalho,

André SeznecAuthors Info & Claims

ACM Transactions on Architecture and Code Optimization (TACO), Volume 18, Issue 3

Article No.: 36, Pages 1 - 27

https://doi.org/10.1145/3457207

Published: 08 June 2021 Publication History

All formats PDF

Abstract

Hardware cache compression derives from software-compression research; yet, its implementation is not a straightforward translation, since it must abide by multiple restrictions to comply with area, power, and latency constraints. This study sheds light on the challenges of adopting compression in cache design—from the shrinking of the data until its physical placement. The goal of this article is not to summarize proposals but to put in evidence the solutions they employ to handle those challenges. An in-depth description of the main characteristics of multiple methods is provided, as well as criteria that can be used as a basis for the assessment of such schemes. It is expected that this article will ease the understanding of decisions to be taken for the design of compressed systems and provide directions for future work.

References

[1]

Ali-Reza Adl-Tabatabai, Anwar M. Ghuloum, and Shobhit O. Kanaujia. 2007. Compression in cache design. In Proceedings of the 21st Annual International Conference on Supercomputing (ICS’07). ACM, 190–201.

[2]

Anant Agarwal and Stephen D. Pudar. 1993. Column-Associative Caches: A Technique for Reducing the Miss Rate of Direct-Mapped Caches. Association for Computing Machinery. 179–190 pages.

[3]

Edward Ahn, Seung-Moon Yoo, and Sung-Mo Steve Kang. 2001. Effective algorithms for cache-level compression. In Proceedings of the 11th Great Lakes Symposium on VLSI (GLSVLSI’01). ACM, 89–92.

Digital Library

[4]

Alaa R. Alameldeen and Rajat Agarwal. 2018. Opportunistic compression for direct-mapped DRAM caches. In Proceedings of the International Symposium on Memory Systems (MEMSYS’18). Association for Computing Machinery, 129–136.

[5]

Alaa R. Alameldeen and David A. Wood. 2004. Adaptive cache compression for high-performance processors. SIGARCH Comput. Archit. News 32, 2 (Mar. 2004), 212.

Digital Library

[6]

Alaa R. Alameldeen and David A. Wood. 2004. Frequent pattern compression: A significance-based compression scheme for L2 caches. Department of Computer Science, University of Wisconsin-Madison, Technical Report No. 1500.

[7]

Alaa R. Alameldeen and David A. Wood. 2007. Interactions between compression and prefetching in chip multiprocessors. In Proceedings of the IEEE 13th International Symposium on High Performance Computer Architecture (HPCA’07). IEEE, 228–239.

[8]

Jorge Albericio, Pablo Ibáñez, Víctor Viñals, and José M. Llabería. 2013. The reuse cache: Downsizing the shared last-level cache. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’13). Association for Computing Machinery, 310–321.

[9]

Chloe Alverti, Georgios Goumas, Konstantinos Nikas, Angelos Arelakis, Nectarios Koziris, and Per Stenström. 2015. Memory link compression to speedup scientific workloads. In Proceedings of the 8th Workshop on Programmability Issues for Heterogeneous Multicores. Amsterdam, Netherlands.

[10]

Angelos Arelakis, Fredrik Dahlgren, and Per Stenstrom. 2015. HyComp: A hybrid cache compression method for selection of data-type-specific compression methods. In Proceedings of the 48th International Symposium on Microarchitecture (MICRO’15). Association for Computing Machinery, 38–49.

Digital Library

[11]

Angelos Arelakis and Per Stenstrom. 2014. A case for a value-aware cache. Comput. Archit. Lett. 13, 1 (2014), 1–4.

Digital Library

[12]

Angelos Arelakis and Per Stenstrom. 2014. SC2: A statistical compression cache scheme. In Proceeding of the 41st Annual International Symposium on Computer Architecuture (ISCA’14). IEEE Press, 145–156.

Digital Library

[13]

Akhil Arunkumar, Shin-Ying Lee, Vignesh Soundararajan, and Carole-Jean Wu. 2018. Latte-cc: Latency tolerance aware adaptive cache compression management for energy efficient gpus. In Proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA’18). IEEE Computer Society, 221–234.

[14]

Seungcheol Baek, Hyung Gyu Lee, Chrysostomos Nicopoulos, Junghee Lee, and Jongman Kim. 2013. ECM: Effective capacity maximizer for high-performance compressed caching. In Proceedings of the IEEE 19th International Symposium on High-Performance Computer Architecture (HPCA’13). IEEE Computer Society, 131–142.

[15]

L. Benini, D. Bruni, A. Macii, and E. Macii. 2002. Hardware-assisted data compression for energy minimization in systems with embedded processors. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’02). IEEE Computer Society, 449.

[16]

Árpád Beszédes, Rudolf Ferenc, Tibor Gyimóthy, André Dolenc, and Konsta Karsisto. 2003. Survey of code-size reduction methods. ACM Comput. Surv. 35, 3 (Sept. 2003), 223–267.

Digital Library

[17]

Martin Burtscher and Paruj Ratanaworabhan. 2010. gFPC: A self-tuning compression algorithm. In Proceedings of the Data Compression Conference (DCC’10). IEEE Computer Society, 396–405.

Digital Library

[18]

Chin-Long Chen and M. Y. Hsiao. 1984. Error-correcting codes for semiconductor memory applications: A state-of-the-art review. IBM J. Res. Dev. 28, 2 (1984), 124–134.

Digital Library

[19]

David Chen, Enoch Peserico, and Larry Rudolph. 2003. A dynamically partitionable compressed cache. In Proceedings of the Singapore-MIT Alliance Symposium.

[20]

Long Chen, Yanan Cao, and Zhao Zhang. 2013. Free ECC: An efficient error protection for compressed last-level caches. In Proceedings of the IEEE 31st International Conference on Computer Design (ICCD’13). IEEE Computer Society, 278–285.

[21]

Xi Chen, Lei Yang, Robert P. Dick, Li Shang, and Haris Lekatsas. 2010. C-pack: A high-performance microprocessor cache compression algorithm. IEEE Trans. Very Large Scale Integr. 18, 8 (2010), 1196–1208.

Digital Library

[22]

Pat Conway, Nathan Kalyanasundharam, Gregg Donley, Kevin Lepak, and Bill Hughes. 2010. Cache hierarchy and memory subsystem of the AMD Opteron processor. IEEE Micro 30, 2 (2010), 16–29.

Digital Library

[23]

Julien Dusser, Thomas Piquet, and André Seznec. 2009. Zero-content augmented caches. In Proceedings of the 23rd International Conference on Supercomputing (ICS’09). Association for Computing Machinery, 46–55.

Digital Library

[24]

Magnus Ekman and Per Stenstrom. 2005. A robust main-memory compression scheme. In Proceedings of the 32nd Annual International Symposium on Computer Architecture (ISCA’05). IEEE Computer Society, 74–85.

Digital Library

[25]

Jens Ernst, William Evans, Christopher W. Fraser, Todd A. Proebsting, and Steven Lucco. 1997. Code compression. SIGPLAN Not. 32, 5 (May 1997), 358–365.

Digital Library

[26]

Alexandra Ferreron, Dario Suarez-Gracia, Jesus Alastruey-Benede, Teresa Monreal-Arnal, and Pablo Ibanez. 2016. Concertina: Squeezing in cache content to operate at near-threshold voltage. IEEE Trans. Comput. 65, 3 (Mar. 2016), 755–769.

Digital Library

[27]

P. Franaszek, J. Robinson, and J. Thomas. 1996. Parallel compression with cooperative dictionary construction. In Proceedings of the Conference on Data Compression (DCC’96). IEEE Computer Society, 200. Retrieved from http://dl.acm.org/citation.cfm?id=789084.789497.

[28]

Jayesh Gaur, Alaa R. Alameldeen, and Sreenivas Subramoney. 2016. Base-victim compression: An opportunistic cache compression architecture. In Proceedings of the ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA’16). IEEE Press, 317–328.

Digital Library

[29]

Amin Ghasemazar, Prashant Nair, and Mieszko Lis. 2020. Thesaurus: Efficient cache compression via dynamic clustering. In Proceedings of the 25th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’20). Association for Computing Machinery, 527–540.

Digital Library

[30]

Yuncheng Guo, Yu Hua, and Pengfei Zuo. 2018. DFPC: A dynamic frequent pattern compression scheme in NVM-based main memory. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition (DATE’18). IEEE, 1622–1627.

[31]

Erik G. Hallnor and Steven K. Reinhardt. 2000. A Fully Associative Software-Managed Cache Design. Association for Computing Machinery. 107–116.

[32]

Erik G. Hallnor and Steven K. Reinhardt. 2005. A unified compressed memory hierarchy. In Proceedings of the 11th International Symposium on High-Performance Computer Architecture (HPCA’05). IEEE Computer Society, 201–212.

[33]

John L. Hennessy and David A. Patterson. 2012. Computer Architecture: A Quantitative Approach. Elsevier.

Digital Library

[34]

Seokin Hong, Bulent Abali, Alper Buyuktosunoglu, Michael B. Healy, and Prashant J. Nair. 2019. Touché: Towards ideal and efficient cache compression by mitigating tag area overheads. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’19). Association for Computing Machinery, 453–465.

[35]

David A. Huffman et al. 1952. A method for the construction of minimum-redundancy codes. Proc. IRE 40, 9 (1952), 1098–1101.

[36]

Bruce Jacob, David Wang, and Spencer Ng. 2010. Memory Systems: Cache, DRAM, Disk. Morgan Kaufmann.

Digital Library

[37]

Aamer Jaleel, Kevin B. Theobald, Simon C. Steely, and Joel Emer. 2010. High performance cache replacement using re-reference interval prediction (RRIP). In Proceedings of the 37th Annual International Symposium on Computer Architecture (ISCA’10). Association for Computing Machinery, 60–71.

Digital Library

[38]

Lei Jiang, Bo Zhao, Youtao Zhang, Jun Yang, and Bruce R. Childers. 2012. Improving write operations in MLC phase change memory. In Proceedings of the IEEE International Symposium on High-Performance Comp Architecture. IEEE Computer Society, 201–210.

[39]

Raghavendra K., Biswabandan Panda, and Madhu Mutyam. 2015. PBC: Prefetched blocks compaction. IEEE Trans. Comput. 65 (01 2015), 1–1.

[40]

John Kelsey. 2002. Compression and information leakage of plaintext. In Proceedings of the International Workshop on Fast Software Encryption (Lecture Notes in Computer Science), Vol. 2365. Springer, 263–276.

[41]

Georgios Keramidas, Konstantinos Aisopos, and Stefanos Kaxiras. 2006. Dynamic dictionary-based data compression for level-1 caches. Archit. Comput. Syst. 3894 (2006), 114–129.

[42]

Mushfique Junayed Khurshid and Mikko Lipasti. 2013. Data compression for thermal mitigation in the hybrid memory cube. In Proceedings of the IEEE 31st International Conference on Computer Design (ICCD’13). IEEE Computer Society, 185–192.

[43]

Jungrae Kim, Michael Sullivan, Esha Choukse, and Mattan Erez. 2016. Bit-plane compression: Transforming data for better compression in many-core architectures. SIGARCH Comput. Archit. News 44, 3 (June 2016), 329–340.

Digital Library

[44]

Jungrae Kim, Michael Sullivan, Seong-Lyong Gong, and Mattan Erez. 2015. Frugal ECC: Efficient and versatile memory error protection through fine-grained compression. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC’15). Association for Computing Machinery, Article 12, 12 pages.

Digital Library

[45]

N. Kim, Todd Austin, and Trevor Mudge. 2002. Low-energy data cache using sign compression and cache line bisection. In Proceedings of the 2nd Annual Workshop on Memory Performance Issues (WMPI’02).

[46]

Soontae Kim, Jongmin Lee, Jesung Kim, and Seokin Hong. 2011. Residue cache: A low-energy low-area L2 cache architecture via compression and partial hits. In Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’11). ACM, 420–429.

Digital Library

[47]

Morten Kjelso, Mark Gooch, and Simon Jones. 1996. Design and performance of a main memory hardware data compressor. In Proceedings of the 22nd EUROMICRO Conference: Beyond 2000: Hardware and Software Design Strategies. IEEE Computer Society, 423–430.

[48]

M. Kjelso, M. Gooch, and S. Jones. 1998. Empirical study of memory-data: Characteristics and compressibility. IEE Proc. Comput. Dig. Techn. 145, 1 (1998), 63–67.

[49]

Sumeet Kumar, Prateek Pujara, and Aneesh Aggarwal. 2004. Bit-sliced datapath for energy-efficient high performance microprocessors. In Proceedings of the International Workshop on Power-Aware Computer Systems (Lecture Notes in Computer Science), Vol. 3471. Springer, 30–45.

[50]

Jang-Soo Lee, Won-Kee Hong, and Shin-Dug Kim. 1999. Design and evaluation of a selective compressed memory system. In Proceedings of the International Conference on Computer Design (ICCD’99). IEEE Computer Society, 184–191.

[51]

Jang-Soo Lee, Won-Kee Hong, and Shin-Dug Kim. 2000. An on-chip cache compression technique to reduce decompression overhead and design complexity. J. Syst. Architect. 46, 15 (2000), 1365–1382.

Digital Library

[52]

Peter Lindstrom and Martin Isenburg. 2006. Fast and efficient compression of floating-point data. IEEE Trans. Visual. Comput. Graph. 12, 5 (Sept. 2006), 1245–1250.

Digital Library

[53]

Gabriel H. Loh and Mark D. Hill. 2011. Efficiently enabling conventional block sizes for very large die-stacked DRAM caches. In Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’11). Association for Computing Machinery, 454–464.

[54]

Joshua San Miguel, Jorge Albericio, Andreas Moshovos, and Natalie Enright Jerger. 2015. DoppelgäNger: A cache for approximate computing. In Proceedings of the 48th International Symposium on Microarchitecture (MICRO’15). Association for Computing Machinery, 50–61.

Digital Library

[55]

Sparsh Mittal. 2016. A survey of techniques for approximate computing. ACM Comput. Surv. 48, 4, Article 62 (Mar. 2016), 33 pages.

[56]

Sparsh Mittal and Jeffrey S. Vetter. 2015. A survey of architectural approaches for data compression in cache and main memory systems. IEEE Trans. Parallel Distrib. Syst. 27, 5 (2015), 1524–1536.

Digital Library

[57]

Tri M. Nguyen, Adi Fuchs, and David Wentzlaff. 2018. CABLE: A cache-based link encoder for bandwidth-starved manycores. In Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’18). IEEE Press, 312–325.

Digital Library

[58]

Tri M. Nguyen and David Wentzlaff. 2015. MORC: A manycore-oriented compressed cache. In Proceedings of the 48th International Symposium on Microarchitecture (MICRO’15). Association for Computing Machinery, 76–88.

[59]

Jose Luis Nunez, Claudia Feregrino, Stephen Bateman, and Simon Jones. 1999. The X-MatchLITE FPGA-based data compressor. In Proceedings of the 25th EUROMICRO Conference, Vol. 1. IEEE Computer Society, 1126–1132.

[60]

Jose Luis Nunez, Claudia Feregrino, Simon Jones, and Stephen Bateman. 2001. X-MatchPRO: A ProASIC-based 200 Mbytes/s full-duplex lossless data compressor. In Proceedings of the International Conference on Field Programmable Logic and Applications (Lecture Notes in Computer Science), Vol. 2147. Springer, 613–617.

[61]

Howard T. Olnowich. 1985. Set associative sector cache. U.S. Patent 4,493,026.

[62]

David J. Palframan, Nam Sung Kim, and Mikko H. Lipasti. 2015. COP: To compress and protect main memory. In Proceedings of the 42nd Annual International Symposium on Computer Architecture (ISCA’15). Association for Computing Machinery, 682–693.

[63]

Biswabandan Panda and André Seznec. 2016. Dictionary sharing: An efficient cache compression scheme for compressed caches. In Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’16). IEEE Press, Article 1, 12 pages.

[64]

Biswabandan Panda and André Seznec. 2018. Synergistic cache layout for reuse and compression. In Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques (PACT’18). Association for Computing Machinery, Article 4, 13 pages.

Digital Library

[65]

Jaehyun Park, Seungcheol Baek, Hyung Gyu Lee, Chrysostomos Nicopoulos, Vinson Young, Junghee Lee, and Jongman Kim. 2017. HoPE: Hot-cacheline prediction for dynamic early decompression in compressed LLCs. ACM Trans. Des. Autom. Electron. Syst. 22, 3, Article 40 (Apr. 2017), 25 pages.

Digital Library

[66]

Bhargavraj Patel, Nikos Hardavellas, and Gokhan Memik. 2015. SCP: Synergistic cache compression and prefetching. In Proceedings of the 33rd IEEE International Conference on Computer Design (ICCD’15). IEEE Computer Society, 164–171.

Digital Library

[67]

Gennady Pekhimenko, Tyler Huberty, Rui Cai, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, and Todd C. Mowry. 2015. Exploiting compressed block size as an indicator of future reuse. In Proceedings of the IEEE 21st International Symposium on High Performance Computer Architecture (HPCA’15). IEEE Computer Society, 51–63.

[68]

Gennady Pekhimenko, Vivek Seshadri, Yoongu Kim, Hongyi Xin, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, and Todd C. Mowry. 2013. Linearly compressed pages: A low-complexity, low-latency main memory compression framework. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’13). Association for Computing Machinery, 172–184.

[69]

Gennady Pekhimenko, Vivek Seshadri, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, and Todd C. Mowry. 2012. Base-delta-immediate compression: Practical data compression for on-chip caches. In Proceedings of the 21st International Conference on Parallel Architectures and Compilation Techniques (PACT’12). Association for Computing Machinery, 377–388.

[70]

Prateek Pujara and Aneesh Aggarwal. 2005. Restrictive compression techniques to increase level 1 cache capacity. In Proceedings of the IEEE International Conference on Computer Design: VLSI in Computers and Processors (ICCD’05). IEEE, IEEE Computer Society, 327–333.

[71]

Moinuddin K. Qureshi and Gabe H. Loh. 2012. Fundamental latency trade-off in architecting DRAM caches: Outperforming impractical SRAM-tags with a simple and practical design. In Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’12). IEEE Computer Society, 235–246.

[72]

Moinuddin K. Qureshi, M. Aater Suleman, and Yale N. Patt. 2007. Line distillation: Increasing cache capacity by filtering unused words in cache lines. In Proceedings of the IEEE 13th International Symposium on High Performance Computer Architecture (HPCA’07). IEEE Computer Society, 250–259.

[73]

Joshua San Miguel, Jorge Albericio, Natalie Enright Jerger, and Aamer Jaleel. 2016. The Bunker cache for spatio-value approximation. In Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’16). IEEE Computer Society, 43:1–43:12.

Digital Library

[74]

Somayeh Sardashti, Angelos Arelakis, Per Stenström, and David A. Wood. 2015. A primer on compression in the memory hierarchy. Synth. Lect. Comput. Architect. 10, 5 (2015), 1–86.

[75]

Somayeh Sardashti, André Seznec, and David A. Wood. 2014. Skewed compressed caches. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’14). IEEE Computer Society, 331–342.

[76]

Somayeh Sardashti, Andre Seznec, and David A. Wood. 2016. Yet another compressed cache: A low-cost yet effective compressed cache. ACM Trans. Archit. Code Optim. 13, 3, Article 27 (Sept. 2016), 25 pages.

Digital Library

[77]

Somayeh Sardashti and David A. Wood. 2013. Decoupled compressed cache: Exploiting spatial locality for energy-optimized compressed caching. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’13). Association for Computing Machinery, 62–73.

[78]

Kenneth James Schultz, Garnet Frederick Randall Gibson, Farhad Shafai, and Armin George Bluschke. 1999. Content addressable memory. U.S. Patent 5,859,791.

[79]

Seok-Won Seong and Prabhat Mishra. 2008. Bitmask-based code compression for embedded systems. IEEE Trans. Comput.-aided Design Integr. Circ. Syst. 27, 4 (2008), 673–685.

Digital Library

[80]

André Seznec. 1993. A case for two-way skewed-associative caches. In Proceedings of the 20th Annual International Symposium on Computer Architecture (ISCA’93). Association for Computing Machinery, 169–178.

Digital Library

[81]

A. Seznec. 1994. Decoupled sectored caches: Conciliating low tag implementation cost. In Proceedings of the 21st Annual International Symposium on Computer Architecture (ISCA’94). IEEE Computer Society Press, 384–393.

Digital Library

[82]

Ali Shafiee, Meysam Taassori, Rajeev Balasubramonian, and Al Davis. 2014. MemZip: Exploring unconventional benefits from memory compression. In Proceedings of the IEEE 20th International Symposium on High Performance Computer Architecture (HPCA’14). IEEE, IEEE Computer Society, 638–649.

[83]

Alan Jay Smith. 1982. Cache memories. ACM Comput. Surv. 14, 3 (Sept. 1982), 473–530.

Digital Library

[84]

James A. Storer and Thomas G. Szymanski. 1982. Data compression via textual substitution. J. ACM 29, 4 (Oct. 1982), 928–951.

Digital Library

[85]

Martin Thuresson, Lawrence Spracklen, and Per Stenstrom. 2008. Memory-link compression schemes: A value locality perspective. IEEE Trans. Comput. 57, 7 (2008), 916–927.

Digital Library

[86]

Xinhua Tian and Minxuan Zhang. 2007. A unified compressed cache hierarchy using simple frequent pattern compression and partial cache line prefetching. In Proceedings of the International Conference on Embedded Software and Systems (Lecture Notes in Computer Science), Vol. 4523. Springer, 142–153.

Digital Library

[87]

Yingying Tian, Samira M. Khan, Daniel A. Jiménez, and Gabriel H. Loh. 2014. Last-level cache deduplication. In Proceedings of the 28th ACM International Conference on Supercomputing (ICS’14). Association for Computing Machinery, Munich, Germany, 53–62.

[88]

R. Brett Tremaine, Peter A. Franaszek, John T. Robinson, Charles O. Schulz, T. Basil Smith, Michael E. Wazlowski, and P. Maurice Bland. 2001. IBM memory expansion technology (MXT). IBM J. Res. Dev. 45, 2 (2001), 271–285.

Digital Library

[89]

R. Brett Tremaine, T. Basil Smith, Mike Wazlowski, David Har, Kwok-Ken Mak, and Sujith Arramreddy. 2001. Pinnacle: IBM MXT in a memory controller chip. IEEE Micro 21, 2 (2001), 56–68.

Digital Library

[90]

Po-An Tsai, Yee Ling Gan, and Daniel Sanchez. 2018. Rethinking the memory hierarchy for modern languages. In Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’18). IEEE Press, 203–216.

Digital Library

[91]

Po-An Tsai, Andres Sanchez, Christopher W. Fletcher, and Daniel Sanchez. 2020. Safecracker: Leaking secrets through compressed caches. In Proceedings of the 25th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’20). Association for Computing Machinery, 1125–1140.

Digital Library

[92]

Po-An Tsai and Daniel Sanchez. 2019. Compress objects, not cache lines: An object-based compressed memory hierarchy. In Proceedings of the 24th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’19). ACM, 229–242.

Digital Library

[93]

Irina Chihaia Tuduce and Thomas R. Gross. 2005. Adaptive main memory compression. In Proceedings of the USENIX Annual Technical Conference. USENIX, 237–250. Retrieved from http://www.usenix.org/events/usenix05/tech/general/tuduce.html.

[94]

J. Uthayakumar, T. Vengattaraman, and P. Dhavachelvan. 2018. A survey on data compression techniques: From the perspective of data quality, coding schemes, data type and applications. J. King Saud Univ.-Comput. Info. Sci. 33, 2 (2021), 1319--1578.

[95]

Luis Villa, Michael Zhang, and Krste Asanović. 2000. Dynamic zero compression for cache energy reduction. In Proceedings of the 33rd Annual ACM/IEEE International Symposium on Microarchitecture (MICRO’00). Association for Computing Machinery, 214–220.

Digital Library

[96]

Hong Wang, Tong Sun, and Qing Yang. 1995. CAT—Caching address tags: A technique for reducing area cost of on-chip caches. In Proceedings of the 22nd Annual International Symposium on Computer Architecture (ISCA’95). Association for Computing Machinery, 381–390.

[97]

Terry A. Welch. 1984. A technique for high-performance data compression. Computer 6, 17 (1984), 8–19.

Digital Library

[98]

Paul R. Wilson, Scott F. Kaplan, and Yannis Smaragdakis. 1999. The case for compressed caching in virtual memory systems. In Proceedings of the USENIX Annual Technical Conference. USENIX, 101–116. Retrieved from http://www.usenix.org/events/usenix99/full_papers/wilson/wilson.pdf.

[99]

Yuejian Xie and Gabriel H. Loh. 2011. Thread-aware dynamic shared cache compression in multi-core processors. In Proceedings of the IEEE 29th International Conference on Computer Design (ICCD’11). IEEE Computer Society, 135–141.

[100]

Chao Yan and Russ Joseph. 2018. Cocoa: Synergistic cache CoMpression and error CoRrection in CaPacity sensitive last level caches. In Proceedings of the International Symposium on Memory Systems (MEMSYS’18). Association for Computing Machinery, 117–128.

Digital Library

[101]

Jun Yang and Rajiv Gupta. 2002. Energy efficient frequent value data cache design. In Proceedings of the 35th Annual ACM/IEEE International Symposium on Microarchitecture (MICRO’02). IEEE Computer Society Press, 197–207. Retrieved from http://dl.acm.org/citation.cfm?id=774861.774883.

Digital Library

[102]

Jun Yang, Rajiv Gupta, and Chuanjun Zhang. 2004. Frequent value encoding for low power data buses. ACM Trans. Des. Autom. Electron. Syst. 9, 3 (July 2004), 354–384.

Digital Library

[103]

Jun Yang, Youtao Zhang, and Rajiv Gupta. 2000. Frequent value compression in data caches. In Proceedings of the 33rd Annual ACM/IEEE International Symposium on Microarchitecture (MICRO 33). Association for Computing Machinery, Monterey, 258–265.

Digital Library

[104]

Keun Soo Yim, Jang-Soo Lee, Jihong Kim, Shin-Dug Kim, and Kern Koh. 2004. A space-efficient on-chip compressed cache organization for high performance computing. In Proceedings of the 2nd International Conference on Parallel and Distributed Processing and Applications (ISPA’04). Springer-Verlag, 952–964.

Digital Library

[105]

Vinson Young, Prashant J. Nair, and Moinuddin K. Qureshi. 2017. DICE: Compressing DRAM caches for bandwidth and capacity. In Proceedings of the 44th Annual International Symposium on Computer Architecture (ISCA’17). Association for Computing Machinery, 627–638.

[106]

Qi Zeng, Rakesh Jha, Shigang Chen, and Jih-Kwon Peir. 2018. Data locality exploitation in cache compression. In Proceedings of the IEEE 24th International Conference on Parallel and Distributed Systems (ICPADS’18). IEEE, 347–354.

[107]

Youtao Zhang, Jun Yang, and Rajiv Gupta. 2000. Frequent value locality and value-centric data cache design. In Proceedings of the 9th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’00). Association for Computing Machinery, 150–159.

Digital Library

[108]

Jacob Ziv and Abraham Lempel. 1977. A universal algorithm for sequential data compression. IEEE Trans. Info. Theory 23, 3 (1977), 337–343.

Digital Library

[109]

Jacob Ziv and Abraham Lempel. 1978. Compression of individual sequences via variable-rate coding. IEEE Trans. Info. Theory 24, 5 (1978), 530–536.

Digital Library

Cited By

Surchenko ANedbailo Y(2024)Hardware Compression Method for On-Chip and Interprocessor Networks with Wide Channels and Wormhole Flow Control PolicyМетодика компрессии данных в накристальных и межпроцессорных сетях с широкими каналами и политикой управления потоком wormholeInformatics and AutomationИнформатика и автоматизация10.15622/ia.23.3.823:3(859-885)Online publication date: 28-May-2024
https://doi.org/10.15622/ia.23.3.8
Nath AKapoor H(2024)AmLuCEP: Amalgamating LUT-based Compression and Adaptive Encoding Assisted Block Placement To Improve Lifetime of PCM-based Main MemoriesACM Transactions on Design Automation of Electronic Systems10.1145/368933429:6(1-24)Online publication date: 20-Aug-2024
https://dl.acm.org/doi/10.1145/3689334
Wang YPaccagnella RGang ZVasquez WKohlbrenner DShacham HFletcher C(2024)GPU.zip: On the Side-Channel Implications of Hardware-Based Graphical Data Compression2024 IEEE Symposium on Security and Privacy (SP)10.1109/SP54263.2024.00084(3716-3734)Online publication date: 19-May-2024
https://doi.org/10.1109/SP54263.2024.00084
Show More Cited By

Index Terms

Understanding Cache Compression
1. Computer systems organization
  1. Dependable and fault-tolerant systems and networks
    1. Processors and memory architectures
2. Information systems
  1. Data management systems
    1. Data structures
      1. Data layout
        Data compression

Recommendations

Base-delta-immediate compression: practical data compression for on-chip caches
PACT '12: Proceedings of the 21st international conference on Parallel architectures and compilation techniques

Cache compression is a promising technique to increase on-chip cache capacity and to decrease on-chip and off-chip bandwidth usage. Unfortunately, directly applying well-known compression algorithms (usually implemented in software) leads to high ...
Compression in cache design
ICS '07: Proceedings of the 21st annual international conference on Supercomputing

Increasing cache capacity via compression enables designers to improve performance of existing designs for small incremental cost, further leveraging the large die area invested in last level caches. This paper explores the compressed cache design space ...
Base-victim compression: an opportunistic cache compression architecture
ISCA '16: Proceedings of the 43rd International Symposium on Computer Architecture

The memory wall has motivated many enhancements to cache management policies aimed at reducing misses. Cache compression has been proposed to increase effective cache capacity, which potentially reduces capacity and conflict misses. However, complexity ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Architecture and Code Optimization

ACM Transactions on Architecture and Code Optimization Volume 18, Issue 3

September 2021

370 pages

ISSN:1544-3566

EISSN:1544-3973

DOI:10.1145/3460978

Editor:
David Kaeli
Northeastern University, USA

Issue’s Table of Contents

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 June 2021

Accepted: 01 March 2021

Revised: 01 February 2021

Received: 01 September 2020

Published in TACO Volume 18, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
6,905
Total Downloads

Downloads (Last 12 months)2,825
Downloads (Last 6 weeks)420

Reflects downloads up to 20 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Surchenko ANedbailo Y(2024)Hardware Compression Method for On-Chip and Interprocessor Networks with Wide Channels and Wormhole Flow Control PolicyМетодика компрессии данных в накристальных и межпроцессорных сетях с широкими каналами и политикой управления потоком wormholeInformatics and AutomationИнформатика и автоматизация10.15622/ia.23.3.823:3(859-885)Online publication date: 28-May-2024
https://doi.org/10.15622/ia.23.3.8
Nath AKapoor H(2024)AmLuCEP: Amalgamating LUT-based Compression and Adaptive Encoding Assisted Block Placement To Improve Lifetime of PCM-based Main MemoriesACM Transactions on Design Automation of Electronic Systems10.1145/368933429:6(1-24)Online publication date: 20-Aug-2024
https://dl.acm.org/doi/10.1145/3689334
Wang YPaccagnella RGang ZVasquez WKohlbrenner DShacham HFletcher C(2024)GPU.zip: On the Side-Channel Implications of Hardware-Based Graphical Data Compression2024 IEEE Symposium on Security and Privacy (SP)10.1109/SP54263.2024.00084(3716-3734)Online publication date: 19-May-2024
https://doi.org/10.1109/SP54263.2024.00084
Buyuktosunoglu ATrilla DAbali BBerger DWalters CLee J(2024)Enterprise-Class Cache Compression Design2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00080(996-1011)Online publication date: 2-Mar-2024
https://doi.org/10.1109/HPCA57654.2024.00080
Loloeyan PNikmehr HRezaei M(2024)A novel approximate cache block compressor for error-resilient image dataComputers and Electrical Engineering10.1016/j.compeleceng.2024.109106115:COnline publication date: 2-Jul-2024
https://dl.acm.org/doi/10.1016/j.compeleceng.2024.109106
Kim JKang MHong JKim S(2022)Exploiting Inter-block Entropy to Enhance the Compressibility of Blocks with Diverse Data2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA53966.2022.00084(1100-1114)Online publication date: Apr-2022
https://doi.org/10.1109/HPCA53966.2022.00084
Carvalho DSeznec A(2021)Conciliating Speed and Efficiency on Cache Compressors2021 IEEE 39th International Conference on Computer Design (ICCD)10.1109/ICCD53106.2021.00075(442-446)Online publication date: Oct-2021
https://doi.org/10.1109/ICCD53106.2021.00075

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents