Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article
Open access

Understanding Cache Compression

Published: 08 June 2021 Publication History

Abstract

Hardware cache compression derives from software-compression research; yet, its implementation is not a straightforward translation, since it must abide by multiple restrictions to comply with area, power, and latency constraints. This study sheds light on the challenges of adopting compression in cache design—from the shrinking of the data until its physical placement. The goal of this article is not to summarize proposals but to put in evidence the solutions they employ to handle those challenges. An in-depth description of the main characteristics of multiple methods is provided, as well as criteria that can be used as a basis for the assessment of such schemes. It is expected that this article will ease the understanding of decisions to be taken for the design of compressed systems and provide directions for future work.

References

[1]
Ali-Reza Adl-Tabatabai, Anwar M. Ghuloum, and Shobhit O. Kanaujia. 2007. Compression in cache design. In Proceedings of the 21st Annual International Conference on Supercomputing (ICS’07). ACM, 190–201.
[2]
Anant Agarwal and Stephen D. Pudar. 1993. Column-Associative Caches: A Technique for Reducing the Miss Rate of Direct-Mapped Caches. Association for Computing Machinery. 179–190 pages.
[3]
Edward Ahn, Seung-Moon Yoo, and Sung-Mo Steve Kang. 2001. Effective algorithms for cache-level compression. In Proceedings of the 11th Great Lakes Symposium on VLSI (GLSVLSI’01). ACM, 89–92.
[4]
Alaa R. Alameldeen and Rajat Agarwal. 2018. Opportunistic compression for direct-mapped DRAM caches. In Proceedings of the International Symposium on Memory Systems (MEMSYS’18). Association for Computing Machinery, 129–136.
[5]
Alaa R. Alameldeen and David A. Wood. 2004. Adaptive cache compression for high-performance processors. SIGARCH Comput. Archit. News 32, 2 (Mar. 2004), 212.
[6]
Alaa R. Alameldeen and David A. Wood. 2004. Frequent pattern compression: A significance-based compression scheme for L2 caches. Department of Computer Science, University of Wisconsin-Madison, Technical Report No. 1500.
[7]
Alaa R. Alameldeen and David A. Wood. 2007. Interactions between compression and prefetching in chip multiprocessors. In Proceedings of the IEEE 13th International Symposium on High Performance Computer Architecture (HPCA’07). IEEE, 228–239.
[8]
Jorge Albericio, Pablo Ibáñez, Víctor Viñals, and José M. Llabería. 2013. The reuse cache: Downsizing the shared last-level cache. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’13). Association for Computing Machinery, 310–321.
[9]
Chloe Alverti, Georgios Goumas, Konstantinos Nikas, Angelos Arelakis, Nectarios Koziris, and Per Stenström. 2015. Memory link compression to speedup scientific workloads. In Proceedings of the 8th Workshop on Programmability Issues for Heterogeneous Multicores. Amsterdam, Netherlands.
[10]
Angelos Arelakis, Fredrik Dahlgren, and Per Stenstrom. 2015. HyComp: A hybrid cache compression method for selection of data-type-specific compression methods. In Proceedings of the 48th International Symposium on Microarchitecture (MICRO’15). Association for Computing Machinery, 38–49.
[11]
Angelos Arelakis and Per Stenstrom. 2014. A case for a value-aware cache. Comput. Archit. Lett. 13, 1 (2014), 1–4.
[12]
Angelos Arelakis and Per Stenstrom. 2014. SC2: A statistical compression cache scheme. In Proceeding of the 41st Annual International Symposium on Computer Architecuture (ISCA’14). IEEE Press, 145–156.
[13]
Akhil Arunkumar, Shin-Ying Lee, Vignesh Soundararajan, and Carole-Jean Wu. 2018. Latte-cc: Latency tolerance aware adaptive cache compression management for energy efficient gpus. In Proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA’18). IEEE Computer Society, 221–234.
[14]
Seungcheol Baek, Hyung Gyu Lee, Chrysostomos Nicopoulos, Junghee Lee, and Jongman Kim. 2013. ECM: Effective capacity maximizer for high-performance compressed caching. In Proceedings of the IEEE 19th International Symposium on High-Performance Computer Architecture (HPCA’13). IEEE Computer Society, 131–142.
[15]
L. Benini, D. Bruni, A. Macii, and E. Macii. 2002. Hardware-assisted data compression for energy minimization in systems with embedded processors. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’02). IEEE Computer Society, 449.
[16]
Árpád Beszédes, Rudolf Ferenc, Tibor Gyimóthy, André Dolenc, and Konsta Karsisto. 2003. Survey of code-size reduction methods. ACM Comput. Surv. 35, 3 (Sept. 2003), 223–267.
[17]
Martin Burtscher and Paruj Ratanaworabhan. 2010. gFPC: A self-tuning compression algorithm. In Proceedings of the Data Compression Conference (DCC’10). IEEE Computer Society, 396–405.
[18]
Chin-Long Chen and M. Y. Hsiao. 1984. Error-correcting codes for semiconductor memory applications: A state-of-the-art review. IBM J. Res. Dev. 28, 2 (1984), 124–134.
[19]
David Chen, Enoch Peserico, and Larry Rudolph. 2003. A dynamically partitionable compressed cache. In Proceedings of the Singapore-MIT Alliance Symposium.
[20]
Long Chen, Yanan Cao, and Zhao Zhang. 2013. Free ECC: An efficient error protection for compressed last-level caches. In Proceedings of the IEEE 31st International Conference on Computer Design (ICCD’13). IEEE Computer Society, 278–285.
[21]
Xi Chen, Lei Yang, Robert P. Dick, Li Shang, and Haris Lekatsas. 2010. C-pack: A high-performance microprocessor cache compression algorithm. IEEE Trans. Very Large Scale Integr. 18, 8 (2010), 1196–1208.
[22]
Pat Conway, Nathan Kalyanasundharam, Gregg Donley, Kevin Lepak, and Bill Hughes. 2010. Cache hierarchy and memory subsystem of the AMD Opteron processor. IEEE Micro 30, 2 (2010), 16–29.
[23]
Julien Dusser, Thomas Piquet, and André Seznec. 2009. Zero-content augmented caches. In Proceedings of the 23rd International Conference on Supercomputing (ICS’09). Association for Computing Machinery, 46–55.
[24]
Magnus Ekman and Per Stenstrom. 2005. A robust main-memory compression scheme. In Proceedings of the 32nd Annual International Symposium on Computer Architecture (ISCA’05). IEEE Computer Society, 74–85.
[25]
Jens Ernst, William Evans, Christopher W. Fraser, Todd A. Proebsting, and Steven Lucco. 1997. Code compression. SIGPLAN Not. 32, 5 (May 1997), 358–365.
[26]
Alexandra Ferreron, Dario Suarez-Gracia, Jesus Alastruey-Benede, Teresa Monreal-Arnal, and Pablo Ibanez. 2016. Concertina: Squeezing in cache content to operate at near-threshold voltage. IEEE Trans. Comput. 65, 3 (Mar. 2016), 755–769.
[27]
P. Franaszek, J. Robinson, and J. Thomas. 1996. Parallel compression with cooperative dictionary construction. In Proceedings of the Conference on Data Compression (DCC’96). IEEE Computer Society, 200. Retrieved from http://dl.acm.org/citation.cfm?id=789084.789497.
[28]
Jayesh Gaur, Alaa R. Alameldeen, and Sreenivas Subramoney. 2016. Base-victim compression: An opportunistic cache compression architecture. In Proceedings of the ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA’16). IEEE Press, 317–328.
[29]
Amin Ghasemazar, Prashant Nair, and Mieszko Lis. 2020. Thesaurus: Efficient cache compression via dynamic clustering. In Proceedings of the 25th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’20). Association for Computing Machinery, 527–540.
[30]
Yuncheng Guo, Yu Hua, and Pengfei Zuo. 2018. DFPC: A dynamic frequent pattern compression scheme in NVM-based main memory. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition (DATE’18). IEEE, 1622–1627.
[31]
Erik G. Hallnor and Steven K. Reinhardt. 2000. A Fully Associative Software-Managed Cache Design. Association for Computing Machinery. 107–116.
[32]
Erik G. Hallnor and Steven K. Reinhardt. 2005. A unified compressed memory hierarchy. In Proceedings of the 11th International Symposium on High-Performance Computer Architecture (HPCA’05). IEEE Computer Society, 201–212.
[33]
John L. Hennessy and David A. Patterson. 2012. Computer Architecture: A Quantitative Approach. Elsevier.
[34]
Seokin Hong, Bulent Abali, Alper Buyuktosunoglu, Michael B. Healy, and Prashant J. Nair. 2019. Touché: Towards ideal and efficient cache compression by mitigating tag area overheads. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’19). Association for Computing Machinery, 453–465.
[35]
David A. Huffman et al. 1952. A method for the construction of minimum-redundancy codes. Proc. IRE 40, 9 (1952), 1098–1101.
[36]
Bruce Jacob, David Wang, and Spencer Ng. 2010. Memory Systems: Cache, DRAM, Disk. Morgan Kaufmann.
[37]
Aamer Jaleel, Kevin B. Theobald, Simon C. Steely, and Joel Emer. 2010. High performance cache replacement using re-reference interval prediction (RRIP). In Proceedings of the 37th Annual International Symposium on Computer Architecture (ISCA’10). Association for Computing Machinery, 60–71.
[38]
Lei Jiang, Bo Zhao, Youtao Zhang, Jun Yang, and Bruce R. Childers. 2012. Improving write operations in MLC phase change memory. In Proceedings of the IEEE International Symposium on High-Performance Comp Architecture. IEEE Computer Society, 201–210.
[39]
Raghavendra K., Biswabandan Panda, and Madhu Mutyam. 2015. PBC: Prefetched blocks compaction. IEEE Trans. Comput. 65 (01 2015), 1–1.
[40]
John Kelsey. 2002. Compression and information leakage of plaintext. In Proceedings of the International Workshop on Fast Software Encryption (Lecture Notes in Computer Science), Vol. 2365. Springer, 263–276.
[41]
Georgios Keramidas, Konstantinos Aisopos, and Stefanos Kaxiras. 2006. Dynamic dictionary-based data compression for level-1 caches. Archit. Comput. Syst. 3894 (2006), 114–129.
[42]
Mushfique Junayed Khurshid and Mikko Lipasti. 2013. Data compression for thermal mitigation in the hybrid memory cube. In Proceedings of the IEEE 31st International Conference on Computer Design (ICCD’13). IEEE Computer Society, 185–192.
[43]
Jungrae Kim, Michael Sullivan, Esha Choukse, and Mattan Erez. 2016. Bit-plane compression: Transforming data for better compression in many-core architectures. SIGARCH Comput. Archit. News 44, 3 (June 2016), 329–340.
[44]
Jungrae Kim, Michael Sullivan, Seong-Lyong Gong, and Mattan Erez. 2015. Frugal ECC: Efficient and versatile memory error protection through fine-grained compression. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC’15). Association for Computing Machinery, Article 12, 12 pages.
[45]
N. Kim, Todd Austin, and Trevor Mudge. 2002. Low-energy data cache using sign compression and cache line bisection. In Proceedings of the 2nd Annual Workshop on Memory Performance Issues (WMPI’02).
[46]
Soontae Kim, Jongmin Lee, Jesung Kim, and Seokin Hong. 2011. Residue cache: A low-energy low-area L2 cache architecture via compression and partial hits. In Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’11). ACM, 420–429.
[47]
Morten Kjelso, Mark Gooch, and Simon Jones. 1996. Design and performance of a main memory hardware data compressor. In Proceedings of the 22nd EUROMICRO Conference: Beyond 2000: Hardware and Software Design Strategies. IEEE Computer Society, 423–430.
[48]
M. Kjelso, M. Gooch, and S. Jones. 1998. Empirical study of memory-data: Characteristics and compressibility. IEE Proc. Comput. Dig. Techn. 145, 1 (1998), 63–67.
[49]
Sumeet Kumar, Prateek Pujara, and Aneesh Aggarwal. 2004. Bit-sliced datapath for energy-efficient high performance microprocessors. In Proceedings of the International Workshop on Power-Aware Computer Systems (Lecture Notes in Computer Science), Vol. 3471. Springer, 30–45.
[50]
Jang-Soo Lee, Won-Kee Hong, and Shin-Dug Kim. 1999. Design and evaluation of a selective compressed memory system. In Proceedings of the International Conference on Computer Design (ICCD’99). IEEE Computer Society, 184–191.
[51]
Jang-Soo Lee, Won-Kee Hong, and Shin-Dug Kim. 2000. An on-chip cache compression technique to reduce decompression overhead and design complexity. J. Syst. Architect. 46, 15 (2000), 1365–1382.
[52]
Peter Lindstrom and Martin Isenburg. 2006. Fast and efficient compression of floating-point data. IEEE Trans. Visual. Comput. Graph. 12, 5 (Sept. 2006), 1245–1250.
[53]
Gabriel H. Loh and Mark D. Hill. 2011. Efficiently enabling conventional block sizes for very large die-stacked DRAM caches. In Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’11). Association for Computing Machinery, 454–464.
[54]
Joshua San Miguel, Jorge Albericio, Andreas Moshovos, and Natalie Enright Jerger. 2015. DoppelgäNger: A cache for approximate computing. In Proceedings of the 48th International Symposium on Microarchitecture (MICRO’15). Association for Computing Machinery, 50–61.
[55]
Sparsh Mittal. 2016. A survey of techniques for approximate computing. ACM Comput. Surv. 48, 4, Article 62 (Mar. 2016), 33 pages.
[56]
Sparsh Mittal and Jeffrey S. Vetter. 2015. A survey of architectural approaches for data compression in cache and main memory systems. IEEE Trans. Parallel Distrib. Syst. 27, 5 (2015), 1524–1536.
[57]
Tri M. Nguyen, Adi Fuchs, and David Wentzlaff. 2018. CABLE: A cache-based link encoder for bandwidth-starved manycores. In Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’18). IEEE Press, 312–325.
[58]
Tri M. Nguyen and David Wentzlaff. 2015. MORC: A manycore-oriented compressed cache. In Proceedings of the 48th International Symposium on Microarchitecture (MICRO’15). Association for Computing Machinery, 76–88.
[59]
Jose Luis Nunez, Claudia Feregrino, Stephen Bateman, and Simon Jones. 1999. The X-MatchLITE FPGA-based data compressor. In Proceedings of the 25th EUROMICRO Conference, Vol. 1. IEEE Computer Society, 1126–1132.
[60]
Jose Luis Nunez, Claudia Feregrino, Simon Jones, and Stephen Bateman. 2001. X-MatchPRO: A ProASIC-based 200 Mbytes/s full-duplex lossless data compressor. In Proceedings of the International Conference on Field Programmable Logic and Applications (Lecture Notes in Computer Science), Vol. 2147. Springer, 613–617.
[61]
Howard T. Olnowich. 1985. Set associative sector cache. U.S. Patent 4,493,026.
[62]
David J. Palframan, Nam Sung Kim, and Mikko H. Lipasti. 2015. COP: To compress and protect main memory. In Proceedings of the 42nd Annual International Symposium on Computer Architecture (ISCA’15). Association for Computing Machinery, 682–693.
[63]
Biswabandan Panda and André Seznec. 2016. Dictionary sharing: An efficient cache compression scheme for compressed caches. In Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’16). IEEE Press, Article 1, 12 pages.
[64]
Biswabandan Panda and André Seznec. 2018. Synergistic cache layout for reuse and compression. In Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques (PACT’18). Association for Computing Machinery, Article 4, 13 pages.
[65]
Jaehyun Park, Seungcheol Baek, Hyung Gyu Lee, Chrysostomos Nicopoulos, Vinson Young, Junghee Lee, and Jongman Kim. 2017. HoPE: Hot-cacheline prediction for dynamic early decompression in compressed LLCs. ACM Trans. Des. Autom. Electron. Syst. 22, 3, Article 40 (Apr. 2017), 25 pages.
[66]
Bhargavraj Patel, Nikos Hardavellas, and Gokhan Memik. 2015. SCP: Synergistic cache compression and prefetching. In Proceedings of the 33rd IEEE International Conference on Computer Design (ICCD’15). IEEE Computer Society, 164–171.
[67]
Gennady Pekhimenko, Tyler Huberty, Rui Cai, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, and Todd C. Mowry. 2015. Exploiting compressed block size as an indicator of future reuse. In Proceedings of the IEEE 21st International Symposium on High Performance Computer Architecture (HPCA’15). IEEE Computer Society, 51–63.
[68]
Gennady Pekhimenko, Vivek Seshadri, Yoongu Kim, Hongyi Xin, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, and Todd C. Mowry. 2013. Linearly compressed pages: A low-complexity, low-latency main memory compression framework. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’13). Association for Computing Machinery, 172–184.
[69]
Gennady Pekhimenko, Vivek Seshadri, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, and Todd C. Mowry. 2012. Base-delta-immediate compression: Practical data compression for on-chip caches. In Proceedings of the 21st International Conference on Parallel Architectures and Compilation Techniques (PACT’12). Association for Computing Machinery, 377–388.
[70]
Prateek Pujara and Aneesh Aggarwal. 2005. Restrictive compression techniques to increase level 1 cache capacity. In Proceedings of the IEEE International Conference on Computer Design: VLSI in Computers and Processors (ICCD’05). IEEE, IEEE Computer Society, 327–333.
[71]
Moinuddin K. Qureshi and Gabe H. Loh. 2012. Fundamental latency trade-off in architecting DRAM caches: Outperforming impractical SRAM-tags with a simple and practical design. In Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’12). IEEE Computer Society, 235–246.
[72]
Moinuddin K. Qureshi, M. Aater Suleman, and Yale N. Patt. 2007. Line distillation: Increasing cache capacity by filtering unused words in cache lines. In Proceedings of the IEEE 13th International Symposium on High Performance Computer Architecture (HPCA’07). IEEE Computer Society, 250–259.
[73]
Joshua San Miguel, Jorge Albericio, Natalie Enright Jerger, and Aamer Jaleel. 2016. The Bunker cache for spatio-value approximation. In Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’16). IEEE Computer Society, 43:1–43:12.
[74]
Somayeh Sardashti, Angelos Arelakis, Per Stenström, and David A. Wood. 2015. A primer on compression in the memory hierarchy. Synth. Lect. Comput. Architect. 10, 5 (2015), 1–86.
[75]
Somayeh Sardashti, André Seznec, and David A. Wood. 2014. Skewed compressed caches. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’14). IEEE Computer Society, 331–342.
[76]
Somayeh Sardashti, Andre Seznec, and David A. Wood. 2016. Yet another compressed cache: A low-cost yet effective compressed cache. ACM Trans. Archit. Code Optim. 13, 3, Article 27 (Sept. 2016), 25 pages.
[77]
Somayeh Sardashti and David A. Wood. 2013. Decoupled compressed cache: Exploiting spatial locality for energy-optimized compressed caching. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’13). Association for Computing Machinery, 62–73.
[78]
Kenneth James Schultz, Garnet Frederick Randall Gibson, Farhad Shafai, and Armin George Bluschke. 1999. Content addressable memory. U.S. Patent 5,859,791.
[79]
Seok-Won Seong and Prabhat Mishra. 2008. Bitmask-based code compression for embedded systems. IEEE Trans. Comput.-aided Design Integr. Circ. Syst. 27, 4 (2008), 673–685.
[80]
André Seznec. 1993. A case for two-way skewed-associative caches. In Proceedings of the 20th Annual International Symposium on Computer Architecture (ISCA’93). Association for Computing Machinery, 169–178.
[81]
A. Seznec. 1994. Decoupled sectored caches: Conciliating low tag implementation cost. In Proceedings of the 21st Annual International Symposium on Computer Architecture (ISCA’94). IEEE Computer Society Press, 384–393.
[82]
Ali Shafiee, Meysam Taassori, Rajeev Balasubramonian, and Al Davis. 2014. MemZip: Exploring unconventional benefits from memory compression. In Proceedings of the IEEE 20th International Symposium on High Performance Computer Architecture (HPCA’14). IEEE, IEEE Computer Society, 638–649.
[83]
Alan Jay Smith. 1982. Cache memories. ACM Comput. Surv. 14, 3 (Sept. 1982), 473–530.
[84]
James A. Storer and Thomas G. Szymanski. 1982. Data compression via textual substitution. J. ACM 29, 4 (Oct. 1982), 928–951.
[85]
Martin Thuresson, Lawrence Spracklen, and Per Stenstrom. 2008. Memory-link compression schemes: A value locality perspective. IEEE Trans. Comput. 57, 7 (2008), 916–927.
[86]
Xinhua Tian and Minxuan Zhang. 2007. A unified compressed cache hierarchy using simple frequent pattern compression and partial cache line prefetching. In Proceedings of the International Conference on Embedded Software and Systems (Lecture Notes in Computer Science), Vol. 4523. Springer, 142–153.
[87]
Yingying Tian, Samira M. Khan, Daniel A. Jiménez, and Gabriel H. Loh. 2014. Last-level cache deduplication. In Proceedings of the 28th ACM International Conference on Supercomputing (ICS’14). Association for Computing Machinery, Munich, Germany, 53–62.
[88]
R. Brett Tremaine, Peter A. Franaszek, John T. Robinson, Charles O. Schulz, T. Basil Smith, Michael E. Wazlowski, and P. Maurice Bland. 2001. IBM memory expansion technology (MXT). IBM J. Res. Dev. 45, 2 (2001), 271–285.
[89]
R. Brett Tremaine, T. Basil Smith, Mike Wazlowski, David Har, Kwok-Ken Mak, and Sujith Arramreddy. 2001. Pinnacle: IBM MXT in a memory controller chip. IEEE Micro 21, 2 (2001), 56–68.
[90]
Po-An Tsai, Yee Ling Gan, and Daniel Sanchez. 2018. Rethinking the memory hierarchy for modern languages. In Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’18). IEEE Press, 203–216.
[91]
Po-An Tsai, Andres Sanchez, Christopher W. Fletcher, and Daniel Sanchez. 2020. Safecracker: Leaking secrets through compressed caches. In Proceedings of the 25th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’20). Association for Computing Machinery, 1125–1140.
[92]
Po-An Tsai and Daniel Sanchez. 2019. Compress objects, not cache lines: An object-based compressed memory hierarchy. In Proceedings of the 24th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’19). ACM, 229–242.
[93]
Irina Chihaia Tuduce and Thomas R. Gross. 2005. Adaptive main memory compression. In Proceedings of the USENIX Annual Technical Conference. USENIX, 237–250. Retrieved from http://www.usenix.org/events/usenix05/tech/general/tuduce.html.
[94]
J. Uthayakumar, T. Vengattaraman, and P. Dhavachelvan. 2018. A survey on data compression techniques: From the perspective of data quality, coding schemes, data type and applications. J. King Saud Univ.-Comput. Info. Sci. 33, 2 (2021), 1319--1578.
[95]
Luis Villa, Michael Zhang, and Krste Asanović. 2000. Dynamic zero compression for cache energy reduction. In Proceedings of the 33rd Annual ACM/IEEE International Symposium on Microarchitecture (MICRO’00). Association for Computing Machinery, 214–220.
[96]
Hong Wang, Tong Sun, and Qing Yang. 1995. CAT—Caching address tags: A technique for reducing area cost of on-chip caches. In Proceedings of the 22nd Annual International Symposium on Computer Architecture (ISCA’95). Association for Computing Machinery, 381–390.
[97]
Terry A. Welch. 1984. A technique for high-performance data compression. Computer 6, 17 (1984), 8–19.
[98]
Paul R. Wilson, Scott F. Kaplan, and Yannis Smaragdakis. 1999. The case for compressed caching in virtual memory systems. In Proceedings of the USENIX Annual Technical Conference. USENIX, 101–116. Retrieved from http://www.usenix.org/events/usenix99/full_papers/wilson/wilson.pdf.
[99]
Yuejian Xie and Gabriel H. Loh. 2011. Thread-aware dynamic shared cache compression in multi-core processors. In Proceedings of the IEEE 29th International Conference on Computer Design (ICCD’11). IEEE Computer Society, 135–141.
[100]
Chao Yan and Russ Joseph. 2018. Cocoa: Synergistic cache CoMpression and error CoRrection in CaPacity sensitive last level caches. In Proceedings of the International Symposium on Memory Systems (MEMSYS’18). Association for Computing Machinery, 117–128.
[101]
Jun Yang and Rajiv Gupta. 2002. Energy efficient frequent value data cache design. In Proceedings of the 35th Annual ACM/IEEE International Symposium on Microarchitecture (MICRO’02). IEEE Computer Society Press, 197–207. Retrieved from http://dl.acm.org/citation.cfm?id=774861.774883.
[102]
Jun Yang, Rajiv Gupta, and Chuanjun Zhang. 2004. Frequent value encoding for low power data buses. ACM Trans. Des. Autom. Electron. Syst. 9, 3 (July 2004), 354–384.
[103]
Jun Yang, Youtao Zhang, and Rajiv Gupta. 2000. Frequent value compression in data caches. In Proceedings of the 33rd Annual ACM/IEEE International Symposium on Microarchitecture (MICRO 33). Association for Computing Machinery, Monterey, 258–265.
[104]
Keun Soo Yim, Jang-Soo Lee, Jihong Kim, Shin-Dug Kim, and Kern Koh. 2004. A space-efficient on-chip compressed cache organization for high performance computing. In Proceedings of the 2nd International Conference on Parallel and Distributed Processing and Applications (ISPA’04). Springer-Verlag, 952–964.
[105]
Vinson Young, Prashant J. Nair, and Moinuddin K. Qureshi. 2017. DICE: Compressing DRAM caches for bandwidth and capacity. In Proceedings of the 44th Annual International Symposium on Computer Architecture (ISCA’17). Association for Computing Machinery, 627–638.
[106]
Qi Zeng, Rakesh Jha, Shigang Chen, and Jih-Kwon Peir. 2018. Data locality exploitation in cache compression. In Proceedings of the IEEE 24th International Conference on Parallel and Distributed Systems (ICPADS’18). IEEE, 347–354.
[107]
Youtao Zhang, Jun Yang, and Rajiv Gupta. 2000. Frequent value locality and value-centric data cache design. In Proceedings of the 9th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’00). Association for Computing Machinery, 150–159.
[108]
Jacob Ziv and Abraham Lempel. 1977. A universal algorithm for sequential data compression. IEEE Trans. Info. Theory 23, 3 (1977), 337–343.
[109]
Jacob Ziv and Abraham Lempel. 1978. Compression of individual sequences via variable-rate coding. IEEE Trans. Info. Theory 24, 5 (1978), 530–536.

Cited By

View all
  • (2024)Hardware Compression Method for On-Chip and Interprocessor Networks with Wide Channels and Wormhole Flow Control PolicyМетодика компрессии данных в накристальных и межпроцессорных сетях с широкими каналами и политикой управления потоком wormholeInformatics and AutomationИнформатика и автоматизация10.15622/ia.23.3.823:3(859-885)Online publication date: 28-May-2024
  • (2024)AmLuCEP: Amalgamating LUT-based Compression and Adaptive Encoding Assisted Block Placement To Improve Lifetime of PCM-based Main MemoriesACM Transactions on Design Automation of Electronic Systems10.1145/368933429:6(1-24)Online publication date: 20-Aug-2024
  • (2024)GPU.zip: On the Side-Channel Implications of Hardware-Based Graphical Data Compression2024 IEEE Symposium on Security and Privacy (SP)10.1109/SP54263.2024.00084(3716-3734)Online publication date: 19-May-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Architecture and Code Optimization
ACM Transactions on Architecture and Code Optimization  Volume 18, Issue 3
September 2021
370 pages
ISSN:1544-3566
EISSN:1544-3973
DOI:10.1145/3460978
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 June 2021
Accepted: 01 March 2021
Revised: 01 February 2021
Received: 01 September 2020
Published in TACO Volume 18, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Caches
  2. cache compression

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2,825
  • Downloads (Last 6 weeks)420
Reflects downloads up to 20 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Hardware Compression Method for On-Chip and Interprocessor Networks with Wide Channels and Wormhole Flow Control PolicyМетодика компрессии данных в накристальных и межпроцессорных сетях с широкими каналами и политикой управления потоком wormholeInformatics and AutomationИнформатика и автоматизация10.15622/ia.23.3.823:3(859-885)Online publication date: 28-May-2024
  • (2024)AmLuCEP: Amalgamating LUT-based Compression and Adaptive Encoding Assisted Block Placement To Improve Lifetime of PCM-based Main MemoriesACM Transactions on Design Automation of Electronic Systems10.1145/368933429:6(1-24)Online publication date: 20-Aug-2024
  • (2024)GPU.zip: On the Side-Channel Implications of Hardware-Based Graphical Data Compression2024 IEEE Symposium on Security and Privacy (SP)10.1109/SP54263.2024.00084(3716-3734)Online publication date: 19-May-2024
  • (2024)Enterprise-Class Cache Compression Design2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00080(996-1011)Online publication date: 2-Mar-2024
  • (2024)A novel approximate cache block compressor for error-resilient image dataComputers and Electrical Engineering10.1016/j.compeleceng.2024.109106115:COnline publication date: 2-Jul-2024
  • (2022)Exploiting Inter-block Entropy to Enhance the Compressibility of Blocks with Diverse Data2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA53966.2022.00084(1100-1114)Online publication date: Apr-2022
  • (2021)Conciliating Speed and Efficiency on Cache Compressors2021 IEEE 39th International Conference on Computer Design (ICCD)10.1109/ICCD53106.2021.00075(442-446)Online publication date: Oct-2021

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media