research-article

Bit-plane compression: transforming data for better compression in many-core architectures

Authors:

Michael Sullivan,

Mattan ErezAuthors Info & Claims

ISCA '16: Proceedings of the 43rd International Symposium on Computer Architecture

Pages 329 - 340

https://doi.org/10.1109/ISCA.2016.37

Published: 18 June 2016 Publication History

Abstract

As key applications become more data-intensive and the computational throughput of processors increases, the amount of data to be transferred in modern memory subsystems grows. Increasing physical bandwidth to keep up with the demand growth is challenging, however, due to strict area and energy limitations. This paper presents a novel and lightweight compression algorithm, Bit-Plane Compression (BPC), to increase the effective memory bandwidth. BPC aims at homogeneously-typed memory blocks, which are prevalent in many-core architectures, and applies a smart data transformation to both improve the inherent data compressibility and to reduce the complexity of compression hardware. We demonstrate that BPC provides superior compression ratios of 4.1:1 for integer benchmarks and reduces memory bandwidth requirements significantly.

References

[1]

B. M. Rogers, A. Krishna, G. B. Bell, K. Vu, X. Jiang, and Y. Solihin, "Scaling the bandwidth wall: Challenges in and avenues for CMP scaling," in Proceedings of the International Symposium on Computer Architecture (ISCA), 2009.

Digital Library

[2]

DDR2 SDRAM Specification, JESD79-2F, Joint Electron Device Engineering Council, Nov. 2009.

[3]

DDR3 SDRAM STANDARD, JESD79-3F, Joint Electron Device Engineering Council, July 2012.

[4]

DDR4 SDRAM STANDARD, JESD79-4, Joint Electron Device Engineering Council, Sep. 2012.

[5]

Graphics Double Data Rate (GDDR5) SGRAM Standard, JESD212B, Joint Electron Device Engineering Council, Dec. 2013.

[6]

V. Sathish, M. J. Schulte, and N. S. Kim, "Lossless and lossy memory I/O link compression for improving performance of GPGPU workloads," in Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT), 2012.

Digital Library

[7]

M. Thuresson, L. Spracklen, and P. Stenstrom, "Memory-link compression schemes: A value locality perspective," IEEE Transactions on Computers, 2008.

Digital Library

[8]

S. Che, M. Boyer, J. Meng, D. Tarjan, J. W. Sheaffer, S.-H. Lee, and K. Skadron, "Rodinia: A benchmark suite for heterogeneous computing," in Proceedings of the International Symposium on Workload Characterization (IISWC), 2009, pp. 44--54.

Digital Library

[9]

J. A. Stratton, C. Rodrigrues, I.-J. Sung, N. Obeid, L. Chang, G. Liu, and W.-M. W. Hwu, "Parboil: A revised benchmark suite for scientific and commercial throughput computing," University of Illinois at Urbana-Champaign, Urbana, Tech. Rep. IMPACT-12-01, Mar. 2012.

[10]

M. Kulkarni, M. Burtscher, C. Casçaval, and K. Pingali, "Lonestar: A suite of parallel irregular programs," in Proceedings of the International Symposium on Performance Analysis of Systems and Software (ISPASS), 2009.

[11]

J. T. Pawlowski, "Hybrid Memory Cube (HMC)," in Symposium on High Performance Chips (HOTCHIPS), 2011.

[12]

Hybrid Memory Cube Specification 2.0, Hybrid Memory Cube Consortium, 2014.

[13]

X. Chen, L. Yang, R. P. Dick, L. Shang, and H. Lekatsas, "C-pack: A high-performance microprocessor cache compression algorithm," IEEE Transactions on VLSI Systems, vol. 18, no. 8, pp. 1196--1208, Aug. 2010.

Digital Library

[14]

G. Pekhimenko, V. Seshadri, O. Mutlu, P. B. Gibbons, M. A. Kozuch, and T. C. Mowry, "Base-delta-immediate compression: Practical data compression for on-chip caches," in Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT), September 2012.

Digital Library

[15]

D. Huffman, "A method for the construction of minimum-redundancy codes," Proceedings of the IRE, vol. 40, no. 9, pp. 1098--1101, 1952.

[16]

C. E. Shannon, "A mathematical theory of communication," The Bell System Technical Journal, vol. 27, pp. 379--423, 1948.

[17]

I. H. Witten, R. M. Neal, and J. G. Cleary, "Arithmetic coding for data compression," Communications of the ACM, vol. 30, no. 6, pp. 520--540, 1987.

Digital Library

[18]

F. C. Pereira and T. Ebrahimi, The MPEG-4 Book. Prentice-hall, 2002.

Digital Library

[19]

A. R. Alameldeen and D. A. Wood, "Frequent Pattern Compression: A significance-based compression scheme for l2 caches," Technical Report 1500, Computer Sciences Department, University of Wisconsin-Madison, Tech. Rep., 2004.

[20]

J. Yang, Y. Zhang, and R. Gupta, "Frequent value compression in data caches," in Proceedings of the International Symposium on Microarchitecture (MICRO), 2000, pp. 258--265.

Digital Library

[21]

B. Abali, H. Franke, D. E. Poff, R. A. Saccone, C. O. Schulz, L. M. Herger, and T. B. Smith, "Memory Expansion Technology (MXT): Software support and performance," IBM Journal of Research and Development, vol. 45, no. 2, pp. 287--301, March 2001.

Digital Library

[22]

M. Kjelso, M. Gooch, and S. Jones, "Design and performance of a main memory hardware data compressor," in EUROMICRO 96. Beyond 2000: Hardware and Software Design Strategies., Proceedings of the 22nd EUROMICRO Conference, Sep 1996, pp. 423--430.

[23]

L. Benini, D. Bruni, B. Ricco, A. Macii, and E. Macii, "An adaptive data compression scheme for memory traffic minimization in processor-based systems," in Proceedings of the International Symposium on Circuits and Systems (ISCAS), vol. 4, 2002.

[24]

A. Arelakis and P. Stenstrom, "SC2: A statistical compression cache scheme," in Proceedings of the International Symposium on Computer Architecture (ISCA), 2014.

Digital Library

[25]

D. J. Palframan, N. S. Kim, and M. H. Lipasti, "COP: To compress and protect main memory," in Proceedings of the International Symposium on Computer Architecture (ISCA), 2015, pp. 682--693.

Digital Library

[26]

J. Kim, M. Sullivan, S.-L. Gong, and M. Erez, "Frugal ECC: Efficient and versatile memory error protection through fine-grained compression," in Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC), 2015.

Digital Library

[27]

B. Justice, "NVIDIA Maxwell GPU GeForce GTX 980 Video Card Review." {Online}. Available: http://goo.gl/y2i5Tm

[28]

J.-S. Lee, W.-K. Hong, and S.-D. Kim, "Design and evaluation of a selective compressed memory system," in Proceedings of the International Conference on Computer Design (ICCD), 1999, pp. 184--191.

Digital Library

[29]

E. G. Hallnor and S. K. Reinhardt, "A compressed memory hierarchy using an indirect index cache," in Proceedings of the Workshop on Memory Performance Issues (WMPI), 2004, pp. 9--15.

Digital Library

[30]

A. R. Alameldeen and D. A. Wood, "Adaptive cache compression for high-performance processors," in Proceedings of the International Symposium on Computer Architecture (ISCA), 2004, pp. 212--223.

Digital Library

[31]

M. Ekman and P. Stenstrom, "A robust main-memory compression scheme," in Proceedings of the International Symposium on Computer Architecture (ISCA), 2005, pp. 74--85.

Digital Library

[32]

G. Pekhimenko, V. Seshadri, Y. Kim, H. Xin, O. Mutlu, M. A. Kozuch, P. B. Gibbons, and T. C. Mowry, "Linearly Compressed Pages: A main memory compression framework with low complexity and low latency," in Proceedings of the International Symposium on Microarchitecture (MICRO), 2013.

Digital Library

[33]

A. Shafiee, M. Taassori, R. Balasubramonian, and A. Davis, "MemZip: Exploring unconventional benefits from memory compression," in Proceedings of the International Symposium on High Performance Computer Architecture (HPCA), 2014.

[34]

S. Chaudhry, R. Cypher, M. Ekman, M. Karlsson, A. Landin, S. Yip, H. Zeffer, and M. Tremblay, "Rock: A high-performance Sparc CMT processor," IEEE Micro, vol. 29, no. 2, pp. 6--16, March 2009.

Digital Library

[35]

M. Rabbani and P. Jones, Digital Image Compression Techniques. SPIE Publications, 1991.

Digital Library

[36]

J. Kim, J. Park, J. Park, and Y. Kwon, "Hybrid image data processing system and method," Jul. 24 2012, US Patent 8,229,235. {Online}. Available: http://www.google.com/patents/US8229235

[37]

Taiwan Semiconductor Manufacturing Company, "40nm CMOS Standard Cell Library v120b," 2009.

[38]

"GPGPU-Sim," http://www.gpgpu-sim.org.

[39]

J. Ishac, "Survey of header compression techniques," National Aeronautic and Space Administration, Tech. Rep. NASA/TM-2001-21154, 2001.

[40]

Ethernet Alliance, "Ethernet Jumbo Frames," http://goo.gl/i6ktnh, November 2009.

Cited By

Shao QArelakis AStenström P(2024)HMComp: Extending Near-Memory Capacity using Compression in Hybrid MemoryProceedings of the 38th ACM International Conference on Supercomputing10.1145/3650200.3656612(74-84)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3650200.3656612
Lascorz AMahmoud MZadeh ANikolic MIbrahim KGiannoula CAbdelhadi AMoshovos ATsafrir DMusuvathi MGupta RAbu-Ghazaleh N(2024)Atalanta: A Bit is Worth a “Thousand” Tensor ValuesProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640356(85-102)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3620665.3640356
Tomei MDas SSeyedzadeh MBedoukian PBeckmann BKumar RWood D(2021)Byte-Select CompressionACM Transactions on Architecture and Code Optimization10.1145/346220918:4(1-27)Online publication date: 3-Sep-2021
https://dl.acm.org/doi/10.1145/3462209
Show More Cited By

Recommendations

Bit-plane compression: transforming data for better compression in many-core architectures
ISCA'16

As key applications become more data-intensive and the computational throughput of processors increases, the amount of data to be transferred in modern memory subsystems grows. Increasing physical bandwidth to keep up with the demand growth is ...
Simple bit-plane coding for lossless image compression and extended functionalities
PCS'09: Proceedings of the 27th conference on Picture Coding Symposium

A simple lossy-to-lossless bit-plane coding of still images is presented to integrate several functionality extensions including selective tile partitioning, progressive transmission, ROI transmission, accuracy scalability, and others. The mean squared ...
Wavelet transform and bit-plane encoding
ICIP '95: Proceedings of the 1995 International Conference on Image Processing (Vol. 1)-Volume 1 - Volume 1

In this work a new approach for wavelet transform (WT) based image compression is presented. Employing a simple region representation coding scheme previously used with bi-level facsimile pictures, the wavelet transform coefficients are first quantized ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ISCA '16: Proceedings of the 43rd International Symposium on Computer Architecture

June 2016

756 pages

ISBN:9781467389471

General Chairs:
Sang Lyul Min
Seoul National University
,
Gabriel Loh
AMD Research

ACM SIGARCH Computer Architecture News Volume 44, Issue 3
ISCA'16
June 2016
730 pages
ISSN:0163-5964
DOI:10.1145/3007787
Editor:
Doug DeGroot
acm dot org
Issue’s Table of Contents

Sponsors

SIGARCH: ACM Special Interest Group on Computer Architecture
IEEE-CS: Computer Society

Publisher

IEEE Press

Publication History

Published: 18 June 2016

Check for updates

Qualifiers

Research-article

Conference

ISCA '16

Sponsor:

SIGARCH
IEEE-CS

ISCA '16: The 42nd Annual International Symposium on Computer Architecture

June 18 - 22, 2016

Seoul, Republic of Korea

Acceptance Rates

Overall Acceptance Rate 543 of 3,203 submissions, 17%

Upcoming Conference

ISCA '25

Sponsor:
sigarch

The 52nd Annual International Symposium on Computer Architecture

June 21 - 25, 2025

Tokyo , Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

34
Total Citations
View Citations
708
Total Downloads

Downloads (Last 12 months)124
Downloads (Last 6 weeks)14

Reflects downloads up to 02 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Shao QArelakis AStenström P(2024)HMComp: Extending Near-Memory Capacity using Compression in Hybrid MemoryProceedings of the 38th ACM International Conference on Supercomputing10.1145/3650200.3656612(74-84)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3650200.3656612
Lascorz AMahmoud MZadeh ANikolic MIbrahim KGiannoula CAbdelhadi AMoshovos ATsafrir DMusuvathi MGupta RAbu-Ghazaleh N(2024)Atalanta: A Bit is Worth a “Thousand” Tensor ValuesProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640356(85-102)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3620665.3640356
Tomei MDas SSeyedzadeh MBedoukian PBeckmann BKumar RWood D(2021)Byte-Select CompressionACM Transactions on Architecture and Code Optimization10.1145/346220918:4(1-27)Online publication date: 3-Sep-2021
https://dl.acm.org/doi/10.1145/3462209
Tsai PSanchez AFletcher CSanchez DLarus JCeze LStrauss K(2020)Safecracker: Leaking Secrets through Compressed CachesProceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3373376.3378453(1125-1140)Online publication date: 9-Mar-2020
https://dl.acm.org/doi/10.1145/3373376.3378453
Sriraman ADhanotia ALarus JCeze LStrauss K(2020)AccelerometerProceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3373376.3378450(733-750)Online publication date: 9-Mar-2020
https://dl.acm.org/doi/10.1145/3373376.3378450
Wang ZKara KZhang HAlonso GMutlu OZhang C(2019)Accelerating generalized linear models with MLWeavingProceedings of the VLDB Endowment10.14778/3317315.331732212:7(807-821)Online publication date: 1-Mar-2019
https://dl.acm.org/doi/10.14778/3317315.3317322
Eldstål-Damlin ATrancoso PSourdis I(2019)AVRProceedings of the 48th International Conference on Parallel Processing10.1145/3337821.3337824(1-10)Online publication date: 5-Aug-2019
https://dl.acm.org/doi/10.1145/3337821.3337824
Oh YKoo GAnnavaram MRo WManne SHunter HAltman E(2019)LinebackerProceedings of the 46th International Symposium on Computer Architecture10.1145/3307650.3322222(183-196)Online publication date: 22-Jun-2019
https://dl.acm.org/doi/10.1145/3307650.3322222
Li CAusavarungnirun RRossbach CZhang YMutlu OGuo YYang JBahar IHerlihy MWitchel ELebeck A(2019)A Framework for Memory Oversubscription Management in Graphics Processing UnitsProceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3297858.3304044(49-63)Online publication date: 4-Apr-2019
https://dl.acm.org/doi/10.1145/3297858.3304044
Tsai PSanchez DBahar IHerlihy MWitchel ELebeck A(2019)Compress Objects, Not Cache LinesProceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3297858.3304006(229-242)Online publication date: 4-Apr-2019
https://dl.acm.org/doi/10.1145/3297858.3304006
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents