Article

A space- and energy-efficient code Compression/Decompression technique for coarse-grained reconfigurable architectures

Authors:

Bernhard Egger,

Mansureh S. Moghaddam,

Kiyoung ChoiAuthors Info & Claims

CGO '17: Proceedings of the 2017 International Symposium on Code Generation and Optimization

Pages 197 - 209

Published: 04 February 2017 Publication History

Abstract

We present an effective code compression technique to reduce the area and energy overhead of the configuration memory for coarse-grained reconfigurable architectures (CGRA). Based on a statistical analysis of existing code, the proposed method reorders the storage locations of the reconfigurable entities and splits the wide configuration memory into a number of partitions. Code compression is achieved by removing consecutive duplicated lines in each partition. Compressibility is increased by an optimization phase in the compiler. The optimization minimizes the number of configuration changes for individual reconfigurable entities. Decompression is performed by a simple hardware decoder logic that is able to decode lines with no additional latency and negligible area overhead. Experiments with over 190 loop kernels from different application domains show that the proposed method achieves a memory reduction of over 40% on average with four partitions. The compressibility of yet unseen code is only slightly lower with 35% on average. In addition, executing compressed code results in a 22 to 47% reduction in the configuration logic's energy consumption.

References

[1]

N. Aslam, M. J. Milward, A. T. Erdogan, and T. Arslan. Code compression and decompression for coarse-grain reconfigurable architectures. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 16(12):1596–1608, Dec 2008.

Digital Library

[2]

M.-K. Chung, Y.-G. Cho, and S. Ryu. Efficient code compression for coarse grained reconfigurable architectures. In IEEE 30th International Conference on Computer Design (ICCD), pages 488–489, 2012.

Digital Library

[3]

M.-K. Chung, J.-K. Kim, Y.-G. Cho, and S. Ryu. Adaptive compression for instruction code of coarse grained reconfigurable architectures. In International Conference on Field-Programmable Technology (FPT), pages 394–397, 2013.

[4]

J. H. Holland. Adaptation in natural and artificial systems. MIT Press, Cambridge, MA, 1992.

Digital Library

[5]

C. Kim, M. Chung, Y. Cho, M. Konijnenburg, S. Ryu, and J. Kim. ULP-SRP: Ultra low power samsung reconfigurable processor for biomedical applications. In International Conference on Field-Programmable Technology (FPT), pages 329–334, 2012.

[6]

Y. Kim, I. Park, K. Choi, and Y. Paek. Power-conscious configuration cache structure and code mapping for coarsegrained reconfigurable architecture. In International Symposium on Low Power Electronics and Design (ISLPED), pages 310–315, 2006.

Digital Library

[7]

Y. Kim, R. N. Mahapatra, I. Park, and K. Choi. Low power reconfiguration technique for coarse-grained reconfigurable architecture. IEEE transactions on very large scale integration (VLSI) systems, 17(5):593–603, 2009.

Digital Library

[8]

T. Kitaoka, H. Amano, and K. Anjo. Reducing the configuration loading time of a coarse grain multicontext reconfigurable device. In International Conference on Field Programmable Logic and Applications, pages 171–180, 2003.

[9]

M. Lam. Software pipelining: An effective scheduling technique for vliw machines. In ACM SIGPLAN 1988 Conference on Programming Language Design and Implementation (PLDI), pages 318–328, 1988.

Digital Library

[10]

A. Lambrechts, P. Raghavan, M. Jayapala, F. Catthoor, and D. Verkest. Energy-aware interconnect-exploration of coarse grained reconfigurable processors. In Workshop on Application Specific Processors, 2005.

[11]

J. Lee, Y. Shin, W.-J. Lee, S. Ryu, and K. Jeongwook. Realtime ray tracing on coarse-grained reconfigurable processor. In International Conference on Field-Programmable Technology (FPT), pages 192–197, Dec 2013.

[12]

W.-J. Lee, S.-H. Lee, J.-H. Nah, J.-W. Kim, Y. Shin, J. Lee, and S.-Y. Jung. SGRT: a scalable mobile gpu architecture based on ray tracing. In ACM SIGGRAPH 2012 Posters, page 44, 2012.

Digital Library

[13]

W.-J. Lee, Y. Shin, J. Lee, J.-W. Kim, J.-H. Nah, S. Jung, S. Lee, H.-S. Park, and T.-D. Han. SGRT: a mobile gpu architecture for real-time ray tracing. In Proceedings of the 5th high-performance graphics conference, pages 109–119, 2013.

Digital Library

[14]

W.-J. Lee, Y. Shin, J. Lee, J.-W. Kim, J.-H. Nah, H.-S. Park, S. Jung, and S. Lee. A novel mobile gpu architecture based on ray tracing. In 2013 IEEE International Conference on Consumer Electronics (ICCE), pages 21–22, 2013.

[15]

B. Liu, W.-Y. Zhu, Y. Liu, and P. Cao. A configuration compression approach for coarse-grain reconfigurable architecture for radar signal processing. In International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), pages 448–453, Oct 2014.

Digital Library

[16]

B. Mei, S. Vernalde, D. Verkest, H. De Man, and R. Lauwereins. ADRES: An architecture with tightly coupled vliw processor and coarse-grained reconfigurable matrix. In 13th International Conference on Field Programmable Logic and Applications (FPL), pages 61–70, 2003.

[17]

B. Mei, S. Vernalde, D. Verkest, H. D. Man, and R. Lauwereins. Exploiting loop-level parallelism on coarse-grained reconfigurable architectures using modulo scheduling. IEE Proceedings - Computers and Digital Techniques, 150(5):255– 261, Sept 2003.

[18]

N. Muralimanohar, R. Balasubramonian, and N. P. Jouppi. CACTI 6.0: A tool to model large caches. HP Laboratories, pages 22–31, 2009.

[19]

T. Oh, B. Egger, H. Park, and S. Mahlke. Recurrence cycle aware modulo scheduling for coarse-grained reconfigurable architectures. In ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES), pages 21–30, 2009.

Digital Library

[20]

Y. Park, H. Park, and S. Mahlke. Cgra express: Accelerating execution using dynamic operation fusion. In International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES), pages 271–280, 2009.

Digital Library

[21]

M. Quax, J. Huisken, and J. van Meerbergen. A scalable implementation of a reconfigurable wcdma rake receiver. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE) - Volume 3, pages 30230–, 2004.

Digital Library

[22]

B. R. Rau. Iterative modulo scheduling: An algorithm for software pipelining loops. In 27th Annual International Symposium on Microarchitecture (MICRO), pages 63–74, 1994.

Digital Library

[23]

Samsung Exynos 4210 Product Brief. http: //www.samsung.com/us/business/oem-solutions/ pdfs/Exynos\_v11.pdf, 2011. (online; accessed November 2016).

[24]

K. Sastry, D. E. Goldberg, and G. Kendall. Genetic algorithms. In Search methodologies, pages 93–117. Springer US, 2014.

[25]

Y. Shin, J. Lee, W.-J. Lee, S. Ryu, and J. Kim. Full-stream architecture for ray tracing with efficient data transmission. In 2014 IEEE International Symposium on Circuits and Systems (ISCAS), pages 2165–2168, 2014.

[26]

H. Singh, M.-H. Lee, G. Lu, F. J. Kurdahi, N. Bagherzadeh, and E. M. Chaves Filho. Morphosys: an integrated reconfigurable system for data-parallel and computation-intensive applications. IEEE transactions on computers, 49(5):465–481, 2000.

Digital Library

[27]

D. Suh, K. Kwon, S. Kim, S. Ryu, and J. Kim. Design space exploration and implementation of a high performance and low area coarse grained reconfigurable processor. In International Conference on Field-Programmable Technology (FPT), pages 67–70, 2012.

[28]

Synopsys Design Compiler 2010. http://www.synopsys. com/, 2010. (online; accessed November 2016).

[29]

TOMLAB CPLEX. http://www.tomopt.com/tomlab/ products/cplex, 2016. (online; accessed November 2016).

Cited By

Kim CJeong SCho SLee YSong WKim YKim HLee J(2021)Thread-aware area-efficient high-level synthesis compiler for embedded devicesProceedings of the 2021 IEEE/ACM International Symposium on Code Generation and Optimization10.1109/CGO51591.2021.9370341(327-339)Online publication date: 27-Feb-2021
https://dl.acm.org/doi/10.1109/CGO51591.2021.9370341
Dave SKim YAvancha SLee KShrivastava A(2019)dMazeRunnerACM Transactions on Embedded Computing Systems10.1145/335819818:5s(1-27)Online publication date: 8-Oct-2019
https://dl.acm.org/doi/10.1145/3358198
Lee HMoghaddam MSuh DEgger B(2018)Improving Energy Efficiency of Coarse-Grain Reconfigurable Arrays Through Modulo Schedule Compression/DecompressionACM Transactions on Architecture and Code Optimization10.1145/316201815:1(1-26)Online publication date: 22-Mar-2018
https://dl.acm.org/doi/10.1145/3162018

Recommendations

Improving Energy Efficiency of Coarse-Grain Reconfigurable Arrays Through Modulo Schedule Compression/Decompression

Modulo-scheduled course-grain reconfigurable array (CGRA) processors excel at exploiting loop-level parallelism at a high performance per watt ratio. The frequent reconfiguration of the array, however, causes between 25% and 45% of the consumed chip ...
Code compression and decompression for coarse-grain reconfigurable architectures

This paper presents a code compression and on-the-fly decompression scheme suitable for coarse-grain reconfigurable technologies. These systems pose further challenges by having an order of magnitude higher memory requirement due to much wider ...
Low power reconfiguration technique for coarse-grained reconfigurable architecture

Coarse-grained reconfigurable architectures (CGRAs) require many processing elements (PEs) and a configuration memory unit (configuration cache) for reconfiguration of its PE array. Although this structure is meant for high performance and flexibility, ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CGO '17: Proceedings of the 2017 International Symposium on Code Generation and Optimization

February 2017

317 pages

ISBN:9781509049318

General Chair:
Vijay Janapa Reddi
University of Texas at Austin, USA
,
Program Chairs:
Aaron Smith
Microsoft Research, UK / University of Edinburgh, UK
,
Lingjia Tang
University of Michigan, USA

Sponsors

SIGPLAN: ACM Special Interest Group on Programming Languages
SIGMICRO: ACM Special Interest Group on Microarchitectural Research and Processing
IEEE-CS: Computer Society

Publisher

IEEE Press

Publication History

Published: 04 February 2017

Check for updates

Author Tags

Qualifiers

Article

Conference

CGO '17

Sponsor:

CGO '17: 15th Annual IEEE/ACM International Symposium on Code Generation and Optimization

February 4 - 8, 2017

Austin, USA

Acceptance Rates

CGO '17 Paper Acceptance Rate 26 of 116 submissions, 22%;

Overall Acceptance Rate 312 of 1,061 submissions, 29%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
163
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)0

Reflects downloads up to 13 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Kim CJeong SCho SLee YSong WKim YKim HLee J(2021)Thread-aware area-efficient high-level synthesis compiler for embedded devicesProceedings of the 2021 IEEE/ACM International Symposium on Code Generation and Optimization10.1109/CGO51591.2021.9370341(327-339)Online publication date: 27-Feb-2021
https://dl.acm.org/doi/10.1109/CGO51591.2021.9370341
Dave SKim YAvancha SLee KShrivastava A(2019)dMazeRunnerACM Transactions on Embedded Computing Systems10.1145/335819818:5s(1-27)Online publication date: 8-Oct-2019
https://dl.acm.org/doi/10.1145/3358198
Lee HMoghaddam MSuh DEgger B(2018)Improving Energy Efficiency of Coarse-Grain Reconfigurable Arrays Through Modulo Schedule Compression/DecompressionACM Transactions on Architecture and Code Optimization10.1145/316201815:1(1-26)Online publication date: 22-Mar-2018
https://dl.acm.org/doi/10.1145/3162018

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents