poster

Design of a Loeffler DCT using Xilinx Vivado HLS (Abstract Only)

Authors:

Seung Yeol Baik,

Seokjin Jeong,

Hyeong-Cheol OhAuthors Info & Claims

FPGA '15: Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

Page 278

https://doi.org/10.1145/2684746.2694735

Published: 22 February 2015 Publication History

Get Access

Abstract

Loeffler discrete cosine transform (DCT) algorithm is recognized as the most efficient one because it requires the theoretically least number of multiplications. However, many applications still encounter difficulty in performing the 11 multiplications required by the algorithm to calculate a 1D eight-point DCT. To avoid expensive multipliers in the hardware, we used two design methods, namely, distributed arithmetic (DA) and shift-and-add (SAA) methods, to design the DCT accelerator. The memory bandwidth is 60 bits: 24 bits for reads of the R(red), G(green), and B(blue) data of a pixel and 36 bits for writes of three corresponding 12-bit DCT coefficients. Thus, the 1D eight-point DCT accelerator for each of R, G, and B can have one 12-bit input port and one 12-bit output port so that it can calculate a 2D DCT by row-column decomposition method. The designs are adjusted to produce the same latency and interval. DA seems promising because Loeffler DCT requires only three small tables with four input bits. However, our experiments using Xilinx Vivado HLS show that the SAA design is better than the DA design for the considered applications. Furthermore, simulation results suggest that the optimal accelerator design can be obtained by adjusting the SAA design to the considered applications. The resultant SAA design requires only 13 adders (per color component) and can calculate one DCT coefficient per clock cycle. The precision of the internal hardware has been adjusted, such that the reconstructed images have PSNR values of at least 39.1 dB for all test images (Lenna, Pepper, House, and Cameraman). If a precision of 13bits is allowed, PSNR becomes at least 44.8 dB. Our presentation describes the architecture and operation of the optimized SAA design.

Index Terms

Design of a Loeffler DCT using Xilinx Vivado HLS (Abstract Only)
1. Computer systems organization
  1. Embedded and cyber-physical systems
  2. Real-time systems
2. Hardware
  1. Communication hardware, interfaces and storage
    1. Signal processing systems

Recommendations

A High-throughput Architecture for Lossless Decompression on FPGA Designed Using HLS (Abstract Only)
FPGA '16: Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

In the field of big data applications, lossless data compression and decompression can play an important role in improving the data center's efficiency in storage and distribution of data. To avoid becoming a performance bottleneck, they must be ...
Fast DCT domain filtering using the DCT and the DST

A method for efficient spatial domain filtering, directly in the discrete cosine transform (DCT) domain, is developed and proposed. It consists of using the discrete sine transform (DST) and the DCT for transform-domain processing on the in JPEG basis ...
Quality and Power Efficient Architecture for the Discrete Cosine Transform

In recent years, the demand for multimedia mobile battery-operated devices has created a need for low power implementation of video compression. Many compression standards require the discrete cosine transform (DCT) function to perform image/video ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

FPGA '15: Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

February 2015

292 pages

ISBN:9781450333153

DOI:10.1145/2684746

General Chair:
George A. Constantinides
Imperial College
,
Program Chair:
Deming Chen
University of Illinois at Urbana-Champaign

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 February 2015

Check for updates

Author Tags

Qualifiers

Poster

Funding Sources

National Research Foundation of Korea

Conference

FPGA '15

Sponsor:

SIGDA

FPGA '15: The 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

February 22 - 24, 2015

California, Monterey, USA

Acceptance Rates

FPGA '15 Paper Acceptance Rate 20 of 102 submissions, 20%;

Overall Acceptance Rate 125 of 627 submissions, 20%

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 22 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Index Terms

Recommendations

A High-throughput Architecture for Lossless Decompression on FPGA Designed Using HLS (Abstract Only)

Fast DCT domain filtering using the DCT and the DST

Quality and Power Efficient Architecture for the Discrete Cosine Transform