Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Computer Generation of Hardware for Linear Digital Signal Processing Transforms

Published: 01 April 2012 Publication History

Abstract

Linear signal transforms such as the discrete Fourier transform (DFT) are very widely used in digital signal processing and other domains. Due to high performance or efficiency requirements, these transforms are often implemented in hardware. This implementation is challenging due to the large number of algorithmic options (e.g., fast Fourier transform algorithms or FFTs), the variety of ways that a fixed algorithm can be mapped to a sequential datapath, and the design of the components of this datapath. The best choices depend heavily on the resource budget and the performance goals of the target application. Thus, it is difficult for a designer to determine which set of options will best meet a given set of requirements.
In this article we introduce the Spiral hardware generation framework and system for linear transforms. The system takes a problem specification as input as well as directives that define characteristics of the desired datapath. Using a mathematical language to represent and explore transform algorithms and datapath characteristics, the system automatically generates an algorithm, maps it to a datapath, and outputs a synthesizable register transfer level Verilog description suitable for FPGA or ASIC implementation. The quality of the generated designs rivals the best available handwritten IP cores.

References

[1]
4DSP, LLC. 2007. 4DSP floating point fast Fourier transform V2.6. 4DSP, LLC.
[2]
Astola, J. and Akopian, D. 1999. Architecture-oriented regular algorithms for discrete sine and cosine transforms. IEEE Trans. Signal Process. 47, 4, 1109--1124.
[3]
Bluestein, L. I. 1970. A linear filtering approach to computation of discrete Fourier transform. IEEE Trans. Audio Electroacoust. 18, 4, 451--455.
[4]
Cohen, D. 1976. Simplified control of FFT hardware. IEEE Trans. Acoust. Speech, Signal Process. 24, 6, 577--579.
[5]
Cooley, J. W. and Tukey, J. W. 1965. An algorithm for the machine calculation of complex Fourier series. Math. Computat. 19, 90, 297--301.
[6]
Cortés, A. and Vélez, I. 2009. Radix rk FFTs: Matricial representation and SDC/SDF pipeline implementation. IEEE Trans. Signal Process. 57, 7, 2824--2839.
[7]
Dave, N., Pellauer, M., Gerding, S., and Arvind. 2006. 802.11a transmitter: A case study in microarchitectural exploration. In Proceedings of the ACM/IEEE International Conference on Formal Methods and Models for Codesign. 59--68.
[8]
Dershowitz, N. and Plaisted, D. A. 2001. Rewriting. In Handbook of Automated Reasoning, Vol. 1, A. Robinson and A. Voronkov Eds., Elsevier, 535--610.
[9]
Franchetti, F., Voronenko, Y., and Püschel, M. 2006a. FFT program generation for shared memory: SMP and multicore. In Proceedings of the ACM/IEEE Conference on Supercomputing.
[10]
Franchetti, F., Voronenko, Y., and Püschel, M. 2006b. A rewriting system for the vectorization of signal transforms. In Proceedings of High Performance Computing for Computational Science (VECPAR). Lecture Notes in Computer Science, vol. 4395, Springer, 363--377.
[11]
Gorman, S. F. and Wills, J. M. 1995. Partial column FFT pipelines. IEEE Trans. Circuits Syst. II: Analog Digital Signal Process. 42, 6, 414--423.
[12]
He, S. and Torkelson, M. 1996. A new approach to pipeline FFT processor. In Proceedings of the International Parallel Processing Symposium. 766--770.
[13]
Järvinen, T. S., Salmela, P., Sorokin, H., and Takala, J. H. 2004. Stride permutation networks for array processors. In Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures, and Processors. 51--71.
[14]
Johnson, J. R., Johnson, R. W., Rodriguez, D., and Tolimieri, R. 1990. A methodology for designing, modifying, and implementing Fourier transform algorithms on various architectures. Circuits, Syst. Signal Process. 9, 449--500.
[15]
Kee, H., Petersen, N., Kornerup, J., and Bhattacharyya, S. S. 2008. Systematic generation of FPGA-based FFT implementations. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing. 1413--1416.
[16]
Kumhom, P., Johnson, J., and Nagvajara, P. 2000. Design, optimization, and implementation of a universal FFT processor. In Proceedings of the 13th IEEE ASIC/SOC Conference. 182--186.
[17]
Milder, P. A. 2010. A mathematical approach for compiling and optimizing hardware implementations of DSP transforms. Ph.D. dissertation, Electrical and Computer Engineering, Carnegie Mellon University.
[18]
Milder, P. A., Ahmad, M., Hoe, J. C., and Püschel, M. 2006. Fast and accurate resource estimation of automatically generated custom DFT IP cores. In Proceedings of the ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA). 211--220.
[19]
Milder, P. A., Franchetti, F., Hoe, J. C., and Püschel, M. 2008. Formal datapath representation and manipulation for implementing DSP transforms. In Proceedings of the 45th Annual ACM/IEEE Conference on Design Automation (DAC). 385--390.
[20]
Milder, P. A., Hoe, J. C., and Püschel, M. 2009. Automatic generation of streaming datapaths for arbitrary fixed permutations. In Proceedings of the Conference and Exhibition on Design, Automation and Test in Europe. 1118--1123.
[21]
Muralimanohar, N., Balasubramonian, R., and Jouppi, N. P. 2009. CACTI 6.0: A tool to model large caches. Tech. rep. HPL-2009-85, Hewlett-Packard Laboratories.
[22]
Nikara, J., Takala, J. H., and Astola, J. 2006. Discrete cosine and sine transforms---regular algorithms and pipeline architectures. Signal Process., 230--249.
[23]
Pease, M. C. 1968. An adaptation of the fast Fourier transform for parallel processing. J. ACM 15, 2, 252--264.
[24]
Püschel, M., Moura, J. M. F., Johnson, J., Padua, D., Veloso, M., Singer, B. W., Xiong, J., Franchetti, F., Gačić, A., Voronenko, Y., Chen, K., Johnson, R. W., and Rizzolo, N. 2005. SPIRAL: Code generation for DSP transforms. Proc. IEEE 93, 2, 232--275.
[25]
Püschel, M., Milder, P. A., and Hoe, J. C. 2009. Permuting streaming data using RAMs. J. ACM 56, 2.
[26]
Sukhsawas, S. and Benkrid, K. 2004. A high-level implementation of a high performance pipeline FFT on Virtex-E FPGAs. In Proceedings of the IEEE Symposium on VLSI. 229--232.
[27]
Takala, J. H., Akopian, D. A., Astola, J., and Saarinen, J. P. P. 2000. Constant geometry algorithm for discrete cosine transform. IEEE Trans. Signal Process. 48, 6, 1840--1843.
[28]
Van Loan, C. 1992. Computational Frameworks for the Fast Fourier Transform. SIAM.
[29]
Voronenko, Y. and Püschel, M. 2009. Algebraic signal processing theory: Cooley-Tukey type algorithms for real DFTs. IEEE Trans. Signal Process. 57, 1, 205--222.
[30]
Xilinx, Inc. 2010. Xilinx LogiCore IP fast Fourier transform v7.1. Xilinx, Inc.
[31]
Xiong, J., Johnson, J., Johnson, R., and Padua, D. 2001. SPL: A language and compiler for DSP algorithms. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). 298--308.
[32]
Zapata, E. L. and Argüello, F. 1992. A VLSI constant geometry architecture for the fast Hartley and Fourier transforms. IEEE Trans. Parall. Distrib. Syst. 3, 1, 58--70.

Cited By

View all
  • (2024)4-Gbps low-latency FPGA-based underwater wireless optical communicationOptics Express10.1364/OE.53055132:21(36207)Online publication date: 24-Sep-2024
  • (2024)Allo: A Programming Model for Composable Accelerator DesignProceedings of the ACM on Programming Languages10.1145/36564018:PLDI(593-620)Online publication date: 20-Jun-2024
  • (2024)Hardware Acceleration and Implementation of Fully Homomorphic Encryption Over the TorusIEEE Transactions on Circuits and Systems I: Regular Papers10.1109/TCSI.2023.333895371:3(1116-1129)Online publication date: Mar-2024
  • Show More Cited By

Index Terms

  1. Computer Generation of Hardware for Linear Digital Signal Processing Transforms

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Design Automation of Electronic Systems
    ACM Transactions on Design Automation of Electronic Systems  Volume 17, Issue 2
    April 2012
    170 pages
    ISSN:1084-4309
    EISSN:1557-7309
    DOI:10.1145/2159542
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Journal Family

    Publication History

    Published: 01 April 2012
    Accepted: 01 November 2011
    Revised: 01 November 2011
    Received: 01 July 2011
    Published in TODAES Volume 17, Issue 2

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Digital signal processing transform
    2. discrete Fourier transform
    3. fast Fourier transform
    4. hardware generation
    5. high-level synthesis
    6. linear transform

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)66
    • Downloads (Last 6 weeks)7
    Reflects downloads up to 14 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)4-Gbps low-latency FPGA-based underwater wireless optical communicationOptics Express10.1364/OE.53055132:21(36207)Online publication date: 24-Sep-2024
    • (2024)Allo: A Programming Model for Composable Accelerator DesignProceedings of the ACM on Programming Languages10.1145/36564018:PLDI(593-620)Online publication date: 20-Jun-2024
    • (2024)Hardware Acceleration and Implementation of Fully Homomorphic Encryption Over the TorusIEEE Transactions on Circuits and Systems I: Regular Papers10.1109/TCSI.2023.333895371:3(1116-1129)Online publication date: Mar-2024
    • (2024)SlidingConv: Domain-Specific Description of Sliding Discrete Cosine Transform Convolution for HalideIEEE Access10.1109/ACCESS.2023.334566012(7563-7583)Online publication date: 2024
    • (2024)RETRACTED: An improved hybrid network‐on‐chip with flexible topology and frugal routingThe Journal of Engineering10.1049/tje2.123952024:6Online publication date: 11-Jun-2024
    • (2023)Supply Chain Aware Computer ArchitectureProceedings of the 50th Annual International Symposium on Computer Architecture10.1145/3579371.3589052(1-15)Online publication date: 17-Jun-2023
    • (2023)FPT: A Fixed-Point Accelerator for Torus Fully Homomorphic EncryptionProceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security10.1145/3576915.3623159(741-755)Online publication date: 15-Nov-2023
    • (2023)Frequency-Domain Inference Acceleration for Convolutional Neural Networks Using ReRAMsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.332290734:12(3133-3146)Online publication date: 1-Dec-2023
    • (2023)Towards a Flexible Hardware Implementation for Mixed-Radix Fourier Transforms2023 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC58863.2023.10363540(1-7)Online publication date: 25-Sep-2023
    • (2022)Pushing the Level of Abstraction of Digital System Design: A Survey on How to Program FPGAsACM Computing Surveys10.1145/353298955:5(1-48)Online publication date: 3-Dec-2022
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media