Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Compressor tree synthesis on commercial high-performance FPGAs

Published: 28 December 2011 Publication History

Abstract

Compressor trees are a class of circuits that generalizes multioperand addition and the partial product reduction trees of parallel multipliers using carry-save arithmetic. Compressor trees naturally occur in many DSP applications, such as FIR filters, and, in the more general case, their use can be maximized through the application of high-level transformations to arithmetically intensive data flow graphs. Due to the presence of carry-chains, it has long been thought that trees of 2- or 3-input carry-propagate adders are more efficient than compressor trees for FPGA synthesis; however, this is not the case. This article presents a heuristic for FPGA synthesis of compressor trees that outperforms adder trees and exploits carry-chains when possible. The experimental results show that, on average, the use of compressor trees can reduce critical path delay by 33% and 45% respectively, compared to adder trees synthesized on the Xilinx Virtex-5 and Altera Stratix III FPGAs.

References

[1]
Altera Corporation. Stratix II, III, and IV device handbooks. http://www.altera.com/.
[2]
Dadda, L. 1965. Some schemes for parallel multipliers. Alta Frequenza 34, 349--356.
[3]
Hutton, M., Schleicher, J., Lewis, D. M. Pederson, Yuan, S., Kaptanoglu, S., Baeckler, G., Ratchev, B., Padalia, K., Bourgeault, M., Lee, A., Kim, H., and Saini, R. 2004. Improving FPGA performance and area using an adaptive logic module. In Proceedings of the 14th International Conference on Field Programmable Logic and Applications. 135--144.
[4]
Kamp, W., Bainbridge-Smith, A., and Hayes, M. 2009. Efficient implementation of fast redundant number addrs for long word-lengths in FPGAs. In Proceedings of the International Conference on Field-Programmable Technology. 239--246.
[5]
Matsunaga, T., Kimura, S., and Matsunaga, Y. 2010. Multi-Operand adder synthesis on FPGAs using generalized parallel counters. In Proceedings of the 15th Asia and South Pacific Design Automation Conference. 337--342.
[6]
Oklobdzija, V. G. and Villeger, D. 1995. Improving multiplier design by using improved column compression tree and optimized final adder in CMOS technology. IEEE Trans. VLSI Syst. 3. 292--301.
[7]
Ortiz, M., Quiles, F., Jormigo, J., Jaime, F. J., Villalba, J., and Zapata, E. L. 2009. Efficient implementation of carry-save adders in FPGAs. In Proceedings of the 20th IEEE International Conference on Application-specific Systems, Architectures, and Processors. 207--210.
[8]
Paidimarri, A., Cevrero, A., Brisk, P., and Ienne, P. 2009. FPGA implementation of a single-precision floating-point multiply accumulator with single-cycle accumulation. In Proceedings of the 17th IEEE Symposium on Field Programmable Custom Computing Machines. 267--270.
[9]
Parandeh-Afshar, H., Brisk, P., and Ienne, P. 2008a. Efficient synthesis of compressor trees on FPGAs. In Proceedings of the Asia-South Pacific Design Automation Conference. 138--143.
[10]
Parandeh-Afhsar, H., Brisk, P., and Ienne, P. 2008b. Improving synthesis of compressor trees on FPGAs via integer linear programming. In Proceedings of the International Conference on Design Automation and Test in Europe. 1256--1262.
[11]
Parandeh-Afshar, H., Brisk, P., and Ienne, P. 2009. Exploiting fast carry-chains of FPGAs for designing compressor trees. In Proceedings of the 19th International Conference on Field Programmable Logic and Applications. 242--249.
[12]
Poldre, J. and Tammemae, K. 1999. Reconfigurable multiplier for Virtex FPGA family. In Proceedings of the 9th International Workshop on Field-Programmable Logic and Applications. 359--364.
[13]
Stelling, P. F., Martel, C. U., Oklobdzija, V. J., and Ravi, R. 1998. Optimal circuits for parallel multipliers. IEEE Trans. Comput. 47, 273--285.
[14]
Stelling, P. F. and Oklobdzija, V. J. 1996. Design strategies for optimal hybrid final adders in a parallel multiplier. J. VLSI Signal Process. 14, 321--331.
[15]
Stenzel, W. J., Kubitz, W. J., and Garcia, G. H. 1977. A compact high-speed parallel multiplication scheme. IEEE Trans. Comput. C-26, 948--957.
[16]
Swartzlander Jr., E. E. 1973. Parallel counters. IEEE Trans. Comput. C-22, 1021--1024.
[17]
Um, J. and Kim, T. 2002. Layout-aware synthesis of arithmetic circuits. In Proceedings of the 39th Design Automation Conference. 207--212.
[18]
Verma, A. K., Brisk, P., and Ienne, P. 2008. Data-flow transformations to maximise the use of carry-save representation in arithmetic circuits. IEEE Trans. Comput.-Aided Des. 27, 1761--1774.
[19]
Wallace, C. S. 1964. A suggestion for a fast multiplier. IEEE Trans. Elec. Comput. 13, 14--17.
[20]
Weinberger, A. 1981. A 4:2 carry save adder module. IBM Tech. Disclos. Bull. 23.
[21]
Xilinx Corporation. Virtex 4, 5, and 6 device handbooks. http://www.xilinx.com.
[22]
Lee, C., Potkonjak, M., and Mangione-Smith, W. H. 1997. MediaBench: A tool for evaluating and synthesizing multimedia and communications systems. In Proceedings of the 30th International Symposium on Microarchitecture. 330--335.
[23]
Mirzaei, S., Hosangadi, A., and Kastner, R. 2006. FPGA implementation of high speed FIR filters using add and shift method. In Proceedings of the International Conference on Computer Design. 308--313.
[24]
Chen, C.-Y., Chien, S.-Y., Huang, Y.-W., Chen, T.-C., Wang, T.-C., and Chen, L.-G. 2006. Analysis and architecture design of variable block-size motion estimation for H.264/AVC. IEEE Trans. Circ. Syst. 53, 578--593.
[25]
Shams, A., Pan, W., Chandanandan, A., and Bayoumi, M. 2000. A high-performance 1D-DCT architecture. In Proceedings of IEEE International Symposium on Circuits and Systems. 521--524.
[26]
Synopsys. 2001. Creating high-speed data-path components—Application note. Tech. rep. Mountain View, CA, version 2001.08.

Cited By

View all
  • (2024)High-efficiency Compressor Trees for Latest AMD FPGAsACM Transactions on Reconfigurable Technology and Systems10.1145/364509717:2(1-32)Online publication date: 10-Feb-2024
  • (2024)Multiplier Design Addressing Area-Delay Trade-offs by using DSP and Logic resources on FPGAs2024 IEEE 35th International Conference on Application-specific Systems, Architectures and Processors (ASAP)10.1109/ASAP61560.2024.00051(217-225)Online publication date: 24-Jul-2024
  • (2024)Small Logic-based Multipliers with Incomplete Sub-Multipliers for FPGAs2024 IEEE 31st Symposium on Computer Arithmetic (ARITH)10.1109/ARITH61463.2024.00029(124-131)Online publication date: 10-Jun-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Reconfigurable Technology and Systems
ACM Transactions on Reconfigurable Technology and Systems  Volume 4, Issue 4
December 2011
179 pages
ISSN:1936-7406
EISSN:1936-7414
DOI:10.1145/2068716
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 December 2011
Accepted: 01 March 2011
Revised: 01 December 2010
Received: 01 June 2010
Published in TRETS Volume 4, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Field Programamble Gate Array (FPGA)
  2. carry chain
  3. compressor tree
  4. look-up table (LUT)

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)33
  • Downloads (Last 6 weeks)1
Reflects downloads up to 17 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)High-efficiency Compressor Trees for Latest AMD FPGAsACM Transactions on Reconfigurable Technology and Systems10.1145/364509717:2(1-32)Online publication date: 10-Feb-2024
  • (2024)Multiplier Design Addressing Area-Delay Trade-offs by using DSP and Logic resources on FPGAs2024 IEEE 35th International Conference on Application-specific Systems, Architectures and Processors (ASAP)10.1109/ASAP61560.2024.00051(217-225)Online publication date: 24-Jul-2024
  • (2024)Small Logic-based Multipliers with Incomplete Sub-Multipliers for FPGAs2024 IEEE 31st Symposium on Computer Arithmetic (ARITH)10.1109/ARITH61463.2024.00029(124-131)Online publication date: 10-Jun-2024
  • (2023)Towards Globally Optimal Design of Multipliers for FPGAsIEEE Transactions on Computers10.1109/TC.2023.323812872:5(1261-1273)Online publication date: 1-May-2023
  • (2023)Sums of Weighted BitsApplication-Specific Arithmetic10.1007/978-3-031-42808-1_7(151-201)Online publication date: 23-Aug-2023
  • (2022)Fixed–Point Arithmetic for Implementing Massive MIMO Systems2022 24th International Conference on Advanced Communication Technology (ICACT)10.23919/ICACT53585.2022.9728867(1-11)Online publication date: 13-Feb-2022
  • (2022)Lossless Differential Table Compression for Hardware Function EvaluationIEEE Transactions on Circuits and Systems II: Express Briefs10.1109/TCSII.2021.313140569:3(1642-1646)Online publication date: Mar-2022
  • (2021)Resource Optimal Truncated Multipliers for FPGAs2021 IEEE 28th Symposium on Computer Arithmetic (ARITH)10.1109/ARITH51176.2021.00029(102-109)Online publication date: Jun-2021
  • (2020)LUXORProceedings of the 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays10.1145/3373087.3375303(161-171)Online publication date: 23-Feb-2020
  • (2020)Optimizing FPGA Logic Block Architectures for ArithmeticIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2020.2965772(1-14)Online publication date: 2020
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media