Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

An FPGA Logic Cell and Carry Chain Configurable as a 6:2 or 7:2 Compressor

Published: 01 September 2009 Publication History

Abstract

To improve FPGA performance for arithmetic circuits that are dominated by multi-input addition operations, an FPGA logic block is proposed that can be configured as a 6:2 or 7:2 compressor. Compressors have been used successfully in the past to realize parallel multipliers in VLSI technology; however, the peculiar structure of FPGA logic blocks, coupled with the high cost of the routing network relative to ASIC technology, renders compressors ineffective when mapped onto the general logic of an FPGA. On the other hand, current FPGA logic cells have already been enhanced with carry chains to improve arithmetic functionality, for example, to realize fast ternary carry-propagate addition. The contribution of this article is a new FPGA logic cell that is specialized to help realize efficient compressor trees on FPGAs. The new FPGA logic cell has two variants that can respectively be configured as a 6:2 or a 7:2 compressor using additional carry chains that, coupled with lookup tables, provide the necessary functionality. Experiments show that the use of these modified logic cells significantly reduces the delay of compressor trees synthesized on FPGAs compared to state-of-the-art synthesis techniques, with a moderate increase in area and power consumption.

References

[1]
Betz, V. and Rose, J. 1997. VPR: A new packing, placement, and routing tool for FPGA research. In Proceedings of the 7th International Workshop on Field-Programmable Logic and Applications. 213--222.
[2]
Betz, V., Rose, J., and Marquardt, A. 1999. Architecture and CAD for Deep Submicron FPGAs. Kluwer Academic Publishers, Norwell, MA.
[3]
Brisk, P., Verma, A. K., Ienne, P., and Parandeh-Afshar, H. 2007. Enhancing FPGA performance for arithmetic circuits. In Proceedings of the 44th Design Automation Conference. 404--409.
[4]
Cevrero, A., Athanasopoulos, P., Parandeh-Afshar, H., Verma, A. K., Brisk, P., Gurkaynak, F. K., Leblebici, Y., and Ienne, P. 2008. Architectural improvements for field programmable counter arrays: Enabling efficient synthesis of fast compressor trees on FPGAs. In Proceedings of the 16th International Symposium on Field Programmable Gate Arrays. 181--190.
[5]
Chen, C.-Y., Chien, S.-Y., Huang, Y.-W., Chen, T.-C., Wang, T.-C., and Chen, L.-G. 2006. Analysis and architecture design of variable block-size motion estimation for H.264/AVC. IEEE Trans. Circ. Syst. 53, 578--593.
[6]
Cherepacha, D. and Lewis, D. 1996. DP-FPGA: an FPGA architecture optimizated for datapaths. VLSI Des. 4, 329--343.
[7]
Choy, N. C. K. and Wilton, S. J. E. 2006. Activity-based power estimation and characterization of DSP and multiplier blocks in FPGAs. In Proceedings of the IEEE International Conference on Field Programmable Technology. 253--256.
[8]
Cong, J. and Huang, H. 2005. Technology mapping and architecture evaluation for k/m-macrocell-based FPGAs. ACM Trans. Des. Automat. Electron. Syst. 10, 3--23.
[9]
Dadda, L. 1965. Some schemes for parallel multipliers. Alta Frequenza 34, 349--356.
[10]
DeHon, A. 1999. Balancing interconnect and computation in a reconfigurable computing array (or, why you don’t really want 100% LUT utilization). In Proceedings of the International Symposium on Field Programmable Gate Arrays. 69--76.
[11]
Fadavi-Ardekani, J. 1993. M × N Booth encoded multiplier generator using optimized Wallace trees. IEEE Trans. VLSI Syst. 1, 120--125.
[12]
Frederick, M. T. and Somani, A. K. 2006. Multi-bit carry chains for high performance reconfigurable fabrics. In Proceedings of the 16th International Conference on Field Programmable Logic and Applications. 1--6.
[13]
Grover, R. S., Shang, W., and Li, Q. 2002. A faster distributed arithmetic architecture for FPGAs. In Proceedings of the 10th International Symposium on FPGAs. 31--39.
[14]
Hauck, S., Hosler, M. M., and Fry, T. W. 2000. High-performance carry chains for FPGAs. IEEE Trans. VLSI Syst. 8, 138--147.
[15]
Hu, Y., Das, S., Trimberger, S., and He, L. 2007. Design, synthesis, and evaluation of heterogeneous FPGA with mixed LUTs and macro-gates. In Proceedings of the International Conference on Computer-Aided Design. 188--193.
[16]
Jamieson, P. and Rose, J. 2006. Enhancing the area of FPGAs with hard circuits using shadow clusters. In Proceedings of the IEEE International Conference on Field-Programmable Technology. 1--8.
[17]
Kastner, R., Kaplan, A., Ogrenci-Memik, S., and Bozorgzadeh, E. 2002. Instruction generation for hybrid reconfigurable systems. ACM Trans. Des. Automat. Electro. Syst. 7, 605--627.
[18]
Kaviani, A., Vranisec, D., and Brown, S. 1998. Computational field programmable architecture. In Proceedings of the IEEE Custom Integrated Circuits Conference. 261--264.
[19]
Kuon, I. and Rose, J. 2007. Measuring the gap between FPGAs and ASICs. IEEE Trans. Comput.-Aid. Des. 26, 203--215.
[20]
Kwon, O., Nowka, K., and Swartzlander Jr., E. E. 2002. A 16-bit by 16-bit MAC design using fast 5:3 compressor cells. J. VLSI Sign. Process. 31, 77--89.
[21]
Lamoureux, J. and Wilton, S. J. E. 2006. Activity estimation for field programmable gate arrays. In Proceedings of the 16th International Conference on Field Programmable Logic and Applications. 1--8.
[22]
Lee, C., Potkonjak, M., and Mangione-Smith, W. H. 1997. MediaBench: a tool for evaluating and synthesizing multimedia and communications systems. In Proceedings of the 30th International Symposium on Microarchitecture. 330--335.
[23]
Leijten-Nowak, K. and Van Meerbergen, J. L. 2003. An FPGA architecture with enhanced datapath functionality. In Proceedings of the 11th International Symposium on FPGAs. 195--204.
[24]
Mirzaei, S., Hosangadi, A., and Kastner, R. 2006. FPGA implementation of high speed FIR filters using add and shift method. In Proceedings of the International Conference on Computer Design. 308--313.
[25]
Mora Mora, H., Pascual Mora, J., Sanchez Romero, J. L., and Pujol Lopez, F. 2006. Partial product reduction based on look-up tables. In Proceedings of the International Conference on VLSI Design. 399--404.
[26]
Najm, F. N. 1994. A survey of power estimation techniques in VLSI circuits. IEEE Trans. VLSI Syst. 2, 446--455.
[27]
Oklobdzija, V. G. and Villeger, D. 1995. Improving multiplier design by using improved column compression tree and optimized final adder in CMOS technology. IEEE Trans. VLSI Syst. 3, 292--301.
[28]
Parandeh-Afshar, H. Brisk, P., and Ienne, P. 2008a. A novel FPGA logic block for improved arithmetic performance. In Proceedings of the 16th International Symposium on Field Programmable Gate Arrays. 171--180.
[29]
Parandeh-Afshar, H., Brisk, P., and Ienne, P. 2008b. Efficient synthesis of compressor trees on FPGAs. In Proceedings of the Asia-South Pacific Design Automation Conference. 138--143.
[30]
Parandeh-Afhsar, H., Brisk, P., and Ienne, P. 2008c. Improving synthesis of compressor trees on FPGAs via integer linear programming. In Proceedings of the International Conference on Design Automation and Test in Europe. 1256--1262.
[31]
Parandeh-Afshar, H. Brisk, P., and Ienne, P. 2009. Exploiting fast carry chains of FPGAs for designing compressor trees. In Proceedings of the 19th International Conference on Field Programmable Logic and Applications. 242--249.
[32]
Parhami, B. 2000. Computer Arithmetic, Algorithms and Hardware Designs. Oxford University Press.
[33]
Poldre, J. and Tammemae, K. 1999. Reconfigurable multiplier for Virtex FPGA family. In Proceedings of the 9th International Workshop on Field-Programmable Logic and Applications. 359--364.
[34]
Poon, K. K. W., Wilton, S. J. E., and Yan, A. 2005. A detailed power model for field-programmable gate arrays. ACM Trans. Des. Automat. Electro. Syst. 10, 279--302.
[35]
Santoro, M. and Horowitz, M. 1988. A pipelined 64x64b iterative array multiplier. In Proceedings of the IEEE Solid State Circuits Conference. 36--37, 290.
[36]
Song, P. J. and De Micheli, G. 1991. Circuit and architecture tradeoffs for high-speed multiplication. IEEE J. Solid-State Circ. 26, 1184--1198.
[37]
Stelling, P. F., Martel, C. U., Oklobdzija, V. J., and Ravi, R. 1998. Optimal circuits for parallel multipliers. IEEE Trans. Comput. 47, 273--285.
[38]
Stelling, P. F. and Oklobdzija, V. J. 1996. Design strategies for optimal hybrid final adders in a parallel multiplier. J. VLSI Signal Process. 14, 321--331.
[39]
Stenzel, W. J., Kubitz, W. J., and Garcia, G. H. 1977. A compact high-speed parallel multiplication scheme. IEEE Trans. Comput. C-26, 948--957.
[40]
Swartzlander Jr., E. E. 1973. Parallel counters. IEEE Trans. Comput. C-22, 1021--1024.
[41]
Um, J. and Kim, T. 2002. Layout-aware synthesis of arithmetic circuits. In Proceedings of the 39th Design Automation Conference. 207--212.
[42]
Verma, A. K., Brisk, P., and Ienne, P. 2008. Data-flow transformations to maximise the use of carry-save representation in arithmetic circuits. IEEE Trans. Comput.-Aid. Des. 27, 1761--1774.
[43]
Verma, A. K. and Ienne, P. 2007a. Automatic synthesis of compressor trees: Reevaluating large counters. In Proceedings of the International Conference on Design Automation and Test in Europe. 443--448.
[44]
Verma, A. K. and Ienne, P. 2007b. Improving XOR-dominated circuits by exploiting dependencies between operands. In Proceedings of the Asia-South Pacific Design Automation Conference. 601--608.
[45]
Wallace, C. S. 1964. A suggestion for a fast multiplier. IEEE Trans. Elec. Comput. 13, 14--17.
[46]
Weinberger, A. 1981. A 4:2 carry save adder module. IBM Techn. Disclos. Bull. 23.
[47]
Zuchowski, P. S., Reynolds, C. B., Grupp, R. J., Davis, S. G., Cremen, B., and Troxel, B. 2002. A hybrid ASIC and FPGA architecture. In Proceedings of the International Conference on Computer-Aided Design. 187--194.

Cited By

View all
  • (2020)LUXORProceedings of the 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays10.1145/3373087.3375303(161-171)Online publication date: 23-Feb-2020
  • (2018)FPGA Architecture Enhancements for Efficient BNN Implementation2018 International Conference on Field-Programmable Technology (FPT)10.1109/FPT.2018.00039(214-221)Online publication date: Dec-2018
  • (2012)Rethinking FPGAsProceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays10.1145/2145694.2145715(119-128)Online publication date: 22-Feb-2012

Recommendations

Reviews

Srinivasa R Vemuru

Field-programmable gate arrays (FPGAs) are ideal platforms for prototyping hardware, due to the fast turnaround time and inherent reconfigurable nature of the devices. FPGAs are increasingly being used in low-to-medium-volume markets. Although their performance is superior to software implementations, FPGA implementations still have significantly lower speeds than application-specific integrated circuit implementations. To reduce this performance gap, FPGAs are equipped with additional hardware and programmable features to improve the performance of arithmetic blocks. Compressor trees are very suitable for multiple digit addition and fast multiplication applications. The authors present new enhancements to commercial FPGA architectures that make it easier to implement 6:2 and 7:2 compressors. The paper has a good introduction to arithmetic primitives and their mapping onto the FPGA hardware resources. The authors describe the cell modifications to improve the performance of arithmetic primitives. They discuss in detail the heuristics to map compressor trees to the modified FPGA cells. They study the critical path delay, power consumption, and four different implementations of multiple benchmark arithmetic circuits, on the modified FPGA architecture. The four implementations are ternary, generalized parallel counters (GPC), GPC with 6:2 compressors, and GPC with 7:2 compressors. Overall, the implementations based on compressors have significantly reduced delays, with an increase in the use of FPGA resources and power consumption. The paper should be of interest to researchers in the areas of FPGA architectures and computer arithmetic. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Reconfigurable Technology and Systems
ACM Transactions on Reconfigurable Technology and Systems  Volume 2, Issue 3
September 2009
121 pages
ISSN:1936-7406
EISSN:1936-7414
DOI:10.1145/1575774
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 September 2009
Accepted: 01 June 2009
Revised: 01 February 2009
Received: 01 August 2008
Published in TRETS Volume 2, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. 6:2 compressor
  2. 7:2 compressor
  3. FPGA
  4. carry chain
  5. compressor tree

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)1
Reflects downloads up to 17 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2020)LUXORProceedings of the 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays10.1145/3373087.3375303(161-171)Online publication date: 23-Feb-2020
  • (2018)FPGA Architecture Enhancements for Efficient BNN Implementation2018 International Conference on Field-Programmable Technology (FPT)10.1109/FPT.2018.00039(214-221)Online publication date: Dec-2018
  • (2012)Rethinking FPGAsProceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays10.1145/2145694.2145715(119-128)Online publication date: 22-Feb-2012

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media