Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2145694.2145738acmconferencesArticle/Chapter ViewAbstractPublication PagesfpgaConference Proceedingsconference-collections
research-article

Reducing the cost of floating-point mantissa alignment and normalization in FPGAs

Published: 22 February 2012 Publication History

Abstract

In floating-point datapaths synthesized on FPGAs, the shifters that perform mantissa alignment and normalization consume a disproportionate number of LUTs. Shifters are implemented using several rows of small multiplexers; unfortunately, multiplexer-based logic structures map poorly onto LUTs. FPGAs, meanwhile, contain a large number of multiplexers in the programmable routing network; these multiplexer are placed under static control of the FPGA's configuration bitstream. In this work, we modify some of the routing multiplexers in the intra-cluster routing network of a CLB in an FPGA to implement shifters for floating-point mantissa alignment and normalization; the number of CLBs required for these operations is reduced by 67%. If shifting is not required, the routing multiplexers that have been modified can be configured to operate as normal routing multiplexers, so no functionality is sacrificed. The area overhead incurred by these modifications is small, and there is no need to modify every routing multiplexer in the FPGA. Experiments show that there is no negative impact in terms of clock frequency or routability for benchmarks that do not use the dynamic multiplexers.

References

[1]
Ahmed, E., and Rose, J. The effect of LUT and cluster size on deep-submicron FPGA performance and density. IEEE Trans. VLSI, vol. 12, no. 3, March, 2003, pp. 288--298. DOI= http://dx.doi.org/10.1109/TVLSI.2004.824300
[2]
Beauchamp, M. J., Hauck, S., Underwood, K. D., and Hemmert, K. S. Architectural modifications to enhance the floating-point performance of FPGAs. IEEE Trans. VLSI, vol. 16, no. 2, Feb. 2008, pp. 177--187. DOI= http://dx.doi.org/10.1109/TVLSI.2007.912041
[3]
Berkeley Logic Synthesis and Verification Group. "ABC: A system for sequential synthesis and verification.: December 2005 release. URL= http://www.eecs.berkeley.edu/~alanmi/abc
[4]
Betz, V., and Rose, J., "Automatic generation of FPGA routing architectures from high-level descriptions," ACM/SIGDA Int. Symp. FPGAs (FPGA '00), pp. 175--184, Feb. 10-11, 2000, DOI= http://doi.acm.org/10.1145/329166.329203
[5]
Chong, Y. and Parameswaran, S., "Flexible multi-mode embedded floating-point unit for field programmable gate arrays," ACM/SIGDA Int. Symp. FPGAs (FPGA '09), pp. 171--180, Feb. 22-24, 2009, DOI= http://doi.acm.org/10.1145/1508128.1508155
[6]
de Dinechin, F., Klein, C., and Pasca, B., "Generating high-performance custom floating-point pipelines," Int. Conf. Field Programmable Logic and Applications (FPL '09), Aug. 31-Sept. 2, 2009. DOI=http://dx.doi.org/10.1109/FPL.2009.527255/
[7]
Feng, W. and Kaptanoglu, S. Designing Efficient Input Interconnect Blocks for LUT Clusters Using Counting and Entropy. ACM Trans. Reconfigurable Technol. Syst., vol. 1, no. 1, Mar. 2008, pp. 1--28. DOI= http://doi.acm.org/10.1145/1331897.1331902
[8]
Gigliotti, P., "Implementing barrel shifters using multipliers," XAPP -- Application Note: Virtex II Family, pp. 1--4, Aug., 2004. URL= http://www.xilinx.com/support/documentation/application_notes/xapp195.pdf
[9]
Ho, C. H., et al., Floating-point FPGA: architecture and modeling. IEEE Trans. VLSI, vol. 17, no. 12, Dec. 2009, pp. 1709--1718. DOI= http://dx.doi.org/10.1109/TVLSI.2008.2006616
[10]
IWLS 2005 benchmarks. URL= http://iwls.org/iwls2005/benchmarks.html
[11]
Jamieson, P., and Rose, J., "Enhancing the area-efficiency of FPGAs with hard circuits using shadow clusters," IEEE Trans. CAD, vol. 18, no. 12, Dec. 2010, pp. 1696--1709. DOI = http://dx.doi.org/10.1109/TVLSI.2009.2026651
[12]
Jamieson, P., and Rose, J., "Mapping multiplexers onto hard multipliers in FPGAs," 3rd Int. IEEE Northeast Workshop on Circuits & Systems (IEEE-NEWCAS '05), pp. 323--326, June 19-22, 2005. DOI= http://dx.doi.org/10.1109/NEWCAS.2005.1496692
[13]
Kaviani, A., FPGA with improved structure for implementing large multiplexors. U.S. patent, no. US 6,556,042 B1, Apr. 29, 2003.
[14]
I. Kuon and J. Rose, "Area and delay trade-offs in the circuit and architecture design of FPGAs," ACM/SIGDA Int. Symp. FPGAs (FPGA '08), pp. 149--158, Feb. 24-26, 2008, DOI= http://doi.acm.org/10.1145/1344671.1344695
[15]
I. Kuon and J. Rose, "Automated transistor sizing for FPGA architecture exploration," ACM/IEEE Design Automation Conference (DAC '08), pp. 792--795, June 8-13, 2008, DOI= http://doi.acm.org/10.1145/1391469.1391671
[16]
Langhammer, M., "Floating point datapath synthesis for FPGAs," Int. Conf. Field Programmable Logic and Applications, (FPL '08), pp.355--360, Sept. 8--10, 2008. DOI= http://dx.doi.org/10.1109/FPL.2008.4629963
[17]
Langhammer, M., and Vancourt, T., "FPGA floating point datapath compiler," IEEE Symp. 17th IEEE Symp. Field-programamble Custom Computing Machines (FCCM '09), April 5-7, 2009. DOI = http://dx.doi.org/10.1109/FCCM.2009.54
[18]
Lemieux, G. Lee, E. Tom, M., and Yu, A. "Directional and single-driver wires in FPGA interconnect," IEEE International Conference on Field-Programmable Technology (FPT '04), pp. 41--48, Dec. 6-8, 2004.
[19]
Lemieux, G, and Lewis, D. "Using sparse crossbars within LUT clusters," ACM/SIGDA Int. Symp. FPGAs (FPGA '01), pp. 59--68, Feb. 11-13, 2001, DOI= http://doi.acm.org/10.1145/360276.360299
[20]
Luu, J., Kuon, I., Jamieson, P., Campbell, T., Ye, A., Fang, W. M., and Rose, J. "VPR 5.0: FPGA CAD and architecture exploration tools with single-driver routing, heterogeneity and process scaling," ACM/SIGDA Int. Symp. FPGAs (FPGA '09), pp. 133--142, Feb. 22-24, 2009, DOI= http://doi.acm.org/10.1145/1508128.1508150
[21]
Marquardt, A., Betz, V., and Rose, J. "Timing-driven placement for FPGAs," ACM/SIGDA Int. Symp. FPGAs (FPGA '00), pp. 203--213, Feb. 10-11, 2000, DOI= http://doi.acm.org/10.1145/329166.329208
[22]
Marquardt, A., Betz, V., and Rose, J. "Using cluster-based logic blocks and timing-driven packing to improve FPGA speed and density," ACM/SIGDA Int. Symp. FPGAs (FPGA '99), pp. 37--46, Feb. 21-23, 1999, DOI= http://doi.acm.org/10.1145/296399.296426
[23]
McMurchie, L., and Ebeling, C. "PathFinder: a negotiation-based performance-driven router for FPGAs," ACM/SIGDA Int. Symp. FPGAs (FPGA '95), pp. 111--117, Feb. 12-14, 1995, DOI= http://doi.acm.org/10.1145/201310.201328
[24]
Metzgen, P., and Nancekievill, D. Multiplexer restructuring for FPGA implementation cost reduction. Design Automation Conf. (DAC '05) pp. 421--426, June 13-17, 2005, DOI= http://doi.acm.org/10.1145/1065579.1065692
[25]
Verma, A., et al. "Synthesis of floating-point addition clusters on FPGAs using carry-save arithmetic," Int. Conf. Field Programmable Logic and Applications (FPL '10), pp. 19--24, Aug. 31-Sep. 2, 2010.
[26]
Xilinx Corporation. Virtex-6 FPGA DSP48E1 Slice User Guide UG369 (v1.2), September 16, 2009. URL= http://www.xilinx.com/support/documentation/user_guides/ug369.pdf

Cited By

View all
  • (2024)Efficient Data Extraction Circuit for Posit Number System: LDD-Based Posit DecoderIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2023.334729543:6(1919-1923)Online publication date: Jun-2024
  • (2023)Area-latency efficient floating point adder using interleaved alignment and normalizationMicroprocessors and Microsystems10.1016/j.micpro.2023.10484299(104842)Online publication date: Jun-2023
  • (2023)Shifters and Leading Bit CountersApplication-Specific Arithmetic10.1007/978-3-031-42808-1_10(307-327)Online publication date: 23-Aug-2023
  • Show More Cited By

Index Terms

  1. Reducing the cost of floating-point mantissa alignment and normalization in FPGAs

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    FPGA '12: Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays
    February 2012
    352 pages
    ISBN:9781450311557
    DOI:10.1145/2145694
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 22 February 2012

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. field programmable gate array (FPGA)
    2. floating-point
    3. mantissa alignment
    4. normalization

    Qualifiers

    • Research-article

    Conference

    FPGA '12
    Sponsor:

    Acceptance Rates

    FPGA '12 Paper Acceptance Rate 20 of 87 submissions, 23%;
    Overall Acceptance Rate 125 of 627 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)12
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 17 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Efficient Data Extraction Circuit for Posit Number System: LDD-Based Posit DecoderIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2023.334729543:6(1919-1923)Online publication date: Jun-2024
    • (2023)Area-latency efficient floating point adder using interleaved alignment and normalizationMicroprocessors and Microsystems10.1016/j.micpro.2023.10484299(104842)Online publication date: Jun-2023
    • (2023)Shifters and Leading Bit CountersApplication-Specific Arithmetic10.1007/978-3-031-42808-1_10(307-327)Online publication date: 23-Aug-2023
    • (2020)LeAp: Leading-One Detection-Based Softcore Approximate Multipliers with Tunable AccuracyProceedings of the 25th Asia and South Pacific Design Automation Conference10.1109/ASP-DAC47756.2020.9045171(605-610)Online publication date: 17-Jan-2020
    • (2014)Accelerating FPGA debugACM Transactions on Design Automation of Electronic Systems10.1145/256666819:2(1-23)Online publication date: 28-Mar-2014
    • (2013)Towards simulator-like observability for FPGAsProceedings of the ACM/SIGDA international symposium on Field programmable gate arrays10.1145/2435264.2435272(19-28)Online publication date: 11-Feb-2013
    • (2012)On the difficulty of pin-to-wire routing in FPGAs22nd International Conference on Field Programmable Logic and Applications (FPL)10.1109/FPL.2012.6339245(83-90)Online publication date: Aug-2012

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media