research-article

Automatically Optimizing the Latency, Area, and Accuracy of C Programs for High-Level Synthesis

Authors:

John Wickerson,

George A. ConstantinidesAuthors Info & Claims

FPGA '16: Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

Pages 234 - 243

https://doi.org/10.1145/2847263.2847282

Published: 21 February 2016 Publication History

Abstract

Loops are pervasive in numerical programs, so high-level synthesis (HLS) tools use state-of-the-art scheduling techniques to pipeline them efficiently. Still, the run time performance of the resultant FPGA implementation is limited by data dependences between loop iterations. Some of these dependence constraints can be alleviated by rewriting the program according to arithmetic identities (e.g. associativity and distributivity), memory access reductions, and control flow optimisations (e.g. partial loop unrolling). HLS tools cannot safely enable such rewrites by default because they may impact the accuracy of floating-point computations and increase area usage. In this paper, we introduce the first open-source program optimizer for automatically rewriting a given program to optimize latency while controlling for accuracy and area. Our tool, SOAP3, reports a multi-dimensional Pareto frontier that the programmer can use to resolve the trade-off according to their needs. When applied to a suite of PolyBench and Livermore Loops benchmarks, our tool has generated programs that enjoy up to a 12x speedup, with a simultaneous 7x increase in accuracy, at a cost of up to 4x more LUTs.

References

[1]

W. Meeus, K. Van Beeck, T. Goedemé, J. Meel, and D. Stroobandt, "An Overview of Today's High-Level Synthesis Tools," Design Automation for Embedded Systems, vol. 16, no. 3, pp. 31--51, 2012.

Digital Library

[2]

Berkeley Design Technology, Inc., "An Independent Evaluation of: High-Level Synthesis Tools for Xilinx FPGAs," 2010. {Online}. Available: http://www.xilinx.com/technology/dsp/BDTI_techpaper.pdf

[3]

A. Canis, S. D. Brown, and J. H. Anderson, "Modulo SDC scheduling with recurrence minimization in high-level synthesis," in FPL, 2014.

[4]

N. J. Higham, Accuracy and Stability of Numerical Algorithms, 2nd ed. Philadelphia, PA, USA: Society for Industrial and Applied Mathematics, 2002.

Digital Library

[5]

Xilinx, Inc., "Vivado Design Suite User Guide--High-Level Synthesis," 2015.

[6]

X. Gao, S. Bayliss, and G. A. Constantinides, "SOAP\@: Structural optimization of arithmetic expressions for high-level synthesis," in FPT, 2013.

[7]

C. Mouilleron, "Efficient computation with structured matrices and arithmetic expressions," Ph.D. dissertation, ENS LYON, 2011.

[8]

X. Gao and G. A. Constantinides, "Numerical program optimization for high-level synthesis," in FPGA, 2015.

Digital Library

[9]

B. R. Rau, "Iterative Modulo Scheduling: An Algorithm for Software Pipelining Loops," in MICRO, 1994.

Digital Library

[10]

J. Dongarra and P. Luszczek, "Livermore Loops," in Encyclopedia of Parallel Computing. Springer US, 2011, pp. 1041--1043.

[11]

L.-N. Pouchet, "PolyBench/C--the Polyhedral Benchmark suite," http://web.cse.ohio-state.edu/ pouchet/software/polybench/.

[12]

S. Verdoolaege, "isl: An Integer Set Library for the Polyhedral Model," in ICMS, 2010.

Digital Library

[13]

R. Bridson, "Fast poisson disk sampling in arbitrary dimensions," in SIGGRAPH Sketches, 2007.

Digital Library

[14]

A. Canis, J. Choi, M. Aldham, V. Zhang, A. Kammoona, J. H. Anderson, S. Brown, and T. Czajkowski, "LegUp: High-level Synthesis for FPGA-based Processor/Accelerator Systems," in FPGA, 2011.

Digital Library

[15]

G. Wang, W. Gong, and R. Kastner, "Operation Scheduling: Algorithms and Applications," in High-Level Synthesis. Springer, 2008.

[16]

P. Li, P. Zhang, L.-N. Pouchet, and J. Cong, "Resource-aware throughput optimization for high-level synthesis," in FPGA, 2015.

Digital Library

[17]

Xilinx, Inc., "Vivado Design Suite User Guide--Synthesis," 2015.

[18]

B. Rau, "Data Flow and Dependence Analysis for Instruction Level Parallelism," in Languages and Compilers for Parallel Computing, ser. LNCS. Springer, 1992, vol. 589, pp. 236--250.

Digital Library

[19]

R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. K. Zadeck, "Efficiently computing static single assignment form and the control dependence graph," ACM TOPLAS, vol. 13, no. 4, pp. 451--490, Oct. 1991.

Digital Library

[20]

D. D. Gajski and L. Ramachandran, "Introduction to high-level synthesis," IEEE Design and Test of Computers, vol. 11, no. 4, pp. 44--54, Oct. 1994.

Digital Library

[21]

A. Nicolau and R. Potasmann, "Incremental tree height reduction for high level synthesis," in DAC, 1991.

Digital Library

[22]

N. Damouche, M. Martel, and A. Chapoutot, "Intra-procedural optimization of the numerical accuracy of programs," in Formal Methods for Industrial Critical Systems, ser. LNCS. Springer, 2015, vol. 9128, pp. 31--46.

[23]

P. Panchekha, A. Sanchez-Stern, J. R. Wilcox, and Z. Tatlock, "Automatically improving accuracy for floating point expressions," in PLDI, 2015.

Digital Library

[24]

A. Hosangadi, F. Fallah, and R. Kastner, "Factoring and eliminating common subexpressions in polynomial expressions," in ICCAD, 2004.

Digital Library

[25]

A. Peymandoust and G. De Micheli, "Using symbolic algebra in algorithmic level DSP synthesis," in DAC, 2001.

Digital Library

[26]

D. Boland and G. Constantinides, "Revisiting the reduction circuit: A case study for simultaneous architecture and precision optimisation," in FPT, 2013.

Cited By

甘芷(2022)A Review of Spatial Exploration of FPGA DesignComputer Science and Application10.12677/CSA.2022.12410712:04(1043-1053)Online publication date: 2022
https://doi.org/10.12677/CSA.2022.124107
Bannwart Perina ASilitonga ABecker JBonato V(2021)Fast Resource and Timing Aware Design Optimisation for High-Level SynthesisIEEE Transactions on Computers10.1109/TC.2021.3112260(1-1)Online publication date: 2021
https://doi.org/10.1109/TC.2021.3112260
Ferres BMuller ORousseau F(2021)Integrating Quick Resource Estimators in Hardware Construction Framework for Design Space Exploration2021 IEEE International Workshop on Rapid System Prototyping (RSP)10.1109/RSP53691.2021.9806276(64-70)Online publication date: 14-Oct-2021
https://doi.org/10.1109/RSP53691.2021.9806276
Show More Cited By

Index Terms

Automatically Optimizing the Latency, Area, and Accuracy of C Programs for High-Level Synthesis

Recommendations

Dynamic Hazard Resolution for Pipelining Irregular Loops in High-Level Synthesis
FPGA '17: Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

Current pipelining approach in high-level synthesis (HLS) achieves high performance for applications with regular and statically analyzable memory access patterns. However, it cannot effectively handle infrequent data-dependent structural and data ...
Numerical Program Optimization for High-Level Synthesis
FPGA '15: Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

This paper introduces a new technique, and its associated open source tool, SOAP2, to automatically perform source-to-source optimization of numerical programs, specifically targeting the trade-off between numerical accuracy and resource usage as a high-...
Non-iterative SDC modulo scheduling for high-level synthesis
Abstract
High-level synthesis is a powerful tool for increasing productivity in digital hardware design. However, as digital systems become larger and more complex, designers have to consider an increased number of optimizations and directives ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

FPGA '16: Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

February 2016

298 pages

ISBN:9781450338561

DOI:10.1145/2847263

General Chair:
Deming Chen
University of Illinois at Urbana-Champaign, USA
,
Program Chair:
Jonathan Greene
Microsemi, USA

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGDA: ACM Special Interest Group on Design Automation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 February 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

FPGA'16

Sponsor:

SIGDA

FPGA'16: The 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

February 21 - 23, 2016

California, Monterey, USA

Acceptance Rates

FPGA '16 Paper Acceptance Rate 20 of 111 submissions, 18%;

Overall Acceptance Rate 125 of 627 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

12
Total Citations
View Citations
344
Total Downloads

Downloads (Last 12 months)14
Downloads (Last 6 weeks)1

Reflects downloads up to 21 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

甘芷(2022)A Review of Spatial Exploration of FPGA DesignComputer Science and Application10.12677/CSA.2022.12410712:04(1043-1053)Online publication date: 2022
https://doi.org/10.12677/CSA.2022.124107
Bannwart Perina ASilitonga ABecker JBonato V(2021)Fast Resource and Timing Aware Design Optimisation for High-Level SynthesisIEEE Transactions on Computers10.1109/TC.2021.3112260(1-1)Online publication date: 2021
https://doi.org/10.1109/TC.2021.3112260
Ferres BMuller ORousseau F(2021)Integrating Quick Resource Estimators in Hardware Construction Framework for Design Space Exploration2021 IEEE International Workshop on Rapid System Prototyping (RSP)10.1109/RSP53691.2021.9806276(64-70)Online publication date: 14-Oct-2021
https://doi.org/10.1109/RSP53691.2021.9806276
Santos TCardoso J(2020)Automatic Selection and Insertion of HLS Directives Via a Source-to-Source Compiler2020 International Conference on Field-Programmable Technology (ICFPT)10.1109/ICFPT51103.2020.00039(227-232)Online publication date: Dec-2020
https://doi.org/10.1109/ICFPT51103.2020.00039
Huang LLi DWang KGao TTavares A(2020)A Survey on Performance Optimization of High-Level Synthesis ToolsJournal of Computer Science and Technology10.1007/s11390-020-9414-835:3(697-720)Online publication date: 29-May-2020
https://doi.org/10.1007/s11390-020-9414-8
Santiago da Silva JBoyer FLanglois J(2019)Module-per-Object: A Human-Driven Methodology for C++-Based High-Level Synthesis Design2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)10.1109/FCCM.2019.00037(218-226)Online publication date: Apr-2019
https://doi.org/10.1109/FCCM.2019.00037
Wang SPrakash AMitra T(2018)Software Support for Heterogeneous Computing2018 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)10.1109/ISVLSI.2018.00142(756-762)Online publication date: Jul-2018
https://doi.org/10.1109/ISVLSI.2018.00142
Zhao JFeng LSinha SZhang WLiang YHe BParameswaran S(2017)COMBAProceedings of the 36th International Conference on Computer-Aided Design10.5555/3199700.3199757(430-437)Online publication date: 13-Nov-2017
https://dl.acm.org/doi/10.5555/3199700.3199757
Zhong GPrakash AWang SLiang YMitra TNiar S(2017)Design space exploration of FPGA-based accelerators with multi-level parallelismProceedings of the Conference on Design, Automation & Test in Europe10.5555/3130379.3130649(1141-1146)Online publication date: 27-Mar-2017
https://dl.acm.org/doi/10.5555/3130379.3130649
Zhong GPrakash AWang SLiang YMitra TNiar S(2017)Design Space exploration of FPGA-based accelerators with multi-level parallelismDesign, Automation & Test in Europe Conference & Exhibition (DATE), 201710.23919/DATE.2017.7927161(1141-1146)Online publication date: Mar-2017
https://doi.org/10.23919/DATE.2017.7927161
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents