Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3535044.3535051acmotherconferencesArticle/Chapter ViewAbstractPublication PagesheartConference Proceedingsconference-collections
research-article

A single-source C++20 HLS flow for function evaluation on FPGA and beyond.

Published: 09 June 2022 Publication History

Abstract

This paper presents a framework to reuse the intelligence of RTL generators in a single-source HLS setting. This framework is illustrated by a C++ fixed-point library to generate mathematical function evaluator. A compiler flow from C++20 to Vivado IPs has been developed to make the library usable with Vitis HLS. This flow is demonstrated on two applications: an adder for the logarithmic number system, and additive sound synthesis. These experiments show that the approach allows to easily tune the precision of the types used in the application. They also demonstrate the ability to generate arbitrary function evaluator at the required precision.

References

[1]
Ankur al, Silvia M Mueller, Bruce M Fleischer, Xiao Sun, Naigang Wang, Jungwook Choi, and Kailash Gopalakrishnan. 2019. Dlfloat: A 16-b floating point format designed for deep learning training and inference. In 26th Symposium on Computer Arithmetic (Kyoto). IEEE, 92–95.
[2]
Syed Asad Alam, James Garland, and David Gregg. 2021. Low-precision Logarithmic Number Systems: Beyond Base-2. Transactions on Architecture and Code Optimization 18, 4(2021), 1–25.
[3]
Nicolas Brunie, Florent de Dinechin, Matei Istoan, Guillaume Sergent, Kinga Illyes, and Bogdan Popa. 2013. Arithmetic core generation using bit heaps. In 23rd International Conference on Field programmable Logic and Applications (Porto). IEEE, 1–8. https://doi.org/10.1109/FPL.2013.6645544
[4]
Neil Burgess, Nigel Stephens, Jelena Milanovic, and Konstantinos Monachopolous. 2019. Bfloat16 processing for Neural Networks. In 26th Symposium on Computer Arithmetic (Kyoto). IEEE, 88–91. https://doi.org/10.1109/ARITH.2019.00022
[5]
S. Chevillard, M. Joldeş, and C. Lauter. 2010. Sollya: An Environment for the Development of Numerical Codes. In Third International Congress on Mathematical Software (Kobe) (Lecture Notes in Computer Science, Vol. 6327), K. Fukuda, J. van der Hoeven, M. Joswig, and N. Takayama (Eds.). Springer, Heidelberg, Germany, 28–31. https://doi.org/10.1007/978-3-642-15582-6_5
[6]
Bita Darvish Rouhani, Daniel Lo, Ritchie Zhao, Ming Liu, Jeremy Fowers, Kalin Ovtcharov, Anna Vinogradsky, Sarah Massengill, Lita Yang, Ray Bittner, Alessandro Forin, Haishan Zhu, Taesik Na, Prerak Patel, Shuai Che, Lok Chand Koppaka, XIA SONG, Subhojit Som, Kaustav Das, Saurabh Tiwary, Steve Reinhardt, Sitaram Lanka, Eric Chung, and Doug Burger. 2020. Pushing the Limits of Narrow Precision Inferencing at Cloud Scale with Microsoft Floating Point. In Advances in Neural Information Processing Systems. Curran Associates, Inc., 10271–10281.
[7]
D. Das Sarma and D.W. Matula. 1995. Faithful bipartite ROM reciprocal tables. In 12th Symposium on Computer Arithmetic. ACM. https://doi.org/10.1109/ARITH.1995.465381
[8]
Florent de Dinechin, Mioara Joldeş, and Bogdan Pasca. 2010. Automatic generation of polynomial-based hardware architectures for function evaluation. In Application-specific Systems, Architectures and Processors. IEEE.
[9]
Alan Lockhart Monteith Douglas. 1957. The electrical production of music. Philosophical Library, New York.
[10]
The Khronos® SYCL™ Working Group. 2021. SYCL™ 2020 Specification. https://www.khronos.org/sycl
[11]
Oscar Gustafsson. 2007. A Difference Based Adder Graph Heuristic for Multiple Constant Multiplication Problems. In International Symposium on Circuits and Systems. IEEE. https://doi.org/10.1109/ISCAS.2007.378201
[12]
Greg Henry, Ping Tak Peter Tang, and Alexander Heinecke. 2019. Leveraging the Bfloat16 Artificial Intelligence Datatype For Higher-Precision Computations. In 26th Symposium on Computer Arithmetic (Kyoto). IEEE, 97–98. https://doi.org/10.1109/ARITH.2019.00019
[13]
Shen-Fu Hsiao, Po-Han Wu, Chia-Sheng Wen, and Pramod Kumar Meher. 2015. Table Size Reduction Methods for Faithfully Rounded Lookup-Table-Based Multiplierless Function Evaluation. IEEE Transactions on Circuits and Systems II: Express Briefs 62, 5(2015). https://doi.org/10.1109/TCSII.2014.2386232
[14]
Jeff Johnson. 2018. Rethinking floating point for deep learning. https://doi.org/10.48550/ARXIV.1811.01721
[15]
Chris Lattner, Mehdi Amini, Uday Bondhugula, Albert Cohen, Andy Davis, Jacques Pienaar, River Riddle, Tatiana Shpeisman, Nicolas Vasilache, and Oleksandr Zinenko. 2021. MLIR: Scaling Compiler Infrastructure for Domain Specific Computation. In International Symposium on Code Generation and Optimization. IEEE, 2–14. https://doi.org/10.1109/CGO51591.2021.9370308
[16]
Dong-U Lee, Peter Cheung, Wayne Luk, and John Villasenor. 2009. Hierarchical Segmentation Schemes for Function Evaluation. IEEE Transactions on VLSI Systems 17, 1 (2009).
[17]
LLVM 2022. The LLVM Compiler Infrastructure. http://llvm.org
[18]
Jinming Lu, Chao Fang, Mingyang Xu, Jun Lin, and Zhongfeng Wang. 2021. Evaluations on Deep Neural Networks Training Using Posit Number System. IEEE Trans. Comput. 70, 2 (2021), 174–187.
[19]
NVIDIA. 2020. NVIDIA A100 Tensor Core GPU Architecture. Technical Report.
[20]
D.A. Sunderland, R.A. Strauch, S.S. Wharfield, H.T. Peterson, and C.R. Cole. 1984. CMOS/SOS frequency synthesizer LSI circuit for spread spectrum communications. IEEE Journal of Solid-State Circuits 19, 4 (1984). https://doi.org/10.1109/JSSC.1984.1052173
[21]
David B. Thomas. 2015. A general-purpose method for faithfully rounded floating-point function approximation in FPGAs. In 22d Symposium on Computer Arithmetic. IEEE.
[22]
David B. Thomas. 2019. Templatised Soft Floating-Point for High-Level Synthesis. In 27th Annual International Symposium on Field-Programmable Custom Computing Machines. IEEE, 227–235. https://doi.org/10.1109/FCCM.2019.00038
[23]
AMD 2022. SYCL for Vitis: Experimental fusion of triSYCL with Intel SYCL oneAPI DPC++ up-streaming effort into Clang/LLVM. AMD. https://github.com/triSYCL/sycl
[24]
Yohann Uguen, Florent de Dinechin, Victor Lezaud, and Steven Derrien. 2020. Application-Specific Arithmetic in High-Level Synthesis Tools. ACM Transactions on Architecture and Code Optimization 17, 1(2020). https://doi.org/10.1145/3377403
[25]
Yohann Uguen, Luc Forget, and Florent de Dinechin. 2019. Evaluating the Hardware Cost of the Posit Number System. In 29th International Conference on Field Programmable Logic and Applications. IEEE. https://doi.org/10.1109/fpl.2019.00026
[26]
Yevgen Voronenko and Markus Püschel. 2007. Multiplierless Multiple Constant Multiplication. ACM Transactions on Algorithms 3, 2 (may 2007). https://doi.org/10.1145/1240233.1240234

Cited By

View all
  • (2023)Fortran High-Level Synthesis: Reducing the Barriers to Accelerating HPC Codes on FPGAs2023 33rd International Conference on Field-Programmable Logic and Applications (FPL)10.1109/FPL60245.2023.00010(10-18)Online publication date: 4-Sep-2023

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
HEART '22: Proceedings of the 12th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies
June 2022
114 pages
ISBN:9781450396608
DOI:10.1145/3535044
Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 June 2022

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

HEART2022

Acceptance Rates

HEART '22 Paper Acceptance Rate 10 of 21 submissions, 48%;
Overall Acceptance Rate 22 of 50 submissions, 44%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)13
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Fortran High-Level Synthesis: Reducing the Barriers to Accelerating HPC Codes on FPGAs2023 33rd International Conference on Field-Programmable Logic and Applications (FPL)10.1109/FPL60245.2023.00010(10-18)Online publication date: 4-Sep-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media