research-article

DeepBurning: automatic generation of FPGA-based learning accelerators for the neural network family

Authors:

Ying Wang,

Jie Xu,

Yinhe Han,

Huawei Li,

Xiaowei LiAuthors Info & Claims

DAC '16: Proceedings of the 53rd Annual Design Automation Conference

Article No.: 110, Pages 1 - 6

https://doi.org/10.1145/2897937.2898003

Published: 05 June 2016 Publication History

Get Access

Abstract

Recent advances in Neural Networks (NN) are enabling more and more innovative applications. As an energy-efficient hardware solution, machine learning accelerators for CNNs or traditional ANNs are also gaining popularity in the area of embedded vision, robotics and cyberphysics. However, the design parameters of NN models vary significantly from application to application. Hence, it's hard to provide one general and highly-efficient hardware solution to accommodate all of them, and it is also impractical for the domain-specific developers to customize their flown hardware targeting on a specific NN model. To deal with this dilemma, this study proposes a design automation tool, DeepBurning, allowing the application developers to build from scratch learning accelerators that targets their specific NN models with custom configurations and optimized performance. DeepBurning includes a RTL-level accelerator generator and a coordinated compiler that generates the control flow and data layout under the user-specified constraints. The results can be used to implement FPGA-based NN accelerator or help generate chip design for early design stage. In general, DeepBurning supports a large family of NN models, and greatly simplifies the design flow of NN accelerators for the machine learning or AI application developers. The evaluation shows that the generated learning accelerators burnt to our FPGA board exhibit great power efficiency compared to state-of-the-art FPGA-based solutions.

References

[1]

Esmaeilzadeh et al., "Neural acceleration for general-purposeapproximate programs,". In Proc. MICRO, 2012.

Digital Library

Google Scholar

[2]

G. Hinton et al., "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups," in IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 82--97, 2012.

Crossref

Google Scholar

[3]

Y Jia et al., "Caffe: Convolutional architecture for fast feature embedding," in ACM Proc. Multimedia, 2014.

Digital Library

Google Scholar

[4]

T. Chen et al., "DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning," in Proc. ASPLOS, 2014.

Digital Library

Google Scholar

[5]

A. Krizhevsky et al., "Imagenet classification with deep convolutional neural networks," in Proc. NIPS, 2012.

Digital Library

Google Scholar

[6]

M. Peemen et al., "Memory-centric accelerator design for convolutional neural networks," in Proc. ICCD, 2013.

Google Scholar

[7]

C. Zhang et al., "Optimizing fpga-based accelerator design for deep convolutional neural networks," in Proc. FPGA, 2015.

Digital Library

Google Scholar

[8]

W. Ouyang et al,. "DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection," in Proc. CVPR, 2015.

Google Scholar

[9]

Y. Pan et. al., "Jointly Modeling Embedding and Translation to Bridge Video and Language," in arXiv, 2015.

Google Scholar

[10]

V. Vapnik, "Does Deep Learning Come from the Devil?," Yandex conference on machine learning prospects and applications, Berlin, 2015

Google Scholar

[11]

R. Beigel et al., "Sorting n Objects With a k-Sorter," IEEE Trans, on Computers, 1990.

Digital Library

Google Scholar

Cited By

View all

Medina HFarmer CLiu I(2024)Dielectric Elastomer-Based Actuators: A Modeling and Control Review for Non-ExpertsActuators10.3390/act1304015113:4(151)Online publication date: 17-Apr-2024
https://doi.org/10.3390/act13040151
Li WYe JZhang FLiu TZhang TWang J(2024)CUTE: A scalable CPU-centric and Ultra-utilized Tensor Engine for convolutionsJournal of Systems Architecture10.1016/j.sysarc.2024.103106149(103106)Online publication date: Apr-2024
https://doi.org/10.1016/j.sysarc.2024.103106
Liu FLi HHu WHe Y(2024)Review of neural network model acceleration techniques based on FPGA platformsNeurocomputing10.1016/j.neucom.2024.128511(128511)Online publication date: Aug-2024
https://doi.org/10.1016/j.neucom.2024.128511
Show More Cited By

Index Terms

DeepBurning: automatic generation of FPGA-based learning accelerators for the neural network family

Recommendations

DeepBurning-GL: an automated framework for generating graph neural network accelerators
ICCAD '20: Proceedings of the 39th International Conference on Computer-Aided Design

Building FPGA-based graph learning accelerators is very time-consuming due to the low-level RTL programming and the complicated design flow of FPGA development. It also requires the architecture and hardware expertise from the Graph Neural Network (GNN) ...
DeepBurning-SEG: Generating DNN Accelerators of Segment-Grained Pipeline Architecture
MICRO '22: Proceedings of the 55th Annual IEEE/ACM International Symposium on Microarchitecture

The growing complexity and diversity of deep neural network (DNN) applications have inspired intensive research on specialized DNN accelerators and also the design automation frameworks. Previous specialized NN acceleratos roughly fall into two ...
Synthesizable Standard Cell FPGA Fabrics Targetable by the Verilog-to-Routing CAD Flow
Special Section on Field Programmable Logic and Applications 2015 and Regular Papers

In this article, we consider implementing field-programmable gate arrays (FPGAs) using a standard cell design methodology and present a framework for the automated generation of synthesizable FPGA fabrics. The open-source Verilog-to-Routing (VTR) FPGA ...

Reviews

Reviewer: Stewart Mark Godwin

Technically complex, this paper has numerous acronyms that are commonly used in specialist areas like electronics and engineering. However, the topic of DeepBurning can be summarized as a design automation tool that allows application developers the option to build a learning accelerator for specific neural networks. This process uses field-programmable gate arrays (FPGAs) that are designed to be modified and configured to suit problems in areas like machine learning and artificial learning. In the paper, the DeepBurning framework is evaluated using eight neural networks, with comparisons of performance, power consumption, and accuracy. In conclusion, the authors indicate they have proved that DeepBurning enables an instant generation of hardware and software solutions for specific neural networks. This paper and the general topic are not for the layperson and would only be of interest to industry-specific experts and academics within this field.

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

DAC '16: Proceedings of the 53rd Annual Design Automation Conference

June 2016

1048 pages

ISBN:9781450342360

DOI:10.1145/2897937

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 June 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Conference

DAC '16

DAC '16: The 53rd Annual Design Automation Conference 2016

June 5 - 9, 2016

Texas, Austin

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

162
Total Citations
View Citations
1,490
Total Downloads

Downloads (Last 12 months)142
Downloads (Last 6 weeks)17

Reflects downloads up to 27 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Medina HFarmer CLiu I(2024)Dielectric Elastomer-Based Actuators: A Modeling and Control Review for Non-ExpertsActuators10.3390/act1304015113:4(151)Online publication date: 17-Apr-2024
https://doi.org/10.3390/act13040151
Li WYe JZhang FLiu TZhang TWang J(2024)CUTE: A scalable CPU-centric and Ultra-utilized Tensor Engine for convolutionsJournal of Systems Architecture10.1016/j.sysarc.2024.103106149(103106)Online publication date: Apr-2024
https://doi.org/10.1016/j.sysarc.2024.103106
Liu FLi HHu WHe Y(2024)Review of neural network model acceleration techniques based on FPGA platformsNeurocomputing10.1016/j.neucom.2024.128511(128511)Online publication date: Aug-2024
https://doi.org/10.1016/j.neucom.2024.128511
Durães PVéstias M(2023)Smart Embedded System for Skin Cancer ClassificationFuture Internet10.3390/fi1502005215:2(52)Online publication date: 29-Jan-2023
https://doi.org/10.3390/fi15020052
Tsmots ITeslyuk VŁukaszewicz ALukashchuk YKazymyra IHolovatyy AOpotyak Y(2023)An Approach to the Implementation of a Neural Network for Cryptographic Protection of Data Transmission at UAVDrones10.3390/drones70805077:8(507)Online publication date: 2-Aug-2023
https://doi.org/10.3390/drones7080507
Reis MVéstias MNeto H(2023)Designing Deep Learning Models on FPGA with Multiple Heterogeneous EnginesACM Transactions on Reconfigurable Technology and Systems10.1145/361587017:1(1-30)Online publication date: 10-Oct-2023
https://dl.acm.org/doi/10.1145/3615870
Nechi AGroth LMulhem SMerchant FBuchty RBerekovic M(2023)FPGA-based Deep Learning Inference Accelerators: Where Are We Standing?ACM Transactions on Reconfigurable Technology and Systems10.1145/361396316:4(1-32)Online publication date: 9-Oct-2023
https://dl.acm.org/doi/10.1145/3613963
Chu CLiu CXu DWang YLuo TLi HLi X(2023)Accelerating Deformable Convolution Networks with Dynamic and Irregular Memory AccessesACM Transactions on Design Automation of Electronic Systems10.1145/359743128:4(1-23)Online publication date: 18-Jul-2023
https://dl.acm.org/doi/10.1145/3597431
Maurya SMude COliver WLienhard BTannu SSolihin YHeinrich M(2023)Scaling Qubit Readout with Hardware Efficient Machine Learning ArchitecturesProceedings of the 50th Annual International Symposium on Computer Architecture10.1145/3579371.3589042(1-13)Online publication date: 17-Jun-2023
https://dl.acm.org/doi/10.1145/3579371.3589042
Nie QMalik S(2023)CNNFlow: Memory-driven Data Flow Optimization for Convolutional Neural NetworksACM Transactions on Design Automation of Electronic Systems10.1145/357701728:3(1-36)Online publication date: 19-Mar-2023
https://dl.acm.org/doi/10.1145/3577017
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Index Terms

Recommendations

DeepBurning-GL: an automated framework for generating graph neural network accelerators

DeepBurning-SEG: Generating DNN Accelerators of Segment-Grained Pipeline Architecture

Synthesizable Standard Cell FPGA Fabrics Targetable by the Verilog-to-Routing CAD Flow

Reviews

Access critical reviews of Computing literature here