Lian, 2016 - Google Patents
A framework for FPGA-based acceleration of neural network inference with limited numerical precision via high-level synthesis with streaming functionalityLian, 2016
View PDF- Document ID
- 16093079018434018335
- Author
- Lian R
- Publication year
External Links
Snippet
Deep neural networks (DNN) are achieving state-of-the-art performance in many artificial intelligence tasks, such as computer vision and speech recognition. Due to the high computational requirements of DNN, there is an increasing need to design custom hardware …
- 230000001537 neural 0 title abstract description 102
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/78—Architectures of general purpose stored programme computers comprising a single central processing unit
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5045—Circuit design
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5009—Computer-aided design using simulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/80—Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
- G06F15/8007—Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
- G06F15/163—Interprocessor communication
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F1/00—Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Ma et al. | ALAMO: FPGA acceleration of deep learning algorithms with a modularized RTL compiler | |
Liu et al. | Throughput-optimized FPGA accelerator for deep convolutional neural networks | |
Azarkhish et al. | Neurostream: Scalable and energy efficient deep learning with smart memory cubes | |
Ma et al. | Scalable and modularized RTL compilation of convolutional neural networks onto FPGA | |
US11615297B2 (en) | Structured weight based sparsity in an artificial neural network compiler | |
Ma et al. | An automatic RTL compiler for high-throughput FPGA implementation of diverse deep convolutional neural networks | |
CN108268943B (en) | Hardware accelerator engine | |
Chen et al. | Diannao: A small-footprint high-throughput accelerator for ubiquitous machine-learning | |
Mittal | A survey of accelerator architectures for 3D convolution neural networks | |
Chen et al. | A small-footprint accelerator for large-scale neural networks | |
US20200285892A1 (en) | Structured Weight Based Sparsity In An Artificial Neural Network | |
US20200279133A1 (en) | Structured Sparsity Guided Training In An Artificial Neural Network | |
US11934308B2 (en) | Processor cluster address generation | |
Basalama et al. | FlexCNN: An end-to-end framework for composing CNN accelerators on FPGA | |
Shahrouzi et al. | Optimized hardware accelerators for data mining applications on embedded platforms: Case study principal component analysis | |
Lian | A framework for FPGA-based acceleration of neural network inference with limited numerical precision via high-level synthesis with streaming functionality | |
Fan et al. | DT-CGRA: Dual-track coarse-grained reconfigurable architecture for stream applications | |
Hosseiny et al. | Hardware acceleration of YOLOv7-tiny using high-level synthesis tools | |
Saidani et al. | Hardware Acceleration for Object Detection using YOLOv5 Deep Learning Algorithm on Xilinx Zynq FPGA Platform | |
Hong et al. | Survey of convolutional neural network accelerators on field-programmable gate array platforms: architectures and optimization techniques | |
Santoro | Exploring New Computing Paradigms for Data-Intensive Applications. | |
Gonçalves et al. | Exploring data size to run convolutional neural networks in low density fpgas | |
Li | Acceleration of Deep Learning on FPGA | |
Miro | FPGA-Based Accelerators for Convolutional Neural Networks on Embedded Devices | |
Ma | Hardware Acceleration of Deep Convolutional Neural Networks on FPGA |