Nothing Special   »   [go: up one dir, main page]

Lian, 2016 - Google Patents

A framework for FPGA-based acceleration of neural network inference with limited numerical precision via high-level synthesis with streaming functionality

Lian, 2016

View PDF
Document ID
16093079018434018335
Author
Lian R
Publication year

External Links

Snippet

Deep neural networks (DNN) are achieving state-of-the-art performance in many artificial intelligence tasks, such as computer vision and speech recognition. Due to the high computational requirements of DNN, there is an increasing need to design custom hardware …
Continue reading at central.bac-lac.gc.ca (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/30Arrangements for executing machine-instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored programme computers
    • G06F15/78Architectures of general purpose stored programme computers comprising a single central processing unit
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/50Computer-aided design
    • G06F17/5045Circuit design
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/50Computer-aided design
    • G06F17/5009Computer-aided design using simulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored programme computers
    • G06F15/80Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
    • G06F15/8007Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
    • G06F15/163Interprocessor communication
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F19/00Digital computing or data processing equipment or methods, specially adapted for specific applications
    • G06F19/10Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/36Image preprocessing, i.e. processing the image information without deciding about the identity of the image
    • G06K9/46Extraction of features or characteristics of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F1/00Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing

Similar Documents

Publication Publication Date Title
Ma et al. ALAMO: FPGA acceleration of deep learning algorithms with a modularized RTL compiler
Liu et al. Throughput-optimized FPGA accelerator for deep convolutional neural networks
Azarkhish et al. Neurostream: Scalable and energy efficient deep learning with smart memory cubes
Ma et al. Scalable and modularized RTL compilation of convolutional neural networks onto FPGA
US11615297B2 (en) Structured weight based sparsity in an artificial neural network compiler
Ma et al. An automatic RTL compiler for high-throughput FPGA implementation of diverse deep convolutional neural networks
CN108268943B (en) Hardware accelerator engine
Chen et al. Diannao: A small-footprint high-throughput accelerator for ubiquitous machine-learning
Mittal A survey of accelerator architectures for 3D convolution neural networks
Chen et al. A small-footprint accelerator for large-scale neural networks
US20200285892A1 (en) Structured Weight Based Sparsity In An Artificial Neural Network
US20200279133A1 (en) Structured Sparsity Guided Training In An Artificial Neural Network
US11934308B2 (en) Processor cluster address generation
Basalama et al. FlexCNN: An end-to-end framework for composing CNN accelerators on FPGA
Shahrouzi et al. Optimized hardware accelerators for data mining applications on embedded platforms: Case study principal component analysis
Lian A framework for FPGA-based acceleration of neural network inference with limited numerical precision via high-level synthesis with streaming functionality
Fan et al. DT-CGRA: Dual-track coarse-grained reconfigurable architecture for stream applications
Hosseiny et al. Hardware acceleration of YOLOv7-tiny using high-level synthesis tools
Saidani et al. Hardware Acceleration for Object Detection using YOLOv5 Deep Learning Algorithm on Xilinx Zynq FPGA Platform
Hong et al. Survey of convolutional neural network accelerators on field-programmable gate array platforms: architectures and optimization techniques
Santoro Exploring New Computing Paradigms for Data-Intensive Applications.
Gonçalves et al. Exploring data size to run convolutional neural networks in low density fpgas
Li Acceleration of Deep Learning on FPGA
Miro FPGA-Based Accelerators for Convolutional Neural Networks on Embedded Devices
Ma Hardware Acceleration of Deep Convolutional Neural Networks on FPGA