Nothing Special   »   [go: up one dir, main page]

Sudrajat et al., 2019 - Google Patents

GEMM-Based Quantized Neural Network FPGA Accelerator Design

Sudrajat et al., 2019

Document ID
1817960969226360832
Author
Sudrajat M
Adiono T
Syafalni I
Publication year
Publication venue
2019 International Symposium on Electronics and Smart Devices (ISESD)

External Links

Snippet

In this study, we will explore Neural Network based FPGA acceleration based on accelerating General Matrix Multiplication (GEMM). GEMM acceleration allows regularized and modular implementation of accelerator design, as well as providing the benefits of …
Continue reading at ieeexplore.ieee.org (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/52Multiplying; Dividing
    • G06F7/523Multiplying only
    • G06F7/53Multiplying only in parallel-parallel fashion, i.e. both operands being entered in parallel
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/14Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
    • G06F17/141Discrete Fourier transforms
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/50Computer-aided design
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored programme computers
    • G06F15/78Architectures of general purpose stored programme computers comprising a single central processing unit
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • HELECTRICITY
    • H03BASIC ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same information or similar information or a subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction

Similar Documents

Publication Publication Date Title
Jang et al. Sparsity-aware and re-configurable NPU architecture for Samsung flagship mobile SoC
US20210374503A1 (en) Network-centric architecture and algorithms to accelerate distributed training of neural networks
US20180197084A1 (en) Convolutional neural network system having binary parameter and operation method thereof
WO2020057161A1 (en) Split accumulator for convolutional neural network accelerator
CN110555516B (en) Method for realizing low-delay hardware accelerator of YOLOv2-tiny neural network based on FPGA
CN110543939B (en) Hardware acceleration realization device for convolutional neural network backward training based on FPGA
US11948069B2 (en) Compression of neural network activation data
EP3637327B1 (en) Computing device and method
CN110110852B (en) Method for transplanting deep learning network to FPAG platform
Struharik et al. Conna–compressed cnn hardware accelerator
Piyasena et al. Reducing dynamic power in streaming CNN hardware accelerators by exploiting computational redundancies
Li et al. An efficient CNN accelerator using inter-frame data reuse of videos on FPGAs
Wu et al. Skeletongcn: a simple yet effective accelerator for gcn training
Niu et al. SPEC2: Spectral sparse CNN accelerator on FPGAs
Zhan et al. Field programmable gate array‐based all‐layer accelerator with quantization neural networks for sustainable cyber‐physical systems
Sudrajat et al. GEMM-Based Quantized Neural Network FPGA Accelerator Design
Xiao et al. Research on fpga based convolutional neural network acceleration method
Zhou et al. Design and implementation of YOLOv3-Tiny accelerator based on PYNQ-Z2 heterogeneous platform
Xiao et al. A mobilenet accelerator with high processing-element-efficiency on fpga
Zhao et al. HDSuper: High-Quality and High Computational Utilization Edge Super-Resolution Accelerator With Hardware-Algorithm Co-Design Techniques
Jo et al. Bit-serial multiplier based neural processing element with approximate adder tree
Li et al. A 0.13 mJ/Prediction CIFAR-100 Raster-Scan-Based Wired-Logic Processor Using Non-Linear Neural Network
Sharma et al. Hardware accelerator for object detection using tiny YOLO-v3
US20220121915A1 (en) Configurable bnn asic using a network of programmable threshold logic standard cells
Huang et al. A low-bit quantized and hls-based neural network fpga accelerator for object detection