Nothing Special   »   [go: up one dir, main page]

Meng et al., 2021 - Google Patents

How to avoid zero-spacing in fractionally-strided convolution? a hardware-algorithm co-design methodology

Meng et al., 2021

View PDF
Document ID
3969322703362294321
Author
Meng Y
Kuppannagari S
Kannan R
Prasanna V
Publication year
Publication venue
2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics (HiPC)

External Links

Snippet

Fractionally Strided Convolution (FSC) is a key operation in popular image-based Deep Learning models, for example, back propagation in CNN training, the decoding stage of convolutional auto-encoders and generative CNNs (GAN), etc. FSC typically performs up …
Continue reading at par.nsf.gov (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/14Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
    • G06F17/141Discrete Fourier transforms
    • G06F17/142Fast Fourier transforms, e.g. using a Cooley-Tukey type algorithm
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/50Computer-aided design
    • G06F17/5009Computer-aided design using simulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored programme computers
    • G06F15/80Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
    • G06F15/8007Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/50Computer-aided design
    • G06F17/5045Circuit design
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/30Arrangements for executing machine-instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored programme computers
    • G06F15/78Architectures of general purpose stored programme computers comprising a single central processing unit
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/11Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
    • G06F17/12Simultaneous equations, e.g. systems of linear equations
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/52Multiplying; Dividing
    • G06F7/523Multiplying only
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F19/00Digital computing or data processing equipment or methods, specially adapted for specific applications
    • G06F19/10Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F2217/00Indexing scheme relating to computer aided design [CAD]
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/005General purpose rendering architectures
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models

Similar Documents

Publication Publication Date Title
Liang et al. Evaluating fast algorithms for convolutional neural networks on FPGAs
Gschwend Zynqnet: An fpga-accelerated embedded convolutional neural network
JP7329533B2 (en) Method and accelerator apparatus for accelerating operations
Liu et al. Optimizing CNN-based segmentation with deeply customized convolutional and deconvolutional architectures on FPGA
Albericio et al. Cnvlutin: Ineffectual-neuron-free deep neural network computing
Mittal A survey of accelerator architectures for 3D convolution neural networks
Kim et al. FPGA-based CNN inference accelerator synthesized from multi-threaded C software
Zhao et al. Dnn-chip predictor: An analytical performance predictor for dnn accelerators with various dataflows and hardware architectures
Liu et al. Towards an efficient accelerator for DNN-based remote sensing image segmentation on FPGAs
Kästner et al. Hardware/software codesign for convolutional neural networks exploiting dynamic partial reconfiguration on PYNQ
Kim et al. A fully pipelined fpga architecture of a factored restricted boltzmann machine artificial neural network
Gu et al. DLUX: A LUT-based near-bank accelerator for data center deep learning training workloads
Chen et al. Zara: A novel zero-free dataflow accelerator for generative adversarial networks in 3d reram
Liu et al. WinoCNN: Kernel sharing Winograd systolic array for efficient convolutional neural network acceleration on FPGAs
Xu et al. CaFPGA: An automatic generation model for CNN accelerator
Zhang et al. Fitnn: A low-resource fpga-based cnn accelerator for drones
Nguyen et al. ShortcutFusion: From tensorflow to FPGA-based accelerator with a reuse-aware memory allocation for shortcut data
Huang et al. IECA: An in-execution configuration CNN accelerator with 30.55 GOPS/mm² area efficiency
Lee et al. Specializing CGRAs for light-weight convolutional neural networks
Meng et al. Dynamap: Dynamic algorithm mapping framework for low latency cnn inference
Meng et al. How to avoid zero-spacing in fractionally-strided convolution? a hardware-algorithm co-design methodology
Shrivastava et al. A survey of hardware architectures for generative adversarial networks
Fan et al. Hardware and Algorithm Co-Optimization for pointwise convolution and channel shuffle in ShuffleNet V2
Yang et al. AIM: Accelerating Arbitrary-precision Integer Multiplication on Heterogeneous Reconfigurable Computing Platform Versal ACAP
Park et al. An FPGA-based accelerator for cortical object classification