Meng et al., 2021 - Google Patents

How to avoid zero-spacing in fractionally-strided convolution? a hardware-algorithm co-design methodology

Meng et al., 2021

Document ID: 3969322703362294321
Author: Meng Y; Kuppannagari S; Kannan R; Prasanna V
Publication year: 2021
Publication venue: 2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics (HiPC)

External Links

Cited by

Snippet

Fractionally Strided Convolution (FSC) is a key operation in popular image-based Deep Learning models, for example, back propagation in CNN training, the decoding stage of convolutional auto-encoders and generative CNNs (GAN), etc. FSC typically performs up …

Continue reading at par.nsf.gov (PDF) (other versions)

238000000034 method 0 title abstract description 38

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/14—Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
- G06F17/141—Discrete Fourier transforms
- G06F17/142—Fast Fourier transforms, e.g. using a Cooley-Tukey type algorithm
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5009—Computer-aided design using simulation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/80—Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
- G06F15/8007—Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5045—Circuit design
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/78—Architectures of general purpose stored programme computers comprising a single central processing unit
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/11—Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
- G06F17/12—Simultaneous equations, e.g. systems of linear equations
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/52—Multiplying; Dividing
- G06F7/523—Multiplying only
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2217/00—Indexing scheme relating to computer aided design [CAD]
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/20—Processor architectures; Processor configuration, e.g. pipelining
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/005—General purpose rendering architectures
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models

Similar Documents

Publication	Publication Date	Title
Liang et al.	2019	Evaluating fast algorithms for convolutional neural networks on FPGAs
Gschwend	2020	Zynqnet: An fpga-accelerated embedded convolutional neural network
JP7329533B2 (en)	2023-08-18	Method and accelerator apparatus for accelerating operations
Liu et al.	2018	Optimizing CNN-based segmentation with deeply customized convolutional and deconvolutional architectures on FPGA
Albericio et al.	2016	Cnvlutin: Ineffectual-neuron-free deep neural network computing
Mittal	2021	A survey of accelerator architectures for 3D convolution neural networks
Kim et al.	2017	FPGA-based CNN inference accelerator synthesized from multi-threaded C software
Zhao et al.	2020	Dnn-chip predictor: An analytical performance predictor for dnn accelerators with various dataflows and hardware architectures
Liu et al.	2019	Towards an efficient accelerator for DNN-based remote sensing image segmentation on FPGAs
Kästner et al.	2018	Hardware/software codesign for convolutional neural networks exploiting dynamic partial reconfiguration on PYNQ
Kim et al.	2014	A fully pipelined fpga architecture of a factored restricted boltzmann machine artificial neural network
Gu et al.	2020	DLUX: A LUT-based near-bank accelerator for data center deep learning training workloads
Chen et al.	2019	Zara: A novel zero-free dataflow accelerator for generative adversarial networks in 3d reram
Liu et al.	2021	WinoCNN: Kernel sharing Winograd systolic array for efficient convolutional neural network acceleration on FPGAs
Xu et al.	2018	CaFPGA: An automatic generation model for CNN accelerator
Zhang et al.	2022	Fitnn: A low-resource fpga-based cnn accelerator for drones
Nguyen et al.	2022	ShortcutFusion: From tensorflow to FPGA-based accelerator with a reuse-aware memory allocation for shortcut data
Huang et al.	2021	IECA: An in-execution configuration CNN accelerator with 30.55 GOPS/mm² area efficiency
Lee et al.	2021	Specializing CGRAs for light-weight convolutional neural networks
Meng et al.	2021	Dynamap: Dynamic algorithm mapping framework for low latency cnn inference
Meng et al.	2021	How to avoid zero-spacing in fractionally-strided convolution? a hardware-algorithm co-design methodology
Shrivastava et al.	2021	A survey of hardware architectures for generative adversarial networks
Fan et al.	2021	Hardware and Algorithm Co-Optimization for pointwise convolution and channel shuffle in ShuffleNet V2
Yang et al.	2023	AIM: Accelerating Arbitrary-precision Integer Multiplication on Heterogeneous Reconfigurable Computing Platform Versal ACAP
Park et al.	2012	An FPGA-based accelerator for cortical object classification