Meng et al., 2021 - Google Patents
How to avoid zero-spacing in fractionally-strided convolution? a hardware-algorithm co-design methodologyMeng et al., 2021
View PDF- Document ID
- 3969322703362294321
- Author
- Meng Y
- Kuppannagari S
- Kannan R
- Prasanna V
- Publication year
- Publication venue
- 2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics (HiPC)
External Links
Snippet
Fractionally Strided Convolution (FSC) is a key operation in popular image-based Deep Learning models, for example, back propagation in CNN training, the decoding stage of convolutional auto-encoders and generative CNNs (GAN), etc. FSC typically performs up …
- 238000000034 method 0 title abstract description 38
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/14—Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
- G06F17/141—Discrete Fourier transforms
- G06F17/142—Fast Fourier transforms, e.g. using a Cooley-Tukey type algorithm
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5009—Computer-aided design using simulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/80—Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
- G06F15/8007—Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5045—Circuit design
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/78—Architectures of general purpose stored programme computers comprising a single central processing unit
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/11—Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
- G06F17/12—Simultaneous equations, e.g. systems of linear equations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/52—Multiplying; Dividing
- G06F7/523—Multiplying only
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2217/00—Indexing scheme relating to computer aided design [CAD]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/20—Processor architectures; Processor configuration, e.g. pipelining
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/005—General purpose rendering architectures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Liang et al. | Evaluating fast algorithms for convolutional neural networks on FPGAs | |
Gschwend | Zynqnet: An fpga-accelerated embedded convolutional neural network | |
JP7329533B2 (en) | Method and accelerator apparatus for accelerating operations | |
Liu et al. | Optimizing CNN-based segmentation with deeply customized convolutional and deconvolutional architectures on FPGA | |
Albericio et al. | Cnvlutin: Ineffectual-neuron-free deep neural network computing | |
Mittal | A survey of accelerator architectures for 3D convolution neural networks | |
Kim et al. | FPGA-based CNN inference accelerator synthesized from multi-threaded C software | |
Zhao et al. | Dnn-chip predictor: An analytical performance predictor for dnn accelerators with various dataflows and hardware architectures | |
Liu et al. | Towards an efficient accelerator for DNN-based remote sensing image segmentation on FPGAs | |
Kästner et al. | Hardware/software codesign for convolutional neural networks exploiting dynamic partial reconfiguration on PYNQ | |
Kim et al. | A fully pipelined fpga architecture of a factored restricted boltzmann machine artificial neural network | |
Gu et al. | DLUX: A LUT-based near-bank accelerator for data center deep learning training workloads | |
Chen et al. | Zara: A novel zero-free dataflow accelerator for generative adversarial networks in 3d reram | |
Liu et al. | WinoCNN: Kernel sharing Winograd systolic array for efficient convolutional neural network acceleration on FPGAs | |
Xu et al. | CaFPGA: An automatic generation model for CNN accelerator | |
Zhang et al. | Fitnn: A low-resource fpga-based cnn accelerator for drones | |
Nguyen et al. | ShortcutFusion: From tensorflow to FPGA-based accelerator with a reuse-aware memory allocation for shortcut data | |
Huang et al. | IECA: An in-execution configuration CNN accelerator with 30.55 GOPS/mm² area efficiency | |
Lee et al. | Specializing CGRAs for light-weight convolutional neural networks | |
Meng et al. | Dynamap: Dynamic algorithm mapping framework for low latency cnn inference | |
Meng et al. | How to avoid zero-spacing in fractionally-strided convolution? a hardware-algorithm co-design methodology | |
Shrivastava et al. | A survey of hardware architectures for generative adversarial networks | |
Fan et al. | Hardware and Algorithm Co-Optimization for pointwise convolution and channel shuffle in ShuffleNet V2 | |
Yang et al. | AIM: Accelerating Arbitrary-precision Integer Multiplication on Heterogeneous Reconfigurable Computing Platform Versal ACAP | |
Park et al. | An FPGA-based accelerator for cortical object classification |