Zhou et al., 2018 - Google Patents
Addressing sparsity in deep neural networksZhou et al., 2018
- Document ID
- 14673694380208102852
- Author
- Zhou X
- Du Z
- Zhang S
- Zhang L
- Lan H
- Liu S
- Li L
- Guo Q
- Chen T
- Chen Y
- Publication year
- Publication venue
- IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
External Links
Snippet
Neural networks (NNs) have been demonstrated to be useful in a broad range of applications, such as image recognition, automatic translation, and advertisement recommendation. State-of-the-art NNs are known to be both computationally and memory …
- 230000001537 neural 0 title abstract description 29
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units
- G06F9/3889—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled by multiple instructions, e.g. MIMD, decoupled access or execute
- G06F9/3891—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled by multiple instructions, e.g. MIMD, decoupled access or execute organised in groups of units sharing resources, e.g. clusters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/30036—Instructions to perform operations on packed data, e.g. vector operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/80—Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
- G06F15/8007—Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/14—Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
- G06F17/141—Discrete Fourier transforms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
- G06F15/163—Interprocessor communication
- G06F15/173—Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/78—Architectures of general purpose stored programme computers comprising a single central processing unit
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/544—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
- G06F7/5443—Sum of products
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhang et al. | Cambricon-X: An accelerator for sparse neural networks | |
Conti et al. | XNOR neural engine: A hardware accelerator IP for 21.6-fJ/op binary neural network inference | |
Dhilleswararao et al. | Efficient hardware architectures for accelerating deep neural networks: Survey | |
Shawahna et al. | FPGA-based accelerators of deep learning networks for learning and classification: A review | |
Boutros et al. | You cannot improve what you do not measure: FPGA vs. ASIC efficiency gaps for convolutional neural network inference | |
CN209560590U (en) | Electronic Devices and Integrated Circuits | |
CN106940815B (en) | A Programmable Convolutional Neural Network Coprocessor IP Core | |
Han | Efficient methods and hardware for deep learning | |
Zhou et al. | Addressing sparsity in deep neural networks | |
Du et al. | An accelerator for high efficient vision processing | |
Mohaidat et al. | A survey on neural network hardware accelerators | |
Chen et al. | A high-throughput neural network accelerator | |
Fan et al. | Stream processing dual-track CGRA for object inference | |
Xu et al. | CaFPGA: An automatic generation model for CNN accelerator | |
Nguyen et al. | ShortcutFusion: From tensorflow to FPGA-based accelerator with a reuse-aware memory allocation for shortcut data | |
Wang et al. | A low-latency sparse-winograd accelerator for convolutional neural networks | |
Hsiao et al. | Design of a sparsity-aware reconfigurable deep learning accelerator supporting various types of operations | |
Zhao et al. | Cambricon-Q: A hybrid architecture for efficient training | |
Huang et al. | IECA: An in-execution configuration CNN accelerator with 30.55 GOPS/mm² area efficiency | |
Akkad et al. | Embedded deep learning accelerators: A survey on recent advances | |
CN116822600A (en) | A neural network search chip based on RISC-V architecture | |
Lou et al. | RV-CNN: Flexible and efficient instruction set for CNNs based on RISC-V processors | |
Yantır et al. | IMCA: An efficient in-memory convolution accelerator | |
You et al. | New paradigm of FPGA-based computational intelligence from surveying the implementation of DNN accelerators | |
Potocnik et al. | Optimizing foundation model inference on a many-tiny-core open-source risc-v platform |