Jiao et al., 2017 - Google Patents
Accelerating low bit-width convolutional neural networks with embedded FPGAJiao et al., 2017
- Document ID
- 6872230262167175501
- Author
- Jiao L
- Luo C
- Cao W
- Zhou X
- Wang L
- Publication year
- Publication venue
- 2017 27th international conference on field programmable logic and applications (FPL)
External Links
Snippet
Convolutional Neural Networks (CNNs) can achieve high classification accuracy while they require complex computation. Binarized Neural Networks (BNNs) with binarized weights and activations can simplify computation but suffer from obvious accuracy loss. In this paper …
- 230000001537 neural 0 title abstract description 19
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/52—Multiplying; Dividing
- G06F7/523—Multiplying only
- G06F7/53—Multiplying only in parallel-parallel fashion, i.e. both operands being entered in parallel
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/52—Multiplying; Dividing
- G06F7/523—Multiplying only
- G06F7/533—Reduction of the number of iteration steps or stages, e.g. using the Booth algorithm, log-sum, odd-even
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/544—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
- G06F7/5443—Sum of products
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/499—Denomination or exception handling, e.g. rounding, overflow
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2207/00—Indexing scheme relating to methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F2207/38—Indexing scheme relating to groups G06F7/38 - G06F7/575
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Jiao et al. | Accelerating low bit-width convolutional neural networks with embedded FPGA | |
Faraone et al. | AddNet: Deep neural networks using FPGA-optimized multipliers | |
US10810484B2 (en) | Hardware accelerator for compressed GRU on FPGA | |
CN111062472B (en) | Sparse neural network accelerator based on structured pruning and acceleration method thereof | |
CN107704916B (en) | Hardware accelerator and method for realizing RNN neural network based on FPGA | |
US9411726B2 (en) | Low power computation architecture | |
US20180046897A1 (en) | Hardware accelerator for compressed rnn on fpga | |
Ortiz et al. | Low-precision floating-point schemes for neural network training | |
CN111582451A (en) | Image recognition interlayer parallel pipeline type binary convolution neural network array architecture | |
Lentaris et al. | Combining arithmetic approximation techniques for improved CNN circuit design | |
Lu et al. | ETA: An efficient training accelerator for DNNs based on hardware-algorithm co-optimization | |
US20240311626A1 (en) | Asynchronous accumulator using logarithmic-based arithmetic | |
KR20190089685A (en) | Method and apparatus for processing data | |
Duan et al. | Energy-efficient architecture for FPGA-based deep convolutional neural networks with binary weights | |
Shivapakash et al. | A power efficiency enhancements of a multi-bit accelerator for memory prohibitive deep neural networks | |
Zhao et al. | Optimizing FPGA-Based DNN accelerator with shared exponential floating-point format | |
Guo et al. | BOOST: block minifloat-based on-device CNN training accelerator with transfer learning | |
Shivapakash et al. | A power efficient multi-bit accelerator for memory prohibitive deep neural networks | |
Yu et al. | Optimizing FPGA-based convolutional encoder-decoder architecture for semantic segmentation | |
CN111930681A (en) | Computing device and related product | |
Zhan et al. | Field programmable gate array‐based all‐layer accelerator with quantization neural networks for sustainable cyber‐physical systems | |
Chen et al. | SmartDeal: Remodeling Deep Network Weights for Efficient Inference and Training | |
Xu et al. | A low-power arithmetic element for multi-base logarithmic computation on deep neural networks | |
Moshovos et al. | Value-based deep-learning acceleration | |
Zhang et al. | Yolov3-tiny Object Detection SoC Based on FPGA Platform |