Jiao et al., 2017 - Google Patents
Accelerating low bit-width convolutional neural networks with embedded FPGAJiao et al., 2017
- Document ID
- 6872230262167175501
- Author
- Jiao L
- Luo C
- Cao W
- Zhou X
- Wang L
- Publication year
- Publication venue
- 2017 27th international conference on field programmable logic and applications (FPL)
External Links
Snippet
Convolutional Neural Networks (CNNs) can achieve high classification accuracy while they require complex computation. Binarized Neural Networks (BNNs) with binarized weights and activations can simplify computation but suffer from obvious accuracy loss. In this paper …
- 230000001537 neural 0 title abstract description 19
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/52—Multiplying; Dividing
- G06F7/523—Multiplying only
- G06F7/53—Multiplying only in parallel-parallel fashion, i.e. both operands being entered in parallel
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/52—Multiplying; Dividing
- G06F7/523—Multiplying only
- G06F7/533—Reduction of the number of iteration steps or stages, e.g. using the Booth algorithm, log-sum, odd-even
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/544—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
- G06F7/5443—Sum of products
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/499—Denomination or exception handling, e.g. rounding, overflow
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2207/00—Indexing scheme relating to methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F2207/38—Indexing scheme relating to groups G06F7/38 - G06F7/575
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Jiao et al. | Accelerating low bit-width convolutional neural networks with embedded FPGA | |
Faraone et al. | AddNet: Deep neural networks using FPGA-optimized multipliers | |
US10698657B2 (en) | Hardware accelerator for compressed RNN on FPGA | |
CN107704916B (en) | Hardware accelerator and method for realizing RNN neural network based on FPGA | |
US9411726B2 (en) | Low power computation architecture | |
CN107423816B (en) | Multi-calculation-precision neural network processing method and system | |
Ortiz et al. | Low-precision floating-point schemes for neural network training | |
Geng et al. | CQNN: a CGRA-based QNN framework | |
Lentaris et al. | Combining arithmetic approximation techniques for improved CNN circuit design | |
Lu et al. | ETA: An efficient training accelerator for DNNs based on hardware-algorithm co-optimization | |
KR20190089685A (en) | Method and apparatus for processing data | |
US20240311626A1 (en) | Asynchronous accumulator using logarithmic-based arithmetic | |
Duan et al. | Energy-efficient architecture for FPGA-based deep convolutional neural networks with binary weights | |
US12141225B2 (en) | Inference accelerator using logarithmic-based arithmetic | |
Guo et al. | BOOST: block minifloat-based on-device CNN training accelerator with transfer learning | |
Lee et al. | Tender: Accelerating large language models via tensor decomposition and runtime requantization | |
Yu et al. | Optimizing FPGA-based convolutional encoder-decoder architecture for semantic segmentation | |
CN111930681A (en) | Computing device and related product | |
Shivapakash et al. | A power efficient multi-bit accelerator for memory prohibitive deep neural networks | |
Xu et al. | A low-power arithmetic element for multi-base logarithmic computation on deep neural networks | |
Zhan et al. | Field programmable gate array‐based all‐layer accelerator with quantization neural networks for sustainable cyber‐physical systems | |
Wong et al. | Low bitwidth CNN accelerator on FPGA using Winograd and block floating point arithmetic | |
Zhang et al. | YOLOv3-tiny object detection SOC based on FPGA platform | |
Moshovos et al. | Value-based deep-learning acceleration | |
Chen et al. | Smartdeal: Remodeling deep network weights for efficient inference and training |