Tang et al., 2021 - Google Patents
Low-memory and high-performance CNN inference on distributed systems at the edgeTang et al., 2021
View PDF- Document ID
- 5852992927569293463
- Author
- Tang E
- Stefanov T
- Publication year
- Publication venue
- Proceedings of the 14th IEEE/ACM International Conference on Utility and Cloud Computing Companion
External Links
Snippet
Nowadays, some applications need CNN inference on resource-constrained edge devices that may have very limited memory and computation capacity to fit a large CNN model. In such application scenarios, to deploy a large CNN model and perform inference on a single …
- 238000000638 solvent extraction 0 abstract description 101
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Programme initiating; Programme switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/12—Computer systems based on biological models using genetic models
- G06N3/126—Genetic algorithms, i.e. information processing using digital simulations of the genetic system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Besta et al. | Parallel and distributed graph neural networks: An in-depth concurrency analysis | |
Tang et al. | Low-memory and high-performance CNN inference on distributed systems at the edge | |
Khaleghzadeh et al. | A novel data-partitioning algorithm for performance optimization of data-parallel applications on heterogeneous HPC platforms | |
EP4428754A1 (en) | Neural network model processing method and device | |
US10642610B2 (en) | Scalable cloud-based time series analysis | |
CN106953862A (en) | Perception method and device for network security situation and perception model training method and device | |
Christidis et al. | Enabling serverless deployment of large-scale ai workloads | |
Du et al. | Model parallelism optimization for distributed inference via decoupled CNN structure | |
Zhang et al. | Adaptive distributed convolutional neural network inference at the network edge with ADCNN | |
Yadav et al. | An opposition-based hybrid evolutionary approach for task scheduling in fog computing network | |
Sarathambekai et al. | Task scheduling in distributed systems using heap intelligent discrete particle swarm optimization | |
Deng et al. | A parallel version of differential evolution based on resilient distributed datasets model | |
Yang et al. | Pico: Pipeline inference framework for versatile cnns on diverse mobile devices | |
WO2024045188A1 (en) | Loop transformation in tensor compilers of deep neural networks (dnns) | |
Liu et al. | High-performance tensor learning primitives using GPU tensor cores | |
CN118378008B (en) | Matrix decomposition parallelization optimization method and system for high-performance computing | |
Chung et al. | A case study using automatic performance tuning for large-scale scientific programs | |
Abdelfattah et al. | On the development of variable size batched computation for heterogeneous parallel architectures | |
Singh et al. | Distributed quadratic programming solver for kernel SVM using genetic algorithm | |
Kelkawi et al. | GPU-based cooperative coevolution for large-scale global optimization | |
CN116596035A (en) | Neural network training parallel method | |
Yin et al. | Exact memory-and communication-aware scheduling of dnns on pipelined edge tpus | |
Boureima et al. | Distributed Out-of-core NMF of Dense and Sparse Data on CPU/GPU Architectures with Automatic Model Selection for Exascale Data | |
Chen et al. | Efficient multi-training framework of image deep learning on GPU cluster | |
CN109635191B (en) | Similarity determination method and device, storage medium and computer equipment |