Tang et al., 2021 - Google Patents

Low-memory and high-performance CNN inference on distributed systems at the edge

Tang et al., 2021

Document ID: 5852992927569293463
Author: Tang E; Stefanov T
Publication year: 2021
Publication venue: Proceedings of the 14th IEEE/ACM International Conference on Utility and Cloud Computing Companion

External Links

Cited by

Snippet

Nowadays, some applications need CNN inference on resource-constrained edge devices that may have very limited memory and computation capacity to fit a large CNN model. In such application scenarios, to deploy a large CNN model and perform inference on a single …

Continue reading at dl.acm.org (PDF) (other versions)

238000000638 solvent extraction 0 abstract description 101

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Programme initiating; Programme switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/12—Computer systems based on biological models using genetic models
- G06N3/126—Genetic algorithms, i.e. information processing using digital simulations of the genetic system
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management

Similar Documents

Publication	Publication Date	Title
Besta et al.	2024	Parallel and distributed graph neural networks: An in-depth concurrency analysis
Tang et al.	2021	Low-memory and high-performance CNN inference on distributed systems at the edge
Khaleghzadeh et al.	2018	A novel data-partitioning algorithm for performance optimization of data-parallel applications on heterogeneous HPC platforms
EP4428754A1 (en)	2024-09-11	Neural network model processing method and device
US10642610B2 (en)	2020-05-05	Scalable cloud-based time series analysis
CN106953862A (en)	2017-07-14	Perception method and device for network security situation and perception model training method and device
Christidis et al.	2020	Enabling serverless deployment of large-scale ai workloads
Du et al.	2020	Model parallelism optimization for distributed inference via decoupled CNN structure
Zhang et al.	2020	Adaptive distributed convolutional neural network inference at the network edge with ADCNN
Yadav et al.	2023	An opposition-based hybrid evolutionary approach for task scheduling in fog computing network
Sarathambekai et al.	2017	Task scheduling in distributed systems using heap intelligent discrete particle swarm optimization
Deng et al.	2015	A parallel version of differential evolution based on resilient distributed datasets model
Yang et al.	2023	Pico: Pipeline inference framework for versatile cnns on diverse mobile devices
WO2024045188A1 (en)	2024-03-07	Loop transformation in tensor compilers of deep neural networks (dnns)
Liu et al.	2022	High-performance tensor learning primitives using GPU tensor cores
CN118378008B (en)	2024-09-20	Matrix decomposition parallelization optimization method and system for high-performance computing
Chung et al.	2006	A case study using automatic performance tuning for large-scale scientific programs
Abdelfattah et al.	2016	On the development of variable size batched computation for heterogeneous parallel architectures
Singh et al.	2016	Distributed quadratic programming solver for kernel SVM using genetic algorithm
Kelkawi et al.	2023	GPU-based cooperative coevolution for large-scale global optimization
CN116596035A (en)	2023-08-15	Neural network training parallel method
Yin et al.	2022	Exact memory-and communication-aware scheduling of dnns on pipelined edge tpus
Boureima et al.	2022	Distributed Out-of-core NMF of Dense and Sparse Data on CPU/GPU Architectures with Automatic Model Selection for Exascale Data
Chen et al.	2015	Efficient multi-training framework of image deep learning on GPU cluster
CN109635191B (en)	2023-02-03	Similarity determination method and device, storage medium and computer equipment