Nothing Special   »   [go: up one dir, main page]

Tang et al., 2021 - Google Patents

Low-memory and high-performance CNN inference on distributed systems at the edge

Tang et al., 2021

View PDF
Document ID
5852992927569293463
Author
Tang E
Stefanov T
Publication year
Publication venue
Proceedings of the 14th IEEE/ACM International Conference on Utility and Cloud Computing Companion

External Links

Snippet

Nowadays, some applications need CNN inference on resource-constrained edge devices that may have very limited memory and computation capacity to fit a large CNN model. In such application scenarios, to deploy a large CNN model and perform inference on a single …
Continue reading at dl.acm.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Programme initiating; Programme switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformations of program code
    • G06F8/41Compilation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/50Computer-aided design
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/12Computer systems based on biological models using genetic models
    • G06N3/126Genetic algorithms, i.e. information processing using digital simulations of the genetic system
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6217Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F19/00Digital computing or data processing equipment or methods, specially adapted for specific applications
    • G06F19/10Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06QDATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management

Similar Documents

Publication Publication Date Title
Besta et al. Parallel and distributed graph neural networks: An in-depth concurrency analysis
Tang et al. Low-memory and high-performance CNN inference on distributed systems at the edge
Khaleghzadeh et al. A novel data-partitioning algorithm for performance optimization of data-parallel applications on heterogeneous HPC platforms
EP4428754A1 (en) Neural network model processing method and device
US10642610B2 (en) Scalable cloud-based time series analysis
CN106953862A (en) Perception method and device for network security situation and perception model training method and device
Christidis et al. Enabling serverless deployment of large-scale ai workloads
Du et al. Model parallelism optimization for distributed inference via decoupled CNN structure
Zhang et al. Adaptive distributed convolutional neural network inference at the network edge with ADCNN
Yadav et al. An opposition-based hybrid evolutionary approach for task scheduling in fog computing network
Sarathambekai et al. Task scheduling in distributed systems using heap intelligent discrete particle swarm optimization
Deng et al. A parallel version of differential evolution based on resilient distributed datasets model
Yang et al. Pico: Pipeline inference framework for versatile cnns on diverse mobile devices
WO2024045188A1 (en) Loop transformation in tensor compilers of deep neural networks (dnns)
Liu et al. High-performance tensor learning primitives using GPU tensor cores
CN118378008B (en) Matrix decomposition parallelization optimization method and system for high-performance computing
Chung et al. A case study using automatic performance tuning for large-scale scientific programs
Abdelfattah et al. On the development of variable size batched computation for heterogeneous parallel architectures
Singh et al. Distributed quadratic programming solver for kernel SVM using genetic algorithm
Kelkawi et al. GPU-based cooperative coevolution for large-scale global optimization
CN116596035A (en) Neural network training parallel method
Yin et al. Exact memory-and communication-aware scheduling of dnns on pipelined edge tpus
Boureima et al. Distributed Out-of-core NMF of Dense and Sparse Data on CPU/GPU Architectures with Automatic Model Selection for Exascale Data
Chen et al. Efficient multi-training framework of image deep learning on GPU cluster
CN109635191B (en) Similarity determination method and device, storage medium and computer equipment