Du et al., 2022 - Google Patents
Enhancing Distributed In-Situ CNN Inference in the Internet of ThingsDu et al., 2022
- Document ID
- 2118718900651620460
- Author
- Du J
- Du Y
- Huang D
- Lu Y
- Liao X
- Publication year
- Publication venue
- IEEE Internet of Things Journal
External Links
Snippet
Convolutional neural networks (CNNS) enable machines to view the world as humans and become increasing prevalent for Internet of Things (IoT) applications. Instead of streaming the raw data to the cloud and executing CNN inference remotely, it would be very attractive …
- 238000011065 in-situ storage 0 title abstract description 25
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Programme initiating; Programme switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/505—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
- G06F15/163—Interprocessor communication
- G06F15/173—Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5009—Computer-aided design using simulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Liu et al. | HierTrain: Fast hierarchical edge AI learning with hybrid parallelism in mobile-edge-cloud computing | |
US9053067B2 (en) | Distributed data scalable adaptive map-reduce framework | |
Du et al. | Model parallelism optimization for distributed inference via decoupled CNN structure | |
CN103699656A (en) | GPU-based mass-multimedia-data-oriented MapReduce platform | |
CN111738488A (en) | Task scheduling method and device | |
Du et al. | A distributed in-situ CNN inference system for IoT applications | |
Gadiyar et al. | Artificial intelligence software and hardware platforms | |
Guo et al. | Automated exploration and implementation of distributed cnn inference at the edge | |
Neshatpour et al. | Energy-efficient acceleration of MapReduce applications using FPGAs | |
Marszałkowski et al. | Time and energy performance of parallel systems with hierarchical memory | |
Ma et al. | FPGA-based AI smart NICs for scalable distributed AI training systems | |
Lin et al. | Latency-driven model placement for efficient edge intelligence service | |
Xu et al. | A meta reinforcement learning-based virtual machine placement algorithm in mobile edge computing | |
Liu | Yolov2 acceleration using embedded gpu and fpgas: pros, cons, and a hybrid method | |
Zhou et al. | Training and Serving System of Foundation Models: A Comprehensive Survey | |
Guan et al. | Using data compression for optimizing FPGA-based convolutional neural network accelerators | |
Shu et al. | Design of deep learning accelerated algorithm for online recognition of industrial products defects | |
Du et al. | Enhancing Distributed In-Situ CNN Inference in the Internet of Things | |
Tianyang et al. | A Survey: FPGA‐Based Dynamic Scheduling of Hardware Tasks | |
Zhang et al. | Enabling highly efficient capsule networks processing through software-hardware co-design | |
Tang et al. | Energy-efficient and high-throughput CNN inference on embedded CPUs-GPUs MPSoCs | |
You et al. | New paradigm of FPGA-based computational intelligence from surveying the implementation of DNN accelerators | |
Guo et al. | Hierarchical design space exploration for distributed CNN inference at the edge | |
Sadiq et al. | Distributed Systems for Machine Learning in Cloud Computing: A Review of Scalable and Efficient Training and Inference | |
US20230419166A1 (en) | Systems and methods for distributing layers of special mixture-of-experts machine learning models |