Zhuang, 2019 - Google Patents

Communication reduction techniques in numerical methods and deep neural networks

Zhuang, 2019

Document ID: 2124667952963184073
Author: Zhuang S
Publication year: 2019

External Links

Cited by

Snippet

Inter-node communication has turned out to be one of the determining factors of the performance on modern HPC systems. Furthermore, the situation only gets worse with the ever-incresing size of the cores involved. Hence, this thesis explore the various possible …

Continue reading at upcommons.upc.edu (PDF) (other versions)

238000000034 method 0 title abstract description 136

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/11—Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5009—Computer-aided design using simulation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/80—Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
- G06F15/8007—Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
- G06F15/163—Interprocessor communication
- G06F15/173—Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G06F8/45—Exploiting coarse grain parallelism in compilation, i.e. parallelism between groups of instructions
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F1/00—Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL

Similar Documents

Publication	Publication Date	Title
Kirk et al.	2016	Programming massively parallel processors: a hands-on approach
You et al.	2019	Fast deep neural network training on distributed systems and cloud TPUs
Terenin et al.	2019	GPU-accelerated Gibbs sampling: a case study of the Horseshoe Probit model
Li et al.	2019	CPU versus GPU: which can perform matrix computation faster—performance comparison for basic linear algebra subprograms
Zhang et al.	2018	Towards memory friendly long-short term memory networks (LSTMs) on mobile GPUs
Ren et al.	2019	Performance analysis of deep learning workloads on leading-edge systems
Messer et al.	2018	MiniApps derived from production HPC applications using multiple programing models
Gonzalez-de-Aledo et al.	2018	An optimization approach for agent-based computational models of biological development
do Rosario et al.	2021	Efficiency and scalability of multi-lane capsule networks (MLCN)
Catanzaro et al.	2010	Ubiquitous parallel computing from Berkeley, Illinois, and Stanford
Abdelhafez et al.	2021	Mirage: Machine learning-based modeling of identical replicas of the jetson agx embedded platform
Eichner et al.	2009	Neural simulations on multi-core architectures
Zhuang	2019	Communication reduction techniques in numerical methods and deep neural networks
Hesse	2021	Analysis and comparison of performance and power consumption of neural networks on cpu, gpu, tpu and fpga
Del Monte et al.	2016	A scalable GPU-enabled framework for training deep neural networks
Linderman et al.	2010	High-throughput Bayesian network learning using heterogeneous multicore computers
Götz et al.	2017	Supporting software engineering practices in the development of data-intensive hpc applications with the juml framework
Fender	2017	Parallel solutions for large-scale eigenvalue problems arising in graph analytics
Franquinet	2023	Performance portability analysis of SYCL with a classical CG on CPU, GPU, and FPGA
Herten et al.	2019	Performance comparison for neuroscience application benchmarks
Dikbayir	2019	Kernel and launch time optimizations for deep learning frameworks
Briggs et al.	2016	Separable projection integrals for higher-order correlators of the cosmic microwave sky: Acceleration by factors exceeding 100
Chau et al.	2017	Advances in dataflow systems
Casal et al.	2019	Analysis of the Construction of Similarity Matrices on Multi-Core and Many-Core Platforms Using Different Similarity Metrics
Kumar	2023	Accelerating Betweenness Centrality on GPU