Bernaschi et al., 2016 - Google Patents

A factored sparse approximate inverse preconditioned conjugate gradient solver on graphics processing units

Bernaschi et al., 2016

Document ID: 10072827351681719385
Author: Bernaschi M; Bisson M; Fantozzi C; Janna C
Publication year: 2016
Publication venue: SIAM Journal on Scientific Computing

External Links

Cited by

Snippet

Graphics Processing Units (GPUs) exhibit significantly higher peak performance than conventional CPUs. However, in general only highly parallel algorithms can exploit their potential. In this scenario, the iterative solution to sparse linear systems of equations could …

Continue reading at www.research.unipd.it (PDF) (other versions)

238000011030 bottleneck 0 abstract description 3

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5009—Computer-aided design using simulation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/11—Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
- G06F17/12—Simultaneous equations, e.g. systems of linear equations
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G06F8/44—Encoding
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G06F8/45—Exploiting coarse grain parallelism in compilation, i.e. parallelism between groups of instructions
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/44—Arrangements for executing specific programmes
- G06F9/455—Emulation; Software simulation, i.e. virtualisation or emulation of application or operating system execution engines
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5045—Circuit design
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/80—Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
- G06F15/8007—Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
- G06F15/163—Interprocessor communication
- G06F15/173—Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F1/00—Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring

Similar Documents

Publication	Publication Date	Title
Li et al.	2013	GPU-accelerated preconditioned iterative linear solvers
Lutz et al.	2013	PARTANS: An autotuning framework for stencil computation on multi-GPU systems
Bell et al.	2008	Efficient sparse matrix-vector multiplication on CUDA
Wyant et al.	2012	Computing performance benchmarks among cpu, gpu, and fpga
Georgescu et al.	2013	GPU acceleration for FEM-based structural analysis
Agullo et al.	2016	Task‐based FMM for heterogeneous architectures
Bernaschi et al.	2016	A factored sparse approximate inverse preconditioned conjugate gradient solver on graphics processing units
Calore et al.	2016	Performance and portability of accelerated lattice Boltzmann applications with OpenACC
Fernandez et al.	2012	Alternate parallel processing approach for FEM
Economon et al.	2015	Towards high-performance optimizations of the unstructured open-source SU2 suite
Sun et al.	2011	An I/O bandwidth-sensitive sparse matrix-vector multiplication engine on FPGAs
Gao et al.	2017	A multi-GPU parallel optimization model for the preconditioned conjugate gradient algorithm
Ziane Khodja et al.	2014	Parallel sparse linear solver with GMRES method using minimization techniques of communications for GPU clusters
Smith et al.	2005	Accelerating scientific applications with the SRC-6 reconfigurable computer: Methodologies and analysis
Magoulès et al.	2015	Auto-tuned Krylov methods on cluster of graphics processing unit
Rossinelli et al.	2011	Multicore/multi-gpu accelerated simulations of multiphase compressible flows using wavelet adapted grids
AlAhmadi et al.	2020	Performance characteristics for sparse matrix-vector multiplication on GPUs
Tian et al.	2022	swSuperLU: A highly scalable sparse direct solver on Sunway manycore architecture
Halbiniak et al.	2021	Exploration of OpenCL heterogeneous programming for porting solidification modeling to CPU‐GPU platforms
Davis et al.	2012	Paradigmatic shifts for exascale supercomputing
Zhang et al.	2024	Mixed-precision block incomplete sparse approximate preconditioner on Tensor core
Ohshima et al.	2019	Optimization of numerous small dense-matrix–vector multiplications in H-matrix arithmetic on GPU
Zhang et al.	2013	Implementing sparse matrix-vector multiplication with QCSR on GPU
Al-Mouhamed et al.	2017	SpMV and BiCG-Stab optimization for a class of hepta-diagonal-sparse matrices on GPU
Bylina et al.	2017	Explicit Fourth-Order Runge–Kutta Method on Intel Xeon Phi Coprocessor