Nothing Special   »   [go: up one dir, main page]

Bernaschi et al., 2016 - Google Patents

A factored sparse approximate inverse preconditioned conjugate gradient solver on graphics processing units

Bernaschi et al., 2016

View PDF
Document ID
10072827351681719385
Author
Bernaschi M
Bisson M
Fantozzi C
Janna C
Publication year
Publication venue
SIAM Journal on Scientific Computing

External Links

Snippet

Graphics Processing Units (GPUs) exhibit significantly higher peak performance than conventional CPUs. However, in general only highly parallel algorithms can exploit their potential. In this scenario, the iterative solution to sparse linear systems of equations could …
Continue reading at www.research.unipd.it (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/50Computer-aided design
    • G06F17/5009Computer-aided design using simulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/11Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
    • G06F17/12Simultaneous equations, e.g. systems of linear equations
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/30Arrangements for executing machine-instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformations of program code
    • G06F8/41Compilation
    • G06F8/44Encoding
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformations of program code
    • G06F8/41Compilation
    • G06F8/45Exploiting coarse grain parallelism in compilation, i.e. parallelism between groups of instructions
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/44Arrangements for executing specific programmes
    • G06F9/455Emulation; Software simulation, i.e. virtualisation or emulation of application or operating system execution engines
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/50Computer-aided design
    • G06F17/5045Circuit design
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored programme computers
    • G06F15/80Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
    • G06F15/8007Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
    • G06F15/163Interprocessor communication
    • G06F15/173Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F1/00Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring

Similar Documents

Publication Publication Date Title
Li et al. GPU-accelerated preconditioned iterative linear solvers
Lutz et al. PARTANS: An autotuning framework for stencil computation on multi-GPU systems
Bell et al. Efficient sparse matrix-vector multiplication on CUDA
Wyant et al. Computing performance benchmarks among cpu, gpu, and fpga
Georgescu et al. GPU acceleration for FEM-based structural analysis
Agullo et al. Task‐based FMM for heterogeneous architectures
Bernaschi et al. A factored sparse approximate inverse preconditioned conjugate gradient solver on graphics processing units
Calore et al. Performance and portability of accelerated lattice Boltzmann applications with OpenACC
Fernandez et al. Alternate parallel processing approach for FEM
Economon et al. Towards high-performance optimizations of the unstructured open-source SU2 suite
Sun et al. An I/O bandwidth-sensitive sparse matrix-vector multiplication engine on FPGAs
Gao et al. A multi-GPU parallel optimization model for the preconditioned conjugate gradient algorithm
Ziane Khodja et al. Parallel sparse linear solver with GMRES method using minimization techniques of communications for GPU clusters
Smith et al. Accelerating scientific applications with the SRC-6 reconfigurable computer: Methodologies and analysis
Magoulès et al. Auto-tuned Krylov methods on cluster of graphics processing unit
Rossinelli et al. Multicore/multi-gpu accelerated simulations of multiphase compressible flows using wavelet adapted grids
AlAhmadi et al. Performance characteristics for sparse matrix-vector multiplication on GPUs
Tian et al. swSuperLU: A highly scalable sparse direct solver on Sunway manycore architecture
Halbiniak et al. Exploration of OpenCL heterogeneous programming for porting solidification modeling to CPU‐GPU platforms
Davis et al. Paradigmatic shifts for exascale supercomputing
Zhang et al. Mixed-precision block incomplete sparse approximate preconditioner on Tensor core
Ohshima et al. Optimization of numerous small dense-matrix–vector multiplications in H-matrix arithmetic on GPU
Zhang et al. Implementing sparse matrix-vector multiplication with QCSR on GPU
Al-Mouhamed et al. SpMV and BiCG-Stab optimization for a class of hepta-diagonal-sparse matrices on GPU
Bylina et al. Explicit Fourth-Order Runge–Kutta Method on Intel Xeon Phi Coprocessor