Issue Downloads
New Numerical Algorithm for Deflation of Infinite and Zero Eigenvalues and Full Solution of Quadratic Eigenvalue Problems
This article presents a new method for computing all eigenvalues and eigenvectors of quadratic matrix pencil Q(λ)=λ2 M + λ C + K. It is an upgrade of the quadeig algorithm by Hammarlinget al., which attempts to reveal and remove by deflation a certain ...
PHIST: A Pipelined, Hybrid-Parallel Iterative Solver Toolkit
- Jonas Thies,
- Melven Röhrig-Zöllner,
- Nigel Overmars,
- Achim Basermann,
- Dominik Ernst,
- Georg Hager,
- Gerhard Wellein
The increasing complexity of hardware and software environments in high-performance computing poses big challenges on the development of sustainable and hardware-efficient numerical software. This article addresses these challenges in the context of ...
Parallel Tree Algorithms for AMR and Non-Standard Data Access
We introduce several parallel algorithms operating on a distributed forest of adaptive quadtrees/octrees. They are targeted at large-scale applications relying on data layouts that are more complex than required for standard finite elements, such as hp-...
Variable Step-Size Control Based on Two-Steps for Radau IIA Methods
Two-step embedded methods of order s based on s-stage Radau IIA formulas are considered for the variable step-size integration of stiff differential equations. These embedded methods are aimed at local error control and are computed through a linear ...
Yet Another Tensor Toolbox for Discontinuous Galerkin Methods and Other Applications
The numerical solution of partial differential equations is at the heart of many grand challenges in supercomputing. Solvers based on high-order discontinuous Galerkin (DG) discretisation have been shown to scale on large supercomputers with excellent ...
A Shift Selection Strategy for Parallel Shift-invert Spectrum Slicing in Symmetric Self-consistent Eigenvalue Computation
The central importance of large-scale eigenvalue problems in scientific computation necessitates the development of massively parallel algorithms for their solution. Recent advances in dense numerical linear algebra have enabled the routine treatment of ...
A Feature-complete SPIKE Dense Banded Solver
This article presents a parallel, effective, and feature-complete recursive SPIKE algorithm that achieves near feature-parity with the standard linear algebra package banded linear system solver. First, we present a flexible parallel implementation of ...
Error Analysis and Improving the Accuracy of Winograd Convolution for Deep Neural Networks
Popular deep neural networks (DNNs) spend the majority of their execution time computing convolutions. The Winograd family of algorithms can greatly reduce the number of arithmetic operations required and is used in many DNN software frameworks. However,...
Algorithm 1012: DELAUNAYSPARSE: Interpolation via a Sparse Subset of the Delaunay Triangulation in Medium to High Dimensions
DELAUNAYSPARSE contains both serial and parallel codes written in Fortran 2003 (with OpenMP) for performing medium- to high-dimensional interpolation via the Delaunay triangulation. To accommodate the exponential growth in the size of the Delaunay ...