SUMMA: Scalable Universal Matrix Multiplication Algorithm

SUMMA: Scalable Universal Matrix Multiplication AlgorithmApril 1995

April 1995

1995 Technical Report

Publisher:

University of Texas at Austin
Computer Science Dept. Taylor Hall 2.124 Austin, TX
United States

Published:01 April 1995

Bibliometrics

Abstract

In this paper, we give a straight forward, highly efficient, scalable implementation of common matrix multiplication operations. The algorithms are much simpler than previously published methods, yield better performance, and require less work space. MPI implementations are given, as are performance results on the Intel Paragon system.

Cited By

Contributors

Robert A. van de Geijn
The University of Texas at Austin
- Publication Years1987 - 2023
- Publication counts114
- Citation count3,040
- Available for Download42
- Downloads (cumulative)37,509
- Downloads (12 months)2,503
- Downloads (6 weeks)387
- Average Downloads per Article893
- Average Citation per Article27
View Full Profile
Jerrell Richard Watts
California Institute of Technology
- Publication Years1993 - 1999
- Publication counts12
- Citation count119
- Available for Download1
- Downloads (cumulative)233
- Downloads (12 months)48
- Downloads (6 weeks)9
- Average Downloads per Article233
- Average Citation per Article10
View Full Profile

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Recommendations

Scalable Parallel Matrix Multiplication on Distributed Memory Parallel Computers

Consider any known sequential algorithm for matrix multiplication over an arbitrary ring with time complexity O(N ), where 2< 3. We show that such an algorithm can be parallelized on a distributed memory parallel computer (DMPC) in O(logN) time by using ...
Scalable task-based algorithm for multiplication of block-rank-sparse matrices
IA³ '15: Proceedings of the 5th Workshop on Irregular Applications: Architectures and Algorithms

A task-based formulation of Scalable Universal Matrix Multiplication Algorithm (SUMMA), a popular algorithm for matrix multiplication (MM), is applied to the multiplication of hierarchy-free, rank-structured matrices that appear in the domain of quantum ...
Scalable Parallel Matrix Multiplication on Distributed Memory Parallel Computers
IPDPS '00: Proceedings of the 14th International Symposium on Parallel and Distributed Processing

\math. We show that such an algorithm can be parallelized on a distributed memory parallel computer (DMPC) in \math time by using \math processors. Such a parallel computation is cost optimal and matches the performance of PRAM. Further-more, our ...

Browse Reports

Sections

Cited By

Scalable Parallel Matrix Multiplication on Distributed Memory Parallel Computers

Scalable task-based algorithm for multiplication of block-rank-sparse matrices

Scalable Parallel Matrix Multiplication on Distributed Memory Parallel Computers

Save to Binder

Sections

Cited By

Save to Binder

Recommendations

Scalable Parallel Matrix Multiplication on Distributed Memory Parallel Computers

Scalable task-based algorithm for multiplication of block-rank-sparse matrices

Scalable Parallel Matrix Multiplication on Distributed Memory Parallel Computers