article

Free access

A set of level 3 basic linear algebra subprograms

Editor: John A. Rice Authors:

J. J. Dongarra,

Jeremy Du Croz,

Sven Hammarling,

I. S. DuffAuthors Info & Claims

ACM Transactions on Mathematical Software (TOMS), Volume 16, Issue 1

Pages 1 - 17

https://doi.org/10.1145/77626.79170

Published: 01 March 1990 Publication History

PDF eReader

Abstract

This paper describes an extension to the set of Basic Linear Algebra Subprograms. The extensions are targeted at matrix-vector operations that should provide for efficient and portable implementations of algorithms for high-performance computers

References

[1]

BARRON, D. W., AND SWINNERTON-DYER, H. P.F. Solution of simultaneous linear equations using a magnetic-tape store. Comput. J. 3 (1960), 28-33.

Google Scholar

[2]

BERRY, M., GALLIVAN, K., HARROD, W., JALBY, W., LO, S., MEIER, U., PHILIPPE, B., AND SAMEH, A. Parallel algorithms on the CEDAR system. CSRD Report 581, 1986.

Google Scholar

[3]

BISCHOF, C., AND VAN LOAN, C. The WY representation for products of Householder matrices. SIAM J. Sci. Star. Comput. 8, 1 (Jan. 1987), s2-s13.

Crossref

Google Scholar

[4]

BRONLUND, O. E., AND JOHNSEN, T. QR-factorization of partitioned matrices. Comput. Meth. Appl. Mech. Eng., vol. 3, pp. 153-172, 1974.

Google Scholar

[5]

BUCHER, I., AND JORDAN, T. Linear algebra programs for use on a vector computer with a secondary solid state storage device. In Advances in Computer Methods for Partial Differential Equations, R. Vichnevetsky and R. Stepleman, Eds. IMACS, 1984, 546-550.

Google Scholar

[6]

CALAHAN, D.A. Block-oriented local-memory-based linear equation solution on the CRAY-2: Uniprocessor algorithms. In Proceedings International Conference on Parallel Processing (Aug. 1986). IEEE Computer Society Press, New York, 1986.

Google Scholar

[7]

CARNEVALI, P., RADICATI DI BROZOLO, G., ROBERT, Y., AND SGUAZZERO, P. Efficient Fortran implementation of the Gaussian elimination and Householder reduction algorithms on the IBM 3090 vector multiprocessor. IBM ECSEC Rep. ICE-0012, 1987.

Google Scholar

[8]

CHARTRES, B. Adaption of the Jacobi and Givens methods for a computer with magnetic tape backup store. Univ. of Sydney Tech. Rep. 8, 1960.

Google Scholar

[9]

DAVE, A. K., AND DUFF, I.S. Sparse matrix calculations on the CRAY-2. Parallel Comput. 5 (July 1987), 55-64.

Google Scholar

[10]

DEMMEL, J., DONGARRA, J. J., DU CROZ, J., GREENBAUM, A., HAMMARLING, S., AND SORENSEN, D. Prospectus for the development of a linear algebra library for high-performance computers. Argonne National Lab. Rep. ANL-MCS-TM-97, Sept. 1987.

Google Scholar

[11]

DIETRICH, G. A new formulation of the hypermatrix Householder QR-decomposition. Comput. Meth. AppI. Mech. Eng. 9 (1976), 273-280.

Google Scholar

[12]

DODSON, D., AND LEWIS, J. Issues relating to extension of the basic linear algebra subprograms. ACM SIGNUM Newsl. 20, 1 (1985), 2-18.

Crossref

Google Scholar

[13]

DONGARRA, J. J., BUNCH, J., MOLER, C., AND STEWART, G. LINPACK Users' Guide. SIAM, Philadelphia, Pa., 1979.

Google Scholar

[14]

DONGARRA, J. J., DuCRoz, J., HAMMARLING, S., AND HANSON, R. An extended set of Fortran basic linear algebra subprograms. ACM Trans. Math. Softw. 14, i (Mar. 1988), 1-17.

Crossref

Google Scholar

[15]

DONGARRA, J. J., DuCRoz, Z., HAMMARLING, S., AND HANSON, R. An extended set of Fortran basic linear algebra subprograms: Model implementation and test programs. ACM Trans. Math. Softw. 14, I (Mar. 1988), 18-32.

Crossref

Google Scholar

[16]

DONGARRA, J. J., DuCRoz, J., DUFF, I. S., AND HAMMARLING, S. A set of level 3 basic linear algebra subprograms: Model implementation and test programs. This issue, pp. 18-37.

Crossref

Google Scholar

[17]

DONGARRA, J. J., AND DUFF, I.S. Advanced architecture computers. Univ. of Tennessee, Rep. CS-89-90, Nov. 1989.

Crossref

Google Scholar

[18]

DONGARRA, J. J., GUSTAVSON, F., AND KARP, A. Implementing linear algebra algorithms for dense matrices on a vector pipeline machine. SIAM Rev. 26, 1 (1984), 91-112.

Google Scholar

[19]

DONGARRA, J. J., HAMMARLING, S., AND SORENSEN, O. C. Block reduction of matrices to condensed forms for eigenvalue computations. Argonne National Lab. Rep. ANL-MCS-TM-99, Sept. 1987.

Google Scholar

[20]

DONGARRA, J. J., AND HEWITT, T. Implementing dense linear algebra using multitasking on the CRAY X-MP-4. J. Comput. Appl. Math. 27 (1989), 215-227.

Google Scholar

[21]

DONGARRA, J. J., AND SORENSEN, D.C. Linear algebra on high-performance computers. In Proceedings Parallel Computing 85, U. Schendel, Ed. North Holland, Amsterdam, 1986, 113-136.

Google Scholar

[22]

DuCRoz, J., NUGENT, S., REID, J., AND TAYLOR, D. Solving large full sets of linear equations in a paged virtual store. ACM Trans. Math. Softw. 7, 4 (1981), 527-536.

Crossref

Google Scholar

[23]

DUFF, I. S. Full matrix techniques in sparse Gaussian elimination. In Numerical Analysis Proceedings, Dundee 1981, Lecture Notes in Mathematics 912. Springer-Verlag, New York, 1981, 71-84.

Google Scholar

[24]

GALLIVAN, K., JALBV, W., AND MEIER, U. The use of BLAS3 in linear algebra on a parallel processor with a hierarchical memory. SIAM J. Sci. Star. Comput. 8, 6 (Nov. 1987), 1079-1084.

Crossref

Google Scholar

[25]

GEORGE, A., AND RASHWAN, S. Auxiliary storage methods for solving finite element systems. SIAM J. Sci. Star. Comput. 6, 4 (Oct. 1985), 882-910.

Google Scholar

[26]

IBM. Engineering and scientific subroutine library. Program 5668-863, 1986.

Google Scholar

[27]

LAWSON, C., HANSON, R. KINCAID, D., AND KROGH, F. Basic linear algebra subprograms for Fortran usage. ACM Trans. Math. Softw. 5 (1979), 308-323.

Crossref

Google Scholar

[28]

LAWSON, C., HANSON, R., KINCAID, D., AND KROGH, F. Algorithm 539: Basic linear algebra subprograms for Fortran usage. ACM Trans. Math. Softw. 5 (1979), 324-325.

Crossref

Google Scholar

[29]

MCKELLAR, A. C., AND COFFMAN, E. G., JR. Organizing matrices and matrix operations for paged memory systems. Commun. ACM 12, 3 (1969), 153-165.

Crossref

Google Scholar

[30]

ROBERT, Y., AND SGUAZZERO, P. The LU decomposition algorithm and its efficient Fortran implementation on the IBM 3090 vector multiprocessor. IBM ECSEC Rep. ICE-0006, 1987.

Google Scholar

[31]

SCHREIBER, R. Module design specification (Version 1.0). SAXPY Computer Corp., 255 San Geronimo Way, Sunnyvale, CA 94086, 1986.

Google Scholar

[32]

SCHREIBER, R., AND PARLETT, B. Block reflectors: Theory and computation. SIAM J. Numer. Anal. 25, 1 (Feb. 1988), 189-205.

Crossref

Google Scholar

Cited By

View all

Ikarashi YQian KDroubi SReinking ABernstein GRagan-Kelley JEeckhout LSmaragdakis GLiang KSampson AKim MRossbach C(2025)Exo 2: Growing a Scheduling LanguageProceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3669940.3707218(426-444)Online publication date: 3-Feb-2025
https://dl.acm.org/doi/10.1145/3669940.3707218
Eiximeno BMiró ABegiashvili BValero ERodriguez ILehmkhul O(2025)PyLOM: A HPC open source reduced order model suite for fluid dynamics applicationsComputer Physics Communications10.1016/j.cpc.2024.109459308(109459)Online publication date: Mar-2025
https://doi.org/10.1016/j.cpc.2024.109459
Bertolazzi EStocco D(2025)Parallel cyclic reduction of padded bordered almost block diagonal matricesJournal of Computational and Applied Mathematics10.1016/j.cam.2024.116331458(116331)Online publication date: Apr-2025
https://doi.org/10.1016/j.cam.2024.116331
Show More Cited By

Index Terms

A set of level 3 basic linear algebra subprograms

Recommendations

Algorithm 679: A set of level 3 basic linear algebra subprograms: model implementation and test programs

This paper describes a model implementation and test software for the Level 3 Basic Linear Algebra Subprograms (Level3 BLAS). The Level3 BLAS are targeted at matrix-matrix operations with the aim of providing more efficient, but portable, implementations ...
An updated set of basic linear algebra subprograms (BLAS)
Level 3 basic linear algebra subprograms for sparse matrices: a user-level interface

This article proposes a set of Level 3 Basic Linear Algebra Subprograms and associated kernels for sparse matrices. A major goal is to design and develop a common framework to enable efficient, and portable, implementations of iterative algorithms for ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Mathematical Software

ACM Transactions on Mathematical Software Volume 16, Issue 1

March 1990

109 pages

ISSN:0098-3500

EISSN:1557-7295

DOI:10.1145/77626

Editor:
John A. Rice

Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 March 1990

Published in TOMS Volume 16, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1,478
Total Citations
View Citations
4,614
Total Downloads

Downloads (Last 12 months)485
Downloads (Last 6 weeks)62

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Ikarashi YQian KDroubi SReinking ABernstein GRagan-Kelley JEeckhout LSmaragdakis GLiang KSampson AKim MRossbach C(2025)Exo 2: Growing a Scheduling LanguageProceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3669940.3707218(426-444)Online publication date: 3-Feb-2025
https://dl.acm.org/doi/10.1145/3669940.3707218
Eiximeno BMiró ABegiashvili BValero ERodriguez ILehmkhul O(2025)PyLOM: A HPC open source reduced order model suite for fluid dynamics applicationsComputer Physics Communications10.1016/j.cpc.2024.109459308(109459)Online publication date: Mar-2025
https://doi.org/10.1016/j.cpc.2024.109459
Bertolazzi EStocco D(2025)Parallel cyclic reduction of padded bordered almost block diagonal matricesJournal of Computational and Applied Mathematics10.1016/j.cam.2024.116331458(116331)Online publication date: Apr-2025
https://doi.org/10.1016/j.cam.2024.116331
Šír G(2024)A computational perspective on neural-symbolic integrationNeurosymbolic Artificial Intelligence10.3233/NAI-240672(1-12)Online publication date: 18-Jul-2024
https://doi.org/10.3233/NAI-240672
ÖZ I(2024)Quantitative Performance Analysis of BLAS Libraries on GPU ArchitecturesBLAS Kütüphanelerinin GPU Mimarilerindeki Nicel Performans AnaliziDeu Muhendislik Fakultesi Fen ve Muhendislik10.21205/deufmd.202426760626:76(40-48)Online publication date: 23-Jan-2024
https://doi.org/10.21205/deufmd.2024267606
Carretero Perez JRodríguez-Sánchez RCastelló ACatalán SIgual FQuintana-Ortí E(2024)Experiences with nested parallelism in task-parallel applications using malleable BLAS on multicore processorsInternational Journal of High Performance Computing Applications10.1177/1094342023115765338:2(55-68)Online publication date: 10-Apr-2024
https://dl.acm.org/doi/10.1177/10943420231157653
Toledo S(2024)Algorithm 1051: UltimateKalman, Flexible Kalman Filtering and Smoothing Using Orthogonal TransformationsACM Transactions on Mathematical Software10.1145/369995850:4(1-19)Online publication date: 12-Dec-2024
https://dl.acm.org/doi/10.1145/3699958
Josyula AVerma PGaonkar ABarua AHegde N(2024)Optimizing a Super-Fast Eigensolver for Hierarchically Semiseparable MatricesProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673119(32-41)Online publication date: 12-Aug-2024
https://dl.acm.org/doi/10.1145/3673038.3673119
Li JFeng ZGao YTian SZhang HYe HZhang J(2024)High-Performance 3D convolution on the Latest Generation Sunway ProcessorProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673093(241-251)Online publication date: 12-Aug-2024
https://dl.acm.org/doi/10.1145/3673038.3673093
Bin KPark JPark CKim SLee KOkoshi TKo JLiKamWa R(2024)CoActo: CoActive Neural Network Inference Offloading with Fine-grained and Concurrent ExecutionProceedings of the 22nd Annual International Conference on Mobile Systems, Applications and Services10.1145/3643832.3661885(412-424)Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1145/3643832.3661885
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Abstract

References

Cited By

Index Terms

Recommendations

Algorithm 679: A set of level 3 basic linear algebra subprograms: model implementation and test programs

An updated set of basic linear algebra subprograms (BLAS)

Level 3 basic linear algebra subprograms for sparse matrices: a user-level interface

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

Login options

Full Access

Share

Share this Publication link

Share on social media

Affiliations