research-article

Towards a functional run-time for dense NLA domain

Authors:

Marcos VieraAuthors Info & Claims

FHPC '13: Proceedings of the 2nd ACM SIGPLAN workshop on Functional high-performance computing

Pages 85 - 96

https://doi.org/10.1145/2502323.2502327

Published: 23 September 2013 Publication History

Abstract

We investigate the use of functional programming to develop a numerical linear algebra run-time; i.e. a framework where the solvers can be adapted easily to different contexts and task parallelism can be attained (semi-) automatically. We follow a bottom up strategy, where the first step is the design and implementation of a framework layer, composed by a functional version of BLAS (Basic Linear Algebra Subprograms) routines. The framework allows the manipulation of arbitrary representations for matrices and vectors and it is also possible to write and combine multiple implementations of BLAS operations based on different algorithms and parallelism strategies. Using this framework, we implement a functional version of Cholesky factorization, which serves as a proof of concept to evaluate the flexibility and performance of our approach.

References

[1]

SMP Superscalar (SMPSs) User's Manual, Version 2.4. Barcelona Supercomputing Center, Barcelona, 2011.

[2]

E. Agullo, J. Demmel, J. Dongarra, B. Hadri, J. Kurzak, J. Langou, H. Ltaief, P. Luszczek, and S. Tomov. Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects. Journal of Physics: Conference Series, 180, 2009. 10.1088/1742-6596/180/1/012037.

[3]

E. Anderson, Z. Bai, J. Demmel, J. E. Dongarra, J. DuCroz, A. Greenbaum, S. Hammarling, A. E. McKenney, S. Ostrouchov, and D. Sorensen. phLAPACK Users' Guide, Third Edition. SIAM, Philadelphia, 1999.

Digital Library

[4]

R. M. Badia, J. R. Herrero, J. Labarta, J. M. Pérez, E. S. Quintana-Ortí, and G. Quintana-Ortí. Parallelizing dense and banded linear algebra libraries using smpss. Concurrency and Computation: Practice and Experience, 21 (18): 2438--2456, 2009.

Digital Library

[5]

P. Bientinesi, B. Gunter, and R. A. v. d. Geijn. Families of algorithms related to the inversion of a symmetric positive definite matrix. ACM Trans. Math. Softw., 35 (1): 3:1--3:22, July 2008. ISSN 0098--3500. 10.1145/1377603.1377606. URL http://doi.acm.org/10.1145/1377603.1377606.

Digital Library

[6]

J. Bilmes, K. Asanović, C. whye Chin, and J. Demmel. Optimizing matrix multiply using PHiPAC: a Portable, High-Performance, ANSI C coding methodology. In Proceedings of International Conference on Supercomputing, Vienna, Austria, July 1997.

Digital Library

[7]

L. S. Blackford, J. Demmel, J. Dongarra, I. Duff, S. Hammarling, G. Henry, M. Heroux, L. Kaufman, A. Lumsdaine, A. Petitet, R. Pozo, K. Remington, and R. C. Whaley. An updated set of basic linear algebra subprograms (blas). ACM Transactions on Mathematical Software, 28: 135--151, 2001.

Digital Library

[8]

M. M. Chakravarty, G. Keller, S. Lee, T. L. McDonell, and V. Grover. Accelerating haskell array codes with multicore gpus. In Proceedings of the sixth workshop on Declarative aspects of multicore programming, DAMP '11, pages 3--14. ACM, 2011. ISBN 978--1--4503-0486--3.

Digital Library

[9]

M. M. T. Chakravarty, R. Leshchinskiy, S. Peyton-Jones, G. Keller, and S. Marlow. Data parallel Haskell: a status report. In DAMP '07: Proceedings of the 2007 workshop on Declarative aspects of multicore programming, pages 10--18. ACM, 2007. ISBN 978-1-59593-690-5. 10.1145/1248648.1248652.

Digital Library

[10]

E. Chan, F. G. V. Zee, P. Bientinesi, E. S. Quintana-Ortí, G. Quintana-Ortí, and R. A. van de Geijn. Supermatrix: a multithreaded runtime scheduling system for algorithms-by-blocks. In S. Chatterjee and M. L. Scott, editors, PPOPP, pages 123--132. ACM, 2008. ISBN 978-1-59593-795-7.

Digital Library

[11]

J. Choi, J. J. Dongarra, R. Pozo, and D. W. Walker. ScaLAPACK: A scalable linear algebra library for distributed memory concurrent computers. In Proceedings of the Fourth Symposium on the Frontiers of Massively Parallel Computation, pages 120--127. IEEE Comput. Soc. Press, 1992.

[12]

A. Foltzer, A. Kulkarni, R. Swords, S. Sasidharan, E. Jiang, and R. Newton. A meta-scheduler for the par-monad: composable scheduling for the heterogeneous cloud. SIGPLAN Not., 47 (9): 235--246, Sept. 2012. ISSN 0362-1340. 10.1145/2398856.2364562. URL http://doi.acm.org/10.1145/2398856.2364562.

Digital Library

[13]

A. V. Gerbessiotis. Algorithmic and Practical Considerations for Dense Matrix Computations on the BSP Model. PRG-TR 32, Oxford University Computing Laboratory, 1997. URL http://web.njit.edu/alexg/pubs/papers/PRG3297.ps.

[14]

G. Golub and C. V. Loan. Matrix Computations. The Johns Hopkins University Press, Baltimore, 3rd edition, 1996.

Digital Library

[15]

R. Hinze. Fun with phantom types. In J. Gibbons and O. de Moor, editors, The Fun of Programming, Cornerstones of Computing, pages 245--262. Palgrave Macmillan, 2003.

[16]

G. Keller, M. M. Chakravarty, R. Leshchinskiy, S. Peyton Jones, and B. Lippmeier. Regular, shape-polymorphic, parallel arrays in haskell. In Proceedings of the 15th Intl. Conf. on Funct. Progr., ICFP '10, pages 261--272. ACM, 2010. ISBN 978-1-60558-794-3.

Digital Library

[17]

O. Kiselyov, S. P. Jones, and C. chieh Shan. Fun with type functions. In A. W. Roscoe, C. B. Jones, and K. Wood, editors, Reflections on the work of C. A. R. Hoare. Springer, 2010.

[18]

S. Marlow, S. L. Peyton-Jones, and S. Singh. Runtime support for multicore Haskell. In ICFP 2009, Intl. Conf. on Functional Programming, pages 65--78. ACM, 2009. http://doi.acm.org/10.1145/1596550.1596563.

Digital Library

[19]

S. Marlow, P. Maier, H.-W. Loidl, M. K. Aswad, and P. W. Trinder. Seq no more: Better strategies for parallel Haskell. In Haskell Symposium 2010, Baltimore, MD, USA, Sept. 2010. ACM Press.

Digital Library

[20]

S. Marlow, R. Newton, and S. Peyton-Jones. A monad for deterministic parallelism. In Haskell '11: Proceedings of the Fourth Symposium on Haskell. ACM, 2011.

Digital Library

[21]

S. Peyton Jones. Harnessing the multicores: Nested data parallelism in haskell. In Proceedings of the 6th Asian Symposium on Programming Languages and Systems, APLAS '08, pages 138--138, Berlin, Heidelberg, 2008. Springer-Verlag. ISBN 978-3-540-89329-5.

Digital Library

[22]

S. Peyton-Jones et al. The Haskell 98 language and libraries: The revised report. Journal of Functional Programming, 13 (1): 0-255, Jan 2003.

[23]

E. Quintana-Ortí;, G. Quintana-Ortí;, X. Sun, and R. van de Geijn. A note on parallel matrix inversion. SIAM Journal on Scientific Computing, 22 (5): 1762--1771, 2000.

Digital Library

[24]

C. Whaley, A. Petitet, and J. J. Dongarra. Automated Empirical Optimization of Software and the ATLAS Project. In PARALLEL COMPUTING, volume 27, 2000.

Cited By

Blanco MPerdomo PEzzatti PPardo AViera M(2015)Painless Parallelism on Heterogeneous Hardware Leveraging the Functional ParadigmProceedings of the 2015 International Symposium on Computer Architecture and High Performance Computing Workshop (SBAC-PADW)10.1109/SBAC-PADW.2015.24(73-78)Online publication date: 18-Oct-2015
https://dl.acm.org/doi/10.1109/SBAC-PADW.2015.24

Index Terms

Towards a functional run-time for dense NLA domain
1. Computing methodologies
  1. Parallel computing methodologies
    1. Parallel programming languages
2. Software and its engineering
  1. Software notations and tools
    1. General programming languages
      1. Language types
        Functional languages
        Parallel programming languages

Recommendations

Compiler blockability of dense matrix factorizations

The goal of the LAPACK project is to provide efficient and portable software for dense numerical linear algebra computations. By recasting many of the fundamental dense matrix computations in terms of calls to an efficient implementation of the BLAS (...
Algorithm 865: Fortran 95 subroutines for Cholesky factorization in block hybrid format

We present subroutines for the Cholesky factorization of a positive-definite symmetric matrix and for solving corresponding sets of linear equations. They exploit cache memory by using the block hybrid format proposed by the authors in a companion ...
Exploiting zeros on the diagonal in the direct solution of indefinite sparse symmetric linear systems

We describe the design of a new code for the solution of sparse indefinite symmetric linear systems of equations. The principal difference between this new code and earlier work lies in the exploitation of the additional sparsity available when the ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

FHPC '13: Proceedings of the 2nd ACM SIGPLAN workshop on Functional high-performance computing

September 2013

104 pages

ISBN:9781450323819

DOI:10.1145/2502323

General Chairs:
Clemens Grelck
University of Amsterdam, The Netherlands
,
Fritz Henglein
University of Copenhagen, Denmark
,
Program Chairs:
Umut Acar
Carnegie Mellon University, USA
,
Jost Berthold
University of Copenhagen, Denmark

Copyright © 2013 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGPLAN: ACM Special Interest Group on Programming Languages

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 September 2013

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ICFP'13

Sponsor:

SIGPLAN

ICFP'13: ACM SIGPLAN International Conference on Functional Programming

September 23, 2013

Massachusetts, Boston, USA

Acceptance Rates

FHPC '13 Paper Acceptance Rate 8 of 14 submissions, 57%;

Overall Acceptance Rate 18 of 25 submissions, 72%

Upcoming Conference

ICFP '25

Sponsor:
sigplan

ACM SIGPLAN International Conference on Functional Programming

October 12 - 18, 2025

Singapore , Singapore

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
41
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)1

Reflects downloads up to 29 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Blanco MPerdomo PEzzatti PPardo AViera M(2015)Painless Parallelism on Heterogeneous Hardware Leveraging the Functional ParadigmProceedings of the 2015 International Symposium on Computer Architecture and High Performance Computing Workshop (SBAC-PADW)10.1109/SBAC-PADW.2015.24(73-78)Online publication date: 18-Oct-2015
https://dl.acm.org/doi/10.1109/SBAC-PADW.2015.24

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents