Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2502323.2502327acmconferencesArticle/Chapter ViewAbstractPublication PagesicfpConference Proceedingsconference-collections
research-article

Towards a functional run-time for dense NLA domain

Published: 23 September 2013 Publication History

Abstract

We investigate the use of functional programming to develop a numerical linear algebra run-time; i.e. a framework where the solvers can be adapted easily to different contexts and task parallelism can be attained (semi-) automatically. We follow a bottom up strategy, where the first step is the design and implementation of a framework layer, composed by a functional version of BLAS (Basic Linear Algebra Subprograms) routines. The framework allows the manipulation of arbitrary representations for matrices and vectors and it is also possible to write and combine multiple implementations of BLAS operations based on different algorithms and parallelism strategies. Using this framework, we implement a functional version of Cholesky factorization, which serves as a proof of concept to evaluate the flexibility and performance of our approach.

References

[1]
SMP Superscalar (SMPSs) User's Manual, Version 2.4. Barcelona Supercomputing Center, Barcelona, 2011.
[2]
E. Agullo, J. Demmel, J. Dongarra, B. Hadri, J. Kurzak, J. Langou, H. Ltaief, P. Luszczek, and S. Tomov. Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects. Journal of Physics: Conference Series, 180, 2009. 10.1088/1742-6596/180/1/012037.
[3]
E. Anderson, Z. Bai, J. Demmel, J. E. Dongarra, J. DuCroz, A. Greenbaum, S. Hammarling, A. E. McKenney, S. Ostrouchov, and D. Sorensen. phLAPACK Users' Guide, Third Edition. SIAM, Philadelphia, 1999.
[4]
R. M. Badia, J. R. Herrero, J. Labarta, J. M. Pérez, E. S. Quintana-Ortí, and G. Quintana-Ortí. Parallelizing dense and banded linear algebra libraries using smpss. Concurrency and Computation: Practice and Experience, 21 (18): 2438--2456, 2009.
[5]
P. Bientinesi, B. Gunter, and R. A. v. d. Geijn. Families of algorithms related to the inversion of a symmetric positive definite matrix. ACM Trans. Math. Softw., 35 (1): 3:1--3:22, July 2008. ISSN 0098--3500. 10.1145/1377603.1377606. URL http://doi.acm.org/10.1145/1377603.1377606.
[6]
J. Bilmes, K. Asanović, C. whye Chin, and J. Demmel. Optimizing matrix multiply using PHiPAC: a Portable, High-Performance, ANSI C coding methodology. In Proceedings of International Conference on Supercomputing, Vienna, Austria, July 1997.
[7]
L. S. Blackford, J. Demmel, J. Dongarra, I. Duff, S. Hammarling, G. Henry, M. Heroux, L. Kaufman, A. Lumsdaine, A. Petitet, R. Pozo, K. Remington, and R. C. Whaley. An updated set of basic linear algebra subprograms (blas). ACM Transactions on Mathematical Software, 28: 135--151, 2001.
[8]
M. M. Chakravarty, G. Keller, S. Lee, T. L. McDonell, and V. Grover. Accelerating haskell array codes with multicore gpus. In Proceedings of the sixth workshop on Declarative aspects of multicore programming, DAMP '11, pages 3--14. ACM, 2011. ISBN 978--1--4503-0486--3.
[9]
M. M. T. Chakravarty, R. Leshchinskiy, S. Peyton-Jones, G. Keller, and S. Marlow. Data parallel Haskell: a status report. In DAMP '07: Proceedings of the 2007 workshop on Declarative aspects of multicore programming, pages 10--18. ACM, 2007. ISBN 978-1-59593-690-5. 10.1145/1248648.1248652.
[10]
E. Chan, F. G. V. Zee, P. Bientinesi, E. S. Quintana-Ortí, G. Quintana-Ortí, and R. A. van de Geijn. Supermatrix: a multithreaded runtime scheduling system for algorithms-by-blocks. In S. Chatterjee and M. L. Scott, editors, PPOPP, pages 123--132. ACM, 2008. ISBN 978-1-59593-795-7.
[11]
J. Choi, J. J. Dongarra, R. Pozo, and D. W. Walker. ScaLAPACK: A scalable linear algebra library for distributed memory concurrent computers. In Proceedings of the Fourth Symposium on the Frontiers of Massively Parallel Computation, pages 120--127. IEEE Comput. Soc. Press, 1992.
[12]
A. Foltzer, A. Kulkarni, R. Swords, S. Sasidharan, E. Jiang, and R. Newton. A meta-scheduler for the par-monad: composable scheduling for the heterogeneous cloud. SIGPLAN Not., 47 (9): 235--246, Sept. 2012. ISSN 0362-1340. 10.1145/2398856.2364562. URL http://doi.acm.org/10.1145/2398856.2364562.
[13]
A. V. Gerbessiotis. Algorithmic and Practical Considerations for Dense Matrix Computations on the BSP Model. PRG-TR 32, Oxford University Computing Laboratory, 1997. URL http://web.njit.edu/alexg/pubs/papers/PRG3297.ps.
[14]
G. Golub and C. V. Loan. Matrix Computations. The Johns Hopkins University Press, Baltimore, 3rd edition, 1996.
[15]
R. Hinze. Fun with phantom types. In J. Gibbons and O. de Moor, editors, The Fun of Programming, Cornerstones of Computing, pages 245--262. Palgrave Macmillan, 2003.
[16]
G. Keller, M. M. Chakravarty, R. Leshchinskiy, S. Peyton Jones, and B. Lippmeier. Regular, shape-polymorphic, parallel arrays in haskell. In Proceedings of the 15th Intl. Conf. on Funct. Progr., ICFP '10, pages 261--272. ACM, 2010. ISBN 978-1-60558-794-3.
[17]
O. Kiselyov, S. P. Jones, and C. chieh Shan. Fun with type functions. In A. W. Roscoe, C. B. Jones, and K. Wood, editors, Reflections on the work of C. A. R. Hoare. Springer, 2010.
[18]
S. Marlow, S. L. Peyton-Jones, and S. Singh. Runtime support for multicore Haskell. In ICFP 2009, Intl. Conf. on Functional Programming, pages 65--78. ACM, 2009. http://doi.acm.org/10.1145/1596550.1596563.
[19]
S. Marlow, P. Maier, H.-W. Loidl, M. K. Aswad, and P. W. Trinder. Seq no more: Better strategies for parallel Haskell. In Haskell Symposium 2010, Baltimore, MD, USA, Sept. 2010. ACM Press.
[20]
S. Marlow, R. Newton, and S. Peyton-Jones. A monad for deterministic parallelism. In Haskell '11: Proceedings of the Fourth Symposium on Haskell. ACM, 2011.
[21]
S. Peyton Jones. Harnessing the multicores: Nested data parallelism in haskell. In Proceedings of the 6th Asian Symposium on Programming Languages and Systems, APLAS '08, pages 138--138, Berlin, Heidelberg, 2008. Springer-Verlag. ISBN 978-3-540-89329-5.
[22]
S. Peyton-Jones et al. The Haskell 98 language and libraries: The revised report. Journal of Functional Programming, 13 (1): 0-255, Jan 2003.
[23]
E. Quintana-Ortí;, G. Quintana-Ortí;, X. Sun, and R. van de Geijn. A note on parallel matrix inversion. SIAM Journal on Scientific Computing, 22 (5): 1762--1771, 2000.
[24]
C. Whaley, A. Petitet, and J. J. Dongarra. Automated Empirical Optimization of Software and the ATLAS Project. In PARALLEL COMPUTING, volume 27, 2000.

Cited By

View all
  • (2015)Painless Parallelism on Heterogeneous Hardware Leveraging the Functional ParadigmProceedings of the 2015 International Symposium on Computer Architecture and High Performance Computing Workshop (SBAC-PADW)10.1109/SBAC-PADW.2015.24(73-78)Online publication date: 18-Oct-2015

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
FHPC '13: Proceedings of the 2nd ACM SIGPLAN workshop on Functional high-performance computing
September 2013
104 pages
ISBN:9781450323819
DOI:10.1145/2502323
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 September 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. blas
  2. haskell
  3. nla
  4. parallelism

Qualifiers

  • Research-article

Conference

ICFP'13
Sponsor:

Acceptance Rates

FHPC '13 Paper Acceptance Rate 8 of 14 submissions, 57%;
Overall Acceptance Rate 18 of 25 submissions, 72%

Upcoming Conference

ICFP '25
ACM SIGPLAN International Conference on Functional Programming
October 12 - 18, 2025
Singapore , Singapore

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)1
Reflects downloads up to 29 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2015)Painless Parallelism on Heterogeneous Hardware Leveraging the Functional ParadigmProceedings of the 2015 International Symposium on Computer Architecture and High Performance Computing Workshop (SBAC-PADW)10.1109/SBAC-PADW.2015.24(73-78)Online publication date: 18-Oct-2015

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media