research-article

Data-driven performance modeling of linear solvers for sparse matrices

Authors:

Jae-Seung Yeom,

Jayaraman J. Thiagarajan,

Abhinav Bhatele,

Greg Bronevetsky,

Tzanio KolevAuthors Info & Claims

PMBS '16: Proceedings of the 7th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems

Pages 32 - 42

Published: 13 November 2016 Publication History

Abstract

Performance of scientific codes is increasingly dependent on the input problem, its data representation and the underlying hardware with the increase in code and architectural complexity. This makes the task of identifying the fastest algorithm for solving a problem more challenging. In this paper, we focus on modeling the performance of numerical libraries used to solve a sparse linear system. We use machine learning to develop data-driven models of performance of linear solver implementations. These models can be used by a novice user to identify the fastest preconditioner and solver for a given input matrix. We use a variety of features that represent the matrix structure, numerical properties of the matrix and the underlying mesh or input problem as input to the model. We model the performance of nine linear solvers and thirteen preconditioners available in Trilinos using 1240 sparse matrices obtained from two different sources. Our prediction models perform significantly better than a blind classifier and black-box SVM and k-NN classifiers.

References

[1]

M. A. Heroux, R. A. Bartlett, V. E. Howle, R. J. Hoekstra, J. J. Hu, T. G. Kolda, R. B. Lehoucq, K. R. Long, R. P. Pawlowski, E. T. Phipps, A. G. Salinger, H. K. Thornquist, R. S. Tuminaro, J. M. Willenbring, A. Williams, and K. S. Stanley, "An overview of the Trilinos project," ACM Trans. Math. Softw., vol. 31, no. 3, pp. 397--423, 2005.

Digital Library

[2]

"LAPACK - Linear Algebra PACKage," http://www.netlib.org/lapack.

[3]

"Portable, Extensible Toolkit for Scientific Computation (PETSc)," http://www.mcs.anl.gov/petsc.

[4]

"Scalable Library for Eigenvalue Problem Computations (SLEPc)," http://www.grycap.upv.es/slepc.

[5]

O. Levy and Y. Goldberg, "Neural word embedding as implicit matrix factorization," in Advances in Neural Information Processing Systems 27, 2014, pp. 2177--2185.

Digital Library

[6]

"MFEM: Modular finite element methods," mfem.org.

[7]

T. A. Davis and Y. Hu, "The university of florida sparse matrix collection," ACM Trans. Math. Softw., vol. 38, no. 1, pp. 1:1--1:25, Dec. 2011.

Digital Library

[8]

M. M. Mathis, D. J. Kerbyson, and A. Hoisie, "A performance model of non-deterministic particle transport on large-scale systems," Future Generation Comp. Syst., vol. 22, no. 3, pp. 324--335, 2006.

Digital Library

[9]

D. J. Kerbyson, H. J. Alme, A. Hoisie, F. Petrini, H. J. Wasserman, and M. Gittings, "Predictive performance and scalability modeling of a large-scale application," in Proceedings of the 2001 ACM/IEEE conference on Supercomputing, November 2001, p. 37.

Digital Library

[10]

D. J. Kerbyson, A. Hoisie, and H. J. Wasserman, "Use of predictive performance modeling during large-scale system installation," Parallel Processing Letters, vol. 15, no. 4, pp. 387--396, 2005.

[11]

N. R. Tallent and A. Hoisie, "Palm: easing the burden of analytical performance modeling," in 2014 International Conference on Super-computing, June 2014, pp. 221--230.

Digital Library

[12]

H. Gahvari, A. H. Baker, M. Schulz, U. M. Yang, K. E. Jordan, and W. Gropp, "Modeling the performance of an algebraic multigrid cycle on HPC platforms," in Proceedings of the 25th International Conference on Supercomputing, June 2011, pp. 172--181.

Digital Library

[13]

H. Gahvari, W. Gropp, K. E. Jordan, M. Schulz, and U. M. Yang, "Modeling the performance of an algebraic multigrid cycle using hybrid mpi/openmp," in 41st International Conference on Parallel Processing, ICPP 2012, September 2012, pp. 128--137.

Digital Library

[14]

H. Gahvari and W. Gropp, "An introductory exascale feasibility study for ffts and multigrid," in Parallel Distributed Processing (IPDPS), 2010 IEEE International Symposium on, April 2010, pp. 1--9.

[15]

R. D. Falgout and U. M. Yang, "hypre: A library of high performance preconditioners," in International Conference on Computational Science, April 2002, pp. 632--641.

Digital Library

[16]

A. Bhatele, P. Jetley, H. Gahvari, L. Wesolowski, W. D. Gropp, and L. Kale, "Architectural constraints to attain 1 Exaflop/s for three scientific application classes," in Proceedings of the IEEE International Parallel & Distributed Processing Symposium, ser. IPDPS '11. IEEE Computer Society, May 2011. {Online}. Available:

Digital Library

[17]

S. Bhowmick, V. Eijkhouta, Y. Freund, E. Fuentes, and D. Keye, "Application of machine learning to the selection of sparse linear solvers," Int. J. High Perf. Comput. Appl, 2006.

[18]

S. Bhowmick, B. Toth, and P. Raghavan, "Towards low-cost, high-accuracy classifiers for linear solver selection," in Computational Science-ICCS 2009, 2009, pp. 463--472.

Digital Library

[19]

R. Nair, S. Bernstein, E. R. Jessup, and B. Norris, "Generating customized sparse eigenvalue solutions with lighthouse," in Proceedings of the Ninth International Multi-Conference on Computing in the Global Information Technology, June 2014.

[20]

G. Belter, E. R. Jessup, I. Karlin, and J. G. Siek, "Automating the generation of composed linear algebra kernels," in Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, ser. SC '09, Nov 2009.

Digital Library

[21]

B. Norris, S. Bernstein, R. Nair, and E. R. Jessup, "Lighthouse: A user-centered web service for linear algebra software," CoRR, vol. abs/1408.1363, 2014. {Online}. Available: http://arxiv.org/abs/1408.1363

[22]

C. Cortes and V. Vapnik, "Support-vector networks," Mach. Learn., vol. 20, no. 3, pp. 273--297, September 1995.

Digital Library

[23]

N. Jain, A. Bhatele, M. P. Robson, T. Gamblin, and L. V. Kale, "Predicting application performance using supervised learning on communication features," in ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, ser. SC '13. IEEE Computer Society, Nov. 2013, LLNL-CONF-635857.

Digital Library

[24]

A. Bhatele, A. R. Titus, J. J. Thiagarajan, N. Jain, T. Gamblin, P.-T. Bremer, M. Schulz, and L. V. Kale, "Identifying the culprits behind network congestion," in Proceedings of the IEEE International Parallel & Distributed Processing Symposium, ser. IPDPS '15. IEEE Computer Society, May 2015, LLNL-CONF-663150.

Digital Library

[25]

J. H. Friedman, "Greedy function approximation: A gradient boosting machine," The Annals of Statistics, vol. 29, no. 5, pp. pp. 1189--1232, 2001.

[26]

T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, "Distributed representations of words and phrases and their compositionality," in Advances in Neural Information Processing Systems 26, 2013, pp. 3111--3119.

Digital Library

Cited By

Tang XZhai JQian XHe BXue WChen W(2018)vSensorACM SIGPLAN Notices10.1145/3200691.317849753:1(124-136)Online publication date: 10-Feb-2018
https://dl.acm.org/doi/10.1145/3200691.3178497
Tang XZhai JQian XHe BXue WChen WKrall AGross T(2018)vSensorProceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/3178487.3178497(124-136)Online publication date: 10-Feb-2018
https://dl.acm.org/doi/10.1145/3178487.3178497

Data-driven performance modeling of linear solvers for sparse matrices
1. Computing methodologies
  1. Symbolic and algebraic manipulation
    1. Symbolic and algebraic algorithms

Recommendations

Parallel iterative solvers for sparse linear systems in circuit simulation

For the solution of sparse linear systems from circuit simulation whose coefficient matrices include a few dense rows and columns, a parallel iterative algorithm with distributed Schur complement preconditioning is presented. The parallel efficiency of ...
Accurate Symmetric Indefinite Linear Equation Solvers

The Bunch-Kaufman factorization is widely accepted as the algorithm of choice for the direct solution of symmetric indefinite linear equations; it is the algorithm employed in both LINPACK and LAPACK. It has also been adapted to sparse symmetric ...
Parallel block-diagonal-bordered sparse linear solvers for power systems applications

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

PMBS '16: Proceedings of the 7th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems

November 2016

123 pages

ISBN:9781509052189

Sponsors

SIGHPC: ACM Special Interest Group on High Performance Computing, Special Interest Group on High Performance Computing
IEEE-CS\DATC: IEEE Computer Society

In-Cooperation

SIGARCH: ACM Special Interest Group on Computer Architecture

Publisher

IEEE Press

Publication History

Published: 13 November 2016

Check for updates

Qualifiers

Research-article

Conference

SC16

Sponsor:

SIGHPC
IEEE-CS\DATC

SC16: The International Conference for High Performance Computing, Networking, Storage and Analysis

November 13 - 18, 2016

Utah, Salt Lake City

Acceptance Rates

Overall Acceptance Rate 9 of 22 submissions, 41%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 22 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Tang XZhai JQian XHe BXue WChen W(2018)vSensorACM SIGPLAN Notices10.1145/3200691.317849753:1(124-136)Online publication date: 10-Feb-2018
https://dl.acm.org/doi/10.1145/3200691.3178497
Tang XZhai JQian XHe BXue WChen WKrall AGross T(2018)vSensorProceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/3178487.3178497(124-136)Online publication date: 10-Feb-2018
https://dl.acm.org/doi/10.1145/3178487.3178497

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

Media

Figures

Other

Tables

View Table of Contents