Abstract
The Chip Multiprocessor (CMP) will be the basic building block for computer systems ranging from laptops to supercomputers. New software developments at all levels are needed to fully utilize these systems. In this work, we evaluate performance of different high-performance sparse LU factorization and triangular solution algorithms on several representative multicore machines. We include both pthreads and MPI implementations in this study, and found that the pthreads implementation consistently delivers good performance and a left-looking algorithm is usually superior.
This research was supported by the Director, Office of Science, Office of Advanced Scientific Computing Research, of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
CrayPatCray Performance Analysis Tools, http://docs.cray.com/books/S-2376-41/S-2376-41.pdf
Davis, T.A.: University of Florida Sparse Matrix Collection, http://www.cise.ufl.edu/research/sparse/matrices
Demmel, J.W., Gilbert, J.R., Li, X.S.: An asynchronous parallel supernodal algorithm for sparse gaussian elimination. SIAM J. Matrix Analysis and Applications 20(4), 915–952 (1999)
Demmel, J.W., Gilbert, J.R., Li, X.S.: SuperLU Users Guide. Technical Report LBNL-44289, Lawrence Berkeley National Laboratory (September 1999)(Last update: September 2007), http://crd.lbl.gov/~xiaoye/SuperLU/
Duff, I.S., Koster, J.: On algorithms for permuting large entries to the diagonal of a sparse matrix. SIAM J. Matrix Analysis and Applications 22(4), 973–996 (2001)
Li, X.S.: Sparse Gaussian elimination on high performance computers. Technical Report UCB//CSD-96-919, Computer Science Division, U.C. Berkeley, Ph.D dissertation (September 1996)
Li, X.S.: An overview of SuperLU: Algorithms, implementation, and user interface. ACM Trans. Mathematical Software 31(3), 302–325 (2005)
Li, X.S., Demmel, J.W.: SuperLU_DIST: A scalable distributed-memory sparse direct solver for unsymmetric linear systems. ACM Trans. Mathematical Software 29(2), 110–140 (2003)
MPICH - A Portable Implementation of MPI, http://www-unix.mcs.anl.gov/mpi/mpich1/
PAPI - Performance Application Programming Interface, http://icl.cs.utk.edu/papi/
Phillips, S.: Victoriafalls: Scaling highly-threaded processor cores. In: HOT CHIPS 19: A Symposium on High Performance Chips, Stanford, California, August 19-21 (2007)
Shalf, J.: Private communications
Williams, S.: Private communications
Williams, S., Oliker, L., Vuduc, R., Shalf, J., Yelick, K., Demmel, J.: Optimization of sparse matrix-vector multiplication on emerging multicore platforms. In: Supercomputing (SC), Reno, California, November 10-16 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, X.S. (2008). Evaluation of Sparse LU Factorization and Triangular Solution on Multicore Platforms. In: Palma, J.M.L.M., Amestoy, P.R., Daydé, M., Mattoso, M., Lopes, J.C. (eds) High Performance Computing for Computational Science - VECPAR 2008. VECPAR 2008. Lecture Notes in Computer Science, vol 5336. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-92859-1_26
Download citation
DOI: https://doi.org/10.1007/978-3-540-92859-1_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-92858-4
Online ISBN: 978-3-540-92859-1
eBook Packages: Computer ScienceComputer Science (R0)