Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/3014904.3014927acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Block iterative methods and recycling for improved scalability of linear solvers

Published: 13 November 2016 Publication History

Abstract

Contemporary large-scale Partial Differential Equation (PDE) simulations usually require the solution of large and sparse linear systems. Moreover, it is often needed to solve these linear systems with different or multiple Right-Hand Sides (RHSs). In this paper, various strategies will be presented to extend the scalability of existing multigrid or domain decomposition linear solvers using appropriate recycling strategies or block methods---i.e., by treating multiple right-hand sides simultaneously.
The scalability of this work is assessed by performing simulations on up to 8,192 cores for solving linear systems arising from various physical phenomena modeled by Poisson's equation, the system of linear elasticity, or Maxwell's equation.
This work is shipped as part of on open-source software, readily available and usable in any C/C++, Python, or Fortran code. In particular, some simulations are performed on top of a well-established library, PETSc, and it is shown how our approaches can be used to decrease time to solution down by 30%.

References

[1]
H. Sundar, G. Biros, C. Burstedde, J. Rudi, O. Ghattas, and G. Stadler, "Parallel Geometric-Algebraic Multigrid on Unstructured Forests of Octrees," in Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC12. IEEE Computer Society Press, 2012.
[2]
J. Rudi, A. C. I. Malossi, T. Isaac, G. Stadler, M. Gurnis, P. W. Staar, Y. Ineichen, C. Bekas, A. Curioni, and O. Ghattas, "An Extreme-Scale Implicit Solver for Complex PDEs: Highly Heterogeneous Flow in Earth's Mantle," in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC15. ACM, 2015.
[3]
P. Jolivet, F. Hecht, F. Nataf, and C. Prud'homme, "Scalable Domain Decomposition Preconditioners for Heterogeneous Elliptic Problems," in Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC13. ACM, 2013.
[4]
A. Klawonn and O. Rheinbach, "Highly scalable parallel domain decomposition methods with an application to biomechanics," ZAMM-Journal of Applied Mathematics and Mechanics, vol. 90, no. 1, pp. 5--32, 2010.
[5]
U. M. Yang, Parallel Algebraic Multigrid Methods---High Performance Preconditioners. Springer, 2006.
[6]
V. Dolean, P. Jolivet, and F. Nataf, An Introduction to Domain Decomposition Methods: Algorithms, Theory, and Parallel Implementation. SIAM, 2015, vol. 144.
[7]
Y. Saad and M. H. Schultz, "GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems," SIAM Journal on Scientific and Statistical Computing, vol. 7, no. 3, pp. 856--869, 1986.
[8]
M. R. Hestenes and E. Stiefel, "Methods of Conjugate Gradients for Solving Linear Systems," Journal of Research of the National Bureau of Standards, vol. 49, no. 6, pp. 409--436, 1952.
[9]
P. Ghysels, T. J. Ashby, K. Meerbergen, and W. Vanroose, "Hiding Global Communication Latency in the GMRES Algorithm on Massively Parallel Machines," SIAM Journal on Scientific Computing, vol. 35, no. 1, pp. C48--C71, 2013.
[10]
P. Sanan, S. M. Schnepp, and D. May, "Pipelined, Flexible Krylov Subspace Methods," SIAM Journal on Scientific Computing, 2016.
[11]
A. T. Chronopoulos, "s-Step Iterative Methods for (Non)Symmetric (In)Definite Linear Systems," SIAM Journal on Numerical Analysis, vol. 28, no. 6, pp. 1776--1789, 1991.
[12]
M. Hoemmen, "Communication-avoiding Krylov subspace methods," Ph.D. dissertation, University of California, Berkeley, 2010.
[13]
P. Soudais, "Iterative Solution of a 3-D Scattering Problem from Arbitrary Shaped Multidielectric and Multiconducting Bodies," IEEE Transactions on Antennas and Propagation, vol. 42, no. 7, pp. 954--959, 1994.
[14]
A. H. Baker, J. M. Dennis, and E. R. Jessup, "On Improving Linear Solver Performance: A Block Variant of GMRES," SIAM Journal on Scientific Computing, vol. 27, no. 5, pp. 1608--1626, 2006.
[15]
L. Giraud, S. Gratton, and E. Martin, "Incremental spectral preconditioners for sequences of linear systems," Applied Numerical Mathematics, vol. 57, no. 11, pp. 1164--1180, 2007.
[16]
P. Gosselet, C. Rey, and J. Pebrel, "Total and selective reuse of Krylov subspaces for the resolution of sequences of nonlinear structural problems," International Journal for Numerical Methods in Engineering, vol. 94, no. 1, pp. 60--83, 2013.
[17]
V. Simoncini and E. Gallopoulos, "Convergence properties of block GMRES and matrix polynomials," Linear Algebra and its Applications, vol. 247, pp. 97--119, 1996.
[18]
J. Langou, "Solving large linear systems with multiple right-hand sides," Ph.D. dissertation, INSA de Toulouse, 2003.
[19]
M. Robbé and M. Sadkane, "Exact and inexact breakdowns in the block GMRES method," Linear Algebra and its Applications, vol. 419, no. 1, pp. 265--285, 2006.
[20]
R. B. Morgan, "Restarted block-GMRES with deflation of eigenvalues," Applied Numerical Mathematics, vol. 54, no. 2, pp. 222--236, 2005.
[21]
E. Agullo, L. Giraud, and Y.-F. Jing, "Block GMRES method with inexact breakdowns and deflated restarting," SIAM Journal on Matrix Analysis and Applications, vol. 35, no. 4, pp. 1625--1651, 2014.
[22]
M. L. Parks, E. de Sturler, G. Mackey, D. D. Johnson, and S. Maiti, "Recycling Krylov Subspaces for Sequences of Linear Systems," SIAM Journal on Scientific Computing, vol. 28, no. 5, pp. 1651--1674, 2006.
[23]
S. Balay, S. Abhyankar, M. F. Adams, J. Brown, P. Brune, K. Buschelman, L. Dalcin, V. Eijkhout, W. D. Gropp, D. Kaushik, M. G. Knepley, L. C. McInnes, K. Rupp, B. F. Smith, S. Zampini, and H. Zhang, "PETSc web page," http://www.mcs.anl.gov/petsc, 2016.
[24]
M. F. Adams, H. H. Bayraktar, T. M. Keaveny, and P. Papadopoulos, "Ultrascalable Implicit Finite Element Analyses in Solid Mechanics With Over a Half a Billion Degrees of Freedom," in Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC04. IEEE Computer Society, 2004.
[25]
M. J. Gander, "Optimized Schwarz Methods," SIAM Journal on Numerical Analysis, vol. 44, no. 2, pp. 699--731, 2006.
[26]
C. Japhet and F. Nataf, "The best interface conditions for domain decomposition methods: absorbing boundary conditions," Absorbing Boundaries and Layers, Domain Decomposition Methods: Applications to Large Scale Computers, pp. 348--373, 2001.
[27]
J. E. Hicken and D. W. Zingg, "A Simplified and Flexible Variant of GCROT for Solving Nonsymmetric Linear Systems," SIAM Journal on Scientific Computing, vol. 32, no. 3, pp. 1672--1694, 2010.
[28]
L. M. Carvalho, S. Gratton, R. Lago, and X. Vasseur, "A Flexible Generalized Conjugate Residual Method With Inner Orthogonalization and Deflated Restarting," SIAM Journal on Matrix Analysis and Applications, vol. 32, no. 4, pp. 1212--1235, 2011.
[29]
E. de Sturler, "Nested Krylov methods based on GCR," Journal of Computational and Applied Mathematics, vol. 67, no. 1, pp. 15--41, 1996.
[30]
L. Giraud, S. Gratton, X. Pinel, and X. Vasseur, "Flexible GMRES with Deflated Restarting," SIAM Journal on Scientific Computing, vol. 32, no. 4, pp. 1858--1878, 2010.
[31]
R. B. Morgan, "GMRES with Deflated Restarting," SIAM Journal on Scientific Computing, vol. 24, no. 1, pp. 20--37, 2002.
[32]
D. P. O'Leary, "The Block Conjugate Gradient Algorithm and Related Methods," Linear Algebra and its Applications, vol. 29, pp. 293--322, 1980.
[33]
B. Vital, "Étude de quelques méthodes de résolution de problemes linéaires de grande taille sur multiprocesseur," Ph.D. dissertation, Université Rennes 1, 1990.
[34]
D. Darnell, R. B. Morgan, and W. Wilcox, "Deflated GMRES for systems with multiple shifts and multiple right-hand sides," Linear Algebra and its Applications, vol. 429, no. 10, pp. 2415--2434, 2008.
[35]
J. Meng, H.-B. Li, and Y.-F. Jing, "A new deflated block GCROT(m, k) method for the solution of linear systems with multiple right-hand sides," Journal of Computational and Applied Mathematics, 2016.
[36]
R. Yu, E. de Sturler, and D. D. Johnson, "A Block Iterative Solver for Complex Non-Hermitian Systems Applied to Large-Scale Electronic-Structure Calculations," University of Illinois at Urbana-Champaign, Department of Computer Science, Technical Report UIUCDCS-R-2002--2299, 2002.
[37]
H. Calandra, S. Gratton, J. Langou, X. Pinel, and X. Vasseur, "Flexible Variants of Block Restarted GMRES Methods with Application to Geophysics," SIAM Journal on Scientific Computing, vol. 34, no. 2, pp. A714--A736, 2012.
[38]
T. Sakurai, H. Tadano, and Y. Kuramashi, "Application of block Krylov subspace algorithms to the Wilson-Dirac equation with multiple right-hand sides in lattice QCD," Computer Physics Communications, vol. 181, no. 1, pp. 113--117, 2010.
[39]
R. Falgout and U. Yang, "hypre: A library of high performance preconditioners," Computational Science---ICCS 2002, pp. 632--641, 2002.
[40]
P. Bastian, F. Heimann, and S. Marnach, "Generic implementation of finite element methods in the distributed and unified numerics environment (DUNE)," Kybernetika, vol. 46, no. 2, pp. 294--315, 2010.
[41]
D. Lukarski and N. Trost, "PARALUTION web page," http://www.parallution.com, 2016.
[42]
A. H. Baker, E. R. Jessup, and T. Manteuffel, "A Technique for Accelerating the Convergence of Restarted GMRES," SIAM Journal on Matrix Analysis and Applications, vol. 26, no. 4, pp. 962--984, 2005.
[43]
J. Erhel, K. Burrage, and B. Pohl, "Restarted GMRES preconditioned by deflation," Journal of computational and applied mathematics, vol. 69, no. 2, pp. 303--318, 1996.
[44]
D. N. Wakam and F. Pacull, "Memory efficient hybrid algebraic solvers for linear systems arising from compressible flows," Computers & Fluids, vol. 80, pp. 158--167, 2013.
[45]
M. A. Heroux, R. A. Bartlett, V. E. Howle, R. J. Hoekstra, J. J. Hu, T. G. Kolda, R. B. Lehoucq, K. R. Long, R. P. Pawlowski, E. T. Phipps et al., "An overview of the Trilinos project," ACM Transactions on Mathematical Software (TOMS), vol. 31, no. 3, pp. 397--423, 2005.
[46]
E. Bavier, M. Hoemmen, S. Rajamanickam, and H. Thornquist, "Amesos2 and Belos: Direct and Iterative Solvers for Large Sparse Linear Systems," Scientific Programming, vol. 20, no. 3, pp. 241--255, 2012.
[47]
M. Mohiyuddin, M. Hoemmen, J. Demmel, and K. Yelick, "Minimizing Communication in Sparse Matrix Solvers," in Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC09. ACM, 2009.
[48]
A. Stathopoulos and K. Wu, "A Block Orthogonalization Procedure with Constant Synchronization Requirements," SIAM Journal on Scientific Computing, vol. 23, no. 6, pp. 2165--2182, 2002.
[49]
I. Yamazaki, S. Tomov, and J. Dongarra, "Mixed-Precision Cholesky QR Factorization and Its Case Studies on Multicore CPU with Multiple GPUs," SIAM Journal on Scientific Computing, vol. 37, no. 3, pp. C307--C330, 2015.
[50]
D. L. Brown, R. Cortez, and M. L. Minion, "Accurate Projection Methods for the Incompressible Navier-Stokes Equations," Journal of Computational Physics, vol. 168, no. 2, pp. 464--499, 2001.
[51]
L. Feng, P. Benner, and J. G. Korvink, "Parametric model order reduction accelerated by subspace recycling," in Proceedings of the 48th IEEE Conference on Decision and Control. IEEE, 2009, pp. 4328--4333.
[52]
H. C. Elman, O. G. Ernst, and D. P. O'Leary, "A Multigrid Method Enhanced by Krylov Subspace Iteration for Discrete Helmholtz Equations," SIAM Journal on Scientific Computing, vol. 23, no. 4, pp. 1291--1315, 2001.
[53]
S. Cools, B. Reps, and W. Vanroose, "An Efficient Multigrid Calculation of the Far Field Map for Helmholtz and Schrödinger Equations," SIAM Journal on Scientific Computing, vol. 36, no. 3, pp. B367--B395, 2014.
[54]
A. Amritkar, E. de Sturler, K. Świrydowicz, D. Tafti, and K. Ahuja, "Recycling Krylov subspaces for CFD applications and a new hybrid recycling solver," Journal of Computational Physics, vol. 303, pp. 222--237, 2015.
[55]
S. Wang, E. de Sturler, and G. H. Paulino, "Large-scale topology optimization using preconditioned Krylov subspace methods with recycling," International Journal for Numerical Methods in Engineering, vol. 69, no. 12, pp. 2441--2468, 2007.
[56]
J.-C. Nédélec, "Mixed finite elements in R<sup>3</sup>," Numerische Mathematik, vol. 35, no. 3, pp. 315--341, 1980.
[57]
D. Boffi, P. Fernandes, L. Gastaldi, and I. Perugia, "Computational Models of Electromagnetic Resonators: Analysis of Edge Element Approximation," SIAM Journal on Numerical Analysis, vol. 36, no. 4, pp. 1264--1290, 1999.
[58]
F. Rapetti, "High-order edge elements on simplicial meshes," ESAIM: Mathematical Modelling and Numerical Analysis, vol. 41, no. 06, pp. 1001--1020, 2007.
[59]
S. Semenov, B. Seiser, E. Stoegmann, and E. Auff, "Electromagnetic Tomography for Brain Imaging: From Virtual to Human Brain," in 2014 IEEE Conference on Antenna Measurements & Applications. IEEE, 2014.
[60]
X.-C. Cai and M. Sarkis, "A Restricted Additive Schwarz Preconditioner for General Sparse Linear Systems," SIAM Journal on Scientific Computing, vol. 21, no. 2, pp. 792--797, 1999.
[61]
V. Dolean, M. J. Gander, and L. Gerardo-Giorda, "Optimized Schwarz Methods for Maxwell's Equations," SIAM Journal on Scientific Computing, vol. 31, no. 3, pp. 2193--2213, 2009.
[62]
F. Hecht, "New development in FreeFem++," Journal of Numerical Mathematics, vol. 20, no. 3--4, pp. 251--266, 2012.
[63]
P. Jolivet, V. Dolean, F. Hecht, F. Nataf, C. Prud'homme, and N. Spillane, "High-performance domain decomposition methods on massively parallel architectures with FreeFem++," Journal of Numerical Mathematics, vol. 20, no. 3--4, pp. 287--302, 2012.
[64]
C. Prudhomme, V. Chabannes, V. Doyeux, M. Ismail, A. Samake, and G. Pena, "Feel++: A Computational Framework for Galerkin Methods and Advanced Numerical Methods," in ESAIM: Proceedings, vol. 38. EDP Sciences, 2012, pp. 429--455.
[65]
M. H. Gutknecht, "Block Krylov space methods for linear systems with multiple right-hand sides: an introduction," in Modern Mathematical Models, Methods and Algorithms for Real World Systems, A. Siddiqi, I. Duff, and O. Christensen, Eds. New Delhi, India: Anamaya Publishers, 2006, pp. 420--447.
[66]
M. Bonazzoli and F. Rapetti, "High-order finite elements in numerical electromagnetism: degrees of freedom and generators in duality," Numerical Algorithms, 2016.
[67]
O. Schenk and K. Gärtner, "Solving unsymmetric sparse systems of linear equations with PARDISO," Future Generation Computer Systems, vol. 20, no. 3, pp. 475--487, 2004.
[68]
Intel, "MKL web page," https://software.intel.com/en-us/intel-mkl, 2016.
[69]
A. Suzuki and F.-X. Roux, "A dissection solver with kernel detection for symmetric finite element matrices on shared memory computers," International Journal for Numerical Methods in Engineering, vol. 100, no. 2, pp. 136--164, 2014.
[70]
K. Goto and R. Van De Geijn, "High-Performance Implementation of the Level-3 BLAS," ACM Transactions on Mathematical Software, vol. 35, no. 1, p. 4, 2008.
[71]
H. Anzt, S. Tomov, and J. Dongarra, "Accelerating the LOBPCG Method on GPUs Using a Blocked Sparse Matrix Vector Product," in Proceedings of the Symposium on High Performance Computing, ser. HPC '15. Society for Computer Simulation International, 2015, pp. 75--82.
[72]
T. V. Kolev and P. Vassilevski, "Parallel Auxiliary Space AMG for H(curl) Problems," Journal of Computational Mathematics, vol. 27, no. 5, pp. 604--623, 2009.
[73]
J. J. Hu, R. S. Tuminaro, P. B. Bochev, C. J. Garasi, and A. C. Robinson, "Toward an h-Independent Algebraic Multigrid Method for Maxwell's Equations," SIAM Journal on Scientific Computing, vol. 27, no. 5, pp. 1669--1688, 2006.
[74]
F. Pellegrini and J. Roman, "SCOTCH: A Software Package for Static Mapping by Dual Recursive Bipartitioning of Process and Architecture Graphs," in High-Performance Computing and Networking. Springer, 1996, pp. 493--498.
[75]
H. Calandra, S. Gratton, R. Lago, X. Vasseur, and L. M. Carvalho, "A Modified Block Flexible GMRES Method with Deflation at Each Iteration for the Solution of Non-Hermitian Linear Systems with Multiple Right-Hand Sides," SIAM Journal on Scientific Computing, vol. 35, no. 5, pp. S345--S367, 2013.
[76]
P.-H. Tournier, I. Aliferis, M. Bonazzoli, M. de Buhan, M. Darbas, V. Dolean, F. Hecht, P. Jolivet, I. El Kanfoud, C. Migliaccio, F. Nataf, C. Pichot, and S. Semenov, "Microwave Tomographic Imaging of Cerebrovascular Accidents by Using High-Performance Computing," 2016.

Cited By

View all
  • (2021)Recycling Krylov Subspaces and Truncating Deflation Subspaces for Solving Sequence of Linear SystemsACM Transactions on Mathematical Software10.1145/343974647:2(1-30)Online publication date: 20-Apr-2021

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SC '16: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
November 2016
1034 pages
ISBN:9781467388153
  • Conference Chair:
  • John West

Sponsors

In-Cooperation

Publisher

IEEE Press

Publication History

Published: 13 November 2016

Check for updates

Author Tags

  1. distributed algorithms
  2. iterative methods
  3. maxwell's equation

Qualifiers

  • Research-article

Conference

SC16
Sponsor:

Acceptance Rates

SC '16 Paper Acceptance Rate 81 of 442 submissions, 18%;
Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)25
  • Downloads (Last 6 weeks)1
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2021)Recycling Krylov Subspaces and Truncating Deflation Subspaces for Solving Sequence of Linear SystemsACM Transactions on Mathematical Software10.1145/343974647:2(1-30)Online publication date: 20-Apr-2021

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media