A mixed precision LOBPCG algorithm

Daniel Kressner¹,
Yuxin Ma² &
Meiyue Shao^3,4

467 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

The locally optimal block preconditioned conjugate gradient (LOBPCG) algorithm is a popular approach for computing a few smallest eigenvalues and the corresponding eigenvectors of a large Hermitian positive definite matrix $A$. In this work, we propose a mixed precision variant of LOBPCG that uses a (sparse) Cholesky factorization of $A$ computed in lower precision as the preconditioner. To further enhance performance, a mixed precision orthogonalization strategy is proposed. To analyze the impact of reducing precision in the preconditioner on performance, we carry out a rounding error and convergence analysis of PINVIT, a simplified variant of LOBPCG. Our theoretical results predict and our numerical experiments confirm that the impact on convergence remains marginal. In practice, our mixed precision LOBPCG algorithm typically reduces the computation time by a factor of $1.4$–$2.0$ on both CPUs and GPUs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient Two-Level Preconditioned Conjugate Gradient Method on the GPU

Mixed Precision Incomplete and Factorized Sparse Approximate Inverse Preconditioning on GPUs

Self-adaptive Multiprecision Preconditioners on Multicore and Manycore Architectures

Data Availability

The results/data/figures in this manuscript have not been published elsewhere, nor are they under consideration (from you or one of your Contributing Authors) by another publisher. All of the material is owned by the authors and/or no permissions are required.

Notes

https://sparse.tamu.edu
For MPLOBPCG-schol, the number of LOBPCG iterations for constructing the initial guess is also counted.

References

Balcan, D., Gonçalves, B., Hu, H., Ramasco, J.J., Colizza, V., Vespignani, A.: Modeling the spatial spread of infectious diseases: the GLobal Epidemic and Mobility computational model. J. Comput. Sci. 1(3), 132–145 (2010). https://doi.org/10.1016/j.jocs.2010.07.002
Article Google Scholar
Knyazev, A.: Recent implementations, applications, and extensions of the locally optimal block preconditioned conjugate gradient method (LOBPCG). arXiv:1708.08354(2017)
Saad, Y.: Numerical Methods for Large Eigenvalue Problems, Revised SIAM, Philadelphia, PA, USA (2011)
Book MATH Google Scholar
Neymeyr, K.: A geometric theory for preconditioned inverse iteration applied to a subspace. Math. Comp. 71(237), 197–216 (2002). https://doi.org/10.1090/S0025-5718-01-01357-6
Article MathSciNet MATH Google Scholar
Argentati, M., Knyazev, A., Neymeyr, K., Ovtchinnikov, E., Zhou, M.: Convergence theory for preconditioned eigenvalue solvers in a nutshell. Found. Comput. Math. 17, 713–727 (2017). https://doi.org/10.1007/s10208-015-9297-1
Article MathSciNet MATH Google Scholar
Knyazev, A.V.: Toward the optimal preconditioned eigensolver: Locally optimal block preconditioned conjugate gradient method. SIAM J. Sci. Comput. 23(2), 517–541 (2001). https://doi.org/10.1137/S1064827500366124
Article MathSciNet MATH Google Scholar
Abdelfattah, A., Anzt, H., Boman, E.G., Carson, E., Cojean, T., Dongarra, J., Fox, A., Gates, M., Higham, N.J., Li, X.S., Loe, J., Luszczek, P., Pranesh, S., Rajamanickam, S., Ribizel, T., Smith, B.F., Swirydowicz, K., Thomas, S., Tomov, S., Tsai, Y.M., Yang, U.M.: A survey of numerical linear algebra methods utilizing mixed-precision arithmetic. Int. J. High Perform. Comput. Appl. 35(4), 344–369 (2021). https://doi.org/10.1177/10943420211003313
Article Google Scholar
Higham, N.J., Mary, T.: Mixed precision algorithms in numerical linear algebra. Acta Numer. 31, 347–414 (2022). https://doi.org/10.1017/S0962492922000022
Article MathSciNet MATH Google Scholar
Carson, E., Higham, N.J.: Accelerating the solution of linear systems by iterative refinement in three precisions. SIAM J. Sci. Comput. 40(2), 817–847 (2018). https://doi.org/10.1137/17M1140819
Article MathSciNet MATH Google Scholar
Ogita, T., Aishima, K.: Iterative refinement for symmetric eigenvalue decomposition. Japan J. Indust. Appl. Math. 35(3), 1007–1035 (2018). https://doi.org/10.1007/s13160-018-0310-3
Article MathSciNet MATH Google Scholar
Ogita, T., Aishima, K.: Iterative refinement for symmetric eigenvalue decomposition II: clustered eigenvalues. Japan J. Indust. Appl. Math. 36(2), 435–459 (2019). https://doi.org/10.1007/s13160-019-00348-4
Article MathSciNet MATH Google Scholar
Ogita, T., Aishima, K.: Iterative refinement for singular value decomposition based on matrix multiplication. J. Comput. Appl. Math. 369, 112512 (2020). https://doi.org/10.1016/j.cam.2019.112512
Article MathSciNet MATH Google Scholar
Bujanović, Z., Kressner, D., Schröder, C.: Iterative refinement of Schur decompositions. Numer. Algorithms 92(1), 247–267 (2023). https://doi.org/10.1007/s11075-022-01327-6
Article MathSciNet MATH Google Scholar
Gao, W., Ma, Y., Shao, M.: A mixed precision Jacobi SVD algorithm. arXiv:2209.04626 (2022)
Dongarra, J.J.: Algorithm 589: SICEDR: A FORTRAN subroutine for improving the accuracy of computed matrix eigenvalues. ACM Trans. Math. Software 8(4), 371–375 (1982). https://doi.org/10.1145/356012.356016
Article MATH Google Scholar
Golub, G.H., Van Loan, C.F.: Matrix Computations, 4th edn. Johns Hopkins University Press, Baltimore, MD, USA (2013)
Book MATH Google Scholar
Duersch, J.A., Shao, M., Yang, C., Gu, M.: A robust and efficient implementation of LOBPCG. SIAM J. Sci. Comput. 40(5), 655–676 (2018). https://doi.org/10.1137/17M1129830
Article MathSciNet MATH Google Scholar
Hetmaniuk, U., Lehoucq, R.: Basis selection in LOBPCG. J. Comput. Phys. 218(1), 324–332 (2006). https://doi.org/10.1016/j.jcp.2006.02.007
Article MathSciNet MATH Google Scholar
Yamazaki, I., Tomov, S., Dongarra, J.: Mixed-precision Cholesky QR factorization and its case studies on multicore CPU with multiple GPUs. SIAM J. Sci. Comput. 37(3), 307–330 (2015). https://doi.org/10.1137/14M0973773
Article MathSciNet MATH Google Scholar
Yamazaki, I., Tomov, S., Kurzak, J., Dongarra, J., Barlow, J.: Mixed-precision block Gram Schmidt orthogonalization. In: ScalA ’15: Proceedings of the 6th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, pp. 2–128 (2015). https://doi.org/10.1145/2832080.2832082
Rohwedder, T., Schneider, R., Zeiser, A.: Perturbed preconditioned inverse iteration for operator eigenvalue problems with applications to adaptive wavelet discretization. Adv. Comput. Math. 34(1), 43–66 (2011)
Article MathSciNet MATH Google Scholar
Higham, N.J.: Accuracy and Stability of Numerical Algorithms, 2nd edn. SIAM, Philadelphia, PA, USA (2002)
Book MATH Google Scholar
Chen, Y., Davis, T.A., Hager, W.W., Rajamanickam, S.: Algorithm 887: CHOLMOD, supernodal sparse Cholesky factorization and update/downdate. ACM Trans. Math. Software 35(3), 22–12214 (2008). https://doi.org/10.1145/1391989.1391995
Article MathSciNet Google Scholar
Shao, M., Oryspayev, D., Yang, C., Maris, P., Cook, B.: Fault-tolerant LOBPCG for nuclear CI calculations. In: International Conference on High Performance Computing in Asia-Pacific Region (HPC ASIA 2023), February 27–March 2, 2023, Singapore, Singapore, pp. 88–95. ACM, New York, NY, USA (2023). https://doi.org/10.1145/3578178.3578240
Fadel, S., Ghoniemy, S., Abdallah, M., Sorra, H.A., Ashour, A., Ansary, A.: Investigating the effect of different kernel functions on the performance of SVM for recognizing Arabic characters. Int. J. Adv. Comput. Sci. Appl. 7(1) (2016). https://doi.org/10.14569/IJACSA.2016.070160

Download references

Acknowledgements

The authors thank Erin Carson for helpful discussions. Part of this work was performed when the second author was visiting EPF Lausanne in 2022.

Funding

Yuxin Ma is partially supported by the State Scholarship Fund of China Scholarship Council (CSC) under Grant No. 202106100093, National Key R &D Program of China under Grant No. 2021YFA1003305 and National Natural Science Foundation of China under Grant No. 71991471. Meiyue Shao is partially supported by the National Natural Science Foundation of China under grant No. 11971118.

Author information

Authors and Affiliations

Institute of Mathematics, EPFL, Lausanne, CH-1015, Switzerland
Daniel Kressner
School of Mathematical Sciences, Fudan University, Shanghai, 200433, China
Yuxin Ma
School of Data Science, Fudan University, Shanghai, 200433, China
Meiyue Shao
MOE Key Laboratory for Computational Physical Sciences, Fudan University, Shanghai, 200433, China
Meiyue Shao

Authors

Daniel Kressner
View author publications
You can also search for this author in PubMed Google Scholar
Yuxin Ma
View author publications
You can also search for this author in PubMed Google Scholar
Meiyue Shao
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

These authors contributed equally to this work.

Corresponding author

Correspondence to Yuxin Ma.

Ethics declarations

Ethics approval

Not applicable

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Daniel Kressner, Yuxin Ma and Meiyue Shao are contributed equally to this work.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Kressner, D., Ma, Y. & Shao, M. A mixed precision LOBPCG algorithm. Numer Algor 94, 1653–1671 (2023). https://doi.org/10.1007/s11075-023-01550-9

Download citation

Received: 24 February 2023
Accepted: 30 March 2023
Published: 11 May 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s11075-023-01550-9

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Efficient Two-Level Preconditioned Conjugate Gradient Method on the GPU

Mixed Precision Incomplete and Factorized Sparse Approximate Inverse Preconditioning on GPUs

Self-adaptive Multiprecision Preconditioners on Multicore and Manycore Architectures

Data Availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification (2010)

Subscribe and save

Buy Now

Navigation

A mixed precision LOBPCG algorithm

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Efficient Two-Level Preconditioned Conjugate Gradient Method on the GPU

Mixed Precision Incomplete and Factorized Sparse Approximate Inverse Preconditioning on GPUs

Self-adaptive Multiprecision Preconditioners on Multicore and Manycore Architectures

Data Availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2010)

Subscribe and save

Buy Now

Search

Navigation