Nothing Special   »   [go: up one dir, main page]

Skip to main content

Advertisement

Log in

A Riemannian Optimization Approach to Clustering Problems

  • Published:
Journal of Scientific Computing Aims and scope Submit manuscript

Abstract

This paper considers the optimization problem

$$\begin{aligned} \min _{X \in \mathcal {F}_v} f( {X}) + \lambda \Vert X\Vert _1, \end{aligned}$$

where f is smooth, \(\mathcal {F}_v = \{X \in \mathbb {R}^{n \times q}: X^T X = I_q, v \in \textrm{span}(X)\}\), and v is a given positive vector. The clustering models including but not limited to the models used by k-means, community detection, and normalized cut can be reformulated as such optimization problems. It is proven that the domain \(\mathcal {F}_v\) forms a compact embedded submanifold of \(\mathbb {R}^{n \times q}\) and optimization-related tools including a family of computationally efficient retractions and an orthonormal basis of any normal space of \(\mathcal {F}_v\) are derived. A Riemannian proximal gradient method that allows an adaptive step size is proposed. The proposed Riemannian proximal gradient method solves its subproblem inexactly and still guarantees its global convergence. Numerical experiments on community detection in networks and normalized cut for image segmentation are used to demonstrate the performance of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1
Algorithm 2
Algorithm 3
Algorithm 4
Fig. 1
Fig. 2

Similar content being viewed by others

Data Availability

The datasets generated during and analysed during the current study are available from the corresponding author on reasonable request.

Notes

  1. In some cases, the proximal mapping in [41] can be solved efficiently without resorting to the semi-smooth Newton algorithm, see [41, Section 5.2].

  2. Throughout this paper, the computational complexity is measured by flop counts. A flop is a floating point operation [28, Section 1.2.4].

  3. Note that the local definition function only requires \(\textrm{D}h(Y)\) to be full rank at \(Y = X\). Here, we prove a stronger result.

  4. The \(\lambda \) in I-AManPG increases by 0.01, 0.04, and 0.2.

  5. The implementation in [17] is for unweighted kernel k-means. We modified it for weighted kernel k-means.

References

  1. Absil, P.-A., Baker, C.G., Gallivan, K.A.: Trust-region methods on Riemannian manifolds. Found. Comput. Math. 7(3), 303–330 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  2. Absil, P.-A., Mahony, R., Sepulchre, R.: Optimization Algorithms on Matrix Manifolds. Princeton University Press, Princeton (2008)

    Book  MATH  Google Scholar 

  3. Absil, P.-A., Malick, J.: Projection-like retractions on matrix manifolds. SIAM J. Optim. 22(1), 135–158 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  4. Bach, F.R., Jordan M.I. Learning spectral clustering. In: Proceedings of the 16th International Conference on Neural Information Processing Systems, NIPS’03, pp. 305–312, Cambridge, MA, US. MIT Press (2003)

  5. Baker, C.G.: Riemannian manifold trust-region methods with applications to eigenproblems. PhD thesis, Florida State University, School of Computational Science (2008)

  6. Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009). https://doi.org/10.1137/080716542

    Article  MathSciNet  MATH  Google Scholar 

  7. Beck, A.: First-Order Methods in Optimization. Society for Industrial and Applied Mathematics, Philadelphia, PA (2017)

  8. Bento, G.C., Ferreira, O.P., Melo, J.G.: Iteration-complexity of gradient, subgradient and proximal point methods on Riemannian manifolds. J. Optim. Theory Appl. 173(2), 548–562 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  9. Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008(10), P10008 (2008)

    Article  MATH  Google Scholar 

  10. Boothby, W.M.: An Introduction to Differentiable Manifolds and Riemannian Geometry, 2nd edn. Academic Press, Cambridge (1986)

    MATH  Google Scholar 

  11. Boumal, N.: An introduction to optimization on smooth manifolds (2022)

  12. Boumal, N., Absil, P.A., Cartis, C.: Global rates of convergence for nonconvex optimization on manifolds. IMA J. Numer. Anal. 39(1), 1–33 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  13. Bounova, G.: Matlab Tools for Network Analysis (2009). http://strategic.mit.edu/downloads.php?page=matlab_networks

  14. Boutsidis, C., Drineas, P., Mahoney, M.W.: Unsupervised feature selection for the \( k \)-means clustering problem. In: Advances in Neural Information Processing Systems, pp. 153–161 (2009)

  15. Carson, T., Mixon, D.G., Villar, S.: Manifold optimization for k-means clustering. In: 2017 International Conference on Sampling Theory and Applications (SampTA), pp. 73–77. IEEE (2017)

  16. Chan, P.K., Schlag, F.: Spectral k-way ratio-cut partitioning and clustering. IEEE Trans. Comput. Aided Design Integr. Circuits Syst. 13(9), 1088–1096 (1994)

    Article  MATH  Google Scholar 

  17. Chen, M.: Pattern recognition and machine learning toolbox (2021)

  18. Chen, S., Ma, S., So, A.M.-C., Zhang, T.: Proximal gradient method for nonsmooth optimization over the Stiefel manifold. SIAM J. Optim. 30(1), 210–239 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  19. Cherian, A., Sra, S.: Riemannian dictionary learning and sparse coding for positive definite matrices. IEEE Trans. Neural Netw. Learn. Syst. 28(12), 2859–2871 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  20. Clarke, F.H.: Optimization and nonsmooth analysis. Classics in Applied Mathematics of SIAM (1990)

  21. Danon, L., Diaz-Guilera, A., Arenas, A.: The effect of size heterogeneity on community identification in complex networks. J. Stat. Mech. Theory Exp. 2006(11), P11010 (2006)

    Article  MATH  Google Scholar 

  22. Danon, L., Diaz-Guilera, A., Duch, J., Arenas, A.: Comparing community structure identification. J. Stat. Mech. Theory Exp. 2005(09), P09008 (2005)

    Article  Google Scholar 

  23. Dhillon, I.S., Guan, Y., Kulis, B.: Kernel k-means, spectral clustering and normalized cuts. In: Proceedings of the 10th Association for Computing Machinery(ACM) Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD) International Conference on Knowledge Discovery and Data Mining (2004)

  24. Dhillon, I., Guan, Y., Kulis, B.: A unified view of kernel k-means. spectral clustering and graph cuts. Technical report, Department of Computer Sciences, University of Texas at Austin (2005)

  25. do Carmo, M.P.: Riemannian Geometry. Mathematics: Theory & Applications (1992)

  26. Edelman, A., Arias, T.A., Smith, S.T.: The geometry of algorithms with orthogonality constraints. SIAM J. Matrix Anal. Appl. 20(2), 303–353 (1998). https://doi.org/10.1137/S0895479895290954

    Article  MathSciNet  MATH  Google Scholar 

  27. Fortunato, S.: Community detection in graphs. Phys. Rep. 486(3–5), 75–174 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  28. Golub, G.H., Van Loan, C.F.: Matrix computations, 3rd edn. Johns Hopkins Studies in the Mathematical Sciences. Johns Hopkins University Press (1996)

  29. Grohs, P., Hosseini, S.: \(\epsilon \)-subgradient algorithms for locally lipschitz functions on Riemannian manifolds. Adv. Comput. Math. (2015). https://doi.org/10.1007/s10444-015-9426-z

    Article  MATH  Google Scholar 

  30. Grohs, P., Hosseini, S.: Nonsmooth trust region algorithms for locally Lipschitz functions on Riemannian manifolds. IMA J. Numer. Anal. (2015). https://doi.org/10.1093/imanum/drv043

    Article  MATH  Google Scholar 

  31. Harris, J.: Algebraic Geometry: A First Course, vol. 133. Springer Science & Business Media, Berlin (1992)

    MATH  Google Scholar 

  32. Hartigan, J.A., Wong, M.A.: A K-means clustering algorithm. Appl. Stat. 28(1), 100–108 (1979)

    Article  MATH  Google Scholar 

  33. Hosseini, S., Huang, W., Yousefpour, R.: Line search algorithms for locally Lipschitz functions on Riemannian manifolds. SIAM J. Optim. 28(1), 596–619 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  34. Jiang, H., Jiang, B., Lin, L., Wen, Z., Yuan, Y.: Structured quasi-newton methods for optimization with orthogonality constraints. SIAM J. Sci. Comput. 41(4), A2239–A2269 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  35. Huang, W.: Optimization algorithms on Riemannian manifolds with applications. PhD thesis, Florida State University, Department of Mathematics (2013)

  36. Huang, W., Absil, P.-A., Gallivan, K.A.: A Riemannian symmetric rank-one trust-region method. Math. Program. 150(2), 179–216 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  37. Huang, W., Absil, P.-A., Gallivan, K.A.: Intrinsic representation of tangent vectors and vector transport on matrix manifolds. Numer. Math. 136(2), 523–543 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  38. Huang, W., Gallivan, K.A.: A limited-memory Riemannian symmetric rank-one trust-region method with a restart strategy. J. Sci. Comput. 93(1), 1 (2022)

    Article  MathSciNet  MATH  Google Scholar 

  39. Huang, W., Gallivan, K.A., Absil, P.-A.: A Broyden class of quasi-Newton methods for Riemannian optimization. SIAM J. Optim. 25(3), 1660–1685 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  40. Huang, W., Gallivan, K.A., Srivastava, A., Absil, P.-A.: Riemannian optimization for elastic shape analysis. In: Proceedings of the 21st Internaltional Symposium on Mathematical Theory of Networks and Systems (MTNS 2014) (2014)

  41. Huang, W., Wei, K.: Riemannian proximal gradient methods. Mathematical Programming (2021). https://doi.org/10.1007/s10107-021-01632-3

  42. Huang, W., Absil, P.A., Gallivan, K.A.: A Riemannian BFGS method without differentiated retraction for nonconvex optimization problems. SIAM J. Optim. 28(1), 470–495 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  43. Huang, W., Hand, P.: Blind deconvolution by a steepest descent algorithm on a quotient manifold. SIAM J. Imaging Sci. 11(4), 2757–2785 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  44. Huang, W., Wei, K.: An Inexact Riemannian Proximal Gradient Method (2021)

  45. Huang, W., Wei, K.: An extension of fast iterative shrinkage-thresholding algorithm to Riemannian optimization for sparse principal component analysis. Numer. Linear Algebra Appl. 29, e2409 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  46. Jiang, B., Meng, X., Wen, Z., Chen, X.: An exact penalty approach for optimization with nonnegative orthogonality constraints (2019). CoRR, ArXiv:1907.12424

  47. Jolliffe, I.T., Trendafilov, N.T., Uddin, M.: A modified principal component technique based on the Lasso. J. Comput. Graph. Stat. 12(3), 531–547 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  48. Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20(1), 359–392 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  49. Kehagias, A.: Community Detection Toolbox (2021). https://www.mathworks.com/matlabcentral/fileexchange/45867-community-detection-toolbox

  50. Lancichinetti, A., Fortunato, S., Radicchi, F.: Benchmark graphs for testing community detection algorithms. Phys. Rev. E 78(4), 046110 (2008)

    Article  Google Scholar 

  51. Li, X., Sun, D., Toh, K.-C.: A highly efficient semismooth Newton augmented Lagrangian method for solving Lasso problems. SIAM J. Optim. 28(1), 433–458 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  52. Macqueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley Symposium on Mathematical Statistics and Probability (1967)

  53. Manning, C.D.: Prabhakar Raghavan. utze, introduction to information retrieval (2008)

  54. Newman, M.E.J.: Fast algorithm for detecting community structure in networks. Phys. Rev. E 69(6), 066133 (2004)

    Article  MATH  Google Scholar 

  55. Newman, M.E.J.: Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74(3), 036104 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  56. Newman, M.E.J.: Modularity and community structure in networks. Proc. Natl. Acad. Sci. 103(23), 8577–8582 (2006)

    Article  MATH  Google Scholar 

  57. Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69(2), 026113 (2004)

    Article  MATH  Google Scholar 

  58. Newman, M.E.J., Leicht, E.A.: Mixture models and exploratory analysis in networks. Proc. Natl. Acad. Sci. 104(23), 9564–9569 (2007)

    Article  MATH  Google Scholar 

  59. Qian, Y., Pan, S., Xiao, L.: Exact penalty methods for minimizing a smooth function over the nonnegative orthogonal set (2021)

  60. Ring, W., Wirth, B.: Optimization methods on Riemannian manifolds and their application to shape space. SIAM J. Optim. 22(2), 596–627 (2012). https://doi.org/10.1137/11082885X

    Article  MathSciNet  MATH  Google Scholar 

  61. Rosvall, M., Bergstrom, C.T.: Maps of random walks on complex networks reveal community structure. Proc. Natl. Acad. Sci. 105(4), 1118–1123 (2008)

    Article  MATH  Google Scholar 

  62. Samir, C., Huang, W.: Coordinate descent optimization for one-to-one correspondence and supervised classification of 3D shapes. Appl. Math. Comput. 388, 125539 (2021)

    MathSciNet  MATH  Google Scholar 

  63. Sato, H.: A Dai–Yuan-type Riemannian conjugate gradient method with the weak Wolfe conditions. Comput. Optim. Appl. 64(1), 101–118 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  64. Sato, H., Iwai, T.: A Riemannian optimization approach to the matrix singular value decomposition. SIAM J. Optim. 23(1), 188–212 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  65. Sato, H., Iwai, T.: A new, globally convergent Riemannian conjugate gradient method. Optimization 64(4), 1011–1031 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  66. Scherrer, A.: Matlab Version for Louvain’s Algorithm (2008). https://perso.uclouvain.be/vincent.blondel/research/louvain.html

  67. Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)

    Article  MATH  Google Scholar 

  68. Shi, J., Cour, T., Yu, S.: Normalized Cut Segmentation Code (2004)

  69. Turaga, P., Veeraraghavan, A., Srivastava, A., Chellappa, R.: Statistical computations on Grassmann and Stiefel manifolds for image and video-based recognition. IEEE Trans. Pattern Anal. Mach. Intell. 33(11), 2273–86 (2011). https://doi.org/10.1109/TPAMI.2011.52

    Article  MATH  Google Scholar 

  70. Vandereycken, B.: Low-rank matrix completion by Riemannian optimization–extended version. SIAM J. Optim. 23(2), 1214–1236 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  71. Vinh, N.X., Epps, J., Bailey, J.: Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J. Mach. Learn. Res. 11, 2837–2854 (2010)

    MathSciNet  MATH  Google Scholar 

  72. Wei, K., Cai, J.-F., Chan, T.F., Leung, S.: Guarantees of Riemannian Optimization for Low Rank Matrix Completion (1) (2016)

  73. Wei, M., Huang, W., Gallivan, K.A., Van Dooren, P.: community detection by a Riemannian projected proximal gradient method. In: Proceedings of the 24th Internaltional Symposium on Mathematical Theory of Networks and Systems (2021)

  74. Wen, Z., Yin, W.: A feasible method for optimization with orthogonality constraints. Mathematical Programming (2012). https://doi.org/10.1007/s10107-012-0584-1

  75. Xiao, G., Bai, Z.-J., Ching, W.-K.: A columnwise update algorithm for sparse stochastic matrix factorization. SIAM J. Matrix Anal. Appl. 43(4), 1712–1735 (2022)

    Article  MathSciNet  MATH  Google Scholar 

  76. Xiao, X., Li, Y., Wen, Z., Zhang, L.: A regularized semi-smooth newton method with projection steps for composite convex programs. J. Sci. Comput. 76(1), 364–389 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  77. Yang, L., Cao, X., He, D., Wang, C., Wang, X., Zhang, W.: Modularity based community detection with deep learning. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI), vol. 16, pp. 2252–2258 (2016)

  78. Ye, J., Zhao, Z., Wu, M.: Discriminative k-means for clustering. In: Lyu, H., Sha, N., Qin, S., Yan, M., Xie, Y., Wang, R. (eds.) Advances in Neural Information Processing Systems, vol. 20. Curran Associates, Inc., New York (2008)

    MATH  Google Scholar 

  79. Ye, K., Lim, L.-H.: Schubert varieties and distances between subspaces of different dimensions. SIAM J. Matrix Anal. Appl. 37(3), 1176–1197 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  80. Yu, S.X., Shi, J.: Multiclass spectral clustering. In: Proceedings Ninth IEEE International Conference on Computer Vision, vol. 1, pp. 313–319 (2003)

  81. Zhang, H., Sra, S.: First-order methods for geodesically convex optimization. In: Conference on Learning Theory (2016)

  82. Zhu, X.: A Riemannian conjugate gradient method for optimization on the Stiefel manifold. Comput. Optim. Appl. 67(1), 73–110 (2017)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

Wen Huang was partially supported by the National Natural Science Foundation of China (No. 12001455 and No. 12371311), the National Natural Science Foundation of Fujian Province (No. 2023J06004), the Fundamental Research Funds for the Central Universities (No. 20720240151), Xiaomi Young Talents Program. Kyle A. Gallivan was partially supported by National Science Foundation Grant CIBR1934157.

Funding

This work was supported by the National Natural Science Foundation of China (No. 12371311), the Natural Science Foundation of Fujian Province (No. 2023J06004), the Fundamental Research Funds for the Central Universities (No. 20720240151), and Xiaomi Young Talents Program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wen Huang.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, W., Wei, M., Gallivan, K.A. et al. A Riemannian Optimization Approach to Clustering Problems. J Sci Comput 103, 8 (2025). https://doi.org/10.1007/s10915-025-02806-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10915-025-02806-3

Keywords