A Riemannian Optimization Approach to Clustering Problems

Wen Huang ORCID: orcid.org/0000-0001-8324-2416¹,
Meng Wei²,
Kyle A. Gallivan² &
…
Paul Van Dooren³

121 Accesses
Explore all metrics

Abstract

This paper considers the optimization problem

$$\begin{aligned} \min _{X \in \mathcal {F}_v} f( {X}) + \lambda \Vert X\Vert _1, \end{aligned}$$

where f is smooth, $\mathcal {F}_v = \{X \in \mathbb {R}^{n \times q}: X^T X = I_q, v \in \textrm{span}(X)\}$, and v is a given positive vector. The clustering models including but not limited to the models used by k-means, community detection, and normalized cut can be reformulated as such optimization problems. It is proven that the domain $\mathcal {F}_v$ forms a compact embedded submanifold of $\mathbb {R}^{n \times q}$ and optimization-related tools including a family of computationally efficient retractions and an orthonormal basis of any normal space of $\mathcal {F}_v$ are derived. A Riemannian proximal gradient method that allows an adaptive step size is proposed. The proposed Riemannian proximal gradient method solves its subproblem inexactly and still guarantees its global convergence. Numerical experiments on community detection in networks and normalized cut for image segmentation are used to demonstrate the performance of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similarity graph-based max-flow and duality approaches for semi-supervised data classification and image segmentation

Article 21 June 2023

A Convex Approach to K-Means Clustering and Image Segmentation

Developments on Solutions of the Normalized-Cut-Clustering Problem Without Eigenvectors

Data Availability

The datasets generated during and analysed during the current study are available from the corresponding author on reasonable request.

Notes

In some cases, the proximal mapping in [41] can be solved efficiently without resorting to the semi-smooth Newton algorithm, see [41, Section 5.2].
Throughout this paper, the computational complexity is measured by flop counts. A flop is a floating point operation [28, Section 1.2.4].
Note that the local definition function only requires $\textrm{D}h(Y)$ to be full rank at $Y = X$. Here, we prove a stronger result.
The $\lambda $ in I-AManPG increases by 0.01, 0.04, and 0.2.
The implementation in [17] is for unweighted kernel k-means. We modified it for weighted kernel k-means.

References

Absil, P.-A., Baker, C.G., Gallivan, K.A.: Trust-region methods on Riemannian manifolds. Found. Comput. Math. 7(3), 303–330 (2007)
Article MathSciNet MATH Google Scholar
Absil, P.-A., Mahony, R., Sepulchre, R.: Optimization Algorithms on Matrix Manifolds. Princeton University Press, Princeton (2008)
Book MATH Google Scholar
Absil, P.-A., Malick, J.: Projection-like retractions on matrix manifolds. SIAM J. Optim. 22(1), 135–158 (2012)
Article MathSciNet MATH Google Scholar
Bach, F.R., Jordan M.I. Learning spectral clustering. In: Proceedings of the 16th International Conference on Neural Information Processing Systems, NIPS’03, pp. 305–312, Cambridge, MA, US. MIT Press (2003)
Baker, C.G.: Riemannian manifold trust-region methods with applications to eigenproblems. PhD thesis, Florida State University, School of Computational Science (2008)
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009). https://doi.org/10.1137/080716542
Article MathSciNet MATH Google Scholar
Beck, A.: First-Order Methods in Optimization. Society for Industrial and Applied Mathematics, Philadelphia, PA (2017)
Bento, G.C., Ferreira, O.P., Melo, J.G.: Iteration-complexity of gradient, subgradient and proximal point methods on Riemannian manifolds. J. Optim. Theory Appl. 173(2), 548–562 (2017)
Article MathSciNet MATH Google Scholar
Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008(10), P10008 (2008)
Article MATH Google Scholar
Boothby, W.M.: An Introduction to Differentiable Manifolds and Riemannian Geometry, 2nd edn. Academic Press, Cambridge (1986)
MATH Google Scholar
Boumal, N.: An introduction to optimization on smooth manifolds (2022)
Boumal, N., Absil, P.A., Cartis, C.: Global rates of convergence for nonconvex optimization on manifolds. IMA J. Numer. Anal. 39(1), 1–33 (2018)
Article MathSciNet MATH Google Scholar
Bounova, G.: Matlab Tools for Network Analysis (2009). http://strategic.mit.edu/downloads.php?page=matlab_networks
Boutsidis, C., Drineas, P., Mahoney, M.W.: Unsupervised feature selection for the $ k $-means clustering problem. In: Advances in Neural Information Processing Systems, pp. 153–161 (2009)
Carson, T., Mixon, D.G., Villar, S.: Manifold optimization for k-means clustering. In: 2017 International Conference on Sampling Theory and Applications (SampTA), pp. 73–77. IEEE (2017)
Chan, P.K., Schlag, F.: Spectral k-way ratio-cut partitioning and clustering. IEEE Trans. Comput. Aided Design Integr. Circuits Syst. 13(9), 1088–1096 (1994)
Article MATH Google Scholar
Chen, M.: Pattern recognition and machine learning toolbox (2021)
Chen, S., Ma, S., So, A.M.-C., Zhang, T.: Proximal gradient method for nonsmooth optimization over the Stiefel manifold. SIAM J. Optim. 30(1), 210–239 (2020)
Article MathSciNet MATH Google Scholar
Cherian, A., Sra, S.: Riemannian dictionary learning and sparse coding for positive definite matrices. IEEE Trans. Neural Netw. Learn. Syst. 28(12), 2859–2871 (2017)
Article MathSciNet MATH Google Scholar
Clarke, F.H.: Optimization and nonsmooth analysis. Classics in Applied Mathematics of SIAM (1990)
Danon, L., Diaz-Guilera, A., Arenas, A.: The effect of size heterogeneity on community identification in complex networks. J. Stat. Mech. Theory Exp. 2006(11), P11010 (2006)
Article MATH Google Scholar
Danon, L., Diaz-Guilera, A., Duch, J., Arenas, A.: Comparing community structure identification. J. Stat. Mech. Theory Exp. 2005(09), P09008 (2005)
Article Google Scholar
Dhillon, I.S., Guan, Y., Kulis, B.: Kernel k-means, spectral clustering and normalized cuts. In: Proceedings of the 10th Association for Computing Machinery(ACM) Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD) International Conference on Knowledge Discovery and Data Mining (2004)
Dhillon, I., Guan, Y., Kulis, B.: A unified view of kernel k-means. spectral clustering and graph cuts. Technical report, Department of Computer Sciences, University of Texas at Austin (2005)
do Carmo, M.P.: Riemannian Geometry. Mathematics: Theory & Applications (1992)
Edelman, A., Arias, T.A., Smith, S.T.: The geometry of algorithms with orthogonality constraints. SIAM J. Matrix Anal. Appl. 20(2), 303–353 (1998). https://doi.org/10.1137/S0895479895290954
Article MathSciNet MATH Google Scholar
Fortunato, S.: Community detection in graphs. Phys. Rep. 486(3–5), 75–174 (2010)
Article MathSciNet MATH Google Scholar
Golub, G.H., Van Loan, C.F.: Matrix computations, 3rd edn. Johns Hopkins Studies in the Mathematical Sciences. Johns Hopkins University Press (1996)
Grohs, P., Hosseini, S.: $\epsilon $-subgradient algorithms for locally lipschitz functions on Riemannian manifolds. Adv. Comput. Math. (2015). https://doi.org/10.1007/s10444-015-9426-z
Article MATH Google Scholar
Grohs, P., Hosseini, S.: Nonsmooth trust region algorithms for locally Lipschitz functions on Riemannian manifolds. IMA J. Numer. Anal. (2015). https://doi.org/10.1093/imanum/drv043
Article MATH Google Scholar
Harris, J.: Algebraic Geometry: A First Course, vol. 133. Springer Science & Business Media, Berlin (1992)
MATH Google Scholar
Hartigan, J.A., Wong, M.A.: A K-means clustering algorithm. Appl. Stat. 28(1), 100–108 (1979)
Article MATH Google Scholar
Hosseini, S., Huang, W., Yousefpour, R.: Line search algorithms for locally Lipschitz functions on Riemannian manifolds. SIAM J. Optim. 28(1), 596–619 (2018)
Article MathSciNet MATH Google Scholar
Jiang, H., Jiang, B., Lin, L., Wen, Z., Yuan, Y.: Structured quasi-newton methods for optimization with orthogonality constraints. SIAM J. Sci. Comput. 41(4), A2239–A2269 (2019)
Article MathSciNet MATH Google Scholar
Huang, W.: Optimization algorithms on Riemannian manifolds with applications. PhD thesis, Florida State University, Department of Mathematics (2013)
Huang, W., Absil, P.-A., Gallivan, K.A.: A Riemannian symmetric rank-one trust-region method. Math. Program. 150(2), 179–216 (2015)
Article MathSciNet MATH Google Scholar
Huang, W., Absil, P.-A., Gallivan, K.A.: Intrinsic representation of tangent vectors and vector transport on matrix manifolds. Numer. Math. 136(2), 523–543 (2017)
Article MathSciNet MATH Google Scholar
Huang, W., Gallivan, K.A.: A limited-memory Riemannian symmetric rank-one trust-region method with a restart strategy. J. Sci. Comput. 93(1), 1 (2022)
Article MathSciNet MATH Google Scholar
Huang, W., Gallivan, K.A., Absil, P.-A.: A Broyden class of quasi-Newton methods for Riemannian optimization. SIAM J. Optim. 25(3), 1660–1685 (2015)
Article MathSciNet MATH Google Scholar
Huang, W., Gallivan, K.A., Srivastava, A., Absil, P.-A.: Riemannian optimization for elastic shape analysis. In: Proceedings of the 21st Internaltional Symposium on Mathematical Theory of Networks and Systems (MTNS 2014) (2014)
Huang, W., Wei, K.: Riemannian proximal gradient methods. Mathematical Programming (2021). https://doi.org/10.1007/s10107-021-01632-3
Huang, W., Absil, P.A., Gallivan, K.A.: A Riemannian BFGS method without differentiated retraction for nonconvex optimization problems. SIAM J. Optim. 28(1), 470–495 (2018)
Article MathSciNet MATH Google Scholar
Huang, W., Hand, P.: Blind deconvolution by a steepest descent algorithm on a quotient manifold. SIAM J. Imaging Sci. 11(4), 2757–2785 (2018)
Article MathSciNet MATH Google Scholar
Huang, W., Wei, K.: An Inexact Riemannian Proximal Gradient Method (2021)
Huang, W., Wei, K.: An extension of fast iterative shrinkage-thresholding algorithm to Riemannian optimization for sparse principal component analysis. Numer. Linear Algebra Appl. 29, e2409 (2021)
Article MathSciNet MATH Google Scholar
Jiang, B., Meng, X., Wen, Z., Chen, X.: An exact penalty approach for optimization with nonnegative orthogonality constraints (2019). CoRR, ArXiv:1907.12424
Jolliffe, I.T., Trendafilov, N.T., Uddin, M.: A modified principal component technique based on the Lasso. J. Comput. Graph. Stat. 12(3), 531–547 (2003)
Article MathSciNet MATH Google Scholar
Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20(1), 359–392 (1998)
Article MathSciNet MATH Google Scholar
Kehagias, A.: Community Detection Toolbox (2021). https://www.mathworks.com/matlabcentral/fileexchange/45867-community-detection-toolbox
Lancichinetti, A., Fortunato, S., Radicchi, F.: Benchmark graphs for testing community detection algorithms. Phys. Rev. E 78(4), 046110 (2008)
Article Google Scholar
Li, X., Sun, D., Toh, K.-C.: A highly efficient semismooth Newton augmented Lagrangian method for solving Lasso problems. SIAM J. Optim. 28(1), 433–458 (2018)
Article MathSciNet MATH Google Scholar
Macqueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley Symposium on Mathematical Statistics and Probability (1967)
Manning, C.D.: Prabhakar Raghavan. utze, introduction to information retrieval (2008)
Newman, M.E.J.: Fast algorithm for detecting community structure in networks. Phys. Rev. E 69(6), 066133 (2004)
Article MATH Google Scholar
Newman, M.E.J.: Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74(3), 036104 (2006)
Article MathSciNet MATH Google Scholar
Newman, M.E.J.: Modularity and community structure in networks. Proc. Natl. Acad. Sci. 103(23), 8577–8582 (2006)
Article MATH Google Scholar
Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69(2), 026113 (2004)
Article MATH Google Scholar
Newman, M.E.J., Leicht, E.A.: Mixture models and exploratory analysis in networks. Proc. Natl. Acad. Sci. 104(23), 9564–9569 (2007)
Article MATH Google Scholar
Qian, Y., Pan, S., Xiao, L.: Exact penalty methods for minimizing a smooth function over the nonnegative orthogonal set (2021)
Ring, W., Wirth, B.: Optimization methods on Riemannian manifolds and their application to shape space. SIAM J. Optim. 22(2), 596–627 (2012). https://doi.org/10.1137/11082885X
Article MathSciNet MATH Google Scholar
Rosvall, M., Bergstrom, C.T.: Maps of random walks on complex networks reveal community structure. Proc. Natl. Acad. Sci. 105(4), 1118–1123 (2008)
Article MATH Google Scholar
Samir, C., Huang, W.: Coordinate descent optimization for one-to-one correspondence and supervised classification of 3D shapes. Appl. Math. Comput. 388, 125539 (2021)
MathSciNet MATH Google Scholar
Sato, H.: A Dai–Yuan-type Riemannian conjugate gradient method with the weak Wolfe conditions. Comput. Optim. Appl. 64(1), 101–118 (2016)
Article MathSciNet MATH Google Scholar
Sato, H., Iwai, T.: A Riemannian optimization approach to the matrix singular value decomposition. SIAM J. Optim. 23(1), 188–212 (2013)
Article MathSciNet MATH Google Scholar
Sato, H., Iwai, T.: A new, globally convergent Riemannian conjugate gradient method. Optimization 64(4), 1011–1031 (2015)
Article MathSciNet MATH Google Scholar
Scherrer, A.: Matlab Version for Louvain’s Algorithm (2008). https://perso.uclouvain.be/vincent.blondel/research/louvain.html
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)
Article MATH Google Scholar
Shi, J., Cour, T., Yu, S.: Normalized Cut Segmentation Code (2004)
Turaga, P., Veeraraghavan, A., Srivastava, A., Chellappa, R.: Statistical computations on Grassmann and Stiefel manifolds for image and video-based recognition. IEEE Trans. Pattern Anal. Mach. Intell. 33(11), 2273–86 (2011). https://doi.org/10.1109/TPAMI.2011.52
Article MATH Google Scholar
Vandereycken, B.: Low-rank matrix completion by Riemannian optimization–extended version. SIAM J. Optim. 23(2), 1214–1236 (2013)
Article MathSciNet MATH Google Scholar
Vinh, N.X., Epps, J., Bailey, J.: Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J. Mach. Learn. Res. 11, 2837–2854 (2010)
MathSciNet MATH Google Scholar
Wei, K., Cai, J.-F., Chan, T.F., Leung, S.: Guarantees of Riemannian Optimization for Low Rank Matrix Completion (1) (2016)
Wei, M., Huang, W., Gallivan, K.A., Van Dooren, P.: community detection by a Riemannian projected proximal gradient method. In: Proceedings of the 24th Internaltional Symposium on Mathematical Theory of Networks and Systems (2021)
Wen, Z., Yin, W.: A feasible method for optimization with orthogonality constraints. Mathematical Programming (2012). https://doi.org/10.1007/s10107-012-0584-1
Xiao, G., Bai, Z.-J., Ching, W.-K.: A columnwise update algorithm for sparse stochastic matrix factorization. SIAM J. Matrix Anal. Appl. 43(4), 1712–1735 (2022)
Article MathSciNet MATH Google Scholar
Xiao, X., Li, Y., Wen, Z., Zhang, L.: A regularized semi-smooth newton method with projection steps for composite convex programs. J. Sci. Comput. 76(1), 364–389 (2018)
Article MathSciNet MATH Google Scholar
Yang, L., Cao, X., He, D., Wang, C., Wang, X., Zhang, W.: Modularity based community detection with deep learning. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI), vol. 16, pp. 2252–2258 (2016)
Ye, J., Zhao, Z., Wu, M.: Discriminative k-means for clustering. In: Lyu, H., Sha, N., Qin, S., Yan, M., Xie, Y., Wang, R. (eds.) Advances in Neural Information Processing Systems, vol. 20. Curran Associates, Inc., New York (2008)
MATH Google Scholar
Ye, K., Lim, L.-H.: Schubert varieties and distances between subspaces of different dimensions. SIAM J. Matrix Anal. Appl. 37(3), 1176–1197 (2016)
Article MathSciNet MATH Google Scholar
Yu, S.X., Shi, J.: Multiclass spectral clustering. In: Proceedings Ninth IEEE International Conference on Computer Vision, vol. 1, pp. 313–319 (2003)
Zhang, H., Sra, S.: First-order methods for geodesically convex optimization. In: Conference on Learning Theory (2016)
Zhu, X.: A Riemannian conjugate gradient method for optimization on the Stiefel manifold. Comput. Optim. Appl. 67(1), 73–110 (2017)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

Wen Huang was partially supported by the National Natural Science Foundation of China (No. 12001455 and No. 12371311), the National Natural Science Foundation of Fujian Province (No. 2023J06004), the Fundamental Research Funds for the Central Universities (No. 20720240151), Xiaomi Young Talents Program. Kyle A. Gallivan was partially supported by National Science Foundation Grant CIBR1934157.

Funding

This work was supported by the National Natural Science Foundation of China (No. 12371311), the Natural Science Foundation of Fujian Province (No. 2023J06004), the Fundamental Research Funds for the Central Universities (No. 20720240151), and Xiaomi Young Talents Program.

Author information

Authors and Affiliations

School of Mathematical Sciences, Xiamen University, Xiamen, China
Wen Huang
Department of Mathematics, Florida State University, Tallahassee, USA
Meng Wei & Kyle A. Gallivan
Department of Mathematical Engineering, Université catholique de Louvain, Louvain-la-Neuve, Belgium
Paul Van Dooren

Authors

Wen Huang
View author publications
You can also search for this author in PubMed Google Scholar
Meng Wei
View author publications
You can also search for this author in PubMed Google Scholar
Kyle A. Gallivan
View author publications
You can also search for this author in PubMed Google Scholar
Paul Van Dooren
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wen Huang.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Huang, W., Wei, M., Gallivan, K.A. et al. A Riemannian Optimization Approach to Clustering Problems. J Sci Comput 103, 8 (2025). https://doi.org/10.1007/s10915-025-02806-3

Download citation

Received: 28 January 2024
Revised: 02 December 2024
Accepted: 16 January 2025
Published: 12 February 2025
DOI: https://doi.org/10.1007/s10915-025-02806-3

A Riemannian Optimization Approach to Clustering Problems

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Similarity graph-based max-flow and duality approaches for semi-supervised data classification and image segmentation

A Convex Approach to K-Means Clustering and Image Segmentation

Developments on Solutions of the Normalized-Cut-Clustering Problem Without Eigenvectors

Data Availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

A Riemannian Optimization Approach to Clustering Problems

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Similarity graph-based max-flow and duality approaches for semi-supervised data classification and image segmentation

A Convex Approach to K-Means Clustering and Image Segmentation

Developments on Solutions of the Normalized-Cut-Clustering Problem Without Eigenvectors

Data Availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now