research-article

Open access

Clustering mixtures with almost optimal separation in polynomial time

Authors:

Jerry LiAuthors Info & Claims

STOC 2022: Proceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing

Pages 1248 - 1261

https://doi.org/10.1145/3519935.3520012

Published: 10 June 2022 Publication History

Abstract

We consider the problem of clustering mixtures of mean-separated Gaussians in high dimensions. We are given samples from a mixture of k identity covariance Gaussians, so that the minimum pairwise distance between any two pairs of means is at least Δ, for some parameter Δ > 0, and the goal is to recover the ground truth clustering of these samples. It is folklore that separation Δ = Θ (√logk) is both necessary and sufficient to recover a good clustering (say with constant or 1/poly(k) error), at least information theoretically. However, the estimators which achieve this guarantee are inefficient. We give the first algorithm which runs in polynomial time, and which almost matches this guarantee. More precisely, we give an algorithm which takes polynomially many samples and time, and which can successfully recover a good clustering, so long as the separation is Δ = Ω (log^{1/2 + c} k), for any c > 0. Previously, polynomial time algorithms were only known for this problem when the separation was polynomial in k, and all algorithms which could tolerate poly logk separation required quasipolynomial time. We also extend our result to mixtures of translations of a distribution which satisfies the Poincaré inequality, under additional mild assumptions.

Our main technical tool, which we believe is of independent interest, is a novel way to implicitly represent and estimate high degree moments of a distribution, which allows us to extract important information about high-degree moments without ever writing down the full moment tensors explicitly.

References

[1]

Jayadev Acharya, Ilias Diakonikolas, Jerry Li, and Ludwig Schmidt. 2017. Sample-optimal density estimation in nearly-linear time. In Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms. 1278–1289.

[2]

Jayadev Acharya, Ashkan Jafarpour, Alon Orlitsky, and Ananda Theertha Suresh. 2014. Near-optimal-sample estimators for spherical gaussian mixtures. arXiv preprint arXiv:1402.4746.

[3]

Dimitris Achlioptas and Frank McSherry. 2005. On spectral learning of mixtures of distributions. In International Conference on Computational Learning Theory. 458–469.

Digital Library

[4]

Joseph Anderson, Mikhail Belkin, Navin Goyal, Luis Rademacher, and James Voss. 2014. The more, the merrier: the blessing of dimensionality for learning large Gaussian mixtures. In Conference on Learning Theory. 1135–1164.

[5]

Sanjeev Arora and Ravi Kannan. 2005. Learning mixtures of separated nonspherical Gaussians. The Annals of Applied Probability, 15, 1A (2005), 69–92.

[6]

Hassan Ashtiani, Shai Ben-David, Nicholas JA Harvey, Christopher Liaw, Abbas Mehrabian, and Yaniv Plan. 2018. Nearly tight sample complexity bounds for learning mixtures of Gaussians via sample compression schemes. In Proceedings of the 32nd International Conference on Neural Information Processing Systems. 3416–3425.

[7]

Mikhail Belkin and Kaushik Sinha. 2015. Polynomial learning of distribution families. SIAM J. Comput., 44, 4 (2015), 889–911.

Digital Library

[8]

Aditya Bhaskara, Moses Charikar, Ankur Moitra, and Aravindan Vijayaraghavan. 2014. Smoothed analysis of tensor decompositions. In Proceedings of the forty-sixth annual ACM symposium on Theory of computing. 594–603.

Digital Library

[9]

Aditya Bhaskara, Ananda Suresh, and Morteza Zadimoghaddam. 2015. Sparse solutions to nonnegative linear systems and applications. In Artificial Intelligence and Statistics. 83–92.

[10]

Matthew Brennan, Guy Bresler, Samuel B Hopkins, Jerry Li, and Tselil Schramm. 2020. Statistical query algorithms and low-degree tests are almost equivalent. arXiv preprint arXiv:2009.06107.

[11]

Siu-On Chan, Ilias Diakonikolas, Rocco A Servedio, and Xiaorui Sun. 2014. Efficient density estimation via piecewise polynomial approximation. In Proceedings of the forty-sixth annual ACM symposium on Theory of computing. 604–613.

Digital Library

[12]

Siu-On Chan, Ilias Diakonikolas, Rocco A Servedio, and Xiaorui Sun. 2014. Near-optimal density estimation in near-linear time using variable-width histograms. arXiv preprint arXiv:1411.0169.

[13]

Yuansi Chen. 2021. An almost constant lower bound of the isoperimetric coefficient in the KLS conjecture. Geometric and Functional Analysis, 31, 1 (2021), 34–61.

[14]

Sanjoy Dasgupta. 1999. Learning mixtures of Gaussians. In 40th Annual Symposium on Foundations of Computer Science (Cat. No. 99CB37039). 634–644.

[15]

Sanjoy Dasgupta and Leonard J Schulman. 2007. A probabilistic analysis of EM for mixtures of separated, spherical Gaussians. Journal of Machine Learning Research, 8 (2007), 203–226.

Digital Library

[16]

Constantinos Daskalakis and Gautam Kamath. 2014. Faster and sample near-optimal algorithms for proper learning mixtures of gaussians. In Conference on Learning Theory. 1183–1213.

[17]

Constantinos Daskalakis, Christos Tzamos, and Manolis Zampetakis. 2017. Ten steps of EM suffice for mixtures of two Gaussians. In Conference on Learning Theory. 704–710.

[18]

Luc Devroye and Gábor Lugosi. 2001. Combinatorial methods in density estimation. Springer Science & Business Media.

[19]

Ilias Diakonikolas and Daniel M Kane. 2020. Small covers for near-zero sets of polynomials and learning latent variable models. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS). 184–195.

[20]

Ilias Diakonikolas, Daniel M Kane, and Alistair Stewart. 2017. Statistical query lower bounds for robust estimation of high-dimensional gaussians and gaussian mixtures. In 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS). 73–84.

[21]

Ilias Diakonikolas, Daniel M Kane, and Alistair Stewart. 2018. List-decodable robust mean estimation and learning mixtures of spherical Gaussians. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing. 1047–1060.

Digital Library

[22]

Jon Feldman, Ryan O’Donnell, and Rocco A Servedio. 2008. Learning mixtures of product distributions over discrete domains. SIAM J. Comput., 37, 5 (2008), 1536–1564.

Digital Library

[23]

Rong Ge, Qingqing Huang, and Sham M Kakade. 2015. Learning mixtures of gaussians in high dimensions. In Proceedings of the forty-seventh annual ACM symposium on Theory of computing. 761–770.

Digital Library

[24]

Venkatesan Guruswami and Ali Kemal Sinop. 2012. Faster SDP hierarchy solvers for local rounding algorithms. In 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science. 197–206.

Digital Library

[25]

Moritz Hardt and Eric Price. 2015. Tight bounds for learning a mixture of two gaussians. In Proceedings of the forty-seventh annual ACM symposium on Theory of computing. 753–760.

Digital Library

[26]

Samuel B Hopkins and Jerry Li. 2018. Mixture models, robustness, and sum of squares proofs. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing. 1021–1034.

Digital Library

[27]

Samuel B Hopkins, Tselil Schramm, and Jonathan Shi. 2019. A robust spectral algorithm for overcomplete tensor decomposition. In Conference on Learning Theory. 1683–1722.

[28]

Samuel B Hopkins, Tselil Schramm, Jonathan Shi, and David Steurer. 2016. Fast spectral algorithms from sum-of-squares proofs: tensor decomposition and planted sparse vectors. In Proceedings of the forty-eighth annual ACM symposium on Theory of Computing. 178–191.

Digital Library

[29]

Daniel Hsu and Sham M Kakade. 2013. Learning mixtures of spherical gaussians: moment methods and spectral decompositions. In Proceedings of the 4th conference on Innovations in Theoretical Computer Science. 11–20.

Digital Library

[30]

Adam Tauman Kalai, Ankur Moitra, and Gregory Valiant. 2010. Efficiently learning mixtures of two Gaussians. In Proceedings of the forty-second ACM symposium on Theory of computing. 553–562.

Digital Library

[31]

Pravesh K Kothari, Jacob Steinhardt, and David Steurer. 2018. Robust moment estimation and improved clustering via sum of squares. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing. 1035–1046.

Digital Library

[32]

Amit Kumar and Ravindran Kannan. 2010. Clustering with spectral norm and the k-means algorithm. In 2010 IEEE 51st Annual Symposium on Foundations of Computer Science. 299–308.

Digital Library

[33]

Jerry Li, Allen Liu, and Ankur Moitra. 2021. Sparsification for Sums of Exponentials and its Algorithmic Applications. arXiv preprint arXiv:2106.02774.

[34]

Jerry Li and Ludwig Schmidt. 2017. Robust and proper learning for mixtures of gaussians via systems of polynomial inequalities. In Conference on Learning Theory. 1302–1382.

[35]

Tengyu Ma, Jonathan Shi, and David Steurer. 2016. Polynomial-time tensor decompositions with sum-of-squares. In 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS). 438–446.

[36]

Dustin G Mixon, Soledad Villar, and Rachel Ward. 2017. Clustering subgaussian mixtures by semidefinite programming. Information and Inference: A Journal of the IMA, 6, 4 (2017), 389–415.

[37]

Ankur Moitra and Gregory Valiant. 2010. Settling the polynomial learnability of mixtures of gaussians. In 2010 IEEE 51st Annual Symposium on Foundations of Computer Science. 93–102.

Digital Library

[38]

Karl Pearson. 1894. Contributions to the mathematical theory of evolution. Philosophical Transactions of the Royal Society of London. A, 185 (1894), 71–110.

[39]

Prasad Raghavendra, Satish Rao, and Tselil Schramm. 2017. Strongly refuting random csps below the spectral threshold. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing. 121–131.

Digital Library

[40]

Oded Regev and Aravindan Vijayaraghavan. 2017. On Learning Mixtures of Well-Separated Gaussians. arxiv:1710.11592.

[41]

Tselil Schramm and David Steurer. 2017. Fast and robust tensor decomposition with applications to dictionary learning. In Conference on Learning Theory. 1760–1793.

[42]

David Steurer and Stefan Tiegel. 2021. SoS degree reduction with applications to clustering and robust moment estimation. In Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA). 374–393.

[43]

Yin Tat Lee and Santosh S Vempala. 2018. The Kannan-Lovász-Simonovits Conjecture. arXiv e-prints, arXiv–1807.

[44]

Santosh Vempala and Grant Wang. 2004. A spectral algorithm for learning mixture models. J. Comput. System Sci., 68, 4 (2004), 841–860.

Digital Library

[45]

CF Jeff Wu. 1983. On the convergence properties of the EM algorithm. The Annals of statistics, 95–103.

[46]

Ji Xu, Daniel Hsu, and Arian Maleki. 2016. Global analysis of expectation maximization for mixtures of two gaussians. arXiv preprint arXiv:1608.07630.

Cited By

Arbas JAshtiani HLiaw CKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)Polynomial time and private learning of unbounded Gaussian mixture modelsProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3618450(1018-1040)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3618450
Qiao MGuruganesh GRawat ADubey AZaheer MKoyejo SMohamed SAgarwal ABelgrave DCho KOh A(2022)A fourier approach to mixture learningProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601786(20850-20861)Online publication date: 28-Nov-2022
https://dl.acm.org/doi/10.5555/3600270.3601786

Index Terms

Clustering mixtures with almost optimal separation in polynomial time
1. Theory of computation
  1. Theory and algorithms for application domains
    1. Machine learning theory
      1. Unsupervised learning and clustering

Recommendations

Robustly Learning General Mixtures of Gaussians
This work represents a natural coalescence of two important lines of work — learning mixtures of Gaussians and algorithmic robust statistics. In particular, we give the first provably robust algorithm for learning mixtures of any constant number of ...
Learning mixtures of spherical gaussians: moment methods and spectral decompositions
ITCS '13: Proceedings of the 4th conference on Innovations in Theoretical Computer Science

This work provides a computationally efficient and statistically consistent moment-based estimator for mixtures of spherical Gaussians. Under the condition that component means are in general position, a simple spectral decomposition technique yields ...
Settling the Polynomial Learnability of Mixtures of Gaussians
FOCS '10: Proceedings of the 2010 IEEE 51st Annual Symposium on Foundations of Computer Science

Given data drawn from a mixture of multivariate Gaussians, a basic problem is to accurately estimate the mixture parameters. We give an algorithm for this problem that has running time and data requirements polynomial in the dimension and the inverse of ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

STOC 2022: Proceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing

June 2022

1698 pages

ISBN:9781450392648

DOI:10.1145/3519935

General Chair:
Stefano Leonardi
Sapienza University of Rome, Italy
,
Program Chair:
Anupam Gupta
Carnegie Mellon University, USA

Copyright © 2022 Owner/Author.

This work is licensed under a Creative Commons Attribution 4.0 International License.

Sponsors

SIGACT: ACM Special Interest Group on Algorithms and Computation Theory

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 June 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

STOC '22

Sponsor:

SIGACT

STOC '22: 54th Annual ACM SIGACT Symposium on Theory of Computing

June 20 - 24, 2022

Rome, Italy

Acceptance Rates

Overall Acceptance Rate 1,469 of 4,586 submissions, 32%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
285
Total Downloads

Downloads (Last 12 months)122
Downloads (Last 6 weeks)34

Reflects downloads up to 12 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Arbas JAshtiani HLiaw CKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)Polynomial time and private learning of unbounded Gaussian mixture modelsProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3618450(1018-1040)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3618450
Qiao MGuruganesh GRawat ADubey AZaheer MKoyejo SMohamed SAgarwal ABelgrave DCho KOh A(2022)A fourier approach to mixture learningProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601786(20850-20861)Online publication date: 28-Nov-2022
https://dl.acm.org/doi/10.5555/3600270.3601786

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents