Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1064092.1064114acmconferencesArticle/Chapter ViewAbstractPublication PagessocgConference Proceedingsconference-collections
Article

Smaller coresets for k-median and k-means clustering

Published: 06 June 2005 Publication History

Abstract

In this paper, we show that there exists a (k, ε)-coreset for k-median and k-means clustering of n points in Rd, which is of size independent of n. In particular, we construct a (k, ε)-coreset of size O(k2d) for k-median clustering, and of size O(k3d+1) for k-means clustering.

References

[1]
S. Arora. Polynomial time approximation schemes for euclidean tsp and ther geometric problems. J. Assoc. Comput. Mach.,45(5):753--782,Sep 1998.
[2]
S. Arora, P. Raghavan,and S. Rao. Approximation schemes for Euclidean k -median and related problems. In Proc. 30th Annu. ACM Sympos. Theory Comput., pages 106--113,1998.
[3]
M. Bǎdiu, S. Har-Peled,and P.Indyk. Approximate clustering via coresets. In Proc. 34th Annu. ACM Sympos. Theory Comput., pages 250--257, 2002.
[4]
M. Charikar, S. Guha, E. Tardos,and D. B. Shmoys. A constant-factor approximation algorithm for the k -median problem. In Proc. 31st Annu. ACM Sympos. Theory Comput., pages 1--10, 1999.
[5]
W.F.de la Vega,M. Karpinski,C. Kenyon, and Y. Rabani. Approximation schemes for clustering problems. In Proc. 35th Annu. ACM Sympos. Theory Comput., pages 50--58,2003.
[6]
R. G. D wney and M. R. Fellows. Fixed-parameter tractability and completeness i: Basic results. SIAM J. Comput.,24(4): 873--921,1995.
[7]
M. Effros and L. J. Schulman. Deterministic clustering with data nets.Technical Report TR04-085, Elec. Colloq.Comp.Complexity,2003. http://www.eccc.uni-trier.de/eccc-reports/2004/TR04-085/.
[8]
S. Guha,R. M. N. Mishra, and L. O'Callaghan. Clustering data streams.In Proc. 41th Annu. IEEE Sympos. Found. Comput. Sci., pages 359--366,2000.
[9]
S. Har-Peled and S. Mazumdar. Coresets for k -means and k -median clustering and their applications. In Proc. 36th Annu. ACM Sympos. Theory Comput., pages 291--300,2004.
[10]
M. Inaba,N. Katoh,and H.Imai. Applications of weighted voronoi diagrams and randomization to variance-based k -clustering. In Proc. 10th Annu. ACM Sympos. Comput. Geom.,pages 332--339, 1994.
[11]
T. Kanungo, D. M. Mount, N. S. Netanyahu, C. D. Piatk,R. Silverman,and A. Y.Wu. A local search approximation algorithm for k -means clustering. In Proc. 18th Annu. ACM Sympos. Comput. Geom., pages 10--18,2002.
[12]
S. G. Kolliop ulos and S. Rao. A nearly linear-time approximation scheme for the euclidean ? -median problem. In Proc. 7thAnnu. European Sympos. Algorithms pages 378--389,1999.
[13]
A. Kumar,Y. Sabharwal, and S. Sen. Linear time algorithms for clustering problems in any dimension. manuscript,2004.
[14]
J. Matoušek. On approximate geometric k -clustering. Discrete Comput. Geom., 24: 61--84,2000.
[15]
J. Matoušsek.Lectures on Discrete Geometry Springer, 2002.
[16]
R. Mettu and C. G. Plaxton. Optimal time bounds for approximate clustering.In In Proceedings of the 18th Conference on Uncertainty in Artifical Intelligence pages 344--351,2002.

Cited By

View all
  • (2024)Improved Approximation Algorithms for Relational ClusteringProceedings of the ACM on Management of Data10.1145/36958312:5(1-27)Online publication date: 7-Nov-2024
  • (2024)Distillation vs. Sampling for Efficient Training of Learning to Rank ModelsProceedings of the 2024 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3664190.3672527(51-60)Online publication date: 2-Aug-2024
  • (2024)Coreset Learning-Based Sparse Black-Box Adversarial Attack for Video RecognitionIEEE Transactions on Information Forensics and Security10.1109/TIFS.2023.333355619(1547-1560)Online publication date: 2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SCG '05: Proceedings of the twenty-first annual symposium on Computational geometry
June 2005
398 pages
ISBN:1581139918
DOI:10.1145/1064092
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 June 2005

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. clustering
  2. coresets

Qualifiers

  • Article

Conference

SoCG05

Acceptance Rates

SCG '05 Paper Acceptance Rate 41 of 141 submissions, 29%;
Overall Acceptance Rate 625 of 1,685 submissions, 37%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)38
  • Downloads (Last 6 weeks)6
Reflects downloads up to 16 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Improved Approximation Algorithms for Relational ClusteringProceedings of the ACM on Management of Data10.1145/36958312:5(1-27)Online publication date: 7-Nov-2024
  • (2024)Distillation vs. Sampling for Efficient Training of Learning to Rank ModelsProceedings of the 2024 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3664190.3672527(51-60)Online publication date: 2-Aug-2024
  • (2024)Coreset Learning-Based Sparse Black-Box Adversarial Attack for Video RecognitionIEEE Transactions on Information Forensics and Security10.1109/TIFS.2023.333355619(1547-1560)Online publication date: 2024
  • (2024)MapReduce Algorithms for Robust Center-Based Clustering in Doubling MetricsJournal of Parallel and Distributed Computing10.1016/j.jpdc.2024.104966(104966)Online publication date: Aug-2024
  • (2023)Sequential subset matching for dataset distillationProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669072(67487-67504)Online publication date: 10-Dec-2023
  • (2023)Dataset Condensation via Expert Subspace ProjectionSensors10.3390/s2319814823:19(8148)Online publication date: 28-Sep-2023
  • (2023)Brief Announcement: Streaming Balanced ClusteringProceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3558481.3591318(311-314)Online publication date: 17-Jun-2023
  • (2023)Enhancing Big Data Conversion Validation with Alpha-Lightweight CoresetSN Computer Science10.1007/s42979-023-02206-04:6Online publication date: 14-Oct-2023
  • (2022)Alpha Lightweight Coreset for k-Means Clustering2022 16th International Conference on Ubiquitous Information Management and Communication (IMCOM)10.1109/IMCOM53663.2022.9721770(1-8)Online publication date: 3-Jan-2022
  • (2022)On the k-means/median cost functionInformation Processing Letters10.1016/j.ipl.2022.106252177(106252)Online publication date: Aug-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media