Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/513400.513440acmconferencesArticle/Chapter ViewAbstractPublication PagessocgConference Proceedingsconference-collections
Article

Projective clustering in high dimensions using core-sets

Published: 05 June 2002 Publication History

Abstract

(MATH) Let P be a set of n points in $\Red, and for any integer 0kd--1, let $\RDk(P) denote the minimum over all k-flats $\FLAT$ of maxP Dist(p,\FLAT). We present an algorithm that computes, for any 0 < ε < 1, a k-flat that is within a distance of (1 + $egr;) \RDk(P) from each point of P. The running time of the algorithm is dnO(k/ε5log(1/ε)). The crucial step in obtaining this algorithm is a structural result that says that there is a near-optimal flat that lies in an affine subspace spanned by a small subset of points in P. The size of this "core-set" depends on k and ε but is independent of the dimension.This approach also extends to the case where we want to find a k-flat that is close to a prescribed fraction of the entire point set, and to the case where we want to find j flats, each of dimension k, that are close to the point set. No efficient approximation schemes were known for these problems in high-dimensions, when k>1 or j>1.

References

[1]
D. Achlioptas. Database-friendly random projections. In Proc. 20th ACM Sympos. Principles Database Syst., pages 274--281, 2001.
[2]
N. Alon, S. Dar, M. Parnas, and D. Ron. Testing of clustering. In Proc. 41th Annu. IEEE Sympos. Found. Comput. Sci., 2000.
[3]
P.K. Agarwal and C.M. Procopiuc. Approximation algorithms for projective clustering. In Proc. 11th ACM-SIAM Sympos. Discrete Algorithms, pages 538--547, 2000.
[4]
I. Barany and Z. Furedi Computing the volume is difficult. In Discrete Comput. Geom. 2, 319--326, 1987.
[5]
M. Bern and D. Eppstein. Approximation algorithms for geometric problems. In D.S. Hochbaum, editor, Approximationg algorithms for NP-Hard problems. PWS Publishing Company, 1997.
[6]
G. Barequet and S. Har-Peled. Efficiently approximating the minimum-volume bounding box of a point set in three dimensions. J. Algorithms, 38:91--109, 2001.
[7]
M. Badoiu, S. Har-Peled, and P. Indyk. Approximate clustering via core-sets. To appear in STOC 2002.
[8]
A. Brieden and P. Gritzmann and V. Klee. Inapproximability of some geometric and quadratic optimization problems. In P. M. Pardalos, editor, Approximation and Complexity in Numerical Optimization: Continuous and Discrete Problems, pages 96--115, Kluwer, 2000.
[9]
R.O. Duda, P.E. Hart, and D.G. Stork. Pattern Classification. Wiley-Interscience, New York, 2nd edition, 2001.
[10]
P. Gritzmann and V. Klee. Inner and outer j-radii of convex bodies in finite-dimensional normed spaces. Discrete Comput. Geom., 7:255--280, 1992.
[11]
P. Gritzmann and V. Klee. Computational complexity of inner and outer $j$-radii of polytopes in finite-dimensional normed spaces. Math. Program., 59:163--213, 1993.
[12]
P. Gritzmann and V. Klee. On the complexity of some basic problems in computational convexity: I. containment problems. Discrete Math., 136:129--174, 1994.
[13]
M. Grötschel, L. Lovász, and A. Schrijver. Geometric Algorithms and Combinatorial Optimization, volume 2 of Algorithms and Combinatorics. Springer-Verlag, Berlin Heidelberg, 2nd edition, 1988. 2nd edition 1994.
[14]
S. Har-Peled. Clustering motion. In Proc. 42nd Annu. IEEE Sympos. Found. Comput. Sci., pages 84--93, 2001.
[15]
S. Har-Peled and K.R. Varadarajan. Approximate shape fitting via linearization. In Proc. 42nd Annu. IEEE Sympos. Found. Comput. Sci., pages 66--73, 2001.
[16]
P. Indyk. Algorithmic applications of low-distortion geometric embeddings. In Proc. 42nd Annu. IEEE Sympos. Found. Comput. Sci., pages 10--31, 2001.
[17]
P. Indyk and M. Thorup. Approximate 1-median. manuscript, 2000.
[18]
W.B. Johnson and J. Lindenstrauss. Extensions of lipshitz mapping into hilbert space. Contemporary Mathematics, 26:189--206, 1984.
[19]
A. Magen. Dimensionality reductions that preserve volumes and distance to affine spaces, and its algorithmic applications. Manuscript, 2001.
[20]
N. Mishra, D. Oblinger, and L. Pitt. Sublinear time approximate clustering. In SODA 12, 2001.
[21]
N. Megiddo and A. Tamir. On the complexity of locating linear facilities in the plane. Oper. Res. Lett., 1:194--197, 1982.

Cited By

View all
  • (2020)Robust Coreset Construction for Distributed Machine LearningIEEE Journal on Selected Areas in Communications10.1109/JSAC.2020.300037338:10(2400-2417)Online publication date: Oct-2020
  • (2020)Real-time EEG classification via coresets for BCI applicationsEngineering Applications of Artificial Intelligence10.1016/j.engappai.2019.10345589(103455)Online publication date: Mar-2020
  • (2020)Locating Dimensional Facilities in a Continuous SpaceLocation Science10.1007/978-3-030-32177-2_7(143-184)Online publication date: 17-Mar-2020
  • Show More Cited By

Index Terms

  1. Projective clustering in high dimensions using core-sets

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SCG '02: Proceedings of the eighteenth annual symposium on Computational geometry
    June 2002
    330 pages
    ISBN:1581135041
    DOI:10.1145/513400
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 05 June 2002

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. computational convexity
    2. projective clustering

    Qualifiers

    • Article

    Conference

    SoCG02

    Acceptance Rates

    SCG '02 Paper Acceptance Rate 35 of 104 submissions, 34%;
    Overall Acceptance Rate 625 of 1,685 submissions, 37%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)26
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 26 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2020)Robust Coreset Construction for Distributed Machine LearningIEEE Journal on Selected Areas in Communications10.1109/JSAC.2020.300037338:10(2400-2417)Online publication date: Oct-2020
    • (2020)Real-time EEG classification via coresets for BCI applicationsEngineering Applications of Artificial Intelligence10.1016/j.engappai.2019.10345589(103455)Online publication date: Mar-2020
    • (2020)Locating Dimensional Facilities in a Continuous SpaceLocation Science10.1007/978-3-030-32177-2_7(143-184)Online publication date: 17-Mar-2020
    • (2019)Robust Coreset Construction for Distributed Machine Learning2019 IEEE Global Communications Conference (GLOBECOM)10.1109/GLOBECOM38437.2019.9013625(1-6)Online publication date: Dec-2019
    • (2019)Topology-Aware Job Scheduling for Machine Learning Cluster2019 IEEE Global Communications Conference (GLOBECOM)10.1109/GLOBECOM38437.2019.9013361(1-6)Online publication date: Dec-2019
    • (2019)Core-Sets: Updated SurveySampling Techniques for Supervised or Unsupervised Tasks10.1007/978-3-030-29349-9_2(23-44)Online publication date: 27-Oct-2019
    • (2019)Core‐sets: An updated surveyWIREs Data Mining and Knowledge Discovery10.1002/widm.133510:1Online publication date: Oct-2019
    • (2017)Stochastic k-center and j-flat-center problemsProceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms10.5555/3039686.3039694(110-129)Online publication date: 16-Jan-2017
    • (2016)Bypassing UGC from Some Optimal Geometric Inapproximability ResultsACM Transactions on Algorithms10.1145/273772912:1(1-25)Online publication date: 8-Feb-2016
    • (2015)Streaming Algorithms for Extent Problems in High DimensionsAlgorithmica10.1007/s00453-013-9846-472:1(83-98)Online publication date: 1-May-2015
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media