Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/3174304.3175477acmconferencesArticle/Chapter ViewAbstractPublication PagessodaConference Proceedingsconference-collections
research-article

Improved coresets for kernel density estimates

Published: 07 January 2018 Publication History

Abstract

We study the construction of coresets for kernel density estimates. That is we show how to approximate the kernel density estimate described by a large point set with another kernel density estimate with a much smaller point set. For characteristic kernels (including Gaussian and Laplace kernels), our approximation preserves the L error between kernel density estimates within error ϵ, with coreset size 4/ϵ2, but no other aspects of the data, including the dimension, the diameter of the point set, or the bandwidth of the kernel common to other approximations. When the dimension is unrestricted, we show this bound is tight for these kernels as well as a much broader set.
This work provides a careful analysis of the iterative Frank-Wolfe algorithm adapted to this context, an algorithm called kernel herding. This analysis unites a broad line of work that spans statistics, machine learning, and geometry.
When the dimension d is constant, we demonstrate much tighter bounds on the size of the coreset specifically for Gaussian kernels, showing that it is bounded by the size of the coreset for axis-aligned rectangles. Currently the best known constructive bound is [EQUATION], and non-constructively, this can be improved by [EQUATION]. This improves the best constant dimension bounds polynomially for d ≥ 3.

References

[1]
Nachman Aronszajn. Theory of reproducing kernels. Transactions of the American mathematical society, 68(3):337--404, 1950.
[2]
Franics Bach, Simon Lacsote-Julien, and Guillaume Obozinski. On the equivalence between herding and conditional gradient algorithms. In Proceedings International Conference on Machine Learning, 2012.
[3]
Nikhil Bansal and Shashwat Garg. Algorithmic discrepancy beyond partial coloring. In Proceedings ACM Symposium on Theory of Computation, 2017.
[4]
Yutian Chen, Max Welling, and Alex Smola. Super-samples from kernel hearding. In Conference on Uncertainty in Artificial Intellegence, 2010.
[5]
Ken Clarkson. Coresets, sparse greedy approximation, and the frank-wolfe algorithm. ACM Transactions on Algorithms, 4(6), 2010.
[6]
Efren Cruz Cortes and Clayton Scott. Sparse approximation of a kernel mean. IEEE Transactions on Signal Processing, accepted (arXiv:1503.00323), 2015.
[7]
Luc Devroye and László Györfi. Nonparametric Density Estimation: The L1 View. Wiley, 1984.
[8]
Joseph C Dunn. Convergence rates for conditional gradient sequences generated by implicit step length rules. SIAM Journal on Control and Optimization, 18(5):473--487, 1980.
[9]
Brittany Terese Fasy, Fabrizio Lecci, Alessandro Rinaldo, Larry Wasserman, Sivaraman Balakrishnan, and Aarti Singh. Confidence sets for persistence diagrams. The Annals of Statistics, 42:2301--2339, 2014.
[10]
Marguerite Frank and Philip Wolfe. An algorithm for quadratic programming. Naval Research Logistics Quarterly, 3:95--110, 1956.
[11]
Robert Freund and Paul Grigas. New analysis and results for the frank-wolfe method. Mathematical Programming, (to appear).
[12]
Bernd Gärtner and Martin Jaggi. Coresets for polytope distance. In SOCG, 2009.
[13]
Joan Glaunès. Transport par difféomorphismes de points, de mesures et de courants pour la comparaison de formes et l'anatomie numérique. PhD thesis, Université Paris 13, 2005.
[14]
Teofilo F. Gonzalez. Clustering to minimize the maximum intercluster distance. Theoretical Computer Science, 38:293--306, 1985.
[15]
Arthur Gretton, Karsten M. Borgwardt, Malte J. Rasch, Bernhard Scholkopf, and Alexander Smola. A kernel two-sample test. JMLR, 13:723--773, 2012.
[16]
Zaid Harchaoui, Francis Bach, Olivier Cappe, and Eric Moulines. Kernel-based methods for hypothesis testing: A unified view. IEEE Signal Processing Magazine, 30:87--97, 2013.
[17]
Nick Harvey and Samira Samadi. Near-optimal herding. In COLT, pages 1165--1182, 2014.
[18]
Matrial Hein and Olivier Bousquet. Hilbertian metrics and positive definite kernels on probability measures. In AIStats, 2005.
[19]
Martin Jaggi. Revisiting Frank-Wolfe: Projection-free sparse convex optimization. In ICML, 2013.
[20]
Martin Jaggi and Simon Lacsote-Julien. On the global linear convergence of frank-wolfe optimization variants. In NIPS, 2015.
[21]
Sarang Joshi, Raj Varma Kommaraju, Jeff M. Phillips, and Suresh Venkatasubramanian. Comparing distributions and shapes using the kernel distance. In SOCG, 2011.
[22]
Sarang Joshi, Raj Varma Kommaraju, Jeff M. Phillips, and Suresh Venkatasubramanian. Comparing distributions and shapes using the kernel distance. In 27th Annual Symposium on Computational Geometry, 2011.
[23]
Yi Li, Philip M. Long, and Aravind Srinivasan. Improved bounds on the samples complexity of learning. J. Comp. and Sys. Sci., 62:516--527, 2001.
[24]
Alfred Muller. Integral probability metrics and their generating classes of functions. Advances in Applied Probability, 29(2):429--443, 1997.
[25]
Aleksandar Nikolov. Tighter bounds for the discrepancy of boxes and polytopes. Technical report, arXiv:1701.05532, 2017.
[26]
Emanuel Parzen. On the estimation of a probability density function and the mode. Annals of Mathematical Statistics, 33:1065--1076, 1962.
[27]
Jeff M. Phillips. ϵ-samples for kernels. In SODA, 2013.
[28]
Jeff M. Phillips and Suresh Venkatasubramanian. A gentle introduction to the kernel distance. arXiv:1103.1625, March 2011.
[29]
Jeff M. Phillips, Bei Wang, and Yan Zheng. Geometric inference on kernel density estimates. In SOCG, 2015.
[30]
Bernhard Scholkopf and Alexander J. Smola. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, 2002.
[31]
David W. Scott. Multivariate Density Estimation: Theory, Practice, and Visualization. Wiley, 1992.
[32]
Alexander J. Smola Seth R. Flaxman, Yu-Xiang Wang. Who supported Obama in 2012?: Ecological inference through distribution regression. In KDD, 2015.
[33]
Bernard W. Silverman. Density Estimation for Statistics and Data Analysis. Chapman & Hall/CRC, 1986.
[34]
Le Song, Xinhua Zhang, Alex Smola, Arthur Gretton, and Berhard Schölkopf. Tailoring density estimation via reproducing kernel moment matching. In ICML, 2008.
[35]
Bharath K. Sriperumbudur, Arthur Gretton, Kenji Fukumizu, Bernhard Schölkopf, and Gert R. G. Lanckriet. Hilbert space embeddings and metrics on probability measures. JMLR, 11:1517--1561, 2010.
[36]
Grace Wahba. Support vector machines, reproducing kernel Hilbert spaces, and randomization. In Advances in Kernel Methods - Support Vector Learning, pages 69--88. 1999.
[37]
Yan Zheng and Jeff M. Phillips. L_infity error and bandwidth selection for kernel density estimates of large data. In KDD, 2015.

Cited By

View all
  1. Improved coresets for kernel density estimates

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SODA '18: Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms
    January 2018
    2859 pages
    ISBN:9781611975031
    • Program Chair:
    • Artur Czumaj

    Sponsors

    Publisher

    Society for Industrial and Applied Mathematics

    United States

    Publication History

    Published: 07 January 2018

    Check for updates

    Qualifiers

    • Research-article

    Conference

    SODA '18
    Sponsor:
    SODA '18: Symposium on Discrete Algorithms
    January 7 - 10, 2018
    Louisiana, New Orleans

    Acceptance Rates

    Overall Acceptance Rate 411 of 1,322 submissions, 31%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)6
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 16 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media