research-article

Efficient density estimation via piecewise polynomial approximation

Authors:

Ilias Diakonikolas,

Rocco A. Servedio,

Xiaorui SunAuthors Info & Claims

STOC '14: Proceedings of the forty-sixth annual ACM symposium on Theory of computing

Pages 604 - 613

https://doi.org/10.1145/2591796.2591848

Published: 31 May 2014 Publication History

Abstract

We give a computationally efficient semi-agnostic algorithm for learning univariate probability distributions that are well approximated by piecewise polynomial density functions. Let p be an arbitrary distribution over an interval I, and suppose that p is τ-close (in total variation distance) to an unknown probability distribution q that is defined by an unknown partition of I into t intervals and t unknown degree d polynomials specifying q over each of the intervals. We give an algorithm that draws Õ(t(d + 1)/ε²) samples from p, runs in time poly(t, d + 1, 1/ε), and with high probability outputs a piecewise polynomial hypothesis distribution h that is (14τ + ε)-close to p in total variation distance. Our algorithm combines tools from real approximation theory, uniform convergence, linear programming, and dynamic programming. Its sample complexity is simultaneously near optimal in all three parameters t, d and ε; we show that even for τ = 0, any algorithm that learns an unknown t-piecewise degree-d probability distribution over I to accuracy ε must use [EQUATION] samples from the distribution, regardless of its running time.

We apply this general algorithm to obtain a wide range of results for many natural density estimation problems over both continuous and discrete domains. These include state-of-the-art results for learning mixtures of log-concave distributions; mixtures of t-modal distributions; mixtures of Monotone Hazard Rate distributions; mixtures of Poisson Binomial Distributions; mixtures of Gaussians; and mixtures of k-monotone densities. Our general technique gives improved results, with provably optimal sample complexities (up to logarithmic factors) in all parameters in most cases, for all these problems via a single unified algorithm.

Supplementary Material

MP4 File (p604-sidebyside.mp4)

Download
232.47 MB

References

[1]

{AK03} S. Arora and S. Khot. Fitting algebraic curves to noisy data. J. Comput. Syst. Sci., 67(2):325--340, 2003.

Digital Library

[2]

{Ass83} P. Assouad. Deux remarques sur l'estimation. C. R. Acad. Sci. Paris Sér. I, 296:1021--1024, 1983.

[3]

{BBBB72} R. E. Barlow, D. J. Bartholomew, J. M. Bremner, and H. D. Brunk. Statistical Inference under Order Restrictions. Wiley, New York, 1972.

[4]

{Bir87a} L. Birgé. Estimating a density under order restrictions: Nonasymptotic minimax risk. Annals of Statistics, 15(3):995--1012, 1987.

[5]

{Bir87b} L. Birgé. On the risk of histograms for estimating decreasing densities. Annals of Statistics, 15(3):1013--1022, 1987.

[6]

{Bru58} H. D. Brunk. On the estimation of parameters restricted by inequalities. Ann. Math. Statist., 29(2):pp. 437--454, 1958.

[7]

{BRW09} F. Balabdaoui, K. Rufibach, and J. A. Wellner. Limit distribution theory for maximum likelihood estimation of a log-concave density. Ann. Statist., 37(3):pp. 1299--1331, 2009.

[8]

{BS10} M. Belkin and K. Sinha. Polynomial learning of distribution families. In FOCS, pages 103--112, 2010.

Digital Library

[9]

{BW07} F. Balabdaoui and J. A. Wellner. Estimation of a k-monotone density: Limit distribution theory and the spline connection. Ann. Statist., 35(6):pp. 2536--2564, 2007.

[10]

{BW10} F. Balabdaoui and J. A. Wellner. Estimation of a k-monotone density: characterizations, consistency and minimax lower bounds. Statistica Neerlandica, 64(1):45--70, 2010.

[11]

{CDSS13} S. Chan, I. Diakonikolas, R. Servedio, and X. Sun. Learning mixtures of structured distributions over discrete domains. In SODA, 2013.

Digital Library

[12]

{DDO⁺13} C. Daskalakis, I. Diakonikolas, R. O'Donnell, R. A. Servedio, and L. Tan. Learning Sums of Independent Integer Random Variables. In FOCS, pages 217--226, 2013.

Digital Library

[13]

{DDS12a} C. Daskalakis, I. Diakonikolas, and R. A. Servedio. Learning k-modal distributions via testing. In SODA, 2012.

Digital Library

[14]

{DDS12b} C. Daskalakis, I. Diakonikolas, and R. A. Servedio. Learning Poisson Binomial Distributions. In STOC, pages 709--728, 2012.

Digital Library

[15]

{DG85} L. Devroye and L. Györfi. Nonparametric Density Estimation: The L₁ View. John Wiley & Sons, 1985.

[16]

{DGJ⁺10} I. Diakoniokolas, P. Gopalan, R. Jaiswal, R. Servedio, and E. Viola. Bounded independence fools halfspaces. SIAM Journal on Computing, 39(8):3441--3462, 2010.

Digital Library

[17]

{DL01} L. Devroye and G. Lugosi. Combinatorial methods in density estimation. Springer Series in Statistics, Springer, 2001.

[18]

{DR09} L. D umbgen and K. Rufibach. Maximum likelihood estimation of a log-concave density and its distribution function: Basic properties and uniform consistency. Bernoulli, 15(1):40--68, 2009.

[19]

{Dud74} R. M Dudley. Metric entropy of some classes of sets with differentiable boundaries. Journal of Approximation Theory, 10(3):227--236, 1974.

[20]

{FM99} Y. Freund and Y. Mansour. Estimating a mixture of two product distributions. In Proc. 12th COLT, pages 183--192, 1999.

Digital Library

[21]

{FOS05} J. Feldman, R. O'Donnell, and R. Servedio. Learning mixtures of product distributions over discrete domains. In Proc. 46th IEEE FOCS, pages 501--510, 2005.

Digital Library

[22]

{Gre56} U. Grenander. On the theory of mortality measurement. Skand. Aktuarietidskr., 39:125--153, 1956.

[23]

{Gro85} P. Groeneboom. Estimating a monotone density. In Proc. Berkeley Conf. in Honor of J. Neyman and J. Kiefer, pages 539--555, 1985.

[24]

{GW09} F. Gao and J. A. Wellner. On the rate of convergence of the maximum likelihood estimator of a k-monotone density. Science in China Series A: Math., 52:1525--1538, 2009.

[25]

{HP76} D. L. Hanson and G. Pledger. Consistency in concave regression. The Annals of Statistics, 4(6):pp. 1038--1050, 1976.

[26]

{KL04} V. N. Konovalov and D. Leviatan. Free-knot splines approximation of s-monotone functions. Adv. Comput. Math., 20(4):347--366, 2004.

[27]

{KL07} V. N. Konovalov and D. Leviatan. Freeknot splines approximation of sobolev-type classes of s-monotone functions. Adv. Comput. Math., 27(2):211--236, 2007.

[28]

{KM10} R. Koenker and I. Mizera. Quasi-concave density estimation. Ann. Statist., 38(5):2998--3027, 2010.

[29]

{KMR⁺94} M. Kearns, Y. Mansour, D. Ron, R. Rubinfeld, R. Schapire, and L. Sellie. On the learnability of discrete distributions. In Proc. 26th STOC, pages 273--282, 1994.

Digital Library

[30]

{KMV10} A. T. Kalai, A. Moitra, and G. Valiant. Efficiently learning mixtures of two Gaussians. In STOC, pages 553--562, 2010.

Digital Library

[31]

{MV10} A. Moitra and G. Valiant. Settling the polynomial learnability of mixtures of Gaussians. In FOCS, pages 93--102, 2010.

Digital Library

[32]

{Nov88} E. Novak. Deterministic and Stochastic Error Bounds In Numerical Analysis. Springer-Verlag, 1988.

[33]

{PA13} D. Papp and F. Alizadeh. Shape constrained estimation using nonnegative splines. J. Comput. & Graph. Statist., 0, 2013.

[34]

{Rao69} B. L. S. Prakasa Rao. Estimation of a unimodal density. Sankhya Ser. A, 31:23--36, 1969.

[35]

{Reb05} L. Reboul. Estimation of a function under shape restrictions. Applications to reliability. Ann. Statist., 33(3):1330--1356, 2005.

[36]

{Sco92} D. W. Scott. Multivariate Density Estimation: Theory, Practice and Visualization. Wiley, New York, 1992.

[37]

{Sil86} B. W. Silverman. Density Estimation. Chapman and Hall, London, 1986.

[38]

{Wal09} G. Walther. Inference and modeling with log-concave distributions. Statistical Science, 24(3):319--327, 2009.

[39]

{Weg70} E. J. Wegman. Maximum likelihood estimation of a unimodal density. I. and II. Ann. Math. Statist., 41:457--471, 2169--2174, 1970.

Cited By

Ben-David SBie AKamath GLechner TOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Distribution learnability and robustnessProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668418(52732-52758)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3668418
Arbas JAshtiani HLiaw CKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)Polynomial time and private learning of unbounded Gaussian mixture modelsProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3618450(1018-1040)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3618450
Liu ALi JMoitra AKoyejo SMohamed SAgarwal ABelgrave DCho KOh A(2022)Robust model selection and nearly-proper learning for GMMsProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601929(22830-22843)Online publication date: 28-Nov-2022
https://dl.acm.org/doi/10.5555/3600270.3601929
Show More Cited By

Index Terms

Efficient density estimation via piecewise polynomial approximation
1. Mathematics of computing
  1. Probability and statistics
    1. Distribution functions
    2. Nonparametric statistics
2. Theory of computation
  1. Design and analysis of algorithms

Recommendations

Learning poisson binomial distributions
STOC '12: Proceedings of the forty-fourth annual ACM symposium on Theory of computing

We consider a basic problem in unsupervised learning: learning an unknown Poisson Binomial Distribution. A Poisson Binomial Distribution (PBD) over {0,1,...,n} is the distribution of a sum of n independent Bernoulli random variables which may have ...
Estimation of quantile mixtures via L-moments and trimmed L-moments

Moments or cumulants have been traditionally used to characterize a probability distribution or an observed data set. Recently, L-moments and trimmed L-moments have been noticed as appealing alternatives to the conventional moments. This paper promotes ...
Noncentral elliptical configuration density

The noncentral configuration density, derived under an elliptical model, generalizes and corrects the Gaussian configuration and some Pearson results. Partition theory is then used to obtain explicit configuration densities associated with matrix ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

STOC '14: Proceedings of the forty-sixth annual ACM symposium on Theory of computing

May 2014

984 pages

ISBN:9781450327107

DOI:10.1145/2591796

Program Chair:
David Shmoys
Cornell University

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGACT: ACM Special Interest Group on Algorithms and Computation Theory

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 May 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

STOC '14

Sponsor:

SIGACT

STOC '14: Symposium on Theory of Computing

May 31 - June 3, 2014

New York, New York

Acceptance Rates

STOC '14 Paper Acceptance Rate 91 of 319 submissions, 29%;

Overall Acceptance Rate 1,469 of 4,586 submissions, 32%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

38
Total Citations
View Citations
389
Total Downloads

Downloads (Last 12 months)55
Downloads (Last 6 weeks)10

Reflects downloads up to 12 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Ben-David SBie AKamath GLechner TOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Distribution learnability and robustnessProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668418(52732-52758)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3668418
Arbas JAshtiani HLiaw CKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)Polynomial time and private learning of unbounded Gaussian mixture modelsProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3618450(1018-1040)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3618450
Liu ALi JMoitra AKoyejo SMohamed SAgarwal ABelgrave DCho KOh A(2022)Robust model selection and nearly-proper learning for GMMsProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601929(22830-22843)Online publication date: 28-Nov-2022
https://dl.acm.org/doi/10.5555/3600270.3601929
Diakonikolas IKane DKongsgaard DLi JTian KLeonardi SGupta A(2022)Clustering mixture models in almost-linear time via list-decodable mean estimationProceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing10.1145/3519935.3520014(1262-1275)Online publication date: 9-Jun-2022
https://dl.acm.org/doi/10.1145/3519935.3520014
Liu ALi JLeonardi SGupta A(2022)Clustering mixtures with almost optimal separation in polynomial timeProceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing10.1145/3519935.3520012(1248-1261)Online publication date: 9-Jun-2022
https://dl.acm.org/doi/10.1145/3519935.3520012
Hao YOrlitsky ADaumé HSingh A(2020)Data amplificationProceedings of the 37th International Conference on Machine Learning10.5555/3524938.3525317(4049-4059)Online publication date: 13-Jul-2020
https://dl.acm.org/doi/10.5555/3524938.3525317
Jain AOrlitsky ALarochelle HRanzato MHadsell RBalcan MLin H(2020)A general method for robust learning from batchesProceedings of the 34th International Conference on Neural Information Processing Systems10.5555/3495724.3497551(21775-21785)Online publication date: 6-Dec-2020
https://dl.acm.org/doi/10.5555/3495724.3497551
Najafi AMotahari SRabiee H(2020)Reliable clustering of Bernoulli mixture modelsBernoulli10.3150/19-BEJ117326:2Online publication date: 1-May-2020
https://doi.org/10.3150/19-BEJ1173
Saha SGuntuboyina A(2020)On the nonparametric maximum likelihood estimator for Gaussian location mixture densities with application to Gaussian denoisingThe Annals of Statistics10.1214/19-AOS181748:2Online publication date: 1-Apr-2020
https://doi.org/10.1214/19-AOS1817
Bueff ASpeichert SBelle V(2020)Probabilistic Tractable Models in Mixed Discrete-Continuous DomainsData Intelligence10.1162/dint_a_00064(1-33)Online publication date: 16-Dec-2020
https://doi.org/10.1162/dint_a_00064
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents