research-article

Learning poisson binomial distributions

Authors:

Constantinos Daskalakis,

Ilias Diakonikolas,

Rocco A. ServedioAuthors Info & Claims

STOC '12: Proceedings of the forty-fourth annual ACM symposium on Theory of computing

Pages 709 - 728

https://doi.org/10.1145/2213977.2214042

Published: 19 May 2012 Publication History

Abstract

We consider a basic problem in unsupervised learning: learning an unknown Poisson Binomial Distribution. A Poisson Binomial Distribution (PBD) over {0,1,...,n} is the distribution of a sum of n independent Bernoulli random variables which may have arbitrary, potentially non-equal, expectations. These distributions were first studied by S. Poisson in 1837 and are a natural n-parameter generalization of the familiar Binomial Distribution. Surprisingly, prior to our work this basic learning problem was poorly understood, and known results for it were far from optimal.

We essentially settle the complexity of the learning problem for this basic class of distributions. As our main result we give a highly efficient algorithm which learns to ε-accuracy using O(1/ε³) samples independent of n. The running time of the algorithm is quasilinear in the size of its input data, i.e. ~O(log(n)/ε³) bit-operations (observe that each draw from the distribution is a log(n)-bit string). This is nearly optimal since any algorithm must use Ω(1/ε²) samples. We also give positive and negative results for some extensions of this learning problem.

Supplementary Material

JPG File (stoc_8b_1.jpg)

Download
14.33 KB

MP4 File (stoc_8b_1.mp4)

Download
147.33 MB

References

[1]

Andrew C. Berry. The accuracy of the gaussian approximation to the sum of independent variates. Transactions of the American Mathematical Society, 49(1):122--136, 1941.

[2]

A.D. Barbour, L. Holst, and S. Janson. Poisson Approximation. Oxford University Press, New York, NY, 1992.

[3]

L. Birgé. Estimating a density under order restrictions: Nonasymptotic minimax risk. Annals of Statistics, 15(3):995--1012, 1987.

[4]

L. Birgé. On the risk of histograms for estimating decreasing densities. Annals of Statistics, 15(3):1013--1022, 1987.

[5]

L. Birgé. Estimation of unimodal densities without smoothness assumptions. Annals of Statistics, 25(3):970--981, 1997.

[6]

A. D. Barbour and T. Lindvall. Translated poisson approximation for markov chains. Journal of Theoretical Probability, 19, 2006.

[7]

R. P. Brent. Multiple-precision zero-finding methods and the complexity of elementary function evaluation. Analytic Computational Complexity (J. F. Traub ed.), pages 151--176, 1975. Academic Press, New York.

[8]

R. P. Brent. Fast multiple-precision evaluation of elementary functions. Journal of the ACM, 23(2):242--251, 1976.

Digital Library

[9]

Mikhail Belkin and Kaushik Sinha. Polynomial learning of distribution families. In FOCS, pages 103--112, 2010.

Digital Library

[10]

L. Le Cam. An approximation theorem for the Poisson binomial distribution. Pacific J. Math, 10:1181--1197, 1960.

[11]

H. Chernoff. A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. Ann. Math. Statist., 23:493--507, 1952.

[12]

L.H.Y. Chen. On the convergence of Poisson binomial to Poisson distributions. Ann. Probab., 2:178--180, 1974.

[13]

S.X. Chen and J.S. Liu. Statistical applications of the Poisson-Binomial and Conditional Bernoulli Distributions. Statistica Sinica, 7:875--892, 1997.

[14]

Constantinos Daskalakis. An Efficient PTAS for Two-Strategy Anonymous Games. phWINE 2008, pp. 186--197. Full version available as ArXiV report, 2008.

Digital Library

[15]

C. Daskalakis, I. Diakonikolas, and R.A. Servedio. Learning k-modal distributions via testing. To appear in SODA, 2012.

Digital Library

[16]

L. Devroye and G. Lugosi. Nonasymptotic universal smoothing factors, kernel complexity and Yatracos classes. Annals of Statistics, 25:2626--2637, 1996.

[17]

L. Devroye and G. Lugosi. A universally acceptable smoothing factor for kernel density estimation. Annals of Statistics, 24:2499--2512, 1996.

[18]

L. Devroye and G. Lugosi. Combinatorial methods in density estimation. Springer Series in Statistics, Springer, 2001.

[19]

P. Deheuvels and D. Pfeifer. A semigroup approach to Poisson approximation. Ann. Probab., 14:663--676, 1986.

[20]

D. Dubhashi and A. Panconesi. Concentration of measure for the analysis of randomized algorithms. Cambridge University Press, Cambridge, 2009.

Digital Library

[21]

C. Daskalakis and C. Papadimitriou. On Oblivious PTAS's for Nash Equilibrium. STOC 2009, pp. 75--84. Full version available as ArXiV report, 2011.

Digital Library

[22]

Werner Ehm. Binomial approximation to the Poisson binomial distribution. Statistics and Probability Letters, 11:7--16, 1991.

[23]

Carl-Gustav Esseen. On the liapunoff limit of error in the theory of probability. Arkiv för matematik, astronomi och fysik, A:1--19, 1942.

[24]

Sandra Fillebrown. Faster computation of bernoulli numbers. Journal of Algorithms, 13(3):431 -- 445, 1992.

Digital Library

[25]

S.L. Hodges and L. Le Cam. The Poisson approximation to the binomial distribution. Ann. Math. Statist., 31:747--740, 1960.

[26]

W. Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58:13--30, 1963.

[27]

J.L. Johnson. Probability and Statistics for Computer Science. John Wiley & Sons, Inc., New York, NY, USA, 2003.

Digital Library

[28]

J. Keilson and H. Gerber. Some Results for Discrete Unimodality. J. American Statistical Association, 66(334):386--389, 1971.

[29]

M. Kearns, Y. Mansour, D. Ron, R. Rubinfeld, R. Schapire, and L. Sellie. On the learnability of discrete distributions. In Proceedings of the 26th Symposium on Theory of Computing, pages 273--282, 1994.

Digital Library

[30]

Adam Tauman Kalai, Ankur Moitra, and Gregory Valiant. Efficiently learning mixtures of two Gaussians. In STOC, pages 553--562, 2010.

Digital Library

[31]

Donald E. Knuth. The Art of Computer Programming, Volume II: Seminumerical Algorithms, 2nd Edition. Addison-Wesley, 1981.

[32]

V.G. Mikhailov. On a refinement of the central limit theorem for sums of independent random indicators. Theory Probab. Appl., 38:479--489, 1993.

[33]

Ankur Moitra and Gregory Valiant. Settling the polynomial learnability of mixtures of gaussians. In FOCS, pages 93--102, 2010.

Digital Library

[34]

S. Kotz N.L. Johnson, A.W. Kemp. Univariate discrete distributions. John Wiley & Sons, Inc., New York, NY, USA, 2005.

[35]

S.D. Poisson. Recherches sur la Probabilitè des jugements en matié criminelle et en matiére civile. Bachelier, Paris, 1837.

[36]

Y. Peres and S. Watson. Personal communication, 2011.

[37]

A. Röllin. Translated Poisson Approximation Using Exchangeable Pair Couplings. Annals of Applied Probability, 17(5/6):1596--1614, 2007.

[38]

B. Roos. Binomial approximation to the Poisson binomial distribution: The Krawtchouk expansion. Theory Probab. Appl., 45:328--344, 2000.

[39]

Eugene Salamin. Computation of pi using arithmetic-geometric mean. Mathematics of Computation, 30(135):565--570, 1976.

[40]

Ken-Iti Sato. Convolution of unimodal distributions can produce any number of modes. Annals of Probability, 21(3):1543--1549, 1993.

[41]

S.Y.T. Soon. Binomial approximation for dependent indicators. Statist. Sinica, 6:703--714, 1996.

[42]

A. Schönhage and V. Strassen. Schnelle multiplikation grosser zahlen. Computing, 7:281--292, 1971.

[43]

J.M. Steele. Le Cam's Inequality and Poisson Approximation. Amer. Math. Monthly, 101:48--54, 1994.

[44]

A. Yu. Volkova. A refinement of the central limit theorem for sums of independent random indicators. Theory Probab. Appl., 40:791--794, 1995.

[45]

Gregory Valiant and Paul Valiant. Estimating the unseen: an n/log(n)-sample estimator for entropy and support size, shown optimal via new CLTs. In STOC, pages 685--694, 2011.

Digital Library

[46]

Y.H. Wang. On the number of successes in independent trials. Statistica Sinica, 3:295--312, 1993.

[47]

E.T. Whittaker. A course of modern analysis. Cambridge University Press, 1980.

[48]

Y. G. Yatracos. Rates of convergence of minimum distance estimators and Kolmogorov's entropy. Annals of Statistics, 13:768--774, 1985.

Cited By

Ben-David SBie AKamath GLechner TOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Distribution learnability and robustnessProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668418(52732-52758)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3668418
Golowich LSaha BServedio R(2023)A New Berry-Esseen Theorem for Expander WalksProceedings of the 55th Annual ACM Symposium on Theory of Computing10.1145/3564246.3585141(10-22)Online publication date: 2-Jun-2023
https://dl.acm.org/doi/10.1145/3564246.3585141
Xiang Z(2023)A Transformer-based Patient Clustering Method in Continuing Care Retirement Communities2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)10.1109/BIBM58861.2023.10385837(684-689)Online publication date: 5-Dec-2023
https://doi.org/10.1109/BIBM58861.2023.10385837
Show More Cited By

Index Terms

Learning poisson binomial distributions
1. Mathematics of computing
  1. Probability and statistics
    1. Distribution functions
    2. Nonparametric statistics
2. Theory of computation
  1. Randomness, geometry and discrete structures

Recommendations

Learning Poisson Binomial Distributions

We consider a basic problem in unsupervised learning: learning an unknown Poisson binomial distribution. A Poisson binomial distribution (PBD) over $$\{0,1,\ldots ,n\}$${0,1, ,n} is the distribution of a sum of $$n$$n independent Bernoulli random ...
A size-free CLT for poisson multinomials and its applications
STOC '16: Proceedings of the forty-eighth annual ACM symposium on Theory of Computing

An (n,k)-Poisson Multinomial Distribution (PMD) is the distribution of the sum of n independent random vectors supported on the set B_k={e₁,…,e_k} of standard basis vectors in ℝ^k. We show that any (n,k)-PMD is poly(k/σ)-close in total variation distance ...
Efficient density estimation via piecewise polynomial approximation
STOC '14: Proceedings of the forty-sixth annual ACM symposium on Theory of computing

We give a computationally efficient semi-agnostic algorithm for learning univariate probability distributions that are well approximated by piecewise polynomial density functions. Let p be an arbitrary distribution over an interval I, and suppose that p ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

STOC '12: Proceedings of the forty-fourth annual ACM symposium on Theory of computing

May 2012

1310 pages

ISBN:9781450312455

DOI:10.1145/2213977

General Chair:
Howard Karloff
AT&T Labs--Research
,
Program Chair:
Toniann Pitassi
University of Toronto

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGACT: ACM Special Interest Group on Algorithms and Computation Theory

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 May 2012

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

STOC'12

Sponsor:

SIGACT

STOC'12: Symposium on Theory of Computing

May 19 - 22, 2012

New York, New York, USA

Acceptance Rates

Overall Acceptance Rate 1,469 of 4,586 submissions, 32%

Upcoming Conference

STOC '25

Sponsor:
sigact

57th Annual ACM Symposium on Theory of Computing (STOC 2025)

June 23 - 27, 2025

Prague , Czech Republic

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

49
Total Citations
View Citations
185
Total Downloads

Downloads (Last 12 months)44
Downloads (Last 6 weeks)7

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Ben-David SBie AKamath GLechner TOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Distribution learnability and robustnessProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668418(52732-52758)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3668418
Golowich LSaha BServedio R(2023)A New Berry-Esseen Theorem for Expander WalksProceedings of the 55th Annual ACM Symposium on Theory of Computing10.1145/3564246.3585141(10-22)Online publication date: 2-Jun-2023
https://dl.acm.org/doi/10.1145/3564246.3585141
Xiang Z(2023)A Transformer-based Patient Clustering Method in Continuing Care Retirement Communities2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)10.1109/BIBM58861.2023.10385837(684-689)Online publication date: 5-Dec-2023
https://doi.org/10.1109/BIBM58861.2023.10385837
Assare OGupta R(2022)Performance Analysis of Timing-Speculative ProcessorsIEEE Transactions on Computers10.1109/TC.2021.305187771:2(407-420)Online publication date: 1-Feb-2022
https://doi.org/10.1109/TC.2021.3051877
Canonne CDiakonikolas IKane DStewart A(2020)Testing Bayesian NetworksIEEE Transactions on Information Theory10.1109/TIT.2020.297162566:5(3132-3170)Online publication date: May-2020
https://doi.org/10.1109/TIT.2020.2971625
Axelrod BDiakonikolas DSidiropoulos AStewart AValiant GWallach HLarochelle HBeygelzimer Ad'Alché-Buc FFox E(2019)A polynomial time algorithm for log-concave maximum likelihood via locally exponential familiesProceedings of the 33rd International Conference on Neural Information Processing Systems10.5555/3454287.3454981(7723-7735)Online publication date: 8-Dec-2019
https://dl.acm.org/doi/10.5555/3454287.3454981
Bun MKamath GSteinke TWu ZWallach HLarochelle HBeygelzimer Ad'Alché-Buc FFox E(2019)Private hypothesis selectionProceedings of the 33rd International Conference on Neural Information Processing Systems10.5555/3454287.3454302(156-167)Online publication date: 8-Dec-2019
https://dl.acm.org/doi/10.5555/3454287.3454302
Hornbæk KMottelson AKnibbe JVogel D(2019)What Do We Mean by “Interaction”? An Analysis of 35 Years of CHIACM Transactions on Computer-Human Interaction10.1145/332528526:4(1-30)Online publication date: 1-Jul-2019
https://dl.acm.org/doi/10.1145/3325285
Ahumada-Newhart VOlson J(2019)Going to School on a RobotACM Transactions on Computer-Human Interaction10.1145/332521026:4(1-28)Online publication date: 17-Jun-2019
https://dl.acm.org/doi/10.1145/3325210
Assare OGupta R(2019)Accurate Estimation of Program Error Rate for Timing-Speculative ProcessorsProceedings of the 56th Annual Design Automation Conference 201910.1145/3316781.3317758(1-6)Online publication date: 2-Jun-2019
https://dl.acm.org/doi/10.1145/3316781.3317758
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten