The Normal Approximation To The Binomial Distribution
The Normal Approximation To The Binomial Distribution
The Normal Approximation To The Binomial Distribution
Fall 2012
n!
.
x!(n x)!
C(n, x)
The function f (x) represents the probability of exactly x successes in n Bernoulli trials
(cf. pp. 756758 of Boas), where a given trial has two possible outcomes: a success
with probability p and a failure with probability q = 1 p. Each repeated trial is an
independent event.
The expectation value of the binomial distribution can be computed using the following trick. Consider the binomial expansion
n
(p + q) =
n
X
C(n, k)pk q nk .
k=0
X
d
(p + q)n =
kC(n, k)pk q nk .
dp
k=0
Evaluating the left hand side of the above equation then yields
np(p + q)n1 =
n
X
kC(n, k)pk q nk .
k=0
The above result is true for any p and q. If we apply it to the case where q = 1 p, then
we find
n
X
kf (k) = x ,
np =
k=0
Pn
where we recognize k=0 kf (k) as the expectation value (or mean) of the binomial distribution. Hence, we conclude that
x = np .
By a similar trick, we may compute the variance of the binomial distribution. In this
case, we evaluate
n
2
X
n
2 d
k(k 1)C(n, k)pk q nk .
p 2 (p + q) =
dp
k=0
Evaluating the left hand side of the above equation then yields
n(n 1)p2 (p + q)n2 =
n
X
k=0
The above result is true for any p and q. If we apply it to the case where q = 1 p, then
we find
n
n
X
X
2
2
n(n 1)p =
k f (k)
kf (k) = x2 x ,
k=0
Pn
k=0
after recognizing k=0 k f (k) as the average value of x2 for the binomial distribution.
Since x = np, we conclude that
x2 = n(n 1)p2 + np .
Hence, the variance is given by
Var(x) = x2 (x)2 = n(n 1)p2 + np n2 p2 = np(1 p) .
1
2
e(xnp) /2npq .
2npq
In these notes, we will prove this result and establish the size of the correction.
We start with the explicit form for the binomial distribution,
f (x) =
n!
px q nx ,
x!(n x)!
where q = 1 p. By assumption n, np and nq are large.1 We are interested in approximating the binomial distribution by the normal distribution in the region where the
1
As long as p is not too close to either 0 or 1, it follows that np and nq are both of O(n) as n is taken
large.
binomial distribution differs significantly from zero. This is the region in the vicinity of
the mean np. Thus, we assume that x does not deviate too much from np. We shall
we see that x np should be of O( n). This is not much of a restriction since once
x deviates from np by many standard deviations, f (x) becomes very small and can be
crudely approximated as being zero. Hence, in what follows we shall take x and n x to
both be of O(n) as n is taken large.
Using Stirlings formula [cf. eq. (11.1) and (11.5) on p. 552 of Boas],
1
n n
,
2n 1 + O
n! = n e
n
we have
nn en 2n
1
x nx
p
f (x) =
1+O
p q
x
x
nx
(nx)
n
x e
2x(n x) e
2(n x)
r
1
n
x
nx n
= (p/x) (q/(n x)) n
1+O
2x(n x)
n
np x nq nx r
1
n
=
1+O
.
x
nx
2x(n x)
n
(1)
np
= ln
= ln 1 +
,
ln
x
np +
np
nq
nq
ln
= ln
= ln 1
.
nx
nq
nq
Then, using the expansion, ln(1 + x) = x 12 x2 + O(x3 ), we have
"
#
np x nq nx
np
nq
ln
= x ln
+ (n x) ln
x
nx
x
nx
3
1 2
+O
= ( + np)
2
2
np 2 n p
n3
3
1 2
(nq )
+
O
nq 2 n2 q 2
n3
2
1
1
1+
+O
= 1 +
2 np
2 nq
n2
3
2
+O
.
=
2npq
n2
3
Exponentiating the above result, it follows that the product of the first two terms in
eq. (1) can be written as
3
np x nq nx
2 /2npq
.
1+O
=e
x
nx
n2
Moreover, the square root factor in eq. (1) can be approximated by
r
r
r
n
n
1
1+O
.
=
=
2x(n x)
2(np + )(nq )
2npq
n
(2)
(3)
At the beginning of this section, I argued that x should differ from the mean = np
1
2
e(xnp) /2npq ,
2npq
and
2
x (x) dx
Z
2
x(x) dx = npq ,