Lecture 13
Lecture 13
Lecture 13
Department of Economics
ECON10005 Quantitative Methods 1
LECTURE 13
Hypothesis Testing 1
1
The story so far: Estimation theory for a population mean
Suppose we have a simple random sample X1 , … , Xn from a
population with mean μ = E(Xi ) and variance σ 2 = var(Xi ).
n
1
The sample mean is X = ∑ Xi .
ˉ ˉ is a random variable.
X
n i=1
ˉ is called the sampling distribution.
− The distribution of X
ˉ is E(X
− The mean of the sampling distribution of X ˉ ) = μ.
ˉ is unbiased.
X
σ 2
ˉ is var(X
− The variance of the sampling distribution of X ˉ) = .
n
− The Central Limit Theorem shows that
Xˉ −μ a
Z= ∼ N (0, 1) .
σ/ n 2
Variance Estimation
ˉ to draw inferences about μ. The CLT result
The aim is to use X
ˉ −μ a
X
Z= ∼ N (0, 1)
σ/ n
provides a good foundation for inference, except σ is unknown.
The sample variance
n
1
2
s = ∑(Xi − X
ˉ )2
n − 1 i=1
can be shown to be an unbiased estimator of σ 2 . i.e. E(s2 ) = σ 2 .
3
Variance Estimation
ˉ to draw inferences about μ. The CLT result
The aim is to use X
ˉ −μ a
X
Z= ∼ N (0, 1)
σ/ n
provides a good foundation for inference, except σ is unknown.
The sample variance
n
1
2
s = ∑(Xi − X
ˉ )2
n − 1 i=1
can be shown to be an unbiased estimator of σ 2 . i.e. E(s2 ) = σ 2 .
You will prove this in tutorial 7, next week.
It can be shown the CLT continues to work:
ˉ −μ a
X
∼ N (0, 1)
s/ n 4
The t distribution
However... when σ is estimated by s, a better approximation may
be provided by the t distribution.
− The t distribution is very similar to the normal distribution.
− It depends on the degrees of f reedom, which is n − 1.
− The t distribution has a (slightly) larger variance than the normal:
n−1
var(t) = for n > 3.
n−3
− As n increases, the t and normal distributions become more and
more indistinguishable.
ˉ −μ
X
− If Xi is normally distributed then ∼ tn−1 exactly.
s/ n
(Although Xi can never be exactly normally distributed!)
5
t distribution with 10 degrees of freedom
6
t distribution with 20 degrees of freedom
7
Application: Children with Cochlear Implants
Profoundly deaf children can be given Cochlear Implants to
provide some hearing.
In the general population, IQ scores have mean 100 and s.d. 15.
From the sample of IQ scores from the 90 CI children, we calculate
Xˉ = 105.22, s = 12.54
Does this evidence suggest that CI children have average IQ’s
the same as the general population?
This is formalised as a hypothesis test.
9
Steps Conducting a Hypothesis Test
10
Hypothesis Testing for a Mean
Let μ denote the (unknown) mean of the population of interest.
eg. average IQ in the population of children with Cochlear Implants.
The specification of HA depends on the wording of the question.
11
Hypothesis Testing for a Mean
Let μ denote the (unknown) mean of the population of interest.
eg. average IQ in the population of children with Cochlear Implants.
The specification of HA depends on the wording of the question.
Does average IQ for CI children differ from the general population?
12
Hypothesis Testing for a Mean
Let μ denote the (unknown) mean of the population of interest.
eg. average IQ in the population of children with Cochlear Implants.
The specification of HA depends on the wording of the question.
Does average IQ for CI children differ from the general population?
Is average IQ for CI children below that of the general population?
13
Hypothesis Testing for a Mean
Let μ denote the (unknown) mean of the population of interest.
eg. average IQ in the population of children with Cochlear Implants.
The specification of HA depends on the wording of the question.
Does average IQ for CI children differ from the general population?
Is average IQ for CI children below that of the general population?
The specification of HA can not depend on the data! 14
Hypothesis Testing for a Mean
We specify a null hypothesis (H0 ) that the population mean is
equal to some value of interest.
eg. H0 : μ = 100
We then specify one of three possible alternative hypotheses (HA ) :
HA : μ = 100 or HA : μ > 100 or HA : μ < 100.
Two-tail test Upper-tail Test Lower-tail Test
15
The test statistic
Next calculate the t-statistic :
Xˉ − 100
t=
s/ n
16
The test statistic
Next calculate the t-statistic :
Xˉ − 100
t=
s/ n
ˉ is calculated from the sample, eg. X
−X ˉ = 105.22
17
The test statistic
Next calculate the t-statistic :
Xˉ − 100
t=
s/ n
ˉ is calculated from the sample, eg. X
−X ˉ = 105.22
− 100 is the value specified by H0
18
The test statistic
Next calculate the t-statistic :
Xˉ − 100
t=
s/ n
ˉ is calculated from the sample, eg. X
−X ˉ = 105.22
− 100 is the value specified by H0
− s is calculated from the sample, eg. s = 12.54
19
The test statistic
Next calculate the t-statistic :
Xˉ − 100
t=
s/ n
ˉ is calculated from the sample, eg. X
−X ˉ = 105.22
− 100 is the value specified by H0
− s is calculated from the sample, eg. s = 12.54
ˉ.
− s/ n is called the “standard error” (s.e.) of X
ˉ ) = s/ n = 12.54/ 90 = 1.32
eg. s.e.(X
20
The test statistic
Next calculate the t-statistic :
Xˉ − 100
t=
s/ n
ˉ is calculated from the sample, eg. X
−X ˉ = 105.22
− 100 is the value specified by H0
− s is calculated from the sample, eg. s = 12.54
ˉ.
− s/ n is called the “standard error” (s.e.) of X
ˉ ) = s/ n = 12.54/ 90 = 1.32
eg. s.e.(X
− The t-statistic is
105.22 − 100
t= = 3.95
1.32 21
Recall the fundamental (approximate) distributional result:
ˉ −μ a
X
∼ tn−1 (t distribution with n − 1 d.f.)
s/ n
We have calculated the t-statistic
ˉ − 100
X
t= = 3.95
s/ n
22
Recall the fundamental (approximate) distributional result:
ˉ −μ a
X
∼ tn−1 (t distribution with n − 1 d.f.)
s/ n
We have calculated the t-statistic
ˉ − 100
X
t= = 3.95
s/ n
If μ = 100 then t should “look like” it was drawn from tn−1 .
23
Recall the fundamental (approximate) distributional result:
ˉ −μ a
X
∼ tn−1 (t distribution with n − 1 d.f.)
s/ n
We have calculated the t-statistic
ˉ − 100
X
t= = 3.95
s/ n
If μ = 100 then t should “look like” it was drawn from tn−1 .
If μ = 100 then t may “look unlike” it was drawn from tn−1 .
How to decide between these???
24
tn−1 distribution with n = 90
25
tn−1 distribution with n = 90
t = 3.95
26
Decision Rules
If the calculated t-statistic is “far enough” into the tail of the
t distribution, we decide the evidence in the sample is
against H0 being true.
27
Decision Rules
If the calculated t-statistic is “far enough” into the tail of the
t distribution, we decide the evidence in the sample is
against H0 being true.
28
Decision Rules
If the calculated t-statistic is “far enough” into the tail of the
t distribution, we decide the evidence in the sample is
against H0 being true.
29
Decision rule (using p-value approach) - upper tail test
t = 0.5
30
Decision rule (using p-value approach)- upper tail test
31
Decision rule (using p-value approach)- upper tail test
Most common decision rule:
reject H0 if p = P (tn−1 > t) < 0.05.
− The significance level is commonly denoted α.
− More later on what this level implies...
32
Decision rule (using p-value approach)- upper tail test
Most common decision rule:
reject H0 if p = P (tn−1 > t) < 0.05.
d.f. = n − 1 = 89
Excel example, with n = 90 :
Most common decision rule:
reject H0 if p = P (tn−1 > t) < 0.05.
d.f. = n − 1 = 89
Excel example, with n = 90 :
P (tn−1 ≤ t) = 0.95
(d.f. = n − 1) 36
Upper tail test decision rule
Equivalently:
reject H0 if t > tα,n−1
where tα,n−1 is the critical value satisfying P (tn−1 > tα,n−1 ) = α.
tα,n−1 in Excel: = T.INV(1 − α, n − 1)
38
Things to know
The t distribution
Hypotheses - null and alternatives
(Upper tail, lower tail and two-tailed specifications)
Decision rules for upper tailed test using
- p value
- t statistic and critical value
39