Nothing Special   »   [go: up one dir, main page]

0% found this document useful (0 votes)
28 views131 pages

Ch11 (2 Files Merged)

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 131

CHAPTER 11

CHI-SQUARE TESTS

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
THE CHI-SQUARE DISTRIBUTION
Definition
The chi-square distribution has only one
parameter called the degrees of freedom. The
shape of a chi-squared distribution curve is skewed
to the right for small df and becomes symmetric
for large df. The entire chi-square distribution
curve lies to the right of the vertical axis. The chi-
square distribution assumes nonnegative values
only, and these are denoted by the symbol χ2 (read
as “chi-square”).

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Figure 11.1 Three chi-square distribution curves.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-1

Find the value of χ² for 7 degrees of


freedom and an area of .10 in the right tail
of the chi-square distribution curve.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Table 11.1 χ2 for df = 7 and .10 Area in the Right
Tail

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Figure 11.2

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-2

Find the value of χ² for 12 degrees of


freedom and an area of .05 in the left tail of
the chi-square distribution curve.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-2: Solution

Area in the right tail


= 1 – Area in the left tail
= 1 – .05 = .95

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Table 11.2 χ2 for df = 12 and .95 Area in the Right
Tail

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Figure 11.3

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
A GOODNESS-OF-FIT TEST
Definition
An experiment with the following characteristics is
called a multinomial experiment.
1. It consists of n identical trials (repetitions).
2. Each trial results in one of k possible outcomes
(or categories), where k > 2.
3. The trials are independent.
4. The probabilities of the various outcomes remain
constant for each trial.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
A GOODNESS-OF-FIT TEST
Definition
The frequencies obtained from the performance of
an experiment are called the observed
frequencies and are denoted by O. The expected
frequencies, denoted by E, are the frequencies
that we expect to obtain if the null hypothesis is
true. The expected frequency for a category is
obtained as
E = np
where n is the sample size and p is the probability
that an element belongs to that category if the null
hypothesis is true.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
A GOODNESS-OF-FIT TEST

Degrees of Freedom for a Goodness-of-Fit


Test
In a goodness-of-fit test, the degrees of
freedom are
df = k – 1
where k denotes the number of possible
outcomes (or categories) for the
experiment.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Test Statistic for a Goodness-of-Fit Test

The test statistic for a goodness-of-fit test is χ2


and its value is calculated as
(O  E )2
2  
where
E
O = observed frequency for a category
E = expected frequency for a category = np
Remember that a chi-square goodness-of-fit test is
always right-tailed.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-3
A bank has an ATM installed inside the bank, and
it is available to its customers only from 7 AM to 6
PM Monday through Friday. The manager of the
bank wanted to investigate if the percentage of
transactions made on this ATM is the same for
each of the 5 days (Monday through Friday) of the
week. She randomly selected one week and
counted the number of transactions made on this
ATM on each of the 5 days during this week. The
information she obtained is given in the following
table, where the number of users represents the
number of transactions on this ATM on these
days. For convenience, we will refer to these
transactions as “people” or “users.”
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-3

At the 1% level of significance, can we reject the


null hypothesis that the number of people who
use this ATM each of the 5 days of the week is
the same? Assume that this week is typical of all
weeks in regard to the use of this ATM.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-3: Solution

 Step 1:
 H0 : p1 = p2 = p3 = p4 = p5 = .20
 H1 : At least two of the five proportions are
not equal to .20

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-3: Solution

 Step 2:
 There are 5 categories
 5 days on which the ATM is used
 Multinomial experiment
 We use the chi-square distribution to make
this test.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-3: Solution

 Step 3:
 Area in the right tail = α = .01
 k = number of categories = 5
 df = k – 1 = 5 – 1 = 4
 The critical value of χ2 = 13.277

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Figure 11.4

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Table 11.3 Calculating the Value of the Test
Statistic

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-3: Solution

 Step 4:
 All the required calculations to find the
value of the test statistic χ2 are shown in
Table 11.3.
(O  E ) 2

  
2
 23.184
E

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-3: Solution

 Step 5:
 The value of the test statistic χ2 = 23.184 is
larger than the critical value of χ2 = 13.277
 It falls in the rejection region
 Hence, we reject the null hypothesis
 We state that the number of persons who
use this ATM is not the same for the 5 days
of the week.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-4
In a July 23, 2009, Harris Interactive Poll, 1015
advertisers were asked about their opinions of
Twitter. The percentage distribution of their
responses is shown in the following table.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-4
Assume that these percentage hold true for the
2009 population of advertisers. Recently 800
randomly selected advertisers were asked the
same question. The following table lists the
number of advertisers in this sample who gave
each response.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-4
Test at the 2.5% level of significance whether the
current distribution of opinions is different from
that for 2009.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-4: Solution

 Step 1:
 H0 : The current percentage distribution of
opinions is the same as for 2009.
 H1 : The current percentage distribution of
opinions is different from that for 2009.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-4: Solution

 Step 2:
 There are 4 categories
 5 days on opinion
 Multinomial experiment
 We use the chi-square distribution to make
this test.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-4: Solution

 Step 3:
 Area in the right tail = α = .025
 k = number of categories = 4
 df = k – 1 = 4 – 1 = 3
 The critical value of χ2 = 9.348

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Figure 11.5

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Table 11.4 Calculating the Value of the Test
Statistic

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-4: Solution

 Step 4:
 All the required calculations to find the
value of the test statistic χ2 are shown in
Table 11.4.
(O  E ) 2

  
2
 6.420
E

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-3: Solution

 Step 5:
 The value of the test statistic χ2 = 5.420 is
smaller than the critical value of χ2 = 9.348
 It falls in the nonrejection region
 Hence, we fail to reject the null hypothesis
 We state that the current percentage
distribution of opinions is the same as for
2009.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
CONTINGENCY TABLES

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
A TEST OF INDEPENDENCE OR
HOMOGENEITY

 A Test of Independence
 A Test of Homogeneity

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
A Test of Independence

Definition
A test of independence involves a test of the
null hypothesis that two attributes of a
population are not related. The degrees of
freedom for a test of independence are
df = (R – 1)(C – 1)
Where R and C are the number of rows and
the number of columns, respectively, in the
given contingency table.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
A Test of Independence

Test Statistic for a Test of Independence


The value of the test statistic χ2 for a test
of independence is calculated as
(O  E )2
2  
E
where O and E are the observed and
expected frequencies, respectively, for a
cell.
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-5

Violence and lack of discipline have become


major problems in schools in the United
States. A random sample of 300 adults was
selected, and these adults were asked if
they favor giving more freedom to
schoolteachers to punish students for
violence and lack of discipline. The two-way
classification of the responses of these
adults is represented in the following table.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-5

Calculate the expected frequencies for


this table, assuming that the two
attributes, gender and opinions on the
issue, are independent.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-5: Solution
Table 11.6 Observed Frequencies

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Expected Frequencies for a Test of
Independence
The expected frequency E for a cell is
calculated as

(Row total)(Column total)


E
sample size

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-5: Solution
Table 11.7 Observed and Expected Frequencies

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-6
Reconsider the two-way classification table given in
Example 11-5. In that example, a random sample
of 300 adults was selected, and they were asked if
they favor giving more freedom to schoolteachers
to punish students for violence and lack of
discipline. Based on the results of the survey, a
two-way classification table was prepared and
presented in Example 11-5. Does the sample
provide sufficient information to conclude that the
two attributes, gender and opinions of adults, are
dependent? Use a 1% significance level.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-6: Solution

 Step 1:
 H0: Gender and opinions of adults are
independent
 H1: Gender and opinions of adults are
dependent

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-6: Solution

 Step 2: We use the chi-square distribution


to make a test of independence for a
contingency table.

 Step 3:
 α = .01
 df = (R – 1)(C – 1) = (2 – 1)(3 – 1) = 2
 The critical value of χ2 = 9.210

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Figure 11.6

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Table 11.8 Observed and Expected Frequencies

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-6: Solution
Step 4:

(O  E ) 2
2  
E
 93  105.00   70  59.50   12  10.50 
2 2 2

  
105.00 59.50 10.50
 87  75.00   32  42.50   6  7.50 
2 2 2

  
75.00 42.50 7.50
 1.371  1.853  .214  1.920  2.594  .300  8.252

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-6: Solution
 Step 5:
 The value of the test statistic χ2 = 8.252
 It is less than the critical value of χ2 = 9.210
 It falls in the nonrejection region
 Hence, we fail to reject the null hypothesis
 We state that there is not enough evidence
from the sample to conclude that the two
characteristics, gender and opinions of
adults, are dependent for this issue.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-7
A researcher wanted to study the relationship
between gender and owning cell phones. She took
a sample of 2000 adults and obtained the
information given in the following table.

At the 5% level of significance, can you conclude


that gender and owning cell phones are related for
all adults?
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-7: Solution
 Step 1:
 H0: Gender and owning a cell phone are
not related
 H1: Gender and owning a cell phone are
related

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-7: Solution

 Step 2:
 We are performing a test of independence
 We use the chi-square distribution

 Step 3:
 α = .05.
 df = (R – 1)(C – 1) = (2 – 1)(2 – 1) = 1
 The critical value of χ2 = 3.841

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Figure 11.7

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Table 11.9 Observed and Expected Frequencies

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-7: Solution
Step 4:
(O  E ) 2
2  
E
 640  588.60   450  501.40 
2 2

 
588.60 501.40
 440  491.40   470  418.60 
2 2

 
491.40 481.60
 4.489  5.269  5.376  6.311  21.445

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-7: Solution

 Step 5:
 The value of the test statistic χ2 = 21.445
 It is larger than the critical value of χ2 = 3.841
 It falls in the rejection region
 Hence, we reject the null hypothesis

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
A Test of Homogeneity

Definition
A test of homogeneity involves testing the
null hypothesis that the proportions of
elements with certain characteristics in two
or more different populations are the same
against the alternative hypothesis that these
proportions are not the same.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-8

Consider the data on income distributions


for households in California and Wisconsin
given in Table 11.10. Using the 2.5%
significance level, test the null hypothesis
that the distribution of households with
regard to income levels is similar
(homogeneous) for the two states.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-8

Table 11.10 Observed Frequencies

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-8: Solution

 Step 1:
 H0: The proportions of households that
belong to different income groups are the
same in both states
 H1: The proportions of households that
belong to different income groups are not the
same in both states

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-8: Solution
 Step 2: We use the chi-square distribution
to make a homogeneity test.

 Step 3:
 α = .025
 df = (R – 1)(C – 1) = (3 – 1)(2 – 1) = 2
 The critical value of χ2 = 7.378

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Figure 11.8

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Table 11.11 Observed and Expected Frequencies

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-8: Solution
Step 4:

(O  E )2
2  
E
 70  65   34  39   80  75 
2 2 2

  
65 39 75
 40  45   100  110   76  66 
2 2 2

  
45 110 66
 .385  .641  .333  .566  .909  1.515  4.339

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-8: Solution

 Step 5:
 The value of the test statistic χ2 = 4.339
 It is less than the critical value of χ2
 It falls in the nonrejection region
 Hence, we fail to reject the null hypothesis
 We state that the distribution of households
with regard to income appears to be similar
(homogeneous) in California and Wisconsin.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
INFERENCES ABOUT THE POPULATION
VARIANCE

 Estimation of the Population Variance


 Hypothesis Tests About the Population
Variance

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
INFERENCES ABOUT THE POPULATION
VARIANCE

Sampling Distribution of (n – 1)s2 / σ2


If the population from which the sample is
selected is (approximately) normally
distributed, then
(n  1)s 2
 2

has a chi-square distribution with n – 1


degrees of freedom.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Estimation of the Population Variance
Confidence interval for the
population variance σ2
Assuming that the population from which
the sample is selected is (approximately)
normally distributed, we obtain the (1 –
α)100% confidence interval for the
population variance σ2 as
(n  1)s 2
(n  1)s 2
to
 / 2
2
21 / 2

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Estimation of the Population Variance
 2
 2
where  / 2 and 1 / 2 are obtained from
the chi-square distribution for α /2 and 1-
α /2 areas in the right tail of the chi-
square distribution curve, respectively,
and for n-1 degrees of freedom. The
confidence interval for the population
standard deviation can be obtained by
simply taking the positive square roots of
the two limits of the confidence interval
for the population variance.
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-9

One type of cookie manufactured by Haddad


Food Company is Cocoa Cookies. The
machine that fills packages of these cookies
is set up in such a way that the average net
weight of these packages is 32 ounces with a
variance of .015 square ounce. From time to
time the quality control inspector at the
company selects a sample of a few such
packages, calculates the variance of the net
weights of these packages, and construct a
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-9
95% confidence interval for the population
variance. If either both or one of the two
limits of this confidence interval is not the
interval .008 to .030, the machine is
stopped and adjusted. A recently taken
random sample of 25 packages from the
production line gave a sample variance of
.029 square ounce. Based on this sample
information, do you think the machine
needs an adjustment? Assume that the
net weights of cookies in all packages are
normally distributed.
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-9: Solution
 Step 1:
 n = 25 and s2 = .029

 Step 2:
 α = 1 - .95 = .05
 α/2 = .05/2 = .025
 1 – α/2 = 1 – .025 = .975
 df = n – 1 = 25 – 1 = 24
 χ2 for 24 df and .025 area in the right tail = 39.364
 χ2 for 24 df and .975 area in the right tail = 12.401

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Figure 11.9

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-9: Solution
Step 3:

(n  1)s 2
(n  1)s 2
to
 / 2
2
21 / 2

(25  1)(.029) (25  1)(.029)


to
39.364 12.401
.0177 to .0561

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-9: Solution

Thus, with 95% confidence, we can state


that the variance for all packages of Cocoa
Cookies lies between .0177 and .0561
square ounce.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Hypothesis Tests About the Population Variance
Test statistic for a Test of Hypothesis
About σ2
The value of the test statistic χ2 is calculated as

(n  1)s 2
 
2

 2

where s2 is the sample variance, σ2 is the


hypothesized value of the population variance, and
n – 1 represents the degrees of freedom. The
population from which the sample is selected is
assumed to be (approximately) normally
distributed.
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-10
One type of cookie manufactured by Haddad Food
Company is Cocoa Cookies. The machine that fills
packages of these cookies is set up in such a way
that the average net weight of these packages is
32 ounces with a variance of .015 square ounce.
From time to time the quality control inspector at
the company selects a sample of a few such
packages, calculates the variance of the net
weights of these packages, and makes a test of
hypothesis about the population variance. She
always uses α = .01. The acceptable value of the
population variance is .015 square ounce or less.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-10
If the conclusion from the test of
hypothesis is that the population variance
is not within the acceptable limit, the
machine is stopped and adjusted. A
recently taken random sample of 25
packages from the production line gave a
sample variance of .029 square ounce.
Based on this sample information, do you
think the machine needs an adjustment?
Assume that the net weights of cookies in
all packages are normally distributed.
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-10: Solution

 Step 1:
 H0 :σ2 ≤ .015
 The population variance is within the acceptable
limit
 H1: σ2 >.015
 The population variance exceeds the acceptable
limit

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-10: Solution

 Step 2: We use the chi-square distribution


to test a hypothesis about σ2

 Step 3:
 α = .01.
 df = n – 1 = 25 – 1 = 24
 The critical value of χ2 = 42.980

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Figure 11.10

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-10: Solution
Step 4:

(n  1)s 2
(25  1)(.029)
 
2
  46.400
 2
.015

From H0

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-10: Solution
 Step 5:
 The value of the test statistic χ2 = 46.400
 It is greater than the critical value of χ 2

 It falls in the rejection region


 Hence, we reject the null hypothesis H0
 We conclude that the population variance is
not within the acceptable limit. The
machine should be stopped and adjusted.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-11
The variance of scores on a standardized
mathematics test for all high school seniors was
150 in 2009. A sample of scores for 20 high
school seniors who took this test this year gave a
variance of 170. Test at the 5% significance level
if the variance of current scores of all high school
seniors on this test is different from 150. Assume
that the scores of all high school seniors on this
test are (approximately) normally distributed.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-11: Solution

 Step 1:
 H0: σ2 = 150
 The population variance is not different from
150
 H1: σ2 ≠ 150
 The population variance is different from 150

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-11: Solution
 Step 2: We use the chi-square distribution
to test a hypothesis about σ2

 Step 3:
 α = .05
 Area in the each tail = .025
 df = n – 1 = 20 – 1 = 19
 The critical values of χ2 32.852 and 8.907

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Figure 11.11

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-11: Solution
Step 4:

(n  1)s 2 (20  1)(170)


 
2
  21.533
2 150

From H0

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 11-11: Solution

 Step 5:
 The value of the test statistic χ2 = 21.533
 It is between the two critical values of χ2
 It falls in the nonrejection region
 Consequently, we fail to reject H0.
 We conclude that the population variance
of the current scores of high school seniors
on this standardized mathematics test does
not appear to be different from 150.
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
CHAPTER 12
ANALYSIS OF
VARIANCE

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
THE F DISTRIBUTION

Definition
1. The F distribution is continuous and
skewed to the right.
2. The F distribution has two numbers of
degrees of freedom: df for the numerator
and df for the denominator.
3. The units of an F distribution, denoted F,
are nonnegative.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
THE F DISTRIBUTION

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Figure 12.1 Three F distribution curves.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 12-1

Find the F value for 8 degrees of freedom


for the numerator, 14 degrees of freedom
for the denominator, and .05 area in the
right tail of the F distribution curve.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Table 12.1 Obtaining the F Value From Table VII

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Figure 12.2 The critical value of F for 8 df for the numerator, 14 df
for the denominator, and .05 area in the right tail.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
ONE-WAY ANALYSIS OF VARIANCE

 Calculating the Value of the Test Statistic


 One-Way ANOVA Test

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
ONE-WAY ANALYSIS OF VARIANCE

Definition
ANOVA is a procedure used to test the null
hypothesis that the means of three or more
populations are equal.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Assumptions of One-Way ANOVA

The following assumptions must hold true to


use one-way ANOVA.
1. The populations from which the samples are
drawn are (approximately) normally distributed.
2. The populations from which the samples are
drawn have the same variance (or standard
deviation).
3. The samples drawn from different populations
are random and independent.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Calculating the Value of the Test Statistic

Test Statistic F for a One-Way ANOVA Test


The value of the test statistic F for an
ANOVA test is calculated as

Variance between samples MSB


F or
Variance within samples MSW

The calculation of MSB and MSW is


explained in Example 12-2.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 12-2

Fifteen fourth-grade students were


randomly assigned to three groups to
experiment with three different methods of
teaching arithmetic. At the end of the
semester, the same test was given to all
15 students. The table gives the scores of
students in the three groups.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 12-2

Calculate the value of the test statistic F.


Assume that all the required assumptions
mentioned in Section 12.2 hold true.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 12-2: Solution
Let
 x = the score of a student
 k = the number of different samples (or treatments)
 ni = the size of sample i
 Ti = the sum of the values in sample i
 n = the number of values in all samples
= n1 + n 2 + n3 + . . .
 Σx = the sum of the values in all samples
= T1 + T2 + T3 + . . .
 Σx² = the sum of the squares of the values in all samples

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 12-2: Solution
To calculate MSB and MSW, we first
compute the between-samples sum of
squares, denoted by SSB and the within-
samples sum of squares, denoted by SSW.
The sum of SSB and SSW is called the
total sum of squares and is denoted by
SST; that is,
SST = SSB + SSW
The values of SSB and SSW are calculated
using the following formulas.
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Between- and Within-Samples Sums of Squares

The between-samples sum of squares,


denoted by SSB, is calculates as

T T2
T 2
 ( x )
2 2

SSB    1
  ...  
2 3

 n1 n2 n3  n

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Between- and Within-Samples Sums of Squares
The within-samples sum of squares,
denoted by SSW, is calculated as

T T T 
2 2 2
SSW   x      ... 
2 1 2 3

 n1 n2 n3 

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Table 12.2

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 12-2: Solution

∑x = T1 + T2 + T3 = 324+369+388 = 1081
n = n1 + n2 + n3 = 5+5+5 = 15
Σx² = (48)² + (73)² + (51)² + (65)² +
(87)² + (55)² + (85)² + (70)² +
(69)² + (90)² + (84)² + (68)² +
(95)² + (74)² + (67)²
= 80,709

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 12-2: Solution

 (324)2 (369)2 (388)2  (1081)2


SSB       432.1333
 5 5 5  15
 (324)2 (369)2 (388)2 
SSW  80,709       2372.8000
 5 5 5 
SST  432.1333  2372.8000  2804.9333

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Calculating the Values of MSB and MSW

MSB and MSW are calculated as


SSB SSW
MSB  and MSW 
k 1 nk

where k – 1 and n – k are, respectively,


the df for the numerator and the df for the
denominator for the F distribution.
Remember, k is the number of different
samples.
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Example 12-2: Solution

SSB 432.1333
MSB    216.0667
k 1 3 1
SSW 2372.8000
MSW    197.7333
nk 15  3
MSB 216.0667
F   1.09
MSW 197.7333

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Table 12.3 ANOVA Table

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Table 12.4 ANOVA Table for Example 12-2

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 12-3

Reconsider Example 12-2 about the scores


of 15 fourth-grade students who were
randomly assigned to three groups in order
to experiment with three different methods
of teaching arithmetic. At the 1%
significance level, can we reject the null
hypothesis that the mean arithmetic score
of all fourth-grade students taught by each
of these three methods is the same?
Assume that all the assumptions required
to apply the one-way ANOVA procedure
hold true.
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Example 12-3: Solution
 Step 1:
 H0: μ1 = μ2 = μ3 (The mean scores of the
three groups are all equal)
 H1: Not all three means are equal

 Step 2: Because we are comparing the


means for three normally distributed
populations, we use the F distribution to
make this test.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 12-3: Solution
 Step 3:
 α = .01
 A one-way ANOVA test is always right-
tailed
 Area in the right tail is .01
 df for the numerator = k – 1 = 3 – 1 = 2
 df for the denominator = n – k = 15 – 3
= 12
 The required value of F is 6.93

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Figure 12.3 Critical value of F for df = (2,12) and α
= .01.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 12-3: Solution

 Step 4 & 5:
 The value of the test statistic F = 1.09
 It is less than the critical value of F = 6.93
 It falls in the nonrejection region
 Hence, we fail to reject the null hypothesis
 We conclude that the means of the three
population are equal.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 12-4
From time to time, unknown to its employees, the
research department at Post Bank observes various
employees for their work productivity. Recently
this department wanted to check whether the four
tellers at a branch of this bank serve, on average,
the same number of customers per hour. The
research manager observed each of the four tellers
for a certain number of hours. The following table
gives the number of customers served by the four
tellers during each of the observed hours.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 12-4

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 12-4

At the 5% significance level, test the null


hypothesis that the mean number of
customers served per hour by each of
these four tellers is the same. Assume that
all the assumptions required to apply the
one-way ANOVA procedure hold true.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 12-4: Solution

 Step 1:
 H0: μ1 = μ2 = μ3 = μ4 (The mean number of
customers served per hour by each of the
four tellers is the same)
 H1: Not all four population means are equal

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 12-4: Solution

 Step 2:
 Because we are testing for the equality of
four means for four normally distributed
populations, we use the F distribution to
make the test.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 12-4: Solution

 Step 3:
 α = .05.
 A one-way ANOVA test is always right-
tailed.
 Area in the right tail is .05.
 df for the numerator = k – 1 = 4 – 1 = 3
 df for the denominator = n – k = 22 – 4
= 18
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Figure 12.4 Critical value of F for df = (3, 18) and
α = .05.

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Table 12.5

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 12-4: Solution
 Step 4:
 Σx = T1 + T2 + T3 + T4 =108 + 87 + 93 + 110
= 398
 n = n1 + n2 + n3 + n4 = 5 + 6 + 6 + 5 = 22

 Σx² = (19)² + (21)² + (26)² + (24)² + (18)² +


(14)² + (16)² + (14)² + (13)² + (17)² +
(13)² + (11)² + (14)² + (21)² + (13)² +
(16)² + (18)² + (24)² + (19)² + (21)² +
(26)² + (20)²
= 7614

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 12-4: Solution
T   x 
2
T 2
T T 2 2 2
SSB   1
  2
 
3 4

 n1 n2 n3 n4  n
 (108)2 (87)2 (93)2 (110)2  (398) 2
       255.6182
 5 6 6 5  22
 T12 T22 T32 T42 
SSW   x  2
   
 1
n n2 n3 n4 

 (108)2 (87)2 (93)2 (110)2 


 7614        158.2000
 5 6 6 5 

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 12-4: Solution

SSB 255.6182
MSB    85.2061
k 1 4 1
SSW 158.2000
MSW    8.7889
nk 22  4
MSB 85.2061
F    9.69
MSW 8.7889

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Table 12.6 ANOVA Table for Example 12-4

Prem Mann, Introductory Statistics, 7/E


Copyright © 2010 John Wiley & Sons. All right reserved
Example 12-4: Solution

 Step 5:
 The value for the test statistic F = 9.69
 It is greater than the critical value of F = 3.16
 It falls in the rejection region
 Consequently, we reject the null
hypothesis
 We conclude that the mean number of
customers served per hour by each of the
four tellers is not the same.
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved

You might also like