Chap012 Anova (ANALYSIS OF VARIANCE)

Analysis of Variance
Chapter 12
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved.
GOALS
1. List the characteristics of the F distribution.

2. Conduct a test of hypothesis to determine whether the
variances of two populations are equal.
3. Discuss the general idea of analysis of variance.
4. Organize data into a one-way and a two-way ANOVA table.
5. Conduct a test of hypothesis among three or more treatment
means.
6. Develop confidence intervals for the difference in treatment
means.
7. Conduct a test of hypothesis among treatment means using a
blocking variable.
8. Conduct a two-way ANOVA with interaction.
12-2
The F Distribution
 It was named to honor Sir Ronald Fisher, one of the founders of

modern-day statistics.
 It is
– used to test whether two samples are from populations having equal
variances
– applied when we want to compare several population means
simultaneously. The simultaneous comparison of several population
means is called analysis of variance(ANOVA).
– In both of these situations, the populations must follow a normal
distribution, and the data must be at least interval-scale.
12-3
Characteristics of F-Distribution
1. There is a “family” of F
Distributions. A particular
member of the family is
determined by two
parameters: the degrees of
freedom in the numerator and
the degrees of freedom in the
denominator.
2. The F distribution is
continuous
3. F cannot be negative.
4. The F distribution is positively
skewed.
5. It is asymptotic. As F   the
curve approaches the X-axis
but never touches it.
12-4
Comparing Two Population Variances
The F distribution is used to test the hypothesis that the variance of one
normal population equals the variance of another normal population.
Examples:
 Two Barth shearing machines are set to produce steel bars of the same length.
The bars, therefore, should have the same mean length. We want to ensure
that in addition to having the same mean length they also have similar variation.
 The mean rate of return on two types of common stock may be the same, but
there may be more variation in the rate of return in one than the other. A
sample of 10 technology and 10 utility stocks shows the same mean rate of
return, but there is likely more variation in the Internet stocks.
 A study by the marketing department for a large newspaper found that men and
women spent about the same amount of time per day reading the paper.
However, the same report indicated there was nearly twice as much variation in
time spent per day among the men than the women.
12-5
Test for Equal Variances
12-6
Test for Equal Variances - Example
Lammers Limos offers limousine service

from the city hall in Toledo, Ohio, to
Metro Airport in Detroit. Sean Lammers,
president of the company, is considering
two routes. One is via U.S. 25 and the
other via I-75. He wants to study the time
it takes to drive to the airport using each
route and then compare the results. He
collected the following sample data,
which is reported in minutes.
Using the .10 significance level, is there a

difference in the variation in the driving
times for the two routes?
12-7
Step 1: The hypotheses are:

H0: σ12 = σ22
H1: σ12 ≠ σ22
Step 2: The significance level is .05.
Step 3: The test statistic is the F distribution.
12-8
Step 4: State the decision rule.
Reject H0 if F > F/2,v1,v2

F > F.10/2,7-1,8-1
F > F.05,6,7
12-9
Step 5: Compute the value of F and make a decision
The decision is to reject the null hypothesis, because the computed F

value (4.23) is larger than the critical value (3.87).
We conclude that there is a difference in the variation of the travel times along
12-10
the two routes.
Test for Equal Variances – Excel
Example
12-11
Comparing Means of Two or More
Populations
The F distribution is also used for testing whether

two or more sample means came from the same
or equal populations.
Assumptions:
– The sampled populations follow the normal
distribution.
– The populations have equal standard deviations.
– The samples are randomly selected and are
independent.
12-12
Populations
 The Null Hypothesis is that the population means are

the same. The Alternative Hypothesis is that at least one
of the means is different.
 The Test Statistic is the F distribution.
 The Decision rule is to reject the null hypothesis if F
(computed) is greater than F (table) with numerator and
denominator degrees of freedom.
 Hypothesis Setup and Decision Rule:
H0: µ1 = µ2 =…= µk
H1: The means are not all equal
Reject H0 if F > F,k-1,n-k
12-13
Analysis of Variance – F statistic
 If there are k populations being sampled, the

numerator degrees of freedom is k – 1.
 If there are a total of n observations the denominator
degrees of freedom is n – k.
 The test statistic is computed by:
SST k  1
F
SSE n  k 
12-14
Populations – Illustrative Example
Joyce Kuhlman manages a regional
financial center. She wishes to
compare the productivity, as
measured by the number of
customers served, among three
employees. Four days are randomly
selected and the number of
customers served by each employee
is recorded. The results are:
12-15
Populations – Example
Recently a group of four major carriers

joined in hiring Brunner Marketing
Research, Inc., to survey recent
passengers regarding their level of
satisfaction with a recent flight.
The survey included questions on
ticketing, boarding, in-flight
service, baggage handling, pilot
communication, and so forth.
Twenty-five questions offered a range

of possible answers: excellent,
good, fair, or poor. A response of
excellent was given a score of 4,
good a 3, fair a 2, and poor a 1.
These responses were then
totaled, so the total score was an Is there a difference in the mean
indication of the satisfaction with satisfaction level among the four airlines?
the flight. Brunner Marketing Use the .01 significance level.
Research, Inc., randomly selected
and surveyed passengers from
the four airlines.
12-16
Step 1: State the null and alternate hypotheses.
H0: µE = µA = µT = µO
H1: The means are not all equal
Step 2: State the level of significance.
The .01 significance level is stated in the problem.
Step 3: Find the appropriate test statistic.

Because we are comparing means of more than
two groups, use the F statistic
12-17

F > F.01,4-1,22-4
F > F.01,3,18
F > 5.09
12-18
Step 5: Compute the value of F and make a decision
12-19
12-20
Computing SS Total and SSE
12-21
Computing SST
The computed value of F is 8.99, which is greater than the critical value of 5.09,
so the null hypothesis is rejected.
Conclusion: The population means are not all equal. The mean scores are not
the same for the four airlines; at this point we can only conclude there is a
difference in the treatment means. We cannot determine which treatment groups
differ or how many treatment groups differ.
12-22
Confidence Interval for the
Difference Between Two Means
When we reject the null hypothesis that the means are equal, we may
want to know which treatment means differ. One of the simplest
procedures is through the use of confidence intervals.
 1 1
X  X2   t MSE 
 1
 
2 
1
n n
12-23
Confidence Interval for the
Difference Between Two Means - Example
From the previous example, develop a 95% confidence interval

for the difference in the mean rating for Eastern and Ozark.
Can we conclude that there is a difference between the two
airlines’ ratings?
The 95 percent confidence interval ranges from 10.46 up to 26.04.

Both endpoints are positive; hence, we can conclude these
treatment means differ significantly. That is, passengers on Eastern
rated service significantly different from those
on Ozark.
12-24
Excel
12-25
Two-Way Analysis of Variance
 For the two-factor ANOVA we test whether there is a

significant difference between the treatment effect
and whether there is a difference in the blocking
effect. Let Br be the block totals (r for rows)
 Let SSB represent the sum of squares for the blocks
where:
SSB  k( x b  x G ) 2
12-26
Two-Way Analysis of Variance - Example
WARTA, the Warren Area Regional

Transit Authority, is expanding bus
service from the suburb of
Starbrick into the central business
district of Warren. There are four
routes being considered from
Starbrick to downtown Warren:
(1) via U.S. 6, (2) via the West
End, (3) via the Hickory Street
Bridge, and (4) via Route 59.
WARTA conducted several tests to determine whether there was a difference in the
mean travel times along the four routes. Because there will be many different drivers,
the test was set up so each driver drove along each of the four routes. Next slide shows
the travel time, in minutes, for each driver-route combination. At the .05 significance
level, is there a difference in the mean travel time along the four routes? If we remove
the effect of the drivers, is there a difference in the mean travel time?
12-27
Sample Data
12-28
Step 1: State the null and alternate hypotheses.

H0: µu = µw = µh = µr
H1: Not all treatment means are the same
Step 2: State the level of significance.

The .05 significance level is stated in the problem.
Step 3: Find the appropriate test statistic.
Because we are comparing means of more than two groups,
use the F statistic
Reject H0 if F > F,v1,v2
F > F.05,k-1,n-k
F > F.05,4-1,20-4
F > F.05,3,16
12-29
F > 3.24
SSB  k( x b  x G ) 2
12-30
12-31
Two-Way Analysis of Variance – Excel
Example
Using Excel to perform the calculations, we conclude:

(1) The mean time is not the same for all drivers
12-32
(2) The mean times for the routes are not all the same
Two-way ANOVA with Interaction
 In the previous section, we studied the separate or independent

effects of two variables, routes into the city and drivers, on
mean travel time.
 There is another effect that may influence travel time. This is
called an interaction effect between route and driver on travel
time. For example, is it possible that one of the drivers is
especially good driving one or more of the routes?
 The combined effect of driver and route may also explain
differences in mean travel time.
 To measure interaction effects it is necessary to have at least
two observations in each cell.
12-33
Interaction Effect
 When we use a two-way ANOVA to study interaction, we now

call the two variables as factors instead of blocks
 Interaction occurs if the combination of two factors has some
effect on the variable under study, in addition to each factor
alone.
 The variable being studied is referred to as the response
variable.
 One way to study interaction is by plotting factor means in a
graph called an interaction plot.
12-34
Graphical Observation of Mean Times
Our graphical observations show us that

interaction effects are possible. The next
step is to conduct statistical tests of
hypothesis to further investigate the
possible interaction effects. In summary,
our study of travel times has several
questions:
 Is there really an interaction between

routes and drivers?
 Are the travel times for the drivers the

same?
 Are the travel times for the routes the

same?
Of the three questions, we are most

interested in the test for interactions. To
put it another way, does a particular
route/driver combination result in
significantly faster (or slower) driving
times? Also, the results of the hypothesis
test for interaction affect the way we
analyze the route and driver questions.
12-35
Example – ANOVA with Replication
Suppose the WARTA

blocking experiment
discussed earlier is
repeated by
measuring two more
travel times for each
driver and route
combination with the
data shown in the
Excel worksheet.
12-36
12-37
Three Tests in ANOVA with Replication
The ANOVA now has three sets of hypotheses to test:
1. H0: There is no interaction between drivers and routes.

H1: There is interaction between drivers and routes.
2. H0: The driver means are the same.

H1: The driver means are not the same.
3. H0: The route means are the same.

H1: The route means are not the same.
12-38
ANOVA Table
12-39
Excel Output
Driver Route
12-40
One-way ANOVA for Each Driver
H0: Route travel times are equal.
12-41

Chap012 Anova (ANALYSIS OF VARIANCE)

Uploaded by

Copyright:

Available Formats

Chap012 Anova (ANALYSIS OF VARIANCE)

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chap012 Anova (ANALYSIS OF VARIANCE)

Uploaded by

Copyright:

Available Formats

Analysis of Variance

1. List the characteristics of the F distribution.

 It was named to honor Sir Ronald Fisher, one of the founders of

Lammers Limos offers limousine service

Using the .10 significance level, is there a

Step 1: The hypotheses are:

Step 2: The significance level is .05.

Step 3: The test statistic is the F distribution.

Step 4: State the decision rule.

Reject H0 if F > F/2,v1,v2

Step 5: Compute the value of F and make a decision

The decision is to reject the null hypothesis, because the computed F

The F distribution is also used for testing whether

 The Null Hypothesis is that the population means are

 If there are k populations being sampled, the

 The test statistic is computed by:

Recently a group of four major carriers

Twenty-five questions offered a range

Step 1: State the null and alternate hypotheses.

Step 3: Find the appropriate test statistic.

Step 4: State the decision rule.

Reject H0 if F > F,k-1,n-k

Step 5: Compute the value of F and make a decision

From the previous example, develop a 95% confidence interval

The 95 percent confidence interval ranges from 10.46 up to 26.04.

 For the two-factor ANOVA we test whether there is a

WARTA, the Warren Area Regional

Step 1: State the null and alternate hypotheses.

Step 2: State the level of significance.

Using Excel to perform the calculations, we conclude:

 In the previous section, we studied the separate or independent

 When we use a two-way ANOVA to study interaction, we now

Our graphical observations show us that

 Is there really an interaction between

 Are the travel times for the drivers the

 Are the travel times for the routes the

Of the three questions, we are most

Suppose the WARTA

The ANOVA now has three sets of hypotheses to test:

1. H0: There is no interaction between drivers and routes.

2. H0: The driver means are the same.

3. H0: The route means are the same.

H0: Route travel times are equal.

You might also like