Statistics For Managers Using Microsoft® Excel 5th Edition: Analysis of Variance
Statistics For Managers Using Microsoft® Excel 5th Edition: Analysis of Variance
Statistics For Managers Using Microsoft® Excel 5th Edition: Analysis of Variance
Chapter 11
Analysis of Variance
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-1
Learning Objectives
After completing this chapter, you should be able to:
Recognize situations in which to use analysis of variance
(ANOVA)
Understand different analysis of variance designs
Evaluate assumptions of the model
Perform a single-factor ANOVA and interpret the results
Conduct and interpret a Tukey-Kramer post-analysis to
determine which means are different
Analyze two-factor analysis of variance tests
Conduct and interpret a Tukey-Kramer post-analysis procedure
to determine which factors are different
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-2
General ANOVA Analysis
Investigator controls one or more independent variables
Called factors or treatment variables
One factor contains three or more levels or groups or
categories/classifications
Other factors contains two or more levels or groups or
categories/classifications
Experimental design: the plan used to test the hypothesis
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-3
One-Factor ANOVA
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-4
One-Factor Analysis of Variance
Assumptions
Populations are normally distributed
(test with Box plot or Normal Probability Plot)
Populations have equal variances
(use Levene’s Test for Homogeneity of Variance)
Samples are randomly and independently drawn
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-5
Why Analysis of Variance?
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-6
Hypotheses: One-Factor ANOVA
H0 : μ1 μ2 μ3 μc
All population means are equal
i.e., no treatment effect (no variation in means among groups)
H1 : Not all of the population means are the same
At least one population mean is different
i.e., there is a treatment (groups) effect
Does not mean that all population means are different (at
least one of the means is different from the others)
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-7
Hypotheses: One-Factor
ANOVA
H0 : μ1 μ2 μ3 μc
H1 : Not all μ j are the same
μ1 μ 2 μ 3
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-8
Hypotheses: One-Factor
ANOVA
H 0 : μ1 μ 2 μ 3 μ c At least one mean is different:
The Null Hypothesis is NOT true
H1 : Not all μj are the same (Treatment Effect is present)
or
μ1 μ2 μ3 μ1 μ2 μ3
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-9
One-Factor ANOVA Table
Source of df SS MS P-value F-Ratio
Variation (Variance)
Between c-1 SSA MSA P(X=F) MSA
F
Groups MSW
Within n-c SSW MSW
Groups
Total n-1 SST =
SSA+SSW
c = number of groups
n = sum of the sample sizes from all groups
df = degrees of freedom
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-10
One-Factor ANOVA
Test Statistic
H0: μ1= μ2 = … = μc
H1: At least two population means are different
Test statistic
MSA
F
MSW
MSA is mean squares among variances
MSW is mean squares within variances
Degrees of freedom
df1 = c – 1 (c = number of groups)
df2 = n – c (n = sum of all sample sizes)
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-11
One-Factor ANOVA
Test Statistic
The F statistic is the ratio of the among variance to the
within variance
The ratio must always be positive
df1 = c -1 will typically be small
df2 = n - c will typically be large
Decision Rule:
Reject H0 if F > FU, = .05
otherwise do not reject H0
0 Do not Reject H0
reject H0
FU
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-12
One-Factor ANOVA
F Test Example
You want to see if three Club 1 Club 2 Club 3
different golf clubs yield 254 234 200
different distances. You 263 218 222
randomly select five 241 235 197
measurements from trials on an 237 227 206
automated driving machine for 251 216 204
each club. At the .05
significance level, is there a
difference in mean driving
distance?
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-13
One-Way ANOVA
Example
Distance
270
Club 1 Club 2 Club 3 260 •
254 234 200 •• X 1
250
263 218 222
241 235 197 240 •
• ••
237 227 206 230
• X2
251 216 204 220 •• •
X
210
X3
x1 249.2 x 2 226.0 x 3 205.8 ••
200 ••
x 227.0
190 1 2 3
Club
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-14
ANOVA -- Single Factor:
Excel Output
EXCEL: Tools | Data Analysis | ANOVA: Single Factor
SUMMARY
Groups Count Sum Average Variance
Club 1 5 1246 249.2 108.2
Club 2 5 1130 226 77.5
Club 3 5 1029 205.8 94.2
ANOVA
Source of
SS df MS F P-value F crit
Variation
Between
4716.4 2 2358.2 25.275 4.99E-05 3.89
Groups
Within
1119.6 12 93.3
Groups
Total 5836.0 14
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-15
One-Factor ANOVA Example
Solution
H0: μ1 = μ2 = μ3
H1: μi not all equal
= .05
p-value: 4.99E-05
Decision:
Reject H0 at = 0.05 There is evidence that
Conclusion: at least one μi differs
from the rest
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-16
The Tukey-Kramer Procedure
Tells which population means are significantly different
e.g.: μ1 = μ2 ≠ μ3
Done after rejection of equal means in ANOVA
Allows pair-wise comparisons
Compare absolute mean differences with critical
range
μ1= μ2 μ3 x
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-17
Tukey-Kramer Critical Range
MSW 1 1
Critical Range QU
2 n j n j'
where:
QU = Value from Studentized Range Distribution with c
and n - c degrees of freedom for the desired level
of (see appendix E.9 table)
MSW = Mean Square Within
nj and nj’ = Sample sizes from groups j and j’
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-18
The Tukey-Kramer Procedure:
Example
1. PhStat computes the absolute
Club 1 Club 2 Club 3 mean differences:
254 234 200
x1 x 2 249.2 226.0 23.2
263 218 222
241 235 197 x1 x 3 249.2 205.8 43.4
237 227 206
251 216 204 x 2 x 3 226.0 205.8 20.2
QU 3.77
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-19
The Tukey-Kramer Procedure:
Example (continued)
3. PhStat computes the Critical Range:
MSW 1 1 93.3 1 1
Critical Range QU 3.77 16.285
2 n j n j' 2 5 5
4. Compare:
5. All of the absolute mean differences x1 x 2 23.2
are greater than critical range.
Therefore there is a significant x1 x 3 43.4
difference between each pair of
means at 5% level of significance. x 2 x 3 20.2
PhStat does all the calculations for you
but you must input the Q value
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-20
Tukey-Kramer in PHStat
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-21
ANOVA Assumptions
Levene’s Test
Tests the assumption that the variances of each
group are equal.
First, define the null and alternative hypotheses:
H0: σ21 = σ22 = …=σ2c
H1: Not all σ2j are equal
Second, compute the absolute value of the difference
between each value and the median of each group.
Third, perform a one-way ANOVA on these
absolute differences.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-22
Two-Factor ANOVA
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-23
Two-Factor ANOVA
Assumptions
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-24
Two-Factor ANOVA
Sources of Variation
Two Factors of interest: A and B
r = number of levels of factor A
c = number of levels of factor B
n/ = number of replications for each cell
n = total number of observations in all cells
(n = rcn/)
Xijk = value of the kth observation of level i
of factor A and level j of factor B
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-25
Two-Factor ANOVA Summary Table
With Replication
Source of Degrees of Sum of Mean F
p-value
Variation Freedom Squares Squares Statistic
Sample
MSA = MSA/
Factor A r–1 SSA f (FA)
SSA/(r – 1) MSE
(Row)
Columns MSB = MSB/
c–1 SSB f (FB)
Factor B SSB/(c – 1) MSE
MSAB/
Interaction MSAB = MSE f (FA&B)
(r – 1)(c – 1) SSAB
(AB) SSAB/ [(r – 1)(c – 1)]
Within MSE =
rc n – 1)’
SSE
Error SSE/[rc n’ – 1)]
Total rc n’ – 1 SST
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-26
Two-Factor ANOVA
With Replication
As production manager, Box Machine1 Machine2 Machine3
you want to see if 3 filling 1 25.40 23.40 20.00
machines have different 26.40 24.40 21.00
mean filling times when 2 26.31 21.80 22.20
used with 5 types of boxes. 25.90 23.00 22.00
At the .05 level, is there a 3 24.10 23.50 19.75
difference in machines, in 24.40 22.40 19.00
boxes? Is there an
4 23.74 22.75 20.60
interaction?
25.40 23.40 20.00
5 25.10 21.60 20.40
26.20 22.90 21.90
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-28
Summary Table
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-32