Formula Stat
Formula Stat
Formula Stat
We combine all of this variation into a single statistic, called the F statistic because it uses the F-
distribution. We do this by dividing the variation between samples by the variation within each
sample. The way to do this is typically handled by software, however, there is some value in
seeing one such calculation worked out.
It will be easy to get lost in what follows. Here is the list of steps that we will follow in the
example below:
Calculate the sample means for each of our samples as well as the mean for all of the sample data.
Calculate the sum of squares of error. Here within each sample, we square the deviation of each
data value from the sample mean. The sum of all of the squared deviations is the sum of squares of
error, abbreviated SSE.
Calculate the sum of squares of treatment. We square the deviation of each sample mean from the
overall mean. The sum of all of these squared deviations is multiplied by one less than the number
of samples we have. This number is the sum of squares of treatment, abbreviated SST.
Calculate the degrees of freedom. The overall number of degrees of freedom is one less than the
total number of data points in our sample, or n - 1. The number of degrees of freedom of treatment
is one less than the number of samples used, or m - 1. The number of degrees of freedom of error
is the total number of data points, minus the number of samples, or n - m.
Calculate the mean square of error. This is denoted MSE = SSE/(n - m).
Calculate the mean square of treatment. This is denoted MST = SST/m - `1.
Calculate the F statistic. This is the ratio of the two mean squares that we calculated. So F =
MST/MSE.
Software does all of this quite easily, but it is good to know what is happening behind the scenes.
In what follows we work out an example of ANOVA following the steps as listed above.
Sample from population #1: 12, 9, 12. This has a sample mean of 11.
Sample from population #2: 7, 10, 13. This has a sample mean of 10.
Sample from population #3: 5, 8, 11. This has a sample mean of 8.
Sample from population #4: 5, 8, 8. This has a sample mean of 7.
The mean of all of the data is 9.
For the sample from population #1: (12 – 11)2 + (9– 11)2 +(12 – 11)2 = 6
For the sample from population #2: (7 – 10)2 + (10– 10)2 +(13 – 10)2 = 18
For the sample from population #3: (5 – 8)2 + (8 – 8)2 +(11 – 8)2 = 18
For the sample from population #4: (5 – 7)2 + (8 – 7)2 +(8 – 7)2 = 6.
We then add all of these sum of squared deviations and obtain 6 + 18 + 18 + 6 = 48.
Degrees of Freedom
Before proceeding to the next step, we need the degrees of freedom. There are 12 data values and
four samples. Thus the number of degrees of freedom of treatment is 4 – 1 = 3. The number of
degrees of freedom of error is 12 – 4 = 8.
Mean Squares
We now divide our sum of squares by the appropriate number of degrees of freedom in order to
obtain the mean squares.
Tables of values or software can be used to determine how likely it is to obtain a value of the F-
statistic as extreme as this value by chance alone.