Nothing Special   »   [go: up one dir, main page]

Measures of Central Tendency

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 27

Chapter 2

Measures of
Central
Tendency
Learning objectives
By the end of the lesson, students should be able to:

 know what the median is, and be able to calculate it


 know what the mean is, and be able to calculate it
efficiently
 know what the mode and the modal class are, and
be able to find them
 be able to choose which is the appropriate
measure to use in a given situation.
Measures of Central Tendency

The median: This is a simple measure of location or measure of


central tendency, the ‘middle’ value, that has equal numbers of
values above and below it.
To find the median of a data set of n values, arrange the values in order of increasing
size.
If n is odd, the median is the (n+1) th value. If n is even, the median is halfway
between the th value and the following value.

49, 56, 55, 68, 61, 57, 61, 52, 63 …….. Section 2.1
Re – arranging:
49, 52, 55, 56, 57, 61, 61, 63, 68
median = 57
 The median is the average of two middle items in an even set of data.
47, 49, 59, 62, 65, 68 = 60.5
 Stem-and-leaf diagram is a convenient way of sorting large data into
order of increasing size.

15 7 9 (2)
16 0 0 4 4 4 4 5 8 8 8 9 9 9 (13)
17 3 3 4 5 9 (5)
Key : 17 3 = 173 cm
Fig. 2.2 : Height of Students

Median is the average of the 10th and 11th values.


∴ Median = = 168 cm
Finding the median from a frequency table

Ungrouped data

0 4 4
1 11 15
2 7 22
3 2 24
24

Table 2.3

Median is the average of the 12th and 13th scores (items).

∴ Median = = 1
Grouped data : Large data sets for continuous variables are nearly always
grouped, and the individual values are lost. Thus, the median cannot be
calculated exactly, you will have to estimate it from a cumulative frequency
graph.

Table 2.4 Playing times of 95 CDs.


A cumulative frequency curve is obtained by plotting the upper class boundary
of each class against the cumulative frequency. The points are joined by a smooth
curve as shown below.

Fig. 2.5 Cumulative frequency graph for data in Table 2.4. Median – Read off
the value of the variable corresponding to . i.e. = 47.5th . This gives a playing
time of 60 minutes.
Exercise A

1.

2.

Using your stem-and-leaf diagrams above, obtain the median of the two data
sets.
3.
The mean is the most commonly used average in statistics. It makes use of the
actual values of all the observations. It is used when the total quantity is of
interest. The mean can give a misleading result if exceptionally large or small
values occur in the data set( i.e. outliers).

Example 2.6: Find the mean of data in section 2.1


Solution

58 minutes
Uses of Summation, ∑ - notation

Example 2.7
If evaluate :
(a) (b) (c)
(d) (e)
Solution

(b) =
= 1 +3 + 4 + 5
= 13

(b) = + + +
= + + +
= 51
(c) = 3.25

(d) = + + +
= + + +

Note: The sum is always equal to zero, 0.

(e) = + + +
= + + +
= + + +
= 8.75
Calculating the mean from a frequency table

Example 2.8 : Calculate the mean of data in table 2.3

0 4 0
1 11 11
2 7 14
3 2 6
24 31
Table 2.9

Note : The mean of data values need not be a whole number.


Grouped data
In grouped data, the mid - class value (i.e. the x column) is calculated as the
average of the class boundaries (or the intervals).

Example 2.10 : Calculate the mean of data in table 2.4

Table 2.11
Remark: This value is only an estimate of the mean playing time for the discs, because
individual values have been replaced by mid – class values. Some information have been
lost by grouping the data.

Simplifying mean calculation


Calculating the mean involves quite large numbers which can be tedious.
However, this can be simplified using specified mean to reduce values to
manageable sizes. For example, to find the mean of 907, 908, 898, 902 and 897.

Alternatively, you could first make these numbers smaller by subtracting 900
from each of them, giving 7, 8, -2, 2, and -3. Label these numbers .

Then, the mean of original values is


= 2.4 + 900 = 902.4
Note because the amount of spread is the same for both data sets.

Coded mean:
Example 2.12 : The heights, x cm of a sample of 80 female students are
summarised by the equation .
Find the mean height of a female student.
solution

∴ The mean height of a female student is 163 cm.


Example 2.13 :
The heights, x cm of a group of young children are summarised by .
The mean height is 104.8 cm.
(i) Find the number of children in the group.
(ii) Find . S12qp63 Q2
Solution
(i)
(ii) Find
Let y = x – 100
= = 10.24
= 1572
= = = 164 899.2

∑(
= ∑∑ + ∑

= ∑∑ +
Alternatively,
= = 10.24
Also,
=
=
=
=
Exercise B
1.

2.

3.
The mode and the modal class
A third measure of central tendency is the mode , sometimes called the
modal value. It is the value with the highest frequency in a data set. It can be
picked readily from a frequency table if the data have not been grouped.
In Table 2.3, the mode is 1.
The mode can only be estimated for grouped data. Alternatively, you can
give the modal class, which is the class with the highest frequency density.
For example, the modal class for the playing times in Table 2.4 is 60 – 64
minutes.
For the nine CDs in section 2.1, with playing times
49, 52, 55, 56, 57, 61, 61, 63, 68 . The mode is 61.
It is not uncommon for all the values to occur once, so that there is no mode.
For example, the next six CDs had playing times
47, 49, 59, 62, 65, 68. No modal value.
Combining the two data sets gives
47, 49, 49, 52, 55, 56, 57, 59, 61, 61, 62, 63, 65, 68, 68
There are three values which have a frequency of 2, giving three
modes: 49, 61, and 68.
In this case, the mode fails to provide only one measure of central
tendency to represent the data set.
Thus, the mode is not a very useful measure of central tendency for
small data sets.
Remark: In contrast to the mean and the median, the mode can be
found for qualitative data. For example , for the data file ‘cereals’ in
Table 1.1 on page 3, the mode for the variable ‘type’ is C ( i.e. cold).
 Qualitative data take non – numerical value while Quantitative
data take numerical value.
Comparison of the mean, median and mode
Why are there different ways of calculating the average of a data set?
The reason is that an average describes a large amount of information
with a single value, and there is no completely satisfactory way of doing
this.
Each average conveys different information and each has its advantages
and disadvantages.
Example 2.14
The monthly salaries of the 13 employees in a small firm were stated below
in ascending order. $1000 $1000 $1000 $1000 $1100 $1200
$1250 $1400 $1600 $1600 $1700 $2900 $4200
Choose and calculate an appropriate measure of central tendency (mean,
median or mode) to summarise these salaries. Explain briefly why the other
measures are not suitable.
Solution
Median = (n+1)th value = 7th value = $1250
Mean = , correct to nearest dollar
Mode = value with the highest frequency = $1000

Thus, median gives the best measure of central tendency.


Mean is not suitable because 10 out of 13 employees earn less than .
Mode is not also suitable because it is the lowest salary: 9 out of 13 employees earn
more than this.
Example 2.15
The times in minutes for seven students to become proficient at a new
computer game were measured as shown here 15 10 48 10 19 14 16
(i) State which of the mean, median or mode you consider would be most
appropriate to use as a measure of central tendency to represent the
data in this case.
(ii) For each of the two measures of central tendency you did not choose in
part (i), give a reason why you consider it inappropriate. S10qp62 Q1
Solution
(iii) Re – arranging the times in ascending order, 10 10 14 15 16 19 48

Mean =
Median = 15 , Mode = 10

∴ Median would be the most appropriate measure of central tendency.


(ii) Mode is not suitable as it is the lowest value of this data. Mean is not
appropriate measure because it is affected by the larger value i.e. 48.
Exercise C
Oral Exercise / Class Discussion
Miscellaneous exercise
1.
2.
3.

End of Lesson

You might also like