FHMM1034 Topic 1 B Descriptive Statistics Student
FHMM1034 Topic 1 B Descriptive Statistics Student
FHMM1034 Topic 1 B Descriptive Statistics Student
Topic 1 (Part 2)
Descriptive Statistics
Contents (Part 1)
Variance 2
s 2
Standard deviation s
Proportion p p̂
n
fi xi
fx
x or = =
i =1
f
n
fi
i =1
Note:
1. If n is odd, then median is the value of the middle
term in the ranked data
2. If n is even, then median is the average values of
the two middle terms.
3. Median is not influenced by the extreme values or
outliers.
Jan 2021, FHMM1034 Mathematics III Page 15
Example 4
term, n = f
2
The median is exactly at the 50th position of the data,
implying 50% of the data will be less than the median and
50% of them will be more than the median.
In other words, 50% of the data is at most of the median
value, and 50% is at least of the median value.
Ans: 2 children
Jan 2021, FHMM1034 Mathematics III Page 22
1.8
Measures of Central Tendency
for Grouped Data
Mean for
Grouped Frequency Distribution
fm
Mean for population data : = , N = f
N
fm
Mean for sampledata : x = , n= f
n
m is the midpoint
f is the frequency of a class.
f 1 100
f 0.5 50
2
1
f
2
cumulative frequency up to median class, FA
1
( f ) − f M −1
M = LM + 2 c
fM
Jan 2021, FHMM1034 Mathematics III Page 32
Median for
Grouped Frequency Distribution
Method 3: Using formula (linear interpolation)
1
( f ) − f M −1
M = LM + 2 c
fM
LM = lower boundary of median class
c = width of median class
f M = frequency of median class
f M −1 = cumulative frequency before median class
f = total frequency
Note : This formula can be used for both grouped data
of equal and unequal widths.
Jan 2021, FHMM1034 Mathematics III Page 33
Example 10
Weight Cumulative
Frequency Weight
(nearest kg) frequency
60 – 62 3
63 – 65 4
66 – 68 5
69 – 71 6
72 – 74 2
1.2
Cumulative Relative
1.0 74.5
71.5
Frequency
0.8
0.6 68.5
0.4
65.5
0.2
62.5
0.0 59.5
56.5 61.5 66.5 71.5 76.5
Weight (kg) 67.5
120%
100% 74.5
Cumulative Percentage
71.5
80%
60% 68.5
40%
65.5
20%
62.5
0% 59.5
56.5 61.5 66.5 71.5 76.5
Weight (kg) 67.5
Weight
Class Cumulative
(nearest Frequency Boundary
width Frequency
kg)
60 – 62 3
63 – 65 4
66 – 68 5
69 – 71 6
72 – 74 2
Cumulative
Boundary frequency
frequency
0 – 10 2
10 – 20 2
20 – 30 2
30 – 40 10
40 – 50 24
50 – 60 22
…
fb S
fa
m o − Lm
Mode, mo
Lm
C
Jan 2021, FHMM1034 Mathematics III Page 49
Mode for
Grouped Frequency distribution
PU f m − fb mo − Lm f m − fb
PQR PST = =
PV fm − fa C − (mo − Lm ) f m − f a
( mo − Lm ) ( f m − f a ) = ( f m − f b ) C − ( mo − Lm )
( mo − Lm ) ( f m − f a ) = C ( f m − f b ) − ( f m − f b ) ( mo − Lm )
( mo − Lm ) ( f m − f a ) + ( f m − f b ) ( mo − Lm ) = C ( f m − f b )
( mo − Lm ) ( f m − f b ) + ( f m − f a ) = C ( f m − f b )
( fm − fb )
mo = Lm + C
( fm − fb ) + ( fm − fa )
Jan 2021, FHMM1034 Mathematics III Page 51
Mode for
Grouped Frequency distribution
For mode of a grouped data of unequal width, the
frequency has to be replaced by frequency density.
Lm = lower class boundary of the modal class (based on frequency
density)
ρm = frequency density of the modal class
ρb = frequency density of the class immediately before the modal class
ρa = frequency density of the class immediately after the modal class
C = the class width of the modal class
m − b
Mode, mo = Lm + C
( m − b ) + ( m − a )
Jan 2021, FHMM1034 Mathematics III Page 52
Mode for
Grouped Frequency distribution
fm−fb= 2 fm−fa=
U P V
3
Q
fm= 15
S
fb= 13 fa=
12
20 30
?
Jan 2021, FHMM1034 Mathematics III Page 54
Mode for
Grouped Frequency distribution
R T
fm−fb= 2 2 : 3 fm−fa=
U P V
3
Q
2
10 = 4 S
2+3
f m − fb 2 : 3
C
( f m − fb ) + ( f m − f a ) C = 10
Lm= 20 30
Weight
(nearest kg)
60 − 62 63 − 65 66 − 68 69 − 71 72 − 74
Frequency 3 4 5 6 2
Weight Marks
Class Frequency
(nearest Class boundaries width f
kg)
60 – 62 3
63 – 65 4
66 – 68 5
69 – 71 6
72 – 74 2
Number of
0 – 4 5 – 9 10 – 14 15 – 19 20 – 24 25 – 29 30 – 34
accidents
Number of
4 6 11 15 8 5 3
weeks
Company 2 18 27 33 Mean = 40 52 70
Jan 2021, FHMM1034 Mathematics III Page 70
Measure of Dispersion
Disadvantages of range:
(x − )
2
( x − ) 2
Population: =
2
=
N N
( x − x )
2
( x − x ) 2
Sample: s =
2
s=
n −1 n −1
(x − ) = 0 and (x − x ) =0
x x − mean
For this reason we
82 82 – 84 = –2
square the deviation to
95 95 – 84 = +11 calculate variance and
67 67 – 84 = –17 standard deviation.
92 92 – 84 = +8
Jan 2021, FHMM1034 Mathematics III Page 76
Deviation from the Mean
Mid Term Score
100
95
90 +11 +8
85 84
80
−2
75 −17
70
65
60
Jan 2021, FHMM1034 Mathematics III Page 77
Deviation from the Mean
( x − ) = 0 ( x − x ) = 0
( x − ) = 478 ( x − x ) = 478
2 2
( x − ) ( x − x )
2 2
=2
= 119.5 s =
2
= 159.33
N n −1
POPULATION SAMPLE
Jan 2021, FHMM1034 Mathematics III Page 78
Example 15
x
2
−
Population: = 2 N = 2
N
x
2
= − 2
N
( x)
2
x
2
−
Sample: s = 2 n s = s2
n −1
Jan 2021, FHMM1034 Mathematics III Page 80
Calculation-friendly Formulae for Variance &
Standard Deviation for Ungrouped Data
(x − )
2
Population Variance, = 2
N
( x − )
2
(
= x 2 − 2 x + 2 )
= x 2 − 2 x + 2
2( x) ( x)
2 2
x = x − +
= x − 2 2
x + N 2 2
N N N
2( x) x
2 2
= x −2( x)
2
= x − 2
+ N
N N N
Jan 2021, FHMM1034 Mathematics III Page 81
Calculation-friendly Formulae for Variance &
Standard Deviation for Ungrouped Data
( x − )
2
=2
N
( x)
2
x −
2
2
x x
2
= N = −
N N N
x ( x)
2
x
2 2
= − = − 2
N N2 N
Age, x x− (x − ) 2
N
35
36
38
39
40
45
47
∑x = 280
Company 2: ( x − ) 2
=
2
N
Age, x x− (x − ) 2
18
27
33
52
70
∑x = 200
x −
2
2
2 N x
Age, x x 2 = = − 2
N N
35
36
38
39
40
45
47
∑x = 280
( x)
2
Company 2:
x −
2
2
N x
2 = = − 2
Age, x x2 N N
18
27
33
52
70
∑x = 200
mid-point mid-point
Range = of − of
the largest class the smallest class
f (m − )
2
= 2
N
2 ( − ) 2
f m x
s =
n −1
N N
Sample variance:
fm 2 −
2
( fm )
n
s2 =
n −1
Jan 2021, FHMM1034 Mathematics III Page 91
Example 17
The following data give the frequency distribution of the
number of orders received each day during a sample
period of 50 days at the office of a mail-order company.
f = 50
0 x < 10 4
10 x < 20 9
20 x < 30 6
30 x < 40 4
40 x < 50 2
f = 25
(mean at centre)
Steps:
1. Arrange data in ascending or descending order.
2. Locate the median, i.e. the Second Quartil Q2
3. For observation below median locate the
middle value i.e. the First Quartile Q1
4. For observation above median locate the
middle value i.e. the Third Quartile Q3
IQR = Q3 − Q1
Weight
Weight Frequency, Cumulative
boundaries
(nearest kg) f Frequency, F
(kg)
60 – 62 3
63 – 65 4
66 – 68 5
69 – 71 6
72 – 74 2
∑f = 20
Jan 2021, FHMM1034 Mathematics III Page 117
Example 22 Solution
0 10 20 30 40 50 60
Smallest Largest
Median
value value
1st 3rd
Quartile quartile
50
Boxplots 40
Third quartile Q3
displayed Median Q2
30
vertically First quartile Q1
20
10 Smallest value
0
Jan 2021, FHMM1034 Mathematics III Page 126
Boxplot
* *
Lower Upper
fence Q1 Q2 Q3 fence
Last value Last value
inside lower inside upper
Outlier fence fence Outlier