Chapter 4
Chapter 4
Chapter 4
Chapter IV
4.1 Correlation
- Scatter plots
Scatter plots are graphs that depict clusters of dots that represent all of the
pairs of data in an experiment. For example, a plot of weight vs. height will
show a positive correlation: as height increases, weight also increases.
Scatter plots are constructed by plotting two variables along the horizontal
(x) and vertical (y) axes. Below are examples of scatter plots showing a
positive correlation, negative correlation, and no or little correlation. Note
that the more closely the cluster of dots represents a straight line, the
stronger the correlation.
1
Faculty of medicine and Pharmacy / Al-mergib University - Academic year 2022/2023
Biostatistics for Premedical students - Dr. A. A. Aziz - Dr. M.O. Alshrani
2
Faculty of medicine and Pharmacy / Al-mergib University - Academic year 2022/2023
Biostatistics for Premedical students - Dr. A. A. Aziz - Dr. M.O. Alshrani
Positive Correlations
Negative Correlation
• A negative correlation is resulted if an increase in one variable results in
a decrease in the other.
• An r value of -1 suggest that there is a perfect linear association present
(or perfect negative correlation).
3
Faculty of medicine and Pharmacy / Al-mergib University - Academic year 2022/2023
Biostatistics for Premedical students - Dr. A. A. Aziz - Dr. M.O. Alshrani
4
Faculty of medicine and Pharmacy / Al-mergib University - Academic year 2022/2023
Biostatistics for Premedical students - Dr. A. A. Aziz - Dr. M.O. Alshrani
When it’s hotter outside the total ice cream sales of companies tends to be
higher since more people buy ice cream when it’s hot out.
5
Faculty of medicine and Pharmacy / Al-mergib University - Academic year 2022/2023
Biostatistics for Premedical students - Dr. A. A. Aziz - Dr. M.O. Alshrani
The amount of coffee that individuals consume and their IQ level has a
correlation of zero.
The shoe size of individuals and the number of movies they watch per year
has a correlation of zero.
6
Faculty of medicine and Pharmacy / Al-mergib University - Academic year 2022/2023
Biostatistics for Premedical students - Dr. A. A. Aziz - Dr. M.O. Alshrani
Example1:
X 22 35 16 40 10 25 20 45 20 12
Y 5 4 7 3 8 5 6 2 4 7
And what the type of correlation?
Solution
X Y XY X2 Y2
22 5 110 484 25
35 4 140 1225 16
16 7 112 256 49
40 3 120 1600 9
10 8 80 100 64
25 5 125 625 25
20 6 120 400 36
45 2 90 2025 4
20 4 80 400 16
12 7 84 144 49
245 51 1061 7259 293
(10)(1061) − (245)(51)
𝑟= = −0.93
√(10)(7259) − (245)2 √(10)(293) − (51)2
Example2:
X 1 3 4 6 8 9 11 14
Y 1 2 4 4 5 7 8 9
And what the type of correlation?
7
Faculty of medicine and Pharmacy / Al-mergib University - Academic year 2022/2023
Biostatistics for Premedical students - Dr. A. A. Aziz - Dr. M.O. Alshrani
Solution
X Y XY X2 Y2
1 1 1 1 1
3 2 6 9 4
4 4 16 16 16
6 4 24 36 16
8 5 40 64 25
9 7 63 81 49
11 8 88 121 64
14 9 126 196 81
56 40 364 524 256
(8)(364) − (56)(40)
𝑟= = 0.975
√(8)(524) − (56)2 √(8)(256) − (40)2
Solution steps:
D = Rx - Ry
6 ∑𝑛𝑖=1 𝐷2
𝑟 =1−
𝑛(𝑛2 − 1)
8
Faculty of medicine and Pharmacy / Al-mergib University - Academic year 2022/2023
Biostatistics for Premedical students - Dr. A. A. Aziz - Dr. M.O. Alshrani
Example1:
X 5 7 9 10 8 6
Y 1 3 2 6 5 4
Solution
X: 10 9 8 7 6 5
Rx: 1 2 3 4 5 6
Y: 6 5 4 3 2 1
Ry: 1 2 3 4 5 6
X Y Rx Ry D D2
5 1 6 6 6–6=0 0
7 3 4 4 4–4=0 0
9 2 2 5 2 – 5 = -3 9
10 6 1 1 1–1=0 0
8 5 3 2 3 – 2 =1 1
6 4 5 3 5–3=2 4
Total 14
6 ∑𝑛𝑖=1 𝐷2 6(14)
𝑟 =1− = 1 − = 1 − 0.40 = 0.60
𝑛(𝑛2 − 1) 6(62 − 1)
This relationship is positive and average.
Example2:
This is the report about the state of 10 patients given by Doctors:
Doctor1 (X) Good Excellent Bad Bad Good V.Good Excellent V.Bad Bad Good
Doctor2 (Y) Good V.Good Bad V.Bad Excellent Excellent Excellent Bad Bad V.Good
What the relationship between two reports?
Solution
10 9 8 7 6 5 4 3 2 1
X Excellent Excellent V.Good Good Good Good Bad Bad Bad V.Bad
9+10 9+10 7+6+5 7+6+5 7+6+5 4+3+2 4+3+2 4+3+2
= = = = = = = =
Rx 2 2 8 3 3 3 3 3 3 1
9.5 9.5 6 6 6 3 3 3
10 9 8 7 6 5 4 3 2 1
Excellen
Y Excellent Excellent V.Good V.Good Good Bad Bad Bad V.Bad
t
10+9+8 10+9+8 10+9+8 7+6 7+6 4+3+2 4+3+2 4+3+2
= = = = = = = =
Ry 3 3 3 2 2 5 3 3 3 1
9 9 9 6.5 6.5 3 3 3
9
Faculty of medicine and Pharmacy / Al-mergib University - Academic year 2022/2023
Biostatistics for Premedical students - Dr. A. A. Aziz - Dr. M.O. Alshrani
X Y Rx Ry D D2
Good Good 6 5 1 1
Excellent V.Good 9.5 6.5 3 9
Bad Bad 3 3 0 0
Bad V.Bad 3 1 2 4
Good Excellent 6 9 -3 9
V.Good Excellent 8 9 -1 1
Excellent Excellent 9.5 9 0.5 0.25
V.Bad Bad 1 3 -2 4
Bad Bad 3 3 0 0
Good V.Good 6 6.5 -0.5 0.25
∑ 𝐷2 =
28.5
6 ∑𝑛𝑖=1 𝐷2 6(28.5)
𝑟 =1− = 1 − = 0.83
𝑛(𝑛2 − 1) 10(102 − 1)
This relationship is positive and strong.
Example3:
RX 2 4.5 2 6 4.5 8 2 8 8
RY 8.5 4.5 6 4.5 8.5 2.5 7 2.5 1
Find the relationship between X,Y?
Solution
Rx Ry D D2
2 8.5 -6.5 42.25
4.5 4.5 0 0
2 6 -4 16
6 4.5 1.5 2.25
4.5 8.5 -4.0 16
8 2.5 5.5 30.25
2 7 -5 25
8 2.5 5.5 30.25
8 1 7 49
10
Faculty of medicine and Pharmacy / Al-mergib University - Academic year 2022/2023
Biostatistics for Premedical students - Dr. A. A. Aziz - Dr. M.O. Alshrani
∑ 𝐷2 = 211
6 ∑𝑛𝑖=1 𝐷2 6(211)
𝑟 =1− = 1 − = −0.76
𝑛(𝑛2 − 1) 9(92 − 1)
It is strong negative relationship.
11
Faculty of medicine and Pharmacy / Al-mergib University - Academic year 2022/2023
Biostatistics for Premedical students - Dr. A. A. Aziz - Dr. M.O. Alshrani
𝑌 = 𝛽0 + 𝛽1 𝑋 + 𝑒𝑖
Where
- Estimation of 𝛽0 and 𝛽1
𝑌̂ = 𝛽̂0 + 𝛽̂1 𝑋
𝛽̂0 = 𝑌̅ − 𝛽̂1 𝑋̅
Where
12
Faculty of medicine and Pharmacy / Al-mergib University - Academic year 2022/2023
Biostatistics for Premedical students - Dr. A. A. Aziz - Dr. M.O. Alshrani
𝑛 𝑛
∑ 𝑋 ∑ 𝑌
𝑋̅ = 𝑖=1 𝑖 , 𝑌̅ = 𝑖=1 𝑖
𝑛 𝑛
𝑋̂ = 𝛼̂0 + 𝛼̂1 𝑌
𝑛 ∑𝑛𝑖=1 𝑋𝑌 − (∑𝑛𝑖=1 𝑋)( ∑𝑛𝑖=1 𝑌)
𝛼̂1 =
𝑛 ∑𝑛𝑖=1 𝑌 2 − (∑𝑛𝑖=1 𝑌)2
𝛼̂0 = 𝑋̅ − 𝛼̂1 𝑌̅
Where
𝑛 𝑛
∑ 𝑋 ∑ 𝑌
𝑋̅ = 𝑖=1 𝑖 , 𝑌̅ = 𝑖=1 𝑖
𝑛 𝑛
Important Note:
𝜌 = √𝛽̂1 × 𝛼̂1
, are 𝛽0 and 𝛽1 . The first parameter, 𝛽0 , is the intercept of the line; it is the
value of Y when X = 0. The second parameter, 𝛽1 , is the slope of the
regression line; it is the amount of change in the value of Y, on average,
when the value of X is increased by one unit.
13
Faculty of medicine and Pharmacy / Al-mergib University - Academic year 2022/2023
Biostatistics for Premedical students - Dr. A. A. Aziz - Dr. M.O. Alshrani
Example1:
Y 19 20 21 19 23 18 22
X 8 7 8 7 9 6 8
Then find the value of Y when X = 10 ?
Solution
X Y XY X2
8 19 152 64
7 20 140 49
8 21 168 64
7 19 133 49
9 23 207 81
6 18 108 36
8 22 176 64
53 142 1084 407
𝑌̂ = 𝛽̂0 + 𝛽̂1 𝑋
Then
𝑌̂ = 𝛽̂0 + 𝛽̂1 𝑋
𝑌̂ = 11.649 + 1.55𝑋
If X = 10 Then :
14
Faculty of medicine and Pharmacy / Al-mergib University - Academic year 2022/2023
Biostatistics for Premedical students - Dr. A. A. Aziz - Dr. M.O. Alshrani
Example2:
𝑌̂ = 12 − 1.5𝑋
𝑋̂ = 7 − 0.42𝑌
Find the correlation coefficient (𝜌) and describe the relationship between
X and Y?
Solution
Therefore
Note:
The signal of correlation coefficient take the same as signal of regression
coefficient.
Example3:
X 1 2 3 4 5
Y 2 3 5 7 8
𝑛 ̂𝑖 ) = 0
2- Prove that the ∑𝑖=1(𝑌𝑖 − 𝑌
15
Faculty of medicine and Pharmacy / Al-mergib University - Academic year 2022/2023
Biostatistics for Premedical students - Dr. A. A. Aziz - Dr. M.O. Alshrani
Solution
X Y XY X2
1 2 2 1
2 3 6 4
3 5 15 9
4 7 28 16
5 8 40 25
15 25 91 55
𝑌̂ = 𝛽̂0 + 𝛽̂1 𝑋
𝑛 ∑𝑛𝑖=1 𝑋𝑌 − (∑𝑛𝑖=1 𝑋)( ∑𝑛𝑖=1 𝑌) 5(91) − (15)(25)
𝛽̂1 = = = 1.6
𝑛 ∑𝑛𝑖=1 𝑋 2 − (∑𝑛𝑖=1 𝑋)2 5(55) − (15)2
𝑛 𝑛
∑ 𝑋 15 ∑ 𝑌 25
𝑋̅ = 𝑖=1 𝑖 = = 3 , 𝑌̅ = 𝑖=1 𝑖 = = 5
𝑛 5 𝑛 5
Then
𝑌̂ = 𝛽̂0 + 𝛽̂1 𝑋
When
16
Faculty of medicine and Pharmacy / Al-mergib University - Academic year 2022/2023
Biostatistics for Premedical students - Dr. A. A. Aziz - Dr. M.O. Alshrani
X Y 𝑌̂ 𝑌 − 𝑌̂
1 2 1.8 0.2
2 3 3.4 -0.4
3 5 5 0
4 7 6.6 0.4
5 8 8.2 -0.2
Total 0
As we see
𝑛
̂𝑖 ) = 0
∑(𝑌𝑖 − 𝑌
𝑖=1
Exercises
1) Consider the following data on weight of women in kg (Y) and height
in cm (X). The sample size is 20.
- Find the correlation between X and Y and interpret your result.
X Y
148.1 46.4
158.1 53.2
158.1 52.8
151.4 42.0
152.9 50.8
159.1 43.0
151.0 51.9
158.2 59.2
148.2 55.1
147.3 38.9
145.6 49.7
155.1 49.9
155.2 43.1
149.7 42.2
147.0 52.7
152.2 49.8
149.1 50.7
145.2 44.8
145.9 49.2
149.7 47.7
17
Faculty of medicine and Pharmacy / Al-mergib University - Academic year 2022/2023
Biostatistics for Premedical students - Dr. A. A. Aziz - Dr. M.O. Alshrani
X 2 1 3 2 5
Y 4 3 5 3 6
References:
1) HRS (Health And Retirement Study) (2014). Public use dataset. Produced and distributed by
the University of Michigan with funding from the National Institute on Aging (grant number
NIAU01AG09740). Ann Arbor, MI.
2) https://www.statology.org/correlation-examples-in-real-life/
18