Unit 7 Correlation Analysis
Unit 7 Correlation Analysis
Unit 7 Correlation Analysis
CORRELATION ANALYSIS
Mathematics Department
XAVIER UNIVERSITY-ATENEO DE CAGAYAN
Correlation Analysis
Correlation analysis is a statistical method that attempts to measure the
strength of the linear relationship between two quantitative variables by
means of a single value called correlation coefficient.
r = sample correlation coefficient
“rho” = population correlation coefficient.
n x y x y
i 1
i i
i 1
i
i 1
i
r
n
n
2
n
n
2
n
xi
2
xi
n
yi
2
yi
i 1 i 1 i 1 i 1
Scatter Plot
A scatter plot is a graphical way of presenting the linear
relationship between to quantitative variables X and Y.
coefficient of determination = R2 %
x i 1
i 166 y
i 1
i 45
i 1
2
xi 3,238
i 1
y i2 221 x y
i 1
i i 835
n n n
n x y x y
i 1
i i
i 1
i
i 1
i
r
n
n
2
n
n
2
n
xi
2
xi
n
yi
2
yi
i 1 i 1
i 1 i 1
This means that 86.77% of the total variation in the number of copier
machines sold in a month is explained by its linear relationship with the
number of sales calls made in a month. Only 13.23% (computed from
100% minus 86.77%) of the sample variability in the number of copier
machines sold in a month is due to factors other than what is accounted for
by its linear relationship with the number of sales calls made in a month.
Test for the significance of the linear relationship
Null hypothesis
H0: There is no significant linear relationship between between X and Y.
H0 : 0
Alternative hypothesis
H1: There is a significant linear relationship between between X and Y.
H1 : 0 Note: If the test of significance for the
(two-tailed test)
correlation coefficient yields a significant
(one-tailed test) result, then regression analysis can be
performed.
Example: The sales manager of certain No. of sales No. of
company wants to determine whether calls copiers sold
there is a linear relationship between the 9 3
number of sales calls and the number of
25 6
copier machines sold in a month. The
manager selected a random sample of 15 4
10 sales representatives and 20 6
determined the number of sales calls 7 3
each sales representative made last
month and the number of copier 10 4
machines sold. 17 4
20 5
13 3
30 7
𝟐
4. Rejection Regions: Since the test is two-tailed, the rejection regions are
given by
t t or t t
2 2
t t 0.025 or t t 0.025
𝟐 𝟐
6. Statistical Decision
Since t = 7.243 is in the critical region, H0 is rejected.
Optional
SPSS: Analyze Correlate Bivariate
Optional
SPSS Output
H0 : 0 versus H1 : 0