Course Unit 8 - Summary of Basic Statistical Tests III-2
Course Unit 8 - Summary of Basic Statistical Tests III-2
Course Unit 8 - Summary of Basic Statistical Tests III-2
LABORATORY SCIENCE:
BIOSTATISTICS AND EPIDEMIOLOGY
COURSE MODULE COURSE UNIT WEEK
3 8 3
Cognitive:
1. Distinguish and explain efficiently the difference between parametric and nonparametric test
2. Calculate and interpret correctly the test of difference such as Kruskal Wallis & Friedman
test
3. Calculate and interpret correctly the test of correlation such as Pearson r, test of
independence such as Chi-square, and test of reliability (validity) such as Cronbach alpha.
4. List correctly the assumption of Mcnemars test
5. Conduct correctly nonparametric test and explain its applications
Affective:
1. Listen attentively during online class discussions
2. Respond tactfully and respectfully during exchange of ideas and forum discussions.
3. Courteously accept comments and feedbacks of classmates on one’s opinions and ideas.
Psychomotor:
1. Participate actively during online class discussions
2. Perform correctly and individually the assigned unit tasks and assessment tasks
https://www.york.ac.uk/depts/maths/tables/friedman.pdf
https://www.dataanalytics.org.uk/critical-values-for-the-kruskal-wallis-test/
https://www.itl.nist.gov/div898/handbook/eda/section3/eda3674.htm
Stewart, Anthony (2016). Basic Statistics and Epidemiology (A Practical Guide) 4th Ed. CRC Press,
Taylor & Francis Group. p65 to 75, p83 to 88.
Statistical interference tests are frequently classified with regards to whether they are parametric or
nonparametric. A parametric interference test is one that relies extensively upon populace
characteristics, or parameters, for its use. The z test, t test, and F test are instances of parametric
tests. The z test, for instance, requires that we indicate the mean and standard deviation of the
invalid speculation population, as well as necessitating that the populace scores must be typically
disseminated for small ns.
Although all interference tests rely upon populace qualities somewhat, the prerequisites of
nonparametric tests are negligible. For instance, the sign test is a nonparametric test. To utilize the
sign test, it isn't important to know the mean, variance, or state of the populace scores. Since
nonparametric tests rely little upon knowing populace distributions, they are frequently alluded to as
distribution-free tests.
Part I. Krukal Wallis
The Kruskal–Wallis test is a nonparametric test that is used with an independent groups design
employing k samples.
It is used as a substitute for the parametric one-way ANOVA.
The Kruskal–Wallis test does not assume population normality or homogeneity of variance, as does
parametric ANOVA, and requires only ordinal scaling of the dependent variable
Steps involved in testing
1. Define Null and Alternative Hypotheses
2. State alpha (α) or level of significance
3. Test Statistics
4. Determine the degree of freedom (df)
5. Critical value
6. Calculation
7. Conclusion
Example:
In an experiment to determine which of the three different missiles systems is preferable;
the propellant burning rate was measured. The data after coding are given below:
1. Formulation of hypothesis
Ho = 𝑈1 = 𝑈2 = 𝑈3
Ha = 𝑈1 ≠ 𝑈2 ≠ 𝑈3
2. Level of significance
Α = 0.05
3. Test statistics
Kruskal Wallis Test
4. Degree of freedom (h – 1)
df: 3-1 = 2
5. Calculation
SCORE RANK
Propellant Burning Rates
16.7 1
Missile Missile Missile
17.3 2.5 System 1 System 2 System 3
17.3 2.5 19 18 7
17.6 4 1 14.5 11
17.8 5 17 6 2.5
18.1 6 14.5 4 2.5
18.4 7 9.5 16 13
18.8 8 5 9.5
18.9 9.5 8
18.9 9.5 12
19.1 11
19.3 12 R1 = 61.0 R2 = 63.5 R3 = 65.5
n1 = 5 n2 = 6 n3 = 8
19.7 13
19.8 14.5
19.8 14.5
20.2 16
22.8 17
23.2 18
24 19
COMPUTATION:
12 𝑅𝑖
𝐻= . ∑ − 3 (𝑛 + 1)
𝑛 (𝑛 + 1) 𝑛𝑖
12 612 63.52 65.52
𝐻= . ∑( + + ) − 3 (19 + 1)
19 (19 + 1) 5 6 8
H = 1. 66
1. Critical Value
Reject null hypothesis (Ho) if H is greater than the critical value at 5 % level of significance
H tab = 5.991
NOTE: The critical value is determined using the level of significance and degree of
freedom provided in the table that will be viewed by downloading the link given in ‘required
reading’ section.
2. Conclusion
Since the computed value of H = 1.66 at 5% level of significance is less than the critical
value of 5.991, therefore the null hypothesis is accepted.
Assumptions
Assumption #1 – One group that is measured on three or more different occasions
Assumption # 2 – Each group is a random sample from the population
Assumption # 3 - Dependent variables are measured at ordinal or interval/ratio level
Assumption # 4 – Samples do not need to be normally distributed
FORMULA:
12
𝑄= ∑𝑘𝑖=1 𝑅𝑖2 − 3𝑚(𝑘 + 1)
𝑚𝑘 (𝑘+1)
Example:
The manager of a nationally known real estate agency has just completed a training session
on appraisals for three newly hired agents. To evaluate the effectiveness of his training, the
manager wishes to determine whether there is any difference in the appraised values placed on
houses by these three different individuals. A sample of 12 houses is selected by the manager,
and each agent is assigned a task of placing an appraised value (in thousands of dollars) on the 12
houses.
The results are summarized as follows:
At 0.05 level of significance, use the friedman rank test to determine whether there is evidence of
difference in the median appraised value for the three agents. What can you conclude?
Solution:
3. Formulation of hypothesis
Ho = 𝑀1 = 𝑀2 = 𝑀3
Ha = 𝑀1 ≠ 𝑀2 ≠ 𝑀3
4. Level of significance
Α = 0.05
5. Test statistics
Friedman Test
6. Calculation
HOUSES AGENT 1 AGENT 2 AGENT 3
1 181.0 1 182.0 2 183.53
2 179.9 1 180.0 2 182.4 3
3 163.0 2 161.5 1 164.1 3
4 218.0 3 215.0 1 217.3 2
5 213.0 1 216.5 2 218.4 3
6 175.0 1.5 175.0 1.5 216.1 3
7 217.9 1 219.5 2 220.1 3
8 151.0 2 150.0 1 152.4 3
9 164.9 1 165.5 2 166.1 3
10 192.5 1 195.0 2 197.0 3
11 225.0 2 222.7 1 226.4 3
12 177.5 1 178.0 2 179.7 3
∑𝑅 𝑅1 = 17.5 𝑅2 = 19.5 𝑅3 = 35
WHERE: k= 3 m= 12
12
𝑄= ∑𝑘𝑖=1 𝑅𝑖2 − 3𝑚(𝑘 + 1)
𝑚𝑘 (𝑘+1)
12
𝑄= (17.52 + 19.52 + 352 ) − (3)(12)(3 + 1)
(12)(3)(3 + 1)
Q = 15.29
7. Critical Value
Reject null hypothesis (Ho) if Q is greater than the critical value at 5 % level of significance
Q tab = 6.5
NOTE: The critical value is determined using the level of significance and degree of
freedom provided in the table that will be viewed by downloading the link given in ‘required
reading’ section.
8. Conclusion
Since the computed value of Q = 15.29 at 5% level of significance is greater than the critical
value of 6.5, therefore the null hypothesis is rejected. Hence conclude that the appraising
ability of the three agents are not the same.
Part III. Mcnemars Test
The McNemar test is a test on a 2x2 classification table when you want to test the difference
between paired proportions, e.g. in studies in which patients serve as their own control, or in
studies with "before and after" design.
In the McNemar test dialog box, two discrete dichotomous variables with the classification data
must be identified. Classification data may either be numeric or alphanumeric (string) values. If
required, you can convert a continuous variable into a dichotomous variable using the create
groups tools. The variables together cannot contain more than 2 different classification values.
The test is applied to a 2 × 2 contingency table, which tabulates the outcomes of two tests on a
sample of n subjects, as follows.
The null hypothesis of marginal homogeneity states that the two marginal probabilities for each
outcome are the same, i.e. pa + pb = pa + pc and pc + pd = pb + pd.
Thus the null and alternative hypotheses are: Ho: b = c Ha: b ≠ c
FORMULA:
(𝑏 − 𝑐)2
𝑥2 =
𝑏+𝑐
Example:
Compare whether someone experiences joint pain before and after some treatment. We
want to test whether the treatment worked to change people from Yes to No. The 215
people who said “No” at both time points and the 380 people who said “Yes” at both are
actually irrelevant to this comparison. We’re actually just interested in whether the people
who change answers.
2. Level of significance
Α = 0.05
3. Degree of freedom = 1
4. Test statistics
Mcnemars Test
5. Critical Value
Reject null hypothesis (Ho) if 𝑥 2 is greater than the critical value at 5 % level of significance
𝑥 2 tab = 3.841
NOTE: The critical value is determined using the level of significance and degree of
freedom provided in the table that will be viewed by downloading the link given in ‘required
reading’ section
.
6. Calculation
WHERE:
b =75
c = 785
2
(𝑏 − 𝑐)2
𝑥 =
𝑏+𝑐
(75 − 785)2
𝑥2 =
75 + 785
= 586. 16
7. Conclusion
Since the computed value of 𝑥 2 = 586.16 at 5% level of significance is greater than the
critical value of 3.841, therefore the null hypothesis is rejected.
Part IV. Chi-square
Also known as Chi-square test of independence () (Greek symbol chi “χ”). This is a statistical
hypothesis test in which the sampling distribution of the test statistic is a chi-square χ2 distribution
when the null hypothesis is true, or ny in which this is asymptomatically true. It is used to
investigate whether distributions of categorical variables differ from one another. In chi-square, the
observations must be independent and same observation can only appear in on cell. This test
statistics assumes a non-directional hypothesis and tests the hypothesis that two variables are
related only by chance.
Assumptions of Chi-square
1. Subjects are randomly selected
2. Categories are mutually exclusive
df = (r - 1) (c – 1)
Step 3. Select the critical region based on the critical value for Chi-square test based on the set
level of significance.
RULING:
If 𝜒 2 computed is < 𝜒 2critical; do not reject H0; If 𝜒 2 computed is > or = 𝜒 2critical; reject H0
EXAMPLE:
Let us look at an example using some real data, as shown in Table 20.2. A study asks whether
Asians with diabetes receive worse treatment in primary care than non- Asians with diabetes. Th is
is important, since Asians are more likely to develop diabetes than non- Asians. A number of
variables are studied, including whether patients with diabetes have received a HbA1c test within
the previous year (we mentioned HbA1c in Chapter 7), as this is a valuable indicator of how
successfully diabetes is being controlled. Having the test performed regularly is important, and is
therefore a valid indicator of healthcare quality in diabetes. We can calculate that 64.6% (128/198)
of Asians received the check, compared with 74.7% (430/576) of non- Asians. As such we know that
a lower proportion of Asian patients was checked, but is there a significant association between
ethnicity and receiving the check? Our null hypothesis is that there is no association between
ethnicity and receiving a HbA1c check.
The frequencies for Asian/non- Asian patients with diabetes are assembled in a 2 × 2 table and
tabulated against the frequencies in each group of patients who have/have not received the HbA1c
test, as shown in Table 20.2.
To calculate χ2, use the following steps:
Step 1. Set up the hypothesis for Chi-square test.
H0: There is no significant association with ethnicity and receipt of HBA1c checking.
HA: There is a significant association with ethnicity and receipt of HBA1c checking.
df = (r - 1) (c – 1)
df = (2 - 1) (2 – 1) = 1 x 1 = 1
df = 1
Step 3. Select the critical region based on the critical value for Chi-square test based on the set
level of significance (p value <0.05).
df = 1, p-value= 0.05
Cell A: [(a + b) × (a + c)/total] = (198 × 558) / 774 = 110 484 / 774 = 142.74
Cell B: [(a + b) × (b + d)/total] = (198 × 216) / 774 = 42 768 / 774 = 55.26
Cell C: [(a + c) × (c + d)/total] = (558 × 576) / 774 = 321 408 / 774 = 415.26
Cell D: [(b + d) × (c + d)/total] = (216 × 576) / 774 = 124 416 / 774 = 160.74.
χ2 = 7.32
Step 6. Conclude according to statistical decision for hypothesis testing:
RULING: If 𝜒 2 computed is < 𝜒 2critical, do not reject H0; If 𝜒 2 computed is > or = 𝜒 2critical, reject
H0
In statistics, correlation assesses the strength of association between variables (usually interval
or ratio). The most widely used method for this is the Pearson Product-Moment Correlation Test
which is more commonly and simply known as Pearson’s r (as in the Greek letter rho “ρ”). It is a
parametric statistical test used to measure the degree of relationship (correlation coefficient)
between two sets of data. The value of correlation coefficient scales from -1 (perfect negative
correlation) to +1 (perfect positive correlation), with 0 indicating that there is no correlation between
two data set. The table below lists the correlation coefficients and the corresponding degree or
strength of correlation:
Correlation Coefficient Strength of relationship
(Pearson’s r)
0.00 No correlation, no relationship
±0.01 to 0.20 Very low correlation, almost negligible relationship
±0.21 to 0.40 Slight correlation, Definite but small relationship
±0.41 to 0.70 Moderate correlation, substantial relationship
±0.71 to 0.90 High correlation, marked relationship
±0.91 to 0.99 Very High correlation, very dependable relationship
± 1.00 Perfect correlation, perfect relationship
or
𝑛 𝛴𝑥𝑦 − (𝛴𝑥)(𝛴𝑦)
𝑟=
√[𝑛(𝛴𝑥 2 ) − (𝛴𝑥 )2 ][𝑛(𝛴𝑦 2 ) − (𝛴𝑦)2 ]
Step 5. Calculate the t-value and determine the statistical decision for hypothesis testing
𝑛−2
𝑡 = 𝑟√
1 − 𝑟2
RULING:
If t computed is < t critical, do not reject H0; If t computed is > or = t critical, reject H0
NOTE: The test for correlation coefficient is two-tailed; thus, the rejection region is divided into two
equal parts.
EXAMPLE:
A rheumatologist measures and records the bone mineral density (BMD) in a group of women. She
has a hypothesis that BMD decreases with age, and decides to use correlation and linear
regression to explore this. The data collected by our consultant rheumatologist are shown in Table
18.1.
df = 10 – 2
df = 8
t-critical = ±2.306
𝑛 𝛴𝑥𝑦 − (𝛴𝑥)(𝛴𝑦)
𝑟=
√[𝑛(𝛴𝑥 2 ) − (𝛴𝑥 )2 ][𝑛(𝛴𝑦 2 ) − (𝛴𝑦)2 ]
10 (484.184) − (607)(8.198)
𝑟=
√[10(37,907) − (607)2 ][10(6.93466) − (8.198)2 ]
4,841.84 − 4,976.186
𝑟=
√[379,070 − 368,449][69.34656 − 67.207204]
4,841.84 − 4,976.186
𝑟=
√[10,621][2.139356]
−134.346
𝑟=
√22,724.00761
−134.346
𝑟=
150.7448427
𝑟 = −0.891214568 or − 0.891
Step 5. Decision rule. In order to make a decision on the significance of the relationship, we need
to determine the value of t.
𝑛−2 10−2 8 8
𝑡 = 𝑟√1−𝑟 2 = −.0891√1−(−0.8912 ) = −.0891√1−0.793881 = −.0891√0.206 = −.0891√38.81
RULING: If t computed is < t critical, do not reject H0; If t computed is > or = t critical, reject H0
The t-computed -5.55 falls outside the t critical ±2.306 at a significance level of 0.05, therefore, we
reject the null hypothesis.
The correlation coefficient (Pearson r) is -0.891 indicates that there is a significant high negative
correlation between age and BDM of the patients.
McNemar Test - is a test on a 2x2 classification table when you want to test the difference
between paired proportions, e.g. in studies in which patients serve as their own control, or in
studies with "before and after" design.
Chi-square Test – non parametric statistical hypothesis test allows researcher to determine
whether frequencies that have been obtained in research differ from those that would have been
expected
Pearson Product-Moment Correlation Test – parametric statistical test that is used to determine
the degree of relationship (if any) between two data sets, this is also known as the Pearson’s r.
Stephanie. (2014, April 9). Friedman’s Test / Two Way Analysis of Variance by Ranks.
Retrieved June 17, 2019, from Statistics How To website:
https://www.statisticshowto.datasciencecentral.com/friedmans-test/
Stewart, Anthony (2016). Basic Statistics and Epidemiology (A Practical Guide) 4th Ed. CRC Press,
Taylor & Francis Group. p65 to 75, p83 to 88.
STUDY QUESTIONS
1. You want to find out how test anxiety affects actual test scores. The independent variable
“test anxiety” has three levels: no anxiety, low medium anxiety and high anxiety. The
dependent variable is the exam score, rated from 0 to 100%
2. A water company sought evidences if the measures taken to clean up a river were effective.
The Biological Oxygen Demand at 12 sites on the river was compared before clean up, 1
month later and a year after clean up.
4. ABC Brewery manufactures and distributes three types of beer: low calorie beer, regular
beer and dark beer. In an analysis of the market segments of the three beers, the firm’s
market research group has raised the question of whether or not preferences of the beers
differ between male and female beer drinkers. If beer preferences is independent of the
drinker’s gender, then one advertising campaign will be initiated for all ABC beers.
However, if the beer preference depends on gender, the firm will tailor its promotions toward
different target markets. A survey for this study resulted as follows:
5. Marilyn, an avid fan of karaoke sing-alongs and frustrated singer, has observed that even
though her singing voice quality is not that good, she still manages to get a score of 90 and
above whenever she sings in their family owned karaoke. She has a suspicion that it happens
only because of the loudness of the voice, not the quality. As she happened to be a graduate
of BS MedTech, she is abreast with knowledge on scientific investigations and data analyses
that she decided to conduct an experiment regarding this matter. For an entire month, she
has tallied her average daily score in karaoke, together with the average recorded decibels
using a sound level meter. She randomly selected 15 days from a 30-day experiment to
proceed to data processing at a significance level of 0.05. The table below is her tally.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Karaoke 95 94 99 90 85 88 80 73 90 85 90 96 72 100 94
score
Decibel 120 110 122 90 80 97 86 72 113 115 99 102 88 130 125
Stewart, Anthony (2016). Basic Statistics and Epidemiology (A Practical Guide) 4th Ed. CRC Press,
Taylor & Francis Group. pp65 to 75, pp83 to 88.