Nothing Special   »   [go: up one dir, main page]

Hypothesis: A Hypothesis Is An Assumption (Or Claim) About The

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 85

Hypothesis Testing

Hypothesis: A hypothesis is an assumption (or claim) about the


population parameter.

• The hypothesis (or assumption or claim) may be regarding


the population parameter such as population mean,
population standard deviation, population proportion, etc.
  
• The procedure of testing the validity of the hypothesis is
known as Hypothesis Testing.
• Null Hypothesis: A hypothesis which states “nullity”, or
“equality”, or “no difference”. Denoted by.

• Alternative Hypothesis: A hypothesis which is stated against


the null hypothesis with opposite interest. Denoted by.

• Generally, a hypothesis that the researcher wants to “prove”


is stated as an alternative hypothesis.
• For example, an educator claims that the average IQ of Master
degree holders is 100.
Null hypothesis: (Not different than 100)

Alternative hypothesis: (Two tailed)


OR
(Lower tailed)
OR
(Upper tailed)
Hypothesis testing refers to
1. Formulation of hypothesis (Null and alternative hypothesis).
2. Collecting sample.
3. Calculating a sample statistic.
4. Finding critical value(s) from the table.
5. Making decision (reject or fail to reject ).
• Types of error

• Type – I error: The error of rejecting when is true.


• Type – II error: The error of not rejecting when is false.

  is true is false
Reject Type – I error Correct Decision
P(Type-I error)= (No error)
Do not reject Correct Decision Type – II error
(No error) P(Type-II error)=
• Level of significance: The probability of making Type – I error,
Denoted by.

i.e.P(Type-I error)=

• Generally we set (i.e. 1% level, 5% level or 10% level).

• p-value (or observed significance level):p-value is the


calculated value of probability of making Type – I error. Rejectif
p-value is less than the level of significance.
• Parametric Test: Parametric test is a statistical test which makes
assumptions about the distribution of population. Thus the test
statistic is valid under these assumptions.
• Non-parametric Test (or distribution free test): For non-
parametric test no assumption is required about the
distribution.
(a) Chi-square Test

• This test is a simple and most widely used non-parametric test


in statistical work.

• Null hypothesis : States there is no association exists between


the given two attributes. (OR the attributes are independent)

• Alternative hypothesis : States there is an association between


the given two attributes. (OR the attributes are dependent)
• Test statistic to test the above hypothesis:

Where O = Observed frequency


E = Expected frequency

• Calculation of the expected frequency (E):

Where
RT = Row Total, CT = Column Total and , N = Grand Total
• Yates’ correction for continuity: In a contingency table, when
any observed frequency is less than 5 corrected formula of the
test statistic is

• Critical value: The critical value is


Where
the level of significance.
• Example: Use the data given in the table and test whether the
infection is independent of the vaccination. (Tate )
  Infected
  Yes No Total
Yes 10 280 300
Vaccinated No 250 450 700
Total 260 730 1000

Solution:
: Infection is independent of the vaccination.
Infection is not independent of the vaccination.
• The frequencies given in the table are observed frequencies.
• Now, we find the expected frequency of each cell using the
formula .
• The table of expected frequencies:

  Infected
  Yes No Total
Yes 78 219 300
Vaccinated No 182 511 700
Total 260 730 1000
10 78 -68 4624 59.28
280 219 61 3721 16.99
250 182 68 4624 25.41
450 511 -61 3721 7.28

.
Test statistic:
Level of significance
Degrees of freedom = (No. of rows -1) (No. of columns-1)
= (2-1) (2-1)
=1
So, the test statistic is
• Decision: Since the calculated value of the test statistic is greater than the critical
value , we reject the null hypothesis at 5% level of significance.

• Conclusion: The infection depend on the vaccination.


(b) Mann-Whitney U test
• Also called the Mann–Whitney–Wilcoxon (MWW), Wilcoxon
rank-sum test, or Wilcoxon–Mann–Whitney test.
• The Mann-Whitney U test is used to compare differences
between two independent groups when the dependent
variable is either ordinal or continuous, but not normally
distributed.
• The Mann-Whitney U test is often considered the
nonparametric alternative to the independent t-test although
this is not always the case.
Assumptions:
• The dependent variable should be measured at the ordinal or
continuous level.
• The independent variable should consist of two categorical
independent groups.
• The observations are independent.
• Two variables are not normally distributed but have the same
shape.
Example-1:
• A researcher decided to investigate whether an exercise or weight loss
intervention was more effective in lowering cholesterol levels. For this the
researcher recruited a random sample of inactive males that were classified as
overweight. This sample was then randomly split into two groups: Group 1
underwent a calorie-controlled diet (i.e., the 'diet' group) and Group 2 undertook
an exercise-training programme (i.e., the 'exercise' group). In order to determine
which treatment programme was more effective, cholesterol concentrations were
compared between the two groups at the end of the treatment programmes.
• After collecting the data researcher finds the shape of the distribution of
cholesterol is not bell shape in both the groups. That is not normally distributed.
So, in this case independent t-test is not applicable as the normality assumption is
not fulfilling.
• So, in this case research go with Mann-Whitney U test.
Example-2:
Consider a Phase II clinical trial designed to investigate the effectiveness of a new
drug to reduce symptoms of asthma in children. A total of n=10 participants are
randomized to receive either the new drug or a placebo. Participants are asked to
record the number of episodes of shortness of breath over a 1 week period
following receipt of the assigned treatment. The data are shown below.

Placebo 7 5 6 4 12
New Drug 3 6 4 2 1

Is there a difference in the number of episodes of shortness of breath over a 1


week period in participants receiving the new drug as compared to those receiving
the placebo? (By inspection, it appears that participants receiving the placebo have
more episodes of shortness of breath, but is this statistically significant?)
Solution:

• Null hypothesis : There is no difference in the number of


episodes of shortness of breath in both groups.

• Alternative hypothesis : There is a difference in the


number of episodes of shortness of breath in both
groups.
Sum of ranks in Placebo group .
Sum of ranks in New drug group .

• The test statistic for Mann-Whitney U test is denoted by U is the smaller value
among .

So, the test statistic is U = 3.


• The critical value at 0.05 level of significance is 2.
• Rejection criteria: If the value of test statistic is less than the critical value then
reject
• Decision: Since 3 > 2, we do not reject .
Wilcoxon Signed Rank Test
• The Wilcoxon signed-rank test is a non-parametric statistical
hypothesis test used to check the difference between two
dependent samples.

• It is used to compare two sets of scores that come from the


same participants. 

• It can be used as an alternative to the paired Student’s t-test,


when the population cannot be assumed to be normally
distributed.
Assumptions:
There are three assumptions that are required for a Wilcoxon
signed-rank test to give you a valid result. The first two
assumptions relate to your study design and the types of
variables you measured. The third assumption reflects the
nature of your data. 
(1) The dependent variable should be measured at
the ordinal or continuous level. Examples of ordinal
variables include Likert items (e.g., a 7-point item from
"strongly agree" through to "strongly disagree"). Examples
of continuous variables (i.e., interval or ratio variables) include
IQ score, marks obtained in exam, weight measured in
kilograms, etc.
(2) The independent variable should consist of two
categorical, "related groups" or "matched pairs". "Related
groups" indicates that the same subjects are present in both
groups. 
The distribution of the differences between the two related
groups (i.e., the distribution of differences between the scores
of both groups of the independent variable; for example, the
reaction time in a room with "blue lighting" and a room with
"red lighting") needs to be symmetrical in shape. (If this
assumption fails then you can transform your data to achieve a
symmetrically-shaped distribution of differences but it is not
advisable or you can perform sign test instead of the Wilcoxon
signed-rank test).
Example – 1: A pain researcher is interested in finding
methods to reduce lower back pain in individuals
without having to use drugs. The researcher thinks that
having acupuncture in the lower back might reduce back
pain.
To investigate this, the researcher recruits 25 participants
to their study. At the beginning of the study, the
researcher asks the participants to rate their back pain
on a scale of 1 to 10, with 10 indicating the greatest level
of pain.
After 4 weeks of twice weekly acupuncture, the
participants are asked again to indicate their level of
back pain on a scale of 1 to 10, with 10 indicating the
greatest level of pain.
The researcher wishes to understand whether the
participants' pain levels changed after they had
undergone the acupuncture.
He formulates Hypothesis as follows:

Null hypothesis H0: There is no change in pain levels before and


after the acupuncture treatment.
Alternative hypothesis Ha: There is a change in pain levels
before and after the acupuncture treatment.
H0

• So in this case he can use Wilcoxon signed-rank test to test his


hypothesis provided the assumptions are satisfied.
Example – 2: A sports psychologist believes that listening to
music affects the length of athletes’ workout sessions. The
length of time (in minutes) of 10 athletes’ workout sessions,
while listening to music and while not listening to music, are
shown in the table.
Solution:
• Hypothesis:
H0: There is no difference in the length of the athletes’ workout sessions.
Ha : There is a difference in the length of the athletes’ workout sessions.
• Test statistic:

• The sum of the negative ranks:


-1 + (-2) +(-4.5) = -7.5
• The sum of the positive ranks:
3 + 4.5 +6 +7.5 +9 + 10 = 47.5

So, the test statistic is


W = minimum of the absolute value of these two sums.
= Minimum(|-7.5|, |47.5| )
= 7.5
• Critical value:
The sample size n=10 and taking α = 0.05 for two tailed test the
critical value is 8.

• Decision: Since the calculated value 7.5 is less than the critical
value we reject H0 at 5% level of significance.
Khruskal-Wallis Test
• The Kruskal-Wallis test can be used to determine
whether three or more samples were selected from
populations having the same distribution.
• This test is the non-parametric alternative to the One
Way ANOVA.
• Sometimes also called the “one-way ANOVA on ranks”
• This test is a rank based non-parametric test when the
scale of measurement is ordinal or continuous.
• The null and alternative hypothesis for the Kruskal-Wallis
test are as follows.

Null hypothesis:
H0: There is no difference in the distribution of the
populations (OR The population medians are same).
Alternative hypothesis:
Ha: There is a difference in the distribution of the
populations (OR The population medians are not
same).
• Assumption:

• Have one independent variable with two or more


levels (independent groups). 
• Scale of measurement is Ordinal, Ratio or Interval.
• The observations should be independent. That means
there should be no relation between the members in
each group or between group.
• All groups should have the same shape distributions.
Example-1:
• You want to find out how test anxiety affects actual test
scores.
• The independent variable “test anxiety” has three levels:
no anxiety, low-medium anxiety and high anxiety.
• The dependent variable is the exam score, rated from 0
to 100%.
• Example-2:
• You want to find out how socioeconomic status affects
attitude towards sales tax increases.
• Your independent variable is “socioeconomic status”
with three levels: working class, middle class and
wealthy.
• The dependent variable is measured on a 5-point Likert
scale from strongly agree to strongly disagree.
Example-3:
Four groups of students were randomly assigned to be taught
with four different techniques, and their achievement test
scores were recorded. Are the distributions of test scores the
same, or do they differ in location?
1 2 3 4
65 75 59 94
87 69 78 89
73 83 67 80
79 81 62 88
Solution:
H0: The four groups have the same distribution.
Ha: The four groups do not have the same distribution.
We assign ranks from in ascending order. The ranks are given in ( ).
1 2 3 4
65 (3) 75 (7) 59 (1) 94 (16)
87 (13) 69 (5) 78 (8) 89 (15)
73 (6) 83 (12) 67 (4) 80 (10)
79 (9) 81 (11) 62 (2) 88 (14)
31 35 15 55
• The calculated value of the test statistic is H = 8.96.
• The table value (critical value) from chi-square
distribution with df = 4-1 = 3 and α=0.05 is 7.81.

• Rejection region: Reject H0 when H is greater than 7.81.

• Decision: Since the value of H is greater than the critical


value, we reject H0.

• Conclusion: The four groups do not have the same


distribution.
Friedman Test
• The Friedman test is the non-parametric alternative to
the one-way ANOVA with repeated measures.
• It is used to test for differences between groups when
the dependent variable being measured is ordinal.
• It is can also be used for continuous data that has
violated the assumptions necessary to run the one-way
ANOVA with repeated measures.
Assumptions:
• One group that is measured on three or more different
occasions.
• Group is a random sample from the population.
• The dependent variable should be measured at the
ordinal or continuous level.
• Sample do not need to be normally distributed.
• The null and alternative hypothesis for the Friedman test
are as follows.

Null hypothesis:
H0: All medians are equal.

Alternative hypothesis:
Ha: Not all medians are equal.
Test statistic:

Where:
K = number of columns (often called “treatments”)
n = number of rows (often called “blocks”)
Ri = sum of the ranks in column.
Example-1:
•A researcher wants to examine whether music has an
effect on the perceived psychological effort required
to perform an exercise session. 
•The dependent variable is "perceived effort to
perform exercise" and the independent variable is
"music type", which consists of three groups: "no
music", "classical music" and "dance music".
• To test whether music has an effect on the perceived
psychological effort required to perform an exercise
session, the researcher recruited 12 runners who each
ran three times on a treadmill for 30 minutes.
• For consistency, the treadmill speed was the same for all
three runs. In a random order, each subject ran: (a)
listening to no music at all; (b) listening to classical
music; and (c) listening to dance music. 
• At the end of each run, subjects were asked to record
how hard the running session felt on a scale of 1 to 10,
with 1 being easy and 10 extremely hard.
• A Friedman test was then carried out to see if there were
differences in perceived effort based on music type. 
Example-2:
• Government of India want to know whether the project of river
clean up is effective or not.
• Water quality can be measured by the Water Quality Index
(WQI).
• WQI at 10 sites on the river were measured before clean up, 6
month later and a year after clean up.
• To test whether the water quality improved or not we
can apply the Friedman test.
t-test (Student’s t-test)
• t-test is used in the following cases.
(1) When we want to test whether the population mean and the
assumed mean are equal or not. In this case we have only one
sample to test the hypothesis and the test is known as a one
sample t-test.
(2) When we want to test the equality (or difference between the
means of two populations). In this case we have two samples to
test the hypothesis and the test is known as a two-sample t-test.
First we study one sample t-test after that two sample t-
test.
(1) One sample t-test.
• As mentioned earlier in one sample t-test we test the
population mean (μ). The null and alternative
hypothesis are
• When the alternative hypothesis has not equal (≠) sign
the test known as a two-tailed (or two-sided) test.
• When the alternative hypothesis has less than (<) sign or
greater than (>) sign the test is known as one-sided test.
• Particularly, we call lower tailed (or left-tailed) test when
less than (<) and upper tailed (or right-tailed) test when
greater than (>) sign present in the alternative
hypothesis.
Assumptions of one-sample t-test.
• The dependent variable should be measured at the interval or ratio
level (i.e. continuous).
• The data are independent. That is the sample elements are
independent.
• There should be no significant outliers.

• The dependent variable should be approximately normally


distributed.
Test statistic:

Where, x‾ is the sample mean, μ0 is the assumed


mean (or hypothesized mean), s is the sample
standard deviation, and n is the sample size.
Degrees of freedom: For one sample t-test the
degrees of freedom is df = n-1.

Critical value: Critical value depend on the level


of significance, degrees of freedom and type of
the test.
Critical value for lower tailed test:

Critical value for upper tailed test:

Critical values for two tailed test:


Rejection region.
Rejection Rule:
• Critical value approach: Reject the null hypothesis when
the calculated value of the test statistic falls in the critical
region.
• P-value approach: Reject the null hypothesis when the
p-value is less than the level of significance.
Example-1:
• A research study measured the pulse rates of 57 college men
and found a mean pulse rate of 70.4211 beats per minute
with a standard deviation of 9.9480 beats per minute.
• Researchers want to know if the mean pulse rate for all
college men is different from the current standard of 72
beats per minute.
Example-2:
• In the population of Americans who drink coffee, the average daily consumption
is 3 cups per day.
• A university wants to know if their students tend to drink more coffee than the
national average.
• They ask a random sample of 50 students how many cups of coffee they drink
each day and found the sample mean 3.8 and the sample standard deviation 1.5.
• Do they have evidence that their students drink more than the national average?
(2) Two sample t-test.
•Two sample t-test is used to test the equality of the means of two
populations.
•There are two different scenarios when we have two samples.

•The samples may be dependent (e.g. reading are measured before and
after on the same subjects). In this case we perform a paired t-test.
•The samples may be independent. In this case we perform an
independent t-test.
Paired t-test (Dependent t-test):
•The paired t-test compares the means between two related
groups on the same continuous, dependent variable.
•For example you could use a dependent t-test to
understand whether there was a difference in smokers’
daily cigarette consumption before and after a 6 week
hypnotherapy programme.
Hypothesis:

Where, μd = population mean of the differences.


• Test statistic:

Where
• d‾ = Sample mean of the differences.

• Sd = Sample standard deviation of the differernces.

• n = Sample size.
Rejection Rule:
• Critical value approach: Reject the null hypothesis when
the calculated value of the test statistic falls in the critical
region.
• P-value approach: Reject the null hypothesis when the
p-value is less than the level of significance.
Example-1:
• A group of Sports Science students (n = 20) are selected from
the population to investigate whether a 12-week plyometric-
training programme improves their standing long jump
performance.
• In order to test whether this training improves performance,
the students are tested for their long jump performance before
they undertake a plyometric-training programme and then
again at the end of the programme.
• Here we have the before and after data. So, we can perform
paired t-test.
Example-2:
• The sales data of 15 stores before advertisement and
after advertisement were collected.
• We cant test whether the advertisement campaign is
significantly increase the sales using the paired t-test.
Independent t-test

• The independent t-test compares the difference


between means of two unrelated groups on the same
continuous, dependent variable. 
Assumptions of one-sample t-test.
• The dependent variable should be measured at the interval or
ratio level (i.e. continuous).
• The independent variable should consist of two categorical
independent groups.
• The data are independent. That is the sample elements are
independent.
• There should be no significant outliers.
• The dependent variable should be approximately normally
distributed for each group.
Test statistic:
Case-1: When equal variances assumed

• Where,

OR
• In this case the degrees of freedom will be
Case-2: When equal variances not assumed:

In this case the degrees of freedom will be


Rejection Rule:
• Critical value approach: Reject the null hypothesis when
the calculated value of the test statistic falls in the critical
region.
• P-value approach: Reject the null hypothesis when the
p-value is less than the level of significance.
•d
•d
•d
•d
•d
•d
•d
•d
•d

You might also like