μ2 2) Calculate the test statistic and p-value. t-stat = 2.27, df = 8, p-value = 0.026 3) Make a decision: Since p-value < 0.05, reject the null hypothesis. 4) State the conclusion: There is sufficient evidence to support the claim that more juveniles than adults are classified as missing persons on average."> μ2 2) Calculate the test statistic and p-value. t-stat = 2.27, df = 8, p-value = 0.026 3) Make a decision: Since p-value < 0.05, reject the null hypothesis. 4) State the conclusion: There is sufficient evidence to support the claim that more juveniles than adults are classified as missing persons on average.">
Testing Two Independent Samples - With Minitab Procedures)
Testing Two Independent Samples - With Minitab Procedures)
Testing Two Independent Samples - With Minitab Procedures)
N1 = N2
Are they
equal?
n1 n2
Derive a test-statistic to
determine if the 2
populations are equal
What is Meant by Two Independent Samples?
A sample is A sample is
selected from selected from
all men all women
Sample
_ of Men Sample
_ of Women
x1 = 6.2 Is the difference between x2 = 6.5
sample means statistically
significant?
Procedure for Testing the Hypothesis
Sample Problem 13
Sample Problem 13
Step 1: Assumptions and test requirements.
Independentrandom sampling.
The samples must be independent of each other.
Level
of measurement is interval-ratio.
Number of children can be considered as interval-ratio.
Populationvariances are equal.
As long as the two samples are approximately the same, we can
make this assumption.
t-Test for Independent Samples
Sample Problem 13
Step 1: Assumptions and test requirements.
Sampling distribution in normal size.
Because we have to small samples (n<30), we have to add the
previous assumption in order to meet this assumption.
Step 2: Define the null hypothesis.
Null hypothesis, H0: µ1 = µ2
Alternative hypothesis, H1: µ1 < µ2
t-Test for Independent Samples
Sample Problem 13
Step 3: Sampling distribution and critical region.
Sampling distribution.
Student’s t distribution
Significance level: = 0.05 (one-tailed)
Degrees of freedom
n1 + n2 - 2 = 42 + 37 - 2 = 77
Critical t
t (critical) = -1.671
t-Test for Independent Samples
Sample Problem 13
Step 4: Compute the test-statistic.
Method
μ₁: mean of Sample 1
µ₂: mean of Sample 2
Difference: μ₁ - µ₂
Equal variances are not assumed for this analysis.
Descriptive Statistics
Sample N Mean StDev SE Mean
Sample 1 42 2.370 0.630 0.097
Sample 2 37 2.780 0.950 0.16
t-Test for Independent Samples
Sample Problem 13
Step 4: Compute the test-statistic.
Estimation for Difference
95% Upper Bound
Difference for Difference
-0.410 -0.103
Test
Null hypothesis H₀: μ₁ - µ₂ = 0
Alternative hypothesis H₁: μ₁ - µ₂ < 0
T-Value DF P-Value
-2.23 61 0.015
t-Test for Independent Samples
Sample Problem 13
Step 5: Decision and interpretation of results.
t(obtained) = -2.23
This is beyond t (critical) = -1.671
The obtained test-statistic falls in the critical region, so we reject
the H0; p=0.015 < = 0.05.
The difference between the number of children in center-city
families and the suburban families is statistically significant.
The difference between the sample means is so large that we
can conclude (at = 0.05) that a difference exists between the
populations represented by the samples.
t-Test for Independent Samples
Sample Problem 14
The data below shows the distances of the home runs hit in record-
setting seasons by Mark McGwire and Barry Bonds. Assume that
we have simple random samples from large populations and use =
0.05 significance level to test the claim that the distances come from
populations with different means.
Statistics McGwire Bonds
Mean 418.5 403.7
Standard Deviation 45.5 30.6
Sample Size 70 73
t-Test for Independent Samples
Sample Problem 14
Method
μ₁: mean of Sample 1
µ₂: mean of Sample 2
Difference: μ₁ - µ₂
Equal variances are not assumed for this analysis.
Descriptive Statistics
Sample N Mean StDev SE Mean
Sample 1 70 418.5 45.5 5.4
Sample 2 73 403.7 30.6 3.6
t-Test for Independent Samples
Sample Problem 14
Test
Null hypothesis H₀: μ₁ - µ₂ = 0
Alternative hypothesis H₁: μ₁ - µ₂ ≠ 0
T-Value DF P-Value
2.27 120 0.025
Sample Problem 15
A researcher wishes to test the claim that on the average more
juveniles than adults are classified as missing persons. Records for
the last five years are shown below. At = 0.05, is there enough
evidence to support the claim?
Sample Problem 15
Method
μ₁: mean of Juveniles
µ₂: mean of Adults
Difference: μ₁ - µ₂
Equal variances are not assumed for this analysis.
Descriptive Statistics
Sample N Mean StDev SE Mean
Juveniles 5 63356 2808 1256
Adults 5 35387 2631 1177
t-Test for Independent Samples
Sample Problem 15
Test
Null hypothesis H₀: μ₁ - µ₂ = 0
Alternative hypothesis H₁: μ₁ - µ₂ > 0
T-Value DF P-Value
16.25 7 0.000
t-Test for Independent Samples
Sample Problem 16
From the sample data below, use = 0.05 significance level to test
the claim that the mean amount of tar in filtered king-size cigarettes
is less than the mean of tar in non-filtered king-size cigarettes. All
measurements are in milligrams and the data are from the Federal
Trade Commission.
16 15 16 14 16 1 16 18 10 14 12
Filtered
11 14 13 13 13 16 16 8 16 11
Non-filtered 23 23 24 26 25 26 21 24
t-Test for Independent Samples
Direct and Broker-Purchased Mutual Funds
Millions of investors buy mutual funds, choosing from thousands of
possibilities. Some funds can be purchased directly from banks or
other financial institutions while others must be purchased through
brokers, who charge a fee for this service. This raises the question:
Can investors do better by buying mutual funds directly than by
purchasing mutual funds through brokers? To help answer this
question, a group of researchers randomly sampled the annual
returns from the mutual funds than can acquired directly and those
that are bought from brokers and recorded the net annual returns.
Can we conclude at = 0.05 that directly-purchased mutual funds
outperform mutual funds bough through brokers?
t-Test for Independent Samples
New Old No
Procedure Procedure Preference All
1 100 80 20 200
75 100 25
2 50 120 30 200
75 100 25
All 150 200 50 400
Chi-Square Test
Chi-Square DF P-Value
Pearson 26.667 2 0.000
Likelihood Ratio 27.058 2 0.000
x2-Test for Independence
Sample Problem 17
The data below summarizes results from tests of the accuracy of
polygraphs. Use α = 0.05 significance level to test the claim whether
the subject lies is independent of the polygraph indication. What do
the results suggest about the effectiveness of polygraphs?
Category Polygraph Polygraph
Indicated Truth Indicated Lie
Subject actually told the truth 65 15
Subject actually told a lie 3 17
x2-Test for Independence
Sample Problem 18
A researcher wishes to determine whether there is a relationship
between the gender of an individual and the amount of alcohol
consumed. A sample of 68 people is selected and the following data
are obtained. At α = 0.05, can the researcher conclude that alcohol
consumption is related to gender?
Gender Alcohol Consumption Total
Low Moderate High
Male 10 9 8 27
Female 13 16 12 41
Total 23 25 20 68
x2-Test for Independence
Sample Problem 19
A study was conducted to determine whether there is a relationship
between jogging and blood pressure. A random sample of 210
subjects was selected and were classified as shown in the table
below. At α = 0.05, test the claim that jogging and blood pressure
are not related.
Sample Problem 19
Chi-Square Test
Chi-Square DF P-Value
Pearson 6.789 2 0.034
Likelihood Ratio 6.955 2 0.031
x2-Test for Independence
Cramer’s Coefficient
C is a value which estimates the degree of relationship between
two variables (when the variables are in at least nominal) when
measurement under these two variables are summarized in an r
x c contingency (frequency) table. The formula for getting the
c-coefficient is:
where:
√
c= x2 n = total number of subjects/ units
n(L) in samples
L = minimum of r & c (smaller of
the 2 values, r & c)
Ex: if r = 3; c = 4; then L = 3
x2-Test for Independence
√ √ √
c= x2 = 6.789 = 6.789
n(L) 210(2) 105
=
√ 0.064657
= 0.254278
Wilcoxon Rank-Sum Test for Two Independent
Samples (or Mann-Whitney U Test)
The Wilcoxon rank-sum test is a non-parametric test that uses
ranks of sample data from two independent populations.
It is used to test the null hypothesis that the two independent
samples come from populations with the same distribution. That
is, the two populations are identical. The alternative
hypothesis is the claim that the two population distributions are
different in some way.
Two samples are independent if the sample values selected from
one population are not related or somehow matched or paired
with the sample values from the other population.
It is used when the variables are in the ordinal measurement.
Wilcoxon Rank-Sum Test for Two Independent
Samples (or Mann-Whitney U Test)
The key idea underlying the Wilcoxon rank-sum test is this: If two
samples are drawn from identical populations and the individual
values are all ranked as one combined collection of values, then
the high and low ranks should fall evenly between the two
samples.
If the low ranks are found predominantly in one sample and the
high ranks are found predominantly in the other sample, we
suspect that the two populations are not identical.
Suppose we have samples from two populations, X & Y. The null
hypothesis is that X and Y have the same distribution (or X & Y do
not differ). The alternative hypothesis H1 against we test H0 is that
X is stochastically larger than Y -- a directional hypothesis.
Wilcoxon Rank-Sum Test for Two Independent
Samples (or Mann-Whitney U Test)
H1 is accepted if the probability that a score from X is larger than a
score from Y is greater than one-half. That is, if X is one
observation from population X and Y is an observation from
population Y, then H1 is that:
P[X > Y] > ½
If the evidence supports H1, this implies that the “bulk” of the
elements of population X are larger than the bulk of the elements
of population Y. Hence, the null hypothesis is H0: P[X > Y] > ½.
Wilcoxon Rank-Sum Test for Two Independent
Samples (or Mann-Whitney U Test)
It could also be that our hypothesis might instead be that Y is
stochastically larger than X.
In that case, the alternative hypothesis H1 would be that P[X > Y]
> ½.
If the evidence supports H1, this implies that the “bulk” of the
elements of population X are larger than the bulk of the elements
of population Y. Hence, the null hypothesis is H0: P[X > Y] < ½.
Confirmation of this hypothesis would imply that the bulk of Y is
larger than the bulk of X.
Wilcoxon Rank-Sum Test for Two Independent
Samples (or Mann-Whitney U Test)
Fora two-tailed test, i.e., for a prediction of differences which does
not state the direction of the differences, H1 would be that P[X > Y]
≠ ½.
Anotherway of stating the alternative hypothesis H1 is that the
median of X is greater than the median of Y, that is, H1: H0: 0X >
0Y.
Wilcoxon Rank-Sum Test for Two Independent
Samples (or Mann-Whitney U Test)
Step 1: State the hypothesis.
H0: The populations do not differ.
Suppose we let X = response or scores of a population where a
smaller sample is drawn while Y = response or scores of the
population where a larger sample is drawn.
H0 can also be stated as:
H0 : The chance that X is larger than Y is the same as the
chance that X is smaller than Y.
Wilcoxon Rank-Sum Test for Two Independent
Samples (or Mann-Whitney U Test)
Step 1: State the hypothesis.
Ha: Most of the Xs are larger than the Ys
(X is stochastically larger than Y)
or
P[X > Y] > ½.
Ha: Most of the Xs are smaller than the Ys
(X is stochastically smaller than Y)
or
P[X > Y] < ½.
Wilcoxon Rank-Sum Test for Two Independent
Samples (or Mann-Whitney U Test)
Step 1: State the hypothesis.
Ha: P[X > Y] ≠ ½
(either most of the Xs are larger tha larger than Y or
most of the Xs are smaller than Y)
Step 2: Find the critical value. At α = __, use Wilcoxon Rank test.
Step 3: Compute the test value.
Use the test statistic:
Wx = sum of the ranks of the smaller samples
Wilcoxon Rank-Sum Test for Two Independent
Samples (or Mann-Whitney U Test)
Step 3: Compute the test value.
Wx = sum of the ranks of the smaller samples
Let m = sample size of smaller sample (Xs)
n = sample size of larger sample (Ys)
(m + n) = total sample size
Rank the responses or scores for both samples (Xs & Ys)
together in one order; rank 1 is assigned to the lowest
response and rank (m+n) to the highest response.
Example: m = 4 Xs and n = 6 Ys. Rank all observations 1
to 10.
Wilcoxon Rank-Sum Test for Two Independent
Samples (or Mann-Whitney U Test)
Step 3: Compute the test value.
Let Wx = sum of the ranks of the smaller sample or sum of
the ranks of the Xs.
Step 4: Decision rule: H0: P[X > Y] = ½
Reject H0 if Wx is large or P[Wx ≥ Observed Wx] ≤ α
Wilcoxon Rank-Sum Test for Two Independent
Samples (or Mann-Whitney U Test)
Step 3: Compute the test value.
Let Wx = sum of the ranks of the smaller sample or sum of
the ranks of the Xs.
Step 4: Decision rule: H0: P[X > Y] = ½
Reject H0 if Wx is large or P[Wx ≥ Observed Wx] ≤ α
Wilcoxon Rank-Sum Test for Two Independent
Samples (or Mann-Whitney U Test)
Sample Problem 20
Suppose an extension worker would like to evaluate the level of
adoption of a certain technology of farmers and would like to
compare the level of adoption of farmers from two barangays. With
the following results, make a generalization at α = 0.05.
Barangay Rating
Barangay 1 10 50 45 30 40 60
Barangay 2 20 75 70 55 65
Wilcoxon Rank-Sum Test for Two Independent
Samples (or Mann-Whitney U Test)
Sample Problem 20
Step 1: State the hypothesis.
H0: The level of adoption of farmers in the two barangays are
the same.
Ha: There is a difference between level of adoption of
farmers in the two barangays.
Step 2: Find the critical value. At α = __, use a two-tailed Wilcoxon
Rank Sum test.
Step 3: Compute the test value. Rank all ratings 1 to 11.
Wilcoxon Rank-Sum Test for Two Independent
Samples (or Mann-Whitney U Test)
Sample Problem 20
Step 3: Compute the test value.
m = 5 (smaller group; barangay 2)
n = 6 (larger group, barangay 1)
x = response of farmers in barangay 2 or smaller sample
Test
Null hypothesis H₀: η₁ - η₂ = 0
Alternative hypothesis H₁: η₁ - η₂ ≠ 0
W-Value P-Value
27.00 0.121
Wilcoxon Rank-Sum Test for Two Independent
Samples (or Mann-Whitney U Test)
Sample Problem 21
The data below shows the Flesch Reading Ease scores for randomly
selected pages from each of two books: Harry Potter and the
Sorcerer’s Stone by J. K. Rowling and War and Peace by Leo
Tolstoy. Use the two sets of independent sample data with α = 0.05
significance level to test the claim that reading scores for pages from
the two books have the same distribution.
Rowling 85.3 84.3 79.5 82.5 80.2 84.6 79.2 70.9 78.6 86.2 74.0 83.7 71.4
Tolstoy 69.4 64.2 71.4 71.6 68.5 51.9 72.2 74.4 52.8 58.4 65.4 71.6
Wilcoxon Rank-Sum Test for Two Independent
Samples (or Mann-Whitney U Test)
Sample Problem 22
Two independent samples of army and marine recruits are selected
and the time in minutes it takes each recruit to complete an obstacle
course is recorded as shown in the table below. At α = 0.05, is there
a difference in the times it takes the recruits to complete the course?
Army 15 18 16 17 13 22 24 17 19 21 26 6
Marines 14 9 16 19 10 12 11 8 15 18 25
Class Activity 8: Application of appropriate
statistical analysis to test the hypothesis
Many students have had the unpleasant experience of panicking on
a test because the first question was exceptionally difficult. The
arrangement of test items was studied for its effect on anxiety. The
following scores are measures of “debilitating test anxiety” which
most of us call panic or blanking out. Based on the data gathered, is
there a sufficient evidence to support the claim that the two
populations of scores have the same mean? Is there a sufficient
evidence to support the claim that the arrangement of the test items
has an effect on the score? Perform the test at = 0.05 level of
significance.
Class Activity 8: Application of appropriate
statistical analysis to test the hypothesis
Questions Arranged from Easy Questions Arranged from Difficult
to Difficult to Easy
24.64 39.29 16.32 32.83 28.02 33.62 34.02 26.63 30.26
33.31 20.60 21.13 26.69 28.90 35.91 26.68 29.49 35.32
26.43 24.23 7.10 32.86 21.06 27.24 32.34 29.34 33.53
28.89 28.71 31.73 30.02 21.96 27.62 42.91 30.20 32.54
25.49 38.81 27.85 30.29 30.72