0). The null hypothesis is that the new method is not better than or equal to the old method (μ≤0). Hypothesis tests can have type I and type II errors. The significance level determines the probability of a type I error and the size of the rejection region. P-values indicate how significant sample evidence is in supporting the alternative hypothesis over the null hypothesis.">0). The null hypothesis is that the new method is not better than or equal to the old method (μ≤0). Hypothesis tests can have type I and type II errors. The significance level determines the probability of a type I error and the size of the rejection region. P-values indicate how significant sample evidence is in supporting the alternative hypothesis over the null hypothesis.">
Nothing Special   »   [go: up one dir, main page]

STAT609 SP23 LCN Unit3

Download as pdf or txt
Download as pdf or txt
You are on page 1of 46

Unit 3

STATISTICAL INFERENCE

Chapter 9
Hypothesis Testing

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
9-1 Introduction

 In hypothesis testing, an analyst collects sample data


and checks whether the data provide enough evidence
to support a theory, or hypothesis.
 The hypothesis that an analyst is attempting to prove is
called the alternative hypothesis.
❑ It is also frequently called the research hypothesis.
 The opposite of the alternative hypothesis is called the
null hypothesis.
❑ It usually represents the current thinking or status quo.
❑ That is, it is usually the accepted theory that the analyst is
trying to disprove.
 The burden of proof is on the alternative hypothesis.
© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
9-2 Concepts in Hypothesis Testing

 There are a number of concepts behind hypothesis


testing, all of which lead to the key concept of
statistical significance.
 Example 9.1 provides context for the discussion of
these concepts.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 9.1: A New Pizza Style at Pepperoni
Pizza Restaurant

 The manager of Pepperoni Pizza Restaurant has recently


begun experimenting with a new method of baking pizzas.
 He would like to base the decision whether to switch from
the old method to the new method on customer reactions, so
he performs an experiment.
 For 100 randomly selected customers who order a
pepperoni pizza for home delivery, he includes both an old-
style and a free new-style pizza.
 He asks the customers to rate the difference between the
pizzas on a -10 to +10 scale, where -10 means that they
strongly favor the old style, +10 means they strongly favor
the new style, and 0 means they are indifferent between the
two styles.
 How might he proceed by using hypothesis testing?
© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
9-2a Null and Alternative Hypotheses

 The manager would like to prove that the new method


provides better-tasting pizza, so this becomes the
alternative hypothesis.
❑ The opposite, that the old-style pizzas are at least as good as the
new-style pizzas, becomes the null hypothesis.
 He judges which of these are true on the basis of the mean
rating over the entire customer population, labeled μ.
❑ If it turns out that μ≤ 0, the null hypothesis is true.
❑ If μ> 0, the alternative hypothesis is true.
 Usually, the null hypothesis is labeled H0, and the alternative
hypothesis is labeled Ha.
❑ In our example, they can be specified as H0:μ≤ 0 and Ha:μ> 0.
❑ The null and alternative hypotheses divide all possibilities into two
nonoverlapping sets, exactly one of which must be true.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
9-2b One-Tailed versus Two-Tailed Tests

 A one-tailed alternative is one that is supported only


by evidence in a single direction.
 A two-tailed alternative is one that is supported by
evidence in either of two directions.
 Once hypotheses are set up, it is easy to detect whether
the test is one-tailed or two-tailed.
❑ One-tailed alternatives are phrased in terms of “<” or “>”.
❑ Two-tailed alternatives are phrased in terms of “≠” “.

 The pizza manager’s alternative hypothesis is one-


tailed because he is trying to prove that the new-style
pizza is better than the old-style pizza.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
9-2c Types of Errors

 Regardless of whether the manager decides to accept or reject the


null hypothesis, it might be the wrong decision.
❑ He might incorrectly reject the null hypothesis when it is true, or he might
incorrectly accept the null hypothesis when it is false; these are
respectively called type I and type II errors.
❑ A type I error occurs when you incorrectly reject a null hypothesis that is
true.
❑ A type II error occurs when you incorrectly accept a null hypothesis that
is false.
 The traditional hypothesis-testing procedure favors caution in terms
of rejecting the null hypothesis. Given this rather conservative way of
thinking, you are inclined to accept the null hypothesis unless the
sample evidence provides strong support for the alternative
hypothesis.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
9-2d Significance Level and Rejection Region

 To decide how strong the evidence in favor of the alternative hypothesis


must be to reject the null hypothesis, one approach is to prescribe the
probability of a type I error that you are willing to tolerate.
❑ This type I error probability is usually denoted by α and is most commonly set
equal to 0.05.
❑ The value of α is called the significance level of the test.
 The rejection region is the set of sample data that leads to the rejection of
the null hypothesis.
❑ The significance level, α, determines the size of the rejection region.
❑ Sample results in the rejection region are called statistically significant at the α
level.
 It is important to understand the effect of varying α:
❑ If α is small, such as 0.01, the probability of a type I error is small, and a lot of
sample evidence in favor of the alternative hypothesis is required before the
null hypothesis can be rejected
❑ When α is larger, such as 0.10, the rejection region is larger, and it is easier to
reject the null hypothesis.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
9-2e Significance from p-values

 A second approach is to avoid the use of a significance


level and instead simply report how significant the
sample evidence is.
❑ This approach is currently more popular.
❑ It is done by means of a p-value.
◼ The p-value is the probability of seeing a random sample at least
as extreme as the observed sample, given that the null hypothesis is
true.
◼ The smaller the p-value, the more evidence there is in favor of the
alternative hypothesis.
❑ Sample evidence is statistically significant at the
α level only if the p-value is less than α.
◼ The advantage of the p-value approach is that you don’t have to
choose a significance value α ahead of time, and p-values are
included in virtually all statistical software output.
© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
9-2f Type II Errors and Power

 A type II error occurs when the alternative hypothesis is


true but there isn’t enough evidence in the sample to
reject the null hypothesis.
❑ This type of error is traditionally considered less important
than a type I error, but it can lead to serious consequences
in real situations.
 The power of a test is 1 minus the probability of a type
II error.
❑ It is the probability of rejecting the null hypothesis when the
alternative hypothesis is true.
❑ There are several ways to achieve high power, the most
obvious of which is to increase sample size.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
9-2g Hypothesis Tests and Confidence Intervals

 The results of hypothesis tests are often


accompanied by confidence intervals.
❑ This provides two complementary ways to interpret the
data.
❑ There is also a more formal connection between the
two, at least for two-tailed tests.
◼ When using a confidence interval to perform a two-tailed
hypothesis test, reject the null hypothesis if and only if the
confidence interval for the parameter does not contain the
hypothesized value.
◼ When the confidence interval contains the hypothesized
value, do not reject the null hypothesis.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
9-2h Practical versus Statistical Significance

 Statistically significant results are those that produce


sufficiently small p-values.
❑ In other words, statistically significant results are those that
provide strong evidence in support of the alternative
hypothesis.
 Such results are not necessarily significant in terms of
importance. They might be significant only in the
statistical sense.
 There is always a possibility of statistical significance
but not practical significance with large sample sizes.
 By contrast, with small samples, results may not be
statistically significant even if they would be of practical
significance.
© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
9-3 Hypothesis Tests for a Population Mean

 As with confidence intervals, the key to the analysis is


the sampling distribution of the sample mean.
 If you subtract the true mean from the sample mean
and divide the difference by the standard error, the
result has a t distribution with n – 1 degrees of
freedom.
❑ In a hypothesis-testing context, the true mean to use is the
null hypothesis value, specifically, the borderline value
between the null and alternative hypotheses.
◼ This value is usually labeled μ0.
 To run the test, referred to as the t test for a
population mean, you calculate the test statistic as
shown below:

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
9-3 Hypothesis Tests for a Population Mean

P-Value:
▪ The p-value is also called the observed level of significance.
▪ The smallest α level at which 𝐻0 can be rejected.
▪ If p-value < α, reject 𝐻0
▪ If p-value ≥ α, do not reject 𝐻0

Ha P-value

𝜇 > 𝜇0 𝑃 𝑇>𝑡
𝜇 < 𝜇0 𝑃 𝑇<𝑡
𝜇 ≠ 𝜇0 2𝑃 𝑇 > 𝑡

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 9.1: Undergraduate Study Habits slide
(1 of 2)

A recent study asserts that, over the past five decades, the number
of hours that the average college student studies each week has
been steadily dropping (The Boston Globe, July 4, 2010). In 1961,
students invested 24 hours per week in their academic pursuits,
whereas today’s students study an average of 14 hours per week.
The dean randomly selects 35 students and asks their average study
time per week (in hours). From their responses, she calculates a
sample mean of 16.3714 hours and a sample standard deviation
of 7.2155 hours. The dean would also like to test if the mean study
time of students at her university differs from today’s national
average of 14 hours per week. At the 5% significance level, what is
the conclusion to this test?

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 9.1: Undergraduate Study Habits slide
(2 of 2)

 Step 1: Specify the null and alternative hypotheses.


❑ 𝐻0 : 𝜇 = 14 hours
❑ 𝐻𝑎 : 𝜇 ≠ 14 hours
 Step 2: Specify the significance level; 𝛼 = 0.05.
 Step 3: Calculate the test statistic and the p-value.
16.3714−14
7.2155Τ 35 = 1.94
❑ 𝑡 =

❑ For a two-tailed test, the p-value is 2𝑃 𝑇 ≥ 1.94 , df= n-1=34

◼ 0.05< 2𝑃 𝑇 ≥ 1.94 < 0.10; The exact value = 0.0602


 Step 4: State the conclusion and interpret the results.
❑ Since 0.0602 > 0.05, do not reject the null hypothesis.
❑ At the 5% significance level, we cannot conclude that the mean study time of
students at this large university in California differs from 14 hours per week.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 9.2: A New Pizza Style at Pepperoni
Pizza Restaurant (slide 1 of 5)

 Objective: To use a one-sample t test to see whether


consumers prefer the new-style pizza to the old style.
Use 5% significance level (𝛼 = 0.05).
 The file name: Pizza Ratings
 Solution: The ratings for the 40 randomly selected
customers are shown on the next slide.
 Calculate the test statistic using the borderline null
hypothesis value 𝜇0 = 0, and report how much
probability is beyond it in the right tail of the
appropriate t distribution.
❑ The right tail is appropriate because the alternative is one-
tailed of the “greater than” variety.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 9.2: A New Pizza Style at Pepperoni Pizza
Restaurant (slide 3 of 5)

SPSS Steps:
From the menus choose:
• Analyze  Compare Means  One-Sample T Test..
• Click Rating and move it onto the Test Variable(s) field.

• Click in the Test Value box and enter the value that you will compare to. In

this example, enter the hypothetical value of 0 as the Test Value.


SPSS Output:

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 9.2: A New Pizza Style at Pepperoni
Pizza Restaurant (slide 4 of 5)

 t=2.816, P-value = 0.004.


 Since 0.004 < 0.05, we reject the null hypothesis.
 The t-value indicates the sample mean is slightly more
than 2.8 standard errors to the right of the null value,
which provides a lot of evidence in favor of the
alternative.
 The manager of Pepperoni Pizza Restaurant can
conclude that the alternative hypothesis is true—and
presumably switch to the new-style pizza.
 If the alternative is still one-tailed but of the “less than”
variety, the analysis remains virtually unchanged.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 9.2: A New Pizza Style at Pepperoni
Pizza Restaurant (slide 5 of 5)

 A two-tailed test would be relevant if the pizza


manager were deciding which of two new-style
pizzas to switch to.
❑ Calculate the same t-statistic, but because of the two-
tailed nature of the test, the previous p-value is
doubled.
❑ The p-value is still less than .05, and the null hypothesis
can be rejected. The manager can conclude the mean
ratings of the two new pizzas are not the same.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
9-4 Hypothesis Tests for Other Parameters

 Just as we developed confidence intervals for a


variety of parameters, we can develop hypothesis
tests for other parameters.
 In each case, the sample data are used to calculate
a test statistic that has a well-known sampling
distribution.
 Then a corresponding p-value measures the support
for the alternative hypothesis.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
9-4a Hypothesis Tests for a Population Proportion

 To test a population proportion p, recall that the sample proportion


has a sampling distribution that is approximately normal when the
sample size is reasonably large.
❑ Specifically, the distribution of the standardized value

is approximately normal with mean 0 and standard deviation 1.


 This leads to the following z test for a population proportion.
❑ Let p0 be the borderline value of p between the null and alternative
hypotheses.
❑ Then p0 is substituted for p to obtain the test statistic below:

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
9-4a Hypothesis Tests for a Population Proportion

P-Value:
▪ If p-value < α, reject 𝐻0
▪ If p-value ≥ α, do not reject 𝐻0

Ha P-value

𝑝 > 𝑝0 𝑃 𝑍>𝑧
𝑝 < 𝑝0 𝑃 𝑍<𝑧
𝑝 ≠ 𝑝0 2𝑃 𝑍 > 𝑧

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 9.3: Customer Complaints at Walpole
Appliance Company (slide 1 of 2)

 Objective: To use a test for a proportion to see


whether a new process of responding to complaint
letters results in an acceptably low proportion of
unsatisfied customers.
 Solution: The manager’s goal is to reduce the
proportion of unsatisfied customers after 30 days
from 0.15 to 0.075 or less.
 With the new process in place, the manager has
tracked 400 letter writers and found that 23 of
them are “unsatisfied” after 30 days.
 Use 10% significance level (𝛼 = 0.10).
© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 9.3: Customer Complaints at Walpole
Appliance Company (slide 2 of 2)

The hypotheses are 𝐻0 : 𝑝 = 0.075; 𝐻𝑎 : 𝑝 < 0.075.


23
Sample proportion unsatiafied, 𝑝Ƹ = 400 =0.0575
ො 0
𝑝−𝑝 0.0575−0.075
z= = = −1.33;
𝑝0 (1−𝑝0 )Τ𝑛 0.075(1−0.075)Τ400
The p-value is 𝑃 𝑍 < −1.33 = 0.092.
Because the p-value of 0.092 is less than 𝛼 = 0.10, we reject the
null hypothesis. Therefore, at the 10% significance level, the
manager has indeed achieved her goal. The proportion of
unsatisfied customers after 30 days is reduced (less than 0.075).

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
9-4b Hypothesis Tests for Differences between
Population Means

 The comparison problem, where the difference between two


population means is tested, is one of the most important problems
analyzed with statistical methods.
❑ The form of the analysis depends on whether the two samples are
independent or paired.
 If the samples are paired, then the test is referred to as the t test for
difference between means from paired samples.
• A common case of dependent sampling is matched pairs.
❑ Samples are paired or matched in some way.

❑ Comparison is between “apples” and ”apples.”

 “Before” and “after” studies.


❑ A measurement, intervention, another measurement

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
9-4b Hypothesis Tests for Differences between
Population Means

 Example: measuring the weight of clients before and after a diet


plan
 A pairing of observations.
❑ Not the same individual who gets sampled twice

❑ Example: 20 adjacent plots of land using a nonorganic fertilizer


on one half of the plot and an organic fertilizer on the other
 Let 𝐷0 be a hypothesized difference for 𝜇𝐷 .
 The competing hypothesis will be one of the below.
ഥ 0
𝐷−𝐷
 Test statistic: 𝑡 =
𝑠𝐷 Τ 𝑛
 This test is equivalent to finding differences between the paired
items and using one-sample t-test.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 9.4: Measuring the Effects of Soft-Drink
Cans (slide 1 of 4)

 Objective: To use paired-sample t tests for differences


between means to see whether consumers rate the
attractiveness, and their likelihood to purchase, higher
for a new-style can than for the traditional-style can.
The file name: Soft-Drink Cans
 Solution: Randomly selected customers are asked to
rate each of the following on a scale of 1 to 7:
❑ Attractiveness of the traditional-style can (AO)
❑ Attractiveness of the new-style can (AN)
❑ Likelihood of purchasing a product with the traditional-style
can (WBO)
❑ Likelihood of purchasing a product with the new-style can
(WBN)

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 9.4: Measuring the Effects of Soft-Drink
Cans (slide 2 of 4)

Let
1: Attractiveness of the traditional-style can (AO)
2: Attractiveness of the new-style can (AN)
Test the rate the attractiveness of the new design is higher than
the attractiveness of the current design.
Let D = the mean difference in the two style can (D = 1- 2)
Hypotheses: H0: D = 0, Ha: D < 0.
Use 5% significance level (𝛼 = 0.05).
SPSS Steps:
From the menus choose:
 Analyze  Compare Means  Paired-Samples T Test
 Select AO as Variable 1 and AN as Variable 2

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 9.4: Measuring the Effects of Soft-Drink
Cans (slide 3 of 4)

ഥ 0
𝐷−𝐷 −0.539−0
𝑡= = = −5.351,
𝑠𝐷 Τ 𝑛 1.351Τ 180
P-value < 0.001
P-value=0.00000013232, so reject H0.
We conclude that there is overwhelming
evidence that consumers, on average,
rate the attractiveness of the new
design higher than the attractiveness of
the current design.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 9.4: Measuring the Effects of Soft-Drink
Cans (slide 4 of 4)

SPSS Output:

A 95% confidence interval for the mean difference extends from -0.738
to -0.340. Note that this 95% confidence interval does not include the
hypothesized value 0, so we reject H0. This is consistent with the fact that
the two-tailed p-value is less than 0.05. (Recall the relationship between
confidence intervals and two-tailed hypothesis tests from Section 9-2g.)

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
9-4b Hypothesis Tests for Differences between
Population Means

 If the samples are independent, the test is referred to as the t test for
difference between means from independent samples.
One- and Two-Sided Tests
𝐻0 : 𝜇1 − 𝜇2 = 𝐷0 𝜇1 − 𝜇2 ≤ 𝐷0 𝑣𝑠. 𝐻𝑎 : 𝜇1 − 𝜇2 > 𝐷0
𝐻0 : 𝜇1 − 𝜇2 = 𝐷0 𝜇1 − 𝜇2 ≥ 𝐷0 𝑣𝑠. 𝐻𝑎 : 𝜇1 − 𝜇2 < 𝐷0
𝐻0 : 𝜇1 − 𝜇2 = 𝐷0 𝑣𝑠. 𝐻𝑎 : 𝜇1 − 𝜇2 ≠ 𝐷0
❑ Test statistic for independent samples test of difference between means when
equal variances is assumed,= 12 = 22:

❑ The t-value follows a t distribution with degrees of freedom (df) equal to (n1 +
n2 – 2) where the pooled standard deviation sp is given by:

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
9-4b Hypothesis Tests for Differences between
Population Means

 P-value

Ha P-value
𝜇1 − 𝜇2 > 𝐷0 𝑃 𝑇>𝑡
𝜇1 − 𝜇2 < 𝐷0 𝑃 𝑇<𝑡
𝜇1 − 𝜇2 ≠ 𝐷0 2𝑃 𝑇 > 𝑡

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 9.5: Designing a promotional Web
page to increase online sales (slide 1 of 4)

A marketing team designed a promotional Web page to increase


online sales. Visitors to www.name-of-this-company.com were
randomly directed to the old page or the new page. During this
A/B test, 300 visitors to the site were randomly assigned. The
131 visitors who were directed to the new page spent 𝑋ത new=
$328 on average (snew = $161); those directed to the old page
spent 𝑋ത 𝑜𝑙𝑑= $253 on average (sold = $155). (Assume that these
samples are large enough to satisfy the sample size condition
with equal variances).
Does the new page generate statistically significantly higher sales
than the old page? State the null hypothesis and whether it’s
rejected.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 9.5: Designing a promotional Web
page to increase online sales (slide 2 of 4)

The hypotheses are:


H0: new - old = 0 (i.e. H0: new = old )
Ha: new - old > 0 (i.e. Ha: new > old )
The pooled standard deviation, sp:
(131 − 1) 1612 + 169 − 1 1552
𝑠𝑝 = = 157.646
131 + 169 − 2
The t-statistic:
328 − 253
𝑡= = 4.09
1 1
157.646 +
131 169

𝑃 − 𝑣𝑎𝑙𝑢𝑒 = 𝑃 𝑇 > 4.09 = 0.000056 (by SPSS). The p-value is much less
than 0.05, then reject 𝐻0 .
Conclusion:
We conclude that the new page generates statistically significantly higher sales
than the old page.
© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 9.5: Designing a promotional Web
page to increase online sales (slide 3 of 4)

SPSS Steps:
From the menus choose:
 Analyze  Compare Means  Summary Independent- Samples
T Test.
 Complete the dialog box as shown.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 9.5: Designing a promotional Web
page to increase online sales (slide 4 of 4)

 SPSS Output:

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 9.6: Productivity Due to Exercise at
Informatrix (slide 1 of 3)

 Objective: To use a two-sample t test for the difference between


means to see whether regular exercise increases worker productivity.
 Solution: Informatrix Software Company installed exercise
equipment on site a year ago and wants to know if it has had an
effect on productivity.
 The company gathered data on a sample of 80 randomly chosen
employees: 23 used the exercise facility regularly, 6 exercised
regularly elsewhere, and 51 admitted to being nonexercisers.
 The 51 nonexercisers were compared to the 29 exercisers based on
the employees’ productivity over the year, as rated by their
supervisors on a scale of 1 to 25, 25 being the best.
 The file name: Exercise & Productivity

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 9.6: Productivity Due to Exercise at
Informatrix (slide 2 of 3)

SPSS Steps:
From the menus choose:
 Analyze  Compare Means  Independent- Samples T Test.
 Select Rating and move it onto the Test Variable(s) field.
 Select Exerciser and move it onto the Grouping Variable field.
 Click on the <Define Groups> button. In the window displayed,
enter Yes and No for Groups 1 and 2 respectively.
 SPSS Output:

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 9.6: Productivity Due to Exercise at
Informatrix (slide 3 of 3)

The hypotheses are:


𝐻0 : 𝜇1 − 𝜇2 ≤ 0 𝑣𝑠. 𝐻𝑎 : 𝜇1 − 𝜇2 > 0
where 𝜇1 and 𝜇2 are the mean ratings for the exerciser and non-exerciser
populations.
First check the equality of variances by testing
H0: 12 = 22 vs. Ha: 12 ≠ 22.
The p-value = 0.129 > 0.05, we do not reject H0. Therefore, we run the 2
samples t-test assuming equal variances.
The independent samples T test shows that the test statistic is 2.387, and the p-
value for a one-tailed test equals 0.009711 which is slightly less than 0.01.
We reject 𝐻0 and conclude that the exercisers perform better, in terms of mean
ratings, than non-exercisers.
A 95% confidence interval for this mean difference is all positive; it extends from
0.452 to 4.988.
© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
9-5 Tests for Normality

 Many statistical procedures are based on the assumption that population


data are normally distributed.
 The tests that allow you to test this assumption are called tests for
normality.
 One of the most powerful test is the Lilliefors test.
❑ This test is based on the cumulative distribution function (cdf), which
shows the probability of being less than or equal to any particular
value.
❑ Specifically, the Lilliefors test compares two cdfs: the cdf from a normal
distribution and the cdf corresponding to the given data.
◼ This latter cdf, called the empirical cdf, shows the fraction of
observations less than or equal to any particular value.
◼ If the maximum vertical distance between the two cdfs is sufficiently
large, the null hypothesis of normality can be rejected.
© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 9.6: Distribution of Metal Strip Widths
in Manufacturing (slide 1 of 5)

◼ The Hypotheses are


◼ H0: “The population is normal”
◼ Ha: “The population is not normal”
◼ If p-value > , then we conclude that the population is Normal.

 Objective: To use the Lilliefors test to see whether a normal


distribution of the metal strip widths is reasonable. The file name:
Testing Normality
 Solution: A company manufactures strips of metal that are supposed
to have width of 10 centimeters.
 For purposes of quality control, the manager plans to run some
statistical tests on these strips.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 9.6: Distribution of Metal Strip Widths
in Manufacturing (slide 2 of 5)
SPSS Steps:
From the menus choose:
 Analyze  Descriptive Statistics  Explore.
 Select Width and move it onto the Dependent List field.
 Click Plots and choose Histogram and Normality plots with tests.
 The P-value of Lilliefors test is equal to 0.200 which is greater than
𝛼 = 0.05, then we fail to reject the null hypothesis that says the
width is normally distributed.
 The histogram below confirms that the normal fit to the data
appears to be quite good.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 9.6: Distribution of Metal Strip Widths
in Manufacturing (slide 3 of 5)
SPSS Output:

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 9.6: Distribution of Metal Strip Widths
in Manufacturing (slide 4 of 5)

 A popular, but informal, test of normality is the quantile-quantile


(Q-Q) plot.
❑ It is basically a scatterplot of the standardized values from the
data set versus the values that would be expected if the data
were perfectly normally distributed.
SPSS Steps:
From the menus choose:
 Analyze  Descriptive Statistics  Q-Q Plots.
 Select Width and move it onto the Variables field.
 Select Normal as a test distribution.
 The Q-Q plot for the Width data appears in the next slide.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 9.6: Distribution of Metal Strip Widths
in Manufacturing (slide 5 of 5)

 Although the points in this Q-Q plot do not all lie exactly on a
45° line, they are about as close to doing so as can be
expected from real data. Therefore, there is no reason to
question the normal hypothesis for these data—the same
conclusion as from the Lilliefors test.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

You might also like