Lecture 9

- Descriptive statistics: 2 methods: visualization + numerical method
- Probability and probability distribution: Binomial, Poisson, Uniform,
Normal distribution.
- Inferential statistics: 2 methods: Estimation + hypothesis testing
2 parameters:
+ mean of normal distribution, µ (normal distribution)
+ proportion, p (Binomial)
CHAPTER 8: Estimation – Confidence Interval.
Why: Population  variable (score of EBBA students): continuous
Score: X~N(µ,σ2): we want to estimate the parameter µ, population
mean.
To estimate it, we take a sample with n observations.
We can estimate µ by sample mean, x_bar: point estimator (descriptive statistic)
Because the point estimator does give the level of confidence, we use
Confidence Interval for µ: find an interval (a,b) such that:
P(a < µ < b) = 1-α
(1-α) is the confidence level, often 95%, 90%, 99%
The interval (a,b) is called Confidence Interval of µ with confidence level
1-α.
The construction of formula:
Case 1: Estimate the confidence interval for population mean, µ when
We know the population variance, σ2:
From the sample, we find sample mean, x_bar:
The CI (1-α)100% for µ is:
Then, zα/2 *(σ/sqrt(n) = ME: Margin of error
zα/2: from Z table: z0.025=1,96; z0.05 = 1.645, z0.005= 2.58

How the CI changes if we change 1-α or sample size n
- Increase 1-α (n fix): CI: wider: more confident, less precise

- Increase n (1-α fix): CI: narrower: more precise
Example:
Data were collected on the amount spent by 64 customers for lunch at a major Houston
restaurant. These data are contained in the file named Houston. Based upon past studies
the population standard deviation is known with σ = $6.
a. At 99% confidence, what is the margin of error?
b. Develop a 99% confidence interval estimate of the mean amount spent for lunch.
Variable: Spent amount for lunch (of customers in restaurant in Houston)
Assume this amount has normal distribution N(µ,62)
Margin of Error: ME= zα/2 *(σ/sqrt(n)):
1-α=0.99, zα/2 = z0.005 =2.58: ME= 2.58*(6/sqrt(64)) = 2.58*(6/8)=1.94
To find the confidence interval, we calculate the sample mean,
x_bar= 21.52  CI 99% for population mean is: 21.52 +/- 1.94
(19.58, 23,46)
Interpretation: With 99% confidence level, the mean amount of spending for lunch
Is from 19.58 to 23.46 $
 Sample size determination: given MEo, σ, 1-α. Sample size required

2 2
(z α / 2) σ
n= 2
ME
Example: How many students to take into the sample if you want to estimate
The CI 95% for their Math score with ME= 5 points. Assume σ=3.5 point.
Solution: n= 1.962 * 3.52 / 25= 1.88  2 students needed.
Case 2: Estimate the confidence interval for population mean, µ when
We do not know the population variance, σ2
When σ2 unknown, we use sample standard deviation, s replacing for it.
Now, Z  student, T, zα/2  tn-1α/2
CI is:
s
x ± t α / 2 (n−1)
√n
Note that if sample size, n is large (n >30)  tn-1α/2  zα/2
Example: Sales personnel for Skillings Distributors submit weekly reports listing the customer
contacts made during the week. A sample of 65 weekly reports showed a sample mean of
19.5 customer contacts per week. The sample standard deviation was 5.2. Provide 90% and
95% confidence intervals for the population mean number of weekly customer contacts for
the sales personnel.
Solution: variable: X: customer contacts, X~N(µ,σ2), σ2 unknown
CI for µ:
s
x ± t α / 2 (n−1)
√n
xbar = 19.5, s=5.2, n=65, tn-1α/2 = t640.05 ~ 1.645, (1.67)
CI: 19.5 +/- 1.67*(5.2/sqrt(65))  (18.42, 20.58)
Case 3: estimation for population proportion, p
The variable of interest is qualitative (Gender, Preference…).
Take a sample of n observations in which there are m elements having the qualitative
Value that we want to estimate the proportion.
The formula is:

In which, pbar is the sample proportion, = m/n
Example: Estimate the proportion of students in NEU holding IELTS certificate.
We take a sample of 150 students in NEU to which there are 65 of them who hold
The certificate. Find the CI 95% for this proportion.
Solution: The CI 95% is:
pbar = 65/150, zα/2 =z0.025 =1.96, n=150
CI: 65/150 +/- 1.96* sqrt((65/150*(1-65/15))/150)
ME= 0.0793  35.4% to 51.3%
Example: Estimate the proportion of female attendance in a conference, take a
random sample of 200 attendees, we see that there are 75 of them who are female
1. Produce the confidence interval 98% for the population proportion of female?
2. If we want to produce the interval with 2% margin of error, what is the sample
size required (confidence level is still 98%)
Solution: the Confidence interval for p (female proportion) is:
p ± z α /2
√ p(1−p)
n
pbar =75/200= 0.375, 1-α=0.98  zα/2 = z0.01 = 2.33
 CI: 0.375 +/- 2.33* sqrt(0.375*(1-0.375)/200)

 (0.295; 0.455)
Margin of error: ME = z α/ 2
√ p (1− p)
n
= 2.33* sqrt(0.375*(1-0.375)/200)=0.08
3. The formula for the sample size when estimating p is:
= 2.33^2 * 0.375*(1-0.375)/(0.02^2)3180.996  n=3181

Note: If we do not have the value of p (pbar)  maximum sample size needed is:
n= zα/22 * 0.25 /ME2

Hypothesis testing:
We assume a hypothesis for parameters, then test for this hypothesis.
Test for population mean, µ of normal – assume σ2 unknown
Step 1: Specify the hypothesis: Ho: Null hypothesis (=, ≥, ≤)
H1: Alternative hypothesis (≠, <, >)
If we want to compare µ with µo: there are possible three types of hypothesis
Two tail test: Ho: µ=µo; H1: µ ≠ µo
Left tail test: Ho: µ ≥ µo; H1: µ < µo
Right tail test: Ho: µ ≤ µo; Ho: µ > µo
Example: - Test if the mean score in maths of Students is at least 75:
Ho: µ ≥ 75, H1: µ <75
- Test if the mean score in maths of Students is more than 75

Ho: µ ≤ 75 ; H1: µ >75
Step 2: calculate the test statistic: use t test (test statistic is t distribution)
x−μ0
t=
s/√n
Step 3: Rejection rule: when to reject Ho:
Given the value of significant level (α)- the probability of rejecting the Ho
Two tail test: Ho: µ=µo; H1: µ ≠ µo : Reject Ho if t > tn-1α/2 or t < - tn-1α/2
Left tail test: Ho: µ ≥ µo; H1: µ < µo : Reject Ho if t < -tn-1α
Right tail test: Ho: µ ≤ µo; Ho: µ > µo : Reject Ho if t > tn-1α
Step 4: Make conclusion
Example: To test for the assumption that the average GPA of NEU students is more
than 3.3, we take a sample of 30 students randomly. The mean and standard deviation
from this sample are 3.6 and 0.4. Do the test with 5% significant level.
Solution:
Step 1: Hypothesis: Ho: µ ≤ 3.3; H1: µ >3.3 (Right tail)

Step 2: calculate the test statistic (t- value):
x−μ0
t=
s/√n
t= (3.6-3.3)/(0.4/sqrt(30)) = 4.108
Step 3: Reject Ho if t > tn-1α = t290.05 = 1.699
So, we reject Ho
Step 4: The average GPA of NEU students is truly more than 3.3.
Example: A shareholders’ group, in lodging a protest, claimed that the mean tenure for a
chief executive office (CEO) was at least nine years. A survey of companies reported in The Wall
Street Journal found a sample mean tenure of 7.27 years for CEOs with a standard
deviation of s = 6.38 years.
a. Formulate hypotheses that can be used to challenge the validity of the claim made by
the shareholders’ group.
b. Assume 85 companies were included in the sample. What is the p-value for your
hypothesis test?
c. At α = .01, what is your conclusion?
Solution:
Hypothesis: Ho: µ ≥ 9; H1: µ <9
Calculate t statistic: t = (7.27 – 9) / (6.38/sqrt(85))= -2.5
Reject Ho if t < t840.01 ~ -2.37
So, Reject Ho. It means the mean tenure of CEO is less than 9 years.
Rule of test using p-value:
- If p-value =< α : reject Ho

- If p-value > α: do not reject Ho
In the above example: p-value = P(T < -2.5)
Example: Ex 29 on page 415:
Hypothesis: Ho: µ=90000; H1: µ ≠ 90000
Calculate t, from the data, sample mean xbar= 85272;
sample standard deviation, s=11039
t= (85272-90000)/(11039/sqrt(25))=-2.14
Reject Ho if t > t240.025 = 2.064 or t < -t240.025 =-2.064
Reject Ho  The mean salary of Ohio is differed from national level.

Find the p-value= P(T <-2.14) or P(T > 2.14) = 2*P(T >2.14)= 0.042726
Alpha=0.05  p-value < alpha Reject Ho
Note: P-value in two tail test = 2* p-value in one test
One-Sample Test
Test Value = 90000
95% Confidence Interval of the

Difference
t df Sig. (2-tailed) Mean Difference Lower Upper
Salary -2.141 24 .043 -4728.000 -9284.77 -171.23
 Test for the population proportion, p
Two tail: Ho: p =p0; H1: p≠ p0 : Reject Ho if z > zα/2 or z < -zα/2
Right tail: Ho: p ≤p0; H1: p > p0 Reject Ho if z > zα
Left tail: Ho: p ≥p0; H1: p < p0 Reject Ho if z < - zα
Test statistic:
In which:
Rejection rule:
Example: Test of the proportion of female student is dominated in NEU
If we take a random sample of 300 students in which there are 175 females.
Conclude with 5% significant level. What is the p value?

Solution: Hypothesis:
Ho: p =< 0.5; H1: p >0.5
pbar = 175/300=0.583
z= (0.583 – 0.5)/ sqrt(0.5*0.5/300)= 2.88
Reject Ho if z > z0.05 = 1.645  reject Ho  the female students is dominated.
 P-value = P(Z > 2.88)= 1-P(Z <2.88)=1-0.998=0.002 << 0.05  reject Ho

Chapter 10: Test to compare two population parameters
Compare two population means:

2 variable X1 ~N(µ1, σ12); X2 ~(µ2, σ22)
Assumptions:
- Data is normal
- Two populations are independent
- Two populations have the same variance, σ12 = σ22
Compare µ1, and µ2 :

Three types of hypothesis:
Two tail: Ho : µ1 =µ2 ; H1: µ1 ≠µ2
Right: µ1 =< µ2 ; H1: µ1 > µ2
Two sample t test:
t=x 1−x 2 /s x −x
1 2
In which:
And
Rejection rule:
Two tail: Reject Ho if t > tn1+n2-2α/2 or t < - tn1+n2-2α/2
Right tail: Reject ho if t > tn1+n2-2α
Left tail: Reject Ho if t < - tn1+n2-2α
Example:
Specific Motors of Detroit has developed a new automobile known as the

M car. 12 M cars and 8 J cars (from Japan) were road tested to compare
miles-per- gallon (mpg) performance. The sample statistics are:
sample mean (M)= 29.8 mpg, s1= 2.56 mpg; sample mean (J)= 27.3 mpg
s2= 1.81 mpg.
Test with alpha =0.05
Solution:
Hypothesis: Ho : µ1 =µ2 ; H1: µ1 ≠µ2. µ1 is mean for M, µ2 is mean of J car.
s2 = (11*2.56^2 + 7*1.81^2) /(12+8-2)=5.279

se(xbar1-xbar2)= sqrt(5.279*(1/12+1/8))=1.049
t= (29.8-27.3)/1.049=2.38
Reject Ho if t > t180.025=2.101 or t < -2.101

So, we reject Ho as t =2.38 >2.101
Conclusion: The mean of gas consumption of two types of cars are different
Example: Use the data HomePrices: compare the average prices of houses in
Two points of time: in 2006 and in 2009: whether the home price increases?
Solution:
Hypothesis: Ho: µ1 >= µ2; H1: µ1 > µ2 (increases)
F test for equal variance: Ho: σ12 = σ22

F= 1.531, p-value =0.22 >0.05  do not reject Ho  equal variance
Test for equal means with assumption of equal variance

Ho: µ1 >= µ2; H1: µ1 > µ2
p-value is very small, test 1 tail, half of p-value is smaller  reject Ho
 Two means are different.
 Compare two population proportions, p1 and p2:
Hypothesis: Two tail: Ho: p1 = p2; H1: p1 ≠ p2
Test statistic: use Z distribution
( p1 −p 2)
z=
s p −p
1 2
In which:
Example: To compare the proportion of students who hold the IELTS certificate
in two Universities A and B, we take two sample of 250 and 300 students in each Uni.
Of those students, 120 and 160 IELTS holders in each Uni.
Solution: Ho: p1 = p2; H1: p1 ≠ p2
Calculate z value:
pbar1 =120/250=0.48; pbar2 160/300=0.53
sp1-p2 = sqrt(0.48*(1-0.48)/250 + 0.53*(1-0.53)/300) =0.043
z= (0.48-0.53)/0.043 =-1.16
Reject Ho if z > 1.96 or z <-1.96
So, we do not reject Ho,  the proportions of that of two Uni are the same.

Lecture 9

Uploaded by

Copyright:

Available Formats

Lecture 9

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 9

Uploaded by

Copyright:

Available Formats

- Descriptive statistics: 2 methods: visualization + numerical method

- Probability and probability distribution: Binomial, Poisson, Uniform,

- Inferential statistics: 2 methods: Estimation + hypothesis testing

+ mean of normal distribution, µ (normal distribution)

CHAPTER 8: Estimation – Confidence Interval.

Why: Population  variable (score of EBBA students): continuous

Score: X~N(µ,σ2): we want to estimate the parameter µ, population

To estimate it, we take a sample with n observations.

We can estimate µ by sample mean, x_bar: point estimator (descriptive statistic)

Confidence Interval for µ: find an interval (a,b) such that:

P(a < µ < b) = 1-α

(1-α) is the confidence level, often 95%, 90%, 99%

The interval (a,b) is called Confidence Interval of µ with confidence level

The construction of formula:

Case 1: Estimate the confidence interval for population mean, µ when

We know the population variance, σ2:

From the sample, we find sample mean, x_bar:

The CI (1-α)100% for µ is:

Then, zα/2 *(σ/sqrt(n) = ME: Margin of error

zα/2: from Z table: z0.025=1,96; z0.05 = 1.645, z0.005= 2.58

- Increase 1-α (n fix): CI: wider: more confident, less precise

Variable: Spent amount for lunch (of customers in restaurant in Houston)

Assume this amount has normal distribution N(µ,62)

Margin of Error: ME= zα/2 *(σ/sqrt(n)):

1-α=0.99, zα/2 = z0.005 =2.58: ME= 2.58*(6/sqrt(64)) = 2.58*(6/8)=1.94

To find the confidence interval, we calculate the sample mean,

Is from 19.58 to 23.46 $

 Sample size determination: given MEo, σ, 1-α. Sample size required

Solution: n= 1.962 * 3.52 / 25= 1.88  2 students needed.

Case 2: Estimate the confidence interval for population mean, µ when

We do not know the population variance, σ2

When σ2 unknown, we use sample standard deviation, s replacing for it.

Now, Z  student, T, zα/2  tn-1α/2

Solution: variable: X: customer contacts, X~N(µ,σ2), σ2 unknown

CI: 19.5 +/- 1.67*(5.2/sqrt(65))  (18.42, 20.58)

Case 3: estimation for population proportion, p

The variable of interest is qualitative (Gender, Preference…).

Value that we want to estimate the proportion.

The formula is:

Example: Estimate the proportion of students in NEU holding IELTS certificate.

The certificate. Find the CI 95% for this proportion.

Solution: The CI 95% is:

pbar = 65/150, zα/2 =z0.025 =1.96, n=150

CI: 65/150 +/- 1.96* sqrt((65/150*(1-65/15))/150)

ME= 0.0793  35.4% to 51.3%

Example: Estimate the proportion of female attendance in a conference, take a

Solution: the Confidence interval for p (female proportion) is:

 CI: 0.375 +/- 2.33* sqrt(0.375*(1-0.375)/200)

3. The formula for the sample size when estimating p is:

= 2.33^2 * 0.375*(1-0.375)/(0.02^2)3180.996  n=3181

n= zα/22 * 0.25 /ME2

We assume a hypothesis for parameters, then test for this hypothesis.

Test for population mean, µ of normal – assume σ2 unknown

Step 1: Specify the hypothesis: Ho: Null hypothesis (=, ≥, ≤)

H1: Alternative hypothesis (≠, <, >)

Two tail test: Ho: µ=µo; H1: µ ≠ µo

Left tail test: Ho: µ ≥ µo; H1: µ < µo

Right tail test: Ho: µ ≤ µo; Ho: µ > µo

Example: - Test if the mean score in maths of Students is at least 75:

Ho: µ ≥ 75, H1: µ <75

1-α=0.99, zα/2 = z0.005 =2.58: ME= 2.58(6/sqrt(64)) = 2.58(6/8)=1.94

s2 = (112.56^2 + 71.81^2) /(12+8-2)=5.279

sp1-p2 = sqrt(0.48(1-0.48)/250 + 0.53(1-0.53)/300) =0.043