Nothing Special   »   [go: up one dir, main page]

Chapter 7 2022

Download as pdf or txt
Download as pdf or txt
You are on page 1of 55

Probability and Statistics

Chapter 7
Fundamental Sampling
Distributions

Dr. Yehya Mesalam 1


Sampling Distribution of the Sample Mean
The sampling distribution of the sample mean is the probability
distribution of the population of the sample means obtainable
from all possible samples of size n from a population of size N

Population
6
5
4
3
2
1

17 18 19 20 21 22 23 24 25

Dr. Yehya Mesalam 2


Distribution of Sample Means from Samples of Size n = 2

x f(x) x.f(x) x2.f(x)


18 0.25 4.5 81
20 0.25 5 100
22 0.25 5.5 121
24 0.25 6 144
sum 1 21 446
μ  E(X)   x P(x)  21
x

 2 V(X)  E(x 2
)  [ E ( x)]2  446  212  5

  2.236068  2.24
Dr. Yehya Mesalam 3
Distribution of Sample Means from Samples of Size n = 2
Sample # Scores Mean ( X )
1 18, 18 18
2 18, 20 19
3 18, 22 20
4 18, 24 21
5 20, 18 19
6 20, 20 20
7 20, 22 21
8 20, 24 22
9 22, 18 20
10 22, 20 21
11 22, 22 22
12 22, 24 23
13 24, 18 21
14 24, 20 22
15 24, 22 23
16 24, 24 24
Dr. Yehya Mesalam 4
Distribution of Sample Means from Samples of Size n = 2

𝒙 f f(x) 𝒙.f(x) 𝒙2.f(x)


18 1 0.0625 1.125 20.25
19 2 0.125 2.375 45.125
20 3 0.1875 3.75 75
21 4 0.25 5.25 110.25
22 3 0.1875 4.125 90.75
23 2 0.125 2.875 66.125
24 1 0.0625 1.5 36
sum 16 1 21 443.5

Dr. Yehya Mesalam 5


Distribution of Sample Means from Samples of Size n = 2

μ x  E( X)   x.f(x)  21
x

μ x    21
2
 x V( X)  E( x )  [ E ( x )]  443.5  21  2.5
2 2 2

 x  1.581139
 2.236068
X    1.581139
n 2

Dr. Yehya Mesalam 6


Sampling Distribution of the Sample Mean
The sampling distribution of the sample mean is the probability
distribution of the population of the sample means obtainable
from all possible samples of size n from a population of size N

Population
6
5
4
3
2
1

1 2 3 4 5 6 7 8 9

Dr. Yehya Mesalam 7


Distribution of Sample Means from Samples of Size n = 2

x f(x) x.f(x) x2.f(x)


2 0.25 0.5 1
4 0.25 1 4
6 0.25 1.5 9
8 0.25 2 16
sum 1 5 30
μ  E(X)   x P(x)  5
x

 2 V(X)  E(x 2
)  [ E ( x)]2  30  25  5

  2.236068  2.24
Dr. Yehya Mesalam 8
Distribution of Sample Means from Samples of Size n = 2
Sample # Scores Mean ( X )
1 2, 2 2
2 2,4 3
3 2,6 4
4 2,8 5
5 4,2 3
6 4,4 4
7 4,6 5
8 4,8 6
9 6,2 4
10 6,4 5
11 6,6 6
12 6,8 7
13 8,2 5
14 8,4 6
15 8.6 7
16 8.8 8
Dr. Yehya Mesalam 9
Distribution of Sample Means from Samples of Size n = 2

𝒙 f f(x) 𝒙.f(x) 𝒙2.f(x)


2 1 0.0625 0.125 0.25
3 2 0.125 0.375 1.125
4 3 0.1875 0.75 3
5 4 0.25 1.25 6.25
6 3 0.1875 1.125 6.75
7 2 0.125 0.875 6.125
8 1 0.0625 0.5 4
sum 16 1 5 27.5

Dr. Yehya Mesalam 10


Distribution of Sample Means from Samples of Size n = 2

μ x  E( X)   x.f(x)  5
x

μx    5
2
 x V( X)  E( x )  [ E ( x )]  27.5  25  2.5
2 2

 x  1.581139
 2.236068
X    1.581139
n 2

Dr. Yehya Mesalam 11


Distribution of Sample Means from Samples of Size n = 2

6
5
4
3
2
1

1 2 3 4 5 6 7 8 9
sample mean

We can use the distribution of sample means to


answer probability questions about sample
means
Dr. Yehya Mesalam 12
Distribution of Individuals in Population

6
 = 5,  = 2.24
5 Distribution of Sample Means
4
3
2
6 X = 5, X = 1.58
1
5
1 2 3 4 5 6 7 8 9
4
3
2
1

1 2 3 4 5 6 7 8 9
sample mean

Dr. Yehya Mesalam 13


Distribution of Individuals in Population

6
 = 5,  = 2.24
5
4 Distribution of Sample Means
3
2
6 X = 5, X = 1.58
1
5
1 2 3 4 5 6 7 8 9 2.24
4
X   1.58
3 2
2
1

1 2 3 4 5 6 7 8 9
sample mean

Dr. Yehya Mesalam 14


Sampling Distribution (n = 3)

24
22
X = 5
20 X = 1.29
18
16
14
12 2.24
X   1.29
10 3
8
6
4
2

1 2 3 4 5 6 7 8 9
sample mean
Dr. Yehya Mesalam 15
Distribution of Sample Means

6
5 Things to Notice
4
3 1. The sample means tend to pile
2 up around the population mean.
1

1 2 3 4 5 6 7 8 9 2. The distribution of sample means


sample mean is approximately normal in
shape, even though the
population distribution was not.

x μ
z 3. The distribution of sample means
σ has less variability than does the
n population distribution.
Dr. Yehya Mesalam 16
Central Limit Theorem
For any population with mean  and standard deviation ,
the distribution of sample means for sample size n …
1. will have a mean of 

2. will have a standard deviation of
n
3. will approach a normal distribution as n approaches
infinity
 The mean of the sampling distribution
X  
 The standard deviation of sampling distribution
(“standard error of the mean”)

X 
n
Dr. Yehya Mesalam 17
Clarifying Formulas
Distribution of
Population Sample Sample Means

 X X
 X X  
 n
N



ss
s
ss X 
N n 1 n
notice

2
 
2
X
n

Dr. Yehya Mesalam 18


Confidence Level, (1-)
• Suppose confidence level = 95%
• Also written (1 - ) = 0.95
• A relative frequency interpretation:
– From repeated samples, 95% of all the
confidence intervals that can be constructed
will contain the unknown true parameter
• A specific interval either will contain or
will not contain the true parameter
– No probability involved in a specific interval
Dr. Yehya Mesalam 19
Confidence Interval for μ
– Population variance σ2 is known use Z
x μ
z
σ
n

• Confidence interval estimate:


σ σ
x  zα/2  μ  x  zα/2
n n
(where z/2 is the normal distribution value for a probability of /2 in each
tail)

Dr. Yehya Mesalam 20


Finding the Reliability Factor, z/2
• Consider a 95% confidence interval:

1   .95

α α
 .025  .025
2 2

Z units: z = -1.96 0 z = 1.96


Lower Upper
X units: Confidence Point Estimate Confidence
Limit Limit

 Find z.025 = 1.96 from the standard normal distribution table

Dr. Yehya Mesalam 21


Common Levels of Confidence
• Commonly used confidence levels are 90%,
95%, and 99%
Confidence
Confidence
Coefficient, Z/2 value
Level
1 
80% .80 1.28
90% .90 1.645
95% .95 1.96
98% .98 2.33
99% .99 2.58
99.8% .998 3.08
99.9% .999 3.27

Dr. Yehya Mesalam 22


Example
• A sample of 11 circuits from a large normal population has a
mean resistance of 2.20 ohms. We know from past testing that
the population standard deviation is 0.35 ohms. Determine a
95% confidence interval for the true mean resistance of the
population.
• Solution:
σ
xz  2.20  1.96 (.35/ 11)
n
 2.20  .2068
1.9932  μ  2.4068

We are 95% confident that the true mean resistance is between 1.9932 and 2.4068 ohms

Dr. Yehya Mesalam 25


Confidence Interval for μ
• If the population standard deviation σ is
unknown, and n>30 use Z x μ
z
s
n
s s
x  z α/2  μ  x  z α/2
n n
x μ
• N<=30 use t distribution t
s
S S n
x  t α/2,  μ  x  t α/2,
n n
where tα/2,n-1 is the critical value of the t distribution with n-1 d.f. and an
area of α/2 in each tail:
Dr. Yehya Mesalam 26
Choice of Sample Size
• To Calculate the sample size needed for (1-α )
is
z α/2 . 2
n [ ]
E

• Where E the error

E  x μ

Dr. Yehya Mesalam 27


Example
• Assuming the population standard deviation =
3, how large should a sample be to estimate the
population mean with a margin of error not
exceeding 0.5?

z α/2 . 2
n [ ]
E

Dr. Yehya Mesalam 28


Solution
• where = 0.05
• Then from table
z α/2  z0.o 25  1.96
• Error =E = 0.5
• Then z α/2 . 2
n [ ]
E

• n= [ 1.96*3 / 0.5]2 = 138.3


• we need a sample of size at least 139
Dr. Yehya Mesalam 29
Student’s t Distribution
• Consider a random sample of n observations
– with mean x and standard deviation s
– from a normally distributed population with mean μ

• Then the variable


x μ
t
s/ n
follows the Student’s t distribution with (n - 1) degrees of
freedom
d.f. = n - 1

Dr. Yehya Mesalam 30


Student’s t Distribution
Note: t Z as n increases

Standard
Normal
(t with df = ∞)

t (df = 13)
t-distributions are bell-
shaped and symmetric, but
have ‘fatter’ tails than the t (df = 5)
normal

0 t
Dr. Yehya Mesalam 31
Example
A random sample of n = 25 has x = 50 and
s = 8. Form a 95% confidence interval for μ
• Solution
d.f. = n – 1 = 24, so t α/2,  t 24,.025  2.0639
The confidence interval is
S S
x  t α/2,  μ  x  t α/2,
n n
8 8
50  (2.0639)  μ  50  (2.0639)
25 25
46.698  μ  53.302
Dr. Yehya Mesalam 34
Confidence Intervals for the Population
Variance

The random variable


2
(n 1)s
 2
n1 
σ 2

follows a chi-square distribution with (n – 1) degrees of


freedom

Where the chi-square value n21,  denotes the number


for which
P( χn21  χn21, α )  α

Dr. Yehya Mesalam 35


Confidence Intervals for the Population
Variance

The (1 - )% confidence interval for the population


variance is

(n  1)s2 (n  1)s 2
σ  2
2

χ α/2,
2
χ1 - α/2,

Dr. Yehya Mesalam 36


Example

You are testing the speed of a batch of computer


processors. You collect the following data (in Mhz):
Sample size 17
Sample mean 3004
Sample std dev 74

Assume the population is normal.


Determine the 95% confidence interval for σ2

Dr. Yehya Mesalam 37


Solution
• n = 17 so the chi-square distribution has (n – 1) =
16 degrees of freedom
•  = 0.05, so use the chi-square values with area
0.025 in each tail:
χ 2α/2,  χ 0.025,16
2
 28.85
χ12- α/2,  χ 0.975,16
2
 6.91

probability probability
α/2 = .025 α/2 = .025

216
216 = 6.91 216 = 28.85
Dr. Yehya Mesalam 38
Solution

• The 95% confidence interval is


(n  1)s2 (n  1)s 2
 σ 2
 2
χ α/2,
2
χ1 - α/2,

(17  1)(74)2 (17  1)(74)2


 σ2 
28.85 6.91
3037  σ 2  12683

Converting to standard deviation, we are 95% confident


that the population standard deviation of CPU speed is
between 55.1 and 112.6 Mhz
Dr. Yehya Mesalam 42
.
Example
The lapping process which is used to grind certain
silicon wafers to the proper thickness is acceptable only
if the population standard deviation of the thickness of
dice cut from the wafers is at most 0.50 mil. If the
thicknesses of 17 dice cut from such wafers have a
standard deviation of 0.78 mil. Find 95% confidence
limits on .

Dr. Yehya Mesalam 43


.
Solution

(n  1)s 2 (n  1)s 2
 σ 2

χ α/2,
2
χ12- α/2,
2 2
16 * 0.78 16 * 0.78
σ 
2

28.845 6.908

0.3374  σ  1.40912

Dr. Yehya Mesalam 44


Confidence Interval between (Two Means)
σ12 and σ22 known use Z
(x1  x 2 )  (μ1  μ 2 )
Z
σ12 σ 22

n1 n 2
The confidence interval for μ1 – μ2 is:

σ12 σ 22 σ12 σ 22
(x1  x 2 )  z α/2   μ1  μ 2  (x1  x 2 )  z α/2 
n1 n 2 n1 n 2
σ12 and σ22 Unknown and n1+n2 >30 use Z
The confidence interval for μ1 – μ2 is:
s12 s 22 s12 s 22
(x1  x 2 )  z α/2   μ1  μ 2  (x1  x 2 )  z α/2 
n1 n 2 n1 n 2
Dr. Yehya Mesalam 45
Confidence Interval between (Two Means)
σ12 and σ22 Unknown and n1+n2 <=30 use t
(x1  x 2 )  (μ1  μ 2 )
t
1 1
Sp 
n1 n 2
The confidence interval for μ1 – μ2 is:
1 1 1 1
(x1  x 2 )  t α/2, .s p   μ1  μ 2  (x1  x 2 )  t α/2, .s p 
n1 n 2 n1 n 2

Where

(n1  1)s12  (n 2  1)s22


sp 
n1  n 2  2
Is the pooled variance
Dr. Yehya Mesalam 46
Example
You are testing two computer processors for speed.
Form a confidence interval for the difference in CPU
speed. You collect the following speed data (in Mhz):

CPU1 CPU2
Number Tested 16 13
Sample mean 3004 2538
Sample std dev 74 56

Assume both populations are normal with


equal variances, and use 95% confidence
Dr. Yehya Mesalam 47
Solution
The pooled variance is:
n
S2  1
 1S1
2
 n 2  1S 2
2

16  174 2
 13  156 2
 4427.03
(n1  n 2  2) (16  13  2)
p

Sp 
n1  1S12  n 2  1S2 2 
16  1742  13  1562  66.537
(n1  n 2  2) (16  13  2)

The t value for a 95% confidence interval is:


t α/2,  t 0.025,27  2.052

Dr. Yehya Mesalam 48


Solution

• The 95% confidence interval is

1 1 1 1
(x1  x 2 )  t α/2, .s p   μ1  μ 2  (x1  x 2 )  t α/2, .s p 
n1 n 2 n1 n 2

1 1 1 1
(3004  2538)  (2.052) * 66.537   μ1  μ 2  (3004  2538)  (2.052) * 66.537 
16 13 16 13

416.69  μ1  μ 2  515.31

We are 95% confident that the mean difference in CPU


speed is between 416.69 and 515.31 Mhz.
Dr. Yehya Mesalam 49
.
Example
As part of an industrial training program, some trainees are
instructed by Method 1, which is straight teaching-machine
instruction, and some are instructed by Method 2, which
also involves the personal attention of an instructor. If
random samples are taken from large groups of trainees
instructed by each of these two methods and the scores
with standard deviation are 6.06, and 5.58 respectively;
The score obtained in an appropriate achievement test
are
Method 1 71 75 65 69 73 66 69 75 74 87 68
Method 2 72 77 84 78 69 70 77 81 65 77 75
•Use the 0.05 level of significance to find 1   100%
confidence limits on 1   2
Dr. Yehya Mesalam 50
.
Solution
SX SX2 Mean Variance S.D
A 792 52351 72 36.8 6.0663
B 825 62183 75 30.8 5.549775

σ12 σ 22 σ12 σ 22
(x1  x 2 )  z α/2   μ1  μ 2  (x1  x 2 )  z α/2 
n1 n 2 n1 n 2

6.06 2 5.58 2 6.06 2 5.58 2


(72  75)  1.96   μ 1  μ 2  (72  75)  1.96 
11 11 11 11

- 7.85885  μ1  μ 2  1.858845

Dr. Yehya Mesalam 51


.
Example
As part of an industrial training program, some trainees
are instructed by Method 1, which is straight teaching-
machine instruction, and some are instructed by Method
2, which also involves the personal attention of an
instructor. If random samples are taken from large
groups of trainees instructed by each of these two
methods and the scores which they obtained in an
appropriate achievement test are
Method 1 71 75 65 69 73 66 69 75 74 87 68
Method 2 72 77 84 78 69 70 77 81 65 77 75

•Use the 0.05 level of significance to find 1   100%


confidence limits on 1   2
Dr. Yehya Mesalam 52
.
Solution
SX SX2 Mean Variance S.D
A 792 52351 72 36.8 6.0663
B 825 62183 75 30.8 5.549775

s 2p s 2p s 2p s 2p
(x1  x 2 )  t , α/2   μ1  μ1  (x1  x 2 )  t , α/2 
n1 n 2 n1 n 2

(n1  1)s12  (n 2  1)s 22


sp 
n1  n 2  2
10 * 6.0663  10 * 5.54972 2

sp   5.813
11  11  2

Dr. Yehya Mesalam 53


.
Solution

1 1 1 1
(72  75)  2.086 * 5.813   μ1  μ 2  (72  75)  2.086 * 5.813 
11 11 11 11

- 8.1705  μ1  μ 2  2.1705

Dr. Yehya Mesalam 54


Dr. Yehya Mesalam 55
55

You might also like