S&P Lecture Notes 4 (Chapter 7)
S&P Lecture Notes 4 (Chapter 7)
S&P Lecture Notes 4 (Chapter 7)
Lecture
Sampling and Sampling Distribution of Sample Mean
and Sample Proportion
Sep. 20,2021
7. Sampling and Sampling Distributions
Sampling is simply the process of learning about the
Examples 7.2:
Statistic:
➢ Characteristic or measure obtained from a sample.
Sampling:
➢ The process or method of sample selection from the population.
Sampling unit:
➢ The ultimate unit to be sampled or elements of the population to be sampled.
Examples 7.3:
If somebody studies Scio-economic status of the households, households is the
sampling unit.
If one studies performance of freshman students in some college, the student is the
sampling unit.
Cont`d
Sampling frame:
Sample size:
There are two types of errors (Sampling error and non-sampling error)
a) Sampling Error:
The error which arise due to only a sample being used to estimate population
parameter. It is the discrepancy between the population value and sample value.
Sampling error is the difference between an estimate and the true value of the
parameter being evaluated.
Greater speed
Greater accuracy
Save time
➢ Universality
➢ Qualitativeness
➢ Detailedness
➢ Non-representativeness
7.3. Different types of Sampling (Probability vs Non
probability Sampling Techniques)
There are two types of sampling techniques.
These are
➢ Cluster sampling
➢ Systematic sampling
❖ Advantages of Probability Sampling
All elements in the population have the same pre-assigned non zero
probability to be included in to the sample.
Simple random sampling can be done either using the lottery method or
table of random numbers.
I. Lottery Method:
It is a very popular method of taking a random sample.
All items of the universe are numbered or named on separate slips of paper of identical size and
shape.
These slips are then folded and mixed up in a container or drum.
A blindfold selection then made of the number of slips required to constitute the desired sample
size.
The selection of items thus depends entirely on chance.
For Instance,
If we want to take a sample of 10 persons out of a population of 100, the procedure is to write the
names of the 100 persons on separate slips of paper, fold these slips, mix them thoroughly and then
make a blindfold selection of 10 slips.
It is very popular in lottery draws where a decision about prizes is to be made.
However, while adopting lottery method it is absolutely essential to see that
The slips are of identical size, shape and color,
Otherwise there is a lot of possibility of personal prejudice and bias affecting the results.
II. Table of Random Numbers
Table of random numbers are tables of the digits 0, 1, 2,…,, 9,
For convenience,
The numbers are put in blocks
In using these tables to select a simple random sample, the steps are:
Step 1: each element numbered for example for a population of size 500 we assign 001
to 500.
Step 3: we need only respective number of digits. Proceed in this fashion until the
required number of sample selected
Note: If sampling is without replacement, reject all the numbers that comes more
than once.
2. Stratified Random Sampling:
The population will be divided in to non overlapping but exhaustive groups called
strata.
Elements in the same strata should be more or less homogeneous while different in
different strata.
A simple random sample of groups or cluster of elements is chosen and all the sampling
units in the selected clusters will be surveyed.
Clusters are formed in a way that elements within a cluster are heterogeneous, i.e.
observations in each cluster should be more or less dissimilar.
Cluster sampling is useful when it is difficult or costly to generate a simple random sample.
For example,
To estimate the average annual household income in a large city we use cluster sampling,
because to use simple random sampling we need a complete list of households in the city
from which to sample. To use stratified random sampling, we would again need the list of
households. A less expensive way is to let each block within the city represent a cluster. A
sample of clusters could then be randomly selected, and every household within these
clusters could be interviewed to find the average annual household income.
4. Systematic Sampling:
A complete list of all elements within the population (sampling frame) is
required.
Then the technique is to take the kth item from the sampling frame.
Let
B) Non Random Sampling or Non-probability Sampling.
➢ Judgment sampling
➢ Convenience sampling
➢ Quota Sampling.
1. Judgment Sampling
In judgment sampling, the person taking the sample has direct or indirect
control over which items are selected for the sample.
2. Convenience Sampling
The decision maker selects a sample from the population in a manner that
is relatively easy and convenient.
3. Quota Sampling
❖ Sampling Distribution
2+6+4+8
𝜇= =5
4
And the population standard deviation is:
σ𝑛𝑖=1 𝑥𝑖 𝑓𝑖 2 ∗ 1 + 3 ∗2 + ⋯+ 8∗1 80
𝜇𝑥 = 𝑛 = =
σ𝑖=1 𝑓𝑖 16 16
= 5 = 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑒𝑎𝑚 (𝜇)
⇒ 𝜇𝑥 = 𝜇
- The standard deviation of sample means, denoted by 𝜎𝑥 , is
𝜎 2.236
𝜎𝑥 = = ≈ 1.581
𝑛 2
Example 7.4
In summary, if all possible samples of size n are taken with replacement from the
same population, the mean of the sample means, denoted by 𝜇𝑥 , equals the
population mean 𝜇; and the standard deviation of the sample means, denoted
𝜎
by 𝜎𝑥 , equals . The standard deviation of the sample means is called the
𝑛
standard error of the mean.
- A third property of the sampling distribution of sample means pertains to the
shape of the distribution and is explained by the central limit theorem.
The Central Limit Theorem
As the sample size n increases without limit, the shape of the distribution of the
sample means taken with replacement from a population with mean 𝜇 and
standard deviation 𝜎 will approach a normal distribution. As previously shown,
𝜎
this distribution will have a mean 𝜇 and a standard deviation .
𝑛
- If the sample size is sufficiently large, the central limit theorem can be used
to answer questions about sample means in the same manner that a normal
distribution can be used to answer questions about individual values. The only
difference here is that a new formula must be used for the z values. It is:
𝑥−𝜇
𝑍=𝜎
ൗ 𝑛
- If a large number of samples of a given size are selected from a normally
distributed population, or if a large number of samples of a given size that is
greater than or equal to 30 are selected from a population that is not normally
distributed, and the sample means are computed, then the distribution of
sample means will look like the normal distribution.
- It’s important to remember two things when you use the central
limit theorem:
1. When the original variable is normally distributed, the
distribution of the sample means will be normally distributed, for
any sample size n.
2. When the distribution of the original variable is not normal, a
sample size of 30 or more is needed to use a normal distribution
to approximate the distribution of the sample means. The larger
the sample, the better the approximation will be.
- The following examples show you how the standard normal
distribution can be used to answer questions about sample
means.
Example 7.5
A. C. Nielsen: a research group reported that children between the ages of
2 and 5 watch an average of 25 hours of television per week. Assume the
variable is normally distributed and the standard deviation is 3 hours. If 20
children between the ages of 2 and 5 are randomly selected, find the
probability that the mean of the number of hours they watch television
will be greater than 26.3 hours.
Solution
Since the variable is approximately normally distributed, the distribution of
sample means will be approximately normal, with a mean of 25. The
standard deviation of the sample means is
𝜎 3
𝜎𝑥 = = = 0.671
𝑛 20
Step 1: Draw a normal curve and shade the desired area. The distribution of
the means is shown in the Figure below, with the appropriate area shaded.
𝑥 − 𝜇 26.3 − 25 1.3
𝑍=𝜎 = = = 1.94
ൗ 𝑛 3ൗ 0.671
20
Step 3: Find the corresponding area for the z value. The area to the
right of 1.94 is 1.000 - 0.9738 = 0.0262 or 2.62%
Step 4: Conclusion
One can conclude that the probability of obtaining a sample mean
larger than 26.3 hours is 2.62% [that is, P ( 𝑥 > 26.3) = 0.0262].
Specifically, the probability that the 20 children selected between the
ages of 2 and 5 watches more than 26.3 hours of television per week is
2.62%.
Solution
Let 𝑥 be the amount of uric acid in normal adult males
𝜇 = 5.7, 𝜎 = 1 and n = 9
𝜎2
⇒ 𝑥~𝑁(𝜇, )
𝑛
1
~𝑁(5.7, )
9
𝑥−𝜇 𝑥−𝜇
⇒ 𝑍= =𝜎 ~𝑁(0,1)
𝑠𝑑 ൗ 𝑛
Then,
i. P (X > 6) =?
𝑥−𝜇 6−5.7 0.3
⇒ P (X > 6) = P (𝜎 > 1 )=P 𝑧>
ൗ 𝑛 ൗ 9 0.3333
𝟎. 𝟎𝟔𝟔𝟖
End
Of
Chapter 7