Probability Distributions.
Probability Distributions.
Probability Distributions.
Parajuli
Random Variable
T T Probability Distribution
X Value Probability
0 1/4 = 0.25
H T
1 2/4 = 0.50
2 1/4 = 0.25
T H
H H
Continuous Random Variables
• Continuous random variables take up an infinite number of
possible values which are usually in a given range. Typically,
these are measurements like weight, height, temperature of a
solution, the time needed to finish a task, etc.
• To give you an example, the life of an individual in a
community is a continuous random variable. Let’s say that the
average lifespan of an individual in a community is 110 years.
• Therefore, a person can die immediately on birth (where life =
0 years) or after he attains an age of 110 years. Within this
range, he can die at any age. Therefore, the variable ‘Age’ can
take any value between 0 and 110.
• If a random variable (X) takes ‘k’ different
values, with the probability that X = xi is
defined as P(X = xi) =pi, then it must satisfy the
following:
• 0 < pi < 1 (for each ‘i’)
• p1 + p2 + p3 + … + pk = 1
Mathematical Expectation:
The mathematical expectation, also called the expected value of a random
variable is the weighted arithmetic mean of the variable; the weights are
being the respective probabilities of the values that the random variable can
possibly assume.
Thus, if X is a discrete random variable, which can take the values x1, x2,
……..xn with respective probabilities p1, p2, ……..pn, where , then the
mathematical expectation of X denoted by E(X) is defined as
E(X) = p1X1 + p2X2 + …………+ pnXn = σ 𝑝𝑖 𝑥𝑖
Where, σ 𝑝𝑖 = 𝑝1 + 𝑝2 + 𝑝3 +……..+ 𝑝𝑛 = 1
More precisely, if X is a random variable with probability distribution {x, p(x)},
then E(X) = σ 𝒙 . 𝒑(𝒙)
Discrete Continuous
Probability Probability
Distributions Distributions
1
(i) If p = q = the binomial distribution is
2
symmetrical.
1
(ii) If p > , binomial distribution is negatively
2
skewed.
1
(iii) If p < , it is positively skewed.
2
Fitting of Binomial Distribution:
If a random experiment consisting of n trials is repeated N times satisfying
the conditions of binomial distribution, then the frequency of r successes is
given by f (x) = N. P(x) = N. ncx pxqn-x
Putting r = 0, 1, 2, ………n, we can get the expected or theoretical
frequencies of the binomial distribution.
r 1 2
E = N. p (x) = N. ncx pxqn-x = 150 . 4C𝐱 . (3)𝒙 (3)4−𝒙
0 1 2
150 . 4C0 . ( )0 ( )4−0 = 29.63 ≈ 30
3 3
1 4 1 2
150 . C1 . (3)1 (3)4−1 = 59.26 ≈ 59
2 4 1 2
150 . C2 . (3)2 (3)4−2 = 44.44 ≈ 44
3 1 2
150 . 4C3 . (3)3 (3)4−3 = 14.81 ≈ 15
4 1 2
150 . 4C4 . ( )4 ( )4−4 = 1.85 ≈ 2
3 3
𝑒 −𝜆 𝜆𝑥
𝑃(𝑋 = 𝑥) =
𝑥!
where:
x = number of events or no. of success in an area of
opportunity = 0, 1, 2, 3,…………………
= mean of the distribution or average no of
occurrence
e = base of the natural logarithm system (2.71828...)
Mean, = 𝝀
Variance = 𝝀
Standard deviation = 𝝀
Example : The quality control manager of Marlyin’s Cookies is
inspecting a batch of chocolate-chip that has just been baked. If
the production process is in control, the average number of chip
parts per cookies is 6. What is the probability that in any
particular cookie being inspected..
i. Fewer than five chip parts will be found?
ii. Exactly five chip parts will be found?
iii. Five or more chip parts will be found?
iv. Four or five chip parts will be found?
Solution:
𝑒 − 𝑥
= 6 , we have P(X=x) =
𝑥!
(i) P(X< 5)= P(X = 0) + P(X = 1) + P(X = 2) + P(X = 3) + P(X = 4)
=……
𝑒 −6 60 𝑒 −6 61 𝑒 −6 62 𝑒 −6 63 𝑒 −6 64
= 0!
+
1!
+
2!
+
3!
+
4!
0 61 62 63 64
−6 6
= 𝑒 [ + + + + ]
0! 1! 2! 3! 4!
= 0.00248[1+6+18+36+54] = 0.00248× 115 = 0.28581
𝑒 −6 65
(ii) P(X = 5) =
5!
(iii) P(X ≥ 5) = 1 – P(X<5) = 1 – 0.2851 =0.715
(iv) P(X = 4 or 5) = P(X = 4) + P(X = 5) =
Fitting of Poisson Distribution:
• If N is the total number of frequencies, then the expected or
theoretical frequencies of the Poisson distribution are given by
𝑒 − 𝑥
f ( x ) = N. p( x ) = N. P (X=x) = N.
𝑥!
e−0.95 . 0.950
0 100 x 0!
= 39
e−0.95 . 0.951
1 100 x 1!
=37
e−0.95 . 0.952
2 100 x 2!
=17
e−0.95 . 0.953
3 100 x 3!
=6
68.26%
95.44%
𝑋 𝑆𝑐𝑎𝑙𝑒
− 3 − 2 − 𝜇 + + 2 + 3
-3 -2 -1 0 1 2 3 𝑍 𝑆𝑐𝑎𝑙𝑒
99.74%
Standard Normal Distribution
X
a b
Area Under Normal Curve
• When the values of Z is known
The Cumulative Standardized Normal table in the textbook
(Appendix table ) gives the probability less than a desired value
of Z (i.e., from negative infinity to Z)
2.0
P(Z < 2.00) = 0.5 + 0.4772
The Column gives the value of Z to the
second decimal point
Z 0.00 0.01 0.02 …
The row shows the
value of Z to the 0.0 The value within the table
first decimal 0.1 gives the probability from Z = −
point . up to the desired Z value
.
.
P(Z < 2.00) = 0.9772
2.0 .9772
2.0 .4772
P(Z < 2.00) = 0.5 + 0.4772= 0.9772
• When the Scores is known/given
To find P(a < X < b) when X is distributed normally:
• Draw the normal curve for the problem in terms of X
• Translate X-values to Z-values
• Use the Standardized Normal Table
Suppose X is normal with mean 8.0 and standard deviation 5.0.
(a) Find P(8 < X < 8.6)
𝑋 −𝜇
We have, Z =
𝜎
8 −8
When X = 8, Z= =0
5
8.6 −8
When X = 8, Z= = 0.12
5
8 8.6
P(8 < X < 8.6) = P(0 < Z < 0.12) = 0.0478 0 0.12
OR = P(Z < 0.12) – P(Z ≤ 0) = 0.5478 - .5000 = 0.0478
(b) P(X > 8.6) = P(Z > 0.12) = 1.0 - P(Z ≤ 0.12) = 1.0 - 0.5478 = 0.4522
OR = 0.5 – P(0 < Z < 0.12) = 0.5 – 0.0478 = 0.4522
Example: The inside mean diameter of 500 washers produced by a machine is
5.02 mm and the standard deviation is 0.05 mm. The purpose for which these
washers are intended allows a maximum tolerance in the diameter of 4.96 to 5.08
mm., otherwise the washers are considered defective. Determine the percentage
of defective washers produced by the machine assuming the diameters are
normally distributed.
Solution:
N= 500, 𝜇= 5.02 mm., 𝜎 = 0.05 mm.
𝑋 −𝜇
We have, Z =
𝜎
𝑋 −𝜇 4.96 −5.02
When, X= 4.96, Z = = = -1.2 X
𝜎 0.05 1.2
-1.2
𝑋 −𝜇 5.08 −5.02 5.02
Again, when, x=5.08, Z = = = 1.2
𝜎 0.05
Now, P(4.96<x<5.08) = P(-1.2<z<1.2)
= P(-1.2<z<0) + P(0<z<1.2) = 2 . P(0<z<1.2) [by symmetry]
= 0.3849 + 0.3849 = 0.7698
The probability of non defective washers = 0.7698
The probability of defective washers = 1 – 0.7698 = 0.2302
The % of defective washers = 0.2302 100 = 23.02%
Example: A banker claimed that the life of a regular saving account opened in his
bank average 18 months with a standard deviation of 6.45 months. What is the
probability that:
i. There will be still money in a saving account between 20 to 22 months by a
depositor.
ii. The account will be closed (no money in the deposit) after 2 years?
Solution: With the usual notation, =18 months, 6.45 months
(i) Probability that there will be still money in the saving account between 20 to
𝑋 −𝜇
22 months is P(20 <X <22) We have, Z =
𝜎
20 −18 22 −18
When X = 20 , Z1 = = 0.31 Again, When X = 22 , Z1 = = 0.62
6.45 6.45
P(20<X<22) = P(0.31<z<0.62) = P(0<z<0.62) – P(0<z<0.31)
= 0.2324-0.1217 = 0.1107
(ii) Probability that the account will be closed (no money in the deposit) after two
24 −18
years(24 months) is P(X >24) When X = 22 , Z = = 0.93
6.45
P(x>24) = P(z>0.93) = P(0<z< )-P(0<z<0.93) = 0.5-0.3238 = 0.1762
Finding the scores(values) when the area (probability) is known
(i) X = μ + Zσ (ii) X = μ + Zσ
Similarly
= 50 + ( −1.13 )10 = 50 + (1.75 )10
= 38.7 = 67.5
Problem: The marks of 500 candidates in an examination are
normally distributed with a mean of 45 and standard deviation of 20
marks.
a. Given that the pass mark is 40, estimate the number of
candidates who passed the examination.
b. If 5% of the candidates obtained a distinction by scoring x marks
or more, estimate the value of x.
c. If 400 candidates to be passed, what should be the lowest mark
for passing?
Solution:
Let the random variable X denotes the marks obtained by the
candidates. Then X follows normal distribution with mean and
standard deviation . Given, N = 500, 𝜇 = 45, and 𝜎 = 20
𝑋−𝜇 40 − 45
(a) We have Z = , When X = 40, Z = = -0.25
𝜎 20
If the pass mark is 40, the area in the normal curve indicates the area above x = 40,
which can be written as
P(40≤ 𝑋 ≤ ∞ ) = P(-0.25≤ Z≤ ∞) = P(-0.25≤Z≤0) + P(0≤ 𝑍 ≤ ∞)
= 0.0987 + 0.5 = 0.5987
The expected number of students who passed the examination
=500 × 0.5987= 299.71 ≈ 300
(b) Let 𝑥1 denotes the lowest mark of 5% candidates, who have scored distinction
marks, then P(X≥ 𝑥1 )= 0.05
𝑋−𝜇
We have Z = 0.05
𝜎
𝑋−𝜇 𝑥1 −45 0.5 0.45
let Z = = = z1 ………………..(i)
𝜎 20
P(Z≥z1)= 0.05
𝒙𝟏 =?
P(0≤Z ≤ ∞) – P(0≤Z ≤ z1) = 0.05
0.5 - P(0≤Z ≤ z1) = 0.05 z1
P(0≤Z ≤ z1) = 0.45
From the normal table, the value of z corresponding to the probability 0.45 is 1.645
i.e. z1=1.645
Hence from (i) 𝑥1 = 𝜇+ z1 𝜎 = 45 + 1.645× 20 = 45 + 32.9 = 77.9
Hence, the lowest score of top 5% is 77.9 marks.
400
(c) The percentage of candidates to be passed = ×100% = 80%
500
i.e 20% of the candidates are to be failed. Let 𝑥2 denotes the highest score of the
lowest 20% of the candidates.
Then, P(X ≤ 𝑥2 ) = 20% = 0.20 0.20
𝑋−𝜇 𝑥2 −45 0.30
let Z = = = z2 ………………..(ii)
𝜎 20 0.5
P(Z ≤ z2 ) = 0.20
P(−∞ ≤ Z ≤ 0) – P(- z2≤Z ≤0) = 0.20 𝒙𝟐 =?
0.5 - P(- z2≤Z ≤0) = 0.20 z2
P(- z2≤Z ≤0) = 0.30
From the normal table, the value of z closure to 0.30 is 0.2995 at z = 0.84.
i.e. z2= -0.84
from (ii) 𝑥2 = 𝜇+ z2 𝜎 = 45 + -0.84× 20 = 45 -16.8 = 28.2
Hence if 400 candidates are to be passed out of 500, the lowest mark for passing
= 28.2 mark.
Example: In a certain examination 20% of the students, who appeared in
Statistics paper, got less than 30 marks and 97% of the students got less than 62
marks. Assuming the distribution to be normal, find the mean and standard
deviation of the distribution.
Solution: Let x be a random variable of marks obtained by the students, which is
normally distributed.
According to the question, we have,
P(x<30) = 0.20 ………..(i)
P(x<62) = 0.97 ……….(ii)
The standard normal variate is
30−𝜇
When x =30, then, Z = 𝜎
= - z1 …………(iii)
62−𝜇
When x = 62, then, Z= 𝜎
= - z2 …………(iv)
Now, P(x<30) = 0.20 or P(z < -z1) = 0.20
Or, P(-∞ < 𝑍 < 0) – P(- 𝑧1 < 𝑍 < 0) = 0.20
Or, 0.5 – P(0< 𝑍 < 𝑧1 ) = 0.20
Or, P(0< 𝑍 < 𝑧1 ) = 0.30
From the normal table, the value of Z closer to 0.30 is 0.2995 at Z = 0.84
Z = z1 = - 0.84
30−𝜇
From equation (iii) we get, = - 0.84 or, 𝟑𝟎 − 𝝁 = -0.84 𝝈 ………….(v)
𝜎
Again, P(x<62) = 0.97 or P(z < z2) = 0.97
Or, P(-∞ < 𝑍 < 0) + P(0< 𝑍 < 𝑧2 ) = 0.97
Or, 0.5 + P(0< 𝑍 < 𝑧2 ) = 0.97
Or, P(0< 𝑍 < 𝑧2 ) = 0.47
From normal table, the value of Z closure to 0.47 is 0.4699 at Z = 1.88
62−𝜇
From equation (iv), we get, = 1.88 or, 𝟔𝟐 − 𝝁 = 1.88 𝝈 ………………..(vi)
𝜎
Subtracting (vi) from (v), we have,
62 − 𝜇 = 1.88 𝜎
30 − 𝜇 = -0.84 𝜎
- + +
………………………………………..
32 = 2.72 𝜎 or, 𝜎 = 32/2.72 = 11.76
Substituting the value of in eqn. (v), we get, 30 - 𝜇= - 0.84 × 11.76
or 𝜇= 30 + 9.88 = 39.88
Hence mean(𝜇) = 39.88 and standard deviation(𝜎) = 11.76