DOM105 Session 3
DOM105 Session 3
DOM105 Session 3
Session 3
Reference: SfM Ch. 5
Probability distribution of a discrete variable
A discrete variable is a variable that takes only discrete values. These
values may not be integer, but they do not form a continuous function.
It is a mutually exclusive list of all possible numerical outcomes along
with the probability of each outcome occurring.
Eg: The number of possible absentees in an office:
No. of absentees (x) Probability
0 0.15
1 0.35
2 0.2
3 0.15
4 0.1
5 0.05
Expected value of a discrete variable
The expected value of a discrete variable is the weighted average of all the
outcomes, the weights being the probability scores.
µ
Variance =
σ==
The Uniform Distribution
Also called rectangular dist. Has the same chance of occurrence anywhere
in its range.
P(X<=x)=(x-a)/(b-a), where b is max(x) and a is min(x)
Mean:
Variance:
Standard Deviation:
The Binomial Distribution
A discrete random variable distribution created by a Bernoulli Process,
which has the following properties –
It is a series of trials, each trial has only two outcomes, with probabilities
p and 1-p
The value of p stays fixed over the course of the process
The trials are statistically independent
If there are n trials, the chances of obtaining exactly r successes (r<=n) is
given by the binomial formula (let q = 1-p):
Central Tendency of the Binomial Distribution
Mean: np
Variance: npq
Standard deviation:
If a binomial process has a large number of trials (n>20) and a small
probability of success (p<0.05), we can use the Poisson formula after
substituting the binomial mean np.
A store receives 5 customers an hour on average. If the hourly customer
arrival follows a poisson distribution, what is the probability of receiving 10
or more customers in an hour?
The normal distribution
It is a continuous probability distribution. Also called Gaussian
distribution.
Can be used to approximate discrete distributions, with sufficiently large
samples. Can approximate Binomial distribution if np,nq>5
It is symmetrical, bell-shaped in appearance, its interquartile range is from
-0.67 standard deviations to +0.67 std.devs.
Normal Distribution from Z-score
For
a normal distribution, we can find thee probability of the variable being below a
certain value by using the Z-table.
Calculate Z=, the corresponding value in the table shows the probability of the
variable being less than or equal to that value.
To find the probability of the variable being between X1 and X2, P(X1<X<X2)=
P(Z(X2))-P(Z(X1)).
P(X<X1)=P(Z(X1)
P(X>X1)= 1-P(Z(X1)
Normal curve of mean 200, stdev 50, what is P(X<=168)? Z(168)=-0.64, P(Z(168)=
0.2611
What is P(X>300)? Z(300) = 2. P(Z(300)) = 0.9772, P(X>300) = 1-0.9772=0.0228
What is P(168<=X<=240) = P(X<=240)-P(X<=168) = 0.7881-0.2611=0.527
The annual household income of 300 surveyed families has a mean of 16 lakhs
with stdev 90,000. How many families have an income between 10 and 15
lakh?
n=300, Mean=16, Stdev = 0.90
No. of houses between 10 and 15 = n*P(10<=x<=15) = n*[P(x<=15)-
P(x<=10)]
P(x<=15)=P(Z(15))=P((15-16)/0.9)=P(-1.11) = 0.1335
P(x<=10) = P(Z(10))=P((10-16)/0.9)=P(-6.67)=0
No. of houses = 300*(0.1335 – 0)=40.05 or 40 approx.
How many families do you expect to have income above 17.5 lakhs?
P(x>17.5) = 1 – P(x<=17.5)
P(x<=17.5) = P(z(17.5)) = P(1.67) = 0.9525
No. of families = 300*(1 – 0.9525) = 14.25 or 14 approx
Excel commands for Normal Distribution
NORM.DIST(x, Mean, stdev, 1): Returns the probability of a random variable
being less than or equal to x, for a given Mean and Stdev.
NORM.INV(p, Mean, Stdev): Finds the value x for which (PX<=x)=p, for a
given mean and stdev.
NORM.S.DIST(z,1): Returns the corresponding probability value for a certain
Z-score (can act as substitute for Z-table)
NORM.S.INV(p): Returns the Z-score for a certain probability (can act as
substitute for Z-table)