Nothing Special   »   [go: up one dir, main page]

DOM105 Session 3

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 17

DOM105 2019

Session 3
Reference: SfM Ch. 5
Probability distribution of a discrete variable
A discrete variable is a variable that takes only discrete values. These
values may not be integer, but they do not form a continuous function.
It is a mutually exclusive list of all possible numerical outcomes along
with the probability of each outcome occurring.
Eg: The number of possible absentees in an office:
No. of absentees (x) Probability
0 0.15
1 0.35
2 0.2
3 0.15
4 0.1
5 0.05
Expected value of a discrete variable
 The expected value of a discrete variable is the weighted average of all the
outcomes, the weights being the probability scores.

µ

In the previous example, µ =


0.15(0)+0.35(1)+0.2(2)+0.15(3)+0.1(4)+0.05(5) = 1.85
Variance and standard deviation of discrete variable
 The variance of the discrete variable is the sum of the squared difference
between outcome and expected value, multiplied by the probability of that
outcome.

Variance =

The standard deviation is the square root of the variance.

σ==
The Uniform Distribution
 Also called rectangular dist. Has the same chance of occurrence anywhere
in its range.
P(X<=x)=(x-a)/(b-a), where b is max(x) and a is min(x)
Mean:
Variance:
Standard Deviation:
The Binomial Distribution
 A discrete random variable distribution created by a Bernoulli Process,
which has the following properties –
It is a series of trials, each trial has only two outcomes, with probabilities
p and 1-p
The value of p stays fixed over the course of the process
The trials are statistically independent
If there are n trials, the chances of obtaining exactly r successes (r<=n) is
given by the binomial formula (let q = 1-p):
Central Tendency of the Binomial Distribution
 

Mean: np
Variance: npq
Standard deviation:

Final note: To apply the binomial distribution, we must first


ensure that the process meets the conditions for a Bernoulli
Process.
Hypergeometric Distribution
 Where the binomial distribution the sample data are selected with
replacement from a finite pool (or without from an infinite pool) the
hypergeometric distribution is found when the samples are taken from a
finite pool without replacement.
If n samples are taken from population N, and out of the population A
members are of interest, then the probability of exactly x successes out of
n samples is:

Mean = Std. dev.


Out of 20 people 12 have a Masters degree. If you select 5 at random without
replacement, what is the chance of 3 people having masters?
(12C3*8C2)/20C5

P of there being at least one masters?


P(x=1)+P(x=2)+…+P(x=5) = 1 - P(x=0) = 1 – (12C0*8C5)/20C5
Poisson Distribution
Characteristics of the Poisson Process:
The process is applied to a discrete random variable that takes integer
values
The average value of the random variable over the given time period is
already known or can be calculated given past data
At any one second, the possibility of a positive outcome is very small, and
a fixed value.
At any one second, the possibility of two or more positive outcomes is so
small we can assign it a value of zero.
The probability of a positive outcome at any given second is not only
fixed, but independent of the actual time as well as the result in any other
second.
The Poisson Formula
 Let λ be the mean number of occurrences in the interval of time under
study.
e is the base of the natural logarithm system, approx. 2.71828
Poisson probability of x number of incidents occurring

If a binomial process has a large number of trials (n>20) and a small
probability of success (p<0.05), we can use the Poisson formula after
substituting the binomial mean np.
A store receives 5 customers an hour on average. If the hourly customer
arrival follows a poisson distribution, what is the probability of receiving 10
or more customers in an hour?
The normal distribution
It is a continuous probability distribution. Also called Gaussian
distribution.
Can be used to approximate discrete distributions, with sufficiently large
samples. Can approximate Binomial distribution if np,nq>5
It is symmetrical, bell-shaped in appearance, its interquartile range is from
-0.67 standard deviations to +0.67 std.devs.
Normal Distribution from Z-score
For
  a normal distribution, we can find thee probability of the variable being below a
certain value by using the Z-table.
Calculate Z=, the corresponding value in the table shows the probability of the
variable being less than or equal to that value.
To find the probability of the variable being between X1 and X2, P(X1<X<X2)=
P(Z(X2))-P(Z(X1)).
P(X<X1)=P(Z(X1)
P(X>X1)= 1-P(Z(X1)
Normal curve of mean 200, stdev 50, what is P(X<=168)? Z(168)=-0.64, P(Z(168)=
0.2611
What is P(X>300)? Z(300) = 2. P(Z(300)) = 0.9772, P(X>300) = 1-0.9772=0.0228
What is P(168<=X<=240) = P(X<=240)-P(X<=168) = 0.7881-0.2611=0.527
The annual household income of 300 surveyed families has a mean of 16 lakhs
with stdev 90,000. How many families have an income between 10 and 15
lakh?
 n=300, Mean=16, Stdev = 0.90
No. of houses between 10 and 15 = n*P(10<=x<=15) = n*[P(x<=15)-
P(x<=10)]
P(x<=15)=P(Z(15))=P((15-16)/0.9)=P(-1.11) = 0.1335
P(x<=10) = P(Z(10))=P((10-16)/0.9)=P(-6.67)=0
No. of houses = 300*(0.1335 – 0)=40.05 or 40 approx.
How many families do you expect to have income above 17.5 lakhs?
P(x>17.5) = 1 – P(x<=17.5)
P(x<=17.5) = P(z(17.5)) = P(1.67) = 0.9525
No. of families = 300*(1 – 0.9525) = 14.25 or 14 approx
Excel commands for Normal Distribution
NORM.DIST(x, Mean, stdev, 1): Returns the probability of a random variable
being less than or equal to x, for a given Mean and Stdev.
NORM.INV(p, Mean, Stdev): Finds the value x for which (PX<=x)=p, for a
given mean and stdev.
NORM.S.DIST(z,1): Returns the corresponding probability value for a certain
Z-score (can act as substitute for Z-table)
NORM.S.INV(p): Returns the Z-score for a certain probability (can act as
substitute for Z-table)

You might also like