Theoretical Distributions

Theoretical Distributions

Order is heavens first law.

Theoretical Distributions of a Random variable may be:

Theoretical listing of outcomes and probabilities which can
be obtained from a mathematical model representing some phenomenon of interest. An empirical listing of outcomes and their observed relative frequencies. A subjective listing of outcomes associated with their subjective or contrived probabilities representing the degree of conviction of the decision maker as to the likelihood of the possible outcomes.

(Keywords - Outcomes, probabilities)

Theoretical Distributions of a Random variable may be:

Theoretical listing of outcomes and probabilities which can be obtained from a mathematical model representing some phenomenon of interest.
Apart from observed frequency distributions which are obtained by grouping data, it is also possible to deduce mathematically what distributions of certain populations should be. Such distributions as are expected on the basis of previous experience or theoretical considerations are known as Theoretical Distributions.

(Keywords - observed frequency distributions , previous experience)

Theoretical Distributions of a Random Variable Example

If a coin is tossed we expect that as n increases we shall get close to 50% heads and 50% tails.
On the basis of this expectation we can test whether the coin is biased or not.

If a coin is tossed 100 times, we may get 40 heads & 60 tails.

This is our observation, whereas our expectation is 50 % heads and 50% tails. The questions is whether this discrepancy is due to due to sampling fluctuations or is due to the fact that the coin is biased.

The fact that probabilities for both heads and tails are does not mean that we must always get 50% heads and 50% tailsIT MEANS THAT IF EXPERIMENT IS CARRIED OUT A LARGE NUMBER OF TIMES WE WILL ON AN AVERAGE GET CLOSE TO 50%HEADS/TAILS.

Theoretical Distributions of a Random Variable

A probability distribution for a discrete random variable is a mutually exclusive listing of all possible numerical outcomes for that random variable such that a particular probability of occurrence is associated with each outcome.

Face of Outcome
1 2 3 4

1/6 1/6 1/6 1/6
Since all possible outcomes are included ,this listing is complete (or collectively exhaustive) and thus the probabilities must sum upto 1.



(e.g.Prob of getting 4 is 1/6; Prob.of getting even number =3/6 ,Prob.of getting number >6 is 0.

Theoretical Distributions of a Random Variable

Pl. be noted,

When a random experiment is performed , the totality of outcomes of the experiment forms a set which is called Sample Space(S) of the experiment.

(Keywords-Random experiment , sample space)

Theoretical Distributions of a Random Variable

Let the random experiment be tossing of a coin 2 times

Here S={ (T,T),(T,H),(H,T),(H,H)}, then the number of heads obtained in both the trial shall be: (T,T) 0 (T,H) 1 (H,T) 1 (H,H) 2 The sample space can be written as S = {0,1,2}

Theoretical Distributions of a Random Variable

The sample space can be written as S = {0,1,2}

P (X=0) = P (T,T) = P(X=1) =P[(T,H),(H,T)]= P(X=2) = P[(H,H)] = Hence P (X)= + + = 1 Such a function P(X) is called the probability random variable X.

function of the

The probability distribution is the outcome of the different probabilities taken by this function of the random variable X. (Keywords probability function, probability distribution)

Theoretical Distributions of a Random Variable

A Random variable can be Discrete or Continuous.

A Random Variable is said the be Discrete if the set of values defined by it over the sample space is finite , and its probability function P(X) is called as Probability Mass Function and its distribution is called Discrete Probability Distribution.
A Random Variable is said to be Continuous if it can assume any (real) value in an interval, and its probability function P(X) is called Probability Density Function and its distribution is called Continuous Probability Distribution.

Theoretical Distributions of a Random Variable

Among the Theoretical or expected frequency distributions , the following six are more popular : 1.Binomial Distribution 2.Mutinomial Distribution 3.Negative Binomial Distribution 4.Poisson Distribution

5.Hypergeometric Distribution &

6.Normal Distribution.

Normal Distribution

The Normal Distribution also called the Normal Probability Distribution, happens to be the most useful theoretical distribution for continuous variables. Normal Distribution is the cornerstone of modern statistics. The Normal Model has become the most important probability model in statistical analysis.

The normal distribution is an approximation to Binomial Distribution whether or not p is

equal to q, the Binomial Distribution tends to be the form of the continuous curve and when n becomes large, at least for the material part of the range.

The correspondence between Binomial and Normal curve is close even for comparatively
low values of n, provided that p & q are fairly near equality. The normal frequency curve is represented in several forms.

Normal Distribution which is also called the Normal Curve, is the mathematical &

theoretical distribution which describes the expected distribution of sample means

and many other chances of occurrences. The normal curve is bell shaped and almost 99% of its values are within 3 standard deviations from its mean.

55 70







In this example a Standard Deviation for IQ equals 15. We can identify the proportion of the curve by measuring a scores distance (in this case standard deviation) from the mean (100)

Fig: Normal Distribution

The following is the basic form relating to the curve with mean and standard deviation

The Normal Distribution

P(X) = 1 2

-(x - )


X= Values of the continuous random variable = Mean of the normal random variable e= mathematical constant approximated by 2.7183 =Mathematical constant approximated by 3.1416 (2 = 2.5066)

The Equation of Normal Curve y= N 2 2

e 2


The quantity N is equal to the maximum ordinate ( yo) of the normal curve corresponding to the distribution of stated total frequency N and stated standard deviation

1.The Normal Distribution can have different shapes depending on different values of & but there is one and only one normal distribution for any given pair of values for & . 2.Normal Distribution is a limiting case of Binomial Distribution when a) n and b) neither p nor q is very small. 3.Normal Distribution is a limiting case of Poisson Distribution when its mean m is large. 4.The mean of a normally distributed population lies at the centre of its normal curve. 5.The two tails of the normal probability distribution extend infinitely and never touch the horizontal axis (which implies a positive probability for finding values of the random variable within any range from minus infinity to plus infinity.)

Remarks & Observations:



1.The Normal Distribution has the remarkable property stated in the
Central Limit Theorem( CLT).

According to this theorem as the sample size n increases the distribution of mean, X of a random sample taken from practically any population approaches a normal distribution (with mean & standard deviation /n). Thus ,if samples of large size, n, are drawn from a population that is not normally distributed ,nevertheless, the successive sample means will form themselves a distribution that is approximately normal. Hence, as the size of the sample is increased the sample means will tend to be normally The CLT applies to the distribution of most other statistics such as Median & Standard Deviation (but not range). CLT gives the Normal Distribution its central place in the theory of sampling, since many important problems can be solved by this single pattern of sampling variability.



2.As n becomes large the Normal Distribution serves as a good approximation of many Discrete distributions. 3.The Normal Distribution has numerous mathematical properties, which makes it popular and comparatively easy to manipulate. 4.The Normal Distribution is used extensively in statistical quality control in industry in setting up of control limits.


1.The normal distribution is bell shaped and symmetrical in its appearance .If the curves were folded along its vertical axis, the two halves would coincide. 2.The number of cases below the mean in a normal distribution, is equal to the number of cases above the mean , which make mean & median coincide. 3.The height of the curve for a positive deviation of 3 units is the same as the height of the curve for negative deviation of 3 units. 4.The height of the normal curve is at its maximum at the mean , hence the mean & mode of the normal distribution coincide. Thus for a normal distribution mean , median & mode are all equal.


5.There is one maximum point of the normal curve which occurs at the mean. The height of the curve declines as we go in either direction from the mean. The curve approaches nearer and nearer to the base but it never touches it i.e. the curve is Asymptotic to the base on either side, hence its range is unlimited or infinite in both directions. 6.Since there is only one maximum point, the normal curve is unimodal i.e it has only one mode. 7.The points of inflexion i.e. the points where the change in curvature occurs are X . 8.In Binomial & Poisson distribution the variable is discrete whereas in Normal Distribution the variable distributed is continuous

9.The first and third quartiles are equidistant from the Median.


10.The mean deviation is 4th or more precisely 0.7979 of the standard deviation. 11.The area under the normal curve distributed as follows:
Mean 1 covers 68.26% area (34.135% area will lie on either side of the mean) Mean 2 covers 95.45% area Mean 3 covers 99.73 % area.

Distance from the Mean Ordinate 0.5 1.0 1.5 1.96 2.0 2.5 2.5758 3.0 Percentage of Total Area 19.146 34.134 43.319 47.500 47.725 49.379 49.500 49.865

Thus the two ordinates at distance 1.96 from the mean on either side would enclose 47.5 +47.5=95% of the total area. The two ordinates at 2.5758 distance from the mean on either side would enclose 49.5+49.5=99% of the total area. The area enclosed between ordinates at 3 distance from the mean on either side would be 49.865 +49.865 =99.73% of the total area. The various hypothesis are tested either at 5% level or at 1% level(i.e. taking into account 95% & 99% of the total area of the normal curve )

Distance from the Mean Ordinate 0.5 1.0 1.5 1.96 2.0 2.5

Percentage Of Total Area 19.146 34.134 43.319 47.500 47.725 49.379




1.The causal forces must be numerous and of approximately equal weight. 2.These forces must be the same over the universe from which the observations are drawn(although their incidence will vary from event to event).

This is the condition of homogeneity

3.The forces affecting events must be independent of one another.

4.The operations of causal forces must be such that deviation above the population
mean are balanced as to magnitude and number by deviations below the mean.

This is the condition of symmetry.


The mean of the Normal Distribution is X The standard deviation of the normal distribution is 2 = 2 ; 3=0 and 4=3 4 1 or moment of coefficient of Skewness 1 = 32 = 0 32 2 or moment of coefficient of Kurtosis 2 = 4 = 34 22 4

Area under the Normal Curve

The equation under the normal curve gives the ordinate of the curve corresponding to any given value of x : -x2 y= N e 2

Although the researchers are usually are more interested in areas under the normal curve instead of its ordinate.
The areas under the curve gives us the proportion of the cases falling between the two numbers or the probability of getting a value between the two numbers.

As a researcher it is important to understand the meaning of the normal curve in its standard form. The equation of the normal curve depends on X and , and for its different values of X and we will obtain different curves (pl remember we calculate the areas through the use z table). Since for different values, different tables will be required hence it was considered to standardize the data (which is done through the use of one table) So now we can determine the normal curve areas regardless of X and by tabulating the area under the normal curve having X =0 and =1.

Such a Normal curve with 0 mean and unit Standard Deviation is known as the Standard normal curve.

The standard normal probability curve is given by the equation

P(Z) = N


- Z

A normal curve with mean X and standard deviation can be converted into a standard normal distribution by performing the change of the scale and origin ( as discussed above). In the original scale ( the x scale) the mean and the standard deviation are X and ; in the new scale ( the z-scale) they are 0 and 1. The formula that enables us to change x-scale to z-scale and vice versa is: z= X-X or x where x= (X- X)
This transformation from X to z is named as z-transformation and has the effect of reducing X to units in terms of standard deviation


x-values z-values

X -3 X -2 -3 -2

X - -1

X 0 68.27% 95.45% 99.73%

X + 1

X +2 X +3 2 3

Fig: The Standardized Normal Distribution


x-values z-values

-3 -3

-2 -2

-1 -1

0 68.27% 95.45% 99.73%

+1 1

+2 +3 2 3

Fig: The Standardized Normal Distribution

Given a value of X, the corresponding value of z tells us how far away and in what direction X is from its mean in term of its standard deviation .

Fig: Standardized Normal Distribution




With the help of standardized normal distribution researchers can find the probability of any Portion of the area under the standardized normal curve. All we have to do is transform or convert the data from other observed normal distributions to the standardized normal curve. In other words, the standardized normal distribution is extremely valuable because we can translate or transform any normal variable , X into the standardized value Z

Computing the standardized value, Z, of any measurement expressed in original units is simple: Subtract the mean from the value to be transformed and divide the standard deviation (all expressed in original units).Here the population standard deviation, , is used in the formula: Z= X*- *(here X=normal random variable) Standard Value = (Value to be transformed) (Mean) Standard Deviation Where = hypothesized or expected value of the mean.

(source: William Zikmund)

Linear Transformation of any Normal Variable into a Standardized Normal Variable

Sometimes the scale is stretched

X Sometimes it is shrunk



Z= X -

1.Find the area under the normal curve for z = 1.54 Ans: From the table, the entry corresponding to z=1.54 is 0.4382 and this measures the shaded area in the following figure between z = 0 & z = 1.54 0.4382







Theoretical Distributions of a Random Variable

Face of Outcome 1 2 3 Probability 1/6 1/6 1/6

5 6

1/6 1/6

