Theoretical Distributions
Theoretical Distributions
Theoretical Distributions
If a coin is tossed we expect that as n increases we shall get close to 50% heads and 50% tails.
On the basis of this expectation we can test whether the coin is biased or not.
The fact that probabilities for both heads and tails are does not mean that we must always get 50% heads and 50% tailsIT MEANS THAT IF EXPERIMENT IS CARRIED OUT A LARGE NUMBER OF TIMES WE WILL ON AN AVERAGE GET CLOSE TO 50%HEADS/TAILS.
Face of Outcome
1 2 3 4
Probability
1/6 1/6 1/6 1/6
Since all possible outcomes are included ,this listing is complete (or collectively exhaustive) and thus the probabilities must sum upto 1.
5
6
1/6
1/6
(e.g.Prob of getting 4 is 1/6; Prob.of getting even number =3/6 ,Prob.of getting number >6 is 0.
When a random experiment is performed , the totality of outcomes of the experiment forms a set which is called Sample Space(S) of the experiment.
Here S={ (T,T),(T,H),(H,T),(H,H)}, then the number of heads obtained in both the trial shall be: (T,T) 0 (T,H) 1 (H,T) 1 (H,H) 2 The sample space can be written as S = {0,1,2}
P (X=0) = P (T,T) = P(X=1) =P[(T,H),(H,T)]= P(X=2) = P[(H,H)] = Hence P (X)= + + = 1 Such a function P(X) is called the probability random variable X.
function of the
The probability distribution is the outcome of the different probabilities taken by this function of the random variable X. (Keywords probability function, probability distribution)
A Random Variable is said the be Discrete if the set of values defined by it over the sample space is finite , and its probability function P(X) is called as Probability Mass Function and its distribution is called Discrete Probability Distribution.
A Random Variable is said to be Continuous if it can assume any (real) value in an interval, and its probability function P(X) is called Probability Density Function and its distribution is called Continuous Probability Distribution.
Normal Distribution
The Normal Distribution also called the Normal Probability Distribution, happens to be the most useful theoretical distribution for continuous variables. Normal Distribution is the cornerstone of modern statistics. The Normal Model has become the most important probability model in statistical analysis.
The correspondence between Binomial and Normal curve is close even for comparatively
low values of n, provided that p & q are fairly near equality. The normal frequency curve is represented in several forms.
Normal Distribution which is also called the Normal Curve, is the mathematical &
55 70
85
100
115
130
145
IQ
In this example a Standard Deviation for IQ equals 15. We can identify the proportion of the curve by measuring a scores distance (in this case standard deviation) from the mean (100)
The following is the basic form relating to the curve with mean and standard deviation
-(x - )
22
X= Values of the continuous random variable = Mean of the normal random variable e= mathematical constant approximated by 2.7183 =Mathematical constant approximated by 3.1416 (2 = 2.5066)
e 2
-x
The quantity N is equal to the maximum ordinate ( yo) of the normal curve corresponding to the distribution of stated total frequency N and stated standard deviation
1.The Normal Distribution can have different shapes depending on different values of & but there is one and only one normal distribution for any given pair of values for & . 2.Normal Distribution is a limiting case of Binomial Distribution when a) n and b) neither p nor q is very small. 3.Normal Distribution is a limiting case of Poisson Distribution when its mean m is large. 4.The mean of a normally distributed population lies at the centre of its normal curve. 5.The two tails of the normal probability distribution extend infinitely and never touch the horizontal axis (which implies a positive probability for finding values of the random variable within any range from minus infinity to plus infinity.)
According to this theorem as the sample size n increases the distribution of mean, X of a random sample taken from practically any population approaches a normal distribution (with mean & standard deviation /n). Thus ,if samples of large size, n, are drawn from a population that is not normally distributed ,nevertheless, the successive sample means will form themselves a distribution that is approximately normal. Hence, as the size of the sample is increased the sample means will tend to be normally The CLT applies to the distribution of most other statistics such as Median & Standard Deviation (but not range). CLT gives the Normal Distribution its central place in the theory of sampling, since many important problems can be solved by this single pattern of sampling variability.
distributed.
9.The first and third quartiles are equidistant from the Median.
Descriptive Statistics.ppt
AREA RELATIONSHIP
Distance from the Mean Ordinate 0.5 1.0 1.5 1.96 2.0 2.5 2.5758 3.0 Percentage of Total Area 19.146 34.134 43.319 47.500 47.725 49.379 49.500 49.865
AREA RELATIONSHIP
Thus the two ordinates at distance 1.96 from the mean on either side would enclose 47.5 +47.5=95% of the total area. The two ordinates at 2.5758 distance from the mean on either side would enclose 49.5+49.5=99% of the total area. The area enclosed between ordinates at 3 distance from the mean on either side would be 49.865 +49.865 =99.73% of the total area. The various hypothesis are tested either at 5% level or at 1% level(i.e. taking into account 95% & 99% of the total area of the normal curve )
Distance from the Mean Ordinate 0.5 1.0 1.5 1.96 2.0 2.5
2.5758
3.0
49.500
49.865
4.The operations of causal forces must be such that deviation above the population
mean are balanced as to magnitude and number by deviations below the mean.
The equation under the normal curve gives the ordinate of the curve corresponding to any given value of x : -x2 y= N e 2
22
Although the researchers are usually are more interested in areas under the normal curve instead of its ordinate.
The areas under the curve gives us the proportion of the cases falling between the two numbers or the probability of getting a value between the two numbers.
As a researcher it is important to understand the meaning of the normal curve in its standard form. The equation of the normal curve depends on X and , and for its different values of X and we will obtain different curves (pl remember we calculate the areas through the use z table). Since for different values, different tables will be required hence it was considered to standardize the data (which is done through the use of one table) So now we can determine the normal curve areas regardless of X and by tabulating the area under the normal curve having X =0 and =1.
Such a Normal curve with 0 mean and unit Standard Deviation is known as the Standard normal curve.
-x2
2
- Z
A normal curve with mean X and standard deviation can be converted into a standard normal distribution by performing the change of the scale and origin ( as discussed above). In the original scale ( the x scale) the mean and the standard deviation are X and ; in the new scale ( the z-scale) they are 0 and 1. The formula that enables us to change x-scale to z-scale and vice versa is: z= X-X or x where x= (X- X)
This transformation from X to z is named as z-transformation and has the effect of reducing X to units in terms of standard deviation
f(z)
x-values z-values
X -3 X -2 -3 -2
X - -1
X + 1
X +2 X +3 2 3
f(z)
x-values z-values
-3 -3
-2 -2
-1 -1
+1 1
+2 +3 2 3
Given a value of X, the corresponding value of z tells us how far away and in what direction X is from its mean in term of its standard deviation .
-3
-2
-1
With the help of standardized normal distribution researchers can find the probability of any Portion of the area under the standardized normal curve. All we have to do is transform or convert the data from other observed normal distributions to the standardized normal curve. In other words, the standardized normal distribution is extremely valuable because we can translate or transform any normal variable , X into the standardized value Z
Computing the standardized value, Z, of any measurement expressed in original units is simple: Subtract the mean from the value to be transformed and divide the standard deviation (all expressed in original units).Here the population standard deviation, , is used in the formula: Z= X*- *(here X=normal random variable) Standard Value = (Value to be transformed) (Mean) Standard Deviation Where = hypothesized or expected value of the mean.
X Sometimes it is shrunk
-2
-1
Z= X -
Illustrations
1.Find the area under the normal curve for z = 1.54 Ans: From the table, the entry corresponding to z=1.54 is 0.4382 and this measures the shaded area in the following figure between z = 0 & z = 1.54 0.4382
-3
-2
-1
+1
+2
+3
4
5 6
1/6
1/6 1/6