Nothing Special   »   [go: up one dir, main page]

Statistics Handout CH 1&2

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 20

Lecture notes on: Statistics for Economists

CHAPTER ONE

1. OVERVIEW OF BASIC PROBABILITY THEORY


1.1. Sample Space, Sample Points, Events & Event Space
Sample Space: A sample space S associated with a random experiment is simply the set of all
possible outcomes of a random experiment. Each element of S is called a sample point. A sample
space with a finite number of points is called a finite sample space.

If there are as many numbers of points as the set of counting numbers, the sample space is called
count ably infinite. If the number of points is as many as those found in an interval of the set of
real numbers, the sample space is referred to as non-count ably infinite. A finite or count ably
infinite sample space is called discrete, while a non-count ably infinite sample space is called
non-discrete or continuous. or

Sample Space: is all the possible outcomes of an experiment. Alternatively, the entire collection
of possible outcomes from an experiment is termed the sample space. Example: choosing a card
from a deck. There are 52 cards in a deck (not including Jokers). So the Sample Space is all 52
possible cards: {Ace of Hearts, 2 of Hearts, etc...}

The Sample Space is made up of Sample Points:


Sample Point: just one of the possible outcomes. Example: Deck of Cards
 the 5 of Clubs is a sample point
 the King of Hearts is a sample point
"King" is not a sample point. As there are 4 Kings that is 4 different sample points.
Events: Any subset of the sample space S is called an event. If an event, say A, consists of a
single outcome, it is called a single or elementary event. Otherwise, it is called a composite event.
If the event A = S, then it is called a certain or sure event.

It is important to note that S corresponds to the universal set. Similarly, we can speak of an
impossible event. The analogue to an impossible event in the set theoretic context is the empty
set.

It is important to note that all the set operations can be applied to events. Therefore, if A and B
are two events in S, then A∪B stands for the event “A or B or both A and B occurred”, A∩B is
¿
the event “Both A and B occurred”, A is the event “Not A”, A−B is the event “A but not B”.

Departments of economics WSU Page 1


Lecture notes on: Statistics for Economists

If A∩B = φ , it means the events A and B cannot occur simultaneously. In this case we say A
and B are mutually exclusive events.

Event: a single result of an experiment


Example Events:
 Getting a Tail when tossing a coin is an event
 Rolling a "5" is an event.

An event can include one or more possible outcomes:


 Choosing a "King" from a deck of cards (any of the 4 Kings) is an event
 Rolling an "even number" (2, 4 or 6) is also an event

Experiment or Trial: an action where the result is uncertain. Tossing a coin, throwing dice,
seeing what pizza people choose are all examples of experiments.

Terms to note in the definition of classical probability are random, n, mutually exclusive, and
equally likely.

1.2 Definitions /Concept/ of Probability

There is always a certain degree of uncertainty as to whether an event associated with a random
experiment will occur or not. The chances that an event will occur range between 0 and 100
percent. In statistical considerations, we use the term probability instead of the term chance. Also
it is convenient to assign numbers between 0 and 1, inclusive, instead of percentages. If we are
certain that an event will occur, then we assign a probability 1 to it. On the other hand, if the
event cannot occur, the probability 0 is assigned to it. Any other events that are likely to occur
will be assigned probabilities between 0 and 1.

There are two approaches to the computation of the probability of an event that is associated with
a random experiment.
1/ Classical or Priori Approach: If an event can occur in m different ways out of a total n
m
possible ways, all of which are equally likely, then the probability of the event is n.

Departments of economics WSU Page 2


Lecture notes on: Statistics for Economists

Example of classical probability:

2/ Frequency or Posteriori Approach: If there are n repetitions of a random experiment, where


n is large, and an event A is observed to occur in m of these, again the probability of occurrence
m
of A is n.

Note that both the classical and the frequency approaches have their own deficiencies. In the
former case, the phrase “equally likely” is vague. It may not always to be easy to think of equally
likely occurrences. The later suffers a shortcoming in that it is not clear.

Axiom a basic assumption in the definition of classical probability is that n is a finite number;
that is, there is only a finite number of possible outcomes. If there are an infinite number of
possible outcomes, the probability of an outcome is not defined in the classical sense.

Mutually exclusive: The random experiment result in the occurrence of only one of the n
outcomes. E.g. if a coin is tossed, the result is ahead or a tail, but not both. That is, the outcomes
are defined so as to be mutually exclusive.

Equally likely: Each outcome of the random experiment has an equal chance of occurring.

A random experiment is a process leading to at least two possible outcomes with uncertainty as
to which will occur.

Probability: is the chance that something will happen - how likely it is that some event will
happen.
Sometimes you can measure a probability with a number like "10% chance of rain", or you can
use words such as impossible, unlikely, and possible, even chance, likely and certain.

Departments of economics WSU Page 3


Lecture notes on: Statistics for Economists

Probability: How likely something is to happen. Many events can't be predicted with total
certainty. The best we can say is how likely they are to happen, using the idea of probability.

 Tossing a coin. When a coin is tossed, there are two possible outcomes: heads (H) or tails (T)
 We say that the probability of the coin landing H is ½.
 And the probability of the coin landing T is ½.

Throwing Dice:

When a single die is thrown, there are six possible outcomes: 1, 2, 3, 4, 5, 6. The probability of
any one of them is 1/6.
In general:

Probability of an event happening = Number of ways it can happen


Total number of outcomes

Example: the chances of rolling a "4" with a die.

Number of ways it can happen: 1 (there is only 1 face with a "4" on it)

Total number of outcomes: 6 (there are 6 faces altogether)

So the probability = 1/6

Example: there are 5 marbles in a bag: 4 are blue, and 1 is red. What is the probability that a blue
marble gets picked?

Number of ways it can happen: 4 (there are 4 blues)

Total number of outcomes: 5 (there are 5 marbles in total)

So the probability = 4/5 = 0.8

1.3 Axioms /Rules/ of Probability


Suppose S is a sample space. If it is discrete, then all subsets correspond to events and the
converse also holds true. If however, S is continuous; its subsets will be regarded as events. To
each event A belonging to a class C of events, we associate a real number P(A) defined on C. P

Departments of economics WSU Page 4


Lecture notes on: Statistics for Economists

is called a probability function, and P(A) is the probability of the event A, if the following
axioms are satisfied.

Axiom 1: For every event A in C, the probability 0≤P ( A )≤1 holds true.
Axiom 2: For the sure event P(S) = 1.

Axiom 3: If
A1 , A2 , A 3 ,... , are mutually exclusive events in C, then
P ( A1 ∪ A2 ∪ A3 ∪. . . )=P ( A1 ) + P ( A2 ) + P ( A3 ) +. ..

Some Theorems on Probability:

Theorem 1: If A ⊂ B , then P ( A )≤P ( B ) and P ( B− A ) =P ( B )−P ( A )

Theorem 2: For event A, 0≤P ( A )≤1 .

Theorem 3: P ( φ ) =0

P ( A )=1−P ( A ) .
¿
¿
Theorem 4: If A is the complement of A, then

Theorem 5: If
A1 , A2 , A 3 ,..., A n are mutually exclusive events, then
P ( A1 ∪ A2 ∪ A3 ∪. . .∪ An )=P ( A 1 ) + P ( A 2 ) + P ( A 3 ) +. . .+ P ( A n )

Theorem 6: If A and B are any two events, then P ( A∪B ) =P ( A )+P ( B )−P ( A∩B ) .

If
A1 , A2 , A 3 are any three events, then

P ( A1 ∪ A2 ∪ A3 ) =P ( A 1 ) + P ( A 2 ) + P ( A 3 ) −P ( A1 ∩ A2 )−P ( A 2∩ A 3 ) −P ( A 1 ∩ A3 ) + P ( A1 ∩ A2 ∩ A3 )
Extension of this result to the union of more than three events is also possible.

Departments of economics WSU Page 5


Lecture notes on: Statistics for Economists

Example-

1.4 Counting Procedures


A). Counting rule

Example

Departments of economics WSU Page 6


Lecture notes on: Statistics for Economists

B) Permutations

Modifying the above example, if you have six books, but there is room for only four books on
the shelf, in how many ways can you arrange these books on the shelf?

Solution: the number of ordered arrangements of four books selected from six books is equal to
the number of ordered arrangements of four books selected from six books is equal to:

In many situations, you are not interested in the orderof the outcomes but only in the number of
ways that xitems can be selected from nitems, irrespective of order. Each possible selection is
called a combination.

Departments of economics WSU Page 7


Lecture notes on: Statistics for Economists

C).Combinations

If you compare this rule to counting rule 4, you see that it differs only in the inclusion of a term
X! in the denominator. When permutations were used, all of the arrangements of the X objects
are distinguishable. With combinations, the x! Possible arrangements of objects are irrelevant.

Modifying the above example, if the order of the books on the shelf is irrelevant, in how many
ways can you arrange these books on the shelf?

Solution: the number of combinations of four books selected from six books is equal to

1.5 Probabilities Under conditions of Statistical Independence


When two events happen, the outcome of the first event may or may not have an effect on the
outcome of the second event. That is, the events may be either dependent or independent. In this
section, we examine events that are statistically independent.
Definition:- statistically independence is the case when the occurrence of an event has no effect
on the probability of the occurrence of any other event.
There are 3 types of probabilities under statistical independence:

Departments of economics WSU Page 8


Lecture notes on: Statistics for Economists

1. Marginal Probability
2. Joint Probability
3. Conditional Probability
1) Marginal Probabilities Under Statistical Independence.
A marginal or unconditional probability is the simple probability of the occurrence of an event.

Example: In a fair coin toss, P (H) = 0.5, that is, the probability of heads equal 0.5, and the
probability of tails equal 0.5. This is true for every toss, no matter how many tosses have been
made or what their outcomes have been. Every toss stands alone and is in no way connected
with any other toss. Thus, the outcome of each toss of a fair coin is an event that is statistically
independent of the outcomes of every other toss of the coin.

2) Joint Probabilities Under Statistical Independence


The probability of two or more independent events occurring together or in succession is the
product of their marginal probabilities. Mathematically, this is stated as:
P (A and B) = P (A n B) = P (A). P (B)
Where: P (A n B) = probability of events A and B occurring together or in succession, this
is known as joint probability.

P (A) – Marginal probability of event A occurring

P (B) - Marginal probability of event B occurring

For more than two events: P (A n B n C) = P (A). P (B). P (C)

Example: In terms of the fair coin example, the probability of heads on two successive tosses is
the probability of heads on the first toss (which we shall call H 1) times the probability of heads
on the second toss (H2). We have shown that the events are statistically independent, because the
probability of heads on any toss is 0.5, and P (H 1n H2) = 0.5 x 0.5 = 0.25. Thus the probability of
heads on two successive tosses is 0.25.

Exercises: 1.What is the probability of getting, tails, heads, and tails in that order on three
successive tosses of a fair coin?

Solution: P (T1 H2T3) = P (T1). P (H2). P (T3)


= 0.5 x 0.5 x 0.5 = 0.125

Departments of economics WSU Page 9


Lecture notes on: Statistics for Economists

You can also check using tree diagram


2. What is the probability of at least one tail on three tosses?
Solution: At least one tail = means minimum of one tail otherwise 2 or 3 tails. There is only one
case in which no tails occur namely H1H2H3. Therefore, we can simply subtract for the answer.
P (at least one tail in 3 tosses) = 1 – P (all heads)
= 1 - (H1H2H3)

= 1 – 0.125 = 0.875

3) Conditional Probabilities Under Statistical Independence


Thus far, we have considered two types of probabilities, marginal (or unconditional) probability
and joint probability. Symbolically, marginal probability is P (A) and joint probability is P (AB).
Beside these two, there is one another type of probability, known as conditional probability.

Conditional probability is the probability that a second event (let’s say B) will occur if a first
event (let’s say A) has already happened.
Symbolically: P (B/A) read as probability of B given that event A has occurred.
- For statistically independent events, the conditional probability of event B given that
event A has occurred is simply the probability of event B:
P (B/A) = P (B)
- Thus, statistical independence can be defined symbolically as the condition in which
P (B/A) = P (B).
- Examples: What is the probability that the second toss of a fair coin will result in heads,
given that heads resulted on the first toss?
- Solution: In this case the two events are independent.
- Symbolically: the question is written as: P (H2/H1)
- Using conditional probability under statistically independent situation, P(H2/H1) = P(H2)
- P (H2/H1) = 0.5
1.6 BAYES’ THEOREM
In our discussion of conditional probability, we indicated that revising probabilities when new
information is obtained is an important phase of probability analysis. Often, we begin our
analysis with initial or prior probability estimates for specific events of interest. Then, from
sources such as a sample, a special report, or some other means, we obtain some additional
information about the events. Given this new information, we update the prior probability values

Departments of economics WSU Page 10


Lecture notes on: Statistics for Economists

by calculating revised probabilities, referred as posterior probabilities. Bayes’ theorem provides


a means for making these probability calculations. The steps in this probability revision process
are shown in figure below.

Prior New Application of Posterior


Probabilities Information Bayes’ probabilities
Theorem

Figure 2.1: Revising prior probabilities and Estimating posterior probabilities

Example: An application Of Bayes’ Theorem


Consider a manufacturing firm that receives shipments of parts from two different suppliers. Let
A1 denote the event that a part is from supplier 1 and A2 denote the event that a part is from
supplier 2. Currently, 65% of the parts purchased by the company are from supplier 1 and the
remaining 35% are from supplier 2. Hence, if a part is selected at random, we would assign the
prior probabilities P (A1) = .65 and P (A2) = .35.

The quality of the purchased parts varies with the source of supply. Historical data suggest that
the quality ratings of the two suppliers are as shown in the table below.

Table: 2.1Historical Quality Levels of Two Suppliers

Percentage Good parts Percentage Bad parts


Supplier 1 98 2
Supplier 2 95 5
If we let G denote the event that a part is good, and B denote the event that a part is bad, the
information in table 2.1 provides the following conditional probability values.
P (G/A1) = 0.98 P (B/A1) = 0.02
P (G/A2) = 0.95 P (B/A2) = 0.05
Based on the above information we can compute the joint probabilities of a part being good and
comes from supplier 1, good and A 2, a part being bad and supplied by A 1; and bad and supplied
by A2.
P (A1G) = P (A1) P (G/A1) or = P (G) P (A1/G) = .637
P (A1B) = P (A1) P (B/A1) = .0130
P (A2G) = P (A2) P (G/A2) = .3325

Departments of economics WSU Page 11


Lecture notes on: Statistics for Economists

P (A2B) = P (A2) P (G/A2) = 0175


Suppose now that the parts from the two suppliers are used in the firm’s manufacturing process
and that a machine breaks down because it attempts to process a bad part. Given the information
that the part is a bad, what is the probability that it came from supplier 1 and what is the
probability that it came from supplier 2?

 With the prior probabilities and the join probabilities, Bayes’ theorem can be used to
answer these questions.
-Letting B denote the event that the part is bad, we are looking for the posterior
probabilities P (A1/B) and P (A2/B). From the law of conditional probability and
marginal probability, we know that:
P( A 1 nB )
P( A1 /B )=
 P( B)
 P (A1 n B) = P (A1). P (B/A1) and P (A1 n B) = P (A1). P (B/A1)
 P (B) = P (A1nB) + P (A2 n B)
 P (B) = P (A1) P (B/A1) + P (A2) P (B/A2)
Substituting the above equations, we obtain Bayes’ theorem for the case of two events.

P( A1 nB)
P( A1 /B )=
P( A1 nB)+P( A 2 nB)

P( A1 ) P(B / A 1 )
P( A1 /B )=
P( A1 ) P(B / A 1 )+P ( A 2 )P( B/ A2 )

P( A 2 nB)
P( A2 /B )=
P( A 1 nB)+P( A 2 nB )

P( A 2 ) P(B / A 2 )
P( A1 /B )=
P( A1 ) P(B / A 1 )+P ( A 2 )P( B/ A2 )

Departments of economics WSU Page 12


Lecture notes on: Statistics for Economists

Using the above formula:

0 . 65 x 0 . 02 0. 0130
P( A1 /B )= = =0 . 4262
(0. 65 x 0. 02 )+(0 .35 x 0 .05 ) 0. 0305
0 . 35 x 0 . 05 0. 0175
P( A2 /B )= = =0 . 5738
(0 .65 x 0. 02 )+(0 .35 x 0 .05 ) 0. 0305

Note that in this application we started with a probability of .65 that a part selected at random
was from supplier 1. However, given information that the part is bad, the probability that the
part is from supplier 1 drops to .4262. In fact, if the part is bad, there is a better than 50-50
chance that the part came from supplier 2; that is, P (A2/B) = .5738.

Bayes’ theorem is applicable when the events for which we want to compute posterior
probabilities are mutually exclusive and their union is the entire sample space. Bayes’ theorem
can be extended to the case where there are n mutually exclusive events A1, A2,…, An whose
union is the entire sample space. In such case, Bayes’ theorem for computing posterior
probability P (Ai/B) can be written symbolically as:

P ( A i )P( B/ A i )
P( Ai /B )=
P ( A 1 )P( B/ A1 )+ P( A2 ) P(B / A 2 )+ .. .+ P( A n )P (B / A n )

Bayes’ theorem calculated can be conducted using tabular approach as well as tree diagram.

Self –Test

1. Suppose two dice are rolled. What is the sample space? Identify the event, “dice sum to
seven.”
2. List the outcomes in the sample space for tossing a coin three times (use H for heads and
T for tails).
3. Using the sample space in #2, find the probabilities below as reduced fractions: a) Of
getting exactly one tail b) Of getting no heads c) Of getting all heads or all tails
4. During a sale at a men’s store, 16 white sweaters, 3 red sweaters, 9 blue sweaters, and 7
yellow sweaters were purchased. If a customer is selected at random, find the probability
that he bought: (as fractions) a) A blue sweater b) A yellow or white sweater
c) A sweater that was not white
5. When two dice are rolled, find the probability of getting: (as reduced fractions)
a) A sum of 5 or 6 b) A sum greater than 9 c) A sum less than 4 or greater than 9
d) A sum that is divisible by 4 e) A sum of 14 f) A sum less than 13

Departments of economics WSU Page 13


Lecture notes on: Statistics for Economists

CHAPTER TWO
RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS
2.1 The Concept & Definition of a Random Variable
A random variable is a numerical description of the outcome of an experiment. Random
variables must have numerical values.
In effect, a random variable associates a numerical value with each possible experimental
outcome. The particular numerical value of the random variable depends on the outcome of the
experiment. A random variable can be classified as being either discrete or continuous depending
on the numerical values it assumes.
Discrete Random Variables
A random variable that may assume either a finite number of values or an infinite sequence of
values such as 0, 1, 2, . . . is referred to as a discrete random variable. For example, consider
the experiment of an accountant taking the certified public accountant (CPA) examination. The
examination has four parts. We can define a random variable as x = the number of parts of the
CPA examination passed. It is a discrete random variable because it may assume the finite
number of values 0, 1, 2, 3, or 4.
As another example of a discrete random variable, consider the experiment of cars arriving at a
tollbooth. The random variable of interest is x _ the number of cars arriving during a one-day
period. The possible values for x come from the sequence of integers 0, 1, 2, and so on. Hence, x
is a discrete random variable assuming one of the values in this infinite sequence.
Although the outcomes of many experiments can naturally be described by numerical values,
others cannot. For example, a survey question might ask an individual to recall the message in a
recent television commercial. This experiment would have two possible outcomes:
The individual cannot recall the message and the individual can recall the message. We can still
describe these experimental outcomes numerically by defining the discrete random variable x as
follows: let x = 0 if the individual cannot recall the message and x = 1 if the individual can recall
the message. The numerical values for this random variable are arbitrary (we could use 5 and
10), but they are acceptable in terms of the definition of a random variable—namely, x is a
random variable because it provides a numerical description of the outcome of the experiment.
Table 2.1 provides some additional examples of discrete random variables. Note that in each
example the discrete random variable assumes a finite number of values or an infinite sequence

Departments of economics WSU Page 14


Lecture notes on: Statistics for Economists

of values such as 0, 1, 2, . . . . These types of discrete random variables are discussed in detail in
this chapter.
Table 2.1: Examples of discrete random variables

Continuous Random Variables


A random variable that may assume any numerical value in an interval or collection of intervals
is called a continuous random variable. Experimental outcomes based on measurement scales
such as time, weight, distance, and temperature can be described by continuous random
variables. For example, consider an experiment of monitoring incoming telephone calls to the
claims office of a major insurance company. Suppose the random variable of interest is x = the
time between consecutive incoming calls in minutes. This random variable may assume any
value in the interval x ≥ 0. Actually, an infinite number of values are possible for x, including
values such as 1.26 minutes, 2.751 minutes, 4.3333 minutes, and so on. As another example,
consider a 90-mile section of interstate highway I-75 north of Atlanta, Georgia. For an
emergency ambulance service located in Atlanta, we might define the random variable as x =
number of miles to the location of the next traffic accident along this section of I-75. In this case,
x would be a continuous random variable assuming any value in the interval 0 ≤ x≤ 90.

Table 2.2: Examples of continuous random variables

Departments of economics WSU Page 15


Lecture notes on: Statistics for Economists

Note: One way to determine whether a random variable is discrete or continuous is to think of
the values of the random variable as points on a line segment. Choose two points representing
values of the random variable. If the entire line segment between the two points also represents
possible values for the random variable, then the random variable is continuous.

2.2 Discrete Random Variables and their probability Distributions


The probability distribution for a random variable describes how probabilities are distributed
over the values of the random variable. For a discrete random variable x, the probability
distribution is defined by a probability function, denoted by f (x). The probability function
provides the probability for each value of the random variable.
For example, suppose that for the experiment of rolling a die we define the random variable x to
be the number of dots on the upward face. For this experiment, n = 6 values are possible for the
random variable; x = 1, 2, 3, 4, 5, 6. Thus, the probability function for this discrete uniform
random variable is

A discrete probability distribution given by a formula is the discrete uniform probability


distribution. Its probability function is defined by equation:
Departments of economics WSU Page 16
Lecture notes on: Statistics for Economists

Discrete uniform probability function: f (x) = 1/n


Where, n= the number of values the random variable may have
2.3 Expected Value and Variance of discrete random variables
i) Expected Value
The expected value, or mean, of a random variable is a measure of the central location for the
random variable. The formula for the expected value of a discrete random variable x follows.
The expected value is a weighted a verage of the values the random variable where the weights
are the probabilities.
Expected value of a discrete random variable is given as: E(x) = μ =∑x f (x)
Rules of Expected Values
1. If K is constant; then E(K) = K
2. If A and B are constants the E (aX + b) = a E(X) + b (expected value of a linear function).
3. The mathematical expectation of the sum of two or more random variables is equal to the sum
of the expectations of individual random variables. i.e., E (X + Y + Z) = E = (X) + E (Y) + E (Z)
4. If X and Y are independent random variables, then E (XY) = E (X). E(Y); But E(XY) ≠ E(X).
E Y) for dependent random variables.
5. The expected value of the ratio of two random variables is not equal to the ratio of the
x E( x )
expected value of their random variables. I.e. E ( y ) ≠ E( x )
Example: A real-estate agent sells 0, 1, or 2 houses each working week with respective
probabilities 0.5, 0.3, and 0.2. Then, compute the expected value of the number of houses sold
per week?
2
∑ xp ( x )
Solution: E (X) =∑x f (x) =i−0 (0.5 0) (0.3 1) (0.2 2) = 0.7
ii) Variance and Standard Deviation of a Random Variable
The variance measures that how individual values are speeded, dispersed or distributed around
its mean or expected value.
The variance of a random variable X, denoted by 2 (x), is the expected value of the squared
deviations of the random variable from its expected value. Variance of a discrete random
variable is given by;-

Note: standard deviation is simple, the square root of the variance.

Departments of economics WSU Page 17


Lecture notes on: Statistics for Economists

Properties of Variance
1. The variance of a constant is zero
2. If X and Y are two independent random variables, then Var. (X + Y) = Var. (X) + Var. (Y)
3. If b is constant, then Var. (x + b) = Var. (x)
4. If a is constant, then Var. (ax) = a2. Var. (x)
5. If x and Y are random variables and a and b are constants; the Var. (ax + by) = a 2 var.(x) + b2
var.(y)

Departments of economics WSU Page 18


Lecture notes on: Statistics for Economists

2.4 Continuous Probability Distribution: (Probability Density Function – PDF)

Departments of economics WSU Page 19


Lecture notes on: Statistics for Economists

Departments of economics WSU Page 20

You might also like