Probability Theory
Probability Theory
Probability Theory
Descriptive Statistics
Population & Sample
Descriptive measures quartiles
Percentiles & box plots
Lecture 1
● Statistics
● Descriptive Statistics
● Statistical Inference
● Population vs Sample
● Frequency Distributions
● Cumulative Distributions
● Sample Mean
● Sample Median
● Deviations from Mean
● Variance
● Standard Deviation
● Quartiles
● Percentiles
● Box Plots
Statistics
What is statistics?
Statistics is the study and manipulation of data, including ways to gather, review, analyze,
and draw conclusions from data.
Answers provided by statistical analysis can provide the basis for making better decisions
and choices of actions. Statistical reasoning and methods can help you become efficient at
obtaining information and making useful conclusions.
Descriptive Statistics
One must decide carefully how far to go in generalizing from a given set of data.
Population Sample
All the students in the class are population. All the students who regularly attend class
is a sample.
Frequency Distributions
A frequency distribution is a table that divides a set of data into a suitable number
of classes (categories), showing also the number of items belonging to each class.
Instead of knowing the exact value of each item, we only know that it belongs to a
certain class.
Example
Data
245 333 296 304 276 336 289 234 253 292 366 323 309 284 310 338 297 314 305 330 266 391 315 305 290 300
292 311 272 312 315 355 346 337 303 265 278 276 373 271 308 276 364 390 298 290 308 221 274 343
(205,245]
Note that the class limits are given to as many decimal places as the original data. Had the original data been given to one
decimal place, we would have used the class limits 205.1–245.0, 245.1–285.0, …, 365.1–405.0.
Class Mark and Class Interval
Class Mark: The class marks of a frequency distribution are obtained by averaging
successive class boundaries.
Class Interval: If the classes of a distribution are all of equal length then
subtraction the lower limit from the upper limit gives the class interval.
Class Interval: 40
Cumulative Distribution(less than or equal to variant)
(205,245] 3
(245,285] 14
(285,325] 37
(325,365] 46
(365,405] 50
Descriptive Measures: Sample Mean
N measurements/data points
or
If it is desired to eliminate the effect of extreme (very large or very small) values.
Question
A sample of five university students responded to the question “How much time, in
minutes, did you spend on the social network site yesterday?”
100 45 60 130 30 35
A sample of five university students responded to the question “How much time, in
minutes, did you spend on the social network site yesterday?”
100 45 60 130 30 35
Mean: 66.67
Median: 52.5
Descriptive Measures: Deviations from Mean
Descriptive Measures: Deviations from Mean
Data: 1 2 3 4 5 Mean 3
Data: -7 -3 3 10 12 Mean 3
We observe that the dispersion of a set of data is small if the values are closely
bunched about their mean, and that it is large if the values are scattered widely
about their mean.
Because the deviations sum to zero, we need to remove their signs. Absolute
value and square are two natural choices.
Reason for dividing by n−1 instead of n is that there are only n−1 independent deviations xi − x̄.
Because their sum is always zero, the value of any particular one is always equal to the negative
of the sum of the other n − 1 deviations.
If many of the deviations are large in magnitude, either positive or negative, their squares will be
large and s2 will be large. When all the deviations are small, s 2 will be small.
Example
The delay times (handling, setting, and positioning the tools) for cutting 6 parts on
an engine lathe are 0.6, 1.2, 0.9, 1.0, 0.6, and 0.8 minutes. Calculate s2.
Descriptive Measures: Standard Deviation
Notice that the units of s2 are not those of the original observations.
In previous question the data are delay times in minutes, but s2 has the unit
(minute)2
The standard deviation is by far the most generally useful measure of variation. Its
advantage over the variance is that it is expressed in the same units as the
observations.
Descriptive Measures: Quartiles
In addition to the median, which divides a set of data into halves, we can consider
other division points.
When an ordered data set is divided into quarters, the resulting division points are
called sample quartiles.
The first quartile, Q1, is a value that has one-fourth, or 25%, of the observations
below its value. The first quartile is also the sample 25th percentile P0.25.
Descriptive Measures: Percentile
The sample 100 pth percentile is a value such that at least 100p% of the
observations are at or below this value, and at least 100(1 − p)% are at or above
this value.
Descriptive Measures: Percentile
Question
Given the data
136 143 147 151 158 160 161 163 165 167 173 174 181 181 185 188 190 205
n = 18
Number of observations below or equal to 158 = 5 (atleast 4.5 required acc to definition)
Number of observations equal to or above 158 = 14 (atleast 13.5 required acc to definition)
Question
Given the data
136 143 147 151 158 160 161 163 165 167 173 174 181 181 185 188 190 205
Obtain the quartiles and the 10th percentile.
n = 18
Second: 18*(0.5) = 9 Therefore, we average the 9th and 10th ordered values
Q2 = average the 9th and 10th ordered values = (165+167)/2 = 166
Q3 = 181 P0.10 = 143
Descriptive Measures: Range & Interquartile Range
The minimum and maximum observations also convey information concerning the
amount of variability present in a set of data. Together, they describe the interval
containing all of the observed values.
The amount of variation in the middle half of the data is described by the
interquartile range.
● Experiment
● Sample Space
● Events
● Set Theory
● Disjoint Events
● Permutations & Combinations
● Questions
Experiment
Examples:
Simple Event → HH
An event is just a set, so relationships and results from elementary set theory can
be used to study events.
Mutually Exclusive or Disjoint Events
A permutation is used for the list of data (where the order of the data matters) and the combination
is used for a group of data (where the order of data doesn’t matter).
Question
The computers of six faculty members in a certain department are to be replaced. Two
of the faculty members have selected laptop machines and the other four have chosen
desktop machines.
Suppose that only two of the setups can be done on a particular day, and the two
computers to be set up are randomly selected from the six (implying 15 equally likely
outcomes; if the computers are numbered 1, 2, . . . , 6, then one outcome consists of
computers 1 and 2, another consists of computers 1 and 3, and so on).
a.What is the probability that both selected setups are for laptop computers?
b. What is the probability that both selected setups are desktop machines?
c. What is the probability that at least one selected setup is for a desktop computer?
d. What is the probability that at least one computer of each type is chosen for setup?
Question
The computers of six faculty members in a certain department are to be replaced. Two of the
faculty members have selected laptop machines and the other four have chosen desktop
machines.
Suppose that only two of the setups can be done on a particular day, and the two computers to be
set up are randomly selected from the six (implying 15 equally likely outcomes; if the computers
are numbered 1, 2, . . . , 6, then one outcome consists of computers 1 and 2, another consists of
computers 1 and 3, and so on).
a. What is the probability that both selected setups are for laptop computers? 2C2/15
b. What is the probability that both selected setups are desktop machines? 4c2/15
c. What is the probability that at least one selected setup is for a desktop computer? (15-1)/15
=14/15
d. What is the probability that at least one computer of each type is chosen for setup? (2*4)/15
Propositions
A homeowner doing some remodeling requires the services of both a plumbing contractor
and an electrical contractor. If there are 12 plumbing contractors and 9 electrical contractors
available in the area, in how many ways can the contractors be chosen? 108
Question
A production facility employs 20 workers on the day shift, 15 workers on the swing
shift, and 10 workers on the graveyard shift. A quality control consultant is to
select 6 of these workers for in-depth interviews.
Suppose the selection is made in such a way that any particular group of 6
workers has the same chance of being selected as does any other group (drawing
6 slips without replacement from among 45).
a. How many selections result in all 6 workers coming from the day shift? What is
the probability that all 6 selected workers will be from the day shift?
Question
A production facility employs 20 workers on the day shift, 15 workers on the swing
shift, and 10 workers on the graveyard shift. A quality control consultant is to
select 6 of these workers for in-depth interviews.
Suppose the selection is made in such a way that any particular group of 6
workers has the same chance of being selected as does any other group (drawing
6 slips without replacement from among 45).
a. How many selections result would lead to all 6 workers coming from the day
shift? What is the probability that all 6 selected workers will be from the day shift?
20C 20C /45C
6, 6 6
Question
A production facility employs 20 workers on the day shift, 15 workers on the swing
shift, and 10 workers on the graveyard shift. A quality control consultant is to
select 6 of these workers for in-depth interviews.
Suppose the selection is made in such a way that any particular group of 6
workers has the same chance of being selected as does any other group (drawing
6 slips without replacement from among 45).
b. What is the probability that all 6 selected workers will be from the same shift?
Question
A production facility employs 20 workers on the day shift, 15 workers on the swing
shift, and 10 workers on the graveyard shift. A quality control consultant is to
select 6 of these workers for in-depth interviews.
Suppose the selection is made in such a way that any particular group of 6
workers has the same chance of being selected as does any other group (drawing
6 slips without replacement from among 45).
b. What is the probability that all 6 selected workers will be from the same shift?
A production facility employs 20 workers on the day shift, 15 workers on the swing
shift, and 10 workers on the graveyard shift. A quality control consultant is to
select 6 of these workers for in-depth interviews.
Suppose the selection is made in such a way that any particular group of 6
workers has the same chance of being selected as does any other group (drawing
6 slips without replacement from among 45).
c. What is the probability that at least two different shifts will be represented
among the selected workers?
Question
A production facility employs 20 workers on the day shift, 15 workers on the swing shift,
and 10 workers on the graveyard shift. A quality control consultant is to select 6 of these
workers for in-depth interviews.
Suppose the selection is made in such a way that any particular group of 6 workers has the
same chance of being selected as does any other group (drawing 6 slips without
replacement from among 45).
c. What is the probability that at least two different shifts will be represented among the
selected workers?
(1-(20C6+ 15C6 +10C6))/45C6
Question
A production facility employs 20 workers on the day shift, 15 workers on the swing
shift, and 10 workers on the graveyard shift. A quality control consultant is to
select 6 of these workers for in-depth interviews. Suppose the selection is made in
such a way that any particular group of 6 workers has the same chance of being
selected as does any other group (drawing 6 slips without replacement from
among 45).
d. What is the probability that at least one of the shifts will be unrepresented in the
sample of workers?
Question
P(A1 ∩ A2 ∩ A3) = 0
Question
An academic department with five faculty members— Anderson, Box, Cox, Cramer, and
Fisher—must select two of its members to serve on a personnel review committee. Because
the work will be time-consuming, no one is anxious to serve, so it is decided that the
representative will be selected by putting the names on identical pieces of paper and then
randomly selecting two.
a. What is the probability that both Anderson and Box will be selected?
b. What is the probability that at least one of the two members whose name begins with C is
selected?
c. If the five faculty members have taught for 3, 6, 7, 10, and 14 years, respectively, at the
university, what is the probability that the two chosen representatives have a total of at least
15 years’ teaching experience there?
Question
An academic department with five faculty members— Anderson, Box, Cox, Cramer, and Fisher—must
select two of its members to serve on a personnel review committee. Because the work will be time-
consuming, no one is anxious to serve, so it is decided that the representative will be selected by putting
the names on identical pieces of paper and then randomly selecting two.
a. What is the probability that both Anderson and Box will be selected? 0.1
b. What is the probability that at least one of the two members whose name begins with C is selected?
0.7
c. If the five faculty members have taught for 3, 6, 7, 10, and 14 years, respectively, at the university, what
is the probability that the two chosen representatives have a total of at least 15 years’ teaching experience
there?
0.6
Question
The three most popular options on a certain type of new car are a built-in GPS (A), a
sunroof (B), and an automatic transmission (C). If 40% of all purchasers request A,
55% request B, 70% request C, 63% request A or B, 77% request A or C, 80% request
B or C, and 85% request A or B or C, determine the probabilities of the following
events.
a. The next purchaser will request at least one of the three options.
b. The next purchaser will select none of the three options.
c. The next purchaser will request only a built in GPS and not either of the other two
options.
d. The next purchaser will select exactly one of these three options.
Question
The three most popular options on a certain type of new car are a built-in GPS (A), a
sunroof (B), and an automatic transmission (C). If 40% of all purchasers request A,
55% request B, 70% request C, 63% request A or B, 77% request A or C, 80% request
B or C, and 85% request A or B or C, determine the probabilities of the following
events.
a. The next purchaser will request at least one of the three options. 0.85
b. The next purchaser will select none of the three options. 0.15
c. The next purchaser will request only a built in GPS and not either of the other two
options.
d. The next purchaser will select exactly one of these three options.
References
● Conditional Probability
● Bayes Theorem
● Independent Events
● Questions
Conditional Probability
P(A).P(B|A) = P(B).P(A|B)
Question
A chain of video stores sells three different brands of DVD players. Of its DVD
player sales, 50% are brand 1 (the least expensive), 30% are brand 2, and 20%
are brand 3. Each manufacturer offers a 1-year warranty on parts and labor. It is
known that 25% of brand 1’s DVD players require warranty repair work, whereas
the corresponding percentages for brands 2 and 3 are 20% and 10%,
respectively.
1) What is the probability that a randomly selected purchaser has bought a brand
1 DVD player that will need repair while under warranty?
Question
A chain of video stores sells three different brands of DVD players. Of its DVD
player sales, 50% are brand 1 (the least expensive), 30% are brand 2, and 20%
are brand 3. Each manufacturer offers a 1-year warranty on parts and labor. It is
known that 25% of brand 1’s DVD players require warranty repair work, whereas
the corresponding percentages for brands 2 and 3 are 20% and 10%,
respectively.
1) What is the probability that a randomly selected purchaser has bought a brand
1 DVD player that will need repair while under warranty?
0.125
Question
Question
A chain of video stores sells three different brands of DVD players. Of its DVD
player sales, 50% are brand 1 (the least expensive), 30% are brand 2, and 20%
are brand 3. Each manufacturer offers a 1-year warranty on parts and labor. It is
known that 25% of brand 1’s DVD players require warranty repair work, whereas
the corresponding percentages for brands 2 and 3 are 20% and 10%,
respectively.
2. What is the probability that a randomly selected purchaser has a DVD player
that will need repair while under warranty?
Question
A chain of video stores sells three different brands of DVD players. Of its DVD
player sales, 50% are brand 1 (the least expensive), 30% are brand 2, and 20%
are brand 3. Each manufacturer offers a 1-year warranty on parts and labor. It is
known that 25% of brand 1’s DVD players require warranty repair work, whereas
the corresponding percentages for brands 2 and 3 are 20% and 10%,
respectively.
2. What is the probability that a randomly selected purchaser has a DVD player
that will need repair while under warranty?
0.205
Question
A chain of video stores sells three different brands of DVD players. Of its DVD
player sales, 50% are brand 1 (the least expensive), 30% are brand 2, and 20%
are brand 3. Each manufacturer offers a 1-year warranty on parts and labor. It is
known that 25% of brand 1’s DVD players require warranty repair work, whereas
the corresponding percentages for brands 2 and 3 are 20% and 10%,
respectively.
3. If a customer returns to the store with a DVD player that needs warranty repair
work, what is the probability that it is a brand 1 DVD player? A brand 2 DVD
player? A brand 3 DVD player?
Question
3. If a customer returns to the store with a DVD player that needs warranty repair
work, what is the probability that it is a brand 1 DVD player? A brand 2 DVD
player? A brand 3 DVD player?
Bayes Theorem
Question
Question
Only 1 in 1000 adults is afflicted with a rare disease for which a diagnostic test has
been developed. The test is such that when an individual actually has the disease,
a positive result will occur 99% of the time, whereas an individual without the
disease will show a positive test result only 2% of the time. If a randomly selected
individual is tested and the result is positive, what is the probability that the
individual has the disease?
Independent Events
Question
Each day, Monday through Friday, a batch of components sent by a first supplier
arrives at a certain inspection facility. Two days a week, a batch also arrives from
a second supplier. Eighty percent of all supplier 1’s batches pass inspection, and
90% of supplier 2’s do likewise. What is the probability that, on a randomly
selected day, two batches pass inspection?
Question
Two pumps connected in parallel fail independently of one another on any given
day. The probability that only the older pump will fail is .10, and the probability that
only the newer pump will fail is .05. What is the probability that the pumping
system will fail on any given day (which happens if both pumps fail)?
Question
Two pumps connected in parallel fail independently of one another on any given
day. The probability that only the older pump will fail is .10, and the probability that
only the newer pump will fail is .05. What is the probability that the pumping
system will fail on any given day (which happens if both pumps fail)?
Individual A has a circle of five close friends (B, C, D, E, and F). A has heard a
certain rumor from outside the circle and has invited the five friends to a party to
circulate the rumor. To begin, A selects one of the five at random and tells the
rumor to the chosen individual. That individual then selects at random one of the
four remaining individuals and repeats the rumor. Continuing, a new individual is
selected from those not already having heard the rumor by the individual who has
just heard it, until everyone has been told.
Question
1. What is the probability that the rumor is repeated in the order B, C, D, E, and
F?
2. What is the probability that F is the third person at the party to be told the
rumor?
??F??
4.3.1.2.1
=24
Total possibilities = 5.4.3.2.1 =120
So answer is 24/120 = 0.2
Question
3. What is the probability that F is the last person to hear the rumor?
????F
4.3.2.1.1
=24
Total possibilities = 5.4.3.2.1 =120
So answer is 24/120 = 0.2
Question
4. If at each stage the person who currently “has” the rumor does not know who
has already heard it and selects the next recipient at random from all five possible
individuals, what is the probability that F has still not heard the rumor after it has
been told ten times at the party?
4.4.4.4…../ 5.5.5.5…..
410/510
=0.1074
References
● Random Variable
● Bernoulli Random Variable
● Probability Distribution
● Parameter
● Cumulative Distribution
● Expectation
Random Variable
When a student calls a university help desk for technical support, he/she will either
immediately be able to speak to someone (S, for success) or will be placed on
hold (F, for failure).
With Sample Space = {S,F}, define an rv X by
X(S) = 1 and X(F) =0
The rv X indicates whether (1) or not (0) the student can immediately speak to
someone.
Bernoulli Random Variable
Any random variable whose only possible values are 0 and 1 is called a Bernoulli
random variable.
Types of Random Variables
p(x) = P(X=x)
The values of X along with their probabilities collectively specify the pmf.
Example
Six lots of components are ready to be shipped by a certain supplier. The number
of defective components in each lot is as follows:
Let X be the number of defectives in the selected lot. The three possible X values
are 0, 1, and 2.
Example
Consider whether the next person buying a computer at a certain electronics store
buys a laptop or a desktop model. Let
Question
Consider a group of five potential blood donors—a, b, c, d, and e—of whom only a
and b have type O+ blood. Five blood samples, one from each individual, will be
typed in random order until an O+ individual is identified. Let the rv Y = number of
typings necessary to identify an individual with O+ blood.
Note: Once a donor is selected he cannot be selected again.
Find pmf of Y
Question
Parameter of Probability Distribution
Suppose p(x) depends on a quantity that can be assigned any one of a number of
possible values, with each different value determining a different probability
distribution. Such a quantity is called a parameter of the distribution. The collection
of all probability distributions for different values of the parameter is called a family
of probability distributions.
Bernoulli distribution (Each different number α between 0 and 1 determines a
different member of the Bernoulli family of distributions.)
Question
Starting at a fixed time, we observe the gender of each student coming inside the
class until a boy (B) comes. Let p = P(B), assume that successive coming of
students inside the class are independent, and define the rv X by x = number of
students observed. Find out the pmf.
Cumulative Distribution Function
Question
A store carries flash drives with either 1 GB, 2 GB, 4 GB, 8 GB, or 16 GB of
memory. The accompanying table gives the distribution Y = the amount of
memory in a purchased drive:
X
Cumulative Distribution Function
Question
A consumer organization that evaluates new automobiles reports the number of
major defects in each car examined. Let X denote the number of major defects in
a randomly selected car of a certain type. The cdf of X is as follows:
Question
Just after birth, each newborn child is rated on a scale called the Apgar scale. The
possible ratings are 0, 1, . . . , 10, with the child’s rating determined by color,
muscle tone, respiratory effort, heartbeat, and reflex irritability (the best possible
score is 10). Let X be the Apgar score of a randomly selected child born at a
certain hospital during the next year, and suppose that the pmf of X is
Just after birth, each newborn child is rated on a scale called the Apgar scale. The
possible ratings are 0, 1, . . . , 10, with the child’s rating determined by color,
muscle tone, respiratory effort, heartbeat, and reflex irritability (the best possible
score is 10). Let X be the Apgar score of a randomly selected child born at a
certain hospital during the next year, and suppose that the pmf of X is
Question
Let X, the number of interviews a student has prior to getting a job, have pmf
Let X, the number of interviews a student has prior to getting a job, have pmf
A computer store has purchased three computers of a certain type at $500 apiece.
It will sell them for $1000 apiece. The manufacturer has agreed to repurchase any
computers still unsold after a specified period at $200 apiece. Let X denote the
number of computers sold, and suppose that p(0) = 0.1, p(1) = 0.2, p(2) = 0.3 and
p(3) = 0.4. h(X) denote the profit associated. h(X) = 800X-900. Calculate the
expected profit.
Question
A computer store has purchased three computers of a certain type at $500 apiece.
It will sell them for $1000 apiece. The manufacturer has agreed to repurchase any
computers still unsold after a specified period at $200 apiece. Let X denote the
number of computers sold, and suppose that p(0) = 0.1, p(1) = 0.2, p(2) = 0.3 and
p(3) = 0.4. h(X) denote the profit associated. h(X) = 800X-900. Calculate the
expected profit.
References
● Variance
● Bernoulli Random Variable
● Binomial Experiment
● Binomial Random Variable
● Binomial Tables
● Questions
Variance
Expectation and Variance
Var(X) = E[(X-u)2]
= E[X2 + u2 - 2Xu]
= E[X2] + E[u2] - 2E[Xu]
= E[X2] + u2 - 2uE[X]
= E[X2] + u2 - 2uu
= E[X2] - u2
Variance
Question
Let X = the outcome when fair dice is rolled once. If before the die is rolled you
are offered either (1/3.5) dollars or h(X) = 1/X dollars, would you accept the
guaranteed amount or would you gamble?
Question
Let X = the outcome when fair dice is rolled once. If before the die is rolled you
are offered either (1/3.5) dollars or h(X) = 1/X dollars, would you accept the
guaranteed amount or would you gamble?
E(h(X)) = ⅙(1 + ½ + ⅓ + ¼ + ⅕ + ⅙ )
E(h(X)) = 1/(2.44)
So E(h(X)) greater than 1/3.5 so would gamble.
Binomial Experiment
Suppose that 20% of all copies of a particular textbook fail a certain binding
strength test. Let X denote the number among 15 randomly selected copies that
fail the test. Then X has a binomial distribution with n=15 and p =0.2.
1. Calculate the probability that at most 8 fail the test is
Questions
= 1- B(5;10, 0.6)
= 1-0.367
= 0.633
Question
Mean = 10*(0.6) = 6
Standard Deviation = √(10*0.6*0.4) = 1.55
P(4.45<=X<=7.55)
P(4<X<8) = B(7;10,0.6) - B(4;10,0.6) = 0.833 - 0.166 = 0.667
Question
Let X denote the number of creatures of a particular type captured in a trap during
a given time period. Suppose that X has a Poisson distribution with µ = 4.5 , so on
average traps will contain 4.5 creatures.
a. Find probability that the trap contains exactly 5 creatures.
Question
Let X denote the number of creatures of a particular type captured in a trap during
a given time period. Suppose that X has a Poisson distribution with µ = 4.5 , so on
average traps will contain 4.5 creatures.
a. Find probability that the trap contains atleast 5 creatures.
Question
PK(t) denote the probability that k events will be observed during any particular
time interval of length t. The occurrence of events over time as described is called
a Poisson process; the parameter α specifies the rate for the process.
Question
Calculate mean and variance for the uniform distribution (in terms of a and b).
Question
The time X (min) for a lab assistant to prepare the equipment for a certain
experiment is believed to have a uniform distribution with A = 25 and B = 35.
Cumulative Distribution Functions
Question
η(p) is that value on the measurement axis such that 100p% of the area under the
graph of f(x) lies to the left of η(p) and 100(1-p)% lies to the right.
Question
References
P(a ≤ X ≤ b) =
Standard Normal Distribution
Question
Imp
P(Z ≤ -a ) = P(Z ≥ a)
P(-3.4 ≤ Z ≤ 1.25)
P(-3.4 ≤ Z ≤ 1.25)
Suppose that 25% of all students at a large public university receive financial aid.
Let X be the number of students in a random sample of size 50 who receive
financial aid. X follows binomial distribution. Calculate the probability that atmost
10 students receive aid.
p = 0.25, np = 50(0.25) = 12.5 ≥ 10, nq = 50(0.75) = 37.5 ≥ 10
References
● Exponential Distribution
● Mean & Variance Derivation
● Cumulative Distribution
● Memoryless Property
● Questions
Exponential & Gamma Distribution
Let X = the time between two successive arrivals at the drive-up window of a local
bank. If X has an exponential distribution with λ = 1, compute the following:
a. The expected time between two successive arrivals
b. The standard deviation of the time between successive arrivals
c. P(X<=4)
d. P(2<= X< =5)
References
● Gamma Function
● Gamma Distribution
● Standard Gamma Distribution
● Exponential Distribution
● Gamma Density Curves
● Mean & Variance
● Non standard gamma to standard gamma function
● Questions
Gamma Function
Gamma Function
Question
Evaluate each of the following expressions, leaving the final answer in exact
simplified form.
Gamma Distribution
Standard Gamma Distribution
Exponential Distribution
Gamma Density Curves
Standard Gamma Density Curves
Properties
Suppose the time spent by a randomly selected student who uses a terminal
connected to a local time-sharing computer facility has a gamma distribution with
mean 20 min and variance 80 min2.
a. What are the values of α and β?
Question
Suppose the time spent by a randomly selected student who uses a terminal
connected to a local time-sharing computer facility has a gamma distribution with
mean 20 min and variance 80 min2.
b. What is the probability that a student uses the terminal for at most 24 min?
Question
Suppose the time spent by a randomly selected student who uses a terminal
connected to a local time-sharing computer facility has a gamma distribution with
mean 20 min and variance 80 min2.
c. What is the probability that a student spends between 20 and 40 min using the
terminal?
References
A service station has both self-service and full-service islands. On each island,
there is a single regular unleaded pump with two hoses. Let X denote the number
of hoses being used on the self-service island at a particular time, and let Y
denote the number of hoses on the full-service island in use at that time. The joint
pmf of X and Y appears in the accompanying tabulation.
Question
Answers
a. 0.20
b. 0.42
c. Atleast one of the hoses is there in both full service and self service islands,
0.7
d. px(0) = 0.16 , px(1) = 0.34 , px(2) = 0.5 , 0 otherwise
py(0) = 0.24 , p1(1) = 0.38 , p2(0) = 0.38 , 0 otherwise
P(X<=1) = 0.5
Two continuous random variables
Marginal Probability Density Function
Independent random variables
For the given pdf
For a strong positive relationship that is when X increases then Y also increases,
Cov(X, Y) would be quite positive.
For a strong negative relationship that is when X increases then Y decreases,
Cov(X, Y) would be quite negative.
If X and Y are not strongly related, covariance will be near 0
Covariance
Question
Cov(X,Y) = E[(X-uX)(Y-uY)]
= E[XY + uXuY - XuY - YuX]
= E(XY) + E(uXuY) - E(XuY) - E(YuX)
= E(XY) + uXuY - uyE(X) - uxE(Y)
= E(XY) + uXuY - uyux - uxuy
= E(XY) - uXuY
Correlation Coefficient
Correlation Coefficient
● Random Sample
● Sample mean
● Central Limit Theorem
● Hypothesis Testing
● Test Procedure
● Type of errors
● Level of Significance
● p value
● Lower Tail Test
● Questions
Random Sample
Sample mean
Central Limit Theorem
Let X1, X2, . . . , Xn be random samples from a distribution with mean μ and
variance σ2 .
Then if n is sufficiently large, X̄ has approximately a normal distribution with
mean μ and variance σ2/n.
The larger the value of n, the better the approximation.
Rule of thumb: If n > 30, the Central Limit Theorem can be used.
In case X1, X2, . . . , Xn are normally distributed with mean μ and variance σ2
then for any n, X̄ has a normal distribution with mean μ and variance σ2/n.
Question
H0 is the default assumption that nothing has changed. So if μ becomes less than
200, then it is a change which will be part of Ha
H0 : μ ≥ 200
Ha : μ < 200
H0 cannot be rejected when there is no evidence that proposed manufacturing
method reduces costs.
H0 can be rejected when there is evidence that proposed manufacturing method
reduces costs.
Level of Significance
● The level of significance is the probability of making a type I error when the null
hypothesis is true as an equality.
● Type I error: Reject null hypothesis when it is actually true.
● The greek symbol α (alpha) is used to denote the level of significance, and
common choices for α are 0.05 and 0.01.
● In practice, the level of significance is already specified before testing.
● In simple terms, level of significance will define the rejection region of the
graph.
Level of Significance
● By selecting α, that person is controlling the probability of making a type I error.
● Applications of hypothesis testing that only control for the type I error are called
significance tests.
● Because of the uncertainty associated with making a type II error when
conducting significance tests, statisticians usually recommend that we use the
statement “do not reject H0” instead of “accept H0.”
Tests for Population Mean when σ known: Z test
● Hypothesis Testing
● Lower tail Z test
● Upper tail Z test
● Two tail Z test
● t test
Hypothesis
The level of significance is the probability of making a type I error when the null
hypothesis is true as an equality.
Type I error: Reject null hypothesis when it is actually true.
The greek symbol α (alpha) is used to denote the level of significance, and
common choices for α are 0.05 and 0.01.
In practice, the person responsible for the hypothesis test specifies the level of
significance.
In simple terms, level of significance will define the rejection region of the graph.
Tests for Population Mean when σ known: Z test
A p-value is a probability that provides a measure of the evidence against the null
hypothesis provided by the sample. Smaller p-values indicate more evidence
against H0.
The value of the test statistic is used to compute the p-value.
Question
Question
Question
Rules for hypothesis testing
Tests for Population Mean when σ unknown: t test