Nothing Special   »   [go: up one dir, main page]

Statistical Inference: (Analytic Statistics) Lec 10

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 42

Statistical inference

(Analytic statistics)
Lec 10
• Samples are drawn from much larger
populations.
• Data are collected about the sample so that we
can find out something about the population.
• We use samples to estimate quantities such as
disease prevalence, mean blood pressure, mean
exposure to a carcinogen, etc.
• We also want to know by how much these
estimates might vary from sample to sample
(precision) .
Statistical inference
Is the procedure by which we reach
a conclusion about a population on
the basis of the information
contained in a sample that has been
drawn from that population.
Population

Sample

Inference

Statistic
Parameter
Hypothesis Testing:
One Sample Cases
Significant Differences
• Hypothesis testing is designed to detect significant
differences: differences that did not occur by random
chance.
• In the “one sample” case: we compare a random
sample (from a large group) to a population.

• We compare a sample statistic to a population


parameter to see if there is a significant difference.
• Hypothesis testing and estimation :- are used to
reach conclusions about a population by
examining a sample of that population. 
• Hypothesis testing is widely used in medicine,
dentistry, health care, biology and other fields as a
means to draw conclusions about the nature of
populations.

8
Hypothesis testing
• Hypothesis testing is to provide information in helping to
make decisions. 
• The administrative decision usually depends a test
between two hypotheses. 
• Decisions are based on the outcome.
• Hypothesis testing for single sample means
(z test and t test)

9
• Hypothesis testing is a procedure to support one
of two proposed hypotheses.
1. is the null hypothesis or the hypothesis of no
difference.
2. (known as ) is the alternative hypothesis.
It is what we will believe is true if we reject the
null hypothesis.

10
Testing Hypotheses:
Using The Five Step Model
1. Make Assumptions and meet test
requirements. Z test or t test
2. State the null &alternative hypothesis.
3. Select the sampling distribution and
establish the critical region.
4. Compute the test statistic.
5. Make a decision and interpret results.
Single Group Z and T-Tests

• The basic goal of these simple tests is to show


that the distribution of the given data under
examination are not produced by chance and
that there is some systematic pattern therein.
• Main point is to show the mean of a
sample is reflective of the population.
Statistical test on the difference
between a sample mean and a
known population mean

The Z- Test (of a sample mean


against a population mean)
Hypothesis Testing
One Sample Z Test
Does your sample data reflect the population from which
it is drawn from?’
The mean of a sample is reflective of the population?

Z-TEST Formula to find the value of Z (z-test) Is:

x̄ = mean of sample, μ = mean of population


σ = standard deviation of population
n = no. of observations
Z- test: Question one
• Mean level of prothrombin in the normal
population (µ) is known to be 20 mg/100 ml of
plasma
• Mean level of prothrombin in a sample of
patients ( x) having vitamin K deficiency is
18.5 mg/100 ml
• Sample size (n) = 49
• Standard deviation (S) = 3.5 mg/100 ml
• Is this difference between the above two
means real or due to chance?
Z- test: Question one solution

x
The null hypothesis (H0) is a hypothesis which the
researcher tries to disprove, reject or nullify. The 'null'
often refers to the common view of something, while
the alternative hypothesis is what the researcher really
thinks is the cause of a phenomenon.
 WHY WE USE LARGE SAMPLE…?
When we perform a statistical test we are trying to
judge the validity of the null hypothesis. We are doing
so with an incomplete view of the population. Our
sample is our window into the population. The larger
the sample size the bigger our window. However
without a full view of the population there is always the
chance that our sample will lead us to the wrong
conclusion
The Null and Alternative
1. Hypotheses:
Null Hypothesis (H ): 0

 The difference is caused by random chance.


The H0 always states there is “no significant difference.” In this
case, we mean that there is no significant difference between the
population mean and the sample mean.
2. Alternative hypothesis (H1)
 “The difference is real”.
 (H1) always contradicts the H0.

- One (and only one) of these explanations must be true. Which one?
Z- test: Question one
Solution (cont.)
This means the difference between X and µ should be less than
2 SE, i.e.

x - µ < 2 SE
or
Z=
x -µ <2 (SE= 𝞼 /√n)
SE
So if the sample came from the same or similar population, Z
should be less than 2 Null Hypothesis (H o) would
be correct, i.e. the difference is due to chance.
Z- test: Question one
Solution (cont.)
If Z is equal or more than 2, the assumption of no
difference (Ho) is not correct, and should be rejected,
i.e. the difference is real and not due to chance.
Solution
= 18.5 – 20 = -1.5 = |-3|= 3
3.5/√49 0.5
3- Conclusion: This is more than 2, so we reject the
Ho; the difference is real and not due to chance. The
difference is statistically significant at P=0.001
Example: In the population, the average IQ is 100
with a standard deviation of 15. A team of scientists
wants to test a new medication to see if it has either
a positive or negative effect on intelligence,
or no effect at all. A sample of 30 participants who
have taken the medication has a mean of 140.
Did the medication affect intelligence?
Steps for One-Sample z-Test

1. Null Hypotheses: X = µ
2.Alternative Hypotheses: X ≠ µ
3. Calculate Test Statistic
4. State Results
5. State Conclusion
Define Null and Alternative Hypotheses
1-Null Hypotheses: X = µ

2-Alternative Hypotheses: X ≠ µ

3- Calculate Test Statistic:

Z = 14.60
4-Result: Reject the null hypothesis.
5- Conclusion: z > 2
Medication significantly affected intelligence,
z = 14.60, p < 0.05.
Z- Table
P 0.05 0.02 0.01 0.001
Zap 2 2.3 2.6 3

Not Significant (reject Ho)


significant
(N.S)

The smaller the P value, the more statistical


evidence exists against Ho, and that support the
alternative hypothesis (H1).
• The P value is the probability that the
difference between the groups has occurred
by chance.
• It is the probability of rejecting Ho, e.g.
concluding there is a statistically significant
difference, while in fact none exists (type I
or a-error).
Z- test:
Summary

• Z-test (or the normal test) examines the


difference between a sample mean and a
known population mean.
• It is used when the sample size is large (n ≥
30) and / or when the standard deviation of
the population (𝞼 ) is available.
T-Test
(Student t-test)
Student’s T-Test
• Problem: We may not know the mean and
variance of some populations, which means
we cannot do a Z-Test. In this case, we use a
T-test, Student’s T to be specific, for use with
a single group or sample of data.

• Again, this is when we are not looking at


different groups but a sample of data as an
entirety. We will next examine differences in
groups.
The person who invents a test
often names it. The person who
invented the student’s t test was
prevented by his employer from
giving it his own name, so he
called it the student’s.

William_Sealy_Gosset
z Statistic Versus t Statistic
z Statistic t Statistic
• When you know the Mean • When you do not know the
and Standard deviation of a Mean and Standard
population. Deviation of the population
• Calculate the Standard Error • Calculate the Estimate of
of the sample mean the Standard Error of the
sample mean
You can think of the t statistic as an "estimated z-score."
• The estimation comes from the fact that we are using the sample
variance to estimate the unknown population variance.

• The value of degrees of freedom, df = n - 1, determines how well


the distribution of t approximates a normal distribution and how
well the t statistic represents a z-score.
t Distributions
• t dist. are used when we
know the mean of the
population but not the SD
of the population from
which our sample is drawn
• t dist. are useful when we
have small samples.
• t dist is flatter and has fatter • Same Three Assumptions
tails • Dependent Variable is scale
• As sample size approaches • Random selection
30, t looks like z (normal) • Normal Distribution
dist.
Distribution of the t-Statistic The shape of the t-distribution
depends on the number of degrees of freedom(DOF associated
with t test .Standard normal(Z-test)
One sample t-test
• One sample t-test is also used to
examines the difference between a
sample mean (x ) and a known
population mean (µ)
• When?
When the sample size is small (n <
30) and is unknown, we use s
instead,
and the Z-test is replaced by the
t-test
One sample t-test

• T-test needs the t-distribution table in


order to know the level of significance
(the p-value)
• To use the t-distribution table, we need to
calculate the “degrees of freedom” which
equals n–1
Estimating Population from a
Sample
• Main difference between t Tests and z score:
– use the standard deviation of the sample to estimate the
standard deviation of the population.
• How? Subtract 1 from sample size! (called degrees of
freedom)
 X   
Standard Deviation
2
( X  M ) 2
SD  s of a Sample:
N N 1 Estimates the
Population Standard
Deviation
• Use degrees of freedom (df) in the t distribution chart
One sample t-test
Question one
In eight patients with pneumonia, treated with
penicillin G, the numbers of days required to
bring body temperature down to normal were:

1, 4, 5, 7, 3, 2, 5, 6
Can we say that the mean number of days
required to bring the temperature down to
normal, for patients with pneumonia treated
with penicillin G is 2 days?
One sample t-test: Solution of question one
n= 8, = 33/8= 4.125, S= 2.03, µ=2

1- Ho: = µ = 2, the difference is due to chance, and not real


2-Alternative Hypotheses: X ≠ µ

3- Calculate Test Statistic:

= 4.125 – 2 = 2.125 = 2.959


2.03/√8 0.718
One sample t-test: Solution of
question one (cont)
4- Conclusion:
Degrees of freedom= n-1= 8-1= 7
From t-table, a t-value of 2.365 is required to reject the
Ho
The t-value obtained here is more than this value, so
we reject the null hypothesis
The days required is significantly more than 2 days.
t Distribution
Table

You might also like