Econometrics Unit 3 Tedy Best
Econometrics Unit 3 Tedy Best
Econometrics Unit 3 Tedy Best
Model
Chapter 3
1
• Learning Objectives
• After completing this topic the students are able to:
• Predict values of the dependent variable for given values of the explanatory variable;
• Evaluate and mitigate the effects of departures from classical statistical assumptions
on linear regression estimates; and
• Keywords: Simple linear regression model, regression parameters, regression line, residuals,
principle of least squares, least squares estimates, least squares line, fitted values, predicted
values, coefficient of determination, least squares estimators hypotheses on regression 2
parameters, confidence intervals for regression.
Outline
3
3.1. Simple Linear Regression
• Our objective is to study the relationship between
two variables X and Y.
• One way is by means of regression.
• Regression analysis is the process of estimating a
functional relationship between X and Y. A
regression equation is often used to predict a value
of Y for a given value of X.
• Another way to study relationship between two
variables is correlation. It involves measuring the
direction and the strength of the linear
relationship.
4
Examples
• Multiple Linear Regression:
Yi 0 1X1i 2 X 2i i
• Polynomial Linear Regression:
Yi 0 1X i 2 X 2i i
• Linear Regression:
log10 (Yi ) 0 1X i 2 exp(X i ) i
• Nonlinear Regression:
Yi 0 /(1 1 exp( 2 X i )) i
Linear or nonlinear in parameters 5
Cont…
6
Cont…
Now let us see simple linear regression under this chapter
Let Y = α + α1 X + u
i 0 i i 1
8
The simple linear regression model
• We consider the modelling between the dependent and
one independent variable.
9
Cont…
10
• The linear model
• Consider a simple linear regression model
X+
12
Cont…
Explain the variables involved in a regression model.
These variables are observable, unobservable and
unknown parameters.
1. Observable Variables –
These are the variables in which their values are
collected from the field through questionnaires,
interviews and other means of data collection
mechanisms.
yi = the ith value of the dependent variable.
x = the ith value of the independent variable.
13
Cont…
2. Unobservable variables –
15
Cont….
3.Unknown Parameters (or regression coefficients)
16
Cont….
Why is the disturbance term ε?
17
Cont….
Contributors to ε
– Measurement errors
– Exclusion of important variables
– Simultaneity
and
2
𝑉𝑎𝑟 ( 𝑦 )= 𝜎
20
SIMPLE REGRESSION MODEL
Y 1 2 X
Q4
Q3
Q2
Q1
1
X1 X2 X3 X4 X
If the relationship were an exact one, the observations would lie on a straight line and we
would have no trouble obtaining accurate estimates of b1 and b2.
21 3
• The independent variables are viewed as controlled by
the experimenter,
and
2
𝑉𝑎𝑟 ( 𝑦 )= 𝜎
22
• Sometimes X can also be a random variable.
• In such a case, instead of the sample mean and sample
variance of y,
• we consider the conditional mean of y given X = .
(Meaning expected value of y at a given level of x=
and
2
𝑉𝑎𝑟 ( 𝑦 / 𝑥)=𝜎
23
• Regression is estimation or prediction of the average value of
a dependent variable on the basis of the fixed values of other
variables.
24
25
26
27
P4
Y
Y 1 2 X
P1 Q4
Q3
Q2
Q1 P3
1 P2
X1 X2 X3 X4 X
In practice, most economic relationships are not exact and the actual values of Y are
different from those corresponding to the straight line.
28 4
SIMPLE REGRESSION MODEL
P4
Y
Y 1 2 X
P1 Q4
u1 Q3
Q2
Q1 P3
1 P2
u = disturbance term
1 2 X1
X1 X2 X3 X4 X
Each value of Y thus has a non-random component, b1 + b2X, and a random component, u. The first
observation has been decomposed into these two components.
29 6
Simple Linear Regression
Model DCOVA
(continued)
Y Yi β0 β1Xi ε i
Observed Value
of Y for Xi
εi Slope = β1
Predicted Value Random Error
of Y for Xi
for this Xi value
Intercept = β0
Xi X
Copyright © 2016 Pearson Education, Ltd. Chapter 12, Slide 30
Simple Linear Regression Equation
(Prediction Line)
DCOVA
Estimated
(or predicted) Estimate of Estimate of the
Y value for the regression regression slope
observation i
intercept
Value of X for
Ŷi b0 b1Xi
observation i
32
• Various methods of estimation can be used to determine
the estimates of the parameters.
33
3.2. Ordinary Least Square Method (OLS) and Classical
Assumptions
• There are two major ways of estimating regression
functions.
• These are the ordinary least square method and maximum
likelihood (MLH) method.
• Both the methods are basically similar to their application
in estimations.
• The ordinary least square method is the easiest and the
most commonly used method as opposed to the maximum
likelihood (MLH) method which is limited by its
assumptions.
34
Cont…
• For instance, the MLH method is valid only for large sample as
opposed to the OLS method which can be applied to smaller
samples.
• Owing to this merit, our discussion mainly focuses on the
ordinary least square (OLS).
• The (Ordinary) least square (OLS) method of estimating
parameters or regression function is about finding or
estimating values of the parameters ( of the simple
linear regression function given below for which the errors or
35
Cont…
Estimation
The sample regression line is given as:
ˆi Yi Yˆi
37
Ordinary Least Square (OLS)
OLS is the technique used to estimate a line that will
minimize the error.
38
Cont…
39
Cont…
Ordinary Least Squares method chooses
estimates of the parameters (α i) by minimizing
the sum of squared differences between the
actual y’s and the estimated yˆ’s.
i
40
i, Classical Assumptions of OLS
1. The error terms or the disturbance terms ‘‘Ui’’ are not correlated.
The value in which the error term assumed in one period does not
depend on the value in which it assumed in any other period.
U i
• E(Ui) = n
0.
Multiplying both sides by (sample size ‘n’)
we obtain the following. E(Ui) = U i 0 . 42
Classical Assumptions of OLS
• This results in an upward (if the mean of the error term or residual
term is positive) or down ward (if the mean of the error term or
residual term is negative) shift in the regression function.
44
3. The disturbance terms have constant variance in each period.
46
Classical Assumptions of OLS
47
Classical Assumptions of OLS
50
Classical Assumptions of OLS
7. Normality assumption
53
• We can now use the above assumptions to derive the following basic
concepts.
A. The dependent variable Yi is normally distributed
54
Con’t…
Normal distribution - for any fixed explanatory
value, x, the response, y, has a normal distribution.
Generally, the observations are normally distributed,
if the observations are graphed by a bell-shaped or
normal curve appears with zero mean.
• A violation of this assumption occurs when there are
outliers in data set, and leads to problems of wider
confidence intervals and wrong hypothesis testing.
55
Con’t…
Homoskedasticity (or constant variance) - the
variance of the dependent variable is the same for
any independent observations or explanatory
variables.
There exists a constant variance for the given
regression model.
Mathematically, it means the variance of the response
variable for all given observations does not vary.
It is denoted as var( y)2 if y is given as a
dependent or response variable.
56
Con’t…
B,
Existence - for any fixed value of the independent variable x,
the dependent variable, y, is a random variable with a certain
probability distribution, having finite means and variance. A
violation of this assumption may indicate that there is no
relationship between the variables involved.
Continuity – the dependent variable is a continuous random
variable, whereas values of the independent variable are
fixed values; they can take continuous or discrete values.
Cautions must be taken that if the dependent variable is not
continuous, then other type of regression models such as
probit, logit, tobit, etc. should be used accordingly.
57
Cont…
ii, Deriving Ordinary list square estimate
Let Y = α + α X + u
i 0 1 i i 3
58
Con’t…
Sum the equation 3 over all observation
Y (
i 0 1 X i u i ) n 0 1 X i u i
Divide by n
Yi Xi ui
n 0 1 n n
60
Con’t…
The second normal equation
Now returning to the equation 1 and multiplying
both sides by Xi gives us
X i Yi 0 X i 1 X i û i X i
ˆ ˆ 2
And sum it
i i 0 i 1 i û i Xi
X Y
ˆ X
ˆ X 2
61
Cont…
If we divide it by n we obtained
( X iYi ) ˆ 0 x
ˆ1 X i2 (uˆi X i )
n n n
0 1
n n
Y ˆ 0 ˆ1 X
Solving the equation
63
Con’t…
Then the two formulas is given as:
ˆ 0 Y ˆ1 X
ˆ1 XY nXY
X nX
2 2
64
Con’t…
The OLS selects estimates ˆ , ˆ , ˆ … that
0 1 2
65
66
67
68
69
70
71
72
• Example 2.4: Given the following sample data of three pairs of
‘Y’ (dependent variable) and ‘X’ (independent variable), find
a simple linear regression function; Y = f(X).
73
74
75
Simple Linear Regression
Example
DCOVA
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
98.25 0.1098(200 0)
317.85
The predicted price for a house with 2000
square feet is 317.85($2,000s) = $317,850
Copyright © 2016 Pearson Education, Ltd. Chapter 12, Slide 81
Mean and Variance of Parameter Estimates
10
3.3. Evaluation of the estimates
83
Con’t…
Economic a priori criteria: These criteria are
determined by economic theory and refer to the
size and sign of the parameters of economic
relationships.
Statistical criteria (first-order tests): These are
determined by statistical theory and aim at the
evaluation of the statistical reliability of the
estimates of the parameters of the model.
Correlation coefficient test, standard error test, t-
test, F-test, and R2-test are some of the most
commonly used statistical tests.
84
Con’t…
Econometric criteria (second-order tests):
85
Con’t….
They help us establish whether the estimates
have the desirable properties of unbiasedness,
consistency etc.
86
Measurement of the explanatory power of
the regression model
Yi Y 2
Yˆ
i Y
2
Y Yˆ i i
2
1 ESS 1 2
RSS Yˆi Y Yi Yˆi
R 2
TSS Y Y
i TSS Y Y
i
10
97
98
• Or
99
10
Hypothesis Testing of OLS Estimates
10
1
Hypothesis Testing of OLS Estimates
10
2
Hypothesis Testing of OLS Estimates
1. The Standard Error Test
• This test first establishes the two hypotheses that are
going to be tested which are commonly known as the null
and alternative hypotheses.
The two hypotheses are given as follows:
• H0: βi=0
• H1: βi≠0
• The standard error test is outlined as follows:
1. Compute the standard deviations of the parameter
estimates
• This is because standard deviation is the positive square
root of the variance. 10
3
Hypothesis Testing of OLS Estimates
2. Compare the standard errors of the estimates with the
numerical values of the estimates and make decision.
S YX S YX
Sb1
SSX i
(X X ) 2
where:
Sb1 = Estimate of the standard error of the slope
SSE = Standard error of the estimate
S YX
n2
H0: β1 = 0
Test Statistic: tSTAT = 3.329
H1: β1 ≠ 0
d.f. = 10- 2 = 8
a/2=.025 a/2=.025
Decision: Reject H0
b1 t α / 2 S b
1
135
Cont…
Now let us see simple linear regression under this chapter
Let Y = α + α X + u
i 0 1 i i 1
139
140
141
b0
142
4
143
144
145
146
147