Chapter 3 Econometrics
Chapter 3 Econometrics
Chapter 3 Econometrics
1
Multiple Regression
A statistical model that utilizes two or more quantitative and
qualitative explanatory variables (x1,..., xp) to predict a
quantitative dependent variable Y.
Caution: have at least two or more quantitative
explanatory variables.
Multiple regression simultaneously considers the influence of
multiple explanatory variables on a response variable Y:
2
Simple vs. Multiple
• represents the unit change • i represents the unit change in
in Y per unit change in X . Y per unit change in Xi.
• Does not take into account
any other variable besides • Takes into account the effect of
single independent variable. other independent variables.
3
Multiple Regression Models
Multiple
Regression
M odels
Non-
Linear
Linear
Dummy Inter-
Linear action
Variable
Poly- Square
Log Reciprocal Exponential
Nomial Root
4
The Multiple Linear Regression Model building
5
• The coefficients of the multiple regression model are
estimated using sample data with k independent
variables
Estimated Estimated
(or predicted) Estimated slope coefficients
value of Y intercept
8
Estimation of parameters and standard errors
9
10
11
The coefficient of determination and test of model adequacy
12
13
Test of the Significance of Individual Variables!
• Use t-tests of individual variable slopes
• If there is a linear relationship between the variable Xi and Y;
Hypotheses:
• H0: βi = 0 (no linear relationship)
• Test Statistic: bi 0
t*
S bi
18
Given the assumptions and data on Y and set of IVs (X1,..,
XK ) , the following are a suggested procedures/steps to
conduct multiple linear regression:
1. Select variables that you believe are linearly related to the
dependent variable.
2. Use a software to generate the coefficients and the
statistics used to assess the model.
3. Diagnose violations of required conditions/ assumptions.
If there are problems, attempt to remedy them.
4. Assess the model’s fit.
5. Test & interpret the coefficients
6. We use the model to predict a value of the DV.
19
Regression Output Interpretation
Example
In a study of consumer demand (Qd), multiple regression
analysis is done to examine the relationship between quantity
demanded and four potential predictors.
The four independent variables are: price, income, tax and Price
of related goods.
The output for this example is interpreted as follows:
The multiple correlation coefficient is 0.971.
R is the correlation between the observed values of Y and the
values of Y predicted by the model.
20
Source | SS df MS Number of obs = 16
------------------------------------------------------------------------------
-------------+----------------------------------------------------------------
The R2 is 0.943.
This means that the IVs explain 98.2% of the variation in the DV.
23
Finally, the above table will help us to determine whether quantity
demanded and explanatory variables are significantly related,
and the direction and strength of their relationship.
The prediction equation is written as:
The Constant is the predicted value of quantity demanded when all of the
independent variables have a value of zero.
24
The b coefficient associated with price (-0. 162) is negative,
indicating an inverse relationship in which higher price of the
product is associated with lower quantity demanded.
For the independent variable price, the probability of the t
statistic (0.000) for the b coefficient is less than the level of
significance of 0.05.
We reject the null hypothesis that the slope associated with
price is equal to zero and conclude that there is a statistically
significant relationship between price and quantity demanded.
A unit increase/decrease in the price of the product leads to a
0.162 decrease/increase in quantity demanded, ceteris paribus.
25
The income variable is found to be positively and
insignificantly ( even at 10% level of significance) related to
quantity demanded. There is no relation between income and
quantity demanded of this good.
Tax coefficient is statistically significant (at 10% probability
level) and carries positive sign.
The slope of tax is 6.27. This means that for every one unit
increase/decrease in tax on a commodity, quantity demanded
will increase/decrease by 6.27 units, ceteris paribus. Of
course, this is not a valid conclusion.
26
Dummy independent Variables
Describing Qualitative Information
• In regression analysis the dependent variable can be
influenced by variables that are essentially qualitative in
nature,
such as sex, race, color, religion, nationality, geographical
region, political upheavals, and party affiliation.
• One way we could “quantify” such attributes is by
constructing artificial variables that take on values of 1 or 0,
1 indicating the presence (or possession) of that attribute and 0
indicating the absence of that attribute.
• Variables that assume such 0 and 1 values are called dummy/
indicator/ binary/ categorical/ dichotomous variables.
27
Example 1 :
where Y=annual salary of a college professor
Di 1 if male college professor
= 0 otherwise
“less than high school education” category as the base
category.
Therefore, the intercept will reflect the intercept for this
category.
31
• the mean health care expenditure functions for the three
levels of education, namely, less than high school, high
school, and college:
E (Yi | D2 0, D3 0, X i ) 1 X i
E (Yi | D2 1, D3 0, X i ) ( 1 2 ) X i
E (Yi | D2 0, D3 1, X i ) ( 1 3 ) X i
32
Log-Level
Level – Log:
it arises less often in practice.
Y = 0 + 1 log(x) + u
34
Y –hat = 110 + 12 log(x), change in Y hat =? 0.12 units.