PSM
PSM
(PSM)
Peeyush Taori
Roberto Vincenzi
Introduction
Consider the model
Y=a+bD+cX+e
We observe
PSM
Estimation based on propensity
scores can deal with such situations
Attempts to reduce selection bias
due to confounding variables
Create treatment and control groups
that are similar in terms of observed
covariates
Compare the difference in outcomes
between two groups to estimate
effect of treatment
PSM
Matching is (relatively) easy when
covariates are less
Dimensionality issue
Assumptions
Two key assumptions:
A1: Only observable covariates affect
outcome variable and treatment
variable.
PST states that Y1 and Y0 are
conditionally independent of
treatment if we condition on
propensity score of individual.
Assumptions
A2: Common Support
For each X, probability of treated as
well as untreated is positive.
Ensures an overlap in treated and
non-treated individuals based on
characteristics.
PSM Implementation
Run logistic regression of treatment
variable on the confounding
variables, and obtain propensity
scores.
Match observations in the treatment
and control group based on matching
algorithm
Ensure that propensity scores and
covariates are balanced between the
two groups.
Example
Example dataset of clinics in village
(treatment), infant moratlity
(outcome), and covariates.
Imrate
Povrate Pcdocs
10
.5
.01
15
.6
.02
22
.7
.01
19
.6
.02
25
.6
.01
19
.5
.02
.1
.04
.3
.05
.2
.04
Step 1
Run logit regression of T on povrate,
pcdocs Imrat T
Povra Pcdoc Pscor
e
te
10
.5
.01
.416
15
.6
.02
.735
22
.7
.01
.928
19
.6
.02
.735
25
.6
.01
.752
19
.5
.02
.395
.1
.04
.001
.3
.05
.026
.2
.04
.008
Step 2
Match observations in treatment to
control
Based on propensity score
Match 1 with 6
Match 2, 3, 4 with 5
Step 3
Compare average outcome between
two groups
(10+15+22+19) (19+25+25+25)
= -7
Matching Algorithms
Factors to consider
With or without replacement
Closeness of match
1:1 or N:1 match
Weighting of outcome variable
Matching Algorithms
Mahalanobis Distance
Mahalanobis distance computed for each pair
Observations matched based on least
distance
Nearest Neighbour
Objective to minimize absolute difference
between propensity scores.
Matching Algorithms
Caliper Matching
Similar to nearest neighbour matching
Use a caliper
Many to one match
Stratified Matching
Create strata based on propensity scores
Calculate ATE by comparing outcomes
across different strata
Matching vs Regression
Both attempt to solve the same
problem. Why not control for
covariates in regression?
Matching advantages
Not dependent on specific functional
form
Easier to assess how matching is
performing
Removes observations that are not
comparable
Matching vs Regression
Regression advantages
Estimate effect of a continuous treatment
variable
Assess effect of all covariates
Interaction of treatment with covariates
Allows extrapolation of results
PSM Limitations
Unobserved confounders
Unobserved variables that affect both
treatment and outcome
Good amount of overlap needed
between groups in terms of observable
characteristics
Situations where high propensity score
gets treatment and low score does not
Software Packages
Stata
psmatch2
teffects psmatch (Stata 13)
pscore
R
MatchIt
Thank You