Potential Outcomes Framework
Potential Outcomes Framework
Potential Outcomes Framework
Stanislao Maldonado1
University of California, Berkeley
January, 2010
1. Notation
This is based on Holland (1986), Angrist et al (2009) and Morgan et al (2007). The model was
proposed originally by Neyman (1923) and further developed by Rubin (1974). We introduce
here the basic terminology:
Note that the outcome for each individual can be written as follows:
Or simply:
Yi = Yi (1) if Di = 1
Yi = Yi (0) if Di = 0
1 Ph.D student. Department of Agricultural and Resource Economics. E-mail: smaldonadoz@berkeley.edu
1
Applied Econometrics for Economic Research CIES/INEI
Table 1
The Fundamental Problem of Causal Inference
Group Y(1) Y(0)
Treatment (D=1) Observable as Y Counterfactual
Control (D=0) Counterfactual Observable as Y
We are required to think in terms of “counterfactuals”; i.e what would have happened with a
treated individual if he or she would not have received the treatment and viceversa.
Holland (1986) suggests two types of solutions: a) the scientific solution and b) the statistical
solution.
The statistical solution is based on estimating the average effect of the treatment instead of
doing so at an individual level.
This average effect is still not estimable without further assumptions on the relationship
between the potential outcomes Yi (1) and Yi (0) with the treatment Di .
More interesting for economists is the average treatment effect on the treated (ATT):
As in the previous case, we cannot estimate this parameter without further assumptions.
2
Applied Econometrics for Economic Research CIES/INEI
We can define also a parameter called the average treatment effect for the untreated (ATU):
A simple way to estimate ATT is by using the mean difference in outcomes (MDO) or naïve
estimator:
selection bias
ATT can be consistently estimated using the naïve estimator when there is no selection bias.
• Open bias:
3
Applied Econometrics for Economic Research CIES/INEI
• Hidden bias
This assumption implies that the potential outcomes of individuals be unaffected by potential
changes in the treatment exposures of other individuals (Morgan and Winship 2007, section
2.4).
One way to understand SUTVA: no general equilibrium effects due to the treatment.
Critical issue: understanding causality in this framework depends on the ability of defining
correctly the potential outcomes.
Poorly defined treatments are those in which the treatment cannot be potentially manipulated.
Example:
Key idea of this course: how to approximate our research strategy to one situation that
resembles an experiment in which the treatment is randomly assigned.
Angrist and Pischke (2009): random assignment is the most credible and influential research
design because solves the “selection problem”.
4
Applied Econometrics for Economic Research CIES/INEI
And also;
Comments:
• Generally, none of these conditions hold with observational data due to the existence of
selection.
• There is an important case in which these conditions are met. That is the case of a
randomized experiment.
• In an experimental design, the treatment Di is randomly assigned. Because of that, the
treatment Di is independent (or orthogonal) of the potential outcomes Yi (1) and Yi (0) .
Therefore,
As we will see later, one way to do that is by arguing that the treatment is ignorable after
conditioning by a set of covariates. This is known as selection on observables.
5
Applied Econometrics for Economic Research CIES/INEI
Let’s consider the estimation of treatment effects with a random sample from a population.
Thus, we can re-write the naïve estimator of treatment effect in the following way:
(17) Δ NAIVE = EN [ yi di = 1] − EN [ yi di = 1]
Assume that an autonomous fixed treatment selection regime prevails and π is the proportion
of the population of interest that takes the treatment.
In observational studies, there is no guarantee that the naïve estimator is going to converge to
any of the parameters defined earlier.
Comments:
• ATE is a function of five unknowns: the proportion of the population self-selected into the
treatment, and four potential outcomes.
• Without additional assumptions, we can consistently estimate three of these five unknowns
from a random sample of the population.
• In particular, we have that the following sample means converge in probability to the true
population parameters:
EN [di ] ⎯⎯
p
→π
EN [ yi di = 1] ⎯⎯
p
→ E [Yi (1) Di = 1]
EN [ yi di = 0] ⎯⎯
p
→ E [Yi (0) Di = 0]
Now, let’s discuss the bias of the naïve estimator as an estimator of ATE. After a bit of algebra,
it can be shown that:
6
Applied Econometrics for Economic Research CIES/INEI
This expression suggests that the naïve estimator includes the ATE plus two terms:
• E [Yi (0) Di = 1] − E [Yi (0) Di = 0] , which is known as the “baseline bias”; and
• (1 − π ) { E [ Δ i Di = 1] − E [ Δ i Di = 0]} , which is known as the “differential treatment
effect bias”.
It should be clear that in order to get an unbiased and consistent estimate of ATE from a
random sample of a population, we have to rely on assumptions about the counterfactuals.
By assuming A.1 and A.2, we can compute the remaining unknowns in equation (13). In such a
situation ATE=ATT=ATU. Consider the following cases:
8. Final comments
In most of the cases faced by social scientists, both assumptions are hard to believe when only
non-experimental data is available.
We need to find some source of exogenous variation in the data in order to be able to estimate a
causal relationship.
7