| Previous: Checking Balance | Up: Statistical Overview | Next: User's Guide to MATCHIT |
The most common way that parametric analyses are used to compute
quantities of interest (without matching) is by (statistically)
holding constant some explanatory variables, changing others, and
computing predicted or expected values and taking the difference or
ratio, all by using the parametric functional form. In the case of
causal inference, this would mean looking at the effect on the
expected value of the outcome variable when changing
from 0 to 1,
while holding constant the pretreatment control variables
at their
means or medians. This, and indeed any other appropriate analysis
procedure, would be a perfectly reasonable way to proceed with
analysis after matching. If it is the chosen way to proceed, then
either treated or control units may be deleted during the matching
stage, since the same parametric structure is assumed to apply to all
observations.
In other instances, researchers wish to reduce the assumptions
inherent in their statistical model and so want to allow for the
possibility that their treatment effect to vary over observations. In
this situation, one popular quantity of interest used is the
average treatment effect on the treated (ATT). For example,
for the treated group, the potential outcomes under control,
,
are missing, whereas the outcomes under treatment,
, are
observed, and the goal of the analysis is to impute the missing
outcomes,
for observations with
. We do this via
simulation using a parametric statistical model such as regression,
logit, or others (as described below). Once those potential outcomes
are imputed from the model, the estimate of individual
's treatment
effect is
where
is a
predicted value of the dependent variable for unit
under the
counterfactual condition where
. The in-sample average
treatment effect for the treated individuals can then be obtained by
averaging this difference over all observations
where in fact
. Most MATCHIT algorithms retain all treated units, and
choose some subset of or repeated units from the control group, so
that estimating the ATT is straightforward. If one chooses options
that allow matching with replacement, or any solution that has
different numbers of controls (or treateds) within each subclass or strata (such
as full matching),
then the parametric analysis following matching must accomodate these
procedures, such as by using fixed effects or weights, as appropriate.
(Similar procedures can also be used to estimate various other
quantities of interest such as the average treatment effect by
computing it for all observations, but then one must be aware that the
quantity of interest may change during the matching procedure
as some control units may be dropped.)
The imputation from the model can be done in at least two ways.
Recall that the model is used to impute the value that the
outcome variable would take among the treated units if those treated
units were actually controls. Thus, one reasonable approach would
be to fit a model to the matched data and create simulated predicted
values of the dependent variable for the treated units with
switched counterfactually from 1 to 0. An alternative approach would
be to fit a model without
by using only the outcomes of the
matched control units (i.e., using only observations where
).
Then, given this fitted model, the missing outcomes
are
imputed for the matched treated units by using the values of the
explanatory variables for the treated units. The first approach will
usually have lower variance, since all observations are used, and the
second may have less bias, since no assumption of constant parameters
across the models of the potential outcomes under treatment and control is needed.
See Ho, Imai, King, and Stuart (2007) for more details.
Other quantities of interest can also be computed at the parametric stage, following any procedures you would have followed in the absence of matching. The advantage is that if matching is done well your answers will be more robust to many small changes in parametric specification.