| Previous: Updating MATCHIT | Up: MATCHIT: Nonparametric Preprocessing for | Next: Preprocessing via Matching |
MATCHIT is designed for causal inference with a dichotomous treatment variable and a set of pretreatment control variables. Any number or type of dependent variables can be used. (If you are interested in the causal effect of more than one variable in your data set, run MATCHIT separately for each one; it is unlikely in any event that any one parametric model will produce valid causal inferences for more than one treatment variable at a time.) MATCHIT can be used for other types of causal variables by dichotomizing them, perhaps in multiple ways (see also Imai and van Dyk, 2004). MATCHIT works for experimental data, but is usually used for observational studies where the treatment variable is not randomly assigned by the investigator, or the random assignment goes awry.
We adopt the same notation as in Ho, Imai, King, and Stuart (2007). Unless
otherwise noted, let
index the
units in the data set,
denote the number of treated units,
denote the number of control
units (such that
), and
indicate a vector of
pretreatment (or control) variables for unit
. Let
when
unit
is assigned treatment, and
when unit
is assigned
control. (The labels ``treatment'' and ``control'' and values 1 and 0
respectively are arbitrary and can be switched for convenience, except
that some methods of matching are keyed to the definition of the
treated group.) Denote
as the potential outcome of unit
under treatment -- the value the outcome variable would take if
were equal to 1, whether or not
in fact is 0 or 1 - and
the potential outcome of unit
under control -- the value
the outcome variable would take if
were equal to 0, regardless
of its value in fact. The variables
and
are jointly
unobservable, and for each
, we observe one
, and not the other.
Also denote a fixed vector of exogenous, pretreatment measured
confounders as
. These variables are defined in the hope or
under the assumption that conditioning on them appropriately will make
inferences ignorable. Measures of balance should be computed with
respect to all of
, even if some methods of matching only use some
components.