next up previous contents home.gif
Next: Exact Matching Up: Arguments Previous: Arguments   Contents


All Matching Methods

  1. formula: formula used to calculate the distance measure for matching. It takes the usual syntax of R formulas, treat ~ x1 + x2, where treat is a binary treatment indicator and x1 and x2 are the pre-treatment covariates. Both the treatment indicator and pre-treatment covariates must be contained in the same data frame, which is specified as data (see below). All of the usual R syntax for formulas work here. For example, x1:x2 represents the first order interaction term between x1 and x2, and I(x1 ^ 2) represents the square term of x1. See help(formula) for details.

  2. data: the data frame containing the variables called in formula. You may find it helpful for the diagnostics to specify observation names in the data frame (see Section 5.2.2).

  3. method: the matching method (default=nearest). Currently, exact (exact matching), full (full matching), nearest (nearest neighbor matching), optimal (optimal matching), subclass (subclassification), and genetic (genetic matching) are available. Note that within each of these matching methods, MATCHIT offers a variety of options. See Section 3 for more details.

  4. distance: the method used to estimate the distance measure (default=logistic regression, logit). Before using any of these techniques, it is best to understand the theoretical groundings of these techniques and to evaluate the results. Most of these methods (such as logistic or probit regression) are estimating the propensity score, defined as the probability of receiving treatment, conditional on the covariates (Rosenbaum & Rubin (1983)). The distance measures used are the predicted probabilities from the model (the propensity scores). Currently, the following methods are available:

  5. distance.options specifies the optional arguments that are passed to the model for estimating the distance measure. The input to this argument should be a list. For example, if the distance measure is estimated with a logistic regression, users can increase the maximum IWLS iterations by distance.options = list(maxit = 5000).

  6. discard: whether to discard units that fall outside some measure of support of the distance score before matching, and not allow them to be used at all in the matching procedure (default=none). Note that discarding units may change the quantity of interest being estimated.

  7. reestimate: whether the model for distance measure should be re-estimated after units are discarded (default=FALSE). The input must be a logical value. Re-estimation may be desirable for efficiency reasons, especially if many units were discarded and so the post-discard samples are quite different from the original samples.

  8. verbose: whether or not to print out comments indicating the status of the matching (default=FALSE). The input must be a logical value.


next up previous contents home.gif
Next: Exact Matching Up: Arguments Previous: Arguments   Contents
Gary King 2005-09-26