A Theory of Statistical Inference for Matching Methods in Applied Causal Research”.Abstract. 2015. “
Matching methods for causal inference have become a popular way of reducing model dependence and bias, in large part because of their convenience and conceptual simplicity. Researchers most commonly use matching as a data preprocessing step, after which they apply whatever statistical model and uncertainty estimators they would have without matching. Unfortunately, for a given sample of any finite size, this approach is theoretically appropriate only under exact matching, which is usually infeasible; approximate matching can be justified under asymptotic theory, if large enough sample sizes are available, but then specialized point and variance estimators are required, which sacrifices some of matching's simplicity and convenience. Researchers also violate statistical theory with ad hoc iterations between formal matching methods and informal balance checks. Instead of asking researchers to change their widely used practices, we develop a comprehensive theory of statistical inference able to justify them. The theory we propose is substantively plausible, requires no asymptotic theory, and is simple to understand. Its core conceptualizes continuous variables as having natural breakpoints, which are common in applications (e.g., high school or college degrees in years of education, a governmental poverty level in income, or phase transitions in temperature). The theory allows binary, multicategory, and continuous treatment variables from the outset and straightforward extensions for imperfect treatment assignment and different versions of treatments. Although this theory provides a valid foundation for most commonly used methods of matching, researchers must still satisfy the assumptions in any real application.