A Theory of Statistical Inference for Matching Methods in Applied Causal Research.Abstract. Working Paper.
Applied researchers use matching methods for causal inference most commonly as a data preprocessing step for reducing model dependence and bias, after which they use whatever statistical model and uncertainty estimators they would have without matching, such as a difference in means or regression. They also routinely ignore the requirement of existing theory that all matches be exact. We offer the first theory of statistical inference to justify these widely used procedures, as well as the recommended, but ad hoc, practice of iterating between formal matching methods and informal balance checks. The theory we propose is substantively plausible, requires no asymptotic theory, and is simple to understand. Its core conceptualizes continuous variables as having natural breakpoints (such as high school or college degrees in years of education, a governmental poverty level in income, or phase transitions in temperature), which is common in applications. The theory allows binary, multicategory, and continuous treatment variables from the outset and straightforward extensions for imperfect treatment assignment and different versions of treatments. Although this theory provides the foundation for all commonly used methods of matching, researchers must still satisfy the assumptions in any real application.