Working Paper

<p>A Unified Approach to Measurement Error and Missing Data: Overview</p>
Blackwell, Matthew, James Honaker, and Gary King. 2014.

A Unified Approach to Measurement Error and Missing Data: Overview

Although social scientists devote considerable effort to mitigating measurement error during data collection, they often ignore the issue during data analysis. And although many statistical methods have been proposed for reducing measurement error-induced biases, few have been widely used because of implausible assumptions, high levels of model dependence, difficult computation, or inapplicability with multiple mismeasured variables. We develop an easy-to-use alternative without these problems; it generalizes the popular multiple imputation (MI) framework by treating missing data problems as a limiting special case of extreme measurement error, and corrects for both. Like MI, the proposed framework is a simple two-step procedure, so that in the second step researchers can use whatever statistical method they would have if there had been no problem in the first place. We also offer empirical illustrations, open source software that implements all the methods described herein, and a companion paper with technical details and extensions (Blackwell, Honaker, and King, 2014b).
<p>How Robust Standard Errors Expose Methodological Problems They Do Not Fix, and What to Do About It</p>
King, Gary, and Margaret Roberts. 2014.

How Robust Standard Errors Expose Methodological Problems They Do Not Fix, and What to Do About It

"Robust standard errors" are used in a vast array of scholarship to correct standard errors for model misspecification. However, when misspecification is bad enough to make classical and robust standard errors diverge, assuming that it is nevertheless not so bad as to bias everything else requires considerable optimism. And even if the optimism is warranted, settling for a misspecified model, with or without robust standard errors, will still bias estimators of all but a few quantities of interest. Even though this message is well known to methodologists, it has failed to reach most applied researchers. The resulting cavernous gap between theory and practice suggests that considerable gains in applied statistics may be possible. We seek to help applied researchers realize these gains via an alternative perspective that offers a productive way to use robust standard errors; a new general and easier-to-use "generalized information matrix test" statistic; and practical illustrations via simulations and real examples from published research. Instead of jettisoning this extremely popular tool, as some suggest, we show how robust and classical standard error differences can provide effective clues about model misspecification, likely biases, and a guide to more reliable inferences.
<p>The Balance-Sample Size Frontier in Matching Methods for Causal Inference</p>
King, Gary, Christopher Lucas, and Richard Nielsen. 2014.

The Balance-Sample Size Frontier in Matching Methods for Causal Inference

We propose a simplified approach to matching for causal inference that simultaneously optimizes both balance (between the treated and control groups) and matched sample size. This procedure resolves two widespread tensions in the use of this powerful and popular methodology. First, current practice is to run a matching method that maximizes one balance metric (such as a propensity score or average Mahalanobis distance), but then to check whether it succeeds with respect to a different balance metric for which it was not designed (such as differences in means or L1). Second, current matching methods either fix the sample size and maximize balance (e.g., Mahalanobis or propensity score matching), fix balance and maximize the sample size (such as coarsened exact matching), or are arbitrary compromises between the two (such as calipers with ad hoc thresholds applied to other methods). These tensions lead researchers to either try to optimize manually, by iteratively tweaking their matching method and rechecking balance, or settle for suboptimal solutions. We address these tensions by first defining and showing how to calculate the matching frontier as the set of matching solutions with maximum balance for each possible sample size. Researchers can then choose one, several, or all matching solutions from the frontier for analysis in one step without iteration. The main difficulty in this strategy is that checking all possible solutions is exponentially difficult. We solve this problem with new algorithms that finish fast, optimally, and without iteration or manual tweaking. We (will) also offer easy-to-use software that implements these ideas, along with several empirical applications.
<p>Google Flu Trends Still Appears Sick:&nbsp;An Evaluation of the 2013‐2014 Flu Season</p>
Lazer, David, Ryan Kennedy, Gary King, and Alessandro Vespignani. 2014.

Google Flu Trends Still Appears Sick: An Evaluation of the 2013‐2014 Flu Season

Last year was difficult for Google Flu Trends (GFT). In early 2013, Nature reported that GFT was estimating more than double the percentage of doctor visits for influenza like illness than the Centers for Disease Control and Prevention s (CDC) sentinel reports during the 2012 2013 flu season (1). Given that GFT was designed to forecast upcoming CDC reports, this was a problematic finding. In March 2014, our report in Science found that the overestimation problem in GFT was also present in the 2011 2012 flu season (2). The report also found strong evidence of autocorrelation and seasonality in the GFT errors, and presented evidence that the issues were likely, at least in part, due to modifications made by Google s search algorithm and the decision by GFT engineers not to use previous CDC reports or seasonality estimates in their models what the article labeled algorithm dynamics and big data hubris respectively. Moreover, the report and the supporting online materials detailed how difficult/impossible it is to replicate the GFT results, undermining independent efforts to explore the source of GFT errors and formulate improvements.
<p>A Unified Approach to Measurement Error and Missing&nbsp;Data: Details and Extensions</p>
Blackwell, Matthew, James Honaker, and Gary King. 2014.

A Unified Approach to Measurement Error and Missing Data: Details and Extensions

We extend a unified and easy-to-use approach to measurement error and missing data. Blackwell, Honaker, and King (2014a) gives an intuitive overview of the new technique, along with practical suggestions and empirical applications. Here, we offer more precise technical details; more sophisticated measurement error model specifications and estimation procedures; and analyses to assess the approach's robustness to correlated measurement errors and to errors in categorical variables. These results support using the technique to reduce bias and increase efficiency in a wide variety of empirical research.
<p>Computer-Assisted Keyword and Document Set Discovery from Unstructured Text</p>
King, Gary, Patrick Lam, and Margaret Roberts. 2014.

Computer-Assisted Keyword and Document Set Discovery from Unstructured Text

The (unheralded) first step in many applications of automated text analysis involves selecting keywords to choose documents from a large text corpus for further study. Although all substantive results depend crucially on this choice, researchers typically pick keywords in ad hoc ways, given the lack of formal statistical methods to help. Paradoxically, this often means that the validity of the most sophisticated text analysis methods depends in practice on the inadequate keyword counting or matching methods they are designed to replace. The same ad hoc keyword selection process is also used in many other areas, such as following conversations that rapidly innovate language to evade authorities, seek political advantage, or express creativity; generic web searching; eDiscovery; look-alike modeling; intelligence analysis; and sentiment and topic analysis. We develop a computer-assisted (as opposed to fully automated) statistical approach that suggests keywords from available text, without needing any structured data as inputs. This framing poses the statistical problem in a new way, which leads to a widely applicable algorithm. Our specific approach is based on training classifiers, extracting information from (rather than correcting) their mistakes, and then summarizing results with Boolean search strings. We illustrate how the technique works with examples in English and Chinese.
How Coarsening Simplifies Matching-Based Causal Inference Theory
Iacus, Stefano M, and Gary King. 2012. How Coarsening Simplifies Matching-Based Causal Inference Theory.Abstract
The simplicity and power of matching methods have made them an increasingly popular approach to causal inference in observational data. Existing theories that justify these techniques are well developed but either require exact matching, which is usually infeasible in practice, or sacrifice some simplicity via asymptotic theory, specialized bias corrections, and novel variance estimators; and extensions to approximate matching with multicategory treatments have not yet appeared. As an alternative, we show how conceptualizing continuous variables as having logical breakpoints (such as phase transitions when measuring temperature or high school or college degrees in years of education) is both natural substantively and can be used to simplify causal inference theory. The result is a finite sample theory that is widely applicable, simple to understand, and easy to implement by using matching to preprocess the data, after which one can use whatever method would have been applied without matching. The theoretical simplicity also allows for binary, multicategory, and continuous treatment variables from the start and for extensions to valid inference under imperfect treatment assignment.
Comparative Effectiveness of Matching Methods for Causal Inference
King, Gary, Richard Nielsen, Carter Coberley, James E Pope, and Aaron Wells. 2011. Comparative Effectiveness of Matching Methods for Causal Inference.Abstract
Matching is an increasingly popular method of causal inference in observational data, but following methodological best practices has proven difficult for applied researchers. We address this problem by providing a simple graphical approach for choosing among the numerous possible matching solutions generated by three methods: the venerable ``Mahalanobis Distance Matching'' (MDM), the commonly used ``Propensity Score Matching'' (PSM), and a newer approach called ``Coarsened Exact Matching'' (CEM). In the process of using our approach, we also discover that PSM often approximates random matching, both in many real applications and in data simulated by the processes that fit PSM theory. Moreover, contrary to conventional wisdom, random matching is not benign: it (and thus PSM) can often degrade inferences relative to not matching at all. We find that MDM and CEM do not have this problem, and in practice CEM usually outperforms the other two approaches. However, with our comparative graphical approach and easy-to-follow procedures, focus can be on choosing a matching solution for a particular application, which is what may improve inferences, rather than the particular method used to generate it.
How Not to Lie Without Statistics
King, Gary, and Eleanor Neff Powell. 2008. How Not to Lie Without Statistics.Abstract
We highlight, and suggest ways to avoid, a large number of common misunderstandings in the literature about best practices in qualitative research. We discuss these issues in four areas: theory and data, qualitative and quantitative strategies, causation and explanation, and selection bias. Some of the misunderstandings involve incendiary debates within our discipline that are readily resolved either directly or with results known in research areas that happen to be unknown to political scientists. Many of these misunderstandings can also be found in quantitative research, often with different names, and some of which can be fixed with reference to ideas better understood in the qualitative methods literature. Our goal is to improve the ability of quantitatively and qualitatively oriented scholars to enjoy the advantages of insights from both areas. Thus, throughout, we attempt to construct specific practical guidelines that can be used to improve actual qualitative research designs, not only the qualitative methods literatures that talk about them.