Publications by Type: Software

2014
MatchingFrontier: R Package for Calculating the Balance-Sample Size Frontier
Gary King, Christopher Lucas, and Richard Nielsen. 2014. “MatchingFrontier: R Package for Calculating the Balance-Sample Size Frontier”. Abstract

MatchingFrontier is an easy-to-use R Package for making optimal causal inferences from observational data.  Despite their popularity, existing matching approaches leave researchers with two fundamental tensions. First, they are designed to maximize one metric (such as propensity score or Mahalanobis distance) but are judged against another for which they were not designed (such as L1 or differences in means). Second, they lack a principled solution to revealing the implicit bias-variance trade off: matching methods need to optimize with respect to both imbalance (between the treated and control groups) and the number of observations pruned, but existing approaches optimize with respect to only one; users then either ignore the other, or tweak it, usually suboptimally, by hand.

MatchingFrontier resolves both tensions by consolidating previous techniques into a single, optimal, and flexible approach. It calculates the matching solution with maximum balance for each possible sample size (N, N-1, N-2,...). It thus directly calculates the entire balance-sample size frontier, from which the user can easily choose one, several, or all subsamples from which to conduct their final analysis, given their own choice of imbalance metric and quantity of interest. MatchingFrontier solves the joint optimization problem in one run, automatically, without manual tweaking, and without iteration.  Although for each subset size k, there exist a huge (N choose k) number of unique subsets, MatchingFrontier includes specially designed fast algorithms that give the optimal answer, usually in a few minutes.  

MatchingFrontier implements the methods in this paper:  

King, Gary, Christopher Lucas, and Richard Nielsen. 2014. The Balance-Sample Size Frontier in Matching Methods for Causal Inference, copy at http://j.mp/1dRDMrE

See http://projects.iq.harvard.edu/frontier/

2011
AutoCast: Automated Bayesian Forecasting with YourCast
Jonathan Bischof, Gary King, and Samir Soneji. 2011. “AutoCast: Automated Bayesian Forecasting with YourCast”. Publisher's Version
2010
JudgeIt II: A Program for Evaluating Electoral Systems and Redistricting Plans
A program for analyzing most any feature of district-level legislative elections data, including prediction, evaluating redistricting plans, estimating counterfactual hypotheses (such as what would happen if a term-limitation amendment were imposed). This implements statistical procedures described in a series of journal articles and has been used during redistricting in many states by judges, partisans, governments, private citizens, and many others. The earlier version was winner of the APSA Research Software Award.
ReadMe: Software for Automated Content Analysis
Gary King, Matthew Knowles, and Steven Melendez. 2010. “ReadMe: Software for Automated Content Analysis”. Publisher's Version Abstract
This program will read and analyze a large set of text documents and report on the proportion of documents in each of a set of given categories.
2009
AMELIA II: A Program for Missing Data
James Honaker, Gary King, and Matthew Blackwell. 2009. “AMELIA II: A Program for Missing Data”. Publisher's Version Abstract
This program multiply imputes missing data in cross-sectional, time series, and time series cross-sectional data sets. It includes a Windows version (no knowledge of R required), and a version that works with R either from the command line or via a GUI.
CEM: Coarsened Exact Matching Software
Stefano Iacus, Gary King, and Giuseppe Porro. 2009. “CEM: Coarsened Exact Matching Software”. Publisher's Version
2008
VA: Verbal Autopsies
Gary King and Ying Lu. 2008. “VA: Verbal Autopsies”. Publisher's Version
2007
Anchors: Software for Anchoring Vignettes Data
Johnathan Wand, Gary King, and Olivia Lau. 2007. “Anchors: Software for Anchoring Vignettes Data”. Publisher's Version
MatchIt: Nonparametric Preprocessing for Parametric Causal Inference
Gary King, Kosuke Imai, Gary King, and Elizabeth A Stuart. 2007. “MatchIt: Nonparametric Preprocessing for Parametric Causal Inference”. Publisher's Version
2006
Zelig: Everyone's Statistical Software
Kosuke Imai, Gary King, and Olivia Lau. 2006. “Zelig: Everyone's Statistical Software”. Publisher's Version
2005
WhatIf: Software for Evaluating Counterfactuals
Heather Stoll, Gary King, and Langche Zeng. 2005. “WhatIf: Software for Evaluating Counterfactuals”. Publisher's Version
2004
YourCast
Frederico Girosi and Gary King. 2004. “YourCast”. Publisher's Version Abstract
YourCast is (open source and free) software that makes forecasts by running sets of linear regressions together in a variety of sophisticated ways. YourCast avoids the bias that results when stacking datasets from separate cross-sections and assuming constant parameters, and the inefficiency that results from running independent regressions in each cross-section.
2003
Michael Tomz, Jason Wittenberg, and Gary King. 2003. “CLARIFY: Software for Interpreting and Presenting Statistical Results.” Journal of Statistical Software. Abstract
This is a set of easy-to-use Stata macros that implement the techniques described in Gary King, Michael Tomz, and Jason Wittenberg's "Making the Most of Statistical Analyses: Improving Interpretation and Presentation". To install Clarify, type "net from http://gking.harvard.edu/clarify" at the Stata command line. The documentation [ HTML | PDF ] explains how to do this. We also provide a zip archive for users who want to install Clarify on a computer that is not connected to the internet. Winner of the Okidata Best Research Software Award. Also try -ssc install qsim- to install a wrapper, donated by Fred Wolfe, to automate Clarify's simulation of dummy variables.
EI: A Program for Ecological Inference
Gary King and Kenneth Benoit. 2003. “EzI: A(n Easy) Program for Ecological Inference”. Publisher's Version
ReLogit: Rare Events Logistic Regression
Gary King, Michael Tomz, and Langche Zeng. 2003. “ReLogit: Rare Events Logistic Regression”. Publisher's Version
2002
A stand-alone, easy-to-use program for running event count and duration regression models, developed by and/or discussed in a series of journal articles by me. (Event count models have a dependent variable measured as the number of times something happens, such as the number of uncontested seats per state or the number of wars per year. Duration models explain dependent variables measured as the time until some event, such as the number of months a parliamentary cabinet endures.) Winner of the APSA Research Software Award.
1998
AMELIA: A Program for Missing Data
James Honaker, Anne Joseph, Gary King, Kenneth Scheve, and Naunihal Singh. 1998. “AMELIA: A Program for Missing Data”. Publisher's Version
Gary King. 1998. “MAXLIK”. Abstract

A set of Gauss programs and datasets (annotated for pedagogical purposes) to implement many of the maximum likelihood-based models I discuss in Unifying Political Methodology: The Likelihood Theory of Statistical Inference, Ann Arbor: University of Michigan Press, 1998, and use in my class. All datasets are real, not simulated.

1992
JudgeIt I: A Program for Evaluating Electoral Systems and Redistricting Plans
A program for analyzing almost any feature of district-level legislative elections data, including prediction, evaluating redistricting plans, estimating counterfactual hypotheses (such as what would happen if a term-limitation amendment were imposed), and others. This implements statistical procedures described in a series of journal articles and has been used during redistricting in many states by judges, partisans, governments, private citizens, and many others. Winner of the APSA Research Software Award.