Statistical methods to accommodate missing information in data sets due to scattered unit nonresponse, missing variables, or cell values or variables measured with error. Easy-to-use algorithms and software for multiple imputation and multiple overimputation for surveys, time series, and time series cross-sectional data. Applications to electoral, and other compositional, data.
The methods developed in this paper greatly expands the size and types of data sets that can be imputed without difficulty, for cross-sectional, time series, and time series cross-sectional data. . 2001. Analyzing Incomplete Political Science Data: An Alternative Algorithm for Multiple Imputation, American Political Science Review 95: 49–69.Abstract
We propose a remedy for the discrepancy between the way political scientists analyze data with missing values and the recommendations of the statistics community. Methodologists and statisticians agree that "multiple imputation" is a superior approach to the problem of missing data scattered through one’s explanatory and dependent variables than the methods currently used in applied data analysis. The discrepancy occurs because the computational algorithms used to apply the best multiple imputation models have been slow, difficult to implement, impossible to run with existing commercial statistical packages, and have demanded considerable expertise. We adapt an algorithm and use it to implement a general-purpose, multiple imputation model for missing data. This algorithm is considerably easier to use than the leading method recommended in statistics literature. We also quantify the risks of current missing data practices, illustrate how to use the new procedure, and evaluate this alternative through simulated data as well as actual empirical examples. Finally, we offer easy-to-use that implements our suggested methods. (Software: AMELIA)