Gary King is the Albert J. Weatherhead III University Professor at Harvard University, based in the Department of Government (in the Faculty of Arts and Sciences). He also serves as Director of the Institute for Quantitative Social Science. King and his research group develop and apply empirical methods in many areas of social science research, focusing on innovations that span the range from statistical theory to practical application. For more information, see his bio and curriculum vitae.
Gary King is the Albert J. Weatherhead III University Professor at Harvard University -- one of 24 with the title of University Professor, Harvard's most distinguished faculty position. He is based in the Department of Government (in the Faculty of Arts and Sciences) and serves as Director of the Institute for Quantitative Social Science. King develops and applies empirical methods in many areas of social science research, focusing on innovations that span the range from statistical theory to practical application.
King has been elected Fellow in 7 honorary societies (National Academy of Sciences 2010, National Academy of Social Insurance 2014, American Statistical Association 2009, American Association for the Advancement of Science 2004, American Academy of Arts and Sciences 1998, Society for Political Methodology 2008, and American Academy of Political and Social Science 2004) and has won more than 40 "best of" awards for his work (including the Career Achievement Award 2010, Warren Miller Prize 2008, McGraw-Hill Award 2006, Durr Award 2005, Gosnell Prize 1999 and 1997, Outstanding Statistical Application Award 2000, Donald Campbell Award 1997, Eulau Award 1995, Mills Award 1993, Pi Sigma Alpha Award 2005, 1998, and 1993, APSA Research Software Award 2005, 1997, 1994, and 1992, Okidata Best Research Software Award 1999, Okidata Best Research Web Site Award 1999, Mendelsohn Excellence in Mentoring Award 2011, Kellogg/Notre Dame Award 2014, among others). King was elected President of the Society for Political Methodology (1997-1999) and Vice President of the American Political Science Association (2003-2004). He was appointed a Fellow of the Guggenheim Foundation (1994-1995), Visiting Fellow at Oxford (1994), Senior Science Advisor to the World Health Organization (1998-2003), and member of the Senior Editorial Board at Science (2015-). His more than 150 journal articles, 20 open source software packages, and 8 books span most aspects of political methodology, many fields of political science, and several other scholarly disciplines.
King's work is widely read across scholarly fields and beyond academia. He was listed as the most cited political scientist of his cohort; among the group of "political scientists who have made the most important theoretical contributions" to the discipline "from its beginnings in the late-19th century to the present"; and on ISI's list of the most highly cited researchers across the social sciences. His work on legislative redistricting has been used in most American states by legislators, judges, lawyers, political parties, minority groups, and private citizens, as well as the U.S. Supreme Court. His work on inferring individual behavior from aggregate data has been used in as many states by these groups, and in many other practical contexts. His contributions to methods for achieving cross-cultural comparability in survey research have been used in surveys in over eighty countries by researchers, governments, and private concerns. King led an evaluation of the Mexican universal health insurance program, which included the largest randomized health policy experiment to date. The statistical methods and software he develops are used extensively in academia, government, consulting, and private industry. He is a founder, and an inventor of the original technology for, Learning Catalytics (acquired by Pearson) and Crimson Hexagon, among others.
King has had many students and postdocs, many of whom now hold faculty positions at leading universities and companies. He has collaborated with more than 150 scholars, including many of his students, on research for publication. He has served on more than 30 editorial boards; on the governing councils of the American Political Science Association, Inter-university Consortium for Political and Social Research, the Society for Political Methodology, and the Midwest Political Science Association; and on several National Research Council and National Science Foundation panels.
King received a B.A. from SUNY New Paltz (1980) and a Ph.D. from the University of Wisconsin-Madison (1984). His research has been supported by the National Science Foundation, the Centers for Disease Control and Prevention, the World Health Organization, the National Institute of Aging, the Global Forum for Health Research, and centers, corporations, foundations, and other federal agencies.
Proof that previously used estimators of electoral incumbency advantage were biased, and a new unbiased estimator. Also, the first systematic demonstration that constituency service by legislators increases the incumbency advantage.
An evaluation of the Mexican Seguro Popular program (designed to extend health insurance and regular and preventive medical care, pharmaceuticals, and health facilities to 50 million uninsured Mexicans), one of the world's largest health policy reforms of the last two decades. Our evaluation features a new design for field experiments that is more robust to the political interventions and implementation errors that have ruined many similar previous efforts; new statistical methods that produce more reliable and efficient results using fewer resources, assumptions, and data; and an implementation of these methods in the largest randomized health policy experiment to date. (See the Harvard Gazette story on this project.)
Resolution of the paradox of why polls are so variable over time during presidential campaigns even though the vote outcome is easily predictable before it starts. Also, a resolution of a key controversy over absentee ballots during the 2000 presidential election; and the methodology of small-n research on executives.
Replication Standards New standards, protocols, and software for citing, sharing, analyzing, archiving, preserving, distributing, cataloging, translating, disseminating, naming, verifying, and replicating scholarly research data and analyses. Also includes proposals to improve the norms of data sharing and replication in science.
Methods for coding, analyzing, and forecasting international conflict and state failure. Evidence that the causes of conflict, theorized to be important but often found to be small or ephemeral, are indeed tiny for the vast majority of dyads, but are large, stable, and replicable wherever the ex ante probability of conflict is large.
The definition of partisan symmetry as a standard for fairness in redistricting; methods and software for measuring partisan bias and electoral responsiveness; discussion of U.S. Supreme Court rulings about this work. Evidence that U.S. redistricting reduces bias and increases responsiveness, and that the electoral college is fair; applications to legislatures, primaries, and multiparty systems.
Methods for forecasting mortality rates (overall or for time series data cross-classified by age, sex, country, and cause); estimating mortality rates in areas without vital registration; measuring inequality in risk of death; applications to US mortality, the future of the Social Security, armed conflict, heart failure, and human security.
Methods for detecting and reducing model dependence (i.e., when minor model changes produce substantively different inferences) in inferring causal effects and other counterfactuals. Matching methods; "politically robust" and cluster-randomized experimental designs; causal bias decompositions.
Statistical models to explain or predict how many events occur for each fixed time period, or the time between events. An application to cabinet dissolution in parliamentary democracies which united two previously warring scholarly literature. Other applications to international relations and U.S. Supreme Court appointments.
Inferring individual behavior from group-level data: The first approach to incorporate both unit-level deterministic bounds and cross-unit statistical information, methods for 2x2 and larger tables, Bayesian model averaging, applications to elections, software.
Statistical methods to accommodate missing information in data sets due to scattered unit nonresponse, missing variables, or cell values or variables measured with error. Easy-to-use algorithms and software for multiple imputation and multiple overimputation for surveys, time series, and time series cross-sectional data. Applications to electoral, and other compositional, data.
How the same unified theory of inference underlies quantitative and qualitative research alike; scientific inference when quantification is difficult or impossible; research design; empirical research in legal scholarship.
How to save 99% of your data collection costs; bias corrections for logistic regression in estimating probabilities and causal effects in rare events data; estimating base probabilities or any quantity from case-control data; automated coding of events.
"Anchoring Vignette" methods for when different respondents (perhaps from different cultures, countries, or ethnic groups) understand survey questions in different ways; an approach to developing theoretical definitions of complicated concepts apparently definable only by example (i.e., "you know it when you see it"); how surveys work.
Large errors in flu prediction were largely avoidable, which offers lessons for the use of big data.
In February 2013, Google Flu Trends (GFT) made headlines but not for a reason that Google executives or the creators of the flu tracking system would have hoped. Nature reported that GFT was predicting more than double the proportion of doctor visits for influenza-like illness (ILI) than the Centers for Disease Control and Prevention (CDC), which bases its estimates on surveillance reports from laboratories across the United States ( 1, 2). This happened despite the fact that GFT was built to predict CDC reports. Given that GFT is often held up as an exemplary use of big data ( 3, 4), what lessons can we draw from this error?
The social sciences are undergoing a dramatic transformation from studying problems to solving them; from making do with a small number of sparse data sets to analyzing increasing quantities of diverse, highly informative data; from isolated scholars toiling away on their own to larger scale, collaborative, interdisciplinary, lab-style research teams; and from a purely academic pursuit to having a major impact on the world. To facilitate these important developments, universities, funding agencies, and governments need to shore up and adapt the infrastructure that supports social science research. We discuss some of these developments here, as well as a new type of organization we created at Harvard to help encourage them -- the Institute for Quantitative Social Science. An increasing number of universities are beginning efforts to respond with similar institutions. This paper provides some suggestions for how individual universities might respond and how we might work together to advance social science more generally.
"Robust standard errors" are used in a vast array of scholarship to correct standard errors for model misspecification. However, when misspecification is bad enough to make classical and robust standard errors diverge, assuming that it is nevertheless not so bad as to bias everything else requires considerable optimism. And even if the optimism is warranted, settling for a misspecified model, with or without robust standard errors, will still bias estimators of all but a few quantities of interest. The resulting cavernous gap between theory and practice suggests that considerable gains in applied statistics may be possible. We seek to help researchers realize these gains via a more productive way to understand and use robust standard errors; a new general and easier-to-use "generalized information matrix test" statistic that can formally assess misspecification (based on differences between robust and classical variance estimates); and practical illustrations via simulations and real examples from published research. How robust standard errors are used needs to change, but instead of jettisoning this popular tool we show how to use it to provide effective clues about model misspecification, likely biases, and a guide to considerably more reliable, and defensible, inferences. Accompanying this article [soon!] is software that implements the methods we describe.
The American system of higher education is under attack by political, economic, and educational forces that threaten to undermine its business model, governmental support, and operating mission. The potential changes are considerably more dramatic and disruptive than what we've already experienced. Traditional colleges and universities urgently need a coherent, thought-out response. Their central role in ensuring the creation, preservation, and distribution of knowledge may be at risk and, as a consequence, so too may be the spectacular progress across fields we have come to expect as a result.
Symposium contributors include Henry E. Brady, John Mark Hansen, Gary King, Nannerl O. Keohane, Michael Laver, Virginia Sapiro, and Maya Sen.
We marshal discoveries about human behavior and learning from social science research and show how they can be used to improve teaching and learning. The discoveries are easily stated as three social science generalizations: (1) social connections motivate, (2) teaching teaches the teacher, and (3) instant feedback improves learning. We show how to apply these generalizations via innovations in modern information technology inside, outside, and across university classrooms. We also give concrete examples of these ideas from innovations we have experimented with in our own teaching.
See also a video presentation of this talk before the Harvard Board of Overseers
We offer the first large scale, multiple source analysis of the outcome of what may be the most extensive effort to selectively censor human expression ever implemented. To do this, we have devised a system to locate, download, and analyze the content of millions of social media posts originating from nearly 1,400 different social media services all over China before the Chinese government is able to find, evaluate, and censor (i.e., remove from the Internet) the large subset they deem objectionable. Using modern computer-assisted text analytic methods that we adapt to and validate in the Chinese language, we compare the substantive content of posts censored to those not censored over time in each of 85 topic areas. Contrary to previous understandings, posts with negative, even vitriolic, criticism of the state, its leaders, and its policies are not more likely to be censored. Instead, we show that the censorship program is aimed at curtailing collective action by silencing comments that represent, reinforce, or spur social mobilization, regardless of content. Censorship is oriented toward attempting to forestall collective activities that are occurring now or may occur in the future --- and, as such, seem to clearly expose government intent.
In the election for President of the United States, the Electoral College is the body whose members vote to elect the President directly. Each state sends a number of delegates equal to its total number of representatives and senators in Congress; all but two states (Nebraska and Maine) assign electors pledged to the candidate that wins the state's plurality vote. We investigate the effect on presidential elections if states were to assign their electoral votes according to results in each congressional district,and conclude that the direct popular vote and the current electoral college are both substantially fairer compared to those alternatives where states would have divided their electoral votes by congressional district.
The financial viability of Social Security, the single largest U.S. Government program, depends on accurate forecasts of the solvency of its intergenerational trust fund. We begin by detailing information necessary for replicating the Social Security Administration’s (SSA’s) forecasting procedures, which until now has been unavailable in the public domain. We then offer a way to improve the quality of these procedures due to age-and sex-specific mortality forecasts. The most recent SSA mortality forecasts were based on the best available technology at the time, which was a combination of linear extrapolation and qualitative judgments. Unfortunately, linear extrapolation excludes known risk factors and is inconsistent with long-standing demographic patterns such as the smoothness of age profiles. Modern statistical methods typically outperform even the best qualitative judgments in these contexts. We show how to use such methods here, enabling researchers to forecast using far more information, such as the known risk factors of smoking and obesity and known demographic patterns. Including this extra information makes a sub¬stantial difference: For example, by only improving mortality forecasting methods, we predict three fewer years of net surplus, $730 billion less in Social Security trust funds, and program costs that are 0.66% greater of projected taxable payroll compared to SSA projections by 2031. More important than specific numerical estimates are the advantages of transparency, replicability, reduction of uncertainty, and what may be the resulting lower vulnerability to the politicization of program forecasts. In addition, by offering with this paper software and detailed replication information, we hope to marshal the efforts of the research community to include ever more informative inputs and to continue to reduce the uncertainties in Social Security forecasts.
This work builds on our article that provides forecasts of US Mortality rates (see King and Soneji, The Future of Death in America), a book developing improved methods for forecasting mortality (Girosi and King, Demographic Forecasting), all data we used (King and Soneji, replication data sets), and open source software that implements the methods (Girosi and King, YourCast). Also available is a New York Times Op-Ed based on this work (King and Soneji, Social Security: It’s Worse Than You Think), and a replication data set for the Op-Ed (King and Soneji, replication data set).
Massive increases in the availability of informative social science data are making dramatic progress possible in analyzing, understanding, and addressing many major societal problems. Yet the same forces pose severe challenges to the scientific infrastructure supporting data sharing, data management, informatics, statistical methodology, and research ethics and policy, and these are collectively holding back progress. I address these changes and challenges and suggest what can be done.
We highlight common problems in the application of random treatment assignment in large scale program evaluation. Random assignment is the defining feature of modern experimental design. Yet, errors in design, implementation, and analysis often result in real world applications not benefiting from the advantages of randomization. The errors we highlight cover the control of variability, levels of randomization, size of treatment arms, and power to detect causal effects, as well as the many problems that commonly lead to post-treatment bias. We illustrate with an application to the Medicare Health Support evaluation, including recommendations for improving the design and analysis of this and other large scale randomized experiments.
We introduce a new framework for forecasting age-sex-country-cause-specific mortality rates that incorporates considerably more information, and thus has the potential to forecast much better, than any existing approach. Mortality forecasts are used in a wide variety of academic fields, and for global and national health policy making, medical and pharmaceutical research, and social security and retirement planning.
As it turns out, the tools we developed in pursuit of this goal also have broader statistical implications, in addition to their use for forecasting mortality or other variables with similar statistical properties. First, our methods make it possible to include different explanatory variables in a time series regression for each cross-section, while still borrowing strength from one regression to improve the estimation of all. Second, we show that many existing Bayesian (hierarchical and spatial) models with explanatory variables use prior densities that incorrectly formalize prior knowledge. Many demographers and public health researchers have fortuitously avoided this problem so prevalent in other fields by using prior knowledge only as an ex post check on empirical results, but this approach excludes considerable information from their models. We show how to incorporate this demographic knowledge into a model in a statistically appropriate way. Finally, we develop a set of tools useful for developing models with Bayesian priors in the presence of partial prior ignorance. This approach also provides many of the attractive features claimed by the empirical Bayes approach, but fully within the standard Bayesian theory of inference.
Ecological Inference: New Methodological Strategies brings together a diverse group of scholars to survey the latest strategies for solving ecological inference problems in various fields. The last half decade has witnessed an explosion of research in ecological inference – the attempt to infer individual behavior from aggregate data. The uncertainties and the information lost in aggregation make ecological inference one of the most difficult areas of statistical inference, but such inferences are required in many academic fields, as well as by legislatures and the courts in redistricting, by businesses in marketing research, and by governments in policy analysis.
MatchingFrontier is an easy-to-use R Package for making optimal causal inferences from observational data. Despite their popularity, existing matching approaches leave researchers with two fundamental tensions. First, they are designed to maximize one metric (such as propensity score or Mahalanobis distance) but are judged against another for which they were not designed (such as L1 or differences in means). Second, they lack a principled solution to revealing the implicit bias-variance trade off: matching methods need to optimize with respect to both imbalance (between the treated and control groups) and the number of observations pruned, but existing approaches optimize with respect to only one; users then either ignore the other, or tweak it, usually suboptimally, by hand.
MatchingFrontier resolves both tensions by consolidating previous techniques into a single, optimal, and flexible approach. It calculates the matching solution with maximum balance for each possible sample size (N, N-1, N-2,...). It thus directly calculates the entire balance-sample size frontier, from which the user can easily choose one, several, or all subsamples from which to conduct their final analysis, given their own choice of imbalance metric and quantity of interest. MatchingFrontier solves the joint optimization problem in one run, automatically, without manual tweaking, and without iteration. Although for each subset size k, there exist a huge (N choose k) number of unique subsets, MatchingFrontier includes specially designed fast algorithms that give the optimal answer, usually in a few minutes.
MatchingFrontier implements the methods in this paper:
King, Gary, Christopher Lucas, and Richard Nielsen. 2014. The Balance-Sample Size Frontier in Matching Methods for Causal Inference, copy at http://j.mp/1dRDMrE
A program for analyzing most any feature of district-level legislative elections data, including prediction, evaluating redistricting plans, estimating counterfactual hypotheses (such as what would happen if a term-limitation amendment were imposed). This implements statistical procedures described in a series of journal articles and has been used during redistricting in many states by judges, partisans, governments, private citizens, and many others. The earlier version was winner of the APSA Research Software Award.
This program multiply imputes missing data in cross-sectional, time series, and time series cross-sectional data sets. It includes a Windows version (no knowledge of R required), and a version that works with R either from the command line or via a GUI.
YourCast is (open source and free) software that makes forecasts by running sets of linear regressions together in a variety of sophisticated ways. YourCast avoids the bias that results when stacking datasets from separate cross-sections and assuming constant parameters, and the inefficiency that results from running independent regressions in each cross-section.
This is a set of easy-to-use Stata macros that implement the techniques described in Gary King, Michael Tomz, and Jason Wittenberg's "Making the Most of Statistical Analyses: Improving Interpretation and Presentation". To install Clarify, type "net from http://gking.harvard.edu/clarify" at the Stata command line. The documentation [ HTML | PDF ] explains how to do this. We also provide a zip archive for users who want to install Clarify on a computer that is not connected to the internet. Winner of the Okidata Best Research Software Award. Also try -ssc install qsim- to install a wrapper, donated by Fred Wolfe, to automate Clarify's simulation of dummy variables.