Applications

Evaluating Social Security Forecasts

The accuracy of U.S. Social Security Administration (SSA) demographic and financial forecasts is crucial for the solvency of its Trust Funds, government programs comprising greater than 50% of all federal government expenditures, industry decision making, and the evidence base of many scholarly articles. Forecasts are also essential for scoring policy proposals, put forward by both political parties. Because SSA makes public little replication information, and uses ad hoc, qualitative, and antiquated statistical forecasting methods, no one in or out of government has been able to produce fully independent alternative forecasts or policy scorings. Yet, no systematic evaluation of SSA forecasts has ever been published by SSA or anyone else. We show that SSA's forecasting errors were approximately unbiased until about 2000, but then began to grow quickly, with increasingly overconfident uncertainty intervals. Moreover, the errors all turn out to be in the same potentially dangerous direction, each making the Social Security Trust Funds look healthier than they actually are. We also discover the cause of these findings with evidence from a large number of interviews we conducted with participants at every level of the forecasting and policy processes. We show that SSA's forecasting procedures meet all the conditions the modern social-psychology and statistical literatures demonstrate make bias likely. When those conditions mixed with potent new political forces trying to change Social Security and influence the forecasts, SSA's actuaries hunkered down trying hard to insulate themselves from the intense political pressures. Unfortunately, this otherwise laudable resistance to undue influence, along with their ad hoc qualitative forecasting models, led them to also miss important changes in the input data such as retirees living longer lives, and drawing more benefits, than predicted by simple extrapolations. We explain that solving this problem involves using (a) removing human judgment where possible, by using formal statistical methods -- via the revolution in data science and big data; (b) instituting formal structural procedures when human judgment is required -- via the revolution in social psychological research; and (c) requiring transparency and data sharing to catch errors that slip through -- via the revolution in data sharing & replication.

An article at Barron's about our work.

Articles and Presentations

Systematic Bias and Nontransparency in US Social Security Administration Forecasts
Konstantin Kashin, Gary King, and Samir Soneji. 2015. “Systematic Bias and Nontransparency in US Social Security Administration Forecasts.” Journal of Economic Perspectives, 29, 2, Pp. 239-258. Publisher's VersionAbstract

The financial stability of four of the five largest U.S. federal entitlement programs, strategic decision making in several industries, and many academic publications all depend on the accuracy of demographic and financial forecasts made by the Social Security Administration (SSA). Although the SSA has performed these forecasts since 1942, no systematic and comprehensive evaluation of their accuracy has ever been published by SSA or anyone else. The absence of a systematic evaluation of forecasts is a concern because the SSA relies on informal procedures that are potentially subject to inadvertent biases and does not share with the public, the scientific community, or other parts of SSA sufficient data or information necessary to replicate or improve its forecasts. These issues result in SSA holding a monopoly position in policy debates as the sole supplier of fully independent forecasts and evaluations of proposals to change Social Security. To assist with the forecasting evaluation problem, we collect all SSA forecasts for years that have passed and discover error patterns that could have been---and could now be---used to improve future forecasts. Specifically, we find that after 2000, SSA forecasting errors grew considerably larger and most of these errors made the Social Security Trust Funds look more financially secure than they actually were. In addition, SSA's reported uncertainty intervals are overconfident and increasingly so after 2000. We discuss the implications of these systematic forecasting biases for public policy.

Explaining Systematic Bias and Nontransparency in US Social Security Administration Forecasts
Konstantin Kashin, Gary King, and Samir Soneji. 2015. “Explaining Systematic Bias and Nontransparency in US Social Security Administration Forecasts.” Political Analysis, 23, 3, Pp. 336-362. Publisher's VersionAbstract

The accuracy of U.S. Social Security Administration (SSA) demographic and financial forecasts is crucial for the solvency of its Trust Funds, other government programs, industry decision making, and the evidence base of many scholarly articles. Because SSA makes public little replication information and uses qualitative and antiquated statistical forecasting methods, fully independent alternative forecasts (and the ability to score policy proposals to change the system) are nonexistent. Yet, no systematic evaluation of SSA forecasts has ever been published by SSA or anyone else --- until a companion paper to this one (King, Kashin, and Soneji, 2015a). We show that SSA's forecasting errors were approximately unbiased until about 2000, but then began to grow quickly, with increasingly overconfident uncertainty intervals. Moreover, the errors are all in the same potentially dangerous direction, making the Social Security Trust Funds look healthier than they actually are. We extend and then attempt to explain these findings with evidence from a large number of interviews we conducted with participants at every level of the forecasting and policy processes. We show that SSA's forecasting procedures meet all the conditions the modern social-psychology and statistical literatures demonstrate make bias likely. When those conditions mixed with potent new political forces trying to change Social Security, SSA's actuaries hunkered down trying hard to insulate their forecasts from strong political pressures. Unfortunately, this otherwise laudable resistance to undue influence, along with their ad hoc qualitative forecasting models, led the actuaries to miss important changes in the input data. Retirees began living longer lives and drawing benefits longer than predicted by simple extrapolations. We also show that the solution to this problem involves SSA or Congress implementing in government two of the central projects of political science over the last quarter century: [1] promoting transparency in data and methods and [2] replacing with formal statistical models large numbers of qualitative decisions too complex for unaided humans to make optimally.

Frequently Asked Questions

You write that that no other institution makes fully independent forecasts? What about the Congressional Budget Office?

The Congressional Budget Office (CBO) uses the SSA’s fertility forecast as an input to its forecasting model. Before 2013, CBO also used SSA's mortality forecasts as inputs to its model (here is why they changed).  

CBO explains on page 103 of The 2014 Long-Term Budget Outlook: "CBO used projected values from the Social Security trustees for fertility rates but produced its own projections for immigration and mortality rates. Together, those projections imply a total U.S. population of 395 million in 2039, compared with 324 million today. CBO also produced its own projection of the rate at which people will qualify for Social Security’s Disability Insurance program in coming decades."

How many ultimate rates of mortality decline does the Social Security Administration choose?

The number of ultimate rates of mortality decline has changed over time. Between 1982 and 2011, the number chosen equaled 210 (5 broad age groups x 2 sexes x 7 causes of death x 3 cost scenarios). Since 2012, SSA reduced the number of causes of death from 7 to 5, applied uniform ultimate rates of decline for males and females, and uniformly scale the ultimate rates of decline for the low and high cost as ½ and 5/3 of the set of intermediate cost rates of decline.

How do you measure uncertainty of SSA policy scores?

As an analogy, we can think of policy scores as the coefficient (an intended causal effect) in a regression of a policy output (such as the balance or cost rate) on the treatment variable (whether or not the proposed policy is adopted) plus an error term. SSA offers no uncertainty estimates for this estimated causal effect, although of course some causal effects are likely to be better estimated or better known than others. Sometimes by known ex ante assumptions we may think the effects are known with a high degree of certainty. However, causal effects are never observed in the real world; only the policy outputs are ever observed. To empirically estimate what will happen in the real world if a policy is adopted, or to evaluate a claim about a causal effect’s size or its uncertainty in a way that makes oneself vulnerable to being proven wrong, we must rely on forecasts under present law and forecasts under the counterfactual condition of the policy being adopted. It is the uncertainty of the forecast under present law that our papers show how to estimate using the observed forecast errors. In this evaluation, we find that most of what could be observable from the impact of the causal effects are swamped by these uncertainty estimates. For example, the most recent SSA evaluation of a policy proposal gives a graphic illustration in Figure 1 which plots the point estimate of the Trust Fund Ratio for each year in the future, under both present law and a proposed law under consideration; each of these lines has uncertainty at least as large as we estimate in our paper. There is also additional uncertainty, over and above forecast errors, because we do not know exactly what would happen if the policy were actually changed, and how all the workers, beneficiaries, government officials, and others would respond under the new regime.

When is it acceptable for the Social Security Administration to bias today’s forecast towards yesterday’s forecast, producing artificially smooth forecasts over time?

Smoothing in this way can be advantageous statistically to reduce variance, and possibly mean square error if there exists no systematic bias. Unfortunately, SSA forecasts are systematically biased and so smoothing is not helpful here. Another possibility is to protect the public so that it does not worry about the future of Social Security. Whether this paternalistic position is appropriate is a normative choice of course. Our own view is that, whenever possible, the government should be in the position of giving accurate forecasts and telling the public the truth as soon as they know it. The government can and should accompany point estimates with accurate uncertainty estimates. If public officials or the public do not understand these uncertainty estimates, then it is incumbent upon government officials, and those of us who pay attention to what they do, to be good teachers. Politicians and the public may not have the time to deal with the details very often, but in our experience it is not difficult to convey important points like these.

How soon could SSA become aware of errors in their forecasts?

For all the financial indicators, the error in last year’s one-year-ahead forecast is known before this year’s forecast is issued. However, SSA receives mortality data from the National Center for Health Statistics with a 2 to 4 year lag.

Who did you interview and how did you select them?

We interviewed a sample of participants in the forecasting process, including those who try to influence the process, use the forecasts, make proposals to change the Social Security, and comment publicly or privately on the process. Our sample included current or former high and low profile public officials in Congress, the White House, and the Social Security Administration, and including Democrats, Republicans, liberals, conservatives, and those on various advisory boards. We also included some in academia and the private sector. Our design was a stratified sequential quota sample, with strata defined based on their role in the process. The sequential part of the process involved sampling and conducting interviews within each stratum until we heard the same stories and the same points sufficiently often so that we could reliably predict what the next person was going to say when prompted with the same question. We tested this hypothesis, making ourselves vulnerable to being proven wrong, by making predictions and seeing what the next person would say. Of course, each person added more color and detail and information, but at some point the information we gathered about our essential questions reached well past the point of diminishing returns and so we stopped. We found individuals by enumeration and snowball sampling techniques; we were able to find all but a few people we attempted to find, and almost everyone we asked freely gave of their time to speak with us. Part of the reason for this success in reaching people is that we promised confidentiality to each respondent; we did this whether or not they asked for it.

Related Materials

Scoring Social Security Proposals: Response from Kashin, King, and Soneji
Konstantin Kashin, Gary King, and Samir Soneji. 2016. “Scoring Social Security Proposals: Response from Kashin, King, and Soneji.” Journal of Economic Perspectives, 30, 2, Pp. 245-248. Publisher's VersionAbstract

This is a response to Peter Diamond's comment on a two paragraph passage in our article, Konstantin Kashin, Gary King, and Samir Soneji. 2015. “Systematic Bias and Nontransparency in US Social Security Administration Forecasts.” Journal of Economic Perspectives, 2, 29: 239-258. 

Statistical Security for Social Security
Samir Soneji and Gary King. 2012. “Statistical Security for Social Security.” Demography, 49, 3, Pp. 1037-1060 . Publisher's versionAbstract

The financial viability of Social Security, the single largest U.S. Government program, depends on accurate forecasts of the solvency of its intergenerational trust fund. We begin by detailing information necessary for replicating the Social Security Administration’s (SSA’s) forecasting procedures, which until now has been unavailable in the public domain. We then offer a way to improve the quality of these procedures due to age-and sex-specific mortality forecasts. The most recent SSA mortality forecasts were based on the best available technology at the time, which was a combination of linear extrapolation and qualitative judgments. Unfortunately, linear extrapolation excludes known risk factors and is inconsistent with long-standing demographic patterns such as the smoothness of age profiles. Modern statistical methods typically outperform even the best qualitative judgments in these contexts. We show how to use such methods here, enabling researchers to forecast using far more information, such as the known risk factors of smoking and obesity and known demographic patterns. Including this extra information makes a sub¬stantial difference: For example, by only improving mortality forecasting methods, we predict three fewer years of net surplus, $730 billion less in Social Security trust funds, and program costs that are 0.66% greater of projected taxable payroll compared to SSA projections by 2031. More important than specific numerical estimates are the advantages of transparency, replicability, reduction of uncertainty, and what may be the resulting lower vulnerability to the politicization of program forecasts. In addition, by offering with this paper software and detailed replication information, we hope to marshal the efforts of the research community to include ever more informative inputs and to continue to reduce the uncertainties in Social Security forecasts.

This work builds on our article that provides forecasts of US Mortality rates (see King and Soneji, The Future of Death in America), a book developing improved methods for forecasting mortality (Girosi and King, Demographic Forecasting), all data we used (King and Soneji, replication data sets), and open source software that implements the methods (Girosi and King, YourCast).  Also available is a New York Times Op-Ed based on this work (King and Soneji, Social Security: It’s Worse Than You Think), and a replication data set for the Op-Ed (King and Soneji, replication data set).

The Future of Death in America
Gary King and Samir Soneji. 2011. “The Future of Death in America.” Demographic Research, 25, 1, Pp. 1--38. WebsiteAbstract

Population mortality forecasts are widely used for allocating public health expenditures, setting research priorities, and evaluating the viability of public pensions, private pensions, and health care financing systems. In part because existing methods seem to forecast worse when based on more information, most forecasts are still based on simple linear extrapolations that ignore known biological risk factors and other prior information. We adapt a Bayesian hierarchical forecasting model capable of including more known health and demographic information than has previously been possible. This leads to the first age- and sex-specific forecasts of American mortality that simultaneously incorporate, in a formal statistical model, the effects of the recent rapid increase in obesity, the steady decline in tobacco consumption, and the well known patterns of smooth mortality age profiles and time trends. Formally including new information in forecasts can matter a great deal. For example, we estimate an increase in male life expectancy at birth from 76.2 years in 2010 to 79.9 years in 2030, which is 1.8 years greater than the U.S. Social Security Administration projection and 1.5 years more than U.S. Census projection. For females, we estimate more modest gains in life expectancy at birth over the next twenty years from 80.5 years to 81.9 years, which is virtually identical to the Social Security Administration projection and 2.0 years less than U.S. Census projections. We show that these patterns are also likely to greatly affect the aging American population structure. We offer an easy-to-use approach so that researchers can include other sources of information and potentially improve on our forecasts too.

Demographic Forecasting
Federico Girosi and Gary King. 2008. Demographic Forecasting. Princeton: Princeton University Press.Abstract

We introduce a new framework for forecasting age-sex-country-cause-specific mortality rates that incorporates considerably more information, and thus has the potential to forecast much better, than any existing approach. Mortality forecasts are used in a wide variety of academic fields, and for global and national health policy making, medical and pharmaceutical research, and social security and retirement planning.

As it turns out, the tools we developed in pursuit of this goal also have broader statistical implications, in addition to their use for forecasting mortality or other variables with similar statistical properties. First, our methods make it possible to include different explanatory variables in a time series regression for each cross-section, while still borrowing strength from one regression to improve the estimation of all. Second, we show that many existing Bayesian (hierarchical and spatial) models with explanatory variables use prior densities that incorrectly formalize prior knowledge. Many demographers and public health researchers have fortuitously avoided this problem so prevalent in other fields by using prior knowledge only as an ex post check on empirical results, but this approach excludes considerable information from their models. We show how to incorporate this demographic knowledge into a model in a statistically appropriate way. Finally, we develop a set of tools useful for developing models with Bayesian priors in the presence of partial prior ignorance. This approach also provides many of the attractive features claimed by the empirical Bayes approach, but fully within the standard Bayesian theory of inference.

Incumbency Advantage

Proof that previously used estimators of electoral incumbency advantage were biased, and a new unbiased estimator. Also, the first systematic demonstration that constituency service by legislators increases the incumbency advantage.
If a Statistical Model Predicts That Common Events Should Occur Only Once in 10,000 Elections, Maybe it’s the Wrong Model
Danny Ebanks, Jonathan N. Katz, and Gary King. Working Paper. “If a Statistical Model Predicts That Common Events Should Occur Only Once in 10,000 Elections, Maybe it’s the Wrong Model”.Abstract

Political scientists forecast elections, not primarily to satisfy public interest, but to validate statistical models used for estimating many quantities of scholarly interest. Although scholars have learned a great deal from these models, they can be embarrassingly overconfident: Events that should occur once in 10,000 elections occur almost every year, and even those that should occur once in a trillion-trillion elections are sometimes observed. We develop a novel generative statistical model of US congressional elections 1954-2020 and validate it with extensive out-of-sample tests. The generatively accurate descriptive summaries provided by this model demonstrate that the 1950s was as partisan and differentiated as the current period, but with parties not based on ideological differences as they are today. The model also shows that even though the size of the incumbency advantage has varied tremendously over time, the risk of an in-party incumbent losing a midterm election contest has been high and essentially constant over at least the last two thirds of a century.

Please see "How American Politics Ensures Electoral Accountability in Congress," which supersedes this paper.
 

How to Estimate the Electoral Advantage of Incumbency

Estimating Incumbency Advantage Without Bias
Proves that all previous measures of incumbency advantage in the congressional elections literature were biased or inconsistent, and develops an unbiased estimator based on a simple linear regression model. Andrew Gelman and Gary King. 1990. “Estimating Incumbency Advantage Without Bias.” American Journal of Political Science, 34, Pp. 1142–1164.Abstract
In this paper we prove theoretically and demonstrate empirically that all existing measures of incumbency advantage in the congressional elections literature are biased or inconsistent. We then provide an unbiased estimator based on a very simple linear regression model. We apply this new method to congressional elections since 1900, providing the first evidence of a positive incumbency advantage in the first half of the century.
A Statistical Model for Multiparty Electoral Data
A general purpose method for analyzing multiparty electoral data, including estimating the incumbency advantage. Jonathan Katz and Gary King. 1999. “A Statistical Model for Multiparty Electoral Data.” American Political Science Review, 93, Pp. 15–32.Abstract
We propose a comprehensive statistical model for analyzing multiparty, district-level elections. This model, which provides a tool for comparative politics research analagous to that which regression analysis provides in the American two-party context, can be used to explain or predict how geographic distributions of electoral results depend upon economic conditions, neighborhood ethnic compositions, campaign spending, and other features of the election campaign or aggregate areas. We also provide new graphical representations for data exploration, model evaluation, and substantive interpretation. We illustrate the use of this model by attempting to resolve a controversy over the size of and trend in electoral advantage of incumbency in Britain. Contrary to previous analyses, all based on measures now known to be biased, we demonstrate that the advantage is small but meaningful, varies substantially across the parties, and is not growing. Finally, we show how to estimate the party from which each party’s advantage is predominantly drawn.
A Fast, Easy, and Efficient Estimator for Multiparty Electoral Data
A generalization of the previous article, more practical for larger numbers of parties (part of a symposium in the same issue of Political Analysis). James Honaker, Gary King, and Jonathan N. Katz. 2002. “A Fast, Easy, and Efficient Estimator for Multiparty Electoral Data.” Political Analysis, 10, Pp. 84–100.Abstract
Katz and King (1999) develop a model for predicting or explaining aggregate electoral results in multiparty democracies. This model is, in principle, analogous to what least squares regression provides American politics researchers in that two-party system. Katz and King applied this model to three-party elections in England and revealed a variety of new features of incumbency advantage and where each party pulls support from. Although the mathematics of their statistical model covers any number of political parties, it is computationally very demanding, and hence slow and numerically imprecise, with more than three. The original goal of our work was to produce an approximate method that works quicker in practice with many parties without making too many theoretical compromises. As it turns out, the method we offer here improves on Katz and King’s (in bias, variance, numerical stability, and computational speed) even when the latter is computationally feasible. We also offer easy-to-use software that implements our suggestions.

Causes and Consequences

Systemic Consequences of Incumbency Advantage in the U.S. House
Gary King and Andrew Gelman. 1991. “Systemic Consequences of Incumbency Advantage in the U.S. House.” American Journal of Political Science, 35, Pp. 110–138.Abstract
The dramatic increase in the electoral advantage of incumbency has sparked widespread interest among congressional researchers over the last 15 years. Although many scholars have studied the advantages of incumbency for incumbents, few have analyzed its effects on the underlying electoral system. We examine the influence of the incumbency advantage on two features of the electoral system in the U.S. House elections: electoral responsiveness and partisan bias. Using a district-level seats-votes model of House elections, we are able to distinguish systematic changes from unique, election-specific variations. Our results confirm the significant drop in responsiveness, and even steeper decline outside the South, over the past 40 years. Contrary to expectations, we find that increased incumbency advantage explains less than a third of this trend, indicating that some other unknown factor is responsible. Moreover, our analysis also reveals another dramatic pattern, largely overlooked in the congressional literature: in the 1940’s and 1950’s the electoral system was severely biased in favor of the Republican party. The system shifted incrementally from this severe Republican bias over the next several decades to a moderate Democratic bias by the mid-1980’s. Interestingly, changes in incumbency advantage explain virtually all of this trend in partisan bias since the 1940’s. By removing incumbency advantage and the existing configuration of incumbents and challengers analytically, our analysis reveals an underlying electoral system that remains consistently biased in favor of the Republican party. Thus, our results indicate that incumbency advantage affects the underlying electoral system, but contrary to conventional wisdom, this changes the trend in partisan bias more than electoral responsiveness.
Constituency Service and Incumbency Advantage
The first systematic evidence that constituency service increases the electoral advantage of incumbency. Gary King. 1991. “Constituency Service and Incumbency Advantage.” British Journal of Political Science, 21, Pp. 119–128.Abstract
This Note addresses the long-standing discrepancy between scholarly support for the effect of constituency service on incumbency advantage and a large body of contradictory empirical evidence. I show first that many of the methodological problems noticed in past research reduce to a single methodological problem that is readily resolved. The core of this Note then provides among the first systematic empirical evidence for the constituency service hypothesis. Specifically, an extra $10,000 added to the budget of the average state legislator gives this incumbent an additional 1.54 percentage points in the next election (with a 95% confidence interval of 1.14 to 1.94 percentage points).

Data

Chinese Censorship

We reverse engineer Chinese information controls -- the most extensive effort to selectively control human expression in the history of the world. We show that this massive effort to slow the flow of information paradoxically also conveys a great deal about the intentions, goals, and actions of the leaders. We downloaded all Chinese social media posts before the government could read and censor them; wrote and posted comments randomly assigned to our categories on hundreds of websites across the country to see what would be censored; set up our own social media website in China; and discovered that the Chinese government fabricates and posts 450 million social media comments a year in the names of ordinary people and convinced those posting (and inadvertently even the government) to admit to their activities. We found that the goverment does not engage on controversial issues (they do not censor criticism or fabricate posts that argue with those who disagree with the government), but they respond on an emergency basis to stop collective action (with censorship, fabricating posts with giant bursts of cheerleading-type distractions, responding to citizen greviances, etc.). They don't care what you think of them or say about them; they only care what you can do.

How the Chinese Government Fabricates Social Media Posts for Strategic Distraction, not Engaged Argument
Gary King, Jennifer Pan, and Margaret E. Roberts. 2017. “How the Chinese Government Fabricates Social Media Posts for Strategic Distraction, not Engaged Argument.” American Political Science Review, 111, 3, Pp. 484-501. Publisher's VersionAbstract

The Chinese government has long been suspected of hiring as many as 2,000,000 people to surreptitiously insert huge numbers of pseudonymous and other deceptive writings into the stream of real social media posts, as if they were the genuine opinions of ordinary people. Many academics, and most journalists and activists, claim that these so-called ``50c party'' posts vociferously argue for the government's side in political and policy debates. As we show, this is also true of the vast majority of posts openly accused on social media of being 50c. Yet, almost no systematic empirical evidence exists for this claim, or, more importantly, for the Chinese regime's strategic objective in pursuing this activity. In the first large scale empirical analysis of this operation, we show how to identify the secretive authors of these posts, the posts written by them, and their content. We estimate that the government fabricates and posts about 448 million social media comments a year. In contrast to prior claims, we show that the Chinese regime's strategy is to avoid arguing with skeptics of the party and the government, and to not even discuss controversial issues. We show that the goal of this massive secretive operation is instead to distract the public and change the subject, as most of the these posts involve cheerleading for China, the revolutionary history of the Communist Party, or other symbols of the regime. We discuss how these results fit with what is known about the Chinese censorship program, and suggest how they may change our broader theoretical understanding of ``common knowledge'' and information control in authoritarian regimes.

This paper is related to our articles in Science, “Reverse-Engineering Censorship In China: Randomized Experimentation And Participant Observation”, and the American Political Science Review, “How Censorship In China Allows Government Criticism But Silences Collective Expression”.

Reverse-engineering censorship in China: Randomized experimentation and participant observation
Gary King, Jennifer Pan, and Margaret E. Roberts. 2014. “Reverse-engineering censorship in China: Randomized experimentation and participant observation.” Science, 345, 6199, Pp. 1-10. Publisher's VersionAbstract

Existing research on the extensive Chinese censorship organization uses observational methods with well-known limitations. We conducted the first large-scale experimental study of censorship by creating accounts on numerous social media sites, randomly submitting different texts, and observing from a worldwide network of computers which texts were censored and which were not. We also supplemented interviews with confidential sources by creating our own social media site, contracting with Chinese firms to install the same censoring technologies as existing sites, and—with their software, documentation, and even customer support—reverse-engineering how it all works. Our results offer rigorous support for the recent hypothesis that criticisms of the state, its leaders, and their policies are published, whereas posts about real-world events with collective action potential are censored.

How Censorship in China Allows Government Criticism but Silences Collective Expression
Gary King, Jennifer Pan, and Margaret E Roberts. 2013. “How Censorship in China Allows Government Criticism but Silences Collective Expression.” American Political Science Review, 107, 2 (May), Pp. 1-18.Abstract

We offer the first large scale, multiple source analysis of the outcome of what may be the most extensive effort to selectively censor human expression ever implemented. To do this, we have devised a system to locate, download, and analyze the content of millions of social media posts originating from nearly 1,400 different social media services all over China before the Chinese government is able to find, evaluate, and censor (i.e., remove from the Internet) the large subset they deem objectionable. Using modern computer-assisted text analytic methods that we adapt to and validate in the Chinese language, we compare the substantive content of posts censored to those not censored over time in each of 85 topic areas. Contrary to previous understandings, posts with negative, even vitriolic, criticism of the state, its leaders, and its policies are not more likely to be censored. Instead, we show that the censorship program is aimed at curtailing collective action by silencing comments that represent, reinforce, or spur social mobilization, regardless of content. Censorship is oriented toward attempting to forestall collective activities that are occurring now or may occur in the future --- and, as such, seem to clearly expose government intent.

Mexican Health Care Evaluation

An evaluation of the Mexican Seguro Popular program (designed to extend health insurance and regular and preventive medical care, pharmaceuticals, and health facilities to 50 million uninsured Mexicans), one of the world's largest health policy reforms of the last two decades. Our evaluation features a new design for field experiments that is more robust to the political interventions and implementation errors that have ruined many similar previous efforts; new statistical methods that produce more reliable and efficient results using fewer resources, assumptions, and data, as well as standard errors that are as much as 600% smaller; and an implementation of these methods in the largest randomized health policy experiment to date. (See the Harvard Gazette story on this project.)

Section 1

A "Politically Robust" Experimental Design for Public Policy Evaluation, with Application to the Mexican Universal Health Insurance Program
1. The evaluation design: Gary King, Emmanuela Gakidou, Nirmala Ravishankar, Ryan T Moore, Jason Lakin, Manett Vargas, Martha María Téllez-Rojo, Juan Eugenio Hernández Ávila, Mauricio Hernández Ávila, and Héctor Hernández Llamas. 2007. “A "Politically Robust" Experimental Design for Public Policy Evaluation, with Application to the Mexican Universal Health Insurance Program.” Journal of Policy Analysis and Management, 26, Pp. 479-506.Abstract

We develop an approach to conducting large scale randomized public policy experiments intended to be more robust to the political interventions that have ruined some or all parts of many similar previous efforts. Our proposed design is insulated from selection bias in some circumstances even if we lose observations and our inferences can still be unbiased even if politics disrupts any two of the three steps in our analytical procedures and and other empirical checks are available to validate the overall design. We illustrate with a design and empirical validation of an evaluation of the Mexican Seguro Popular de Salud (Universal Health Insurance) program we are conducting. Seguro Popular, which is intended to grow to provide medical care, drugs, preventative services, and financial health protection to the 50 million Mexicans without health insurance, is one of the largest health reforms of any country in the last two decades. The evaluation is also large scale, constituting one of the largest policy experiments to date and what may be the largest randomized health policy experiment ever.

The Essential Role of Pair Matching in Cluster-Randomized Experiments, with Application to the Mexican Universal Health Insurance Evaluation
2. The statistical analysis methods: Kosuke Imai, Gary King, and Clayton Nall. 2009. “The Essential Role of Pair Matching in Cluster-Randomized Experiments, with Application to the Mexican Universal Health Insurance Evaluation.” Statistical Science, 24, Pp. 29–53.Abstract
A basic feature of many field experiments is that investigators are only able to randomize clusters of individuals–-such as households, communities, firms, medical practices, schools, or classrooms–-even when the individual is the unit of interest. To recoup the resulting efficiency loss, some studies pair similar clusters and randomize treatment within pairs. However, many other studies avoid pairing, in part because of claims in the literature, echoed by clinical trials standards organizations, that this matched-pair, cluster-randomization design has serious problems. We argue that all such claims are unfounded. We also prove that the estimator recommended for this design in the literature is unbiased only in situations when matching is unnecessary and and its standard error is also invalid. To overcome this problem without modeling assumptions, we develop a simple design-based estimator with much improved statistical properties. We also propose a model-based approach that includes some of the benefits of our design-based estimator as well as the estimator in the literature. Our methods also address individual-level noncompliance, which is common in applications but not allowed for in most existing methods. We show that from the perspective of bias, efficiency, power, robustness, or research costs, and in large or small samples, pairing should be used in cluster-randomized experiments whenever feasible and failing to do so is equivalent to discarding a considerable fraction of one’s data. We develop these techniques in the context of a randomized evaluation we are conducting of the Mexican Universal Health Insurance Program.
Matched Pairs and the Future of Cluster-Randomized Experiments: A Rejoinder
2a. Comments from four scholars on previous article and rejoinder by us: Kosuke Imai, Gary King, and Clayton Nall. 2009. “Matched Pairs and the Future of Cluster-Randomized Experiments: A Rejoinder.” Statistical Science, 24, Pp. 64–72.Abstract

A basic feature of many field experiments is that investigators are only able to randomize clusters of individuals–-such as households, communities, firms, medical practices, schools, or classrooms–-even when the individual is the unit of interest. To recoup the resulting efficiency loss, some studies pair similar clusters and randomize treatment within pairs. However, many other studies avoid pairing, in part because of claims in the literature, echoed by clinical trials standards organizations, that this matched-pair, cluster-randomization design has serious problems. We argue that all such claims are unfounded. We also prove that the estimator recommended for this design in the literature is unbiased only in situations when matching is unnecessary and and its standard error is also invalid. To overcome this problem without modeling assumptions, we develop a simple design-based estimator with much improved statistical properties. We also propose a model-based approach that includes some of the benefits of our design-based estimator as well as the estimator in the literature. Our methods also address individual-level noncompliance, which is common in applications but not allowed for in most existing methods. We show that from the perspective of bias, efficiency, power, robustness, or research costs, and in large or small samples, pairing should be used in cluster-randomized experiments whenever feasible and failing to do so is equivalent to discarding a considerable fraction of one’s data. We develop these techniques in the context of a randomized evaluation we are conducting of the Mexican Universal Health Insurance Program.

Public Policy for the Poor? A Randomised Assessment of the Mexican Universal Health Insurance Programme
3. The results of the evaluation: Gary King, Emmanuela Gakidou, Kosuke Imai, Jason Lakin, Ryan T Moore, Clayton Nall, Nirmala Ravishankar, Manett Vargas, Martha María Téllez-Rojo, Juan Eugenio Hernández Ávila, Mauricio Hernández Ávila, and Héctor Hernández Llamas. 2009. “Public Policy for the Poor? A Randomised Assessment of the Mexican Universal Health Insurance Programme.” The Lancet, 373, Pp. 1447-1454.Abstract

Background: We assessed aspects of Seguro Popular, a programme aimed to deliver health insurance, regular and preventive medical care, medicines, and health facilities to 50 million uninsured Mexicans. Methods: We randomly assigned treatment within 74 matched pairs of health clusters–-i.e., health facility catchment areas–-representing 118,569 households in seven Mexican states, and measured outcomes in a 2005 baseline survey (August 2005, to September 2005) and follow-up survey 10 months later (July 2006, to August 2006) in 50 pairs (n=32 515). The treatment consisted of encouragement to enrol in a health-insurance programme and upgraded medical facilities. Participant states also received funds to improve health facilities and to provide medications for services in treated clusters. We estimated intention to treat and complier average causal effects non-parametrically. Findings: Intention-to-treat estimates indicated a 23% reduction from baseline in catastrophic expenditures (1·9% points and 95% CI 0·14-3·66). The effect in poor households was 3·0% points (0·46-5·54) and in experimental compliers was 6·5% points (1·65-11·28), 30% and 59% reductions, respectively. The intention-to-treat effect on health spending in poor households was 426 pesos (39-812), and the complier average causal effect was 915 pesos (147-1684). Contrary to expectations and previous observational research, we found no effects on medication spending, health outcomes, or utilisation. Interpretation: Programme resources reached the poor. However, the programme did not show some other effects, possibly due to the short duration of treatment (10 months). Although Seguro Popular seems to be successful at this early stage, further experiments and follow-up studies, with longer assessment periods, are needed to ascertain the long-term effects of the programme.

Related Research

Do Nonpartisan Programmatic Policies Have Partisan Electoral Effects? Evidence from Two Large Scale Experiments
Kosuke Imai, Gary King, and Carlos Velasco Rivera. 1/31/2020. “Do Nonpartisan Programmatic Policies Have Partisan Electoral Effects? Evidence from Two Large Scale Experiments.” Journal of Politics, 81, 2, Pp. 714-730. Publisher's VersionAbstract

A vast literature demonstrates that voters around the world who benefit from their governments' discretionary spending cast more ballots for the incumbent party than those who do not benefit. But contrary to most theories of political accountability, some suggest that voters also reward incumbent parties for implementing "programmatic" spending legislation, over which incumbents have no discretion, and even when passed with support from all major parties. Why voters would attribute responsibility when none exists is unclear, as is why minority party legislators would approve of legislation that would cost them votes. We study the electoral effects of two large prominent programmatic policies that fit the ideal type especially well, with unusually large scale experiments that bring more evidence to bear on this question than has previously been possible. For the first policy, we design and implement ourselves one of the largest randomized social experiments ever. For the second policy, we reanalyze studies that used a large scale randomized experiment and a natural experiment to study the same question but came to opposite conclusions. Using corrected data and improved statistical methods, we show that the evidence from all analyses of both policies is consistent: programmatic policies have no effect on voter support for incumbents. We conclude by discussing how the many other studies in the literature may be interpreted in light of our results.

Presidency Research; Voting Behavior

Resolution of the paradox of why polls are so variable over time during presidential campaigns even though the vote outcome is easily predictable before it starts. Also, a resolution of a key controversy over absentee ballots during the 2000 presidential election; and the methodology of small-n research on executives.
Expert Report of Gary King, in Bowyer et al. v. Ducey (Governor) et al., US District Court, District of Arizona
Gary King. 2020. “Expert Report of Gary King, in Bowyer et al. v. Ducey (Governor) et al., US District Court, District of Arizona”.Abstract

In this report, I evaluate evidence described and conclusions drawn in several Exhibits in this case offered by the Plaintiffs. I conclude that the evidence is insufficient to support conclusions about election fraud. Throughout, the authors break the chain of evidence repeatedly – from the 2020 election, to the data analyzed, to the quantitative results presented, to the conclusions drawn – and as such cannot be relied on. In addition, the Exhibits make many crucial assumptions without justification, discussion, or even recognition – each of which can lead to substantial bias, and which was unrecognized and uncorrected. The data analytic and statistical procedures used in the Exhibits for data providence, data analysis, replication information, and statistical analysis all violate professional standards and should be disregarded.

The Court's ruling in this case concluded "Not only have Plaintiffs failed to provide the Court with factual support for their extraordinary claims, but they have wholly failed to establish that they have standing for the Court to consider them. Allegations that find favor in the public sphere of gossip and innuendo cannot be a substitute for earnest pleadings and procedure in federal court. They most certainly cannot be the basis for upending Arizona’s 2020 General Election. The Court is left with no alternative but to dismiss this matter in its entirety."

[Thanks to Soubhik Barari for research assistance.]

How the news media activate public expression and influence national agendas
Gary King, Benjamin Schneer, and Ariel White. 11/10/2017. “How the news media activate public expression and influence national agendas.” Science, 358, Pp. 776-780. Publisher's VersionAbstract

We demonstrate that exposure to the news media causes Americans to take public stands on specific issues, join national policy conversations, and express themselves publicly—all key components of democratic politics—more often than they would otherwise. After recruiting 48 mostly small media outlets, we chose groups of these outlets to write and publish articles on subjects we approved, on dates we randomly assigned. We estimated the causal effect on proximal measures, such as website pageviews and Twitter discussion of the articles’ specific subjects, and distal ones, such as national Twitter conversation in broad policy areas. Our intervention increased discussion in each broad policy area by approximately 62.7% (relative to a day’s volume), accounting for 13,166 additional posts over the treatment week, with similar effects across population subgroups. 

On the Science website: AbstractReprintFull text, and a comment (by Matthew Gentzkow) "Small media, big impact".

 

 

Voting Behavior

If a Statistical Model Predicts That Common Events Should Occur Only Once in 10,000 Elections, Maybe it’s the Wrong Model
Danny Ebanks, Jonathan N. Katz, and Gary King. Working Paper. “If a Statistical Model Predicts That Common Events Should Occur Only Once in 10,000 Elections, Maybe it’s the Wrong Model”.Abstract

Political scientists forecast elections, not primarily to satisfy public interest, but to validate statistical models used for estimating many quantities of scholarly interest. Although scholars have learned a great deal from these models, they can be embarrassingly overconfident: Events that should occur once in 10,000 elections occur almost every year, and even those that should occur once in a trillion-trillion elections are sometimes observed. We develop a novel generative statistical model of US congressional elections 1954-2020 and validate it with extensive out-of-sample tests. The generatively accurate descriptive summaries provided by this model demonstrate that the 1950s was as partisan and differentiated as the current period, but with parties not based on ideological differences as they are today. The model also shows that even though the size of the incumbency advantage has varied tremendously over time, the risk of an in-party incumbent losing a midterm election contest has been high and essentially constant over at least the last two thirds of a century.

Please see "How American Politics Ensures Electoral Accountability in Congress," which supersedes this paper.
 

Do Nonpartisan Programmatic Policies Have Partisan Electoral Effects? Evidence from Two Large Scale Experiments
Kosuke Imai, Gary King, and Carlos Velasco Rivera. 1/31/2020. “Do Nonpartisan Programmatic Policies Have Partisan Electoral Effects? Evidence from Two Large Scale Experiments.” Journal of Politics, 81, 2, Pp. 714-730. Publisher's VersionAbstract

A vast literature demonstrates that voters around the world who benefit from their governments' discretionary spending cast more ballots for the incumbent party than those who do not benefit. But contrary to most theories of political accountability, some suggest that voters also reward incumbent parties for implementing "programmatic" spending legislation, over which incumbents have no discretion, and even when passed with support from all major parties. Why voters would attribute responsibility when none exists is unclear, as is why minority party legislators would approve of legislation that would cost them votes. We study the electoral effects of two large prominent programmatic policies that fit the ideal type especially well, with unusually large scale experiments that bring more evidence to bear on this question than has previously been possible. For the first policy, we design and implement ourselves one of the largest randomized social experiments ever. For the second policy, we reanalyze studies that used a large scale randomized experiment and a natural experiment to study the same question but came to opposite conclusions. Using corrected data and improved statistical methods, we show that the evidence from all analyses of both policies is consistent: programmatic policies have no effect on voter support for incumbents. We conclude by discussing how the many other studies in the literature may be interpreted in light of our results.

Estimating Partisan Bias of the Electoral College Under Proposed Changes in Elector Apportionment
AC Thomas, Andrew Gelman, Gary King, and Jonathan N Katz. 2012. “Estimating Partisan Bias of the Electoral College Under Proposed Changes in Elector Apportionment.” Statistics, Politics, and Policy, Pp. 1-13. Statistics, Politics and Policy (publisher version)Abstract

In the election for President of the United States, the Electoral College is the body whose members vote to elect the President directly. Each state sends a number of delegates equal to its total number of representatives and senators in Congress; all but two states (Nebraska and Maine) assign electors pledged to the candidate that wins the state's plurality vote. We investigate the effect on presidential elections if states were to assign their electoral votes according to results in each congressional district,and conclude that the direct popular vote and the current electoral college are both substantially fairer compared to those alternatives where states would have divided their electoral votes by congressional district.

Ordinary Economic Voting Behavior in the Extraordinary Election of Adolf Hitler
Gary King, Ori Rosen, Martin Tanner, and Alexander Wagner. 2008. “Ordinary Economic Voting Behavior in the Extraordinary Election of Adolf Hitler.” Journal of Economic History, 68, 4, Pp. 996.Abstract

The enormous Nazi voting literature rarely builds on modern statistical or economic research. By adding these approaches, we find that the most widely accepted existing theories of this era cannot distinguish the Weimar elections from almost any others in any country. Via a retrospective voting account, we show that voters most hurt by the depression, and most likely to oppose the government, fall into separate groups with divergent interests. This explains why some turned to the Nazis and others turned away. The consequences of Hitler's election were extraordinary, but the voting behavior that led to it was not.

On Party Platforms, Mandates, and Government Spending
Gary King and Michael Laver. 1993. “On Party Platforms, Mandates, and Government Spending.” American Political Science Review, 87, Pp. 744–750.Abstract

In their 1990 Review article, Ian Budge and Richard Hofferbert analyzed the relationship between party platform emphases, control of the White House, and national government spending priorities, reporting strong evidence of a "party mandate" connection between them. Gary King and Michael Laver successfully replicate the original analysis, critique the interpretation of the causal effects, and present a reanalysis showing that platforms have small or nonexistent effects on spending. In response, Budge, Hofferbert, and Michael McDonald agree that their language was somewhat inconsistent on both interactions and causality but defend their conceptualization of "mandates" as involving only an association, not necessarily a causal connection, between party commitments and government policy. Hence, while the causes of government policy are of interest, noncausal associations are sufficient as evidence of party mandates in American politics.

Why are American Presidential Election Campaign Polls so Variable when Votes are so Predictable?
Resolution of a paradox in the study of American voting behavior. Andrew Gelman and Gary King. 1993. “Why are American Presidential Election Campaign Polls so Variable when Votes are so Predictable?” British Journal of Political Science, 23, Pp. 409–451.Abstract

As most political scientists know, the outcome of the U.S. Presidential election can be predicted within a few percentage points (in the popular vote), based on information available months before the election. Thus, the general election campaign for president seems irrelevant to the outcome (except in very close elections), despite all the media coverage of campaign strategy. However, it is also well known that the pre-election opinion polls can vary wildly over the campaign, and this variation is generally attributed to events in the campaign. How can campaign events affect people’s opinions on whom they plan to vote for, and yet not affect the outcome of the election? For that matter, why do voters consistently increase their support for a candidate during his nominating convention, even though the conventions are almost entirely predictable events whose effects can be rationally forecast? In this exploratory study, we consider several intuitively appealing, but ultimately wrong, resolutions to this puzzle, and discuss our current understanding of what causes opinion polls to fluctuate and yet reach a predictable outcome. Our evidence is based on graphical presentation and analysis of over 67,000 individual-level responses from forty-nine commercial polls during the 1988 campaign and many other aggregate poll results from the 1952–1992 campaigns. We show that responses to pollsters during the campaign are not generally informed or even, in a sense we describe, "rational." In contrast, voters decide which candidate to eventually support based on their enlightened preferences, as formed by the information they have learned during the campaign, as well as basic political cues such as ideology and party identification. We cannot prove this conclusion, but we do show that it is consistent with the aggregate forecasts and individual-level opinion poll responses. Based on the enlightened preferences hypothesis, we conclude that the news media have an important effect on the outcome of Presidential elections–-not due to misleading advertisements, sound bites, or spin doctors, but rather by conveying candidates’ positions on important issues.

Party Competition and Media Messages in U.S. Presidential Election Campaigns
A popular version of the previous article. Andrew Gelman, Gary King, and Sandy L Maisel. 1994. “Party Competition and Media Messages in U.S. Presidential Election Campaigns.” In The Parties Respond: Changes in the American Party System, Pp. 255-295. Boulder, Colorado: Westview Press.Abstract

At one point during the 1988 campaign, Michael Dukakis was ahead in the public opinion polls by 17 percentage points, but he eventually lost the election by 8 percent. Walter Mondale was ahead in the polls by 4 percent during the 1984 campaign but lost the election in a landslide. During June and July of 1992, Clinton, Bush, and Perot each had turns in the public opinion poll lead. What explains all this poll variation? Why do so many citizens change their minds so quickly about presidential choices?

Estimating the Probability of Events that Have Never Occurred: When Is Your Vote Decisive?
The first extensive empirical study of the probability of your vote changing the outcome of a U.S. presidential election? Most previous studies of the probability of a tied vote have involved theoretical calculation without data. Andrew Gelman, Gary King, and John Boscardin. 1998. “Estimating the Probability of Events that Have Never Occurred: When Is Your Vote Decisive?” Journal of the American Statistical Association, 93, Pp. 1–9.Abstract
Researchers sometimes argue that statisticians have little to contribute when few realizations of the process being estimated are observed. We show that this argument is incorrect even in the extreme situation of estimating the probabilities of events so rare that they have never occurred. We show how statistical forecasting models allow us to use empirical data to improve inferences about the probabilities of these events. Our application is estimating the probability that your vote will be decisive in a U.S. presidential election, a problem that has been studied by political scientists for more than two decades. The exact value of this probability is of only minor interest, but the number has important implications for understanding the optimal allocation of campaign resources, whether states and voter groups receive their fair share of attention from prospective presidents, and how formal "rational choice" models of voter behavior might be able to explain why people vote at all. We show how the probability of a decisive vote can be estimated empirically from state-level forecasts of the presidential election and illustrate with the example of 1992. Based on generalizations of standard political science forecasting models, we estimate the (prospective) probability of a single vote being decisive as about 1 in 10 million for close national elections such as 1992, varying by about a factor of 10 among states. Our results support the argument that subjective probabilities of many types are best obtained through empirically based statistical prediction models rather than solely through mathematical reasoning. We discuss the implications of our findings for the types of decision analyses used in public choice studies.
No Evidence on Directional vs. Proximity Voting
Proves that the extensive debate between supporters of the direction and proximity models of voting has been based on theoretical specification rather than empirical evidence. Jeffrey Lewis and Gary King. 1999. “No Evidence on Directional vs. Proximity Voting.” Political Analysis, 8, Pp. 21–33.Abstract
The directional and proximity models offer dramatically different theories for how voters make decisions and fundamentally divergent views of the supposed microfoundations on which vast bodies of literature in theoretical rational choice and empirical political behavior have been built. We demonstrate here that the empirical tests in the large and growing body of literature on this subject amount to theoretical debates about which statistical assumption is right. The key statistical assumptions have not been empirically tested and, indeed, turn out to be effectively untestable with exiting methods and data. Unfortunately, these assumptions are also crucial since changing them leads to different conclusions about voter processes.
Did Illegal Overseas Absentee Ballots Decide the 2000 U.S. Presidential Election?
Resolved for the New York Times a key controversy over the 2000 presidential election. Kosuke Imai and Gary King. 2004. “Did Illegal Overseas Absentee Ballots Decide the 2000 U.S. Presidential Election?” Perspectives on Politics, 2, Pp. 537–549.Abstract

Although not widely known until much later, Al Gore received 202 more votes than George W. Bush on election day in Florida. George W. Bush is president because he overcame his election day deficit with overseas absentee ballots that arrived and were counted after election day. In the final official tally, Bush received 537 more votes than Gore. These numbers are taken from the official results released by the Florida Secretary of State's office and so do not reflect overvotes, undervotes, unsuccessful litigation, butterfly ballot problems, recounts that might have been allowed but were not, or any other hypothetical divergence between voter preferences and counted votes. After the election, the New York Times conducted a six month long investigation and found that 680 of the overseas absentee ballots were illegally counted, and no partisan, pundit, or academic has publicly disagreed with their assessment. In this paper, we describe the statistical procedures we developed and implemented for the Times to ascertain whether disqualifying these 680 ballots would have changed the outcome of the election. The methods involve adding formal Bayesian model averaging procedures to King's (1997) ecological inference model. Formal Bayesian model averaging has not been used in political science but is especially useful when substantive conclusions depend heavily on apparently minor but indefensible model choices, when model generalization is not feasible, and when potential critics are more partisan than academic. We show how we derived the results for the Times so that other scholars can use these methods to make ecological inferences for other purposes. We also present a variety of new empirical results that delineate the precise conditions under which Al Gore would have been elected president, and offer new evidence of the striking effectiveness of the Republican effort to convince local election officials to count invalid ballots in Bush counties and not count them in Gore counties.

The City's Losing Clout
Gerald Benjamin and Gary King. 7/5/1979. “The City's Losing Clout.” New York Times, CXXVIII , 44,269 , Pp. A17. Publisher's VersionAbstract
New York City is a modern "rotten borough," not because of population decline, but because of its massive and continuing fall-off in voter participation.  New York City's political base is more apparent than real.

Presidency Research

The Methodology of Presidential Research
Small-n issues in presidency research, and how to resolve the problem. Gary King. 1993. “The Methodology of Presidential Research.” In Researching the Presidency: Vital Questions, New Approaches, edited by George Edwards III, Bert A. Rockman, and John H. Kessel, Pp. 387–412. Pittsburgh: University of Pittsburgh.Abstract
The original purpose of the paper this chapter was based on was to use the Presidency Research Conference’s first-round papers– by John H. Aldrich, Erwin C. Hargrove, Karen M. Hult, Paul Light, and Richard Rose– as my "data." My given task was to analyze the literature ably reviewed by these authors and report what political methodology might have to say about presidency research. I focus in this chapter on the traditional presidency literature, emphasizing research on the president and the office. For the most part, I do not consider research on presidential selection, election, and voting behavior, which has been much more similar to other fields in American politics.
Gary King and Michael Laver. 1999. “Many Publications, but Still No Evidence.” Electoral Studies, 18, Pp. 597–598.Abstract
In 1990, Budge and Hofferbert (B&H) claimed that they had found solid evidence that party platforms cause U.S. budgetary priorities, and thus concluded that mandate theory applies in the United States as strongly as it does elsewhere. The implications of this stunning conclusion would mean that virtually every observer of the American party system in this century has been wrong. King and Laver (1993) reanalyzed B&H’s data and demonstrated in two ways that there exists no evidence for a causal relationship. First, accepting their entire statistical model, and correcting only an algebraic error (a mistake in how they computed their standard errors), we showed that their hypothesized relationship holds up in fewer than half the tests they reported. Second, we showed that their statistical model includes a slightly hidden but politically implausible assumption that a new party achieves every budgetary desire immediately upon taking office. We then specified a model without this unrealistic assumption and we found that the assumption was not supported, and that all evidence in the data for platforms causing government budgets evaporated. In their published response to our article, B&H withdrew their key claim and said they were now (in 1993) merely interested in an association and not causation. That is how it was left in 1993—a perfectly amicable resolution as far as we were concerned—since we have no objection to the claim that there is a non-causal or chance association between any two variables. Of course, we see little reason to be interested in non-causal associations in this area any more than in the chance correlation that exists between the winner of the baseball World Series and the party winning the U.S. presidency. Since party mandate theory only makes sense as a causal theory, the conventional wisdom about America’s porous, non-mandate party system stands.

Informatics and Data Sharing

Replication Standards New standards, protocols, and software for citing, sharing, analyzing, archiving, preserving, distributing, cataloging, translating, disseminating, naming, verifying, and replicating scholarly research data and analyses. Also includes proposals to improve the norms of data sharing and replication in science.
Replication, Replication
"The replication standard holds that sufficient information exists with which to understand, evaluate, and build upon a prior work if a third party can replicate the results without any additional information from the author." This, and the data sharing to support it, was proposed for political science, along with policy suggestions in Gary King. 1995. “Replication, Replication.” PS: Political Science and Politics, 28, Pp. 444-452.Abstract

Political science is a community enterprise and the community of empirical political scientists need access to the body of data necessary to replicate existing studies to understand, evaluate, and especially build on this work. Unfortunately, the norms we have in place now do not encourage, or in some cases even permit, this aim. Following are suggestions that would facilitate replication and are easy to implement – by teachers, students, dissertation writers, graduate programs, authors, reviewers, funding agencies, and journal and book editors.

Preface: Big Data is Not About the Data!
Gary King. 2016. “Preface: Big Data is Not About the Data!” In Computational Social Science: Discovery and Prediction, edited by R. Michael Alvarez. Cambridge: Cambridge University Press.Abstract

A few years ago, explaining what you did for a living to Dad, Aunt Rose, or your friend from high school was pretty complicated. Answering that you develop statistical estimators, work on numerical optimization, or, even better, are working on a great new Markov Chain Monte Carlo implementation of a Bayesian model with heteroskedastic errors for automated text analysis is pretty much the definition of conversation stopper.

Then the media noticed the revolution we’re all apart of, and they glued a label to it. Now “Big Data” is what you and I do.  As trivial as this change sounds, we should be grateful for it, as the name seems to resonate with the public and so it helps convey the importance of our field to others better than we had managed to do ourselves. Yet, now that we have everyone’s attention, we need to start clarifying for others -- and ourselves -- what the revolution means. This is much of what this book is about.

Throughout, we need to remember that for the most part, Big Data is not about the data....

Precision mapping child undernutrition for nearly 600,000 inhabited census villages in India
Rockli Kim, Avleen S. Bijral, Yun Xu, Xiuyuan Zhang, Jeffrey C. Blossom, Akshay Swaminathan, Gary King, Alok Kumar, Rakesh Sarwal, Juan M. Lavista Ferres, and S.V. Subramanian. 2021. “Precision mapping child undernutrition for nearly 600,000 inhabited census villages in India.” Proceedings of the National Academy of Sciences, 118, 18, Pp. 1-11. Publisher's VersionAbstract
There are emerging opportunities to assess health indicators at truly small areas with increasing availability of data geocoded to micro geographic units and advanced modeling techniques. The utility of such fine-grained data can be fully leveraged if linked to local governance units that are accountable for implementation of programs and interventions. We used data from the 2011 Indian Census for village-level demographic and amenities features and the 2016 Indian Demographic and Health Survey in a bias-corrected semisupervised regression framework to predict child anthropometric failures for all villages in India. Of the total geographic variation in predicted child anthropometric failure estimates, 54.2 to 72.3% were attributed to the village level followed by 20.6 to 39.5% to the state level. The mean predicted stunting was 37.9% (SD: 10.1%; IQR: 31.2 to 44.7%), and substantial variation was found across villages ranging from less than 5% for 691 villages to over 70% in 453 villages. Estimates at the village level can potentially shift the paradigm of policy discussion in India by enabling more informed prioritization and precise targeting. The proposed methodology can be adapted and applied to diverse population health indicators, and in other contexts, to reveal spatial heterogeneity at a finer geographic scale and identify local areas with the greatest needs and with direct implications for actions to take place.
Indaca
Gary King and Nathaniel Persily. 2019. “A New Model for Industry-Academic Partnerships.” PS: Political Science and Politics, 53, 4, Pp. 703-709. Publisher's VersionAbstract

The mission of the social sciences is to understand and ameliorate society’s greatest challenges. The data held by private companies, collected for different purposes, hold vast potential to further this mission. Yet, because of consumer privacy, trade secrets, proprietary content, and political sensitivities, these datasets are often inaccessible to scholars. We propose a novel organizational model to address these problems. We also report on the first partnership under this model, to study the incendiary issues surrounding the impact of social media on elections and democracy: Facebook provides (privacy-preserving) data access; eight ideologically and substantively diverse charitable foundations provide funding; an organization of academics we created, Social Science One (see SocialScience.One), leads the project; and the Institute for Quantitative Social Science at Harvard and the Social Science Research Council provide logistical help.

A Revised Proposal, Proposal
Comments from nineteen authors and a response to the above: Gary King. 1995. “A Revised Proposal, Proposal.” PS: Political Science and Politics, XXVIII, Pp. 494–499.
Publication, Publication
Gary King. 2006. “Publication, Publication.” PS: Political Science and Politics, 39, Pp. 119–125. Continuing updates to this paperAbstract

I show herein how to write a publishable paper by beginning with the replication of a published article. This strategy seems to work well for class projects in producing papers that ultimately get published, helping to professionalize students into the discipline, and teaching them the scientific norms of the free exchange of academic information. I begin by briefly revisiting the prominent debate on replication our discipline had a decade ago and some of the progress made in data sharing since.

The Dataverse Network Project

The Dataverse Network Project: a major ongoing project to write web applications, standards, protocols, and software for automating the process of citing, archiving, preserving, distributing, cataloging, translating, disseminating, naming, verifying, and replicating data and associated analyses (Website: TheData.Org). See also:
An Introduction to the Dataverse Network as an Infrastructure for Data Sharing
Gary King. 2007. “An Introduction to the Dataverse Network as an Infrastructure for Data Sharing.” Sociological Methods and Research, 36, Pp. 173–199.Abstract

We introduce a set of integrated developments in web application software, networking, data citation standards, and statistical methods designed to put some of the universe of data and data sharing practices on somewhat firmer ground. We have focused on social science data, but aspects of what we have developed may apply more widely. The idea is to facilitate the public distribution of persistent, authorized, and verifiable data, with powerful but easy-to-use technology, even when the data are confidential or proprietary. We intend to solve some of the sociological problems of data sharing via technological means, with the result intended to benefit both the scientific community and the sometimes apparently contradictory goals of individual researchers.

From Preserving the Past to Preserving the Future: The Data-PASS Project and the Challenges of Preserving Digital Social Science Data
Myron P Gutmann, Mark Abrahamson, Margaret O Adams, Micah Altman, Caroline Arms, Kenneth Bollen, Michael Carlson, Jonathan Crabtree, Darrell Donakowski, Gary King, Jaret Lyle, Marc Maynard, Amy Pienta, Richard Rockwell, Lois Rocms-Ferrara, and Copeland H Young. 2009. “From Preserving the Past to Preserving the Future: The Data-PASS Project and the Challenges of Preserving Digital Social Science Data.” Library Trends, 57, Pp. 315–337.Abstract

Social science data are an unusual part of the past, present, and future of digital preservation. They are both an unqualified success, due to long-lived and sustainable archival organizations, and in need of further development because not all digital content is being preserved. This article is about the Data Preservation Alliance for Social Sciences (Data-PASS), a project supported by the National Digital Information Infrastructure and Preservation Program (NDIIPP), which is a partnership of five major U.S. social science data archives. Broadly speaking, Data-PASS has the goal of ensuring that at-risk social science data are identified, acquired, and preserved, and that we have a future-oriented organization that could collaborate on those preservation tasks for the future. Throughout the life of the Data-PASS project we have worked to identify digital materials that have never been systematically archived, and to appraise and acquire them. As the project has progressed, however, it has increasingly turned its attention from identifying and acquiring legacy and at-risk social science data to identifying on going and future research projects that will produce data. This article is about the project's history, with an emphasis on the issues that underlay the transition from looking backward to looking forward.

An update on Dataverse
Gary King. 12/7/2014. “An update on Dataverse.” Oxford University Press Blog. Publisher's VersionAbstract
At the American Political Science Association meetings earlier this year, Gary King, Albert J. Weatherhead III University Professor at Harvard University, gave a presentation on Dataverse. Dataverse is an important tool that many researchers use to archive and share their research materials. As many readers of this blog may already know, the journal that I co-edit, Political Analysis, uses Dataverse to archive and disseminate the replication materials for the articles we publish in our journal. I asked Gary to write some remarks about Dataverse, based on his APSA presentation. His remarks are below.  -- Michael Alvarez, Editor, Political Analysis.
Automating Open Science for Big Data
Merce Crosas, Gary King, James Honaker, and Latanya Sweeney. 2015. “Automating Open Science for Big Data.” ANNALS of the American Academy of Political and Social Science, 659, 1, Pp. 260-273. Publisher's VersionAbstract

The vast majority of social science research presently uses small (MB or GB scale) data sets. These fixed-scale data sets are commonly downloaded to the researcher's computer where the analysis is performed locally, and are often shared and cited with well-established technologies, such as the Dataverse Project (see Dataverse.org), to support the published results.  The trend towards Big Data -- including large scale streaming data -- is starting to transform research and has the potential to impact policy-making and our understanding of the social, economic, and political problems that affect human societies.  However, this research poses new challenges in execution, accountability, preservation, reuse, and reproducibility. Downloading these data sets to a researcher’s computer is infeasible or not practical; hence, analyses take place in the cloud, require unusual expertise, and benefit from collaborative teamwork and novel tool development. The advantage of these data sets in how informative they are also means that they are much more likely to contain highly sensitive personally identifiable information. In this paper, we discuss solutions to these new challenges so that the social sciences can realize the potential of Big Data.

Hidden Section 1

A symposium on replication, edited by Nils Petter Gleditsch and Claire Metelits, with several articles including mine, Gary King. 2003. “The Future of Replication.” International Studies Perspectives, 4, Pp. 443–499.Abstract

Since the replication standard was proposed for political science research, more journals have required or encouraged authors to make data available, and more authors have shared their data. The calls for continuing this trend are more persistent than ever, and the agreement among journal editors in this Symposium continues this trend. In this article, I offer a vision of a possible future of the replication movement. The plan is to implement this vision via the Virtual Data Center project, which – by automating the process of finding, sharing, archiving, subsetting, converting, analyzing, and distributing data – may greatly facilitate adherence to the replication standard.

The Virtual Data Center

The Virtual Data Center, the predecessor to the Dataverse Network. See:
A Digital Library for the Dissemination and Replication of Quantitative Social Science Research
Micah Altman, Leonid Andreev, Mark Diggory, Gary King, Daniel L Kiskis, Elizabeth Kolster, Michael Krot, and Sidney Verba. 2001. “A Digital Library for the Dissemination and Replication of Quantitative Social Science Research.” Social Science Computer Review, 19, Pp. 458–470.Abstract
The Virtual Data Center (VDC) software is an open-source, digital library system for quantitative data. We discuss what the software does, and how it provides an infrastructure for the management and dissemination of disturbed collections of quantitative data, and the replication of results derived from this data.

See Also

Comment on 'Estimating the Reproducibility of Psychological Science'
Daniel Gilbert, Gary King, Stephen Pettigrew, and Timothy Wilson. 2016. “Comment on 'Estimating the Reproducibility of Psychological Science'.” Science, 351, 6277, Pp. 1037a-1038a. Publisher's VersionAbstract

recent article by the Open Science Collaboration (a group of 270 coauthors) gained considerable academic and public attention due to its sensational conclusion that the replicability of psychological science is surprisingly low. Science magazine lauded this article as one of the top 10 scientific breakthroughs of the year across all fields of science, reports of which appeared on the front pages of newspapers worldwide. We show that OSC's article contains three major statistical errors and, when corrected, provides no evidence of a replication crisis. Indeed, the evidence is consistent with the opposite conclusion -- that the reproducibility of psychological science is quite high and, in fact, statistically indistinguishable from 100%. (Of course, that doesn't mean that the replicability is 100%, only that the evidence is insufficient to reliably estimate replicability.) The moral of the story is that meta-science must follow the rules of science.

Replication data is available in this dataverse archive. See also the full web site for this article and related materials, and one of the news articles written about it.

A Proposed Standard for the Scholarly Citation of Quantitative Data
Micah Altman and Gary King. 2007. “A Proposed Standard for the Scholarly Citation of Quantitative Data.” D-Lib Magazine, 13. Publisher's VersionAbstract

An essential aspect of science is a community of scholars cooperating and competing in the pursuit of common goals. A critical component of this community is the common language of and the universal standards for scholarly citation, credit attribution, and the location and retrieval of articles and books. We propose a similar universal standard for citing quantitative data that retains the advantages of print citations, adds other components made possible by, and needed due to, the digital form and systematic nature of quantitative data sets, and is consistent with most existing subfield-specific approaches. Although the digital library field includes numerous creative ideas, we limit ourselves to only those elements that appear ready for easy practical use by scientists, journal editors, publishers, librarians, and archivists.

Related Papers on New Forms of Data

Ensuring the Data Rich Future of the Social Sciences
Gary King. 2011. “Ensuring the Data Rich Future of the Social Sciences.” Science, 331, 11 February, Pp. 719-721.Abstract

Massive increases in the availability of informative social science data are making dramatic progress possible in analyzing, understanding, and addressing many major societal problems. Yet the same forces pose severe challenges to the scientific infrastructure supporting data sharing, data management, informatics, statistical methodology, and research ethics and policy, and these are collectively holding back progress. I address these changes and challenges and suggest what can be done.

The Changing Evidence Base of Social Science Research
Gary King. 2009. “The Changing Evidence Base of Social Science Research.” In The Future of Political Science: 100 Perspectives, edited by Gary King, Kay Schlozman, and Norman Nie. New York: Routledge Press.Abstract

This (two-page) article argues that the evidence base of political science and the related social sciences are beginning an underappreciated but historic change.

Preserving Quantitative Research-Elicited Data for Longitudinal Analysis.  New Developments in Archiving Survey Data in the U.S.
Mark Abrahamson, Kenneth A Bollen, Myron P Gutmann, Gary King, and Amy Pienta. 2009. “Preserving Quantitative Research-Elicited Data for Longitudinal Analysis. New Developments in Archiving Survey Data in the U.S.” Historical Social Research, 34, 3, Pp. 51-59.Abstract

Social science data collected in the United States, both historically and at present, have often not been placed in any public archive -- even when the data collection was supported by government grants. The availability of the data for future use is, therefore, in jeopardy. Enforcing archiving norms may be the only way to increase data preservation and availability in the future.

Computational Social Science
David Lazer, Alex Pentland, Lada Adamic, Sinan Aral, Albert-Laszlo Barabasi, Devon Brewer, Nicholas Christakis, Noshir Contractor, James Fowler, Myron Gutmann, Tony Jebara, Gary King, Michael Macy, Deb Roy, and Marshall Van Alstyne. 2009. “Computational Social Science.” Science, 323, Pp. 721-723.Abstract

A field is emerging that leverages the capacity to collect and analyze data at a scale that may reveal patterns of individual and group behaviors.

International Conflict

Methods for coding, analyzing, and forecasting international conflict and state failure. Evidence that the causes of conflict, theorized to be important but often found to be small or ephemeral, are indeed tiny for the vast majority of dyads, but are large, stable, and replicable wherever the ex ante probability of conflict is large.
An Automated Information Extraction Tool For International Conflict Data with Performance as Good as Human Coders: A Rare Events Evaluation Design
Gary King and Will Lowe. 2003. “An Automated Information Extraction Tool For International Conflict Data with Performance as Good as Human Coders: A Rare Events Evaluation Design.” International Organization, 57, Pp. 617-642.Abstract
Despite widespread recognition that aggregated summary statistics on international conflict and cooperation miss most of the complex interactions among nations, the vast majority of scholars continue to employ annual, quarterly, or occasionally monthly observations. Daily events data, coded from some of the huge volume of news stories produced by journalists, have not been used much for the last two decades. We offer some reason to change this practice, which we feel should lead to considerably increased use of these data. We address advances in event categorization schemes and software programs that automatically produce data by "reading" news stories without human coders. We design a method that makes it feasible for the first time to evaluate these programs when they are applied in areas with the particular characteristics of international conflict and cooperation data, namely event categories with highly unequal prevalences, and where rare events (such as highly conflictual actions) are of special interest. We use this rare events design to evaluate one existing program, and find it to be as good as trained human coders, but obviously far less expensive to use. For large scale data collections, the program dominates human coding. Our new evaluative method should be of use in international relations, as well as more generally in the field of computational linguistics, for evaluating other automated information extraction tools. We believe that the data created by programs similar to the one we evaluated should see dramatically increased use in international relations research. To facilitate this process, we are releasing with this article data on 4.3 million international events, covering the entire world for the last decade.
Event Count Models for International Relations: Generalizations and Applications
Gary King. 1989. “Event Count Models for International Relations: Generalizations and Applications.” International Studies Quarterly, 33, Pp. 123–147.Abstract
International relations theorists tend to think in terms of continuous processes. Yet we observe only discrete events, such as wars or alliances, and summarize them in terms of the frequency of occurrence. As such, most empirical analyses in international relations are based on event count variables. Unfortunately, analysts have generally relied on statistical techniques that were designed for continuous data. This mismatch between theory and method has caused bias, inefficiency, and numerous inconsistencies in both theoretical arguments and empirical findings throughout the literature. This article develops a much more powerful approach to modeling and statistical analysis based explicity on estimating continuous processes from observed event counts. To demonstrate this class of models, I present several new statistical techniques developed for and applied to different areas of international relations. These include the influence of international alliances on the outbreak of war, the contagious process of multilateral economic sanctions, and reciprocity in superpower conflict. I also show how one can extract considerably more information from existing data and relate substantive theory to empirical analyses more explicitly with this approach.
Improving Quantitative Studies of International Conflict: A Conjecture
Nathaniel Beck, Gary King, and Langche Zeng. 2000. “Improving Quantitative Studies of International Conflict: A Conjecture.” American Political Science Review, 94, Pp. 21–36.Abstract
We address a well-known but infrequently discussed problem in the quantitative study of international conflict: Despite immense data collections, prestigious journals, and sophisticated analyses, empirical findings in the literature on international conflict are often unsatisfying. Many statistical results change from article to article and specification to specification. Accurate forecasts are nonexistant. In this article we offer a conjecture about one source of this problem: The causes of conflict, theorized to be important but often found to be small or ephemeral, are indeed tiny for the vast majority of dyads, but they are large, stable, and replicable wherever the ex ante probability of conflict is large. This simple idea has an unexpectedly rich array of observable implications, all consistent with the literature. We directly test our conjecture by formulating a statistical model that includes critical features. Our approach, a version of a "neural network" model, uncovers some interesting structural features of international conflict, and as one evaluative measure, forecasts substantially better than any previous effort. Moreover, this improvement comes at little cost, and it is easy to evaluate whether the model is a statistical improvement over the simpler models commonly used.
Proper Nouns and Methodological Propriety: Pooling Dyads in International Relations Data
Gary King. 2001. “Proper Nouns and Methodological Propriety: Pooling Dyads in International Relations Data.” International Organization, 55, Pp. 497–507.Abstract
The intellectual stakes at issue in this symposium are very high: Green, Kim, and Yoon (2000 and hereinafter GKY) apply their proposed methodological prescriptions and conclude that they key findings in the field is wrong and democracy "has no effect on militarized disputes." GKY are mainly interested in convincing scholars about their methodological points and see themselves as having no stake in the resulting substantive conclusions. However, their methodological points are also high stakes claims: if correct, the vast majority of statistical analyses of military conflict ever conducted would be invalidated. GKY say they "make no attempt to break new ground statistically," but, as we will see, this both understates their methodological contribution to the field and misses some unique features of their application and data in international relations. On the ltter, GKY’s critics are united: Oneal and Russett (2000) conclude that GKY’s method "produces distorted results," and show even in GKY’s framework how democracy’s effect can be reinstated. Beck and Katz (2000) are even more unambiguous: "GKY’s conclusion, in table 3, that variables such as democracy have no pacific impact, is simply nonsense...GKY’s (methodological) proposal...is NEVER a good idea." My given task is to sort out and clarify these conflicting claims and counterclaims. The procedure I followed was to engage in extensive discussions with the participants that included joint reanalyses provoked by our discussions and passing computer program code (mostly with Monte Carlo simulations) back and forth to ensure we were all talking about the same methods and agreed with the factual results. I learned a great deal from this process and believe that the positions of the participants are now a lot closer than it may seem from their written statements. Indeed, I believe that all the participants now agree with what I have written here, even though they would each have different emphases (and although my believing there is agreement is not the same as there actually being agreement!).
Explaining Rare Events in International Relations
Gary King and Langche Zeng. 2001. “Explaining Rare Events in International Relations.” International Organization, 55, Pp. 693–715.Abstract
Some of the most important phenomena in international conflict are coded s "rare events data," binary dependent variables with dozens to thousands of times fewer events, such as wars, coups, etc., than "nonevents". Unfortunately, rare events data are difficult to explain and predict, a problem that seems to have at least two sources. First, and most importantly, the data collection strategies used in international conflict are grossly inefficient. The fear of collecting data with too few events has led to data collections with huge numbers of observations but relatively few, and poorly measured, explanatory variables. As it turns out, more efficient sampling designs exist for making valid inferences, such as sampling all available events (e.g., wars) and a tiny fraction of non-events (peace). This enables scholars to save as much as 99% of their (non-fixed) data collection costs, or to collect much more meaningful explanatory variables. Second, logistic regression, and other commonly used statistical procedures, can underestimate the probability of rare events. We introduce some corrections that outperform existing methods and change the estimates of absolute and relative risks by as much as some estimated effects reported in the literature. We also provide easy-to-use methods and software that link these two results, enabling both types of corrections to work simultaneously.
Improving Forecasts of State Failure
Gary King and Langche Zeng. 2001. “Improving Forecasts of State Failure.” World Politics, 53, Pp. 623–658.Abstract

We offer the first independent scholarly evaluation of the claims, forecasts, and causal inferences of the State Failure Task Force and their efforts to forecast when states will fail. State failure refers to the collapse of the authority of the central government to impose order, as in civil wars, revolutionary wars, genocides, politicides, and adverse or disruptive regime transitions. This task force, set up at the behest of Vice President Gore in 1994, has been led by a group of distinguished academics working as consultants to the U.S. Central Intelligence Agency. State Failure Task Force reports and publications have received attention in the media, in academia, and from public policy decision-makers. In this article, we identify several methodological errors in the task force work that cause their reported forecast probabilities of conflict to be too large, their causal inferences to be biased in unpredictable directions, and their claims of forecasting performance to be exaggerated. However, we also find that the task force has amassed the best and most carefully collected data on state failure in existence, and the required corrections which we provide, although very large in effect, are easy to implement. We also reanalyze their data with better statistical procedures and demonstrate how to improve forecasting performance to levels significantly greater than even corrected versions of their models. Although still a highly uncertain endeavor, we are as a consequence able to offer the first accurate forecasts of state failure, along with procedures and results that may be of practical use in informing foreign policy decision making. We also describe a number of strong empirical regularities that may help in ascertaining the causes of state failure.

Rethinking Human Security
Gary King and Christopher J.L. Murray. 2002. “Rethinking Human Security.” Political Science Quarterly, 116, Pp. 585–610.Abstract

In the last two decades, the international community has begun to conclude that attempts to ensure the territorial security of nation-states through military power have failed to improve the human condition. Despite astronomical levels of military spending, deaths due to military conflict have not declined. Moreover, even when the borders of some states are secure from foreign threats, the people within those states do not necessarily have freedom from crime, enough food, proper health care, education, or political freedom. In response to these developments, the international community has gradually moved to combine economic development with military security and other basic human rights to form a new concept of "human security". Unfortunately, by common assent the concept lacks both a clear definition, consistent with the aims of the international community, and any agreed upon measure of it. In this paper, we propose a simple, rigorous, and measurable definition of human security: the expected number of years of future life spent outside the state of "generalized poverty". Generalized poverty occurs when an individual falls below the threshold in any key domain of human well-being. We consider improvements in data collection and methods of forecasting that are necessary to measure human security and then introduce an agenda for research and action to enhance human security that follows logically in the areas of risk assessment, prevention, protection, and compensation.

Armed Conflict as a Public Health Problem
Christopher JL Murray, Gary King, Alan D Lopez, Niels Tomijima, and Etienne Krug. 2002. “Armed Conflict as a Public Health Problem.” BMJ (British Medical Journal), 324, Pp. 346–349.Abstract
Armed conflict is a major cause of injury and death worldwide, but we need much better methods of quantification before we can accurately assess its effect. Armed conflict between warring states and groups within states have been major causes of ill health and mortality for most of human history. Conflict obviously causes deaths and injuries on the battlefield, but also health consequences from the displacement of populations, the breakdown of health and social services, and the heightened risk of disease transmission. Despite the size of the health consequences, military conflict has not received the same attention from public health research and policy as many other causes of illness and death. In contrast, political scientists have long studied the causes of war but have primarily been interested in the decision of elite groups to go to war, not in human death and misery. We review the limited knowledge on the health consequences of conflict, suggest ways to improve measurement, and discuss the potential for risk assessment and for preventing and ameliorating the consequences of conflict.
Theory and Evidence in International Conflict: A Response to de Marchi, Gelpi, and Grynaviski
Nathaniel Beck, Gary King, and Langche Zeng. 2004. “Theory and Evidence in International Conflict: A Response to de Marchi, Gelpi, and Grynaviski,” 98, Pp. 379-389.Abstract
We thank Scott de Marchi, Christopher Gelpi, and Jeffrey Grynaviski (2003 and hereinafter dGG) for their careful attention to our work (Beck, King, and Zeng, 2000 and hereinafter BKZ) and for raising some important methodological issues that we agree deserve readers’ attention. We are pleased that dGG’s analyses are consistent with the theoretical conjecture about international conflict put forward in BKZ –- "The causes of conflict, theorized to be important but often found to be small or ephemeral, are indeed tiny for the vast majority of dyads, but they are large stable and replicable whenever the ex ante probability of conflict is large" (BKZ, p.21) –- and that dGG agree with our main methodological point that out-of-sample forecasting performance should always be one of the standards used to judge studies of international conflict, and indeed most other areas of political science. However, dGG frequently err when they draw methodological conclusions. Their central claim involves the superiority of logit over neural network models for international conflict data, as judged by forecasting performance and other properties such as ease of use and interpretation ("neural networks hold few unambiguous advantages... and carry significant costs" relative to logit and dGG, p.14). We show here that this claim, which would be regarded as stunning in any of the diverse fields in which both methods are more commonly used, is false. We also show that dGG’s methodological errors and the restrictive model they favor cause them to miss and mischaracterize crucial patterns in the causes of international conflict. We begin in the next section by summarizing the growing support for our conjecture about international conflict. The second section discusses the theoretical reasons why neural networks dominate logistic regression, correcting a number of methodological errors. The third section then demonstrates empirically, in the same data as used in BKZ and dGG, that neural networks substantially outperform dGG’s logit model. We show that neural networks improve on the forecasts from logit as much as logit improves on a model with no theoretical variables. We also show how dGG’s logit analysis assumed, rather than estimated, the answer to the central question about the literature’s most important finding, the effect of democracy on war. Since this and other substantive assumptions underlying their logit model are wrong, their substantive conclusion about the democratic peace is also wrong. The neural network models we used in BKZ not only avoid these difficulties, but they, or one of the other methods available that do not make highly restrictive assumptions about the exact functional form, are just what is called for to study the observable implications of our conjecture.
The Supreme Court During Crisis: How War Affects only Non-War Cases
Lee Epstein, Daniel E Ho, Gary King, and Jeffrey A Segal. 2005. “The Supreme Court During Crisis: How War Affects only Non-War Cases.” New York University Law Review, 80, Pp. 1–116.Abstract
Does the U.S. Supreme Court curtail rights and liberties when the nation’s security is under threat? In hundreds of articles and books, and with renewed fervor since September 11, 2001, members of the legal community have warred over this question. Yet, not a single large-scale, quantitative study exists on the subject. Using the best data available on the causes and outcomes of every civil rights and liberties case decided by the Supreme Court over the past six decades and employing methods chosen and tuned especially for this problem, our analyses demonstrate that when crises threaten the nation’s security, the justices are substantially more likely to curtail rights and liberties than when peace prevails. Yet paradoxically, and in contradiction to virtually every theory of crisis jurisprudence, war appears to affect only cases that are unrelated to the war. For these cases, the effect of war and other international crises is so substantial, persistent, and consistent that it may surprise even those commentators who long have argued that the Court rallies around the flag in times of crisis. On the other hand, we find no evidence that cases most directly related to the war are affected. We attempt to explain this seemingly paradoxical evidence with one unifying conjecture: Instead of balancing rights and security in high stakes cases directly related to the war, the Justices retreat to ensuring the institutional checks of the democratic branches. Since rights-oriented and process-oriented dimensions seem to operate in different domains and at different times, and often suggest different outcomes, the predictive factors that work for cases unrelated to the war fail for cases related to the war. If this conjecture is correct, federal judges should consider giving less weight to legal principles outside of wartime but established during wartime, and attorneys should see it as their responsibility to distinguish cases along these lines.
When Can History Be Our Guide? The Pitfalls of Counterfactual Inference
Gary King and Langche Zeng. 2007. “When Can History Be Our Guide? The Pitfalls of Counterfactual Inference.” International Studies Quarterly, Pp. 183-210.Abstract
Inferences about counterfactuals are essential for prediction, answering "what if" questions, and estimating causal effects. However, when the counterfactuals posed are too far from the data at hand, conclusions drawn from well-specified statistical analyses become based on speculation and convenient but indefensible model assumptions rather than empirical evidence. Unfortunately, standard statistical approaches assume the veracity of the model rather than revealing the degree of model-dependence, and so this problem can be hard to detect. We develop easy-to-apply methods to evaluate counterfactuals that do not require sensitivity testing over specified classes of models. If an analysis fails the tests we offer, then we know that substantive results are sensitive to at least some modeling choices that are not based on empirical evidence. We use these methods to evaluate the extensive scholarly literatures on the effects of changes in the degree of democracy in a country (on any dependent variable) and separate analyses of the effects of UN peacebuilding efforts. We find evidence that many scholars are inadvertently drawing conclusions based more on modeling hypotheses than on their data. For some research questions, history contains insufficient information to be our guide.

Legislative Redistricting

The definition of partisan symmetry as a standard for fairness in redistricting; methods and software for measuring partisan bias and electoral responsiveness; discussion of U.S. Supreme Court rulings about this work. Evidence that U.S. redistricting reduces bias and increases responsiveness, and that the electoral college is fair; applications to legislatures, primaries, and multiparty systems.

U.S. Legislatures

If a Statistical Model Predicts That Common Events Should Occur Only Once in 10,000 Elections, Maybe it’s the Wrong Model
Danny Ebanks, Jonathan N. Katz, and Gary King. Working Paper. “If a Statistical Model Predicts That Common Events Should Occur Only Once in 10,000 Elections, Maybe it’s the Wrong Model”.Abstract

Political scientists forecast elections, not primarily to satisfy public interest, but to validate statistical models used for estimating many quantities of scholarly interest. Although scholars have learned a great deal from these models, they can be embarrassingly overconfident: Events that should occur once in 10,000 elections occur almost every year, and even those that should occur once in a trillion-trillion elections are sometimes observed. We develop a novel generative statistical model of US congressional elections 1954-2020 and validate it with extensive out-of-sample tests. The generatively accurate descriptive summaries provided by this model demonstrate that the 1950s was as partisan and differentiated as the current period, but with parties not based on ideological differences as they are today. The model also shows that even though the size of the incumbency advantage has varied tremendously over time, the risk of an in-party incumbent losing a midterm election contest has been high and essentially constant over at least the last two thirds of a century.

Please see "How American Politics Ensures Electoral Accountability in Congress," which supersedes this paper.
 

The Essential Role of Statistical Inference in Evaluating Electoral Systems: A Response to DeFord et al.
Jonathan Katz, Gary King, and Elizabeth Rosenblatt. Forthcoming. “The Essential Role of Statistical Inference in Evaluating Electoral Systems: A Response to DeFord et al.” Political Analysis.Abstract
Katz, King, and Rosenblatt (2020) introduces a theoretical framework for understanding redistricting and electoral systems, built on basic statistical and social science principles of inference. DeFord et al. (Forthcoming, 2021) instead focuses solely on descriptive measures, which lead to the problems identified in our arti- cle. In this paper, we illustrate the essential role of these basic principles and then offer statistical, mathematical, and substantive corrections required to apply DeFord et al.’s calculations to social science questions of interest, while also showing how to easily resolve all claimed paradoxes and problems. We are grateful to the authors for their interest in our work and for this opportunity to clarify these principles and our theoretical framework.
 
Theoretical Foundations and Empirical Evaluations of Partisan Fairness in District-Based Democracies
Jonathan N. Katz, Gary King, and Elizabeth Rosenblatt. 2020. “Theoretical Foundations and Empirical Evaluations of Partisan Fairness in District-Based Democracies.” American Political Science Review, 114, 1, Pp. 164-178. Publisher's VersionAbstract
We clarify the theoretical foundations of partisan fairness standards for district-based democratic electoral systems, including essential assumptions and definitions that have not been recognized, formalized, or in some cases even discussed. We also offer extensive empirical evidence for assumptions with observable implications. Throughout, we follow a fundamental principle of statistical inference too often ignored in this literature -- defining the quantity of interest separately so its measures can be proven wrong, evaluated, or improved. This enables us to prove which of the many newly proposed fairness measures are statistically appropriate and which are biased, limited, or not measures of the theoretical quantity they seek to estimate at all. Because real world redistricting and gerrymandering involves complicated politics with numerous participants and conflicting goals, measures biased for partisan fairness sometimes still provide useful descriptions of other aspects of electoral systems.
How to conquer partisan gerrymandering
Gary King and Robert X Browning. 12/26/2017. “How to conquer partisan gerrymandering.” Boston Globe (Op-Ed), 292 , 179 , Pp. A10. Publisher's VersionAbstract
PARTISAN GERRYMANDERING has long been reviled for thwarting the will of the voters. Yet while voters are acting disgusted, the US Supreme Court has only discussed acting — declaring they have the constitutional right to fix the problem, but doing nothing. But as better data and computer algorithms are now making gerrymandering increasingly effective, continuing to sidestep the issue could do permanent damage to American democracy. In Gill v. Whitford, the soon-to-be-decided challenge to Wisconsin’s 2011 state Assembly redistricting plan, the court could finally fix the problem for the whole country. Judging from the oral arguments, the key to the case is whether the court endorses the concept of “partisan symmetry,” a specific standard for treating political parties equally in allocating legislative seats based on voting.
Edited transcript of a talk on Partisan Symmetry at the 'Redistricting and Representation Forum'
Gary King. 2018. “Edited transcript of a talk on Partisan Symmetry at the 'Redistricting and Representation Forum'.” Bulletin of the American Academy of Arts and Sciences, Winter, Pp. 55-58.Abstract

The origin, meaning, estimation, and application of the concept of partisan symmetry in legislative redistricting, and the justiciability of partisan gerrymandering. An edited transcript of a talk at the “Redistricting and Representation Forum,” American Academy of Arts & Sciences, Cambridge, MA 11/8/2017.

Here also is a video of the original talk.

How to Measure Legislative District Compactness If You Only Know it When You See It
Aaron Kaufman, Gary King, and Mayya Komisarchik. 2021. “How to Measure Legislative District Compactness If You Only Know it When You See It.” American Journal of Political Science, 65, 3, Pp. 533-550. Publisher's VersionAbstract

To deter gerrymandering, many state constitutions require legislative districts to be "compact." Yet, the law offers few precise definitions other than "you know it when you see it," which effectively implies a common understanding of the concept. In contrast, academics have shown that compactness has multiple dimensions and have generated many conflicting measures. We hypothesize that both are correct -- that compactness is complex and multidimensional, but a common understanding exists across people. We develop a survey to elicit this understanding, with high reliability (in data where the standard paired comparisons approach fails). We create a statistical model that predicts, with high accuracy, solely from the geometric features of the district, compactness evaluations by judges and public officials responsible for redistricting, among others. We also offer compactness data from our validated measure for 20,160 state legislative and congressional districts, as well as open source software to compute this measure from any district.

Winner of the 2018 Robert H Durr Award from the MPSA.

The Future of Partisan Symmetry as a Judicial Test for Partisan Gerrymandering after LULAC v. Perry
The U.S. Supreme Court responds favorably to the nonpartisan Amici Curae Brief on partisan gerrymandering filed by Gary King, Bernard Grofman, Andrew Gelman, and Jonathan Katz (see brief) and requests additional information. This information is provided in the context of a brief history of the scholarly literature, a summary of the state of the art in conceptualization and measurement of partisan symmetry, and the state of current jurisprudence, in: Bernard Grofman and Gary King. 2008. “The Future of Partisan Symmetry as a Judicial Test for Partisan Gerrymandering after LULAC v. Perry.” Election Law Journal, 6, 1, Pp. 2-35.Abstract

While the Supreme Court in Bandemer v. Davis found partisan gerrymandering to be justiciable, no challenged redistricting plan in the subsequent 20 years has been held unconstitutional on partisan grounds. Then, in Vieth v. Jubilerer, five justices concluded that some standard might be adopted in a future case, if a manageable rule could be found. When gerrymandering next came before the Court, in LULAC v. Perry, we along with our colleagues filed an Amicus Brief (King et al., 2005), proposing the test be based in part on the partisan symmetry standard. Although the issue was not resolved, our proposal was discussed and positively evaluated in three of the opinions, including the plurality judgment, and for the first time for any proposal the Court gave a clear indication that a future legal test for partisan gerrymandering will likely include partisan symmetry. A majority of Justices now appear to endorse the view that the measurement of partisan symmetry may be used in partisan gerrymandering claims as “a helpful (though certainly not talismanic) tool” (Justice Stevens, joined by Justice Breyer), provided one recognizes that “asymmetry alone is not a reliable measure of unconstitutional partisanship” and possibly that the standard would be applied only after at least one election has been held under the redistricting plan at issue (Justice Kennedy, joined by Justices Souter and Ginsburg). We use this essay to respond to the request of Justices Souter and Ginsburg that “further attention … be devoted to the administrability of such a criterion at all levels of redistricting and its review.” Building on our previous scholarly work, our Amicus Brief, the observations of these five Justices, and a supporting consensus in the academic literature, we offer here a social science perspective on the conceptualization and measurement of partisan gerrymandering and the development of relevant legal rules based on what is effectively the Supreme Court’s open invitation to lower courts to revisit these issues in the light of LULAC v. Perry.

Hidden Section

The concept of partisan symmetry

The concept of partisan symmetry as a standard for assessing partisan gerrymandering:
Heather K. Gerken, Jonathan N. Katz, Gary King, Larry J. Sabato, and Samuel S.-H. Wang. 2017. “Brief of Heather K. Gerken, Jonathan N. Katz, Gary King, Larry J. Sabato, and Samuel S.-H. Wang as Amici Curiae in Support of Appellees.” Filed with the Supreme Court of the United States in Beverly R. Gill et al. v. William Whitford et al. 16-1161 .Abstract
SUMMARY OF ARGUMENT
Plaintiffs ask this Court to do what it has done many times before. For generations, it has resolved cases involving elections and cases on which elections ride. It has adjudicated controversies that divide the American people and those, like this one, where Americans are largely in agreement. In doing so, the Court has sensibly adhered to its long-standing and circumspect approach: it has announced a workable principle, one that lends itself to a manageable test, while allowing the lower courts to work out the precise contours of that test with time and experience.

Partisan symmetry, the principle put forward by the plaintiffs, is just such a workable principle. The standard is highly intuitive, deeply rooted in history, and accepted by virtually all social scientists. Tests for partisan symmetry are reliable, transparent, and easy to calculate without undue reliance on experts or unnecessary judicial intrusion on state redistricting judgments. Under any of these tests, Wisconsin’s districts cannot withstand constitutional scrutiny.
Seats, Votes, and Gerrymandering: Measuring Bias and Representation in Legislative Redistricting
Robert X Browning and Gary King. 1987. “Seats, Votes, and Gerrymandering: Measuring Bias and Representation in Legislative Redistricting.” Law and Policy, 9, Pp. 305–322.Abstract
The Davis v. Bandemer case focused much attention on the problem of using statistical evidence to demonstrate the existence of political gerrymandering. In this paper, we evaluate the uses and limitations of measures of the seat-votes relationship in the Bandemer case. We outline a statistical method we have developed that can be used to estimate bias and the form of representation in legislative redistricting. We apply this method to Indiana State House and Senate elections for the period 1972 to 1984 and demonstrate a maximum bias 6.2% toward the Republicans in the House and a 2.8% bias in the Senate.
Democratic Representation and Partisan Bias in Congressional Elections
Defines, distinguishes, and measures "partisan bias" and "electoral responsiveness" (or "repesentation"), key concepts that had been conflated in much previous academic literature, and "partisan symmetry" as the definition of fairness to parties in districting. A consensus in the academic literature on partisan symmetry as the definition of partisan fairness has held since this article. Gary King and Robert X Browning. 1987. “Democratic Representation and Partisan Bias in Congressional Elections.” American Political Science Review, 81, Pp. 1252–1273.Abstract
The translation of citizen votes into legislative seats is of central importance in democratic electoral systems. It has been a longstanding concern among scholars in political science and in numerous other disciplines. Through this literature, two fundamental tenets of democratic theory, partisan bias and democratic representation, have often been confused. We develop a general statistical model of the relationship between votes and seats and separate these two important concepts theoretically and empirically. In so doing, we also solve several methodological problems with the study of seats, votes and the cube law. An application to U.S. congressional districts provides estimates of bias and representation for each state and deomonstrates the model’s utility. Results of this application show distinct types of representation coexisting in U.S. states. Although most states have small partisan biases, there are some with a substantial degree of bias.
Racial Fairness in Legislative Redistricting
Related work on clarifying normative assumptions underlying proposed standards for fairness to different ethnic groups, and formalizes several absolute standards. Gary King, John Bruce, and Andrew Gelman. 1996. “Racial Fairness in Legislative Redistricting.” In Classifying by Race, edited by Paul E Peterson, Pp. 85-110. Princeton: Princeton University Press.Abstract
In this chapter, we study standards of racial fairness in legislative redistricting- a field that has been the subject of considerable legislation, jurisprudence, and advocacy, but very little serious academic scholarship. We attempt to elucidate how basic concepts about "color-blind" societies, and similar normative preferences, can generate specific practical standards for racial fairness in representation and redistricting. We also provide the normative and theoretical foundations on which concepts such as proportional representation rest, in order to give existing preferences of many in the literature a firmer analytical foundation.

Methods for measuring partisan bias and electoral responsiveness

The methods for measuring partisan bias and electoral responsiveness, and related quantities, that first relaxed the assumptions of exact uniform partisan swing and the exact correspondence between statewide electoral results and legislative electoral results, among other improvements:
Representation Through Legislative Redistricting: A Stochastic Model
The first attempt to eliminate the exact uniform partisan swing assumption, using data from a single election. Gary King. 1989. “Representation Through Legislative Redistricting: A Stochastic Model.” American Journal of Political Science, 33, Pp. 787–824.Abstract
This paper builds a stochastic model of the processes that give rise to observed patterns of representation and bias in congressional and state legislative elections. The analysis demonstrates that partisan swing and incumbency voting, concepts from the congressional elections literature, have determinate effects on representation and bias, concepts from the redistricting literature. The model shows precisely how incumbency and increased variability of partisan swing reduce the responsiveness of the electoral system and how partisan swing affects whether the system is biased toward one party or the other. Incumbency, and other causes of unresponsive representation, also reduce the effect of partisan swing on current levels of partisan bias. By relaxing the restrictive portions of the widely applied "uniform partisan swing" assumption, the theoretical analysis leads directly to an empirical model enabling one more reliably to estimate responsiveness and bias from a single year of electoral data. Applying this to data from seven elections in each of six states, the paper demonstrates that redistricting has effects in predicted directions in the short run: partisan gerrymandering biases the system in favor of the party in control and, by freeing up seats held by opposition party incumbents, increases the system’s responsiveness. Bipartisan-controlled redistricting appears to reduce bias somewhat and dramatically to reduce responsiveness. Nonpartisan redistricting processes substantially increase responsiveness but do not have as clear an effect on bias. However, after only two elections, prima facie evidence for redistricting effects evaporate in most states. Finally, across every state and type of redistricting process, responsiveness declined significantly over the course of the decade. This is clear evidence that the phenomenon of "vanishing marginals," recognized first in the U.S. Congress literature, also applies to these different types of state legislative assemblies. It also strongly suggests that redistricting could not account for this pattern.
Estimating the Electoral Consequences of Legislative Redistricting
The most technically sophisticated method, many aspects of which were simplified in the above paper. Andrew Gelman and Gary King. 1990. “Estimating the Electoral Consequences of Legislative Redistricting.” Journal of the American Statistical Association, 85, Pp. 274–282.Abstract
We analyze the effects of redistricting as revealed in the votes received by the Democratic and Republican candidates for state legislature. We develop measures of partisan bias and the responsiveness of the composition of the legislature to changes in statewide votes. Our statistical model incorporates a mixed hierarchical Bayesian and non-Bayesian estimation, requiring simulation along the lines of Tanner and Wong (1987). This model provides reliable estimates of partisan bias and responsiveness along with measures of their variabilities from only a single year of electoral data. This allows one to distinguish systematic changes in the underlying electoral system from typical election-to-election variability.
A Unified Method of Evaluating Electoral Systems and Redistricting Plans
A now widely used set of methods for estimating bias and responsiveness, including applications to redistricting in the states and the U.S. Congress. Andrew Gelman and Gary King. 1994. “A Unified Method of Evaluating Electoral Systems and Redistricting Plans.” American Journal of Political Science, 38, Pp. 514–554.Abstract
We derive a unified statistical method with which one can produce substantially improved definitions and estimates of almost any feature of two-party electoral systems that can be defined based on district vote shares. Our single method enables one to calculate more efficient estimates, with more trustworthy assessments of their uncertainty, than each of the separate multifarious existing measures of partisan bias, electoral responsiveness, seats-votes curves, expected or predicted vote in each district in a legislature, the probability that a given party will win the seat in each district, the proportion of incumbents or others who will lose their seats, the proportion of women or minority candidates to be elected, the incumbency advantage and other causal effects, the likely effects on the electoral system and district votes of proposed electoral reforms, such as term limitations, campaign spending limits, and drawing majority-minority districts, and numerous others. To illustrate, we estimate the partisan bias and electoral responsiveness of the U.S. House of Representatives since 1900 and evaluate the fairness of competing redistricting plans for the 1992 Ohio state legislature.

Paradoxical benefits of redistricting

Demonstrates the paradoxical benefits of redistricting to American democracy, even partisan gerrymandering, (as compared to no redistricting) in reducing partian bias and increasing electoral responsiveness. (Of course, if the symmetry standard were imposed, redistricting by any means would produce less bias than any other arrangement.)
Enhancing Democracy Through Legislative Redistricting
Andrew Gelman and Gary King. 1994. “Enhancing Democracy Through Legislative Redistricting.” American Political Science Review, 88, Pp. 541–559.Abstract
We demonstrate the surprising benefits of legislative redistricting (including partisan gerrymandering) for American representative democracy. In so doing, our analysis resolves two long-standing controversies in American politics. First, whereas some scholars believe that redistricting reduces electoral responsiveness by protecting incumbents, others, that the relationship is spurious, we demonstrate that both sides are wrong: redistricting increases responsiveness. Second, while some researchers believe that gerrymandering dramatically increases partisan bias and others deny this effect, we show both sides are in a sense correct. Gerrymandering biases electoral systems in favor of the party that controls the redistricting as compared to what would have happened if the other party controlled it, but any type of redistricting reduces partisan bias as compared to an electoral system without redistricting. Incorrect conclusions in both literatures resulted from misjudging the enormous uncertainties present during redistricting periods, making simplified assumptions about the redistricters’ goals, and using inferior statistical methods.
Advantages of Conflictual Redistricting
A shortened, popular version of the previous article. Andrew Gelman and Gary King. 1996. “Advantages of Conflictual Redistricting.” In Fixing the Boundary: Defining and Redefining Single-Member Electoral Districts, edited by Iain McLean and David Butler, Pp. 207–218. Aldershot, England: Dartmouth Publishing Company.Abstract
This article describes the results of an analysis we did of state legislative elections in the United States, where each state is required to redraw the boundaries of its state legislative districts every ten years. In the United States, redistrictings are sometimes controlled by the Democrats, sometimes by the Republicans, and sometimes by bipartisan committees, but never by neutral boundary commissions. Our goal was to study the consequences of redistricting and at the conclusion of this article, we discuss how our findings might be relevant to British elections.

Other Districting Systems

Electoral Responsiveness and Partisan Bias in Multiparty Democracies
Unifies existing multi-year seats-votes models as special cases of a new general model, and was the first formalization of, and method for estimating, electoral responsiveness and partisan bias in electoral systems with any number of political parties. Gary King. 1990. “Electoral Responsiveness and Partisan Bias in Multiparty Democracies.” Legislative Studies Quarterly, XV, Pp. 159–181.Abstract
Because the goals of local and national representation are inherently incompatible, there is an uncertain relationship between aggregates of citizen votes and the national allocation of legislative seats in almost all democracies. In particular electoral systems, this uncertainty leads to diverse configurations of electoral responsiveness and partisian bias, two fundamental concepts in empirical democratic theory. This paper unifies virtually all existing multiyear seats-votes models as special cases of a new general model. It also permits the first formalization of, and reliable method for empirically estimating, electoral responsiveness and partisian bias in electoral systems with any number of political parties. I apply this model to data from nine democratic countries, revealing clear patterns in responsiveness and bias across different types of electoral rules.
Measuring the Consequences of Delegate Selection Rules in Presidential Nominations
Formalizes normative criteria used to judge presidential selection contests by modeling the translation of citizen votes in primaries and caucuses into delegates to the national party conventions and reveals the patterns of biases and responsiveness in the Democratic and Republican nomination systems. Stephen Ansolabehere and Gary King. 1990. “Measuring the Consequences of Delegate Selection Rules in Presidential Nominations.” Journal of Politics, 52, Pp. 609–621.Abstract
In this paper, we formalize existing normative criteria used to judge presidential selection contests by modeling the translation of citizen votes in primaries and caucuses into delegates to the national party conventions. We use a statistical model that enables us to separate the form of electoral responsiveness in presidential selection systems, as well as the degree of bias toward each of the candidates. We find that (1) the Republican nomination system is more responsive to changes in citizen votes than the Democratic system and (2) non-PR primaries are always more responsive than PR primaries and (3) surprisingly, caucuses are more proportional than even primaries held under PR rules and (4) significant bias in favor of a candidate was a good prediction of the winner of the nomination contest. We also (5) evaluate the claims of Ronald Reagan in 1976 and Jesse Jackson in 1988 that the selection systems were substantially biased against their candidates. We find no evidence to support Reagan’s claim, but substantial evidence that Jackson was correct.
Empirically Evaluating the Electoral College
Evaluates the partisan bias of the electoral college, and shows that there is little basis for reform of the system. Changing to popular vote of the president would not even increase individual voting power. Andrew Gelman, Jonathan Katz, and Gary King. 2004. “Empirically Evaluating the Electoral College.” In Rethinking the Vote: The Politics and Prospects of American Electoral Reform, edited by Ann N Crigler, Marion R Just, and Edward J McCaffery, Pp. 75-88. New York: Oxford University Press.Abstract

The 2000 U.S. presidential election rekindled interest in possible electoral reform. While most of the popular and academic accounts focused on balloting irregularities in Florida, such as the now infamous "butterfly" ballot and mishandled absentee ballots, some also noted that this election marked only the fourth time in history that the candidate with a plurality of the popular vote did not also win the Electoral College. This "anti-democratic" outcome has fueled desire for reform or even outright elimination of the electoral college. We show that after appropriate statistical analysis of the available historical electoral data, there is little basis to argue for reforming the Electoral College. We first show that while the Electoral College may once have been biased against the Democrats, the current distribution of voters advantages neither party. Further, the electoral vote will differ from the popular vote only when the average vote shares of the two major candidates are extremely close to 50 percent. As for individual voting power, we show that while there has been much temporal variation in relative voting power over the last several decades, the voting power of individual citizens would not likely increase under a popular vote system of electing the president.

Software

JudgeIt II: A Program for Evaluating Electoral Systems and Redistricting Plans
Andrew Gelman, Gary King, and Andrew Thomas. 2010. “JudgeIt II: A Program for Evaluating Electoral Systems and Redistricting Plans”.Abstract

A program for analyzing most any feature of district-level legislative elections data, including prediction, evaluating redistricting plans, estimating counterfactual hypotheses (such as what would happen if a term-limitation amendment were imposed). This implements statistical procedures described in a series of journal articles and has been used during redistricting in many states by judges, partisans, governments, private citizens, and many others. The earlier version was winner of the APSA Research Software Award.

Track JudgeIt Changes

Data

Mortality Studies

Methods for forecasting mortality rates (overall or for time series data cross-classified by age, sex, country, and cause); estimating mortality rates in areas without vital registration; measuring inequality in risk of death; applications to US mortality, the future of the Social Security, armed conflict, heart failure, and human security.
A simulation-based comparative effectiveness analysis of policies to improve global maternal health outcomes
Zachary J. Ward, Rifat Atun, Gary King, Brenda Sequeira Dmello, and Sue J. Goldie. 4/20/2023. “A simulation-based comparative effectiveness analysis of policies to improve global maternal health outcomes.” Nature Medicne. Publisher's VersionAbstract
The Sustainable Development Goals include a target to reduce the global maternal mortality ratio (MMR) to less than 70 maternal deaths per 100,000 live births by 2030, with no individual country exceeding 140. However, on current trends the goals are unlikely to be met. We used the empirically calibrated Global Maternal Health microsimulation model, which simulates individual women in 200 countries and territories to evaluate the impact of different interventions and strategies from 2022 to 2030. Although individual interventions yielded fairly small reductions in maternal mortality, integrated strategies were more effective. A strategy to simultaneously increase facility births, improve the availability of clinical services and quality of care at facilities, and improve linkages to care would yield a projected global MMR of 72 (95% uncertainty interval (UI) = 58–87) in 2030. A comprehensive strategy adding family planning and community-based interventions would have an even larger impact, with a projected MMR of 58 (95% UI = 46–70). Although integrated strategies consisting of multiple interventions will probably be needed to achieve substantial reductions in maternal mortality, the relative priority of different interventions varies by setting. Our regional and country-level estimates can help guide priority setting in specific contexts to accelerate improvements in maternal health.
Simulation-based estimates and projections of global, regional and country-level maternal mortality by cause, 1990–2050
Zachary J. Ward, Rifat Atun, Gary King, Brenda Sequeira Dmello, and Sue J. Goldie. 4/20/2023. “Simulation-based estimates and projections of global, regional and country-level maternal mortality by cause, 1990–2050.” Nature Medicine. Publisher's VersionAbstract
Maternal mortality is a major global health challenge. Although progress has been made globally in reducing maternal deaths, measurement remains challenging given the many causes and frequent underreporting of maternal deaths. We developed the Global Maternal Health microsimulation model for women in 200 countries and territories, accounting for individual fertility preferences and clinical histories. Demographic, epidemiologic, clinical and health system data were synthesized from multiple sources, including the medical literature, Civil Registration Vital Statistics systems and Demographic and Health Survey data. We calibrated the model to empirical data from 1990 to 2015 and assessed the predictive accuracy of our model using indicators from 2016 to 2020. We projected maternal health indicators from 1990 to 2050 for each country and estimate that between 1990 and 2020 annual global maternal deaths declined by over 40% from 587,500 (95% uncertainty intervals (UI) 520,600–714,000) to 337,600 (95% UI 307,900–364,100), and are projected to decrease to 327,400 (95% UI 287,800–360,700) in 2030 and 320,200 (95% UI 267,100–374,600) in 2050. The global maternal mortality ratio is projected to decline to 167 (95% UI 142–188) in 2030, with 58 countries above 140, suggesting that on current trends, maternal mortality Sustainable Development Goal targets are unlikely to be met. Building on the development of our structural model, future research can identify context-specific policy interventions that could allow countries to accelerate reductions in maternal deaths.
Precision mapping child undernutrition for nearly 600,000 inhabited census villages in India
Rockli Kim, Avleen S. Bijral, Yun Xu, Xiuyuan Zhang, Jeffrey C. Blossom, Akshay Swaminathan, Gary King, Alok Kumar, Rakesh Sarwal, Juan M. Lavista Ferres, and S.V. Subramanian. 2021. “Precision mapping child undernutrition for nearly 600,000 inhabited census villages in India.” Proceedings of the National Academy of Sciences, 118, 18, Pp. 1-11. Publisher's VersionAbstract
There are emerging opportunities to assess health indicators at truly small areas with increasing availability of data geocoded to micro geographic units and advanced modeling techniques. The utility of such fine-grained data can be fully leveraged if linked to local governance units that are accountable for implementation of programs and interventions. We used data from the 2011 Indian Census for village-level demographic and amenities features and the 2016 Indian Demographic and Health Survey in a bias-corrected semisupervised regression framework to predict child anthropometric failures for all villages in India. Of the total geographic variation in predicted child anthropometric failure estimates, 54.2 to 72.3% were attributed to the village level followed by 20.6 to 39.5% to the state level. The mean predicted stunting was 37.9% (SD: 10.1%; IQR: 31.2 to 44.7%), and substantial variation was found across villages ranging from less than 5% for 691 villages to over 70% in 453 villages. Estimates at the village level can potentially shift the paradigm of policy discussion in India by enabling more informed prioritization and precise targeting. The proposed methodology can be adapted and applied to diverse population health indicators, and in other contexts, to reveal spatial heterogeneity at a finer geographic scale and identify local areas with the greatest needs and with direct implications for actions to take place.
Population-scale Longitudinal Mapping of COVID-19 Symptoms, Behaviour and Testing
William E. Allen, Han Altae-Tran, James Briggs, Xin Jin, Glen McGee, Andy Shi, Rumya Raghavan, Mireille Kamariza, Nicole Nova, Albert Pereta, Chris Danford, Amine Kamel, Patrik Gothe, Evrhet Milam, Jean Aurambault, Thorben Primke, Weijie Li, Josh Inkenbrandt, Tuan Huynh, Evan Chen, Christina Lee, Michael Croatto, Helen Bentley, Wendy Lu, Robert Murray, Mark Travassos, Brent A. Coull, John Openshaw, Casey S. Greene, Ophir Shalem, Gary King, Ryan Probasco, David R. Cheng, Ben Silbermann, Feng Zhang, and Xihong Lin. 8/26/2020. “Population-scale Longitudinal Mapping of COVID-19 Symptoms, Behaviour and Testing.” Nature Human Behavior. Publisher's VersionAbstract
Despite the widespread implementation of public health measures, coronavirus disease 2019 (COVID-19) continues to spread in the United States. To facilitate an agile response to the pandemic, we developed How We Feel, a web and mobile application that collects longitudinal self-reported survey responses on health, behaviour and demographics. Here, we report results from over 500,000 users in the United States from 2 April 2020 to 12 May 2020. We show that self-reported surveys can be used to build predictive models to identify likely COVID-19-positive individuals. We find evidence among our users for asymptomatic or presymptomatic presentation; show a variety of exposure, occupational and demographic risk factors for COVID-19 beyond symptoms; reveal factors for which users have been SARS-CoV-2 PCR tested; and highlight the temporal dynamics of symptoms and self-isolation behaviour. These results highlight the utility of collecting a diverse set of symptomatic, demographic, exposure and behavioural self-reported data to fight the COVID-19 pandemic.
Building an International Consortium for Tracking Coronavirus Health Status
Eran Segal, Feng Zhang, Xihong Lin, Gary King, Ophir Shalem, Smadar Shilo, William E. Allen, Yonatan H. Grad, Casey S. Greene, Faisal Alquaddoomi, Simon Anders, Ran Balicer, Tal Bauman, Ximena Bonilla, Gisel Booman, Andrew T. Chan, Ori Cohen, Silvano Coletti, Natalie Davidson, Yuval Dor, David A. Drew, Olivier Elemento, Georgina Evans, Phil Ewels, Joshua Gale, Amir Gavrieli, Benjamin Geiger, Iman Hajirasouliha, Roman Jerala, Andre Kahles, Olli Kallioniemi, Ayya Keshet, Gregory Landua, Tomer Meir, Aline Muller, Long H. Nguyen, Matej Oresic, Svetlana Ovchinnikova, Hedi Peterson, Jay Rajagopal, Gunnar Rätsch, Hagai Rossman, Johan Rung, Andrea Sboner, Alexandros Sigaras, Tim Spector, Ron Steinherz, Irene Stevens, Jaak Vilo, Paul Wilmes, and CCC (Coronavirus Census Collective). 8/2020. “Building an International Consortium for Tracking Coronavirus Health Status.” Nature Medicine, 26, Pp. 1161-1165. Publisher's VersionAbstract
Information is the most potent protective weapon we have to combat a pandemic, at both the individual and global level. For individuals, information can help us make personal decisions and provide a sense of security. For the global community, information can inform policy decisions and offer critical insights into the epidemic of COVID-19 disease. Fully leveraging the power of information, however, requires large amounts of data and access to it. To achieve this, we are making steps to form an international consortium, Coronavirus Census Collective (CCC, coronaviruscensuscollective.org), that will serve as a hub for integrating information from multiple data sources that can be utilized to understand, monitor, predict, and combat global pandemics. These sources may include self-reported health status through surveys (including mobile apps), results of diagnostic laboratory tests, and other static and real-time geospatial data. This collective effort to track and share information will be invaluable in predicting hotspots of disease outbreak, identifying which factors control the rate of spreading, informing immediate policy decisions, evaluating the effectiveness of measures taken by health organizations on pandemic control, and providing critical insight on the etiology of COVID-19. It will also help individuals stay informed on this rapidly evolving situation and contribute to other global efforts to slow the spread of disease. In the past few weeks, several initiatives across the globe have surfaced to use daily self-reported symptoms as a means to track disease spread, predict outbreak locations, guide population measures and help in the allocation of healthcare resources. The aim of this paper is to put out a call to standardize these efforts and spark a collaborative effort to maximize the global gain while protecting participant privacy.
Survey Data and Human Computation for Improved Flu Tracking
Stefan Wojcik, Avleen Bijral, Richard Johnston, Juan Miguel Lavista, Gary King, Ryan Kennedy, Alessandro Vespignani, and David Lazer. 2021. “Survey Data and Human Computation for Improved Flu Tracking.” Nature Communications, 12, 194, Pp. 1-8. Publisher's VersionAbstract
While digital trace data from sources like search engines hold enormous potential for tracking and understanding human behavior, these streams of data lack information about the actual experiences of those individuals generating the data. Moreover, most current methods ignore or under-utilize human processing capabilities that allow humans to solve problems not yet solvable by computers (human computation). We demonstrate how behavioral research, linking digital and real-world behavior, along with human computation, can be utilized to improve the performance of studies using digital data streams. This study looks at the use of search data to track prevalence of Influenza-Like Illness (ILI). We build a behavioral model of flu search based on survey data linked to users’ online browsing data. We then utilize human computation for classifying search strings. Leveraging these resources, we construct a tracking model of ILI prevalence that outperforms strong historical benchmarks using only a limited stream of search data and lends itself to tracking ILI in smaller geographic units. While this paper only addresses searches related to ILI, the method we describe has potential for tracking a broad set of phenomena in near real-time.
Evaluating COVID-19 Public Health Messaging in Italy: Self-Reported Compliance and Growing Mental Health Concerns
Soubhik Barari, Stefano Caria, Antonio Davola, Paolo Falco, Thiemo Fetzer, Stefano Fiorin, Lukas Hensel, Andriy Ivchenko, Jon Jachimowicz, Gary King, Gordon Kraft-Todd, Alice Ledda, Mary MacLennan, Lucian Mutoi, Claudio Pagani, Elena Reutskaja, Christopher Roth, and Federico Raimondi Slepoi. 2020. “Evaluating COVID-19 Public Health Messaging in Italy: Self-Reported Compliance and Growing Mental Health Concerns”. Publisher's VersionAbstract

Purpose: The COVID-19 death-rate in Italy continues to climb, surpassing that in every other country. We implement one of the first nationally representative surveys about this unprecedented public health crisis and use it to evaluate the Italian government’ public health efforts and citizen responses. 
Findings: (1) Public health messaging is being heard. Except for slightly lower compliance among young adults, all subgroups we studied understand how to keep themselves and others safe from the SARS-Cov-2 virus. Remarkably, even those who do not trust the government, or think the government has been untruthful about the crisis believe the messaging and claim to be acting in accordance. (2) The quarantine is beginning to have serious negative effects on the population’s mental health.
Policy Recommendations: Communications focus should move from explaining to citizens that they should stay at home to what they can do there. We need interventions that make staying at home and following public health protocols more desirable. These interventions could include virtual social interactions, such as online social reading activities, classes, exercise routines, etc. — all designed to reduce the boredom of long term social isolation and to increase the attractiveness of following public health recommendations. Interventions like these will grow in importance as the crisis wears on around the world, and staying inside wears on people.

Replication data for this study in dataverse

Forecasting Mortality

Statistical Security for Social Security
Samir Soneji and Gary King. 2012. “Statistical Security for Social Security.” Demography, 49, 3, Pp. 1037-1060 . Publisher's versionAbstract

The financial viability of Social Security, the single largest U.S. Government program, depends on accurate forecasts of the solvency of its intergenerational trust fund. We begin by detailing information necessary for replicating the Social Security Administration’s (SSA’s) forecasting procedures, which until now has been unavailable in the public domain. We then offer a way to improve the quality of these procedures due to age-and sex-specific mortality forecasts. The most recent SSA mortality forecasts were based on the best available technology at the time, which was a combination of linear extrapolation and qualitative judgments. Unfortunately, linear extrapolation excludes known risk factors and is inconsistent with long-standing demographic patterns such as the smoothness of age profiles. Modern statistical methods typically outperform even the best qualitative judgments in these contexts. We show how to use such methods here, enabling researchers to forecast using far more information, such as the known risk factors of smoking and obesity and known demographic patterns. Including this extra information makes a sub¬stantial difference: For example, by only improving mortality forecasting methods, we predict three fewer years of net surplus, $730 billion less in Social Security trust funds, and program costs that are 0.66% greater of projected taxable payroll compared to SSA projections by 2031. More important than specific numerical estimates are the advantages of transparency, replicability, reduction of uncertainty, and what may be the resulting lower vulnerability to the politicization of program forecasts. In addition, by offering with this paper software and detailed replication information, we hope to marshal the efforts of the research community to include ever more informative inputs and to continue to reduce the uncertainties in Social Security forecasts.

This work builds on our article that provides forecasts of US Mortality rates (see King and Soneji, The Future of Death in America), a book developing improved methods for forecasting mortality (Girosi and King, Demographic Forecasting), all data we used (King and Soneji, replication data sets), and open source software that implements the methods (Girosi and King, YourCast).  Also available is a New York Times Op-Ed based on this work (King and Soneji, Social Security: It’s Worse Than You Think), and a replication data set for the Op-Ed (King and Soneji, replication data set).

Understanding the Lee-Carter Mortality Forecasting Method
Federico Girosi and Gary King. 2007. “Understanding the Lee-Carter Mortality Forecasting Method”.Abstract
We demonstrate here several previously unrecognized or insufficiently appreciated properties of the Lee-Carter mortality forecasting approach, the dominant method used in both the academic literature and practical applications. We show that this model is a special case of a considerably simpler, and less often biased, random walk with drift model, and prove that the age profile forecast from both approaches will always become less smooth and unrealistic after a point (when forecasting forward or backwards in time) and will eventually deviate from any given baseline. We use these and other properties we demonstrate to suggest when the model would be most applicable in practice.
Demographic Forecasting
Federico Girosi and Gary King. 2008. Demographic Forecasting. Princeton: Princeton University Press.Abstract

We introduce a new framework for forecasting age-sex-country-cause-specific mortality rates that incorporates considerably more information, and thus has the potential to forecast much better, than any existing approach. Mortality forecasts are used in a wide variety of academic fields, and for global and national health policy making, medical and pharmaceutical research, and social security and retirement planning.

As it turns out, the tools we developed in pursuit of this goal also have broader statistical implications, in addition to their use for forecasting mortality or other variables with similar statistical properties. First, our methods make it possible to include different explanatory variables in a time series regression for each cross-section, while still borrowing strength from one regression to improve the estimation of all. Second, we show that many existing Bayesian (hierarchical and spatial) models with explanatory variables use prior densities that incorrectly formalize prior knowledge. Many demographers and public health researchers have fortuitously avoided this problem so prevalent in other fields by using prior knowledge only as an ex post check on empirical results, but this approach excludes considerable information from their models. We show how to incorporate this demographic knowledge into a model in a statistically appropriate way. Finally, we develop a set of tools useful for developing models with Bayesian priors in the presence of partial prior ignorance. This approach also provides many of the attractive features claimed by the empirical Bayes approach, but fully within the standard Bayesian theory of inference.

The Future of Death in America
Gary King and Samir Soneji. 2011. “The Future of Death in America.” Demographic Research, 25, 1, Pp. 1--38. WebsiteAbstract

Population mortality forecasts are widely used for allocating public health expenditures, setting research priorities, and evaluating the viability of public pensions, private pensions, and health care financing systems. In part because existing methods seem to forecast worse when based on more information, most forecasts are still based on simple linear extrapolations that ignore known biological risk factors and other prior information. We adapt a Bayesian hierarchical forecasting model capable of including more known health and demographic information than has previously been possible. This leads to the first age- and sex-specific forecasts of American mortality that simultaneously incorporate, in a formal statistical model, the effects of the recent rapid increase in obesity, the steady decline in tobacco consumption, and the well known patterns of smooth mortality age profiles and time trends. Formally including new information in forecasts can matter a great deal. For example, we estimate an increase in male life expectancy at birth from 76.2 years in 2010 to 79.9 years in 2030, which is 1.8 years greater than the U.S. Social Security Administration projection and 1.5 years more than U.S. Census projection. For females, we estimate more modest gains in life expectancy at birth over the next twenty years from 80.5 years to 81.9 years, which is virtually identical to the Social Security Administration projection and 2.0 years less than U.S. Census projections. We show that these patterns are also likely to greatly affect the aging American population structure. We offer an easy-to-use approach so that researchers can include other sources of information and potentially improve on our forecasts too.

Estimating Overall and Cause-Specific Mortality Rates

Inexpensive methods of estimating the overall and cause-specific mortality rates from surveys when vital registration (death certificates) or other monitoring is unavailable or inadequate.

Hidden Region 1

A method for estimating cause-specific mortality from "verbal autopsy" data that is less expensive, more reliable, requires fewer assumptions, and will normally be more accurate.
Estimating Incidence Curves of Several Infections Using Symptom Surveillance Data
Edward Goldstein, Benjamin J Cowling, Allison E Aiello, Saki Takahashi, Gary King, Ying Lu, and Marc Lipsitch. 2011. “Estimating Incidence Curves of Several Infections Using Symptom Surveillance Data.” PLoS ONE, 6, 8, Pp. e23380.Abstract

We introduce a method for estimating incidence curves of several co-circulating infectious pathogens, where each infection has its own probabilities of particular symptom profiles. Our deconvolution method utilizes weekly surveillance data on symptoms from a defined population as well as additional data on symptoms from a sample of virologically confirmed infectious episodes. We illustrate this method by numerical simulations and by using data from a survey conducted on the University of Michigan campus. Last, we describe the data needs to make such estimates accurate.

Link to PLoS version

Designing Verbal Autopsy Studies
Gary King, Ying Lu, and Kenji Shibuya. 2010. “Designing Verbal Autopsy Studies.” Population Health Metrics, 8, 19.Abstract
Background: Verbal autopsy analyses are widely used for estimating cause-specific mortality rates (CSMR) in the vast majority of the world without high quality medical death registration. Verbal autopsies -- survey interviews with the caretakers of imminent decedents -- stand in for medical examinations or physical autopsies, which are infeasible or culturally prohibited. Methods and Findings: We introduce methods, simulations, and interpretations that can improve the design of automated, data-derived estimates of CSMRs, building on a new approach by King and Lu (2008). Our results generate advice for choosing symptom questions and sample sizes that is easier to satisfy than existing practices. For example, most prior effort has been devoted to searching for symptoms with high sensitivity and specificity, which has rarely if ever succeeded with multiple causes of death. In contrast, our approach makes this search irrelevant because it can produce unbiased estimates even with symptoms that have very low sensitivity and specificity. In addition, the new method is optimized for survey questions caretakers can easily answer rather than questions physicians would ask themselves. We also offer an automated method of weeding out biased symptom questions and advice on how to choose the number of causes of death, symptom questions to ask, and observations to collect, among others. Conclusions: With the advice offered here, researchers should be able to design verbal autopsy surveys and conduct analyses with greatly reduced statistical biases and research costs.
Deaths From Heart Failure: Using Coarsened Exact Matching to Correct Cause of Death Statistics
Gretchen Stevens, Gary King, and Kenji Shibuya. 2010. “Deaths From Heart Failure: Using Coarsened Exact Matching to Correct Cause of Death Statistics.” Population Health Metrics, 8, 6.Abstract

Background: Incomplete information on death certificates makes recorded cause of death data less useful for public health monitoring and planning. Certifying physicians sometimes list only the mode of death (and in particular, list heart failure) without indicating the underlying disease(s) that gave rise to the death. This can prevent valid epidemiologic comparisons across countries and over time. Methods and Results: We propose that coarsened exact matching be used to infer the underlying causes of death where only the mode of death is known; we focus on the case of heart failure in U.S., Mexican and Brazilian death records. Redistribution algorithms derived using this method assign the largest proportion of heart failure deaths to ischemic heart disease in all three countries (53%, 26% and 22%), with larger proportions assigned to hypertensive heart disease and diabetes in Mexico and Brazil (16% and 23% vs. 7% for hypertensive heart disease and 13% and 9% vs. 6% for diabetes). Reassigning these heart failure deaths increases US ischemic heart disease mortality rates by 6%.Conclusions: The frequency with which physicians list heart failure in the causal chain for various underlying causes of death allows for inference about how physicians use heart failure on the death certificate in different settings. This easy-to-use method has the potential to reduce bias and increase comparability in cause-of-death data, thereby improving the public health utility of death records. Key Words: vital statistics, heart failure, population health, mortality, epidemiology

Verbal Autopsy Methods with Multiple Causes of Death
Gary King and Ying Lu. 2008. “Verbal Autopsy Methods with Multiple Causes of Death.” Statistical Science, 23, Pp. 78–91.Abstract
Verbal autopsy procedures are widely used for estimating cause-specific mortality in areas without medical death certification. Data on symptoms reported by caregivers along with the cause of death are collected from a medical facility, and the cause-of-death distribution is estimated in the population where only symptom data are available. Current approaches analyze only one cause at a time, involve assumptions judged difficult or impossible to satisfy, and require expensive, time consuming, or unreliable physician reviews, expert algorithms, or parametric statistical models. By generalizing current approaches to analyze multiple causes, we show how most of the difficult assumptions underlying existing methods can be dropped. These generalizations also make physician review, expert algorithms, and parametric statistical assumptions unnecessary. With theoretical results, and empirical analyses in data from China and Tanzania, we illustrate the accuracy of this approach. While no method of analyzing verbal autopsy data, including the more computationally intensive approach offered here, can give accurate estimates in all circumstances, the procedure offered is conceptually simpler, less expensive, more general, as or more replicable, and easier to use in practice than existing approaches. We also show how our focus on estimating aggregate proportions, which are the quantities of primary interest in verbal autopsy studies, may also greatly reduce the assumptions necessary, and thus improve the performance of, many individual classifiers in this and other areas. As a companion to this paper, we also offer easy-to-use software that implements the methods discussed herein.

Hidden Region 2

Armed Conflict as a Public Health Problem
Evidence of the massive selection bias in all data on mortality from war (vital registration systems rarely continue to operate when war begins). Uncertainty in mortality estimates from major wars is as large as the estimates. Christopher JL Murray, Gary King, Alan D Lopez, Niels Tomijima, and Etienne Krug. 2002. “Armed Conflict as a Public Health Problem.” BMJ (British Medical Journal), 324, Pp. 346–349.Abstract
Armed conflict is a major cause of injury and death worldwide, but we need much better methods of quantification before we can accurately assess its effect. Armed conflict between warring states and groups within states have been major causes of ill health and mortality for most of human history. Conflict obviously causes deaths and injuries on the battlefield, but also health consequences from the displacement of populations, the breakdown of health and social services, and the heightened risk of disease transmission. Despite the size of the health consequences, military conflict has not received the same attention from public health research and policy as many other causes of illness and death. In contrast, political scientists have long studied the causes of war but have primarily been interested in the decision of elite groups to go to war, not in human death and misery. We review the limited knowledge on the health consequences of conflict, suggest ways to improve measurement, and discuss the potential for risk assessment and for preventing and ameliorating the consequences of conflict.
Death by Survey: Estimating Adult Mortality without Selection Bias from Sibling Survival Data
Unbiased estimates of mortality rates from surveys about sibling and others' survival; explains and reduces biases in existing methods. Emmanuela Gakidou and Gary King. 2006. “Death by Survey: Estimating Adult Mortality without Selection Bias from Sibling Survival Data.” Demography, 43, Pp. 569–585.Abstract
The widely used methods for estimating adult mortality rates from sample survey responses about the survival of siblings, parents, spouses, and others depend crucially on an assumption that we demonstrate does not hold in real data. We show that when this assumption is violated – so that the mortality rate varies with sibship size – mortality estimates can be massively biased. By using insights from work on the statistical analysis of selection bias, survey weighting, and extrapolation problems, we propose a new and relatively simple method of recovering the mortality rate with both greatly reduced potential for bias and increased clarity about the source of necessary assumptions.

Uses of Mortality Rates

Rethinking Human Security
Provides a rigorous and measurable definition of human security; discusses the improvements in data collection and methods of forecasting necessary to measure human security; and introduces an agenda to enhance human security that follows logically in the areas of risk assessment, prevention, protection, and compensation. Gary King and Christopher J.L. Murray. 2002. “Rethinking Human Security.” Political Science Quarterly, 116, Pp. 585–610.Abstract

In the last two decades, the international community has begun to conclude that attempts to ensure the territorial security of nation-states through military power have failed to improve the human condition. Despite astronomical levels of military spending, deaths due to military conflict have not declined. Moreover, even when the borders of some states are secure from foreign threats, the people within those states do not necessarily have freedom from crime, enough food, proper health care, education, or political freedom. In response to these developments, the international community has gradually moved to combine economic development with military security and other basic human rights to form a new concept of "human security". Unfortunately, by common assent the concept lacks both a clear definition, consistent with the aims of the international community, and any agreed upon measure of it. In this paper, we propose a simple, rigorous, and measurable definition of human security: the expected number of years of future life spent outside the state of "generalized poverty". Generalized poverty occurs when an individual falls below the threshold in any key domain of human well-being. We consider improvements in data collection and methods of forecasting that are necessary to measure human security and then introduce an agenda for research and action to enhance human security that follows logically in the areas of risk assessment, prevention, protection, and compensation.

The Effects of International Monetary Fund Loans on Health Outcomes
A Perspective article on the effect of the IMF on increasing tuberculosis mortality rates: Megan Murray and Gary King. 2008. “The Effects of International Monetary Fund Loans on Health Outcomes.” PLoS Medicine, 5.Abstract
A "Perspective" article that discusses an article by David Stuckler and colleagues showing that, in Eastern European and former Soviet countries, participation in International Monetary Fund economic programs have been associated with higher mortality rates from tuberculosis.
Statistical Security for Social Security
Samir Soneji and Gary King. 2012. “Statistical Security for Social Security.” Demography, 49, 3, Pp. 1037-1060 . Publisher's versionAbstract

The financial viability of Social Security, the single largest U.S. Government program, depends on accurate forecasts of the solvency of its intergenerational trust fund. We begin by detailing information necessary for replicating the Social Security Administration’s (SSA’s) forecasting procedures, which until now has been unavailable in the public domain. We then offer a way to improve the quality of these procedures due to age-and sex-specific mortality forecasts. The most recent SSA mortality forecasts were based on the best available technology at the time, which was a combination of linear extrapolation and qualitative judgments. Unfortunately, linear extrapolation excludes known risk factors and is inconsistent with long-standing demographic patterns such as the smoothness of age profiles. Modern statistical methods typically outperform even the best qualitative judgments in these contexts. We show how to use such methods here, enabling researchers to forecast using far more information, such as the known risk factors of smoking and obesity and known demographic patterns. Including this extra information makes a sub¬stantial difference: For example, by only improving mortality forecasting methods, we predict three fewer years of net surplus, $730 billion less in Social Security trust funds, and program costs that are 0.66% greater of projected taxable payroll compared to SSA projections by 2031. More important than specific numerical estimates are the advantages of transparency, replicability, reduction of uncertainty, and what may be the resulting lower vulnerability to the politicization of program forecasts. In addition, by offering with this paper software and detailed replication information, we hope to marshal the efforts of the research community to include ever more informative inputs and to continue to reduce the uncertainties in Social Security forecasts.

This work builds on our article that provides forecasts of US Mortality rates (see King and Soneji, The Future of Death in America), a book developing improved methods for forecasting mortality (Girosi and King, Demographic Forecasting), all data we used (King and Soneji, replication data sets), and open source software that implements the methods (Girosi and King, YourCast).  Also available is a New York Times Op-Ed based on this work (King and Soneji, Social Security: It’s Worse Than You Think), and a replication data set for the Op-Ed (King and Soneji, replication data set).

Systematic Bias and Nontransparency in US Social Security Administration Forecasts
Konstantin Kashin, Gary King, and Samir Soneji. 2015. “Systematic Bias and Nontransparency in US Social Security Administration Forecasts.” Journal of Economic Perspectives, 29, 2, Pp. 239-258. Publisher's VersionAbstract

The financial stability of four of the five largest U.S. federal entitlement programs, strategic decision making in several industries, and many academic publications all depend on the accuracy of demographic and financial forecasts made by the Social Security Administration (SSA). Although the SSA has performed these forecasts since 1942, no systematic and comprehensive evaluation of their accuracy has ever been published by SSA or anyone else. The absence of a systematic evaluation of forecasts is a concern because the SSA relies on informal procedures that are potentially subject to inadvertent biases and does not share with the public, the scientific community, or other parts of SSA sufficient data or information necessary to replicate or improve its forecasts. These issues result in SSA holding a monopoly position in policy debates as the sole supplier of fully independent forecasts and evaluations of proposals to change Social Security. To assist with the forecasting evaluation problem, we collect all SSA forecasts for years that have passed and discover error patterns that could have been---and could now be---used to improve future forecasts. Specifically, we find that after 2000, SSA forecasting errors grew considerably larger and most of these errors made the Social Security Trust Funds look more financially secure than they actually were. In addition, SSA's reported uncertainty intervals are overconfident and increasingly so after 2000. We discuss the implications of these systematic forecasting biases for public policy.

Determinants of Inequality in Child Survival: Results from 39 Countries
Emmanuela Gakidou and Gary King. 2003. “Determinants of Inequality in Child Survival: Results from 39 Countries.” In Health Systems Performance Assessment: Debates, Methods and Empiricism, edited by Christopher J.L. Murray and David B. Evans, Pp. 497-502. Geneva: World Health Organization.Abstract

Few would disagree that health policies and programmes ought to be based on valid, timely and relevant information, focused on those aspects of health development that are in greatest need of improvement. For example, vaccination programmes rely heavily on information on cases and deaths to document needs and to monitor progress on childhood illness and mortality. The same strong information basis is necessary for policies on health inequality. The reduction of health inequality is widely accepted as a key goal for societies, but any policy needs reliable research on the extent and causes of health inequality. Given that child deaths still constitute 19% of all deaths globally and 24% of all deaths in developing countries (1), reducing inequalities in child survival is a good beginning.total = between + within

The between-group component of total health inequality has been studied extensively by numerous scholars. They have expertly analysed the causes of differences in health status and mortality across population subgroups, defined by income, education, race/ethnicity, country, region, social class, and other group identifiers (2–9).

 

A method to estimate total and within-group inequality in health (all prior research is about mean differences between groups). Emmanuela Gakidou and Gary King. 2002. “Measuring Total Health Inequality: Adding Individual Variation to Group-Level Differences.” BioMed Central: International Journal for Equity in Health, 1.Abstract
Background: Studies have revealed large variations in average health status across social, economic, and other groups. No study exists on the distribution of the risk of ill-health across individuals, either within groups or across all people in a society, and as such a crucial piece of total health inequality has been overlooked. Some of the reason for this neglect has been that the risk of death, which forms the basis for most measures, is impossible to observe directly and difficult to estimate. Methods: We develop a measure of total health inequality – encompassing all inequalities among people in a society, including variation between and within groups – by adapting a beta-binomial regression model. We apply it to children under age two in 50 low- and middle-income countries. Our method has been adopted by the World Health Organization and is being implemented in surveys around the world and preliminary estimates have appeared in the World Health Report (2000). Results: Countries with similar average child mortality differ considerably in total health inequality. Liberia and Mozambique have the largest inequalities in child survival, while Colombia, the Philippines and Kazakhstan have the lowest levels among the countries measured. Conclusions: Total health inequality estimates should be routinely reported alongside average levels of health in populations and groups, as they reveal important policy-related information not otherwise knowable. This approach enables meaningful comparisons of inequality across countries and future analyses of the determinants of inequality.
Explaining Systematic Bias and Nontransparency in US Social Security Administration Forecasts
Konstantin Kashin, Gary King, and Samir Soneji. 2015. “Explaining Systematic Bias and Nontransparency in US Social Security Administration Forecasts.” Political Analysis, 23, 3, Pp. 336-362. Publisher's VersionAbstract

The accuracy of U.S. Social Security Administration (SSA) demographic and financial forecasts is crucial for the solvency of its Trust Funds, other government programs, industry decision making, and the evidence base of many scholarly articles. Because SSA makes public little replication information and uses qualitative and antiquated statistical forecasting methods, fully independent alternative forecasts (and the ability to score policy proposals to change the system) are nonexistent. Yet, no systematic evaluation of SSA forecasts has ever been published by SSA or anyone else --- until a companion paper to this one (King, Kashin, and Soneji, 2015a). We show that SSA's forecasting errors were approximately unbiased until about 2000, but then began to grow quickly, with increasingly overconfident uncertainty intervals. Moreover, the errors are all in the same potentially dangerous direction, making the Social Security Trust Funds look healthier than they actually are. We extend and then attempt to explain these findings with evidence from a large number of interviews we conducted with participants at every level of the forecasting and policy processes. We show that SSA's forecasting procedures meet all the conditions the modern social-psychology and statistical literatures demonstrate make bias likely. When those conditions mixed with potent new political forces trying to change Social Security, SSA's actuaries hunkered down trying hard to insulate their forecasts from strong political pressures. Unfortunately, this otherwise laudable resistance to undue influence, along with their ad hoc qualitative forecasting models, led the actuaries to miss important changes in the input data. Retirees began living longer lives and drawing benefits longer than predicted by simple extrapolations. We also show that the solution to this problem involves SSA or Congress implementing in government two of the central projects of political science over the last quarter century: [1] promoting transparency in data and methods and [2] replacing with formal statistical models large numbers of qualitative decisions too complex for unaided humans to make optimally.

Teaching and Administration

Publications and other projects designed to improve teaching, learning, and university administration, as well as broader writings on the future of the social sciences.
Statistical Intuition Without Coding (or Teachers)
Natalie Ayers, Gary King, Zagreb Mukerjee, and Dominic Skinnion. Working Paper. “Statistical Intuition Without Coding (or Teachers)”.Abstract
Two features of quantitative political methodology make teaching and learning especially difficult: (1) Each new concept in probability, statistics, and inference builds on all previous (and sometimes all other relevant) concepts; and (2) motivating substantively oriented students, by teaching these abstract theories simultaneously with the practical details of a statistical programming language (such as R), makes learning each subject harder. We address both problems through a new type of automated teaching tool that helps students see the big theoretical picture and all its separate parts at the same time without having to simultaneously learn to program. This tool, which we make available via one click in a web browser, can be used in a traditional methods class, but is also designed to work without instructor supervision.
 
Education and Scholarship by Video
Gary King. 2021. “Education and Scholarship by Video”. [Direct link to paper]Abstract

When word processors were first introduced into the workplace, they turned scholars into typists. But they also improved our work: Turnaround time for new drafts dropped from days to seconds. Rewriting became easier and more common, and our papers, educational efforts, and research output improved. I discuss the advantages of and mechanisms for doing the same with do-it-yourself video recordings of research talks and class lectures, so that they may become a fully respected channel for scholarly output and education, alongside books and articles. I consider innovations in video design to optimize education and communication, along with technology to make this change possible.

Excerpts of this paper appeared in Political Science Today (Vol. 1, No. 3, August 2021: Pp.5-6, copy here) and in APSAEducate. See also my recorded videos here.

Instructional Support Platform for Interactive Learning Platforms (2nd)
Gary King, Eric Mazur, Kelly Miller, and Brian Lukoff. 6/23/2020. “Instructional Support Platform for Interactive Learning Platforms (2nd).” United States of America US 10,692,391 B2 (U.S Patent and Trademark Office).Abstract
In various embodiments, subject matter for improving discussions in connection with an educational resource is identified and summarized by analyzing annotations made by students assigned to a discussion group to identify high-quality annotations likely to generate responses and stimulate discussion threads, identifying clusters of high quality annotations relating to the same portion or related portions of the educational resource , extracting and summarizing text from the annotations, and combining , in an electronically represented document, the extracted and summarized text and (i) at least some of the annotations and the portion or portions of the educational resource or (ii) click able links thereto.
Instructional Support Platform for Interactive Learning Platforms
Gary King, Eric Mazur, Kelly Miller, and Brian Lukoff. 10/8/2019. “Instructional Support Platform for Interactive Learning Platforms.” United States of America US 10,438,498 B2 (U.S Patent and Trademark Office).Abstract
In various embodiments, subject matter for improving discussions in connection with an educational resource is identified and summarized by analyzing annotations made by students assigned to a discussion group to identify high-quality annotations likely to generate responses and stimulate discussion threads, identifying clusters of high quality annotations relating to the same portion or related portions of the educational resource , extracting and summarizing text from the annotations, and combining , in an electronically represented document, the extracted and summarized text and (i) at least some of the annotations and the portion or portions of the educational resource or (ii) click able links thereto.
The “Math Prefresher” and The Collective Future of Political Science Graduate Training
Gary King, Shiro Kuriwaki, and Yon Soo Park. 2020. “The “Math Prefresher” and The Collective Future of Political Science Graduate Training.” PS: Political Science and Politics, 53, 3, Pp. 537-541. Publisher's VersionAbstract

The political science math prefresher arose a quarter century ago and has now spread to many of our discipline’s Ph.D. programs. Incoming students arrive for graduate school a few weeks early for ungraded instruction in math, statistics, and computer science as they are useful for political science. The prefresher’s benefits, however, go beyond the technical material taught: it develops lasting camaraderie with their entering class, facilitates connections with senior graduate students, opens pathways to mastering methods necessary for research, and eases the transition to the increasingly collaborative nature of graduate work. The prefresher also shows how faculty across a highly diverse discipline can work together to train the next generation. We review this program, highlight its collaborative aspects, and try to take the idea to the next level by building infrastructure to share teaching materials across universities so separate programs can build on each other’s work and improve all our programs.

Participant Grouping for Enhanced Interactive Experience (3rd)
Gary King, Eric Mazur, and Brian Lukoff. 2/26/2019. “Participant Grouping for Enhanced Interactive Experience (3rd).” United States of America US 10,216,827 B2 (U.S Patent and Trademark Office).Abstract
Representative embodiments of a method for grouping participants in an activity include the steps of: (i) defining a grouping policy; (ii) storing, in a database, participant records that include a participant identifier, a characteristic associated with the participant, and/or an identifier for a participant's handheld device; (iii) defining groupings based on the policy and characteristics of the participants relating to the policy and to the activity; and (iv) communicating the groupings to the handheld devices to establish the groups.
Participant Grouping for Enhanced Interactive Experience (2nd)
Gary King, Eric Mazur, and Brian Lutkoff. 12/22/2015. “Participant Grouping for Enhanced Interactive Experience (2nd).” United States of America US 9,219,998 ( U.S Patent and Trademark Office).Abstract
Representative embodiments of a method for grouping participants in an activity include the steps of: (i) defining a grouping policy; (ii) storing, in a database, participant records that include a participant identifier, a characteristic associated with the participant, and/or an identifier for a participant's handheld device; (iii) defining groupings based on the policy and characteristics of the participants relating to the policy and to the activity; and (iv) communicating the groupings to the handheld devices to establish the groups.
Stimulating Online Discussion in Interactive Learning Environments
Gary King, Eric Mazur, Kelly Miller, and Brian Lukoff. 1/29/2019. “Stimulating Online Discussion in Interactive Learning Environments.” United States of America US 10,192,456 B2 (U.S Patent and Trademark Office).Abstract
In various embodiments, online discussions in connection with an eductional resource are improved by analyzing annotations made by students assigned to a discussion group to identify high-quality annotations likely to generate responses and stimulate discussion threads and by making the identified annotations visibile to students not assigned to the discussion group.
Management of Off-Task Time in a Participatory Environment
Gary King, Brian Lukoff, and Eric Mazur. 5/8/2018. “Management of Off-Task Time in a Participatory Environment .” United States of America US 9,965,972 B2 ( U.S Patent and Trademark Office).Abstract
Participatory activity carried out using electronic devices is enhanced by occupying the attention of participants who complete a task before a set completion time. For example, a request or question having an expected response time less than the remaining answer time may be provided to early-finishing participants. In another of the many embodiments, the post-response tasks are different for each participant, depending upon, for example, the rate at which the participant has successfully provided answers to previous questions. This ensures continuous engagement of all participants.
Use of a Social Annotation Platform for Pre-Class Reading Assignments in a Flipped Introductory Physics Class
Kelly Miller, Brian Lukoff, Gary King, and Eric Mazur. 3/2018. “Use of a Social Annotation Platform for Pre-Class Reading Assignments in a Flipped Introductory Physics Class.” Frontiers in Education, 3, 8, Pp. 1-12. Publisher's VersionAbstract

In this paper, we illustrate the successful implementation of pre-class reading assignments through a social learning platform that allows students to discuss the reading online with their classmates. We show how the platform can be used to understand how students are reading before class. We find that, with this platform, students spend an above average amount of time reading (compared to that reported in the literature) and that most students complete their reading assignments before class. We identify specific reading behaviors that are predictive of in-class exam performance. We also demonstrate ways that the platform promotes active reading strategies and produces high-quality learning interactions between students outside class. Finally, we compare the exam performance of two cohorts of students, where the only difference between them is the use of the platform; we show that students do significantly better on exams when using the platform.

Reprinted in Cassidy, R., Charles, E. S., Slotta, J. D., Lasry, N., eds. (2019). Active Learning: Theoretical Perspectives, Empirical Studies and Design Profiles. Lausanne: Frontiers Media. doi: 10.3389/978-2-88945-885-1

booc.io: An Education System with Hierarchical Concept Maps
Michail Schwab, Hendrik Strobelt, James Tompkin, Colin Fredericks, Connor Huff, Dana Higgins, Anton Strezhnev, Mayya Komisarchik, Gary King, and Hanspeter Pfister. 2017. “booc.io: An Education System with Hierarchical Concept Maps.” IEEE Transactions on Visualization and Computer Graphics, 23, 1, Pp. 571-580. Publisher's VersionAbstract

Information hierarchies are difficult to express when real-world space or time constraints force traversing the hierarchy in linear presentations, such as in educational books and classroom courses. We present booc.io, which allows linear and non-linear presentation and navigation of educational concepts and material. To support a breadth of material for each concept, booc.io is Web based, which allows adding material such as lecture slides, book chapters, videos, and LTIs. A visual interface assists the creation of the needed hierarchical structures. The goals of our system were formed in expert interviews, and we explain how our design meets these goals. We adapt a real-world course into booc.io, and perform introductory qualitative evaluation with students.

How Human Subjects Research Rules Mislead You and Your University, and What to Do About it
Gary King and Melissa Sands. 2016. “How Human Subjects Research Rules Mislead You and Your University, and What to Do About it”.Abstract

Universities require faculty and students planning research involving human subjects to pass formal certification tests and then submit research plans for prior approval. Those who diligently take the tests may better understand certain important legal requirements but, at the same time, are often misled into thinking they can apply these rules to their own work which, in fact, they are not permitted to do. They will also be missing many other legal requirements not mentioned in their training but which govern their behaviors. Finally, the training leaves them likely to completely misunderstand the essentially political situation they find themselves in. The resulting risks to their universities, collaborators, and careers may be catastrophic, in addition to contributing to the more common ordinary frustrations of researchers with the system. To avoid these problems, faculty and students conducting research about and for the public need to understand that they are public figures, to whom different rules apply, ones that political scientists have long studied. University administrators (and faculty in their part-time roles as administrators) need to reorient their perspectives as well. University research compliance bureaucracies have grown, in well-meaning but sometimes unproductive ways that are not required by federal laws or guidelines. We offer advice to faculty and students for how to deal with the system as it exists now, and suggestions for changes in university research compliance bureaucracies, that should benefit faculty, students, staff, university budgets, and our research subjects.

Participant Grouping for Enhanced Interactive Experience
Gary King, Brian Lukoff, and Eric Mazur. 2014. “Participant Grouping for Enhanced Interactive Experience.” United States of America US 8,914,373 B2 (U.S. Patent and Trademark Office).Abstract

Representative embodiments of a method for grouping participants in an activity include the steps of: (i) defining a grouping policy; (ii) storing, in a database, participant records that include a participant identifer, a characteristic associated With the participant, and/or an identifier for a participant’s handheld device; (iii) defining groupings based on the policy and characteristics of the participants relating to the policy and to the activity; and (iv) communicating the groupings to the handheld devices to establish the groups.

Restructuring the Social Sciences: Reflections from Harvard's Institute for Quantitative Social Science
Gary King. 2014. “Restructuring the Social Sciences: Reflections from Harvard's Institute for Quantitative Social Science.” PS: Political Science and Politics, 47, 1, Pp. 165-172. Cambridge University Press versionAbstract

The social sciences are undergoing a dramatic transformation from studying problems to solving them; from making do with a small number of sparse data sets to analyzing increasing quantities of diverse, highly informative data; from isolated scholars toiling away on their own to larger scale, collaborative, interdisciplinary, lab-style research teams; and from a purely academic pursuit to having a major impact on the world. To facilitate these important developments, universities, funding agencies, and governments need to shore up and adapt the infrastructure that supports social science research. We discuss some of these developments here, as well as a new type of organization we created at Harvard to help encourage them -- the Institute for Quantitative Social Science.  An increasing number of universities are beginning efforts to respond with similar institutions. This paper provides some suggestions for how individual universities might respond and how we might work together to advance social science more generally.

The Troubled Future of Colleges and Universities (with comments from five scholar-administrators)
Gary King and Maya Sen. 2013. “The Troubled Future of Colleges and Universities (with comments from five scholar-administrators).” PS: Political Science and Politics, 46, 1, Pp. 81--113.Abstract

The American system of higher education is under attack by political, economic, and educational forces that threaten to undermine its business model, governmental support, and operating mission. The potential changes are considerably more dramatic and disruptive than what we've already experienced. Traditional colleges and universities urgently need a coherent, thought-out response. Their central role in ensuring the creation, preservation, and distribution of knowledge may be at risk and, as a consequence, so too may be the spectacular progress across fields we have come to expect as a result.

Symposium contributors include Henry E. Brady, John Mark Hansen, Gary King, Nannerl O. Keohane, Michael Laver, Virginia Sapiro, and Maya Sen.

Pages