Causal Inference

Methods for detecting and reducing model dependence (i.e., when minor model changes produce substantively different inferences) in inferring causal effects and other counterfactuals. Matching methods; "politically robust" and cluster-randomized experimental designs; causal bias decompositions.
Do Nonpartisan Programmatic Policies Have Partisan Electoral Effects? Evidence from Two Large Scale Randomized Experiments
Kosuke Imai, Gary King, and Carlos Velasco Rivera. Working Paper. “Do Nonpartisan Programmatic Policies Have Partisan Electoral Effects? Evidence from Two Large Scale Randomized Experiments”.Abstract

A vast literature demonstrates that voters around the world who benefit from their governments' discretionary spending cast ballots for the incumbent party in larger proportions than those not receiving funds. But contrary to most theories of political accountability, the evidence seems to indicate that voters also reward incumbent parties for implementing ``programmatic'' spending legislation, over which incumbents have no discretion, and even when passed with support from all major parties. Why voters would attribute responsibility when none exists is unclear, as is why minority party legislators would approve of legislation that will cost them votes. We address this puzzle with one of the largest randomized social experiments ever, resulting in clear rejection of the claim, at least in this context, that programmatic policies greatly increase voter support for incumbents. We also reanalyze the study cited as claiming the strongest support for the electoral effects of programmatic policies, which is also a very large scale randomized experiment. We show that its key results vanish after correcting either a simple coding error affecting only two observations or highly unconventional data analysis procedures (or both). We discuss how these consistent empirical results from the only two probative experiments on this question may be reconciled with several observational and theoretical studies touching on similar questions in other contexts. 

Methods for Observational Data

The Balance-Sample Size Frontier in Matching Methods for Causal Inference
Gary King, Christopher Lucas, and Richard Nielsen. In Press. “The Balance-Sample Size Frontier in Matching Methods for Causal Inference.” American Journal of Political Science, 2016.Abstract

We propose a simplified approach to matching for causal inference that simultaneously optimizes balance (similarity between the treated and control groups) and matched sample size. Existing approaches either fix the matched sample size and maximize balance or fix balance and maximize sample size, leaving analysts to settle for suboptimal solutions or attempt manual optimization by iteratively tweaking their matching method and rechecking balance. To jointly maximize balance and sample size, we introduce the matching frontier, the set of matching solutions with maximum possible balance for each sample size. Rather than iterating, researchers can choose matching solutions from the frontier for analysis in one step. We derive fast algorithms that calculate the matching frontier for several commonly used balance metrics. We demonstrate with analyses of the effect of sex on judging and job training programs that show how the methods we introduce can extract new knowledge from existing data sets.

Easy to use, open source, software is available here to implement all methods in the paper.

Evaluating Model Dependence

Evaluating whether counterfactual questions (predictions, what-if questions, and causal effects) can be reasonably answered from given data, or whether inferences will instead be highly model-dependent; also, a new decomposition of bias in causal inference. These articles overlap (and each as been the subject of a journal symposium):
The Dangers of Extreme Counterfactuals
For complete mathematical proofs, general notation, and other technical material, see: Gary King and Langche Zeng. 2006. “The Dangers of Extreme Counterfactuals.” Political Analysis, 14: 131–159.Abstract
We address the problem that occurs when inferences about counterfactuals – predictions, "what if" questions, and causal effects – are attempted far from the available data. The danger of these extreme counterfactuals is that substantive conclusions drawn from statistical models that fit the data well turn out to be based largely on speculation hidden in convenient modeling assumptions that few would be willing to defend. Yet existing statistical strategies provide few reliable means of identifying extreme counterfactuals. We offer a proof that inferences farther from the data are more model-dependent, and then develop easy-to-apply methods to evaluate how model-dependent our answers would be to specified counterfactuals. These methods require neither sensitivity testing over specified classes of models nor evaluating any specific modeling assumptions. If an analysis fails the simple tests we offer, then we know that substantive results are sensitive to at least some modeling choices that are not based on empirical evidence.
When Can History Be Our Guide? The Pitfalls of Counterfactual Inference
For more intuitive, but less general, notation, but with additional examples and more pedagogically oriented material, see: Gary King and Langche Zeng. 2007. “When Can History Be Our Guide? The Pitfalls of Counterfactual Inference.” International Studies Quarterly, 183-210, March.Abstract
Inferences about counterfactuals are essential for prediction, answering "what if" questions, and estimating causal effects. However, when the counterfactuals posed are too far from the data at hand, conclusions drawn from well-specified statistical analyses become based on speculation and convenient but indefensible model assumptions rather than empirical evidence. Unfortunately, standard statistical approaches assume the veracity of the model rather than revealing the degree of model-dependence, and so this problem can be hard to detect. We develop easy-to-apply methods to evaluate counterfactuals that do not require sensitivity testing over specified classes of models. If an analysis fails the tests we offer, then we know that substantive results are sensitive to at least some modeling choices that are not based on empirical evidence. We use these methods to evaluate the extensive scholarly literatures on the effects of changes in the degree of democracy in a country (on any dependent variable) and separate analyses of the effects of UN peacebuilding efforts. We find evidence that many scholars are inadvertently drawing conclusions based more on modeling hypotheses than on their data. For some research questions, history contains insufficient information to be our guide.

Matching Methods

Causal Inference Without Balance Checking: Coarsened Exact Matching
A simple and powerful method of matching: Stefano M Iacus, Gary King, and Giuseppe Porro. 2011. “Causal Inference Without Balance Checking: Coarsened Exact Matching.” Political Analysis.Abstract

We discuss a method for improving causal inferences called "Coarsened Exact Matching'' (CEM), and the new "Monotonic Imbalance Bounding'' (MIB) class of matching methods from which CEM is derived. We summarize what is known about CEM and MIB, derive and illustrate several new desirable statistical properties of CEM, and then propose a variety of useful extensions. We show that CEM possesses a wide range of desirable statistical properties not available in most other matching methods, but is at the same time exceptionally easy to comprehend and use. We focus on the connection between theoretical properties and practical applications. We also make available easy-to-use open source software for R and Stata which implement all our suggestions.

Political Analysis version

An Explanation of CEM Weights

CEM: Software for Coarsened Exact Matching
Stefano M Iacus, Gary King, and Giuseppe Porro. 2009. “CEM: Software for Coarsened Exact Matching.” Journal of Statistical Software, 30. Publisher's VersionAbstract

This program is designed to improve causal inference via a method of matching that is widely applicable in observational data and easy to understand and use (if you understand how to draw a histogram, you will understand this method). The program implements the coarsened exact matching (CEM) algorithm, described below. CEM may be used alone or in combination with any existing matching method. This algorithm, and its statistical properties, are described in Iacus, King, and Porro (2008).

Multivariate Matching Methods That are Monotonic Imbalance Bounding
A technical paper that describes a new class of matching methods, of which coarsened exact matching is an example: Stefano M Iacus, Gary King, and Giuseppe Porro. 2011. “Multivariate Matching Methods That are Monotonic Imbalance Bounding.” Journal of the American Statistical Association, 493, 106: 345-361, 2011.Abstract

We introduce a new "Monotonic Imbalance Bounding" (MIB) class of matching methods for causal inference with a surprisingly large number of attractive statistical properties. MIB generalizes and extends in several new directions the only existing class, "Equal Percent Bias Reducing" (EPBR), which is designed to satisfy weaker properties and only in expectation. We also offer strategies to obtain specific members of the MIB class, and analyze in more detail a member of this class, called Coarsened Exact Matching, whose properties we analyze from this new perspective. We offer a variety of analytical results and numerical simulations that demonstrate how members of the MIB class can dramatically improve inferences relative to EPBR-based matching methods.

Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference
A unified approach to matching methods as a way to reduce model dependence by preprocessing data and then using any model you would have without matching: Daniel Ho, Kosuke Imai, Gary King, and Elizabeth Stuart. 2007. “Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference.” Political Analysis, 15: 199–236.Abstract

Although published works rarely include causal estimates from more than a few model specifications, authors usually choose the presented estimates from numerous trial runs readers never see. Given the often large variation in estimates across choices of control variables, functional forms, and other modeling assumptions, how can researchers ensure that the few estimates presented are accurate or representative? How do readers know that publications are not merely demonstrations that it is possible to find a specification that fits the author’s favorite hypothesis? And how do we evaluate or even define statistical properties like unbiasedness or mean squared error when no unique model or estimator even exists? Matching methods, which offer the promise of causal inference with fewer assumptions, constitute one possible way forward, but crucial results in this fast-growing methodological literature are often grossly misinterpreted. We explain how to avoid these misinterpretations and propose a unified approach that makes it possible for researchers to preprocess data with matching (such as with the easy-to-use software we offer) and then to apply the best parametric techniques they would have used anyway. This procedure makes parametric models produce more accurate and considerably less model-dependent causal inferences.

MatchIt: Nonparametric Preprocessing for Parametric Causal Inference
Daniel E Ho, Kosuke Imai, Gary King, and Elizabeth A Stuart. 2011. “MatchIt: Nonparametric Preprocessing for Parametric Causal Inference.” Journal of Statistical Software, 8, 42. Publisher's VersionAbstract
MatchIt implements the suggestions of Ho, Imai, King, and Stuart (2007) for improving parametric statistical models by preprocessing data with nonparametric matching methods. MatchIt implements a wide range of sophisticated matching methods, making it possible to greatly reduce the dependence of causal inferences on hard-to-justify, but commonly made, statistical modeling assumptions. The software also easily ts into existing research practices since, after preprocessing data with MatchIt, researchers can use whatever parametric model they would have used without MatchIt, but produce inferences with substantially more robustness and less sensitivity to modeling assumptions. MatchIt is an R program, and also works seamlessly with Zelig.
CEM: Coarsened Exact Matching in Stata
Matthew Blackwell, Stefano Iacus, Gary King, and Giuseppe Porro. 2009. “CEM: Coarsened Exact Matching in Stata.” The Stata Journal, 9: 524–546.Abstract
In this article, we introduce a Stata implementation of coarsened exact matching, a new method for improving the estimation of causal effects by reducing imbalance in covariates between treated and control groups. Coarsened exact matching is faster, is easier to use and understand, requires fewer assumptions, is more easily automated, and possesses more attractive statistical properties for many applications than do existing matching methods. In coarsened exact matching, users temporarily coarsen their data, exact match on these coarsened data, and then run their analysis on the uncoarsened, matched data. Coarsened exact matching bounds the degree of model dependence and causal effect estimation error by ex ante user choice, is monotonic imbalance bounding (so that reducing the maximum imbalance on one variable has no effect on others), does not require a separate procedure to restrict data to common support, meets the congruence principle, is approximately invariant to measurement error, balances all nonlinearities and interactions in sample (i.e., not merely in expectation), and works with multiply imputed datasets. Other matching methods inherit many of the coarsened exact matching method’s properties when applied to further match data preprocessed by coarsened exact matching. The cem command implements the coarsened exact matching algorithm in Stata.
Comparative Effectiveness of Matching Methods for Causal Inference
Gary King, Richard Nielsen, Carter Coberley, James E Pope, and Aaron Wells. 2011. “Comparative Effectiveness of Matching Methods for Causal Inference”.Abstract

Matching is an increasingly popular method of causal inference in observational data, but following methodological best practices has proven difficult for applied researchers. We address this problem by providing a simple graphical approach for choosing among the numerous possible matching solutions generated by three methods: the venerable ``Mahalanobis Distance Matching'' (MDM), the commonly used ``Propensity Score Matching'' (PSM), and a newer approach called ``Coarsened Exact Matching'' (CEM). In the process of using our approach, we also discover that PSM often approximates random matching, both in many real applications and in data simulated by the processes that fit PSM theory. Moreover, contrary to conventional wisdom, random matching is not benign: it (and thus PSM) can often degrade inferences relative to not matching at all. We find that MDM and CEM do not have this problem, and in practice CEM usually outperforms the other two approaches. However, with our comparative graphical approach and easy-to-follow procedures, focus can be on choosing a matching solution for a particular application, which is what may improve inferences, rather than the particular method used to generate it.

A Theory of Statistical Inference for Matching Methods in Applied Causal Research
Stefano M. Iacus, Gary King, and Giuseppe Porro. 2015. “A Theory of Statistical Inference for Matching Methods in Applied Causal Research”.Abstract

To reduce model dependence and bias in causal inference, researchers usually use matching as a data preprocessing step, after which they apply whatever statistical model and uncertainty estimators they would have without matching. Unfortunately, this approach is appropriate in finite samples only under exact matching, which is usually infeasible, or approximate matching only under asymptotic theory if large enough sample sizes are available, but even then requires unfamiliar specialized point and variance estimators. Instead of attempting to change common practices, we show how those analyzing certain specific (but extremely common) types of data can instead appeal to a much easier version of existing theory. This alternative theory is substantively plausible, requires no asymptotic theory, and is simple to understand. Its core conceptualizes continuous variables as having natural breakpoints, which are common in applications (e.g., high school or college degrees in years of education, a governmental poverty level in income, or phase transitions in temperature). The theory allows binary, multicategory, and continuous treatment variables from the outset and straightforward extensions for imperfect treatment assignment and different versions of treatments.

Why Propensity Scores Should Not Be Used for Matching
Gary King and Richard Nielsen. Working Paper. “Why Propensity Scores Should Not Be Used for Matching”.Abstract

We show that propensity score matching (PSM), an enormously popular method of preprocessing data for causal inference, often accomplishes the opposite of its intended goal -- increasing imbalance, inefficiency, model dependence, and bias. PSM supposedly makes it easier to find matches by projecting a large number of covariates to a scalar propensity score and applying a single model to produce an unbiased estimate. However, in observational analysis the data generation process is rarely known and so users typically try many models before choosing one to present. The weakness of PSM comes from its attempts to approximate a completely randomized experiment, rather than, as with other matching methods, a more efficient fully blocked randomized experiment. PSM is thus uniquely blind to the often large portion of imbalance that can be eliminated by approximating full blocking with other matching methods. Moreover, in data balanced enough to approximate complete randomization, either to begin with or after pruning some observations, PSM approximates random matching which, we show, increases imbalance even relative to the original data. Although these results suggest that researchers replace PSM with one of the other available methods when performing matching, propensity scores have many other productive uses.

Additional Approaches

Estimating Risk and Rate Levels, Ratios, and Differences in Case-Control Studies
A method to estimate base probabilities or any quantity of interest from case-control data, even with no (or partial) auxiliary information. Discusses problems with odds-ratios. Gary King and Langche Zeng. 2002. “Estimating Risk and Rate Levels, Ratios, and Differences in Case-Control Studies.” Statistics in Medicine, 21: 1409–1427.Abstract
Classic (or "cumulative") case-control sampling designs do not admit inferences about quantities of interest other than risk ratios, and then only by making the rare events assumption. Probabilities, risk differences, and other quantities cannot be computed without knowledge of the population incidence fraction. Similarly, density (or "risk set") case-control sampling designs do not allow inferences about quantities other than the rate ratio. Rates, rate differences, cumulative rates, risks, and other quantities cannot be estimated unless auxiliary information about the underlying cohort such as the number of controls in each full risk set is available. Most scholars who have considered the issue recommend reporting more than just the relative risks and rates, but auxiliary population information needed to do this is not usually available. We address this problem by developing methods that allow valid inferences about all relevant quantities of interest from either type of case-control study when completely ignorant of or only partially knowledgeable about relevant auxiliary population information.
'Truth' is Stranger than Prediction, More Questionable Than Causal Inference
Gary King. 1991. “'Truth' is Stranger than Prediction, More Questionable Than Causal Inference.” American Journal of Political Science, 35: 1047–1053, November.Abstract
Robert Luskin’s article in this issue provides a useful service by appropriately qualifying several points I made in my 1986 American Journal of Political Science article. Whereas I focused on how to avoid common mistakes in quantitative political sciences, Luskin clarifies ways to extract some useful information from usually problematic statistics: correlation coefficients, standardized coefficients, and especially R2. Since these three statistics are very closely related (and indeed deterministic functions of one another in some cases), I focus in this discussion primarily on R2, the most widely used and abused. Luskin also widens the discussion to various kinds of specification tests, a general issue I also address. In fact, as Beck (1991) reports, a large number of formal specification tests are just functions of R2, with differences among them primarily due to how much each statistic penalizes one for including extra parameters and fewer observations. Quantitative political scientists often worry about model selection and specification, asking questions about parameter identification, autocorrelated or heteroscedastic disturbances, parameter constancy, variable choice, measurement error, endogeneity, functional forms, stochastic assumptions, and selection bias, among numerous others. These model specification questions are all important, but we may have forgotten why we pose them. Political scientists commonly give three reasons: (1) finding the "true" model, or the "full" explanation and (2) prediction and and (3) estimating specific causal effects. I argue here that (1) is used the most but useful the least and (2) is very useful but not usually in political science where forecasting is not often a central concern and and (3) correctly represents the goals of political scientists and should form the basis of most of our quantitative empirical work.

Experimental Design

Methods for Extremely Large Scale Media Experiments and Observational Studies (Poster)
Gary King, Benjamin Schneer, and Ariel White. 2014. “Methods for Extremely Large Scale Media Experiments and Observational Studies (Poster).” In Society for Political Methodology. Athens, GA, 24 July.Abstract

This is a poster presentation describing (1) the largest ever experimental study of media effects, with more than 50 cooperating traditional media sites, normally unavailable web site analytics, the text of hundreds of thousands of news articles, and tens of millions of social media posts, and (2) a design we used in preparation that attempts to anticipate experimental outcomes

Avoiding Randomization Failure in Program Evaluation
Gary King, Richard Nielsen, Carter Coberley, James E Pope, and Aaron Wells. 2011. “Avoiding Randomization Failure in Program Evaluation.” Population Health Management, 1, 14: S11-S22, 2011.Abstract An evaluation of the Mexican Seguro Popular program (designed to extend health insurance and regular and preventive medical care, pharmaceuticals, and health facilities to 50 million uninsured Mexicans), one of the world's largest health policy reforms of the last two decades. The evaluation features the largest randomized health policy experiment in history, a new design for field experiments that is more robust to the political interventions that have ruined many similar previous efforts, and new statistical methods that produce more reliable and efficient results using substantially fewer resources, assumptions, and data.

We highlight common problems in the application of random treatment assignment in large scale program evaluation. Random assignment is the defining feature of modern experimental design. Yet, errors in design, implementation, and analysis often result in real world applications not benefiting from the advantages of randomization. The errors we highlight cover the control of variability, levels of randomization, size of treatment arms, and power to detect causal effects, as well as the many problems that commonly lead to post-treatment bias. We illustrate with an application to the Medicare Health Support evaluation, including recommendations for improving the design and analysis of this and other large scale randomized experiments.

(Articles on the Seguro Popular Evaluation: Website)
Misunderstandings Among Experimentalists and Observationalists about Causal Inference
Clarifying serious misunderstandings in the advantages and uses of the most common research designs for making causal inferences. Kosuke Imai, Gary King, and Elizabeth Stuart. 2008. “Misunderstandings Among Experimentalists and Observationalists about Causal Inference.” Journal of the Royal Statistical Society, Series A, 171, part 2: 481–502.Abstract

We attempt to clarify, and suggest how to avoid, several serious misunderstandings about and fallacies of causal inference in experimental and observational research. These issues concern some of the most basic advantages and disadvantages of each basic research design. Problems include improper use of hypothesis tests for covariate balance between the treated and control groups, and the consequences of using randomization, blocking before randomization, and matching after treatment assignment to achieve covariate balance. Applied researchers in a wide range of scientific disciplines seem to fall prey to one or more of these fallacies, and as a result make suboptimal design or analysis choices. To clarify these points, we derive a new four-part decomposition of the key estimation errors in making causal inferences. We then show how this decomposition can help scholars from different experimental and observational research traditions better understand each other’s inferential problems and attempted solutions.

Software

CEM: Coarsened Exact Matching in Stata
Matthew Blackwell, Stefano Iacus, Gary King, and Giuseppe Porro. 2009. “CEM: Coarsened Exact Matching in Stata.” The Stata Journal, 9: 524–546.Abstract
In this article, we introduce a Stata implementation of coarsened exact matching, a new method for improving the estimation of causal effects by reducing imbalance in covariates between treated and control groups. Coarsened exact matching is faster, is easier to use and understand, requires fewer assumptions, is more easily automated, and possesses more attractive statistical properties for many applications than do existing matching methods. In coarsened exact matching, users temporarily coarsen their data, exact match on these coarsened data, and then run their analysis on the uncoarsened, matched data. Coarsened exact matching bounds the degree of model dependence and causal effect estimation error by ex ante user choice, is monotonic imbalance bounding (so that reducing the maximum imbalance on one variable has no effect on others), does not require a separate procedure to restrict data to common support, meets the congruence principle, is approximately invariant to measurement error, balances all nonlinearities and interactions in sample (i.e., not merely in expectation), and works with multiply imputed datasets. Other matching methods inherit many of the coarsened exact matching method’s properties when applied to further match data preprocessed by coarsened exact matching. The cem command implements the coarsened exact matching algorithm in Stata.
MatchingFrontier: R Package for Calculating the Balance-Sample Size Frontier
Gary King, Christopher Lucas, and Richard Nielsen. 2014. “MatchingFrontier: R Package for Calculating the Balance-Sample Size Frontier”.Abstract

MatchingFrontier is an easy-to-use R Package for making optimal causal inferences from observational data.  Despite their popularity, existing matching approaches leave researchers with two fundamental tensions. First, they are designed to maximize one metric (such as propensity score or Mahalanobis distance) but are judged against another for which they were not designed (such as L1 or differences in means). Second, they lack a principled solution to revealing the implicit bias-variance trade off: matching methods need to optimize with respect to both imbalance (between the treated and control groups) and the number of observations pruned, but existing approaches optimize with respect to only one; users then either ignore the other, or tweak it, usually suboptimally, by hand.

MatchingFrontier resolves both tensions by consolidating previous techniques into a single, optimal, and flexible approach. It calculates the matching solution with maximum balance for each possible sample size (N, N-1, N-2,...). It thus directly calculates the entire balance-sample size frontier, from which the user can easily choose one, several, or all subsamples from which to conduct their final analysis, given their own choice of imbalance metric and quantity of interest. MatchingFrontier solves the joint optimization problem in one run, automatically, without manual tweaking, and without iteration.  Although for each subset size k, there exist a huge (N choose k) number of unique subsets, MatchingFrontier includes specially designed fast algorithms that give the optimal answer, usually in a few minutes.  

MatchingFrontier implements the methods in this paper:  

King, Gary, Christopher Lucas, and Richard Nielsen. 2014. The Balance-Sample Size Frontier in Matching Methods for Causal Inference, copy at http://j.mp/1dRDMrE

See http://projects.iq.harvard.edu/frontier/

Michael Tomz, Jason Wittenberg, and Gary King. 2003. “CLARIFY: Software for Interpreting and Presenting Statistical Results.” Journal of Statistical Software 8.Abstract
This is a set of easy-to-use Stata macros that implement the techniques described in Gary King, Michael Tomz, and Jason Wittenberg's "Making the Most of Statistical Analyses: Improving Interpretation and Presentation". To install Clarify, type "net from http://gking.harvard.edu/clarify" at the Stata command line. The documentation [ HTML | PDF ] explains how to do this. We also provide a zip archive for users who want to install Clarify on a computer that is not connected to the internet. Winner of the Okidata Best Research Software Award. Also try -ssc install qsim- to install a wrapper, donated by Fred Wolfe, to automate Clarify's simulation of dummy variables.

Applications

The Supreme Court During Crisis: How War Affects only Non-War Cases
A brief summary of the above article for an undergraduate audience: Lee Epstein, Daniel E Ho, Gary King, and Jeffrey A Segal. 2005. “The Supreme Court During Crisis: How War Affects only Non-War Cases.” New York University Law Review, 80: 1–116, April.Abstract
Does the U.S. Supreme Court curtail rights and liberties when the nation’s security is under threat? In hundreds of articles and books, and with renewed fervor since September 11, 2001, members of the legal community have warred over this question. Yet, not a single large-scale, quantitative study exists on the subject. Using the best data available on the causes and outcomes of every civil rights and liberties case decided by the Supreme Court over the past six decades and employing methods chosen and tuned especially for this problem, our analyses demonstrate that when crises threaten the nation’s security, the justices are substantially more likely to curtail rights and liberties than when peace prevails. Yet paradoxically, and in contradiction to virtually every theory of crisis jurisprudence, war appears to affect only cases that are unrelated to the war. For these cases, the effect of war and other international crises is so substantial, persistent, and consistent that it may surprise even those commentators who long have argued that the Court rallies around the flag in times of crisis. On the other hand, we find no evidence that cases most directly related to the war are affected. We attempt to explain this seemingly paradoxical evidence with one unifying conjecture: Instead of balancing rights and security in high stakes cases directly related to the war, the Justices retreat to ensuring the institutional checks of the democratic branches. Since rights-oriented and process-oriented dimensions seem to operate in different domains and at different times, and often suggest different outcomes, the predictive factors that work for cases unrelated to the war fail for cases related to the war. If this conjecture is correct, federal judges should consider giving less weight to legal principles outside of wartime but established during wartime, and attorneys should see it as their responsibility to distinguish cases along these lines.
The Effect of War on the Supreme Court
Lee Epstein, Daniel E. Ho, Gary King, and Jeffrey A. Segal. 2006. “The Effect of War on the Supreme Court.” In Principles and Practice in American Politics: Classic and Contemporary Readings, edited by Samuel Kernell and Steven S. Smith, 3rd ed. Washington, D.C.: Congressional Quarterly Press.Abstract

Does the U.S. Supreme Court curtail rights and liberties when the nation’s security is under threat? In hundreds of articles and books, and with renewed fervor since September 11, 2001, members of the legal community have warred over this question. Yet, not a single large-scale, quantitative study exists on the subject. Using the best data available on the causes and outcomes of every civil rights and liberties case decided by the Supreme Court over the past six decades and employing methods chosen and tuned especially for this problem, our analyses demonstrate that when crises threaten the nation’s security, the justices are substantially more likely to curtail rights and liberties than when peace prevails. Yet paradoxically, and in contradiction to virtually every theory of crisis jurisprudence, war appears to affect only cases that are unrelated to the war. For these cases, the effect of war and other international crises is so substantial, persistent, and consistent that it may surprise even those commentators who long have argued that the Court rallies around the flag in times of crisis. On the other hand, we find no evidence that cases most directly related to the war are affected. We attempt to explain this seemingly paradoxical evidence with one unifying conjecture: Instead of balancing rights and security in high stakes cases directly related to the war, the Justices retreat to ensuring the institutional checks of the democratic branches. Since rights-oriented and process-oriented dimensions seem to operate in different domains and at different times, and often suggest different outcomes, the predictive factors that work for cases unrelated to the war fail for cases related to the war. If this conjecture is correct, federal judges should consider giving less weight to legal principles outside of wartime but established during wartime, and attorneys should see it as their responsibility to distinguish cases along these lines.