In the last two decades, the international community has begun to conclude that attempts to ensure the territorial security of nation-states through military power have failed to improve the human condition. Despite astronomical levels of military spending, deaths due to military conflict have not declined. Moreover, even when the borders of some states are secure from foreign threats, the people within those states do not necessarily have freedom from crime, enough food, proper health care, education, or political freedom. In response to these developments, the international community has gradually moved to combine economic development with military security and other basic human rights to form a new concept of "human security". Unfortunately, by common assent the concept lacks both a clear definition, consistent with the aims of the international community, and any agreed upon measure of it. In this paper, we propose a simple, rigorous, and measurable definition of human security: the expected number of years of future life spent outside the state of "generalized poverty". Generalized poverty occurs when an individual falls below the threshold in any key domain of human well-being. We consider improvements in data collection and methods of forecasting that are necessary to measure human security and then introduce an agenda for research and action to enhance human security that follows logically in the areas of risk assessment, prevention, protection, and compensation.
Although the term "empirical research" has become commonplace in legal scholarship over the past two decades, law professors have, in fact, been conducting research that is empirical – that is, learning about the world using quantitative data or qualitative information – for almost as long as they have been conducting research. For just as long, however, they have been proceeding with little awareness of, much less compliance with, the rules of inference, and without paying heed to the key lessons of the revolution in empirical analysis that has been taking place over the last century in other disciplines. The tradition of including some articles devoted to exclusively to the methododology of empirical analysis – so well represented in journals in traditional academic fields – is virtually nonexistent in the nation’s law reviews. As a result, readers learn considerably less accurate information about the empirical world than the studies’ stridently stated, but overconfident, conclusions suggest. To remedy this situation both for the producers and consumers of empirical work, this Article adapts the rules of inference used in the natural and social sciences to the special needs, theories, and data in legal scholarship, and explicate them with extensive illustrations from existing research. The Article also offers suggestions for how the infrastructure of teaching and research at law schools might be reorganized so that it can better support the creation of first-rate empirical research without compromising other important objectives.
This is an invited response to an article by Anselin and Cho. I make two main points: The numerical results in this article violate no conclusions from prior literature, and the absence of the deterministic information from the bounds in the article’s analyses invalidates its theoretical discussion of spatial autocorrelation and all of its actual simulation results. An appendix shows how to draw simulations correctly.
Armed conflict is a major cause of injury and death worldwide, but we need much better methods of quantification before we can accurately assess its effect. Armed conflict between warring states and groups within states have been major causes of ill health and mortality for most of human history. Conflict obviously causes deaths and injuries on the battlefield, but also health consequences from the displacement of populations, the breakdown of health and social services, and the heightened risk of disease transmission. Despite the size of the health consequences, military conflict has not received the same attention from public health research and policy as many other causes of illness and death. In contrast, political scientists have long studied the causes of war but have primarily been interested in the decision of elite groups to go to war, not in human death and misery. We review the limited knowledge on the health consequences of conflict, suggest ways to improve measurement, and discuss the potential for risk assessment and for preventing and ameliorating the consequences of conflict.
Background: Studies have revealed large variations in average health status across social, economic, and other groups. No study exists on the distribution of the risk of ill-health across individuals, either within groups or across all people in a society, and as such a crucial piece of total health inequality has been overlooked. Some of the reason for this neglect has been that the risk of death, which forms the basis for most measures, is impossible to observe directly and difficult to estimate. Methods: We develop a measure of total health inequality – encompassing all inequalities among people in a society, including variation between and within groups – by adapting a beta-binomial regression model. We apply it to children under age two in 50 low- and middle-income countries. Our method has been adopted by the World Health Organization and is being implemented in surveys around the world and preliminary estimates have appeared in the World Health Report (2000). Results: Countries with similar average child mortality differ considerably in total health inequality. Liberia and Mozambique have the largest inequalities in child survival, while Colombia, the Philippines and Kazakhstan have the lowest levels among the countries measured. Conclusions: Total health inequality estimates should be routinely reported alongside average levels of health in populations and groups, as they reveal important policy-related information not otherwise knowable. This approach enables meaningful comparisons of inequality across countries and future analyses of the determinants of inequality.
A stand-alone, easy-to-use program for running event count and duration regression models, developed by and/or discussed in a series of journal articles by me. (Event count models have a dependent variable measured as the number of times something happens, such as the number of uncontested seats per state or the number of wars per year. Duration models explain dependent variables measured as the time until some event, such as the number of months a parliamentary cabinet endures.) Winner of the APSA Research Software Award.
Classic (or "cumulative") case-control sampling designs do not admit inferences about quantities of interest other than risk ratios, and then only by making the rare events assumption. Probabilities, risk differences, and other quantities cannot be computed without knowledge of the population incidence fraction. Similarly, density (or "risk set") case-control sampling designs do not allow inferences about quantities other than the rate ratio. Rates, rate differences, cumulative rates, risks, and other quantities cannot be estimated unless auxiliary information about the underlying cohort such as the number of controls in each full risk set is available. Most scholars who have considered the issue recommend reporting more than just the relative risks and rates, but auxiliary population information needed to do this is not usually available. We address this problem by developing methods that allow valid inferences about all relevant quantities of interest from either type of case-control study when completely ignorant of or only partially knowledgeable about relevant auxiliary population information.
Binary, count and duration data all code discrete events occurring at points in time. Although a single data generation process can produce all of these three data types, the statistical literature is not very helpful in providing methods to estimate parameters of the same process from each. In fact, only single theoretical process exists for which know statistical methods can estimate the same parameters - and it is generally used only for count and duration data. The result is that seemingly trivial decisions abut which level of data to use can have important consequences for substantive interpretations. We describe the theoretical event process for which results exist, based on time independence. We also derive a set of models for a time-dependent process and compare their predictions to those of a commonly used model. Any hope of understanding and avoiding the more serious problems of aggregation bias in events data is contingent on first deriving a much wider arsenal of statistical models and theoretical processes that are not constrained by the particular forms of data that happen to be available. We discuss these issues and suggest an agenda for political methodologists interested in this very large class of aggregation problems.
The Virtual Data Center (VDC) software is an open-source, digital library system for quantitative data. We discuss what the software does, and how it provides an infrastructure for the management and dissemination of disturbed collections of quantitative data, and the replication of results derived from this data.
Some of the most important phenomena in international conflict are coded s "rare events data," binary dependent variables with dozens to thousands of times fewer events, such as wars, coups, etc., than "nonevents". Unfortunately, rare events data are difficult to explain and predict, a problem that seems to have at least two sources. First, and most importantly, the data collection strategies used in international conflict are grossly inefficient. The fear of collecting data with too few events has led to data collections with huge numbers of observations but relatively few, and poorly measured, explanatory variables. As it turns out, more efficient sampling designs exist for making valid inferences, such as sampling all available events (e.g., wars) and a tiny fraction of non-events (peace). This enables scholars to save as much as 99% of their (non-fixed) data collection costs, or to collect much more meaningful explanatory variables. Second, logistic regression, and other commonly used statistical procedures, can underestimate the probability of rare events. We introduce some corrections that outperform existing methods and change the estimates of absolute and relative risks by as much as some estimated effects reported in the literature. We also provide easy-to-use methods and software that link these two results, enabling both types of corrections to work simultaneously.
We study rare events data, binary dependent variables with dozens to thousands of times fewer ones (events, such as wars, vetoes, cases of political activism, or epidemiological infections) than zeros ("nonevents"). In many literatures, these variables have proven difficult to explain and predict, a problem that seems to have at least two sources. First, popular statistical procedures, such as logistic regression, can sharply underestimate the probability of rare events. We recommend corrections that outperform existing methods and change the estimates of absolute and relative risks by as much as some estimated effects reported in the literature. Second, commonly used data collection strategies are grossly inefficient for rare events data. The fear of collecting data with too few events has led to data collections with huge numbers of observations but relatively few, and poorly measured, explanatory variables, such as in international conflict data with more than a quarter-million dyads, only a few of which are at war. As it turns out, more efficient sampling designs exist for making valid inferences, such as sampling all variable events (e.g., wars) and a tiny fraction of nonevents (peace). This enables scholars to save as much as 99% of their (nonfixed) data collection costs or to collect much more meaningful explanatory variables. We provide methods that link these two results, enabling both types of corrections to work simultaneously, and software that implements the methods developed.
We propose a remedy for the discrepancy between the way political scientists analyze data with missing values and the recommendations of the statistics community. Methodologists and statisticians agree that "multiple imputation" is a superior approach to the problem of missing data scattered through one’s explanatory and dependent variables than the methods currently used in applied data analysis. The discrepancy occurs because the computational algorithms used to apply the best multiple imputation models have been slow, difficult to implement, impossible to run with existing commercial statistical packages, and have demanded considerable expertise. We adapt an algorithm and use it to implement a general-purpose, multiple imputation model for missing data. This algorithm is considerably easier to use than the leading method recommended in statistics literature. We also quantify the risks of current missing data practices, illustrate how to use the new procedure, and evaluate this alternative through simulated data as well as actual empirical examples. Finally, we offer easy-to-use that implements our suggested methods. (Software: AMELIA)
We offer the first independent scholarly evaluation of the claims, forecasts, and causal inferences of the State Failure Task Force and their efforts to forecast when states will fail. State failure refers to the collapse of the authority of the central government to impose order, as in civil wars, revolutionary wars, genocides, politicides, and adverse or disruptive regime transitions. This task force, set up at the behest of Vice President Gore in 1994, has been led by a group of distinguished academics working as consultants to the U.S. Central Intelligence Agency. State Failure Task Force reports and publications have received attention in the media, in academia, and from public policy decision-makers. In this article, we identify several methodological errors in the task force work that cause their reported forecast probabilities of conflict to be too large, their causal inferences to be biased in unpredictable directions, and their claims of forecasting performance to be exaggerated. However, we also find that the task force has amassed the best and most carefully collected data on state failure in existence, and the required corrections which we provide, although very large in effect, are easy to implement. We also reanalyze their data with better statistical procedures and demonstrate how to improve forecasting performance to levels significantly greater than even corrected versions of their models. Although still a highly uncertain endeavor, we are as a consequence able to offer the first accurate forecasts of state failure, along with procedures and results that may be of practical use in informing foreign policy decision making. We also describe a number of strong empirical regularities that may help in ascertaining the causes of state failure.
The intellectual stakes at issue in this symposium are very high: Green, Kim, and Yoon (2000 and hereinafter GKY) apply their proposed methodological prescriptions and conclude that they key findings in the field is wrong and democracy "has no effect on militarized disputes." GKY are mainly interested in convincing scholars about their methodological points and see themselves as having no stake in the resulting substantive conclusions. However, their methodological points are also high stakes claims: if correct, the vast majority of statistical analyses of military conflict ever conducted would be invalidated. GKY say they "make no attempt to break new ground statistically," but, as we will see, this both understates their methodological contribution to the field and misses some unique features of their application and data in international relations. On the ltter, GKY’s critics are united: Oneal and Russett (2000) conclude that GKY’s method "produces distorted results," and show even in GKY’s framework how democracy’s effect can be reinstated. Beck and Katz (2000) are even more unambiguous: "GKY’s conclusion, in table 3, that variables such as democracy have no pacific impact, is simply nonsense...GKY’s (methodological) proposal...is NEVER a good idea." My given task is to sort out and clarify these conflicting claims and counterclaims. The procedure I followed was to engage in extensive discussions with the participants that included joint reanalyses provoked by our discussions and passing computer program code (mostly with Monte Carlo simulations) back and forth to ensure we were all talking about the same methods and agreed with the factual results. I learned a great deal from this process and believe that the positions of the participants are now a lot closer than it may seem from their written statements. Indeed, I believe that all the participants now agree with what I have written here, even though they would each have different emphases (and although my believing there is agreement is not the same as there actually being agreement!).
In this paper we propose Bayesian and frequentist approaches to ecological inference, based on R x C contingency tables, including a covariate. The proposed Bayesian model extends the binomial-beta hierarchical model developed by King, Rosen and Tanner (1999) from the 2 x 2 case to the R x C case, the inferential procedure employs Markov chain Monte Carlo (MCMC) methods. As such the resulting MCMC analysis is rich but computationally intensive. The frequentist approach, based on first moments rather than on the entire likelihood, provides quick inference via nonlinear least-squares, while retaining good frequentist properties. The two approaches are illustrated with simulated data, as well as with real data on voting patterns in Weimar Germany. In the final section of the paper we provide an overview of a range of alternative inferential approaches which trade-off computational intensity for statistical efficiency.
In this paper, we present an overview of the Virtual Data Center (VDC) software, an open-source digital library system for the management and dissemination of distributed collections of quantitative data. (see http://TheData.org). The VDC functionality provides everything necessary to maintain and disseminate an individual collection of research studies, including facilities for the storage, archiving, cataloging, translation, and on-line analysis of a particular collection. Moreover, the system provides extensive support for distributed and federated collections including: location-independent naming of objects, distributed authentication and access control, federated metadata harvesting, remote repository caching, and distributed "virtual" collections of remote objects.
I am grateful for such thoughtful review from these three distinguished geographers. Fotheringham provides an excellent summary of the approach offered, including how it combines the two methods that have dominated applications (and methodological analysis) for nearly half a century– the method of bounds (Duncan and Davis, 1953) and Goodman’s (1953) least squares regression. Since Goodman’s regression is the only method of ecological inference "widely used in Geography" (O’Loughlin), adding information that is known to be true from the method of bounds (for each observation) would seem to have the chance to improve a lot of research in this field. The other addition that EI provides is estimates at the lowest level of geography available, making it possible to map results, instead of giving only single summary numbers for the entire geographic region. Whether one considers the combined method offered "the" solution (as some reviewers and commentators have portrayed it), "a" solution (as I tried to describe it), or, perhaps better and more simply, as an improved method of ecological inference, is not importatnt. The point is that more data are better, and this method incorporates more. I am gratified that all three reviewers seem to support these basic points. In this response, I clarify a few points, correct some misunderstandings, and present additional evidence. I conclude with some possible directions for future research.
We address a well-known but infrequently discussed problem in the quantitative study of international conflict: Despite immense data collections, prestigious journals, and sophisticated analyses, empirical findings in the literature on international conflict are often unsatisfying. Many statistical results change from article to article and specification to specification. Accurate forecasts are nonexistant. In this article we offer a conjecture about one source of this problem: The causes of conflict, theorized to be important but often found to be small or ephemeral, are indeed tiny for the vast majority of dyads, but they are large, stable, and replicable wherever the ex ante probability of conflict is large. This simple idea has an unexpectedly rich array of observable implications, all consistent with the literature. We directly test our conjecture by formulating a statistical model that includes critical features. Our approach, a version of a "neural network" model, uncovers some interesting structural features of international conflict, and as one evaluative measure, forecasts substantially better than any previous effort. Moreover, this improvement comes at little cost, and it is easy to evaluate whether the model is a statistical improvement over the simpler models commonly used.
Social Scientists rarely take full advantage of the information available in their statistical results. As a consequence, they miss opportunities to present quantities that are of greatest substantive interest for their research and express the appropriate degree of certainty about these quantities. In this article, we offer an approach, built on the technique of statistical simulation, to extract the currently overlooked information from any statistical method and to interpret and present it in a reader-friendly manner. Using this technique requires some expertise, which we try to provide herein, but its application should make the results of quantitative articles more informative and transparent. To illustrate our recommendations, we replicate the results of several published works, showing in each case how the authors’ own conclusions can be expressed more sharply and informatively, and, without changing any data or statistical assumptions, how our approach reveals important new information about the research questions at hand. We also offer very easy-to-use Clarify software that implements our suggestions.
We present a method of analyzing a series of independent cross-sectional surveys in which some questions are not answered in some surveys and some respondents do not answer some of the questions posed. The method is also applicable to a single survey in which different questions are asked or different sampling methods are used in different strata or clusters. Our method involves multiply imputing the missing items and questions by adding to existing methods of imputation designed for single surveys a hierarchical regression model that allows covariates at the individual and survey levels. Information from survey weights is exploited by including in the analysis the variables on which the weights are based, and then reweighting individual responses (observed and imputed) to estimate population quantities. We also develop diagnostics for checking the fit of the imputation model based on comparing imputed data to nonimputed data. We illustrate with the example that motivated this project: a study of pre-election public opinion polls in which not all the questions of interest are asked in all the surveys, so that it is infeasible to impute within each survey separately.
I appreciate the editor’s invitation to reply to Freedman et al.’s (1998) review of "A Solution to the Ecological Inference Problem: Reconstructing Individual Behavior from Aggregate Data" (Princeton University Press.) I welcome this scholarly critique and JASA’s decision to publish in this field. Ecological inference is a large and very important area for applications that is especially rich with open statistical questions. I hope this discussion stimulates much new scholarship. Freedman et al. raise several interesting issues, but also misrepresent or misunderstand the prior literature, my approach, and their own empirical analyses, and compound the problem, by refusing requests from me and the editor to make their data and software available for this note. Some clarification is thus in order.
We propose a comprehensive statistical model for analyzing multiparty, district-level elections. This model, which provides a tool for comparative politics research analagous to that which regression analysis provides in the American two-party context, can be used to explain or predict how geographic distributions of electoral results depend upon economic conditions, neighborhood ethnic compositions, campaign spending, and other features of the election campaign or aggregate areas. We also provide new graphical representations for data exploration, model evaluation, and substantive interpretation. We illustrate the use of this model by attempting to resolve a controversy over the size of and trend in electoral advantage of incumbency in Britain. Contrary to previous analyses, all based on measures now known to be biased, we demonstrate that the advantage is small but meaningful, varies substantially across the parties, and is not growing. Finally, we show how to estimate the party from which each party’s advantage is predominantly drawn.
In 1990, Budge and Hofferbert (B&H) claimed that they had found solid evidence that party platforms cause U.S. budgetary priorities, and thus concluded that mandate theory applies in the United States as strongly as it does elsewhere. The implications of this stunning conclusion would mean that virtually every observer of the American party system in this century has been wrong. King and Laver (1993) reanalyzed B&H’s data and demonstrated in two ways that there exists no evidence for a causal relationship. First, accepting their entire statistical model, and correcting only an algebraic error (a mistake in how they computed their standard errors), we showed that their hypothesized relationship holds up in fewer than half the tests they reported. Second, we showed that their statistical model includes a slightly hidden but politically implausible assumption that a new party achieves every budgetary desire immediately upon taking office. We then specified a model without this unrealistic assumption and we found that the assumption was not supported, and that all evidence in the data for platforms causing government budgets evaporated. In their published response to our article, B&H withdrew their key claim and said they were now (in 1993) merely interested in an association and not causation. That is how it was left in 1993—a perfectly amicable resolution as far as we were concerned—since we have no objection to the claim that there is a non-causal or chance association between any two variables. Of course, we see little reason to be interested in non-causal associations in this area any more than in the chance correlation that exists between the winner of the baseball World Series and the party winning the U.S. presidency. Since party mandate theory only makes sense as a causal theory, the conventional wisdom about America’s porous, non-mandate party system stands.
The authors develop binomial-beta hierarchical models for ecological inference using insights from the literature on hierarchical models based on Markov chain Monte Carlo algorithms and King’s ecological inference model. The new approach reveals some features of the data that King’s approach does not, can easily be generalized to more complicated problems such as general R x C tables, allows the data analyst to adjust for covariates, and provides a formal evaluation of the significance of the covariates. It may also be better suited to cases in which the observed aggregate cells are estimated from very few observations or have some forms of measurement error. This article also provides an example of a hierarchical model in which the statistical idea of "borrowing strength" is used not merely to increase the efficiency of the estimates but to enable the data analyst to obtain estimates.