Ecological inference, as traditionally defined, is the process of using aggregate (i.e., "ecological") data to infer discrete individual-level relationships of interest when individual-level data are not available. Existing methods of ecological inference generate very inaccurate conclusions about the empirical world- which thus gives rise to the ecological inference problem. Most scholars who analyze aggregate data routinely encounter some form of this problem. EI (by Gary King) and EzI (by Kenneth Benoit and Gary King) are freely available software that implement the statistical and graphical methods detailed in Gary King’s book A Solution to the Ecological Inference Problem. These methods make it possible to infer the attributes of individual behavior from aggregate data. EI works within the statistics program Gauss and will run on any computer hardware and operating system that runs Gauss (the Gauss module, CML, or constrained maximum likelihood- by Ronald J. Schoenberg- is also required). EzI is a menu-oriented stand-alone version of the program that runs under MS-DOS (and soon Windows 95, OS/2, and HP-UNIX). EI allows users to make ecological inferences as part of the powerful and open Gauss statistical environment. In contrast, EzI requires no additional software, and provides an attractive menu-based user interface for non-Gauss users, although it lacks the flexibility afforded by the Gauss version. Both programs presume that the user has read or is familiar with A Solution to the Ecological Inference Problem.
This paper is an invited comment on a paper by John Agnew. I largely agree with Agnew’s comments and thus focus on remaining areas wehre an alternative perspective might be useful. My argument is that political geographers should not be so concerned with demonstrating that context matters. My reasoning is based on three arguments. First, in fact context rarely counts (Section 1) and, second, the most productive practical goal for political researchers should be to show that it does not count (Section 2). Finally, a disproportionate focus on ‘context counting’ can lead, and has led, to some seriosu problems in practical research situations, such as attempting to give theoretical answers to empirical questions (Section 3) and empirical answers to theoretical questions (Section 4).
We demonstrate that the expected value and variance commonly given for a well-known probability distribution are incorrect. We also provide corrected versions and report changes in a computer program to account for the known practical uses of this distribution.
Receiving five serious reviews in this symposium is gratifying and confirms our belief that research design should be a priority for our discipline. We are pleased that our five distinguished reviewers appear to agree with our unified approach to the logic of inference in the social sciences, and with our fundamental point: that good quantitative and good qualitative research designs are based fundamentally on the same logic of inference. The reviewers also raised virtually no objections to the main practical contribution of our book– our many specific procedures for avoiding bias, getting the most out of qualitative data, and making reliable inferences. However, the reviews make clear that although our book may be the latest word on research design in political science, it is surely not the last. We are taxed for failing to include important issues in our analysis and for dealing inadequately with some of what we included. Before responding to the reviewers’ more direct criticisms, let us explain what we emphasize in Designing Social Inquiry and how it relates to some of the points raised by the reviewers.
Before every presidential election, journalists, pollsters, and politicians commission dozens of public opinion polls. Although the primary function of these surveys is to forecast the election winners, they also generate a wealth of political data valuable even after the election. These preelection polls are useful because they are conducted with such frequency that they allow researchers to study change in estimates of voter opinion within very narrow time increments (Gelman and King 1993). Additionally, so many are conducted that the cumulative sample size of these polls is large enough to construct aggregate measures of public opinion within small demographic or geographical groupings (Wright, Erikson, and McIver 1985).
These advantages, however, are mitigated by the decentralized origin of the many preelection polls. The surveys are conducted by diverse private enterprises with procedures that differ significantly. Moreover, important methodological detail does not appear in the public record. Codebooks provided by the survey organizations are all incomplete; many are outdated and most are at least partly inaccurate. The most recent treatment in the academic literature, by Brady and Orren (1992), discusses the approach used by three companies but conceals their identities and omits most of the detail. ...
Political science is a community enterprise and the community of empirical political scientists need access to the body of data necessary to replicate existing studies to understand, evaluate, and especially build on this work. Unfortunately, the norms we have in place now do not encourage, or in some cases even permit, this aim. Following are suggestions that would facilitate replication and are easy to implement – by teachers, students, dissertation writers, graduate programs, authors, reviewers, funding agencies, and journal and book editors.
We demonstrate the surprising benefits of legislative redistricting (including partisan gerrymandering) for American representative democracy. In so doing, our analysis resolves two long-standing controversies in American politics. First, whereas some scholars believe that redistricting reduces electoral responsiveness by protecting incumbents, others, that the relationship is spurious, we demonstrate that both sides are wrong: redistricting increases responsiveness. Second, while some researchers believe that gerrymandering dramatically increases partisan bias and others deny this effect, we show both sides are in a sense correct. Gerrymandering biases electoral systems in favor of the party that controls the redistricting as compared to what would have happened if the other party controlled it, but any type of redistricting reduces partisan bias as compared to an electoral system without redistricting. Incorrect conclusions in both literatures resulted from misjudging the enormous uncertainties present during redistricting periods, making simplified assumptions about the redistricters’ goals, and using inferior statistical methods.
King, Alt, Burns, and Laver (1990) proposed and estimated a unified model in which cabinet durations depended on seven explanatory variables reflecting features of the cabinets and the bargaining environments in which they formed, along with a stochastic component in which the risk of a cabinet falling was treated as a constant across its tenure. Two recent research reports take issue with one aspect of this model. Warwick and Easton replicate the earlier findings for explanatory variables but claim that the stochastic risk should be seen as rising, and at a rate which varies, across the life of the cabinet. Bienen and van de Walle, using data on the duration of leaders, allege that random risk is falling. We continue in our goal of unifying this literature by providing further estimates with both cabinet and leader duration data that confirm the original explanatory variables’ effects, showing that leaders’ durations are affected by many of the same factors that affect the durability of the cabinets they lead, demonstrating that cabinets have stochastic risk of ending that is indeed constant across the theoretically most interesting range of durations, and suggesting that stochastic risk for leaders in countries with cabinet government is, if not constant, more likely to rise than fall.
We derive a unified statistical method with which one can produce substantially improved definitions and estimates of almost any feature of two-party electoral systems that can be defined based on district vote shares. Our single method enables one to calculate more efficient estimates, with more trustworthy assessments of their uncertainty, than each of the separate multifarious existing measures of partisan bias, electoral responsiveness, seats-votes curves, expected or predicted vote in each district in a legislature, the probability that a given party will win the seat in each district, the proportion of incumbents or others who will lose their seats, the proportion of women or minority candidates to be elected, the incumbency advantage and other causal effects, the likely effects on the electoral system and district votes of proposed electoral reforms, such as term limitations, campaign spending limits, and drawing majority-minority districts, and numerous others. To illustrate, we estimate the partisan bias and electoral responsiveness of the U.S. House of Representatives since 1900 and evaluate the fairness of competing redistricting plans for the 1992 Ohio state legislature.
Herbert Zimiles has written a provocative article on quantitative research. Because his specific critiques of research on infant day care are nominal examples of his much broader arguments, we focus only on his general methodological perspectives in this brief comment. We write as methodologists, a qualitative researcher with a quantitative background (Walsh) and a quantitative researcher completing a book on qualitative research (King and see King, Keohane & Verba, in preparation).
In their 1990 Review article, Ian Budge and Richard Hofferbert analyzed the relationship between party platform emphases, control of the White House, and national government spending priorities, reporting strong evidence of a "party mandate" connection between them. Gary King and Michael Laver successfully replicate the original analysis, critique the interpretation of the causal effects, and present a reanalysis showing that platforms have small or nonexistent effects on spending. In response, Budge, Hofferbert, and Michael McDonald agree that their language was somewhat inconsistent on both interactions and causality but defend their conceptualization of "mandates" as involving only an association, not necessarily a causal connection, between party commitments and government policy. Hence, while the causes of government policy are of interest, noncausal associations are sufficient as evidence of party mandates in American politics.
As political scientists, we spend much time teaching and doing scholarly research, and more time than we may wish to remember on university committees. However, just as many of us believe that teaching and research are not fundamentally different activities, we also need not use fundamentally different standards of inference when studying government, policy, and politics than when participating in the governance of departments and universities. In this article, we describe our attempts to bring somewhat more systematic methods to the process and policies of graduate admissions.
As most political scientists know, the outcome of the U.S. Presidential election can be predicted within a few percentage points (in the popular vote), based on information available months before the election. Thus, the general election campaign for president seems irrelevant to the outcome (except in very close elections), despite all the media coverage of campaign strategy. However, it is also well known that the pre-election opinion polls can vary wildly over the campaign, and this variation is generally attributed to events in the campaign. How can campaign events affect people’s opinions on whom they plan to vote for, and yet not affect the outcome of the election? For that matter, why do voters consistently increase their support for a candidate during his nominating convention, even though the conventions are almost entirely predictable events whose effects can be rationally forecast? In this exploratory study, we consider several intuitively appealing, but ultimately wrong, resolutions to this puzzle, and discuss our current understanding of what causes opinion polls to fluctuate and yet reach a predictable outcome. Our evidence is based on graphical presentation and analysis of over 67,000 individual-level responses from forty-nine commercial polls during the 1988 campaign and many other aggregate poll results from the 1952–1992 campaigns. We show that responses to pollsters during the campaign are not generally informed or even, in a sense we describe, "rational." In contrast, voters decide which candidate to eventually support based on their enlightened preferences, as formed by the information they have learned during the campaign, as well as basic political cues such as ideology and party identification. We cannot prove this conclusion, but we do show that it is consistent with the aggregate forecasts and individual-level opinion poll responses. Based on the enlightened preferences hypothesis, we conclude that the news media have an important effect on the outcome of Presidential elections–-not due to misleading advertisements, sound bites, or spin doctors, but rather by conveying candidates’ positions on important issues.
Whenever we report predicted values, we should also report some measure of the uncertainty of these estimates. In the linear case, this is relatively simple, and the answer well-known, but with nonlinear models the answer may not be apparent. This short article shows how to make these calculations. I first present this for the familiar linear case, also reviewing the two forms of uncertainty in these estimates, and then show how to calculate these for any arbitrary function. An example appears last.
This Note addresses the long-standing discrepancy between scholarly support for the effect of constituency service on incumbency advantage and a large body of contradictory empirical evidence. I show first that many of the methodological problems noticed in past research reduce to a single methodological problem that is readily resolved. The core of this Note then provides among the first systematic empirical evidence for the constituency service hypothesis. Specifically, an extra $10,000 added to the budget of the average state legislator gives this incumbent an additional 1.54 percentage points in the next election (with a 95% confidence interval of 1.14 to 1.94 percentage points).
"Politimetrics" (Gurr 1972), "polimetrics" (Alker 1975), "politometrics" (Hilton 1976), "political arithmetic" (Petty  1971), "quantitative Political Science (QPS)," "governmetrics," "posopolitics" (Papayanopoulos 1973), "political science statistics (Rai and Blydenburgh 1973), "political statistics" (Rice 1926). These are some of the names that scholars have used to describe the field we now call "political methodology." The history of political methodology has been quite fragmented until recently, as reflected by this patchwork of names. The field has begun to coalesce during the past decade and we are developing persistent organizations, a growing body of scholarly literature, and an emerging consensus about important problems that need to be solved. I make one main point in this article: If political methodology is to play an important role in the future of political science, scholars will need to find ways of representing more interesting political contexts in quantitative analyses. This does not mean that scholars should just build more and more complicated statistical models. Instead, we need to represent more of the essence of political phenomena in our models. The advantage of formal and quantitative approaches is that they are abstract representations of the political world and are, thus, much clearer. We need methods that enable us to abstract the right parts of the phenomenon we are studying and exclude everything superfluous. Despite the fragmented history of quantitative political analysis, a version of this goal has been voiced frequently by both quantitative researchers and their critics (Sec. 2). However, while recognizing this shortcoming, earlier scholars were not in the position to rectify it, lacking the mathematical and statistical tools and, early on, the data. Since political methodologists have made great progress in these and other areas in recent years, I argue that we are now capable of realizing this goal. In section 3, I suggest specific approaches to this problem. Finally, in section 4, I provide two modern examples, ecological inference and models of spatial autocorrelation, to illustrate these points.
In an interesting and provocative article, Michael Lewis-Beck and Andrew Skalaban make an important contribution by emphasizing several philosophical issues in political methodology that have received too little attention from methodologists and quantitative researchers. These issues involve the role of systematic, and especially stochastic, variation in statistical models. After briefly discussing a few points of disagreement, hoping to reduce them to points of clarification, I turn to the philosophical issues. Examples with real data follow.
The dramatic increase in the electoral advantage of incumbency has sparked widespread interest among congressional researchers over the last 15 years. Although many scholars have studied the advantages of incumbency for incumbents, few have analyzed its effects on the underlying electoral system. We examine the influence of the incumbency advantage on two features of the electoral system in the U.S. House elections: electoral responsiveness and partisan bias. Using a district-level seats-votes model of House elections, we are able to distinguish systematic changes from unique, election-specific variations. Our results confirm the significant drop in responsiveness, and even steeper decline outside the South, over the past 40 years. Contrary to expectations, we find that increased incumbency advantage explains less than a third of this trend, indicating that some other unknown factor is responsible. Moreover, our analysis also reveals another dramatic pattern, largely overlooked in the congressional literature: in the 1940’s and 1950’s the electoral system was severely biased in favor of the Republican party. The system shifted incrementally from this severe Republican bias over the next several decades to a moderate Democratic bias by the mid-1980’s. Interestingly, changes in incumbency advantage explain virtually all of this trend in partisan bias since the 1940’s. By removing incumbency advantage and the existing configuration of incumbents and challengers analytically, our analysis reveals an underlying electoral system that remains consistently biased in favor of the Republican party. Thus, our results indicate that incumbency advantage affects the underlying electoral system, but contrary to conventional wisdom, this changes the trend in partisan bias more than electoral responsiveness.
Robert Luskin’s article in this issue provides a useful service by appropriately qualifying several points I made in my 1986 American Journal of Political Science article. Whereas I focused on how to avoid common mistakes in quantitative political sciences, Luskin clarifies ways to extract some useful information from usually problematic statistics: correlation coefficients, standardized coefficients, and especially R2. Since these three statistics are very closely related (and indeed deterministic functions of one another in some cases), I focus in this discussion primarily on R2, the most widely used and abused. Luskin also widens the discussion to various kinds of specification tests, a general issue I also address. In fact, as Beck (1991) reports, a large number of formal specification tests are just functions of R2, with differences among them primarily due to how much each statistic penalizes one for including extra parameters and fewer observations. Quantitative political scientists often worry about model selection and specification, asking questions about parameter identification, autocorrelated or heteroscedastic disturbances, parameter constancy, variable choice, measurement error, endogeneity, functional forms, stochastic assumptions, and selection bias, among numerous others. These model specification questions are all important, but we may have forgotten why we pose them. Political scientists commonly give three reasons: (1) finding the "true" model, or the "full" explanation and (2) prediction and and (3) estimating specific causal effects. I argue here that (1) is used the most but useful the least and (2) is very useful but not usually in political science where forecasting is not often a central concern and and (3) correctly represents the goals of political scientists and should form the basis of most of our quantitative empirical work.