- ...history.
-
What is ``ecological'' about the aggregate data from which individual behavior is to be inferred? The name has been used at least since the late 1800s and stems from the word ecology, the science of the interrelationship of living things and their environments. Statistical measures taken at the level of the environment, such as summaries of geographic areas or other aggregate units, are widely known as ecological data. Ecological inference is the process of using ecological data to learn about the behavior of individuals within these aggregates.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...module.
-
Gauss is available from Aptech Systems, Inc.; 23804 S.E. Kent-Kangley Road; Maple Valley, Washington 98038; (206) 432-7855; sales@aptech.com.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...(p. 414).
-
In 1919, the possibility of what has since come to be known as the ``gender gap'' was a central issue for academics and a nontrivial concern for political leaders seeking reelection: Not only were women about to have the vote for the first time nationwide; because women made up slightly over fifty percent of the population, they were about to have most of the votes.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...women.
-
That is, given these aggregate numbers, a minimum of 0% of females in precinct 1 and 20% in precinct 2 (for an average of 10%) could have opposed the referenda, whereas a maximum of 40% of males in each precinct could have opposed it. Chapter
provides easy graphical methods of making calculations like these.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...earnest.
-
Other early works that recognized the ecological inference problem include Allport (1924), Bernstein (1932), Gehlke and Biehl (1934), Thorndike (1939), Deming and Stephan (1940), and Yule and Kendall (1950). Robinson (1950) cited several of these studies as well as Ogburn and Goltra. Scholars writing even earlier than Ogburn and Goltra (1919) made ecological inferences, even though they did not recognize the problems with doing so. In fact, even the works usually cited as the first statistical works of any kind, which incidentally concerned political topics, included ecological inferences (see Graunt, 1662, and Petty, 1690, 1691). See Achen and Shively (1995) for other details of the history of ecological inference research.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...times.
-
This is a vast underestimate, as it depends on data from the Social Science Citation Index, which did not even begin publishing (or counting) until six years after Robinson's article appeared.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...level.
-
There are even several largely independent lines of research that give conditions under which aggregate data is not worse than individual-level data for certain purposes. In political science, see Kramer (1983); in epidemiology, see Morgenstern (1982); in psychology, see Epstein (1986); in economics, see Grunfeld and Griliches (1960), Fromm and Schink (1973), Aigner and Goldfeld (1974), and Shin (1987); and in input-output analysis, a field within economics, see Malinvaud (1955) and Venezia (1978).
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...voters.
-
In this book, I use ``African American'' and ``black'' interchangeably and, when appropriate or for expository simplicity, often define ``white'' as non-black or occasionally as a residual category such as non-black and non-Hispanic.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...white.
-
In some states, precincts must be aggregated to a somewhat higher geographical level to match electoral and census data.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...example.)
-
The litigation based on the Voting Rights Act is vast; see Grofman, Handley, and Niemi (1992) for a review.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...1994).
-
Most epidemiological questions require relatively certain answers and thus, in most cases, large-scale, randomized experiments on individuals. Because each such experiment can cost hundreds of millions of dollars, a valid method of ecological inference would probably be of primary use in this field for helping scholars (and funding agencies) choose which experiments to conduct.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...level.
-
I had a small role in this case as a consultant to the state of Ohio and therefore witnessed the following story firsthand. My primary task in the case was to evaluate the relative fairness of the state's redistricting plan to the political parties, using methods developed in King and Browning (1987), King (1989b), and Gelman and King (1990, 1994a, b).
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...bounds.
-
That is, although the row total is 55,054, the total number of people in the upper left cell of Table
cannot exceed 19,896, or it would contradict its column marginal.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...answer.
-
This estimate of the number of times authors in the ecological inference literature have made themselves vulnerable to being wrong is based on counting data sets original to this literature. Individual cross-tabulations that were used to study the method of bounds are excluded since no uncertainty, and thus no vulnerability, exists. I obviously also exclude studies that use data sets previously introduced to this literature. A list of data sets and the studies in which they were first used are as follows: Race and illiteracy from the 1930 U.S. Census (Robinson, 1950); race by domestic service from community area data (Goodman, 1959; used originally to study bounds by Duncan and Davis, 1953); infant mortality by race and by urbanicity in U.S. states (Duncan et al., 1961: 71-72); 1964-1966 voter transitions in British constituencies (Hawkes, 1969); a voter transition between Democratic primaries in Florida (Irwin and Meeter, 1969); a 1961 German survey (Stokes, 1969); voter transition in England from Butler and Stokes (1969) data (Miller, 1972); survey of first-year university students (Hannan and Burstein, 1974); vote for Labour by worker category (Crewe and Payne, 1976); voter transition in England compared to a poll (McCarthy and Ryan, 1977); voter transition February to October 1974 in England compared to a poll (Upton, 1978); voter transition from a general election in 1983 to an election to the European parliament in 1984 compared to an ITN poll (Brown and Payne, 1986); one comparison based on twenty-four observations from Lee County, South Carolina, comparing registration and turnout by race (Loewen and Grofman, 1989); two comparisons of a survey to Swedish election data (Ersson and Wörlund, 1990); twenty comparisons of aggregate electoral data in California and nationally compared to exit polls, comparisons using census data, and official data on registration and voter turnout (Freedman et al., 1991); eight voter transition studies in Denmark compared to survey data (Thomsen et al., 1991); race and registration data from Matthews and Prothro (1966) (Alt, 1993); race and literacy from the 1910 U.S. Census (Palmquist, 1994); housing tenure transitions from 1971 to 1981 in England from census data (Cleave, Brown, and Payne, 1995). If you know of any work that belongs on this list that I missed, I would appreciate hearing from you.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...methods.
-
Surveys are also very underused in this literature, perhaps in part since many scholars came to this field because of their skepticism of public opinion polls.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...procedure.)
-
The 3,262 evaluations of the model in this section are from the same data set and, as such, are obviously related. However, each comparison between the truth and an estimate provides a separate instance in which the model is vulnerable to being wrong. These model evaluations simulate the usual situation in which the ecological analyst has no definite prior knowledge about whether the parameters of interest are dependent, unrelated, or all identical.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...model.
-
As an analogy, consider how much information could be added to the usual linear regression if we knew for certain a different narrow range within which each observation's
must fall.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.