The only commonly used methods before EI were Goodman's regression and
the method of bounds. Goodman's regression worked when the
assumptions held but, as Leo Goodman made clear, it did not work when
the assumptions were wrong. Within the Goodman framework, the data
alone provided no information about whether the assumptions were right
or wrong. The method of bounds always gave correct ranges into which
the quantities of interest fell, but the ranges were often wider than
was desirable (only in part because the wrong method of computing them
was frequently used).
EI combines the two methods (hence resolving most controversies
between adherents of these two popular approaches) and adds some
additional features. Instead of there being two situations, as under
Goodman's approach (i.e., the assumptions applied and the method
worked or they don't and it doesn't), we now have five, only the last
one of which is a problem for EI:
- Under EI, if the assumptions are correct, you get the right
answer. For an example, see Chapter 10 or the Monte Carlos in
Chapter 9.
- If the assumptions are wrong, EI still does ``well'' (in the
sense of small MSE or squared bias) when the bounds (and other
information in the tomography lines) are sufficiently informative.
An important point is that the degree to which the bounds are
informative can easily be assessed from the aggregate data, and so
the risks of making ecological inferences are largely known. As one
example, see Chapter 11.
- If the assumptions are wrong and the bounds are not sufficiently
informative, but the diagnostics are sufficiently informative, then
the assumptions can easily be changed, and EI will do well. The
analyses reported in Figure 9.5 (p. 179) and the left graph in
Figure 13.2 (p. 238) for aggregation bias and Figures 9.7 and 9.9
(Pp. 187, 195) for distributional violations are examples. The
third assumption, no spatial autocorrelation, seems to have minor
effects.
- If the assumptions are wrong and the bounds and the diagnostics
are not sufficiently informative, but the researcher has additional
qualitative knowledge of the problem, then appropriate assumptions
can be chosen. In this case, either EI will do well, or the formal
measures of uncertainty produced by EI (standard errors and
confidence intervals, etc., which are based only upon the data and
model) can be supplemented and expanded accordingly. Since the
ecological inference problem is about information that has been
aggregated away, only by adding some information is it possible to
make reliable inferences in general. Qualitative information is of
course subject to more interpretation and hence more uncertainty,
but reliable inferences permit no other option other than to add
assumptions or other information. The book discusses a lot of ways
to bring in qualitative information (see also Gary King, Robert
Keohane, and Sidney Verba. 1994. Designing Social Inquiry:
Scientific Inference in Qualitative Data. Princeton University
Press).
- If the assumptions are wrong and the bounds and the diagnostics
are not sufficiently informative, and the researcher has no time or
resources to collect additional qualitative information, then EI
will perform poorly. An example of data like this appear in Figure
9.2 (p. 163). Even in this worst case scenario, and the others, EI
will be more robust than Goodman's. By this I mean that the maximum
amount of bias from EI is capped at a fixed and knowable level, in
contrast to Goodman's approach. The dotted line (corresponding to
for the default model) in Figure 9.6 (p. 180) shows that
bias in EI estimates increases with the degree of aggregation bias
for small levels of aggregation bias; at some point, however, the
maximum bias maxes out and increases no further. The point at which
the error maxes out depends on the data. Under Goodman's approach,
the error linearly increases without limit as aggregation bias
increases.
The likelihood of the first four cases coming up relative to the fifth
(as compared to the likelihood of the assumptions applying vs not
applying under Goodman's) summarizes the advantage of EI. Basically
what EI does is to chip off pieces of Goodman's worst case (the
assumptions not applying). The benefits of EI will therefore quite
obviously depend on the area and application and how much effort is
put into collecting qualitative information.
Gary King
2006-09-13