Gary King Homepage Previous: How does EI relate Up: Frequently Asked Questions Next: How do I understand

Can EI give the misleading answers?

Yes. Nothing in $ {\mathfrak{E}I}$, this manual, or A Solution to the Ecological Inference Problem promises to give you the correct answer every time without thought. ``The method'' proposed in the book is not what comes spinning out of $ {\mathfrak{E}I}$ with all globals set at their defaults. Appropriate inferences, according to the argument put forward in the book, require full use of the diagnostics, to evaluate the amount of information lost in aggregation (such as how wide the bounds are for different groups of observations) and how well the model fits and its assumptions apply. Since assumptions about joint distributions for $ \beta_i^b$ and $ \beta_i^w$ cannot be rejected if they merely have positive mass over any curve that connects the bottom left and top right points of a tomography plot (p.191), there is no way to make certain inferences about individual level behavior from aggregate data alone. The only solution to this fundamental lack of information is to bring in some of the vast array of qualitative information available to most social scientists about the problems we study -- including ethnographies, particiant observations, partial survey data, journalistic accounts, historical studies, prior quantitative research, and the like -- the full range of data collection schemes used in modern social science. Interpreting qualitative information in the context of statistical inference is of course open to more interpretation and ambiguity than formal statistical tests, but stopping at quantitative data, especially for this problem, is insufficient.

Considerable thought, analysis, and qualitative information may be necessary to settle on the right version of the model to run. $ {\mathfrak{E}I}$ includes dozens of global variables that govern the main parts of the model; combinations of these globals can produce estimates from millions of possible specifications, even given identical input variables. The choice among these models requires the same degree of reasoned analysis and reanalysis, checking assumptions, and rerunning that the appropriate use of any method does. The actual method of ecological inference proposed in the book requires careful attention to each item in the checklist provided in the concluding chapter (Chapter 16); since several of the items require the user to consult qualitative evidence and other substantive knowledge about the problem, this program alone implements only part of the proposed method. Moreover, even with considerable thought, some misinformation or lack of information can sometimes lead to incorrect estimates; Chapter 9 provides extensive examples of precisely what can go wrong and under what conditions. If you have an example where you suspect that EI does not recover the truth, then one of the problems discussed in that chapter is likely at fault, and so you might consider some of the alternative approaches and model extensions also given there.

Finally, if you are comparing EI results to an external source of information to judge the ``truth'', consider whether the external source may be biased. For example, an estimate from survey data is just an estimate and not necessarily be better than an ecological inference. One of the best academic surveys, the National Election Studies, overestimates turnout by 8-10% and vote for incumbent House candidates by about 8%. (Even the NES's ``voter validation studies,'' which check each respondent's turnout from public records, contain errors.) Other surveys, especially about controversial issues, or politically or personally sensitive topics, often generate larger biases. The point is that every source of information, ecological and individual, comes with some potential biases or errors.

See also the questions below on the advantages of EI, computational problems, statistical fit, and standard error interpretation.



Gary King 2006-09-13