Publications by Year: 2003

2003
Gary King. 2003. “10 Million International Dyadic Events”. Publisher's Version
Christopher Adolph and Gary King. 2003. “Analyzing Second Stage Ecological Regressions.” Political Analysis, 11, Pp. 65-76. Article
An Automated Information Extraction Tool For International Conflict Data with Performance as Good as Human Coders: A Rare Events Evaluation Design
Gary King and Will Lowe. 2003. “An Automated Information Extraction Tool For International Conflict Data with Performance as Good as Human Coders: A Rare Events Evaluation Design.” International Organization, 57, Pp. 617-642.Abstract
Despite widespread recognition that aggregated summary statistics on international conflict and cooperation miss most of the complex interactions among nations, the vast majority of scholars continue to employ annual, quarterly, or occasionally monthly observations. Daily events data, coded from some of the huge volume of news stories produced by journalists, have not been used much for the last two decades. We offer some reason to change this practice, which we feel should lead to considerably increased use of these data. We address advances in event categorization schemes and software programs that automatically produce data by "reading" news stories without human coders. We design a method that makes it feasible for the first time to evaluate these programs when they are applied in areas with the particular characteristics of international conflict and cooperation data, namely event categories with highly unequal prevalences, and where rare events (such as highly conflictual actions) are of special interest. We use this rare events design to evaluate one existing program, and find it to be as good as trained human coders, but obviously far less expensive to use. For large scale data collections, the program dominates human coding. Our new evaluative method should be of use in international relations, as well as more generally in the field of computational linguistics, for evaluating other automated information extraction tools. We believe that the data created by programs similar to the one we evaluated should see dramatically increased use in international relations research. To facilitate this process, we are releasing with this article data on 4.3 million international events, covering the entire world for the last decade.
Article
Building An Infrastructure for Empirical Research in the Law
Lee Epstein and Gary King. 2003. “Building An Infrastructure for Empirical Research in the Law.” Journal of Legal Education, 53, Pp. 311–320.Abstract
In every discipline in which "empirical research" has become commonplace, scholars have formed a subfield devoted to solving the methodological problems unique to that discipline’s data and theoretical questions. Although students of economics, political science, psychology, sociology, business, education, medicine, public health, and so on primarily focus on specific substantive questions, they cannot wait for those in other fields to solve their methoodological problems or to teach them "new" methods, wherever they were initially developed. In "The Rules of Inference," we argued for the creation of an analogous methodological subfield devoted to legal scholarship. We also had two other objectives: (1) to adapt the rules of inference used in the natural and social sciences, which apply equally to quantitative and qualitative research, to the special needs, theories, and data in legal scholarship, and (2) to offer recommendations on how the infrastructure of teaching and research at law schools might be reorganized so that it could better support the creation of first-rate quantitative and qualitative empirical research without compromising other important objectives. Published commentaries on our paper, along with citations to it, have focused largely on the first-our application of the rules of inference to legal scholarship. Until now, discussions of our second goal-suggestions for the improvement of legal scholarship, as well as our argument for the creation of a group that would focus on methodological problems unique to law-have been relegated to less public forums, even though, judging from the volume of correspondence we have received, they seem to be no less extensive.
Article
Michael Tomz, Jason Wittenberg, and Gary King. 2003. “CLARIFY: Software for Interpreting and Presenting Statistical Results.” Journal of Statistical Software.Abstract
This is a set of easy-to-use Stata macros that implement the techniques described in Gary King, Michael Tomz, and Jason Wittenberg's "Making the Most of Statistical Analyses: Improving Interpretation and Presentation". To install Clarify, type "net from https://gking.harvard.edu/clarify (https://gking.harvard.edu/clarify)" at the Stata command line.

Winner of the Okidata Best Research Software Award. Also try -ssc install qsim- to install a wrapper, donated by Fred Wolfe, to automate Clarify's simulation of dummy variables.
Christopher Adolph, Gary King, Kenneth W Shotts, and Michael C Herron. 2003. “A Consensus on Second Stage Analyses in Ecological Inference Models.” Political Analysis, 11, Pp. 86–94.Abstract
Since Herron and Shotts (2003a and hereinafter HS), Adolph and King (2003 andhereinafter AK), and Herron and Shotts (2003b and hereinafter HS2), the four of us have iterated many more times, learned a great deal, and arrived at a consensus on this issue. This paper describes our joint recommendations for how to run second-stage ecological regressions, and provides detailed analyses to back up our claims.
Article
Determinants of Inequality in Child Survival: Results from 39 Countries
Emmanuela Gakidou and Gary King. 2003. “Determinants of Inequality in Child Survival: Results from 39 Countries.” In Health Systems Performance Assessment: Debates, Methods and Empiricism, edited by Christopher J.L. Murray and David B. Evans, Pp. 497-502. Geneva: World Health Organization.Abstract

Few would disagree that health policies and programmes ought to be based on valid, timely and relevant information, focused on those aspects of health development that are in greatest need of improvement. For example, vaccination programmes rely heavily on information on cases and deaths to document needs and to monitor progress on childhood illness and mortality. The same strong information basis is necessary for policies on health inequality. The reduction of health inequality is widely accepted as a key goal for societies, but any policy needs reliable research on the extent and causes of health inequality. Given that child deaths still constitute 19% of all deaths globally and 24% of all deaths in developing countries (1), reducing inequalities in child survival is a good beginning.total = between + within

The between-group component of total health inequality has been studied extensively by numerous scholars. They have expertly analysed the causes of differences in health status and mortality across population subgroups, defined by income, education, race/ethnicity, country, region, social class, and other group identifiers (2–9).

 

Chapter
EI: A Program for Ecological Inference
Gary King. 2003. “EI: A Program for Ecological Inference”. Publisher's Version
Gary King and Kenneth Benoit. 2003. “EzI: A(n Easy) Program for Ecological Inference”. Publisher's Version
Gary King. 2003. “The Future of Replication.” International Studies Perspectives, 4, Pp. 443–499.Abstract

Since the replication standard was proposed for political science research, more journals have required or encouraged authors to make data available, and more authors have shared their data. The calls for continuing this trend are more persistent than ever, and the agreement among journal editors in this Symposium continues this trend. In this article, I offer a vision of a possible future of the replication movement. The plan is to implement this vision via the Virtual Data Center project, which – by automating the process of finding, sharing, archiving, subsetting, converting, analyzing, and distributing data – may greatly facilitate adherence to the replication standard.

Article
Numerical Issues Involved in Inverting Hessian Matrices
Jeff Gill and Gary King. 2003. “Numerical Issues Involved in Inverting Hessian Matrices.” In Numerical Issues in Statistical Computing for the Social Scientist, edited by Micah Altman and Michael P. McDonald, Pp. 143-176. Hoboken, NJ: John Wiley and Sons, Inc. Chapter PDF
Michael Tomz, Gary King, and Langche Zeng. 2003. “ReLogit: Rare Events Logistic Regression.” Journal of Statistical Software, 8. Publisher's Version
ReLogit: Rare Events Logistic Regression
Gary King, Michael Tomz, and Langche Zeng. 2003. “ReLogit: Rare Events Logistic Regression”. Publisher's Version
Some Statistical Methods for Evaluating Information Extraction Systems
Will Lowe and Gary King. 2003. “Some Statistical Methods for Evaluating Information Extraction Systems.” Proceedings of the 10th Conference of the European Chapter of the Association for Computational Linguistics, Pp. 19-26.Abstract

We present new statistical methods for evaluating information extraction systems. The methods were developed to evaluate a system used by political scientists to extract event information from news leads about international politics. The nature of this data presents two problems for evaluators: 1) the frequency distribution of event types in international event data is strongly skewed, so a random sample of newsleads will typically fail to contain any low frequency events. 2) Manual information extraction necessary to create evaluation sets is costly, and most effort is wasted coding high frequency categories . We present an evaluation scheme that overcomes these problems with considerably less manual effort than traditional methods, and also allows us to interpret an information extraction system as an estimator (in the statistical sense) and to estimate its bias.

Article