10 Million International Dyadic Events
This page has been superceeded. Please see the data in the more
convenient form on my dataverse:
PLEASE CLICK
HERE

Can You Spot the Android? These graphs show estimated
conflict-cooperation scores (plotted horizontally) by errors (plotted
vertically). The machine's performance appears in one of these graphs
and the performance of trained human coders appear in the other
three. Can you tell which is which? For the answer, see Figure 3 in
King and Lowe [ PDF ].
When the Palestinians launch a mortar attack into Israel, the
Israeli army does not wait until the end of the calendar year to
react. Yet, most modern data collections are aggregated to the month
or year. The data available here include almost 10 million
individual events, each coded to the exact day they occur or
become known. Each event is summarized in the data as "Actor A does
something to Actor B", with Actors A and B recording about 450
countries and other (within-country) actors and "does something to"
coded in an ontology of about 200 types of actions. The data are coded
by computer from millions of Reuters news reports. The software system
(produced by VRA) that performs this task has been independently
evaluated by King and Lowe (2003). This article found that for the
numbers of events it was possible to convince humans (trained Harvard
undergraduates) to code by hand, the machine did as well as the
humans. For much larger numbers of events for which no expert coder
could keep up, the machine dominates.
- Improved and extended data for 1990-2004 for the entire world - almost
10 million events in total. [ Data: ZIP
121MB | Documentation: PDF ]
- Aggregated forms of the above data in MS Access files. [1990-1995 Events:
ZIP126MB | 1995-1999 Events: ZIP
185 MB | 2000-2004 Events: ZIP 161 MB |
Documentation: DOC ]
- These data are made available as a companion to the article that conducted
the independent evaluation: Gary King and Will Lowe. 2003. "An
Automated Information Extraction Tool For International Conflict
Data with Performance as Good as Human Coders: A Rare Events
Evaluation Design," International Organization, 57, 3
(July, 2003): Pp. 617-642. [ PDF |
Abstract ]
- The first version of the data, including 3.7 million
international dyadic events are still available. [ Documentation
& data in MS Access: ZIP
150MB | ASCII text data: ZIP 43 MB
] from the last decade. (Dale
Thomas generously contributed this Windows program [ ZIP
] to extract data from these large files) and additional documentation on
the event codes [ XLS ].
- Methods for the analysis of rare events: Gary
King and Langche Zeng. "Explaining Rare Events in International Relations,"
International Organization, 55, 3 (Spring, 2001): 693-715; Gary King
and Langche Zeng. "Logistic Regression in Rare Events Data,"
Political Analysis, Vol. 9, No. 2, (Spring, 2001): Pp. 137--63; Gary
King and Langche Zeng. 2002. "Estimating Risk and Rate Levels, Ratios, and Differences
in Case-Control Studies," Statistics in Medicine, Vol. 21,
Pp. 1409-1427; and Gary King and Langche Zeng. 2004. "Inference
in Case-Control Studies," in Shein-Chung Chow, ed., Encyclopedia
of Biopharmaceutical Statistics, 2nd edition. New York: Marcel Dekker.
- ReLogit Software for analyzing rare events: for Stata and for Zelig and R.
- Concluding comment in a symposium on the analysis of dyadic
international conflict data: Gary King. "Proper Nouns and Methodological Propriety: Pooling
Dyads in International Relations Data," International Organization,
55, 2 (Fall, 2001): 497-507.