10 Million International Dyadic Events

This page has been superceeded. Please see the data in the more convenient form on my dataverse:

PLEASE CLICK HERE


Can You Spot the Android? These graphs show estimated conflict-cooperation scores (plotted horizontally) by errors (plotted vertically). The machine's performance appears in one of these graphs and the performance of trained human coders appear in the other three. Can you tell which is which? For the answer, see Figure 3 in King and Lowe [ PDF ].

When the Palestinians launch a mortar attack into Israel, the Israeli army does not wait until the end of the calendar year to react. Yet, most modern data collections are aggregated to the month or year. The data available here include almost 10 million individual events, each coded to the exact day they occur or become known. Each event is summarized in the data as "Actor A does something to Actor B", with Actors A and B recording about 450 countries and other (within-country) actors and "does something to" coded in an ontology of about 200 types of actions. The data are coded by computer from millions of Reuters news reports. The software system (produced by VRA) that performs this task has been independently evaluated by King and Lowe (2003). This article found that for the numbers of events it was possible to convince humans (trained Harvard undergraduates) to code by hand, the machine did as well as the humans. For much larger numbers of events for which no expert coder could keep up, the machine dominates.