Content Analysis
- A method that gives unbiased estimates of the proportion of
text documents in investigator-chosen categories, given only a small
subset of hand-coded documents. Also includes the first correction
for the far less-than-perfect levels of inter-coder reliability that
typically characterize hand coding. Applications to sentiment
detection about politicians in blog posts. Daniel Hopkins and Gary
King. A Method of Automated Nonparametric Content
Analysis for Social Science, forthcoming American Journal
of Political Science (Abstract: HTML | Paper: PDF)
- A method of cluster analysis that encompasses (and thus
outperforms) all existing methods; includes new cluster analysis
evaluation techniques. Justin Grimmer and Gary King, Quantitative Discovery from Qualitative Information: A
General-Purpose Document Clustering Methodology (Abstract: HTML | Paper: PDF)
-
Methods to evaluate automated information extraction systems when
coding rare events, the success of one such system, along with
considerable data. Gary King and Will Lowe. An
Automated Information Extraction Tool For International Conflict Data
with Performance as Good as Human Coders: A Rare Events Evaluation
Design, International Organization, Vol. 57, No. 03
(July, 2003): pp. 617-642. (Article: PDF |
Abstract: HTML)
- A version of the previous article for a different audience: Will
Lowe and Gary King. Some Statistical Methods for
Evaluating Information Extraction Systems, in K. Pastra,
ed. Proceedings of the Workshop on Evaluation Initiatives in
Natural Language Processing. 10th Conference of the European
Association for Computational Linguistics, Budapest, Hungary
(2003): Pp. 19-26.
Software
-
ReadMe: Software for Automated Content
Analysis. (Website:
ReadMe)
Data
-
10 Million International Dyadic Events,
conflict and cooperation in international relations, 1990-2004, as
evaluated by King and Lowe (2003), automatically coded from Reuters
news reports. (Website: Events | Abstract: HTML)