Conjoint survey designs are spreading across the social sciences due to their unusual capacity to identify many causal effects from a single randomized experiment. Unfortunately, because the nature of conjoint designs violates aspects of best practices in questionnaire construction, they generate substantial measurement error-induced bias, which can exaggerate, attenuate, or flip signs of causal and descriptive estimates. By replicating both data collection and analysis of eight prominent conjoint studies, all of which closely reproduce published results, we show that about half of all observed variation in this most common type of conjoint experiment is effectively random noise. We then discover a common empirical pattern in how measurement error appears in conjoint studies and use it to derive an easy-to-use statistical method that corrects the bias.
Election surprises are hardly surprising. Unexpected challengers, deaths, retirements, scandals, campaign strategies, real world events, and heresthetical maneuvers all conspire to confuse the best models. Quantitative researchers usually model district-level elections with linear functions of measured covariates, to account for systematic variation, and normal error terms, to account for surprises. However, although these models work well in many situations they can be embarrassingly overconfident: Events that commonly used models indicate should occur once in 10,000 elections occur almost every year, and even those which the model indicates should occur once in a trillion-trillion elections are sometimes observed. We develop a new general purpose statistical model of district-level legislative elections, validated with extensive out-of-sample (and distribution-free) tests. As an illustration, we use this model to generate the first ever correctly calibrated probabilities of incumbent losses in US Congressional elections, one of the most important quantities for evaluating the functioning of a representative democracy. Analyses lead to an optimistic conclusion about American democracy: Even when marginals vanish, incumbency advantage grows, and dramatic changes occur, the risk of an incumbent losing an election has been high and essentially constant from the 1950s until the present day.
Survey researchers have long sought to protect the privacy of their respondents via de-identification (removing names and other directly identifying information) before sharing data. Although these procedures can help, recent research demonstrates that they fail to protect respondents from intentional re-identification attacks, a problem that threatens to undermine vast survey enterprises in academia, government, and industry. This is especially a problem in political science because political beliefs are not merely the subject of our scholarship; they represent some of the most important information respondents want to keep private. We confirm the problem in practice by re-identifying individuals from a survey about a controversial referendum declaring life beginning at conception. We build on the concept of "differential privacy" to offer new data sharing procedures with mathematical guarantees for protecting respondent privacy and statistical validity guarantees for social scientists analyzing differentially private data. The cost of these new procedures is larger standard errors, which can be overcome with somewhat larger sample sizes.
Katz, King, and Rosenblatt (2020) introduces a theoretical framework for understanding redistricting and electoral systems, built on basic statistical and social science principles of inference. DeFord et al. (Forthcoming, 2021) instead focuses solely on descriptive measures, which lead to the problems identified in our arti- cle. In this paper, we illustrate the essential role of these basic principles and then offer statistical, mathematical, and substantive corrections required to apply DeFord et al.’s calculations to social science questions of interest, while also showing how to easily resolve all claimed paradoxes and problems. We are grateful to the authors for their interest in our work and for this opportunity to clarify these principles and our theoretical framework.
Some scholars build models to classify documents into chosen categories. Others, especially social scientists who tend to focus on population characteristics, instead usually estimate the proportion of documents in each category -- using either parametric "classify-and-count" methods or "direct" nonparametric estimation of proportions without individual classification. Unfortunately, classify-and-count methods can be highly model dependent or generate more bias in the proportions even as the percent of documents correctly classified increases. Direct estimation avoids these problems, but can suffer when the meaning of language changes between training and test sets or is too similar across categories. We develop an improved direct estimation approach without these issues by including and optimizing continuous text features, along with a form of matching adapted from the causal inference literature. Our approach substantially improves performance in a diverse collection of 73 data sets. We also offer easy-to-use software software that implements all ideas discussed herein.
We are grateful to DeFord et al. for the continued attention to our work and the crucial issues of fair representation in democratic electoral systems. Our response (Katz, King, and Rosenblatt, forthcoming) was designed to help readers avoid being misled by mistaken claims in DeFord et al. (forthcoming-a), and does not address other literature or uses of our prior work. As it happens, none of our corrections were addressed (or contradicted) in the most recent submission (DeFord et al., forthcoming-b).
We also offer a recommendation regarding DeFord et al.’s (forthcoming-b) concern with how expert witnesses, consultants, and commentators should present academic scholarship to academic novices, such as judges, public officials, the media, and the general public. In these public service roles, scholars attempt to translate academic understanding of sophisticated scholarly literatures, technical methodologies, and complex theories for those without sufficient background in social science or statistics.
Unprecedented quantities of data that could help social scientists understand and ameliorate the challenges of human society are presently locked away inside companies, governments, and other organizations, in part because of privacy concerns. We address this problem with a general-purpose data access and analysis system with mathematical guarantees of privacy for research subjects, and statistical validity guarantees for researchers seeking social science insights. We build on the standard of ``differential privacy,'' correct for biases induced by the privacy-preserving procedures, provide a proper accounting of uncertainty, and impose minimal constraints on the choice of statistical methods and quantities estimated. We also replicate two recent published articles and show how we can obtain approximately the same substantive results while simultaneously protecting the privacy. Our approach is simple to use and computationally efficient; we also offer open source software that implements all our methods.
Ian Ayres, Richard A. Berk, Richard R.W. Brooks, Daniel E. Ho, Gary King, Kevin Quinn, Donald B. Rubin, and Sherod Thaxton. 2022. “Brief of Empirical Scholars as Amici Curiae in Support of Respondents.” Filed with the Supreme Court of the United States in Students for Fair Admissions v. President and Fellows of Harvard College.Abstract
Amici curiae are leaders in the field of quantitative social science and statistical methodology. Amici submit this brief to point out the substantial methodological flaws in the “mismatch” research discussed in the Brief for Richard Sander as Amicus Curiae in Support of Petitioner. Professor Sander’s mismatch hypothesis is unsupported and based on work that fails to adhere to basic tenets of research design.
We offer methods to analyze the "differentially private" Facebook URLs Dataset which, at over 40 trillion cell values, is one of the largest social science research datasets ever constructed. The version of differential privacy used in the URLs dataset has specially calibrated random noise added, which provides mathematical guarantees for the privacy of individual research subjects while still making it possible to learn about aggregate patterns of interest to social scientists. Unfortunately, random noise creates measurement error which induces statistical bias -- including attenuation, exaggeration, switched signs, or incorrect uncertainty estimates. We adapt methods developed to correct for naturally occurring measurement error, with special attention to computational efficiency for large datasets. The result is statistically valid linear regression estimates and descriptive statistics that can be interpreted as ordinary analyses of non-confidential data but with appropriately larger standard errors.
We have implemented these methods in open source software for R called PrivacyUnbiased. Facebook has ported PrivacyUnbiased to open source Python code called svinfer. We have extended these results in Evans and King (2021).
A letter, submitted on behalf of a large group of expert signatories, to request the release of the “noisy measurements file” and other redistricting data by September 30, 2021. This includes the data created by the Bureau in preparing its differentially private data release, without their unnecessary (and, in many important situations, information destroying) post-processing.
Textual responses to open-ended (i.e., free-response) items provided by participants (e.g., by means of mobile wireless devices) are automatically classified, enabling an instructor to assess the responses in a convenient, organized fashion and adjust instruction accordingly.
Representative embodiments of a method for grouping participants in an activity include the steps of: (i) defining a grouping policy; (ii) storing, in a database, participant records that include a participant identifier, a characteristic associated with the participant, and/or an identifier for a participant's handheld device; (iii) defining groupings based on the policy and characteristics of the participants relating to the policy and to the activity; and (iv) communicating the groupings to the handheld devices to establish the groups.
"The classic work on qualitative methods in political science"
Designing Social Inquiry presents a unified approach to qualitative and quantitative research in political science, showing how the same logic of inference underlies both. This stimulating book discusses issues related to framing research questions, measuring the accuracy of data and the uncertainty of empirical inferences, discovering causal effects, and getting the most out of qualitative research. It addresses topics such as interpretation and inference, comparative case studies, constructing causal theories, dependent and explanatory variables, the limits of random selection, selection bias, and errors in measurement. The book only uses mathematical notation to clarify concepts, and assumes no prior knowledge of mathematics or statistics.
Featuring a new preface by Robert O. Keohane and Gary King, this edition makes an influential work available to new generations of qualitative researchers in the social sciences.
When word processors were first introduced into the workplace, they turned scholars into typists. But they also improved our work: Turnaround time for new drafts dropped from days to seconds. Rewriting became easier and more common, and our papers, educational efforts, and research output improved. I discuss the advantages of and mechanisms for doing the same with do-it-yourself video recordings of research talks and class lectures, so that they may become a fully respected channel for scholarly output and education, alongside books and articles. I consider innovations in video design to optimize education and communication, along with technology to make this change possible.
Excerpts of this paper appeared in Political Science Today (Vol. 1, No. 3, August 2021: Pp.5-6, copy here) and in APSAEducate. See also my recorded videos here.
To deter gerrymandering, many state constitutions require legislative districts to be "compact." Yet, the law offers few precise definitions other than "you know it when you see it," which effectively implies a common understanding of the concept. In contrast, academics have shown that compactness has multiple dimensions and have generated many conflicting measures. We hypothesize that both are correct -- that compactness is complex and multidimensional, but a common understanding exists across people. We develop a survey to elicit this understanding, with high reliability (in data where the standard paired comparisons approach fails). We create a statistical model that predicts, with high accuracy, solely from the geometric features of the district, compactness evaluations by judges and public officials responsible for redistricting, among others. We also offer compactness data from our validated measure for 20,160 state legislative and congressional districts, as well as open source software to compute this measure from any district.
Winner of the 2018 Robert H Durr Award from the MPSA.
There are emerging opportunities to assess health indicators at truly small areas with increasing availability of data geocoded to micro geographic units and advanced modeling techniques. The utility of such fine-grained data can be fully leveraged if linked to local governance units that are accountable for implementation of programs and interventions. We used data from the 2011 Indian Census for village-level demographic and amenities features and the 2016 Indian Demographic and Health Survey in a bias-corrected semisupervised regression framework to predict child anthropometric failures for all villages in India. Of the total geographic variation in predicted child anthropometric failure estimates, 54.2 to 72.3% were attributed to the village level followed by 20.6 to 39.5% to the state level. The mean predicted stunting was 37.9% (SD: 10.1%; IQR: 31.2 to 44.7%), and substantial variation was found across villages ranging from less than 5% for 691 villages to over 70% in 453 villages. Estimates at the village level can potentially shift the paradigm of policy discussion in India by enabling more informed prioritization and precise targeting. The proposed methodology can be adapted and applied to diverse population health indicators, and in other contexts, to reveal spatial heterogeneity at a finer geographic scale and identify local areas with the greatest needs and with direct implications for actions to take place.
While digital trace data from sources like search engines hold enormous potential for tracking and understanding human behavior, these streams of data lack information about the actual experiences of those individuals generating the data. Moreover, most current methods ignore or under-utilize human processing capabilities that allow humans to solve problems not yet solvable by computers (human computation). We demonstrate how behavioral research, linking digital and real-world behavior, along with human computation, can be utilized to improve the performance of studies using digital data streams. This study looks at the use of search data to track prevalence of Influenza-Like Illness (ILI). We build a behavioral model of flu search based on survey data linked to users’ online browsing data. We then utilize human computation for classifying search strings. Leveraging these resources, we construct a tracking model of ILI prevalence that outperforms strong historical benchmarks using only a limited stream of search data and lends itself to tracking ILI in smaller geographic units. While this paper only addresses searches related to ILI, the method we describe has potential for tracking a broad set of phenomena in near real-time.