Publications by Year: Working Paper

Working Paper
Differentially Private Survey Research
Georgina Evans, Gary King, Adam D. Smith, and Abhradeep Thakurta. Working Paper. “Differentially Private Survey Research”.Abstract
Survey researchers have long sought to protect the privacy of their respondents via de-identification (removing names and other directly identifying information) before sharing data. Although these procedures can help, recent research demonstrates that they fail to protect respondents from intentional re-identification attacks, a problem that threatens to undermine vast survey enterprises in academia, government, and industry. This is especially a problem in political science because political beliefs are not merely the subject of our scholarship; they represent some of the most important information respondents want to keep private. We confirm the problem in practice by re-identifying individuals from a survey about a controversial referendum declaring life beginning at conception. We build on the concept of "differential privacy" to offer new data sharing procedures with mathematical guarantees for protecting respondent privacy and statistical validity guarantees for social scientists analyzing differentially private data.  The cost of these new procedures is larger standard errors, which can be overcome with somewhat larger sample sizes.
Paper Supplementary Appendix
Statistically Valid Inferences from Differentially Private Data Releases, II: Extensions to Nonlinear Transformations
Georgina Evans and Gary King. Working Paper. “Statistically Valid Inferences from Differentially Private Data Releases, II: Extensions to Nonlinear Transformations”.Abstract

We extend Evans and King (Forthcoming, 2021) to nonlinear transformations, using proportions and weighted averages as our running examples.

Paper
Statistically Valid Inferences from Privacy Protected Data
Georgina Evans, Gary King, Margaret Schwenzfeier, and Abhradeep Thakurta. Working Paper. “Statistically Valid Inferences from Privacy Protected Data”.Abstract
Unprecedented quantities of data that could help social scientists understand and ameliorate the challenges of human society are presently locked away inside companies, governments, and other organizations, in part because of privacy concerns. We address this problem with a general-purpose data access and analysis system with mathematical guarantees of privacy for research subjects, and statistical validity guarantees for researchers seeking social science insights. We build on the standard of ``differential privacy,'' correct for biases induced by the privacy-preserving procedures, provide a proper accounting of uncertainty, and impose minimal constraints on the choice of statistical methods and quantities estimated. We also replicate two recent published articles and show how we can obtain approximately the same substantive results while simultaneously protecting the privacy. Our approach is simple to use and computationally efficient; we also offer open source software that implements all our methods.
Paper