Friday, November 8, 2019
The vast majority of data that could help social scientists understand and ameliorate the challenges of human society is presently locked away inside companies, in part because of worries about privacy violations. We address this problem with a general-purpose data access and analysis system with mathematical guarantees of privacy for individuals who may be represented in the data, statistical guarantees for researchers seeking insights from it, and protection for society from some fallacious scientific conclusions. We build on the standard of ``differential privacy'' but, unlike most such approaches, we also correct for the serious statistical biases induced by privacy-preserving procedures, provide a proper accounting for statistical uncertainty, and impose minimal constraints on the choice of data analytic methods and types of quantities estimated. We emphasize throughout ease of implementation and use, computational efficiency, and open source software that illustrates how to implement our algorithms. Based on joint work with Georgie Evans, Meg Schwenzfeier, and Abhradeep Thakurta; see GaryKing.org/dp.