Automating Open Science for Big Data
Merce Crosas, Gary King, James Honaker, Latanya Sweeney. 2015.
"Automating Open Science for Big Data".
The ANNALS of the American Academy of Political and Social Science, 659, 1, Pp. 260–273.

Abstract
The vast majority of social science research presently uses small (MB or GB scale) data sets. These fixed-scale data sets are commonly downloaded to the researcher’s computer where the analysis is performed locally, and are often shared and cited with well-established technologies, such as the Dataverse Project (see Dataverse.org), to support the published results. The trend towards Big Data – including large scale streaming data – is starting to transform research and has the potential to impact policy-making and our understanding of the social, economic, and political problems that affect human societies. However, this research poses new challenges in execution, accountability, preservation, reuse, and reproducibility. Downloading these data sets to a researcher’s computer is infeasible or not practical; hence, analyses take place in the cloud, require unusual expertise, and benefit from collaborative teamwork and novel tool development. The advantage of these data sets in how informative they are also means that they are much more likely to contain highly sensitive personally identifiable information. In this paper, we discuss solutions to these new challenges so that the social sciences can realize the potential of Big Data.
See Also
- [Presentation] Empowering Social Science to Understand and Ameliorate Major Challenges of Human Society (Federal Interagency Conference on Social Science and Big Data) (2020)
- [Presentation] Big Data Is Not About the Data! (2018)
- [Presentation] Big Data Reveals Made Up Data: How the Chinese Government Fabricates Social Media Posts for Strategic Distraction, Not Engaged Argument (2017)
- [Presentation] Big Data Is Not About the Data! The Power of Modern Analytics (2016)
- [Book] Preface: Big Data Is Not About the Data! (2016)
- [Presentation] Big Data Is Not About the Data, With Applications (2015)
- [Presentation] The Next Big [Social Science] Thing. Some Suggestions for Science Magazine (2015)
- [Presentation] Big Data Is Not About The Data! (2013)