Universities require faculty and students planning research involving human subjects to pass formal certification tests and then submit research plans for prior approval. Those who diligently take the tests may better understand certain important legal requirements but, at the same time, are often misled into thinking they can apply these rules to their own work which, in fact, they are not permitted to do. They will also be missing many other legal requirements not mentioned in their training but which govern their behaviors. Finally, the training leaves them likely to completely misunderstand the essentially political situation they find themselves in. The resulting risks to their universities, collaborators, and careers may be catastrophic, in addition to contributing to the more common ordinary frustrations of researchers with the system. To avoid these problems, faculty and students conducting research about and for the public need to understand that they are public figures, to whom different rules apply, ones that political scientists have long studied. University administrators (and faculty in their part-time roles as administrators) need to reorient their perspectives as well. University research compliance bureaucracies have grown, in well-meaning but sometimes unproductive ways that are not required by federal laws or guidelines. We offer advice to faculty and students for how to deal with the system as it exists now, and suggestions for changes in university research compliance bureaucracies, that should benefit faculty, students, staff, university budgets, and our research subjects.
Computer scientists and statisticians often try to classify individual textual documents into chosen categories. In contrast, social scientists more commonly focus on populations and thus estimate the proportion of documents falling in each category. The two existing types of techniques for estimating these category proportions are parametric “classify and count” methods and “direct” nonparametric estimation of category proportions without an individual classification step. Unfortunately, classify and count methods can sometimes be highly model dependent or generate more bias in the proportions even as the percent correctly classified increases. Direct estimation avoids these problems, but can suffer when the meaning and usage of language is too similar across categories or too different between training and test sets. We develop an improved direct estimation approach without these issues by introducing continuously valued text features optimized for this problem, along with a form of matching adapted from the causal inference literature. We evaluate our approach in analyses of a diverse collection of 73 data sets, showing that it substantially improves performance compared to existing approaches. As a companion to this paper, we offer easy-to-use software that implements all ideas discussed herein.
The mission of the social sciences is to understand and ameliorate society’s greatest challenges. The data held by private companies, collected for different purposes, hold vast potential to further this mission. Yet, because of consumer privacy, trade secrets, proprietary content, and political sensitivities, these datasets are often inaccessible to scholars. We propose a novel organizational model to address these problems. We also report on the first partnership under this model, to study the incendiary issues surrounding the impact of social media on elections and democracy: Facebook provides (privacy-preserving) data access; eight ideologically and substantively diverse charitable foundations provide funding; an organization of academics we created, Social Science One, leads the project; and the Institute for Quantitative Social Science at Harvard and the Social Science Research Council provide logistical help.
We provide an overview of PSI ("a Private data Sharing Interface"), a system we are developing to enable researchers in the social sciences and other fields to share and explore privacy-sensitive datasets with the strong privacy protections of differential privacy.
We clarify the theoretical foundations of partisan fairness standards for district-based democratic electoral systems, including essential assumptions and definitions that have not been formalized or in some cases even discussed. We pare assumptions down to their minimal essential components and add extensive empirical evidence for those with observable implications. Throughout, we follow a fundamental principle of statistics too often ignored -- defining the quantity of interest separately so its measures are vulnerable to being proven wrong, evaluated, and improved. This enables us to prove which approaches -- claimed in the literature to be estimators of partisan symmetry, the most widely accepted standard -- are statistically appropriate and which are biased, limited, or not measures of symmetry at all. Because real world redistricting involves complicated politics with numerous participants and conflicting goals, measures biased for partisan fairness sometimes still provide useful descriptions of other aspects of electoral systems.
Inference is the process of using facts we know to learn about facts we do not know. A theory of inference gives assumptions necessary to get from the former to the latter, along with a definition for and summary of the resulting uncertainty. Any one theory of inference is neither right nor wrong, but merely an axiom that may or may not be useful. Each of the many diverse theories of inference can be valuable for certain applications. However, no existing theory of inference addresses the tendency to choose, from the range of plausible data analysis specifications consistent with prior evidence, those that inadvertently favor one's own hypotheses. Since the biases from these choices are a growing concern across scientific fields, and in a sense the reason the scientific community was invented in the first place, we introduce a new theory of inference designed to address this critical problem. We derive "hacking intervals," which are the range of a summary statistic one may obtain given a class of possible endogenous manipulations of the data. Hacking intervals require no appeal to hypothetical data sets drawn from imaginary superpopulations. A scientific result with a small hacking interval is more robust to researcher manipulation than one with a larger interval, and is often easier to interpret than a classical confidence interval. Some versions of hacking intervals turn out to be equivalent to classical confidence intervals, which means they may also provide a more intuitive and potentially more useful interpretation of classical confidence intervals.
Information hierarchies are difficult to express when real-world space or time constraints force traversing the hierarchy in linear presentations, such as in educational books and classroom courses. We present booc.io, which allows linear and non-linear presentation and navigation of educational concepts and material. To support a breadth of material for each concept, booc.io is Web based, which allows adding material such as lecture slides, book chapters, videos, and LTIs. A visual interface assists the creation of the needed hierarchical structures. The goals of our system were formed in expert interviews, and we explain how our design meets these goals. We adapt a real-world course into booc.io, and perform introductory qualitative evaluation with students.
A vast literature demonstrates that voters around the world who benefit from their governments' discretionary spending cast more ballots for the incumbent party than those who do not benefit. But contrary to most theories of political accountability, some suggest that voters also reward incumbent parties for implementing "programmatic" spending legislation, over which incumbents have no discretion, and even when passed with support from all major parties. Why voters would attribute responsibility when none exists is unclear, as is why minority party legislators would approve of legislation that would cost them votes. We study the electoral effects of two large prominent programmatic policies that fit the ideal type especially well, with unusually large scale experiments that bring more evidence to bear on this question than has previously been possible. For the first policy, we design and implement ourselves one of the largest randomized social experiments ever. For the second policy, we reanalyze studies that used a large scale randomized experiment and a natural experiment to study the same question but came to opposite conclusions. Using corrected data and improved statistical methods, we show that the evidence from all analyses of both policies is consistent: programmatic policies have no effect on voter support for incumbents. We conclude by discussing how the many other studies in the literature may be interpreted in light of our results.
Ecological inference (EI) is the process of learning about individual behavior from aggregate data. We relax assumptions by allowing for ``linear contextual effects,'' which previous works have regarded as plausible but avoided due to non-identification, a problem we sidestep by deriving bounds instead of point estimates. In this way, we offer a conceptual framework to improve on the Duncan-Davis bound, derived more than sixty-five years ago. To study the effectiveness of our approach, we collect and analyze 8,430 2x2 EI datasets with known ground truth from several sources --- thus bringing considerably more data to bear on the problem than the existing dozen or so datasets available in the literature for evaluating EI estimators. For the 88% of real data sets in our collection that fit a proposed rule, our approach reduces the width of the Duncan-Davis bound, on average, by about 44%, while still capturing the true district level parameter about 99% of the time. The remaining 12% revert to the Duncan-Davis bound.
Easy-to-use software is available that implements all the methods described in the paper.
The origin, meaning, estimation, and application of the concept of partisan symmetry in legislative redistricting, and the justiciability of partisan gerrymandering. An edited transcript of a talk at the “Redistricting and Representation Forum,” American Academy of Arts & Sciences, Cambridge, MA 11/8/2017.
To deter gerrymandering, many state constitutions require legislative districts to be "compact." Yet, the law offers few precise definitions other than "you know it when you see it," which effectively implies a common understanding of the concept. In contrast, academics have shown that compactness has multiple dimensions and have generated many conflicting measures. We hypothesize that both are correct -- that compactness is complex and multidimensional, but a common understanding exists across people. We develop a survey to elicit this understanding, with high reliability (in data where the standard paired comparisons approach fails). We create a statistical model that predicts, with high accuracy, solely from the geometric features of the district, compactness evaluations by judges and public officials responsible for redistricting, among others. We also offer compactness data from our validated measure for 20,160 state legislative and congressional districts, as well as software to compute this measure from any district.
Winner of the 2018 Robert H Durr Award from the MPSA.
In various embodiments, documents are searched and retrieved via receipt of a search query, electronically identifying a reference set of relevant documents, providing a search set of documents, creating a database comprising at least some of the documents of the search set and the reference set , computationally classifying the documents in the database , extracting keywords from the search set and one or more classified sets , optionally filtering the extracted keywords, and electronically identifying at least some of the documents from the database that contain one or more of the extracted keywords.
Representative embodiments of a method for grouping participants in an activity include the steps of: (i) defining a grouping policy; (ii) storing, in a database, participant records that include a participant identifier, a characteristic associated with the participant, and/or an identifier for a participant's handheld device; (iii) defining groupings based on the policy and characteristics of the participants relating to the policy and to the activity; and (iv) communicating the groupings to the handheld devices to establish the groups.
In various embodiments, online discussions in connection with an eductional resource are improved by analyzing annotations made by students assigned to a discussion group to identify high-quality annotations likely to generate responses and stimulate discussion threads and by making the identified annotations visibile to students not assigned to the discussion group.
Researchers who generate data often optimize efficiency and robustness by choosing stratified over simple random sampling designs. Yet, all theories of inference proposed to justify matching methods are based on simple random sampling. This is all the more troubling because, although these theories require exact matching, most matching applications resort to some form of ex post stratification (on a propensity score, distance metric, or the covariates) to find approximate matches, thus nullifying the statistical properties these theories are designed to ensure. Fortunately, the type of sampling used in a theory of inference is an axiom, rather than an assumption vulnerable to being proven wrong, and so we can replace simple with stratified sampling, so long as we can show, as we do here, that the implications of the theory are coherent and remain true. Properties of estimators based on this theory are much easier to understand and can be satisfied without the unattractive properties of existing theories, such as assumptions hidden in data analyses rather than stated up front, asymptotics, unfamiliar estimators, and complex variance calculations. Our theory of inference makes it possible for researchers to treat matching as a simple form of preprocessing to reduce model dependence, after which all the familiar inferential techniques and uncertainty calculations can be applied. This theory also allows binary, multicategory, and continuous treatment variables from the outset and straightforward extensions for imperfect treatment assignment and different versions of treatments.
We show that propensity score matching (PSM), an enormously popular method of preprocessing data for causal inference, often accomplishes the opposite of its intended goal --- thus increasing imbalance, inefficiency, model dependence, and bias. The weakness of PSM comes from its attempts to approximate a completely randomized experiment, rather than, as with other matching methods, a more efficient fully blocked randomized experiment. PSM is thus uniquely blind to the often large portion of imbalance that can be eliminated by approximating full blocking with other matching methods. Moreover, in data balanced enough to approximate complete randomization, either to begin with or after pruning some observations, PSM approximates random matching which, we show, increases imbalance even relative to the original data. Although these results suggest researchers replace PSM with one of the other available matching methods, propensity scores have other productive uses.
Participatory activity carried out using electronic devices is enhanced by occupying the attention of participants who complete a task before a set completion time. For example, a request or question having an expected response time less than the remaining answer time may be provided to early-finishing participants. In another of the many embodiments, the post-response tasks are different for each participant, depending upon, for example, the rate at which the participant has successfully provided answers to previous questions. This ensures continuous engagement of all participants.
In this paper, we illustrate the successful implementation of pre-class reading assignments through a social learning platform that allows students to discuss the reading online with their classmates. We show how the platform can be used to understand how students are reading before class. We find that, with this platform, students spend an above average amount of time reading (compared to that reported in the literature) and that most students complete their reading assignments before class. We identify specific reading behaviors that are predictive of in-class exam performance. We also demonstrate ways that the platform promotes active reading strategies and produces high-quality learning interactions between students outside class. Finally, we compare the exam performance of two cohorts of students, where the only difference between them is the use of the platform; we show that students do significantly better on exams when using the platform.
PARTISAN GERRYMANDERING has long been reviled for thwarting the will of the voters. Yet while voters are acting disgusted, the US Supreme Court has only discussed acting — declaring they have the constitutional right to fix the problem, but doing nothing. But as better data and computer algorithms are now making gerrymandering increasingly effective, continuing to sidestep the issue could do permanent damage to American democracy. In Gill v. Whitford, the soon-to-be-decided challenge to Wisconsin’s 2011 state Assembly redistricting plan, the court could finally fix the problem for the whole country. Judging from the oral arguments, the key to the case is whether the court endorses the concept of “partisan symmetry,” a specific standard for treating political parties equally in allocating legislative seats based on voting.
We demonstrate that exposure to the news media causes Americans to take public stands on specific issues, join national policy conversations, and express themselves publicly—all key components of democratic politics—more often than they would otherwise. After recruiting 48 mostly small media outlets, we chose groups of these outlets to write and publish articles on subjects we approved, on dates we randomly assigned. We estimated the causal effect on proximal measures, such as website pageviews and Twitter discussion of the articles’ specific subjects, and distal ones, such as national Twitter conversation in broad policy areas. Our intervention increased discussion in each broad policy area by approximately \(\approx 62.7\%\) (relative to a day’s volume), accounting for 13,166 additional posts over the treatment week, with similar effects across population subgroups.