Background: We assessed aspects of Seguro Popular, a programme aimed to deliver health insurance, regular and preventive medical care, medicines, and health facilities to 50 million uninsured Mexicans. Methods: We randomly assigned treatment within 74 matched pairs of health clusters–-i.e., health facility catchment areas–-representing 118,569 households in seven Mexican states, and measured outcomes in a 2005 baseline survey (August 2005, to September 2005) and follow-up survey 10 months later (July 2006, to August 2006) in 50 pairs (n=32 515). The treatment consisted of encouragement to enrol in a health-insurance programme and upgraded medical facilities. Participant states also received funds to improve health facilities and to provide medications for services in treated clusters. We estimated intention to treat and complier average causal effects non-parametrically. Findings: Intention-to-treat estimates indicated a 23% reduction from baseline in catastrophic expenditures (1·9% points and 95% CI 0·14-3·66). The effect in poor households was 3·0% points (0·46-5·54) and in experimental compliers was 6·5% points (1·65-11·28), 30% and 59% reductions, respectively. The intention-to-treat effect on health spending in poor households was 426 pesos (39-812), and the complier average causal effect was 915 pesos (147-1684). Contrary to expectations and previous observational research, we found no effects on medication spending, health outcomes, or utilisation. Interpretation: Programme resources reached the poor. However, the programme did not show some other effects, possibly due to the short duration of treatment (10 months). Although Seguro Popular seems to be successful at this early stage, further experiments and follow-up studies, with longer assessment periods, are needed to ascertain the long-term effects of the programme.
We introduce a new framework for forecasting age-sex-country-cause-specific mortality rates that incorporates considerably more information, and thus has the potential to forecast much better, than any existing approach. Mortality forecasts are used in a wide variety of academic fields, and for global and national health policy making, medical and pharmaceutical research, and social security and retirement planning.
As it turns out, the tools we developed in pursuit of this goal also have broader statistical implications, in addition to their use for forecasting mortality or other variables with similar statistical properties. First, our methods make it possible to include different explanatory variables in a time series regression for each cross-section, while still borrowing strength from one regression to improve the estimation of all. Second, we show that many existing Bayesian (hierarchical and spatial) models with explanatory variables use prior densities that incorrectly formalize prior knowledge. Many demographers and public health researchers have fortuitously avoided this problem so prevalent in other fields by using prior knowledge only as an ex post check on empirical results, but this approach excludes considerable information from their models. We show how to incorporate this demographic knowledge into a model in a statistically appropriate way. Finally, we develop a set of tools useful for developing models with Bayesian priors in the presence of partial prior ignorance. This approach also provides many of the attractive features claimed by the empirical Bayes approach, but fully within the standard Bayesian theory of inference.
A "Perspective" article that discusses an article by David Stuckler and colleagues showing that, in Eastern European and former Soviet countries, participation in International Monetary Fund economic programs have been associated with higher mortality rates from tuberculosis.
While the Supreme Court in Bandemer v. Davis found partisan gerrymandering to be justiciable, no challenged redistricting plan in the subsequent 20 years has been held unconstitutional on partisan grounds. Then, in Vieth v. Jubilerer, five justices concluded that some standard might be adopted in a future case, if a manageable rule could be found. When gerrymandering next came before the Court, in LULAC v. Perry, we along with our colleagues filed an Amicus Brief (King et al., 2005), proposing the test be based in part on the partisan symmetry standard. Although the issue was not resolved, our proposal was discussed and positively evaluated in three of the opinions, including the plurality judgment, and for the first time for any proposal the Court gave a clear indication that a future legal test for partisan gerrymandering will likely include partisan symmetry. A majority of Justices now appear to endorse the view that the measurement of partisan symmetry may be used in partisan gerrymandering claims as “a helpful (though certainly not talismanic) tool” (Justice Stevens, joined by Justice Breyer), provided one recognizes that “asymmetry alone is not a reliable measure of unconstitutional partisanship” and possibly that the standard would be applied only after at least one election has been held under the redistricting plan at issue (Justice Kennedy, joined by Justices Souter and Ginsburg). We use this essay to respond to the request of Justices Souter and Ginsburg that “further attention … be devoted to the administrability of such a criterion at all levels of redistricting and its review.” Building on our previous scholarly work, our Amicus Brief, the observations of these five Justices, and a supporting consensus in the academic literature, we offer here a social science perspective on the conceptualization and measurement of partisan gerrymandering and the development of relevant legal rules based on what is effectively the Supreme Court’s open invitation to lower courts to revisit these issues in the light of LULAC v. Perry.
We highlight, and suggest ways to avoid, a large number of common misunderstandings in the literature about best practices in qualitative research. We discuss these issues in four areas: theory and data, qualitative and quantitative strategies, causation and explanation, and selection bias. Some of the misunderstandings involve incendiary debates within our discipline that are readily resolved either directly or with results known in research areas that happen to be unknown to political scientists. Many of these misunderstandings can also be found in quantitative research, often with different names, and some of which can be fixed with reference to ideas better understood in the qualitative methods literature. Our goal is to improve the ability of quantitatively and qualitatively oriented scholars to enjoy the advantages of insights from both areas. Thus, throughout, we attempt to construct specific practical guidelines that can be used to improve actual qualitative research designs, not only the qualitative methods literatures that talk about them.
We attempt to clarify, and suggest how to avoid, several serious misunderstandings about and fallacies of causal inference in experimental and observational research. These issues concern some of the most basic advantages and disadvantages of each basic research design. Problems include improper use of hypothesis tests for covariate balance between the treated and control groups, and the consequences of using randomization, blocking before randomization, and matching after treatment assignment to achieve covariate balance. Applied researchers in a wide range of scientific disciplines seem to fall prey to one or more of these fallacies, and as a result make suboptimal design or analysis choices. To clarify these points, we derive a new four-part decomposition of the key estimation errors in making causal inferences. We then show how this decomposition can help scholars from different experimental and observational research traditions better understand each other’s inferential problems and attempted solutions.
The enormous Nazi voting literature rarely builds on modern statistical or economic research. By adding these approaches, we find that the most widely accepted existing theories of this era cannot distinguish the Weimar elections from almost any others in any country. Via a retrospective voting account, we show that voters most hurt by the depression, and most likely to oppose the government, fall into separate groups with divergent interests. This explains why some turned to the Nazis and others turned away. The consequences of Hitler's election were extraordinary, but the voting behavior that led to it was not.
We describe some progress toward a common framework for statistical analysis and software development built on and within the R language, including R’s numerous existing packages. The framework we have developed offers a simple unified structure and syntax that can encompass a large fraction of statistical procedures already implemented in R, without requiring any changes in existing approaches. We conjecture that it can be used to encompass and present simply a vast majority of existing statistical methods, regardless of the theory of inference on which they are based, notation with which they were developed, and programming syntax with which they have been implemented. This development enabled us, and should enable others, to design statistical software with a single, simple, and unified user interface that helps overcome the conflicting notation, syntax, jargon, and statistical methods existing across the methods subfields of numerous academic disciplines. The approach also enables one to build a graphical user interface that automatically includes any method encompassed within the framework. We hope that the result of this line of research will greatly reduce the time from the creation of a new statistical innovation to its widespread use by applied researchers whether or not they use or program in R.
Verbal autopsy procedures are widely used for estimating cause-specific mortality in areas without medical death certification. Data on symptoms reported by caregivers along with the cause of death are collected from a medical facility, and the cause-of-death distribution is estimated in the population where only symptom data are available. Current approaches analyze only one cause at a time, involve assumptions judged difficult or impossible to satisfy, and require expensive, time consuming, or unreliable physician reviews, expert algorithms, or parametric statistical models. By generalizing current approaches to analyze multiple causes, we show how most of the difficult assumptions underlying existing methods can be dropped. These generalizations also make physician review, expert algorithms, and parametric statistical assumptions unnecessary. With theoretical results, and empirical analyses in data from China and Tanzania, we illustrate the accuracy of this approach. While no method of analyzing verbal autopsy data, including the more computationally intensive approach offered here, can give accurate estimates in all circumstances, the procedure offered is conceptually simpler, less expensive, more general, as or more replicable, and easier to use in practice than existing approaches. We also show how our focus on estimating aggregate proportions, which are the quantities of primary interest in verbal autopsy studies, may also greatly reduce the assumptions necessary, and thus improve the performance of, many individual classifiers in this and other areas. As a companion to this paper, we also offer easy-to-use software that implements the methods discussed herein.
When respondents use the ordinal response categories of standard survey questions in different ways, the validity of analyses based on the resulting data can be biased. Anchoring vignettes is a survey design technique, introduced by King, Murray, Salomon, and Tandon (2004), intended to correct for some of these problems. We develop new methods both for evaluating and choosing anchoring vignettes, and for analyzing the resulting data. With surveys on a diverse range of topics in a range of countries, we illustrate how our proposed methods can improve the ability of anchoring vignettes to extract information from survey data, as well as saving in survey administration costs.
Inferences about counterfactuals are essential for prediction, answering "what if" questions, and estimating causal effects. However, when the counterfactuals posed are too far from the data at hand, conclusions drawn from well-specified statistical analyses become based on speculation and convenient but indefensible model assumptions rather than empirical evidence. Unfortunately, standard statistical approaches assume the veracity of the model rather than revealing the degree of model-dependence, and so this problem can be hard to detect. We develop easy-to-apply methods to evaluate counterfactuals that do not require sensitivity testing over specified classes of models. If an analysis fails the tests we offer, then we know that substantive results are sensitive to at least some modeling choices that are not based on empirical evidence. We use these methods to evaluate the extensive scholarly literatures on the effects of changes in the degree of democracy in a country (on any dependent variable) and separate analyses of the effects of UN peacebuilding efforts. We find evidence that many scholars are inadvertently drawing conclusions based more on modeling hypotheses than on their data. For some research questions, history contains insufficient information to be our guide.
We introduce a set of integrated developments in web application software, networking, data citation standards, and statistical methods designed to put some of the universe of data and data sharing practices on somewhat firmer ground. We have focused on social science data, but aspects of what we have developed may apply more widely. The idea is to facilitate the public distribution of persistent, authorized, and verifiable data, with powerful but easy-to-use technology, even when the data are confidential or proprietary. We intend to solve some of the sociological problems of data sharing via technological means, with the result intended to benefit both the scientific community and the sometimes apparently contradictory goals of individual researchers.
Although published works rarely include causal estimates from more than a few model specifications, authors usually choose the presented estimates from numerous trial runs readers never see. Given the often large variation in estimates across choices of control variables, functional forms, and other modeling assumptions, how can researchers ensure that the few estimates presented are accurate or representative? How do readers know that publications are not merely demonstrations that it is possible to find a specification that fits the author’s favorite hypothesis? And how do we evaluate or even define statistical properties like unbiasedness or mean squared error when no unique model or estimator even exists? Matching methods, which offer the promise of causal inference with fewer assumptions, constitute one possible way forward, but crucial results in this fast-growing methodological literature are often grossly misinterpreted. We explain how to avoid these misinterpretations and propose a unified approach that makes it possible for researchers to preprocess data with matching (such as with the easy-to-use software we offer) and then to apply the best parametric techniques they would have used anyway. This procedure makes parametric models produce more accurate and considerably less model-dependent causal inferences.
An essential aspect of science is a community of scholars cooperating and competing in the pursuit of common goals. A critical component of this community is the common language of and the universal standards for scholarly citation, credit attribution, and the location and retrieval of articles and books. We propose a similar universal standard for citing quantitative data that retains the advantages of print citations, adds other components made possible by, and needed due to, the digital form and systematic nature of quantitative data sets, and is consistent with most existing subfield-specific approaches. Although the digital library field includes numerous creative ideas, we limit ourselves to only those elements that appear ready for easy practical use by scientists, journal editors, publishers, librarians, and archivists.
We develop an approach to conducting large scale randomized public policy experiments intended to be more robust to the political interventions that have ruined some or all parts of many similar previous efforts. Our proposed design is insulated from selection bias in some circumstances even if we lose observations and our inferences can still be unbiased even if politics disrupts any two of the three steps in our analytical procedures and and other empirical checks are available to validate the overall design. We illustrate with a design and empirical validation of an evaluation of the Mexican Seguro Popular de Salud (Universal Health Insurance) program we are conducting. Seguro Popular, which is intended to grow to provide medical care, drugs, preventative services, and financial health protection to the 50 million Mexicans without health insurance, is one of the largest health reforms of any country in the last two decades. The evaluation is also large scale, constituting one of the largest policy experiments to date and what may be the largest randomized health policy experiment ever.
We demonstrate here several previously unrecognized or insufficiently appreciated properties of the Lee-Carter mortality forecasting approach, the dominant method used in both the academic literature and practical applications. We show that this model is a special case of a considerably simpler, and less often biased, random walk with drift model, and prove that the age profile forecast from both approaches will always become less smooth and unrealistic after a point (when forecasting forward or backwards in time) and will eventually deviate from any given baseline. We use these and other properties we demonstrate to suggest when the model would be most applicable in practice.