Anchoring Vignettes

Can I use anchoring vignettes to understand why respondents understand survey questions in such different ways?

Yes, once you have anchors in the form of vignettes, you can study the reasons for respondents' different understandings. In the chopit model, the thresholds between the response categories are explained with a set of explanatory variables. We can therefore estimate the effects of these variables on the thresholds. Another way to say this is that the model has multiple systematic components, predicting both the actual values of the concept being measured and the actual thresholds between the response categories, across respondents.

Studies such as these are useful in their own...

Are universally applicable, culture-independent survey questions possible?

Such a goal is probably not achievable across all domains of inquiry. It is probably not even workable for individual domains in many areas, although it is still important to try. Whether or not universal measurement devices (or universally applicable vignettes) can be invented, we still will often want to compare many aspects of health and other concepts across many different places. Our preference for how to do this in most situations is to get it right in specific contexts, and to build up to more generality when possible by comparing across different small sets of areas in separate... Read more about Are universally applicable, culture-independent survey questions possible?

If I have a direct physical measurement, such as a medical test, do I need anchoring vignettes?

The basic process of measurement involves comparing an object under study with some standard. Without the standard, we have no (valid or meaningful) measurement. Anchoring vignettes provide one possible standard, or anchor, to make measurements meaningful. They serve the same purpose as medical tests or other physical measurements when they are available. If you can afford to do the physical tests, and if they are accurate in the area which you are measuring, then you have no need for vignettes as anchors.

For some concepts, direct physical measurement is infeasible. Consider...

Doesn't Anchoring Vignettes merely move the problem of coming up with DIF-free survey questions back one level (from self-assessments to vignettes), and so in the end you have the same problem?

No, the goal of survey design under this approach is not to design DIF-free vignette questions (which is as difficult or impossible as for self-assessment questions). The approach allows respondents to interpret vignette questions in completely different ways. Instead, the goal of survey design is to write vignette questions that have the same types of DIF as the self-assessments, since that provides the necessary information with which we can measure DIF, and with that we can then correct the self-assessments. Since the same respondent will be... Read more about Doesn't Anchoring Vignettes merely move the problem of coming up with DIF-free survey questions back one level (from self-assessments to vignettes), and so in the end you have the same problem?

Can I use anchoring vignettes if I don't have variables to predict the thresholds?

Variables that predict thresholds help chopit if they are available. Both chopit and our nonparametric procedure will both work without variables that can predict threshold variation, but both procedures would then require having respondents who are asked both self-assessments and vignettes.

Do I need one vignette for each response category?

No, there is no necessary relationship between the two. You may have more vignettes or fewer vignettes than response categories.

Is there a simpler way of asking questions so we can avoid any statistical analysis?

Direct measurement, that is without statistical analysis, is preferable when possible. We have tried a variety of simpler strategies in a diverse array of national surveys, but none seem to do remotely as well as anchoring vignettes. For example, we tried asking which of a set of vignettes the respondent is most like, but we found that respondents had a difficult time remembering them all at the same time. Another possibility is to ask if the respondent has a higher or lower level of health/efficacy/etc than the first vignette, and then the second, etc. This is better, but it also does...

Why will anchoring vignettes work when we know that putting educational achievement tests on a common scale has not been possible?

The one research area where our approach clearly does not work is educational testing. The difficulty with educational testing is that no matter how carefully you write the common test questions as anchors, test takers will differ in their responses to them according to both DIF and their knowledge or achievement. Anchoring vignettes solve the problem in other areas because a respondent's answer is only a function of DIF (and estimation variability), and so can be used to adjust the self-assessments. An appropriate anchoring vignette in educational testing would be a test question where all... Read more about Why will anchoring vignettes work when we know that putting educational achievement tests on a common scale has not been possible?

What has to go wrong for anchoring vignette corrections to bias my results?

Here are several ways to think about this issue:

First, for simplicity and since statistical methods can deal with it in fairly straightforward ways, imagine that random perceptual and measurement error were nonexistent. Then what needs to happen for all the problems to be fixed is that respondents differ in their interpretation of the vignettes only due to DIF (differential item functioning, or interpersonal incomparability), whereas the responses to the self-assessments must differ due to DIF and the actual values (A) on the concept of interest. In...

Should the vignette describe the age, sex, etc., of the hypothetical person? Should it be self-referential?

Vignette answers are a function of both the actual level of the person in the vignette (θ, the same for all respondents) and the DIF applied by each respondent (differing over respondents). We can think of these answers as responses to the portions of the vignette text that are, respectively, (1) an integral part of describing θ and (2) words used to package these concepts. DIF is generated by the packaging, which human language of course prevents us from eliminating entirely. Fortunately, to meet the assumption of the model,...