Gary King Homepage Previous: Format: Up: DELIF Next: DISTS

Purpose:

Deletes districts from all variables already read in by commands YVOTE, YVOTE2, XVARS, XVARS2, and XNEW. This command works by actually deleting the districts (or rows of your data) rather than just setting them to missing values, so be careful. If you decide to read in more data following a DELIF command (and before issuing a new YVOTE statement to wipe out all data in memory), these data may not have the same number of rows as data already in memory. If you do read in additional data after a DELIF command, the new variables (and dataset) must have the same districts as those remaining in memory after deletion-the deleted districts must not be included in the new data.

For example, suppose you have some variables measured for all districts in the entire state but other variables measured for only the northern districts, and you wish to perform some analyses for all districts and others for just the northern part of the state. You could accommodate this situation in two ways:

  1. Put all the data in a single dataset, with southern districts coded as missing for some variables. When you use only variables for which you have all the data, every district will be included in the analysis; when you include at least one variable with missing data in the southern districts, those rows automatically will be left out of the analysis. If you have only a few variables with missing data in the south, then this is the preferred option.
  2. If you have a lot of variables with missing data on the south, then Option 1 would be tedious because you would have to type in a large number of observations with only missing value codes. It also would waste disk space. The alternative is to create two datasets, one with variables for which you have all the observations, plus an extra variable coded 0 for northern districts and 1 for southern districts. The second dataset contains variables for which you have data on the northern districts only. To do an analysis with variables based on all the districts, just use the first dataset. To use these variables and also some variables in the second dataset in an analysis of the northern districts only, (a) read in the desired variables from the first dataset, (b) execute a DELIF using the northern/southern variable you added to the first dataset, and (c) read in any variables you wish to use from the second dataset. For example: XVARS money $ <$ c:iowa; delif south $ <$ c: southIA; XVARS vote66 $ <$ iowa2;.



Gary King 2006-01-07