Gary King Homepage Previous: The Main Commands Up: The Main Commands Next: setx


estsimp

Format:

estsimp modelname depvar [indepvars] [weight] [if exp] [in range]
    [, sims(m) genname(newvar) antisim mi(file1 file2 ... filek) iout dropsims]

Description:

estsimp estimates a variety of statistical models and generates $ M$ simulations of each parameter. Currently supported models include regress, logit, probit, ologit, oprobit, mlogit, poisson, nbreg. weibull, sureg, and the additive normal model for compositional data. The simulations are stored in new variables bearing the names $ newvar1, newvar2, \ldots, newvark$, where $ k$ is the number of parameters. Each variable has $ M$ observations corresponding to the $ M$ simulations. estsimp labels the simulated variables and lists their names on the screen, so you can verify what was simulated. The estsimp command accepts nearly all options that are typically available for the supported models. It also accepts several special options that are described below.

Options:

sims(M)
specifies the number of simulations, $ M$, which must be a positive integer. The default is 1000 simulations. If you choose a large number of simulations, you may need to allocate more memory to Stata. See [R] memory in the Stata reference manual for more details about memory allocation.

genname(newvar)
specifies a stub-name for the newly generated variables. If no stub is given, Stata will generate the variables $ b1,
b2, \ldots , bk$, otherwise it will generate $ newvar1$, $ newvar2$, ... , $ newvark$, provided that the variables do not exist in memory already.

antisim
instructs estsimp to use antithetical simulations, in which numbers are drawn in pairs from the uniform[0,1] distribution, with the second draw being the complement of the first. The antithetical draws are then used to obtain simulations from a multivariate normal distribution. This procedure ensures that the mean of the simulations for a particular parameter is equal to the point estimate of that parameter.

mi(filelist)
allows estsimp to analyze multiply-imputed datasets: files in which missing values have been multiply imputed, such as created by Amelia. Enter the name for each imputed dataset you want to use, such as mi(file1 file2 file3). Alternatively, you can enter a common stub name for all imputed datasets, such as mi(file). In this case, estsimp assumes that you want to use all files in the working directory that are part of the uninterrupted sequence file1, file2, file3... estsimp will estimate the parameters for each dataset and use the estimates to generate simulations, which will reflect not only estimation uncertainty but also the uncertainty arising from the imputation process. Note: if the data in memory have been changed, you cannot specify the mi() option until you clear the memory or save the altered dataset.

iout
instructs estsimp to print intermediate output (a table of parameter estimates) for each imputed dataset that it analyzes. By default, estsimp suppresses the intermediate output and displays only the final estimates produced by combining the results from each imputed dataset.

dropsims
drops the simulated parameters from the previous call to estsimp.

Examples:

To estimate a linear regression of $ y$ on $ x1$, $ x2$, $ x3$, and a constant term; simulate 1000 sets of parameter estimates; and then save the simulations as $ b1$, $ b2$, ..., $ bk$, type:

     . estsimp regress y x1 x2 x3

In this example, Stata will create five new variables. The variables $ b1$, $ b2$ and $ b3$ will contain simulated coefficients for $ x1$, $ x2$ and $ x3$; $ b4$ will hold simulations of the constant term; and $ b5$ will contain simulated values for sigma squared, the mean squared error of the regression.

To simulate 500 sets of parameters from a logit regression and save the results as variables beginning with the letter ``s'', type:

     . estsimp logit y x1 x2 x3, sims(500) genname(s)

Since the logit model contains no ancillary parameters, this command will generate four new variables: $ s1$, $ s2$, $ s3$, and $ s4$. Variables $ s1$-$ s3$ are simulated coefficients for $ x1$, $ x2$ and $ x3$, and the final variable, $ s4$, is the simulated constant term.

To simulate 1000 sets of parameters from an ordered probit regression in which the dependent variable can assume three values (low, medium, and high), type:

     . estsimp oprobit y x1 x2 x3

The ordered probit model does not contain a constant term, but it does have ancillary parameters called cut-points. Thus, the estsimp command listed above will generate five new variables. The variables $ b1$, $ b2$ and $ b3$ will hold simulated coefficients for $ x1$ $ x2$ and $ x3$. Variables $ b4$ and $ b5$ will contain simulations for the two cutpoints (cut1 and cut2).

To obtain antithetical variates, simply use the antisim option, as in

   . estsimp oprobit y x1 x2 x3, antisim

Suppose that we have three imputed datasets, called imp1.dta, imp2.dta, and imp3.dta. We could analyze all three datasets and combine the results by issuing the following command:

   . estsimp oprobit y x1 x2 x3, mi(imp1 imp2 imp3)

The resulting simulations of the main and ancillary parameters would reflect both estimation uncertainty and the variability associated with the multiple imputations.

To view the intermediate output from each ordered probit estimation, add the iout option to the previous command, as in

   . estsimp oprobit y x1 x2 x3, mi(imp1 imp2 imp3) iout



Gary King 2006-01-04