| Previous: Formulae - A Peek | Up: Formulae - A Peek | Next: Algorithms for setx |
Recall that the estsimp command performs two functions: it
estimates the main and ancillary parameters (
) of the
statistical model, and it draws simulations of those parameters from
their asymptotic sampling distribution.
Typically, the sampling distribution is multivariate normal with mean
equal to the point-estimates of the parameters (
) and
variance equal to the variance-covariance matrix of estimates
. The current version of
larify contains
two exceptions to this rule.
In the case of linear regression, the effect coefficients (
s)
are drawn from a multivariate normal, but simulations of the
homoskedastic variance
are obtained in a separate step from
a scaled inverse
distribution with
degrees of
freedom, where
is the number of observations in the dataset and
is the number of explanatory variables, including the constant
term (Gelman, et al., 1995, p. 237). The two-step procedure is
legitimate because the effect coefficients and the variance parameter
are orthogonal in a linear regression; the procedure is desirable
because
is strictly positive, and therefore more
appropriately drawn from its exact posterior than from a normal
distribution. To obtain simulations of
, the program draws
from a
with
degrees of freedom, and then calculates
. The resulting draws have an
expected value of
,
which approaches
as
.
Likewise, the effect coefficients (
s) of a seemingly unrelated
regression are drawn from the multivariate normal, but simulations of
the variance matrix
are obtained in a separate step. Here,
the appropriate posterior distribution is the inverse Wishart (Gelman,
et al., 1995, p. 481) with
degrees of freedom and dimension
,
where
is the number of equations in the seemingly unrelated
regression model. In cases where the number of explanatory varables
varies from one equation to the next,
larify calculates
for each
equation and sets
equal to the mean of those values. To obtain
simulations of
, the program draws from a Wishart with scale
factor
and inverts the draws. The algorithm
for drawing from the Wishart relies on Bartlett's decomposition, which
is concisely summarized in Johnson (1987, p. 204) and Ripley (1987,
pp. 99-100). estsimp produces draws that have an expected
value of
, which approaches
as
goes to infinity. In small samples this
procedure is conservative, since
, implying that
.
For all models, simulations of the main and ancillary parameters are
random. This means that, in any given run of estsimp, the
average value of
may be slightly smaller or larger than
the point estimate
, though the approximation becomes more
precise with a higher number of simulations. Users can force the mean
of the simulated parameters to equal the vector of point estimates by
requesting antithetical simulations (Stern 1997, pp. 2028-29). The
antisim option instructs the program to draw random numbers
in pairs from the uniform[0,1] distribution, with the second draw
being the complement of the first. For instance, if the first draw is
0.3 then the complementary draw is 0.7. The draws are, therefore,
exactly balanced around the mean of the uniform distribution. These
anthithetical simulations are then used to obtain antithetical or
balanced draws from the multivariate normal.
When users are analyzing a single dataset,
larify estimates a single
vector
with variance
and draws all
simulations based on those estimates. The table that appears on
the screen gives the exact point estimates and standard errors,
instead of reporting the means and standard deviations of the
simulations.
The procedure is somewhat more complicated when the researcher employs
the mi option to analyze several imputed datasets. In this
case, estsimp repeats the following algorithm
times,
where
is the number of completed datasets: estimate the parameters
and their variance-covariance matrix conditional on the information in
dataset
, and then draw
sets of
parameters from their sampling distribution. By repeating this
algorithm
times, the program generates
sets of simulated
parameters. The output table gives the analytical point-estimate,
standard error, and
-statistic for each parameter, instead of
reporting the means and standard deviations of the simulations.
Specifically, the multiple-imputation point estimate for parameter
is
and the variance
associated with
is a weighted combination of the
within-imputation and between-imputation variances:
, where
and
. The ratio of
(the
parameter estimate) to
(its standard error) forms a
-statistic with degrees of freedom
. For more information about
these procedures, see King, et al. (2001) and Schafer (1997, pp.
109-110).