- _EalphaB
- (cols(Zb)
) matrix of means (in the first
column) and standard deviations (in the second) of an independent
normal prior distribution on elements of
. If you specify
Zb, you should probably specify a prior, at least with mean
zero and some variance (default={.}; which indicates no prior).
(See Equation 9.2, page 170, to interpret
). (If you are
using EzI, and have trouble setting something to 0, try 0.000001 or
some such; this gets around an error in a Gauss proc and should give
essentially the same answer empirically.)
- _EalphaW
- (cols(Zw)
) matrix of means (in the first
column) and standard deviations (in the second) of an independent
normal prior distribution on elements of
. If you specify
Zw, you should probably specify a prior, at least with mean
zero and some variance (default={.}; which indicates no prior).
(See Equation 9.2, page 170, to interpret
). (If you are
using EzI, and have trouble setting something to 0, try 0.000001 or
some such; this gets around an error in a Gauss proc and should give
essentially the same answer empirically.)
- _Ebeta
- Standard deviation of the ``flat normal'' prior on
and
. The flat normal prior is uniform
within the unit square and dropping outside the square according to
the normal distribution. Set to zero for no prior (default).
Setting to positive values probabilistically keeps the estimated
mode within the unit square. 0.25 is a reasonable value to
experiment with at first.
- _Ebounds
- 1 if set CML bounds on parameters automatically
unless z's are included; 0 if don't use bounds;
(where
is the number of starting values) or
matrix to
indicate upper
lower bounds. (Do not confuse the bounds
referred to here with the bounds on the quantities of interest.)
Default=1.
- _Ecdfbvn
- Determines which procedure to use for computing the
area of the bivariate normal distribution above the unit square: 1
based on the Gauss function CDFBVN; 2 Martin van der Ende's
method (based on D.R. Divgi, ``Calculation of the univariate and
bivariate Normal integral,'' Annals of Statistics, 1979,
903-910, with additional options available for this method in the
proc cdfbvn_div); 3 Integration of log of the unit square;
4 Direct integration on unit square; 5, fairly accurate and fast,
based on direct integration on the unit square from a new Gauss
internal procedure (DEFAULT); 6, most accurate but slow, based on a
cdfbvn procedure by Alan Genz (using results from Drezner, Z. and
G.O. Wesolowsky, 1989. ``On the computation of the bivariate
normal integral,'' Journal of Statist. Comput. Simul. 35:
101-107). See Appendix F.
Option 5 (the default) appears to be the best tradeoff between speed
and accuracy currently available (and so this global should not be
changed to anything other than 6, which is more accurate but much
slower, unless you have a good reason to do so). However,
fundamental progress remains to be made on methods of integrating
the bivariate normal, as all currently available methods are
innacurate and jump discontinuously and for very small values.
Because of this, small values are truncated at the global
_EcdfTol, which you may wish to adjust.
- _EcdfTol
- Tolerance for the lncdfbvn function (when
_Ecdfbvn=5, its default), with smaller calculated values
truncated at the value of this global (DEFAULT=2.220446e-11). This
can be any positive number, although lncdfbvn gets imprecise for
small values. Only set to smaller values if you think you need the
precision, such as if most of your values of
or
are very
small.
- _Echeck
- 1 check inputs and globals and give nice error
messages if problems (default); 0 don't check, which saves some
time. There is little reason to choose 0 unless you are running a
large number of estimations and you are certain all the inputs are
correctly specified. (Inessential global: not stored in dbuf.)
- _EdirTol
- direction tolerance for CML convergence.
Default=0.0001. Set to smaller values if most of your values of
or
are very small.
- _EdoML
- 1 do maximum likelihood (default); 0 don't do maximum
likelihood, using instead the values of
stored in
_EdoML_phi and vcphi in _EdoML_vcphi.
- _EdoML_phi
- if _EdoML
, this should include a
vector of values of
and will be used instead of the output of
the likelihood maximization. (This global is ignored unless
_EdoML=1.)
- _EdoML_vcphi
- if _EdoML=1, this should include a
matrix of values of estimated variance matrix
and will be
used instead of the output of the likelihood maximization procedure.
(This global is ignored unless _EdoML=1.)
- _EdoSim
- 1 do simulations (default); 0 don't do simulations;
don't do simulations or compute the maxlik variance (use this
option for computing conditional log-likelihood of eta's).
- _Eeta
- Automatically includes
in the inputs Zb
and/or Zw. The actual inputs Zb and Zw
must be set to 1 if the default is changed. Using this global is
better than explicitly including
in the inputs, because eiread
and eigraph will be ``aware'' of the contents of Zb and
Zw. If you set this global, it is generally best to also
set the priors _EalphaB and _EalphaW. See
Chapter 9, and the parameterization in Equation 9.2 (page 170).
Options include:
- _Eeta=0 excludes
, which is equivalent
to setting
(default).
- _Eeta=1 sets zb=x, zw=1,
which estimates
and fixes
- _Eeta=2 sets zb=1, zw=x,
which estimates
and fixes
- _Eeta=3 sets zb=zw=x,
which estimates
and
.
- Set to a
vector with elements _Eeta =
se
se
to fix
and
, and their standard deviations,
during estimation.
- Finally, set to a
vector
to set
zb=x and zw=1, to estimate
,
and fix
a and its standard error to b. Set to a
vector
to set zb=1 and
zw=x, to estimate
, and fix
a and its standard error to b.
- _EI_vc
matrix (
), each row of which
represents instructions for one attempt to compute an estimated
positive definite variance matrix of
. The procedure exits
after the first positive definite hessian is found. Options to
include in various rows are: {1 0} the usual numerical hessian
computation (using Gauss's hessp.src proc); {1
} use usual
hessian procedure and then adjust eigenvalues together so they are
greater than
; {2
} use wide step lengths at
fraction
falloff in the likelihood function; {3
} use quadradic
approximation with falloff in likelihood function set at
; {4
0} use a generalized inverse (to deal with singularity) and a
generalized cholesky (to deal with non-positive definiteness) based
on work in progress by Jeff Gill and Gary King; {5 0} use wide
step lengths but check that the gradients for each are correct (and
if necessary search for better ones); {-1 0} avoid the computation
of the variance covariance matrix in case of non-positive
definiteness and use the singular value decomposition for the
multinomial normal sampling (i.e. _EisT has to be set to 0). In
order to use this option, also make sure to define relatively narrow
upper and lower bounds of the parameters by using _Ebounds.
DEFAULT={1 0, 4 0, 2 0.1, 2 0.05, 3 0.1, 1 0.1, 1 0.2}.
The variance computation only very rarely gets beyond the second
try.
When the likelihood surface is normal (i.e., quadratic), which is
true asymptotically, all options produce identical results. In
practice, this procedure is useful for ensuring that a positive
definite variance matrix can be found due to numerical, rather than
theoretical or empirical, difficulties, as can happen when the mode
of the truncated normal is far from the unit square due to
imprecision in the function that computes the bivariate normal CDF.
(Another, sometimes better, way to fix these numerical problems is
to reduce the variances of the priors in _Erho and
_Esigma.) Because importance sampling is used after this
procedure, different values of the variance matrix can produce
identical estimates of the quantities of interest. Be sure to
verify that the simulations are being appropriately drawn from the
estimated contours (see compare the right two figures in eigraph's
tomogS).
- _EIgraph_bvsmth
- smoothing parameter for nonparametric
estimation; used only if _Enonpar=1. Default=0.08. (The
same parameter controls the nonparametric bivariate density
estimation for diagnostic purposes in eigraph.) See
Section 9.3.2.
- _EisChk
- 0 to do nothing (default); 1 change lnir from
the scalar mean importance ratio to a
(_Esims*_Eisn)
(rows(
)+1) matrix
containing the log of the importance ratio as the first column and
normal simulations of
as the remaining columns. Also
changes PhiSims from the mean and standard deviation of the
posterior phi's to a _Esims
rows(
) matrix of
normal simulations of phi.
- _EiLlikS
- 1 if save (_Esims
) the log-likelihoods
evaluated for each simulation; 0 saves only the means of these
likelihoods (default). These can be used for computing the marginal
likelihood.
- _EisFac
- factor to multiply by estimated variance matrix in the
normal approximation for use in importance sampling, or set to
to use normal approximation only or
to condition on the maximum
posterior estimates. Adjust this, _Eisn, or
_Eist if eiread's resamp larger than 15 or 20.
If this is set too low, estimation variability will not be
sufficient and your confidence intervals may be too narrow; it must
be greater than zero and should probably be at least one. See
Section 7.5. (Default=4).
- _Eisn
- factor to multiply by _Esims to compute the
number of normals to draw before resampling. This is used to to try
to get _Esims samples from exact posterior. Increase this
or change _EisFac or _EisT if resamp is
larger than 15 or 20. Default=10. See Section 7.5.
- _EisT
- 0 (default) to use multivariate normal density to draw
random numbers for initial approximation for importance sampling; or
if greater than 2, use the multivariate Student
density, with
degrees of freedom _Eist. Use this, _EisFac, or
_Eisn if resamp is larger than 15 or 20. See
Section 7.5.
- _EmaxIter
- Maximum number of iterations for CML. Default=500.
- _EnonEval
- Number of nonparametric density evaluations for each
tomography line (default=11). Only used if _EnonPar=1.
- _EnonNumInt
- Number of points to evaluate for numerical
integration in computing the denominator for the bivariate kernel
density (default=50). Only used if _EnonPar=1.
- _EnonPar
- 0 do not run nonparametric model (default); 1 run
nonparametric model. (When choosing nonparametric estimation, only
relevant options will be available under eigraph and eiread.) See
Section 9.3.2.
- _EnumTol
- Numerical tolerance. A homogeneous precinct is one
for which
_EnumTol or
(1-_EnumTol).
Default is 0.0001. Set to smaller values if most of your values of
or
are very small.
- _Eprt
- 0 print nothing; 1 print only final output from each
stage; 2 also prints friendly iteration numbers etc (default); 3
also prints all sorts of checks along the way. Use eiread and
eigraph instead of this global to see output. (Inessential global:
not stored in dbuf.)
- _Eres
- If items are vput into _Eres before
running ei, they are passed through into dbuf. For
example, identifiers for each aggregate unit would be useful in
interpreting the results, or using them in subsequent analyses (try:
_Eres=vput(_Eres,caseid,"caseid") before calling EI. If
a title is vput and given the name titl, the title is
printed in convenient places. See eiread for further
information. Do not use the name of any globals to this procedure
or options listed under eiread(), or your variable will be lost.
- _Erho
- The first element is the standard deviation of normal
prior on
for the correlation; set to 0 to fix
to a
second element, _Erho[2]; set to
to estimate without
a prior. Default=0.5. _Erho should be a scalar unless
the first element is 0, in which case it should be a
vector, where the second element is the value at which the
is fixed (and not estimated). See Section 7.4.
- _Eselect
- Controls which observations are included in the
estimation stage, including both likelihood maximization and
importance sampling. All observations are included in the simulation
stage unless you delete them from the data set before starting
.
This allows users to base the truncated bivariate normal contours on
a subset of observations that might be more representative (such as
those for which
is not 0 or 1). Set to
vector to
of 1's to include and 0's to exclude individual observations.
- _EselRnd
- Set to scalar 1 to include all observations not
already deleted by _Eselect (default), or a scalar greater
than 0 and less than 1 to randomly select this fraction of
observations in the estimation stage. This global is especially
useful for speeding up estimation in very large datasets, since
thousands of observations are not always needed for estimating
. Since all observations will still be included in the
simulation stage, precinct-level estimates of all quantities of
interest will still be available. (If used with EI2, each iteration
of EI includes a different randomly selected set of observations.)
- _Esigma
- Standard deviation of an underlying normal
distribution, from which a half normal is constructed as a prior for
both
and
. Note: the expected value under
this prior is _Esigma
_Esigma0.8. Set to zero or negative for no prior.
Default = 0.5. See Section 7.4.
- _Esims
- Number of simulations. Default is 100.
- _Estval
- For gradient methods: Scalar 1, use best guess
starting values (default); or set to
vector of starting
values. If _Eeta[1]=0 (its default),
with elements
guesses of
, that is on the scale of estimation. If you have
starting values on the untruncated normal scale,
, you can
reparameterize as in this example:
_Estval=eireparinv(.5|.5|.2|.2|-.1). If
_Eeta[1]=1, 2, 4, or 5,
; if _Eeta=3,
; and if covariates are used and rows(_Eeta)=4, then
is 5 plus the number of covariates included, with Zb coming
before Zw.
For a grid search: Set _Estval to scalar 0 (with 5
divisions per zoom), or to a scalar integer greater than or equal to
3 for a grid search with this number of divisions per zoom. (That
is, the grid search procedure divides the parameter space into a
number of divisions, evaluates the likelihood for every combination
of values on all the parameters, chooses the region of highest
likelihood, zooms in and repeats the procedure on the narrower
parameter region. This continues until differences in the
parameters differ by the global _Edirtol.
- _EvTol
- Numerical tolerance for the conditional variance
calculation. Must be greater than 0; Default is
.