JudgeIt is based on a random components
regression model:
 |
(2) |
where
is (for example) the Democratic proportion of the two-party vote
for district
(
legislative districts),
is a set of
explanatory variables (such as vote in the last election, incumbency status,
partisan control, campaign spending, etc.) and
is a vector of
regression coefficients, such that
. The parameter
represents the part of the district vote that is not explained by
but is still a systematic feature of the electoral system and therefore
persists over time. For each
, the error terms have independent normal
distributions,
with mean zero and variance
and
with mean zero and variance
. We also define
and
.
We define hypothetical election results as the set of all possible
election outcomes that could have occurred if all political conditions up to
the start of the campaign were held constant and the campaign were run again.
The vector
of hypothetical vote proportions is determined by an
analogous probability model:
 |
(3) |
where
is a new vector of
independent error terms with
variance
, and
is a known constant used to
model statewide partisan swing.The parameter
in this model
allows us to easily vary the average district vote in a hypothetical (or
predicted) election, without affecting the relative positions of the districts.
This partitioning reflects the common result that it is often quite easy to
predict which districts will vote more Republican than others, but it is harder
to forecast exactly what the average vote would be across districts.
The hypothetical outcome,
, differs from the actual
in
three ways:
- The matrix,
, of explanatory variables is replaced by
,
to recognize that we may wish to specify different conditions under
which the hypothetical election may be run (such as no incumbents
running).
- A constant,
, is added, to allow a statewide partisan
swing to be specified. One can specify either
or a
corresponding value for the expected average district vote,
, since
.
- The new error term,
, models the fact that, even if the
variables in
were unchanged, we would not expect
to be identical
to
. Across many hypothetical elections,
remains unchanged, while
varies.
The stochastic model is interpreted slightly differently for prediction and
evaluation: for prediction, we ask how many seats will the Democrats win
with an average of 45% of the votes (say), and in evaluation we ask, how many
seats would they have won if essentially the same election campaign had
been run again. The only difference between evaluation and prediction is that
we observe one of the possible hypothetical election outcomes for the former
and do not observe any for the latter.
The parameters of this model to be estimated--
,
, and
--are not usually of primary interest in evaluating electoral systems
and redistricting plans (although
is in some cases of interest in
estimating causal effects). Instead, we define all the quantities of interest,
including the seats-votes curve, district vote predictions, etc., in terms of
the posterior distribution of hypothetical election outcomes
, given
the average district vote
or the actual election outcomes
when available (which, in turn, depend on the parameters). From this, we can
easily calculate estimates and standard errors of any quantity of interest
(such as those listed in our summary of prior research).
t:hypcccccccModel Structure
& & Actual & Hypothetical Replications
Gary King
2006-01-07