eco {eco} | R Documentation |
eco
is used to fit the parametric and nonparametric Bayesian
models for ecological inference in 2 times 2 tables via Markov
chain Monte Carlo. It gives in-sample predictions as well as
out-of-sample predictions for population inference. The parametric
model uses a normal/inverse-Wishart prior, while the nonparametric
model uses a Dirichlet process prior. The models and algorithms are
described in Imai and Lu (2004).
eco <- function(Y, X, data = parent.frame(), n.draws = 5000, burnin = 0, thin = 5, verbose = FALSE, nonpar = TRUE, nu0 = 4, tau0 = 1, mu0 = c(0,0), S0 = diag(8,2), supplement=NULL, alpha = NULL, a0 = 1, b0 = 0.1, predict = FALSE, parameter = FALSE)
Y |
A numeric vector of proportions, representing the weighted average of the missing internal cells of a 2 times 2 ecological table. |
X |
A numeric vector of proportions, representing the weights. |
data |
An optional data frame in which to interpret the variables
in Y and X . The default is the environment in which
eco is called.
|
n.draws |
A positive integer. The number of MCMC draws.
The default is 5000 .
|
burnin |
A positive integer. The burnin interval for the Markov
chain; i.e. the number of initial draws that should not be stored. The
default is 0 .
|
thin |
A positive integer. The thinning interval for the
Markov chain; i.e. the number of Gibbs draws between the recorded
values that are skipped. The default is 5 .
|
verbose |
Logical. If TRUE , the progress of the gibbs
sampler is printed to the screen. The default is FALSE .
|
nonpar |
Logical. If TRUE , the nonparametric
model will be fit. Otherwise, the parametric model will be
estimated. The default is TRUE .
|
nu0 |
A positive integer. The prior degrees of freedom
parameter. the default is 4 .
|
tau0 |
A positive integer. The prior scale parameter. The default
is 2 .
|
mu0 |
A 2 times 1 numeric vector. The prior mean. The default is (0,0). |
S0 |
A 2 times 2 numeric matrix, representing a positive
definite prior scale matrix. The default is diag(10,2) .
|
supplement |
A numeric matrix. The matrix has two columns, which
contain additional individual-level data such as survey data for
W_1 and W_2, respectively. If NULL , no
additional individual-level data are included in the model. The
default is NULL .
|
alpha |
A positive scalar. If NULL , the concentration
parameter α will be updated at each Gibbs draw. The prior
parameters a0 and b0 need to be specified. Otherwise,
α is fixed at a user specified value.
The default is NULL .
|
a0 |
A positive integer. The shape parameter of the gamma prior
for α. The default is 1 .
|
b0 |
A positive integer. The scale parameter of the gamma prior
for α. The default is 0.1 .
|
predict |
Logical. If TRUE , out-of sample predictions will
be returned. The default is FALSE .
|
parameter |
Logical. If TRUE , the Gibbs draws of the population
parameters such as mu and sigma are returned. The default is FALSE .
|
An example of 2 times 2 ecological table for racial voting is as following:
black voters | white voters | ||
Voted | W_{1i} | W_{2i} | Y_i |
Not voted | 1-W_{1i} | 1-W_{2i} | 1-Y_i |
X_i | 1-X_i |
where Y_i and X_i represent the observed margins, and W_1 and W_2 are unknown variables. The following deterministic relationship holds for each i: Y_i=X W_{1i}+(1-X_i)W_{2i}
An object of class eco
containing the following elements:
model |
The name of the model is used to produce the
predictions. If the nonparametric model is used,
model=``Dirichlet Process prior'' ; if the parametric model is used,
model=``Normal prior'' .
|
X |
The vector of data X. |
Y |
The vector of data Y. |
nu0 |
The prior degrees of freedom. |
tau0 |
The prior scale parameter. |
mu0 |
The prior means. |
S0 |
The prior scale matrix. |
burnin |
The number of initial burnin draws. |
thin |
Thinning interval. |
mu1.post |
The posterior draws of the population mean parameter
of W_1. In nonparametric model, mu1.post is a m by n
matrix, where m is the number of Gibbs draws saved, n is the number of
observations. In parametric model, mu1.post is a vector of length
m. Export only if parameter=TRUE . |
mu2.post |
The posterior draws of the population mean parameter
of W_2. The dimension of mu2.post is same as
mu1.post . Export only if parameter=TRUE . |
Sigma11.post |
The posterior draws of the population variance
parameter of W_1. In nonparametric model, Sigma11.post
is a m by n matrix, where m is the number of Gibbs draws saved, n is the
number of observations. In parametric model, Sigma11.post is a
vector of length m. Export only if parameter=TRUE . |
Sigma12.post |
The posterior draws of the population covariance
parameter between W_1 and W_2. The dimension of
Sigma12.post is same as Sigma11.post . Export only if
parameter=TRUE . |
Sigma22.post |
The posterior draws of the population variance
parameter of W_2. The dimension of Sigma22.post is same
as Sigma11.post . Export only if parameter=TRUE . |
W1.post |
The posterior draws or in-sample predictions of W_1. |
W2.post |
The posterior draws or in-sample predictions of W_2. |
W1.pred |
The posterior predictive draws or out-of-sample
predictions of W_1. Export only if predict=TRUE .
|
W2.pred |
The posterior predictive draws or out-of-sample
predictions of W_2. Export only if predict=TRUE .
|
alpha |
Whether α is being updated at each Gibbs draw. |
a0 |
The prior shape parameter. |
b0 |
The prior scale parameter. |
a.post |
The Gibbs draws of α. |
nstar |
The number of clusters at each Gibbs draw. |
Ying Lu, Woodrow Wilson School of International and Public Affairs, Princeton University yinglu@Princeton.Edu; Kosuke Imai, Department of Politics, Princeton University kimai@Princeton.Edu, http://www.princeton.edu/~kimai
Imai, Kosuke and Ying Lu. (2004) “ Parametric and Nonparametric Bayesian Models for Ecological Inference in 2 times 2 Tables.” Proceedings of the American Statistical Association. http://www.princeton.edu/~kimai/research/einonpar.html
summary.eco
## load the registration data data(reg) ## run the nonparametric model to give in-sample & out-of sample predictions res <- eco(Y = Y, X = X, data = reg, n.draws = 50, verbose = TRUE) ##summarize the results summary(res)