eco {eco}R Documentation

Fitting the Parametric and Nonparametric Bayesian Models of Ecological Inference in 2 by 2 Tables

Description

eco is used to fit the parametric and nonparametric Bayesian models for ecological inference in 2 times 2 tables via Markov chain Monte Carlo. It gives in-sample predictions as well as out-of-sample predictions for population inference. The parametric model uses a normal/inverse-Wishart prior, while the nonparametric model uses a Dirichlet process prior. The models and algorithms are described in Imai and Lu (2004).

Usage

eco(formula, data = parent.frame(), nonpar = FALSE, supplement = NULL,
    mu0 = c(0,0), tau0 = 2, nu0 = 4, S0 = diag(10,2), alpha = NULL,
    a0 = 1, b0 = 0.1, predict = FALSE, parameter = FALSE, grid = FALSE,
    n.draws = 5000, burnin = 0, thin = 0, verbose = FALSE)

Arguments

formula A symbolic description of the model to be fit, specifying the column and row margins of 2 times 2 ecological tables. Y ~ X specifies Y as the column margin and X as the row margin. Details and specific examples are given below.
data An optional data frame in which to interpret the variables in formula. The default is the environment in which eco is called.
nonpar Logical. If TRUE, the nonparametric model will be fit. Otherwise, the parametric model will be estimated. The default is TRUE.
supplement A numeric matrix. The matrix has two columns, which contain additional individual-level data such as survey data for W_1 and W_2, respectively. If NULL, no additional individual-level data are included in the model. The default is NULL.
mu0 A 2 times 1 numeric vector. The prior mean. The default is (0,0).
tau0 A positive integer. The prior scale parameter. The default is 2.
nu0 A positive integer. The prior degrees of freedom parameter. the default is 4.
S0 A 2 times 2 numeric matrix, representing a positive definite prior scale matrix. The default is diag(10,2).
alpha A positive scalar. If NULL, the concentration parameter α will be updated at each Gibbs draw. The prior parameters a0 and b0 need to be specified. Otherwise, α is fixed at a user specified value. The default is NULL.
a0 A positive integer. The shape parameter of the gamma prior for α. The default is 1.
b0 A positive integer. The scale parameter of the gamma prior for α. The default is 0.1.
predict Logical. If TRUE, out-of sample predictions will be returned. The default is FALSE.
parameter Logical. If TRUE, the Gibbs draws of the population parameters such as mu and sigma are returned. The default is FALSE.
grid Logical. If TRUE, the grid method is used to sample W in the Gibbs sampler. If FALSE, the Metropolis algorithm is used where candidate draws are sampled from the uniform distribution on the tomography line for each unit. Note that the grid method is significantly slower than the Metropolis algorithm.
n.draws A positive integer. The number of MCMC draws. The default is 5000.
burnin A positive integer. The burnin interval for the Markov chain; i.e. the number of initial draws that should not be stored. The default is 0.
thin A positive integer. The thinning interval for the Markov chain; i.e. the number of Gibbs draws between the recorded values that are skipped. The default is 0.
verbose Logical. If TRUE, the progress of the gibbs sampler is printed to the screen. The default is FALSE.

Details

An example of 2 times 2 ecological table for racial voting is given below:
black voters white voters
Voted W_{1i} W_{2i} Y_i
Not voted 1-W_{1i} 1-W_{2i} 1-Y_i
X_i 1-X_i

where Y_i and X_i represent the observed margins, and W_1 and W_2 are unknown variables. All variables are proportions and hence bounded between 0 and 1. For each i, the following deterministic relationship holds, Y_i=X W_{1i}+(1-X_i)W_{2i}.

Value

An object of class eco containing the following elements:

call The matched call.
nonpar The logical variable indicating whether the nonparametric model is fit.
X The row margin, X.
Y The column margin, Y.
nu0 The prior degrees of freedom.
tau0 The prior scale parameter.
mu0 The prior mean.
S0 The prior scale matrix.
burnin The number of initial burnin draws.
thin Thinning interval.
W1 The posterior in-sample predictions of W_1.
W2 The posterior in-sample predictions of W_2.
W1.pred The posterior predictive draws or out-of-sample predictions of W_1. Export only if predict=TRUE.
W2.pred The posterior predictive draws or out-of-sample predictions of W_2. Export only if predict=TRUE.
a0 The prior shape parameter.
b0 The prior scale parameter.
mu The posterior draws of the population mean parameter, mu.
Sigma The poterior draws of the population variance matrix, Sigma.
mu1 The posterior draws of the population mean parameter for W_1. It is an m times n matrix, where m is the number of Gibbs draws saved, n is the number of units.
mu2 The posterior draws of the population mean parameter of W_2. The dimension of mu2 is the same as mu1.
Sigma11 The posterior draws of the population variance parameter for W_1. It is an m times n matrix, where m is the number of Gibbs draws saved, n is the number of units.
Sigma12 The posterior draws of the population covariance parameter for W_1 and W_2. The dimension of Sigma12 is the same as Sigma11.
Sigma22 The posterior draws of the population variance parameter for W_2. The dimension of Sigma22 is same as Sigma11.
alpha The posterior draws of α.
nstar The number of clusters at each Gibbs draw.

Author(s)

Kosuke Imai, Department of Politics, Princeton University kimai@Princeton.Edu, http://www.princeton.edu/~kimai; Ying Lu, Institute for Quantitative Social Sciences, Harvard University ylu@Latte.Harvard.Edu

References

Imai, Kosuke and Ying Lu. (2004) “ Parametric and Nonparametric Bayesian Models for Ecological Inference in 2 times 2 Tables.” Proceedings of the American Statistical Association. http://www.princeton.edu/~kimai/research/einonpar.html

See Also

summary.eco

Examples


## load the registration data
data(reg)

## NOTE: convergence has not been properly assessed for the following
## examples.

## fit the parametric model to give in-sample predictions and store
## parameter estimates
res <- eco(Y ~ X, data = reg, parameter = TRUE, verbose = TRUE) 
##summarize the results
summary(res)

## fit the nonparametric model to give in-sample predictions
res1 <- eco(Y ~ X, data = reg, nonpar = TRUE, n.draws = 500, verbose = TRUE) 
##summarize the results
summary(res1)

[Package eco version 1.1-1 Index]