eco {eco}R Documentation

Fitting the Parametric Bayesian Model of Ecological Inference in 2x2 Tables

Description

eco is used to fit the parametric Bayesian model (based on a Normal/Inverse-Wishart prior) for ecological inference in 2 times 2 tables via Markov chain Monte Carlo. It gives the in-sample predictions as well as the estimates of the model parameters. The model and algorithm are described in Imai and Lu (2004).

Usage

eco(formula, data = parent.frame(), N = NULL, supplement = NULL,
    mu0 = c(0,0), tau0 = 2, nu0 = 4, S0 = diag(10,2), mu.start = c(0,0),
    Sigma.start = diag(10, 2), parameter = TRUE, grid = FALSE,
    n.draws = 5000, burnin = 0, thin = 0, verbose = FALSE)

Arguments

formula A symbolic description of the model to be fit, specifying the column and row margins of 2 times 2 ecological tables. Y ~ X specifies Y as the column margin and X as the row margin. Details and specific examples are given below.
data An optional data frame in which to interpret the variables in formula. The default is the environment in which eco is called.
N An optional variable representing the size of the unit; e.g., the total number of voters.
supplement An optional matrix of supplemental data. The matrix has two columns, which contain additional individual-level data such as survey data for W_1 and W_2, respectively. If NULL, no additional individual-level data are included in the model. The default is NULL.
mu0 A 2 times 1 numeric vector of the prior mean for the mean parameter μ. The default is c(0,0).
tau0 A positive integer representing the prior scale for the mean parameter μ. The default is 2.
nu0 A positive integer representing the prior degrees of freedom of the variance matrix Σ. the default is 4.
S0 A 2 times 2 numeric matrix representing a positive definite prior scale matrix for the variance matrix Σ. The default is diag(10,2).
mu.start A 2 times 1 numeric vector. The starting values of the mean parameter μ. The default is c(0,0).
Sigma.start A 2 times 2 numeric matrix representing a starting value of the variance matrix Σ. The default is diag(10,2).
parameter Logical. If TRUE, the Gibbs draws of the population parameters, μ and Σ, are returned in addition to the in-sample predictions of the missing internal cells, W. The default is TRUE.
grid Logical. If TRUE, the grid method is used to sample W in the Gibbs sampler. If FALSE, the Metropolis algorithm is used where candidate draws are sampled from the uniform distribution on the tomography line for each unit. Note that the grid method is significantly slower than the Metropolis algorithm.
n.draws A positive integer. The number of MCMC draws. The default is 5000.
burnin A positive integer. The burnin interval for the Markov chain; i.e. the number of initial draws that should not be stored. The default is 0.
thin A positive integer. The thinning interval for the Markov chain; i.e. the number of Gibbs draws between the recorded values that are skipped. The default is 0.
verbose Logical. If TRUE, the progress of the Gibbs sampler is printed to the screen. The default is FALSE.

Details

An example of 2 times 2 ecological table for racial voting is given below:
black voters white voters
Voted W_{1i} W_{2i} Y_i
Not voted 1-W_{1i} 1-W_{2i} 1-Y_i
X_i 1-X_i

where Y_i and X_i represent the observed margins, and W_1 and W_2 are unknown variables. All variables are proportions and hence bounded between 0 and 1. For each i, the following deterministic relationship holds, Y_i=X W_{1i}+(1-X_i)W_{2i}.

Value

An object of class eco containing the following elements:

call The matched call.
X The row margin, X.
Y The column margin, Y.
N The size of each table, N.
burnin The number of initial burnin draws.
thin The thinning interval.
nu0 The prior degrees of freedom.
tau0 The prior scale parameter.
mu0 The prior mean.
S0 The prior scale matrix.
W A three dimensional array storing the posterior in-sample predictions of W. The first dimension indexes the Monte Carlo draws, the second dimension indexes the columns of the table, and the third dimension represents the observations.
Wmin A numeric matrix storing the lower bounds of W.
Wmax A numeric matrix storing the upper bounds of W.
mu The posterior draws of the population mean parameter, μ.
Sigma The posterior draws of the population variance matrix, Σ.

Author(s)

Kosuke Imai, Department of Politics, Princeton University kimai@Princeton.Edu, http://www.princeton.edu/~kimai; Ying Lu, Institute for Quantitative Social Sciences, Harvard University ylu@Latte.Harvard.Edu

References

Imai, Kosuke and Ying Lu. (2004) “ Parametric and Nonparametric Bayesian Models for Ecological Inference in 2 times 2 Tables.” Proceedings of the American Statistical Association. http://www.princeton.edu/~kimai/research/einonpar.html

See Also

ecoNP, predict.eco, summary.eco

Examples


## load the registration data
data(reg)

## NOTE: convergence has not been properly assessed for the following
## examples. See Imai and Lu (2004) for more complete examples.

## fit the parametric model with the default prior specification
res <- eco(Y ~ X, data = reg, verbose = TRUE)
## summarize the results
summary(res)

## obtain out-of-sample prediction
out <- predict(res, verbose=TRUE)
## summarize the results
summary(out)


[Package eco version 2.1-1 Index]