bas.lm {BAS}R Documentation

Bayesian Adaptive Sampling Without Replacement for Variable Selection in Linear Models

Description

Sample without replacement from a posterior distribution on models

Usage

bas.lm(formula, data, n.models=NULL,  prior="ZS-null", alpha=NULL,
 modelprior=uniform(),
 initprobs="Uniform", random=TRUE, method="BAS", update=NULL,
 bestmodel = NULL, bestmarg = NULL, prob.local = 0) 

Arguments

formula linear model formula for the full model with all predictors, Y ~ X. All code assumes that an intercept will be included in each model.
data data frame
n.models number of models to sample. If NULL, BAS will enumerate unless p > 25
prior prior distribution for regression coefficients. Choices include "AIC", "BIC", "g-prior", "ZS-null", "ZS-full", "hyper-g", "hyper-g-laplace", "EB-local", and "EB-global"
alpha optional hyperparameter in g-prior or hyper g-prior. For Zellner's g-prior, alpha = g, for the Liang et al hyper-g method, recommended choice is alpha are between (2, 4), with alpha = 3 recommended.
modelprior Family of prior distribution on the models. Choices include uniform Bernoulli or beta.binomial
initprobs vector of length p with the initial inclusion probabilities used for sampling without replacement (the intercwept should be included with probability one) or a character string giving the method used to construct the sampling probabilities if "Uniform" each predictor variable is equally likely to be sampled (equivalent to random sampling without replacement). If "eplogp", use the eplogprob function to aproximate the Bayes factor to find initial marginal inclusion probabilitites and sample without replacement using these inclusion probabilaties. For variables that should always be included set the corresponding initprobs to 1.
random Logical variable; if TRUE use random sampling (see method) or deterministic sampling
method A character variable indicating whether to use BAS Bayesian Adaptive Sampling (without replacement) (BAS) or Adaptive MCMC (AMCMC) (not yet implemented)
update number of iterations between potential updates of the sampling probabilities. If NULL do not update, otherwise the algorithm will update using the marginal inclusion probabilities as they change while sampling takes place. For large model spaces, updating is recommended.
bestmodel optional binary vector representing a model to initialize the sampling. If NULL sampling starts with the Full model
bestmarg optional value for the log marginal associated with the bestmodel
prob.local An experimental option to allow sampling of models "near" the median probability model. Not recommended for use at this time

Details

BAS provides two search algorithms to find high probability models for use in Bayesian Model Averaging or Bayesian model selection. For p less than 20-25, BAS can enumerate all models depending on memory availability, for larger p, BAS samples without replacement using random or deteminestic sampling. The Bayesian Adaptive Sampling algorithm of Clyde, Ghosh, Littman (2009) samples models without replacement using the initial sampling probabilities, and will optionally update the sampling probabilities every "update" models using the estimated marginal inclusion probabilties. If the predictor variables are orthogonal the deterinistic sampler provides a list of the top models in order of their approximate posterior probabiity, and provides an effective search if the correlations of variables is small to modest. The priors on coefficients include Zellner's g-prior, the Hyper-g prior (Liang et al 2008, the Zellner-Siow Cauchy prior, Empirical Bayes (local and gobal) g-priors. AIC and BIC are also included.

Value

bas returns an object of class BMA
An object of class BMA is a list containing at least the following components:

postprob the posterior probabilities of the models selected
priorprobs the prior probabilities of the models selected
namesx the names of the variables
R2 R2 values for the models
logmarg values of the log of the marginal likelihood for the models
n.vars total number of independent variables in the full model, including the intercept
size the number of independent variables in each of the models, includes the intercept
which a list of lists with one list per model with variables that are included in the model
probne0 the posterior probability that each variable is non-zero
ols list of lists with one list per model giving the OLS estimate of each (nonzero) coefficient for each model
ols.se list of lists with one list per model giving the OLS standard error of each coefficient for each model
prior the name of the prior that created the BMA object
alpha value of hyperparameter in prior used to create the BMA object.
modelprior the prior distribution on models that created the BMA object
Y response
X matrix of predictors


The function summary.bma, is used to print a summary of the results. The function plot.bma is used to plot posterior distributions for the coefficients and image.bma provides an image of the distribution over models. Posterior summaries of coefficients can be extracted using coefficients.bma. Fitted values and predictions can be obtained using the functions fitted.bma and predict.bma. BMA objects may be updated to use a different prior (without rerunning the sampler) using the function update.bma.

Author(s)

Merlise Clyde (clyde@stat.duke.edu) and Michael Littman

References

Clyde, M. Ghosh, J. and Littman, M. (2009) Bayesian Adaptive Sampling for Variable Selection and Model Averaging. Department of Statistical Science Discussion Paper 2009-16. Duke University.

Clyde, M. and George, E. I. (2004) Model Uncertainty. Statist. Sci., 19, 81-94.
http://www.isds.duke.edu/~clyde/papers/statsci.pdf

Clyde, M. (1999) Bayesian Model Averaging and Model Search Strategies (with discussion). In Bayesian Statistics 6. J.M. Bernardo, A.P. Dawid, J.O. Berger, and A.F.M. Smith eds. Oxford University Press, pages 157-185.

Hoeting, J. A., Madigan, D., Raftery, A. E. and Volinsky, C. T. (1999) Bayesian model averaging: a tutorial (with discussion). Statist. Sci., 14, 382-401.
http://www.stat.washington.edu/www/research/online/hoeting1999.pdf

Liang, F., Paulo, R., Molina, G., Clyde, M. and Berger, J.O. (2005) Mixtures of g-priors for Bayesian Variable Selection.
http://www.stat.duke.edu/05-12.pdf

Zellner, A. (1986) On assessing prior distributions and Bayesian regression analysis with g-prior distributions. In Bayesian Inference and Decision Techniques: Essays in Honor of Bruno de Finetti, pp. 233-243. North-Holland/Elsevier.

Zellner, A. and Siow, A. (1980) Posterior odds ratios for selected regression hypotheses. In Bayesian Statistics: Proceedings of the First International Meeting held in Valencia (Spain), pp. 585-603.

See Also

summary.bma, coefficients.bma, print.bma, predict.bma, fitted.bma plot.bma, image.bma, eplogprob, update.bma

Examples

demo(BAS.hald)
## Not run: demo(BAS.USCrime) 

[Package BAS version 0.45 Index]