bas.lm {BAS} | R Documentation |
Sample without replacement from a posterior distribution on models
bas.lm(formula, data, n.models=NULL, prior="ZS-null", alpha=NULL, modelprior=uniform(), initprobs="Uniform", random=TRUE, method="BAS", update=NULL, bestmodel = NULL, bestmarg = NULL, prob.local = 0)
formula |
linear model formula for the full model with all predictors, Y ~ X. All code assumes that an intercept will be included in each model. |
data |
data frame |
n.models |
number of models to sample. If NULL, BAS will enumerate unless p > 25 |
prior |
prior distribution for regression coefficients. Choices include "AIC", "BIC", "g-prior", "ZS-null", "ZS-full", "hyper-g", "hyper-g-laplace", "EB-local", and "EB-global" |
alpha |
optional hyperparameter in g-prior or hyper g-prior. For Zellner's g-prior, alpha = g, for the Liang et al hyper-g method, recommended choice is alpha are between (2, 4), with alpha = 3 recommended. |
modelprior |
Family of prior distribution on the models. Choices
include uniform
Bernoulli or beta.binomial |
initprobs |
vector of length p with the initial inclusion
probabilities used for sampling without replacement (the intercwept
should be included with probability one) or a character
string giving the method used to construct the sampling probabilities
if "Uniform" each predictor variable is equally likely to be
sampled (equivalent to random sampling without replacement). If
"eplogp", use the eplogprob function to aproximate the
Bayes factor to find initial marginal inclusion probabilitites and
sample without replacement using these
inclusion probabilaties. For variables that should always be
included set the corresponding initprobs to 1. |
random |
Logical variable; if TRUE use random sampling (see method) or deterministic sampling |
method |
A character variable indicating whether to use BAS Bayesian Adaptive Sampling (without replacement) (BAS) or Adaptive MCMC (AMCMC) (not yet implemented) |
update |
number of iterations between potential updates of the sampling probabilities. If NULL do not update, otherwise the algorithm will update using the marginal inclusion probabilities as they change while sampling takes place. For large model spaces, updating is recommended. |
bestmodel |
optional binary vector representing a model to initialize the sampling. If NULL sampling starts with the Full model |
bestmarg |
optional value for the log marginal associated with the bestmodel |
prob.local |
An experimental option to allow sampling of models "near" the median probability model. Not recommended for use at this time |
BAS provides two search algorithms to find high probability models for use in Bayesian Model Averaging or Bayesian model selection. For p less than 20-25, BAS can enumerate all models depending on memory availability, for larger p, BAS samples without replacement using random or deteminestic sampling. The Bayesian Adaptive Sampling algorithm of Clyde, Ghosh, Littman (2009) samples models without replacement using the initial sampling probabilities, and will optionally update the sampling probabilities every "update" models using the estimated marginal inclusion probabilties. If the predictor variables are orthogonal the deterinistic sampler provides a list of the top models in order of their approximate posterior probabiity, and provides an effective search if the correlations of variables is small to modest. The priors on coefficients include Zellner's g-prior, the Hyper-g prior (Liang et al 2008, the Zellner-Siow Cauchy prior, Empirical Bayes (local and gobal) g-priors. AIC and BIC are also included.
bas
returns an object of class BMA
An object of class BMA
is a list containing at least the following components:
postprob |
the posterior probabilities of the models selected |
priorprobs |
the prior probabilities of the models selected |
namesx |
the names of the variables |
R2 |
R2 values for the models |
logmarg |
values of the log of the marginal likelihood for the models |
n.vars |
total number of independent variables in the full model, including the intercept |
size |
the number of independent variables in each of the models, includes the intercept |
which |
a list of lists with one list per model with variables that are included in the model |
probne0 |
the posterior probability that each variable is non-zero |
ols |
list of lists with one list per model giving the OLS estimate of each (nonzero) coefficient for each model |
ols.se |
list of lists with one list per model giving the OLS standard error of each coefficient for each model |
prior |
the name of the prior that created the BMA object |
alpha |
value of hyperparameter in prior used to create the BMA object. |
modelprior |
the prior distribution on models that created the BMA object |
Y |
response |
X |
matrix of predictors |
The function summary.bma
, is used to print a summary of
the results. The function plot.bma
is used to plot
posterior distributions for the coefficients and
image.bma
provides an image of the distribution over models.
Posterior summaries of coefficients can be extracted using
coefficients.bma
. Fitted values and predictions can be
obtained using the functions fitted.bma
and predict.bma
.
BMA objects may be updated to use a different prior (without rerunning
the sampler) using the function update.bma
.
Merlise Clyde (clyde@stat.duke.edu) and Michael Littman
Clyde, M. Ghosh, J. and Littman, M. (2009) Bayesian Adaptive Sampling for Variable Selection and Model Averaging. Department of Statistical Science Discussion Paper 2009-16. Duke University.
Clyde, M. and George, E. I. (2004) Model Uncertainty. Statist. Sci.,
19, 81-94.
http://www.isds.duke.edu/~clyde/papers/statsci.pdf
Clyde, M. (1999) Bayesian Model Averaging and Model Search Strategies (with discussion). In Bayesian Statistics 6. J.M. Bernardo, A.P. Dawid, J.O. Berger, and A.F.M. Smith eds. Oxford University Press, pages 157-185.
Hoeting, J. A., Madigan, D., Raftery, A. E. and Volinsky, C. T. (1999)
Bayesian model averaging: a tutorial (with discussion). Statist. Sci.,
14, 382-401.
http://www.stat.washington.edu/www/research/online/hoeting1999.pdf
Liang, F., Paulo, R., Molina, G., Clyde, M. and Berger,
J.O. (2005) Mixtures of g-priors for Bayesian Variable
Selection.
http://www.stat.duke.edu/05-12.pdf
Zellner, A. (1986) On assessing prior distributions and Bayesian regression analysis with g-prior distributions. In Bayesian Inference and Decision Techniques: Essays in Honor of Bruno de Finetti, pp. 233-243. North-Holland/Elsevier.
Zellner, A. and Siow, A. (1980) Posterior odds ratios for selected regression hypotheses. In Bayesian Statistics: Proceedings of the First International Meeting held in Valencia (Spain), pp. 585-603.
summary.bma
,
coefficients.bma
,
print.bma
,
predict.bma
,
fitted.bma
plot.bma
,
image.bma
,
eplogprob
,
update.bma
demo(BAS.hald) ## Not run: demo(BAS.USCrime)