bmonomvn {monomvn}R Documentation

Bayesian Estimation for Multivariate Normal Data with Monotone Missingness

Description

Bayesian estimation via sampling from the posterior distribution of the of the mean and covariance matrix of multivariate normal (MVN) distributed data with a monotone missingness pattern via Gibbs Sampling. Through the use of parsimonious/shrinkage regressions (lasso & ridge), where standard regressions fail, this function can handle an (almost) arbitrary amount of missing data.

Usage

bmonomvn(y, pre = TRUE, p = 0.9, B = 100, T = 200, thin = 1,
         economy = FALSE, method = c("lasso", "ridge", "lsr"),
         RJ = c("bpsn", "p", "none"), capm = method!="lasso",
         start = NULL, rd = NULL, rao.s2 = TRUE, verb = 1,
         trace = FALSE)

Arguments

y data matrix were each row is interpreted as a random sample from a MVN distribution with missing values indicated by NA
pre logical indicating whether pre-processing of the y is to be performed. This sorts the columns so that the number of NAs is non-decreasing with the column index
p when performing regressions, p is the proportion of the number of columns to rows in the design matrix before an alternative regression (lasso, ridge, or RJ) is performed as if least-squares regression has “failed”. Least-squares regression is known to fail when the number of columns equals the number of rows, hence a default of p = 0.9 close to 1. Alternatively, setting p = 0 forces lasso to be used for every regression. Intermediate settings of p allow the user to control when least-squares regressions stop and the lasso ones start
B number of Burn-In MCMC sampling rounds, during which samples are discarded
T total number of MCMC sampling rounds to take place after burn-in, during which samples are saved
thin multiplicative thinning in the MCMC. Each Bayesian (lasso) regression will discard thin*M MCMC rounds, where M is the number of columns in its design matrix, before a sample is saved as a draw from the posterior distribution
economy indicates whether memory should be economized at the expense of speed. When TRUE the individual Bayesian (lasso) regressions are cleaned between uses so that only one of them has a large footprint at any time during sampling from the Markov chain. When FALSE (default) all regressions are pre-allocated and the full memory footprint is realized at the outset, saving dynamic allocations
method indicates the Bayesian parsimonious regression specification to be used, choosing between the lasso (default) of Park & Casella, a ridge regression special case of the lasso, and least-squares
RJ indicates the Reversible Jump strategy to be employed. The default argument of "bpsn" only uses RJ for regressions with p >= n. The "p" method uses RJ whenever a parsimonious regression is used, and "none" never uses RJ
capm when TRUE this argument indicates that the number of components of beta should not exceed n, the number of response variables in a particular regression. This argument is ignored when using method = "lasso"
start a list depicting starting values for the parameters that are use to initialize the Markov chain. Usually this will be a "monomvn"-class object depicting maximum likelihood estimates output from the monomvn function. The relevant fields are the mean vector $mu, covariance matrix $S, monotone ordering $o (for sanity checking with input y), component vector $ncomp and penalty parameter vector $lambda. See note below
rd =c(r,delta); a 2-vector of prior parameters for lambda^2 which depends on the regression method. When method = "lasso" then the components are the alpha (shape) and beta (rate) parameters to the a gamma distribution G(r,d); when method = "ridge" the components are the alpha (shape) and beta (scale) parameters to an inverse-gamma distribution IG(r/2,d/2)
rao.s2 indicates whether to use Rao-Blackwellized samples for s^2 should be used (default TRUE), see the details section of blasso for more information
verb verbosity level; currently only verb = 0 and verb = 1 are supported
trace if TRUE then samples from the regressions are saved to files in the CWD, and then read back into the "monomvn"-class object upon return

Details

If pre = TRUE then bmonomvn first re-arranges the columns of y into nondecreasing order with respect to the number of missing (NA) entries. Then (at least) the first column should be completely observed.

Samples from the posterior distribution of the Multivariate Normal mean vector and covariance matrix are obtained sampling from the posterior distribution of Bayesian regression models. The methodology for converting these to samples from the mean vector and covariance matrix is outlined in the monomvn documentation, detailing a similarly structured maximum likelihood approach. See also the references below.

Whenever the regression model is ill–posed (i.e., when there are more covariates than responses, or a “big p small n” problem) then Bayesian lasso or ridge regressions – possibly augmented with Reversible Jump (RJ) for model selection – are used instead. See the Park & Casella reference below, and the blasso documentation. To guarantee each regression is well posed the combination setting of method="lsr" and RJ="none" is not allowed. As in monomvn the p argument can be used to turn on lasso or ridge regressions (possibly with RJ) at other times.

Value

bmonomvn returns an object of class "monomvn", which is a list containing the inputs above and a subset of the components below.

call a copy of the function call as used
mu estimated mean vector with columns corresponding to the columns of y
S estimated covariance matrix with rows and columns corresponding to the columns of y
mu.var estimated variance of mean vector with columns corresponding to the columns of y
S.var estimated variance of the individual components of the covariance matrix with columns and rows corresponding to the columns of y
na when pre = TRUE this is a vector containing number of NA entries in each column of y
o when pre = TRUE this is a vector containing the index of each column in the sorting of the columns of y obtained by o <- order(na)
method method of regression used on each column, or "bcomplete" indicating that no regression was used
thin the number of thinning rounds used for the regression (method) column
lambda2 records the mean lambda^2 value found in the trace of the Bayesian Lasso regressions. This value will be zero if the corresponding column corresponds to a complete case or a ordinary least squares regression (these would be NA entries from monomvn)
ncomp records the mean number of components (columns of the design matrix) used in the regression model for each column of y. If input RJ = FALSE then this simply corresponds to the monotone ordering (these would correspond to the NA entries from monomvn. When RJ = TRUE the monotone ordering is an upper bound (on each entry)
trace if input trace = TRUE then this field contains traces of the samples of mu in the field $mu and of S in the field $S, and of all regression parameters for each of the m = length(mu) columns in the field $reg. This $reg field is a stripped-down "blasso"-class object so that the methods of that object may be used for analysis
B from inputs: number of Burn-In MCMC sampling rounds, during which samples are discarded
T from inputs: total number of MCMC sampling rounds to take place after burn-in, during which samples are saved
r from inputs: alpha (shape) parameter to the gamma distribution prior for the lasso parameter lambda
delta from inputs: beta (rate) parameter to the gamma distribution prior for the lasso parameter lambda

Note

Whenever the bmonomvn algorithm requires a regression where p >= n, i.e., if any of the columns in the y matrix have fewer non–NA elements than the number of columns with more non–NA elements, then it is helpful to employ both lasso/ridge and the RJ method.

It is important that any starting values provided in the start be compatible with the regression model specified by inputs RJ and method. Any incompatibilities will result with a warning that (alternative) default action was taken and may result in an undesired (possibly inferior) model being fit

Author(s)

Robert B. Gramacy bobby@statslab.cam.ac.uk

References

Robert B. Gramacy and Joo Hee Lee (2007). On estimating covariances between many assets with histories of highly variable length. Preprint available on arXiv:0710.5837:
http://arxiv.org/abs/0710.5837

Roderick J.A. Little and Donald B. Rubin (2002). Statistical Analysis with Missing Data, Second Edition. Wilely.

Park, T., Casella, G. (2008). The Bayesian Lasso, (unpublished)
http://www.stat.ufl.edu/~casella/Papers/bayeslasso.pdf

http://www.statslab.cam.ac.uk/~bobby/monomvn.html

See Also

blasso, monomvn, em.norm in the norm package, and mlest in the mvnmle package

Examples

## standard usage, duplicating the results in
## Little and Rubin, section 7.4.3
data(cement.miss)
out <- bmonomvn(cement.miss)
out
out$mu
out$S

##
## A bigger example, comparing the various methods
##

## generate N=100 samples from a 10-d random MVN
xmuS <- randmvn(100, 20)

## randomly impose monotone missingness
xmiss <- rmono(xmuS$x)

## using least squares only when necessary
obl <- bmonomvn(xmiss)
obl
rmse.muS(obl$mu, obl$S, xmuS$mu, xmuS$S)
oml <- monomvn(xmiss, method="lasso")
rmse.muS(oml$mu, oml$S, xmuS$mu, xmuS$S)

## using least squares sparingly
obls <- bmonomvn(xmiss, p=0.25)
rmse.muS(obls$mu, obls$S, xmuS$mu, xmuS$S)
omls <- monomvn(xmiss, p=0.25, method="lasso")
rmse.muS(omls$mu, omls$S, xmuS$mu, xmuS$S)

#### compare to ridge regression
obrs <- bmonomvn(xmiss, p=0.25, method="ridge")
rmse.muS(obrs$mu, obrs$S, xmuS$mu, xmuS$S)
omrs <- monomvn(xmiss, p=0.25, method="ridge")
rmse.muS(omrs$mu, omrs$S, xmuS$mu, xmuS$S)

## using the maximum likelihood solution to initialize
## the Markov chain and avoid burn-in.  
ob2s <- bmonomvn(xmiss, p=0.25, B=0, start=omls, RJ="p")
rmse.muS(ob2s$mu, ob2s$S, xmuS$mu, xmuS$S)

[Package monomvn version 1.4-1 Index]