bagging {GeneTS} | R Documentation |
bagged.cov
, bagged.cor
, and bagged.pcor
calculate
the bootstrap aggregated (=bagged) versions of the covariance and
(partial) covariance estimators.
The bagged covariance and correlation estimators are advantageous especially for small sample size problems. For example, the bagged correlation matrix typically remains positive definite even when the sample size is much smaller than the number of variables.
In Schaefer and Strimmer (2005) the inverse of the bagged correlation matrix
is used to estimate graphical Gaussian models from sparse microarray data -
see also ggm.estimate.pcor
for various strategies to
estimate partial correlation coefficients.
bagged.cov(x, R=1000, ...) bagged.cor(x, R=1000, ...) bagged.pcor(x, R=1000, ...)
x |
data matrix or data frame |
R |
number of bootstrap replicates (default: 1000) |
... |
options passed to cov , cor , and partial.cor
(e.g., to control handling of missing values) |
Bagging was first suggested by Breiman (1996) as a means to improve and estimator using the bootstrap. The bagged estimate is simply the mean of the bootstrap sampling distribution.
Bagging is essentially a non-parametric variance reduction method. The bagged estimate can also be interpreted as (approximate) posterior mean estimate assuming some implicit prior.
A symmetric matrix.
Juliane Schaefer (http://www.stat.uni-muenchen.de/~schaefer/) and Korbinian Strimmer (http://www.stat.uni-muenchen.de/~strimmer/).
Breiman, L. (1996). Bagging predictors. Machine Learning, 24, 123–140.
Schaefer, J., and Strimmer, K. (2005). An empirical Bayes approach to inferring large-scale gene association networks. Bioinformatics 21:754-764.
Schaefer, J., and Strimmer, K. (2005). Learning large-scale graphical Gaussian models from genomic data. Proceedings of CNET 2004, Aveiro, Pt. (AIP)
cov
, cor
, partial.cor
,
ggm.estimate.pcor
, robust.boot
.
# load GeneTS library library(GeneTS) # small example data set data(caulobacter) dat <- caulobacter[,1:15] dim(dat) # bagged estimates b.cov <- bagged.cov(dat) b.cor <- bagged.cor(dat) b.pcor <- bagged.pcor(dat) # total squared difference sum( (b.cov - cov(dat))^2 ) sum( (b.cor - cor(dat))^2 ) sum( (b.pcor - partial.cor(dat))^2 ) # positive definiteness of bagged correlation is.positive.definite(cor(dat)) is.positive.definite(b.cor)