gm.validation {gmvalid} | R Documentation |
The bootstrapped graphical models are analyzed and some new uncertainty measures are applied in order to determine the uncertainty of a selected graphical model.
gm.validation(data, N = 0, Umax = 0.5, conf.level = 0.95, ...)
data |
Output list from gm.boot.coco or gm.boot.mim ,
data frame or array. Variables need to be discrete and should have names. |
N |
Number of bootstrap replications. Only needed if data is not yet a bootstrap output. |
Umax |
Parameter that defines the maximum uncertainty in the edge selection frequency (default is 0.5). |
conf.level |
Confidence level for bootstrap percentile interval (default is 0.95). |
... |
To add options for the model selection strategy. Only needed if data is not yet a bootstrap output.
See gm.boot.coco .
|
The bootstrap functions bring multivariate output about the uncertainty of a selected graphical model. This function presents some possibilities to reduce the uncertainty to a univariate measure, based either on the edge frequencies of presence in the bootstrapped models or on differences between models measured in edges.
"original model" |
Character string of the selected graphical model, using the original unsampled data. |
"mode model" |
Character string of the model that was selected most frequently. |
"mean model" |
Character string of a graphical model that consists of those edges whose selection frequency is
greater than Umax over the bootstrap replications.
|
"MEU" |
Mean edge uncertainty. Linear measure for the uncertainty of the mean model based on the edge frequency f*(e).
MEU = 1/|VxV| sum[min[(1-Umax)/0.5 f*(e),Umax/0.5 (1-f*(e))] / [Umax (1-Umax) / 0.5 / 0.5 ]] with vertices set V. |
"MSEU" |
Mean squared edge uncertainty.
MSEU = 1/|VxV| sum[min[(1-Umax)/0.5 f*(e),Umax/0.5 (1-f*(e))] / [Umax (1-Umax) / 0.5 ]]^2
|
"edge differences" |
Frequency list of edges that differ in the bootstrapped models from the mean model . |
"total possible edges" |
Number of edges in the saturated model. |
"model std" |
std = 1/(B-1) sum[d(G*b,G*)^2] with number of bootstrap replications B, edge differences d(G*b,G*). |
"MED" |
Mean edge deviation. Mean of edge differences . |
"bootstrap percentile 95" |
The value that includes at least the lower 95% of edge differences to give an upper border to
the model uncertainty.
|
"variable names" |
Matrix. Assigns a letter to each variable used in the model formulas. |
The question when an edge is maximally uncertain is not yet answered satisfactory.
Can we say that an association that is selected in 40
is not present? Or that an edge that is present in 60
randomly? Therefore the argument Umax
leaves it up to your opinion.
If you already have run a bootstrap, make sure it was with all the possible calculations
.
Ronja Foraita, Fabian Sobotka
Bremen Institute for Prevention Research and Social Medicine
(BIPS) http://www.bips.uni-bremen.de
Efron B, Tibshirani RJ (1993) An Introduction to the Bootstrap. Chapman & Hall
Foraita R, Sobotka F, Pigeot I (2009) The uncertainty of a selected graphical model. unpublished
Sobotka F, Foraita R, Eberle A, Pigeot I (2008) GMVALID: An R-package to validate graphical models using data of external causes of morbidity and mortality in Bremen, 1999-2006 (in german, poster) http://www.bips.uni-bremen.de/ma_downloads/bips_gmvalid.pdf
### Standard procedure data(wam) boot.out <- gm.boot.coco(1000,wam,strategy="f",recursive=TRUE,follow=TRUE,all.significant=FALSE) gm.validation(boot.out)