gm.validation {gmvalid}R Documentation

Validation and uncertainty measures for graphical models.

Description

The bootstrapped graphical models are analyzed and some new uncertainty measures are applied in order to determine the uncertainty of a selected graphical model.

Usage

gm.validation(data, N = 0, Umax = 0.5, conf.level = 0.95, ...)

Arguments

data Output list from gm.boot.coco or gm.boot.mim, data frame or array. Variables need to be discrete and should have names.
N Number of bootstrap replications. Only needed if data is not yet a bootstrap output.
Umax Parameter that defines the maximum uncertainty in the edge selection frequency (default is 0.5).
conf.level Confidence level for bootstrap percentile interval (default is 0.95).
... To add options for the model selection strategy. Only needed if data is not yet a bootstrap output. See gm.boot.coco.

Details

The bootstrap functions bring multivariate output about the uncertainty of a selected graphical model. This function presents some possibilities to reduce the uncertainty to a univariate measure, based either on the edge frequencies of presence in the bootstrapped models or on differences between models measured in edges.

Value

"original model" Character string of the selected graphical model, using the original unsampled data.
"mode model" Character string of the model that was selected most frequently.
"mean model" Character string of a graphical model that consists of those edges whose selection frequency is greater than Umax over the bootstrap replications.
"MEU" Mean edge uncertainty. Linear measure for the uncertainty of the mean model based on the edge frequency f*(e).

MEU = 1/|VxV| sum[min[(1-Umax)/0.5 f*(e),Umax/0.5 (1-f*(e))] / [Umax (1-Umax) / 0.5 / 0.5 ]]

with vertices set V.

"MSEU" Mean squared edge uncertainty.

MSEU = 1/|VxV| sum[min[(1-Umax)/0.5 f*(e),Umax/0.5 (1-f*(e))] / [Umax (1-Umax) / 0.5 ]]^2

"edge differences" Frequency list of edges that differ in the bootstrapped models from the mean model.
"total possible edges" Number of edges in the saturated model.
"model std"

std = 1/(B-1) sum[d(G*b,G*)^2]

with number of bootstrap replications B, edge differences d(G*b,G*).

"MED" Mean edge deviation. Mean of edge differences.
"bootstrap percentile 95" The value that includes at least the lower 95% of edge differences to give an upper border to the model uncertainty.
"variable names" Matrix. Assigns a letter to each variable used in the model formulas.

Note

The question when an edge is maximally uncertain is not yet answered satisfactory. Can we say that an association that is selected in 40 is not present? Or that an edge that is present in 60 randomly? Therefore the argument Umax leaves it up to your opinion.

If you already have run a bootstrap, make sure it was with all the possible calculations.

Author(s)

Ronja Foraita, Fabian Sobotka
Bremen Institute for Prevention Research and Social Medicine
(BIPS) http://www.bips.uni-bremen.de

References

Efron B, Tibshirani RJ (1993) An Introduction to the Bootstrap. Chapman & Hall

Foraita R, Sobotka F, Pigeot I (2009) The uncertainty of a selected graphical model. unpublished

Sobotka F, Foraita R, Eberle A, Pigeot I (2008) GMVALID: An R-package to validate graphical models using data of external causes of morbidity and mortality in Bremen, 1999-2006 (in german, poster) http://www.bips.uni-bremen.de/ma_downloads/bips_gmvalid.pdf

See Also

gm.boot.coco

Examples

  ### Standard procedure
  data(wam)
  boot.out <- gm.boot.coco(1000,wam,strategy="f",recursive=TRUE,follow=TRUE,all.significant=FALSE)
  gm.validation(boot.out)



[Package gmvalid version 1.2 Index]