gm.boot.mim {gmvalid}R Documentation

Graphical model validation using the bootstrap (MIM)

Description

Validates a discrete undirected graphical model using the bootstrap. Relative frequencies of the bootstrapped models, cliques or edges are counted. Make sure that MIM is running.

Usage

gm.boot.mim(N, data, strategy = c("backwards", "forwards", "eh", "combined"),
            calculations = c("diff", "edge", "clique"),
            model = FALSE, options = "")

Arguments

N Number of bootstrap replications.
data Data frame or array. Variables need to be discrete and should have names.
strategy Type of model selection. "backwards" eliminates not significant edges, starting from the saturated model as default. "forwards" adds significant edges, starting from the main effects model. The "eh" ("Edwards-Havranek") model search rejects complete models in every step and finishes with one or more accepted models. The "combined" strategy is a 3 step procedure: gm.screening, "backwards" and then "forwards". The default strategy is "backwards". Selections may be abbreviated.
calculations String vector specifying the analysis methods; "clique" and "edge" show the frequency of occurence in all selected models of the bootstrap samples, while "diff" counts the edge differences of the bootstrap replications compared to the edges selected from the original data set. The frequency of the selected models of all bootstrap samples are always calculated. By default all calculations are done. Selections may be abbreviated.
model Character string to specify a start model for "backwards" or "forwards" selection procedures. For the "eh" procedure a minimal and a maximal model has to be assigned in one string connected with " - " (see Example). For "combined" the model cannot be given, a start model will be specified by gm.screening. The model formula has to start with the first lowercase letters of the alphabet, e.g. "abc,cde". Variable names cannot be given.
options Optional character string to specify further options for the search strategy. Possible options can be found in the MIM help searching for "stepwise" (backwards, forwards) or "startsearch" (eh). See details.

Details

This function uses a nonparametric bootstrap.
For your information about the advancements of the bootstrap, some run numbers are displayed.

MIM options for stepwise procedures (backwards, forwards):
"A" - uses the AIC as selection criterion
"B" - uses the BIC as selection criterion
"J" - joggles between backward and forward
"N" - non coherent mode
"U" - unrestricted, allows for non-decomposable models;

MIM options for the eh modelsearch:
(positive number) - maximum number of models fitted
white space
(letters) - "U" for upward search, "D" for downward, default is both.

Value

A list containing:

"bootstrapped models" Matrix with the selected models in first column and their selection frequency in the second.
"bootstrapped cliques" Relative frequenciy vector of selected cliques. Returned if calculation "clique" is selected.
"edge frequencies" Matrix of relative edge frequencies. Returned if calculation "edge" is selected.
"original model" Character string of selected model using the original unsampled data. Returned if calculation "diff" is selected.
"edge differences" List of frequencies of more, less and by absolute value different edges (See argument calculations. Sorted by occurence. Returned if calculation "diff" is selected.
"replications" Number of bootstrap replications.
"variable names" Matrix that assigns a letter to each variable that is used in the model formulae.

Note

The function requires the MIM program. Make sure that it is running before using the function.
The package mimR will only work properly if your R working directory path does not contain hyphens ("-"). For mimR requirements we refer to the package's help page mimR and its homepage http://gbi.agrsci.dk/~shd/public/mimR/index.html. mimR requires the Rgraphviz package. Therefore you need to add "Bioconductor" to your R repositories.

The EH-strategy is very time consuming. This depends on the number of variables, since the total number of possible models doubles with each additional variable.

Author(s)

Ronja Foraita, Fabian Sobotka
Bremen Institute for Prevention Research and Social Medicine
(BIPS) http://www.bips.uni-bremen.de

References

Efron B, Tibshirani RJ (1993) An Introduction to the Bootstrap. Chapman & Hall

Edwards D (2000) An Introduction to Graphical Modelling. Second Edition, Springer Verlag.

See Also

gm.boot.coco, gm.screening

Examples

  ### Examples work!
## Not run: 
  ### should provide good results because of simulated data
  gm.a <- gm.modelsim(2000,"ABC,CDE")
  gm.boot.mim(50,gm.a)
  
  ### on real data sets a forward bootstrap seems to have better results
  data(wynder)
  gm.boot.mim(100,wynder,strategy="f",calculations=c("s","e"),options="u")
  
  ### with model given
  data(wam)
  gm.boot.mim(10,wam,model="a,bcde,cdef")
  
  ### EH-strategy
  gm.boot.mim(50,wam,strategy="eh",model="a,bc,de,f - abcde,bcdef")
  
## End(Not run)

[Package gmvalid version 1.2 Index]