generate.data {Oncotree} | R Documentation |
Generates random event occurrence data based on an oncogenetic tree model.
generate.data(N, otree, with.errors=TRUE, edge.weights=if (with.errors) "estimated" else "observed")
N |
The required sample size. |
otree |
An object of the class oncotree . |
with.errors |
A logical value specifying whether false positive and negative errors should be applied. |
edge.weights |
A choice of whether the observed or estimated
edge transition probabilities should be used in the calculation
of probabilities. See oncotree.fit for explanation
of the difference. By default, estimated edge transition probabilies
if with.errors=TRUE and the observed ones if
with.errors=FALSE . |
Technically, the distribution generated by the
tree is calculated exactly (using distribution.oncotree
),
and the observations are generated by sampling this distribution.
Thus if N
is small and with.errors=TRUE
, it might
be faster to avoid the computational overhead of calculating the
entire distribution, but rather generate data not including
false positive/negatives and then randomly ‘corrupt’ it
(see Examples below).
A data set where each row is an independent observation.
Aniko Szabo
data(ov.cgh) ov.tree <- oncotree.fit(ov.cgh) set.seed(7365) rd <- generate.data(200, ov.tree, with.errors=TRUE) #corrupt data - useful for small N system.time({ rd2 <- generate.data(20, ov.tree, with.errors=FALSE); epos <- ov.tree$eps[["epos"]]; eneg <- ov.tree$eps[["eneg"]]; corrupt.data <- matrix(rbinom(prod(dim(rd2)),size=1,p=ifelse(rd2==0,epos,1-eneg)), nr=nrow(rd2), nc=ncol(rd2), dimnames=list(NULL, names(rd2))) }) system.time(generate.data(20, ov.tree, with.errors=TRUE))