simulate {MAclinical} | R Documentation |
This function simulates a list of data sets as described in Boulesteix et al (2008), section 3.1.
simuldata_list(niter=50,n=500,p=1000,psig=50,q=5,muX=0,muZ=0) simuldatacluster_list(niter=50,n=500,p=1000,psig=50,q=5,muX=0,muZ=0)
niter |
The number of data sets to be simulated. |
n |
The number of observations. |
p |
The number of microarray variables (genes). |
psig |
The number of significant microarray variables (must be <p ). |
q |
The number of clinical variables. |
muX |
The class mean difference for the psig relevant genes. |
muZ |
The class mean difference for the q clinical variables. |
With the function simuldata_cluster
, observations with y=1
are assumed to come
from two different subgroups, 1a and 1b, each with probability 0.5.
Relevant genes are generated such that they separate
class 1a from the rest, whereas clinical variables separate class 1b from the rest.
A niter
-list of simulated data sets. Each data set is given as a list with three elements:
y |
the n -vector of class memberships, coded as 0,1. |
x |
the n x p matrix of gene expressions levels. Each row corresponds to an observation, each column
to a variable (gene). |
z |
the n x q matrix of clinical variables. Each row corresponds to an observation, each column
to a clinical variable. |
Anne-Laure Boulesteix (http://www.slcmsr.net/boulesteix)
Boulesteix AL, Porzelius C, Daumer M, 2008. Microarray-based classification and clinical predictors: On combined classifiers and additional predictive value. Bioinformatics 24:1698-1706.
testclass
, testclass_simul
,
plsrf_x_pv
, plsrf_xz_pv
, plsrf_x
, plsrf_xz
,
logistic_z
, rf_z
, svm_x
.
# load MAclinical library # library(MAclinical) # Generating 3 simulated data sets my.data<-simuldata_list(niter=3,n=100,p=150,psig=10,q=5,muX=2,muZ=1) length(my.data) dim(my.data[[1]]$x) dim(my.data[[1]]$z) length(my.data[[1]]$y)