generateArtificialLongData {longitudinalData} | R Documentation |
This function builp up an artificial longitudinal data set an turn it
into an object of class LongData
.
gald(nbEachClusters=50,time=0:10,decimal=2,percentOfMissing=0, functionClusters=list(function(t){0},function(t){t},function(t){10-t},function(t){-0.4*t^2+4*t}), functionNoise=function(t){rnorm(1,0,3)}) generateArtificialLongData(nbEachClusters=50,time=0:10,decimal=2,percentOfMissing=0, functionClusters=list(function(t){0},function(t){t},function(t){10-t},function(t){-0.4*t^2+4*t}), functionNoise=function(t){rnorm(1,0,3)})
nbEachClusters |
[numeric] or [vector(numeric)]: number of trajectories that each cluster must contain. If a single number is given, it is duplicated for all groups (see detail). |
functionClusters |
[function] or [list(function)]: lists the functions defining the average trajectories of each cluster. If a single function is given, it is duplicated for all groups (see detail). |
functionNoise |
[function] or [list(function)]: lists the functions generating the noise of each trajectory within its own cluster. If a single function is given, it is duplicated for all groups (see detail). |
time |
[vector(numeric)]: time at which measures are made. |
decimal |
[numeric]: number of decimals used to round up values. |
percentOfMissing |
[numeric]: percentage (between 0 and 1) of missing data generated in each cluster. If a single value is given, it is duplicated for all groups (see detail). |
generateArtificialLongData
(gald
in short) is a
function that contruct a set of artificial longitudinal data.
Each individual is considered as belonging to a group. This group
follows a theoretical trajectory, function of time. These functions (one per group) are given via the argument functionClusters
.
Within a group, the individual undergoes individal variations. Individual variations are given via the argument functionNoise
.
The number of individuals in each group is given by nbEachClusters
.
Finally, it is possible to add missing values randomly striking the
data thanks to percentOfMissing
.
Note that the number of cluster is define as the biggest length of
variables nbEachClusters
, functionCluters
,
functionNoise
and percentOfMissing
. So at least one of
these four variables should be define for each clusters.
An object of class LongData
. Note that the field
other
of the object LongData
contains the informations that were used to generate
the set of data: functionClusters
, functionNoise
,
percentOfMissing
and trueClusters
.
Christophe Genolini
PSIGIAM: Paris Sud Innovation Group in Adolescent Mental Health
INSERM U669 / Maison de Solenn / Paris
Contact author: <genolini@u-paris10.fr>
LongData
, longData
,
as.longData
, plot
par(ask=TRUE) ### Default example ex1 <- generateArtificialLongData() ex1 plot(ex1,col=1,type.mean="n") part1 <- partition(rep(1:4,each=50),4) plot(ex1,part1) ### Three diverging lines ex2 <- generateArtificialLongData(functionClusters=list(function(t)0,function(t)-t,function(t)t)) part2 <- partition(rep(1:3,each=50),3) plot(ex2,part2) ### Three diverging lines with high variance, unbalance groups and missing value ex3 <- generateArtificialLongData( functionClusters=list(function(t)0,function(t)-t,function(t)t), nbEachClusters=c(100,30,10), functionNoise=function(t){rnorm(1,0,3)}, percentOfMissing=c(0.25,0.5,0.25) ) part3 <- partition(rep(1:3,c(100,30,10)),3) plot(ex3,part3) ### Four strange functions ex4 <- generateArtificialLongData( nbEachClusters=c(300,200,100,100), functionClusters=list(function(t){-10+2*t},function(t){-0.6*t^2+6*t-7.5},function(t){10*sin(t)},function(t){30*dnorm(t,2,1.5)}), functionNoise=function(t){rnorm(1,0,3)}, time=0:10,decimal=2,percentOfMissing=0.3) part4 <- partition(rep(1:4,c(300,200,100,100)),4) plot(ex4,part4) ### To get only longData (if you want some artificial longData ### to deal with another algorithm), use the getteur ["traj"] ex5 <- gald(nbEachCluster=3,time=1:3) ex5["traj"] par(ask=FALSE)