generateArtificialLongData {kml} | R Documentation |
This function is used to builp up artificial data set of longitudinal data.
gald(name = "", clusterNames = "", nbEachClusters = rep(50, 3), functionClusters = list(function(t){t}, function(t){0},function(t){-t}), functionNoise = function(t){rnorm(1, 0, 0.1)}, time = 0:7, decimal = 2, percentOfMissing = 0) generateArtificialLongData(name = "", clusterNames = "", nbEachClusters = rep(50, 3), functionClusters = list(function(t){t}, function(t){0},function(t){-t}), functionNoise = function(t){rnorm(1, 0, 0.1)}, time = 0:7, decimal = 2, percentOfMissing = 0)
name |
[character]: name of the data set. |
clusterNames |
[character]: name of each clusters. |
nbEachClusters |
[numeric]: number of trajectories that each
cluster must contain. Note that the number of elements in
nbEachClusters (the length of nbEachClusters ) is used
as a definition of the number of groups. |
time |
[numeric]: time at which measures are made. |
functionClusters |
[list(function)]: lists the functions generating the average trajectories of each cluster. If a single function is given, it is used for all groups. If several functions are given, the number of functions must correspond to the number of groups. |
functionNoise |
[list(function)]: list of functions generating the noise of each trajectory within its own cluster. If a single function is given, it is used for all groups. If several functions are given, the number of functions must correspond to the number of groups. |
decimal |
[numeric]: number of decimals used to round up values. |
percentOfMissing |
[numeric]: percentage (between 0 and 1) of missing data generated in each cluster. If a single value is given, it is used for all groups. If several value are given, the number of values must correspond to the number of groups. |
generateArtificialLongData
(gald
in short) is a
function that enables the user to contruct artificial trajectories.
Each individual is considered as belonging to a group. This group
follows a certain theoretical trajectory, function of time. These functions (one per group) are given via the argument functionClusters
.
Within a group, the individual undergoes individal variations. Individual variations are given via the argument functionNoise
.
The number of individuals in each group is given by nbEachClusters
.
Finally, it is possible to add missing values randomly striking the data thanks to percentOfMissing
.
An object of class ArtificialLongData
.
Christophe Genolini
PSIGIAM: Paris Sud Innovation Group in Adolescent Mental Health
INSERM U669 / Maison de Solenn / Paris
Responsable : <genolini@u-paris10.fr>
Raphaël Ricaud
Laboratoire "Sport & Culture" / "Sports & Culture" Laboratory
University of Paris 10 / Nanterre
kml-package,link{ArtificialLongData}
### Three diverging lines ex1 <- generateArtificialLongData() ex1 #plot(ex1) ### Three diverging lines with high variance ex2 <- generateArtificialLongData(functionNoise=function(t){rnorm(1,0,3)}) ex2 #plot(ex2) ### Three diverging lines with unbalance groups ex3 <- generateArtificialLongData(nbEachClusters=c(120,10,40)) ex3 #plot(ex3) ### Three diverging lines with missing data ex4 <- generateArtificialLongData(percentOfMissing=c(0.5,0,0.25)) ex4 #plot(ex4) ### Four strange functions ex5 <- generateArtificialLongData( name="Four strange functions", clusterNames=c("Line","Poly2","Normal","Sinus"), nbEachClusters=rep(100,4), functionClusters=list(function(t){-10+2*t},function(t){-0.6*t^2+6*t-7.5},function(t){10*sin(t)},function(t){30*dnorm(t,2,1.5)}), functionNoise=function(t){rnorm(1,0,4)}, time=0:10,decimal=2,percentOfMissing=0.3) ex5 #plot(ex5) ### Here is a data set. Our objectif is to find some clusters... #layout(1) #plot(ex5,color=c("black","no","no"))