disstree {TraMineR} | R Documentation |
Tree structured discrepancy analysis of non-measurable objects described by their pairwise dissimilarities.
disstree(formula, data= NULL, minSize = 0.05, maxdepth = 5, R = 1000, pval = 0.01)
formula |
A formula where the left hand side is a dissimilarity matrix and the right hand specifies the candidate partitioning variables to partition the population |
data |
a data frame where arguments in formula will be searched |
minSize |
minimum number of cases in a node, in percentage if less than 1. |
maxdepth |
maximum depth of the tree |
R |
Number of permutations used to assess the significance of the split. |
pval |
Maximum p-value, in percent |
The procedure iteratively splits the data. At each step, the procedure selects the variable and split that explains the biggest part of the discrepancy, i.e. the split for which we get the highest pseudo R2. The significance of the retained split is assessed through a permutation test.
An object of class disstree
that contains the following components:
root |
A node object (see below), root of the tree |
adjustment |
A dissassoc object |
formula |
The formula used to generate the tree |
split |
Selected predictor, NULL for terminal nodes |
vardis |
Node discrepancy, see dissvar |
children |
Child nodes, NULL for terminal nodes |
ind |
Index of individuals in this node |
depth |
Depth of the node, starting from root node |
label |
Node label |
R2 |
R squared of the split, NULL for terminal nodes |
Studer, M., G. Ritschard, A. Gabadinho and N. S. Müller (2009). Analyse de dissimilarités par arbre d'induction. Revue des Nouvelles Technologies de l'Information, EGC'2009.
Batagelj, V. (1988). Generalized ward and related clustering problems. In H. Bock (Ed.), Classification and related methods of data analysis, pp. 67-74. North-Holland, Amsterdam.
Anderson, M. J. (2001). A new method for non-parametric multivariate analysis of variance. Austral Ecology 26, 32-46.
Piccarreta, R. et F. C. Billari (2007). Clustering work and family trajectories by using a divisive algorithm. Journal of the Royal Statistical Society A 170(4), 1061-1078.
seqtree2dot
to generate graphic representation of disstree objects when analyzing state sequences.
disstree2dot
is a more general interface to generate such representation.
dissvar
to compute discrepancy using dissimilarities and for a basic introduction to discrepancy analysis.
dissassoc
to test association between objects represented by their dissimilarities and a covariate.
dissmfac
to perform multi-factor analysis of variance from pairwise dissimilarities.
disscenter
to compute the distance of each object to its center of group from pairwise dissimilarities.
data(mvad) ## Defining a state sequence object mvad.seq <- seqdef(mvad[, 17:86]) ## Computing dissimilarities mvad.lcs <- seqdist(mvad.seq, method="LCS") dt <- disstree(mvad.lcs~ male + Grammar + funemp + gcse5eq + fmpr + livboth, data=mvad, R = 10) print(dt) ## Using simplified interface to generate a file for GraphViz seqtree2dot(dt, "mvadseqtree", seqdata=mvad.seq, type="d", border=NA, withlegend=FALSE, axes=FALSE, ylab="", yaxis=FALSE) ## Generating a file for GraphViz disstree2dot(dt, "mvadtree", imagefunc=seqdplot, imagedata=mvad.seq, ## Additional parameters passed to seqdplot withlegend=FALSE, axes=FALSE, ylab="") ## Second method, using a specific function myplotfunction <- function(individuals, seqs, mds,...) { par(font.sub=2, mar=c(3,0,6,0), mgp=c(0,0,0)) ## using mds to order sequence in seqiplot mds <- cmdscale(seqdist(seqs[individuals,], method="LCS"),k=1) seqiplot(seqs[individuals,], sortv=mds,...) } ## Generating a file for GraphViz ## If imagedata is not set, index of individuals are sent to imagefunc disstree2dot(dt, "mvadtree", imagefunc=myplotfunction, title.cex=3, ## additional parameters passed to myplotfunction seqs=mvad.seq, mds=mvad.mds, ## additional parameters passed to seqiplot (through myplotfunction) withlegend=FALSE, axes=FALSE,tlim=0,space=0, ylab="", border=NA) ## To run GraphViz (dot) from R and generate an "svg" file ## shell("dot -Tsvg -O mvadtree.dot")