find.interaction {randomSurvivalForest} | R Documentation |
Test for pairwise interactions between variables by comparing pairwise importance values to additive individual importance values.
find.interaction(object, predictorNames = NULL, sorted = TRUE, npred = NULL, subset = NULL, nrep = 1, rough = FALSE, importance = c("randomsplit", "permute")[1], ...)
object |
An object of class (rsf, grow) or (rsf,
forest) . Note: forest =TRUE must be used in the
original rsf call. |
predictorNames |
Character vector of variable names to be considered. Default is to use all variables. |
sorted |
Should variables be sorted by importance values? Only
applies when predictorNames =NULL. |
npred |
Use the first npred variables as ordered by VIMP (only applies when
predictorNames =NULL). Default uses all variables. |
subset |
An index vector indicating which rows should be used. Default is to use all the data. |
nrep |
Number of Monte Carlo replicates. |
rough |
Logical value indicating whether fast approximation should be used. Default is FALSE. |
importance |
Method used to compute variable importance (VIMP). |
... |
Further arguments passed to or from other methods. |
Using a previously grown forest, identify pairwise interactions for all pairs of variables from a specified list. Two variables are paired and their paired VIMP calculated (refered to as 'Paired' importance). The VIMP for each separate variable is also calculated. The sum of these two values is refered to as 'Additive' importance. A large positive or negative difference between 'Paired' and 'Additive' indicates an association worth pursuing if the VIMP's for each variable are reasonably large (Ishwaran, 2007).
Depending on the size of the data, computations might be slow.
Users should consider setting npred
to a smaller number, or
restricting the analysis to a subset of the data, if that is the
case.
If nrep
is greater than 1, the analysis is repeated
nrep
times and results averaged over the replications.
find.interaction
calls the lower level function
interaction.rsf
. For programming only, users may consider
doing likewise.
Invisibly, the interaction table.
Hemant Ishwaran hemant.ishwaran@gmail.com and Udaya B. Kogalur ubk2101@columbia.edu
H. Ishwaran (2007). Variable importance in binary regression trees and forests, Electronic J. Statist., 1:519-537.
interaction.rsf
.
data(veteran, package = "randomSurvivalForest") v.out <- rsf(Survrsf(time,status)~., veteran, ntree = 1000, forest = TRUE) find.interaction(v.out, npred = 2, nrep=1) ## Not run: # All pairwise interactions: PBC data. # Use fast approximation to speed up computations. data(pbc, package = "randomSurvivalForest") rsf.out <- rsf(Survrsf(days,status)~., pbc, ntree = 1000, forest = TRUE) find.interaction(rsf.out, nrep=3, rough=T) ## End(Not run)