randomVarImpsRF {varSelRF} | R Documentation |
Return variable importances from random forests fitted to data sets like the original except class labels have been randomly permuted.
randomVarImpsRF(xdata, Class, forest, numrandom = 100, whichImp = "impsUnscaled", usingCluster = TRUE, TheCluster = NULL, ...)
xdata |
A data frame or matrix, with subjects/cases in rows and variables in columns. NAs not allowed. |
Class |
The dependent variable; must be a factor. |
forest |
A previously fitted random forest (see randomForest ). |
numrandom |
The number of random permutations of the class labels. |
whichImp |
A vector of one or more of impsUnscaled ,
impsScaled , impsGini , that correspond, respectively, to
the (unscaled) mean decrease in accuracy, the scaled mean decrease
in accuracy, and the Gini index. See below and
randomForest ,
importance and the references for further explanations of the
measures of variable importance. |
usingCluster |
If TRUE use a cluster to parallelize the calculations. |
TheCluster |
The name of the cluster, if one is used. |
... |
Not used. |
The measure of variable importance most often used is based on the decrease
of classification accuracy when values of a variable in a node of a
tree are permuted randomly (see references);
we use the unscaled version —see our paper and supplementary
material. Note that, by default, importance
returns the scaled
version.
An object of class randomVarImpsRF, which is a list
with one to three named components. The name of each
component corresponds to the types of variable importance measures
selected (i.e., impsUnscaled, impsScaled, impsGini).
Each component is a matrix, of dimensions number of variables by
numrandom
; each element (i,j)
of this matrix is the variable
importance for variable i
and random permutation j
.
Ramon Diaz-Uriarte rdiaz02@gmail.com
Breiman, L. (2001) Random forests. Machine Learning, 45, 5–32.
Diaz-Uriarte, R. and Alvarez de Andres, S. (2005) Variable selection from random forests: application to gene expression data. Tech. report. http://ligarto.org/rdiaz/Papers/rfVS/randomForestVarSel.html
Svetnik, V., Liaw, A. , Tong, C & Wang, T. (2004) Application of Breiman's random forest to modeling structure-activity relationships of pharmaceutical molecules. Pp. 334-343 in F. Roli, J. Kittler, and T. Windeatt (eds.). Multiple Classier Systems, Fifth International Workshop, MCS 2004, Proceedings, 9-11 June 2004, Cagliari, Italy. Lecture Notes in Computer Science, vol. 3077. Berlin: Springer.
randomForest
,
varSelRF
,
varSelRFBoot
,
varSelImpSpecRF
,
randomVarImpsRFplot
x <- matrix(rnorm(45 * 30), ncol = 30) x[1:20, 1:2] <- x[1:20, 1:2] + 2 cl <- factor(c(rep("A", 20), rep("B", 25))) rf <- randomForest(x, cl, ntree = 200, importance = TRUE) rf.rvi <- randomVarImpsRF(x, cl, rf, numrandom = 20, usingCluster = FALSE) randomVarImpsRFplot(rf.rvi, rf)