partial.plot {randomForest} | R Documentation |
Partial dependence plot gives a graphical depiction of the marginal effect of a variable on the class probability (classification) or response (regression).
partial.plot(x, pred.data, x.var, which.class, add=FALSE, n.pt=min(length(unique(pred.data[,deparse(substitute(x.var)])), 51), rug=TRUE, ...)
x |
an object of class randomForest , which contains a
forest component. |
pred.data |
a data frame used for contructing the plot, usually the training data used to contruct the random forest. |
x.var |
name of the variable for which partial dependence is to be examined (can be either character or unquoted name). |
which.class |
For classification data, the class to focus on (default the first class). |
add |
whether to add to existing plot (TRUE ) or create a
new plot (FALSE ). |
n.pt |
if x.var is continuous, the number of points on the
grid for evaluating partial dependence. |
rug |
whether to draw hash marks at the bottom of the plot
indicating the deciles of x.var . |
... |
other graphical parameters to be passed on to plot
or lines . |
The function being plotted is defined as:
tilde{f}(x) = frac{1}{n} sum_{i=1}^n f(x, x_{iC}),
where x is the variable for which partial dependence is sought,
and x_{iC} is the other variables in the data. The summand is
the predicted regression function for regression, and logits
(i.e., log of fraction of votes) for which.class
for
classification:
f(x) = log p_k(x) - frac{1}{K} sum_{j=1}^K log p_j(x),
where K is the number of classes, k is which.class
,
and p_j is the proportion of votes for class j.
A list with two components: x
and y
, which are the values
used in the plot.
The randomForest
object must contain the forest
component; i.e., created with randomForest(...,
keep.forest=TRUE)
.
This function runs quite slow for large data sets.
Andy Liaw andy_liaw@merck.com
Friedman, J. (2001). Greedy function approximation: the gradient boosting machine, Ann. of Stat.
data(airquality) airquality <- na.omit(airquality) set.seed(131) ozone.rf <- randomForest(Ozone ~ ., airquality) partial.plot(ozone.rf, airquality, Temp) data(iris) set.seed(543) iris.rf <- randomForest(Species~., iris) partial.plot(iris.rf, iris, Petal.Width, "versicolor")