treeEval {chemometrics}R Documentation

Classification tree evaluation by CV

Description

Evaluation for classification trees by cross-validation

Usage

treeEval(X, grp, train, kfold = 10, cp = seq(0.01, 0.1, by = 0.01), plotit = TRUE, 
   legend = TRUE, legpos = "bottomright", ...)

Arguments

X standardized complete X data matrix (training and test data)
grp factor with groups for complete data (training and test data)
train row indices of X indicating training data objects
kfold number of folds for cross-validation
cp range for tree complexity parameter, see rpart
plotit if TRUE a plot will be generated
legend if TRUE a legend will be added to the plot
legpos positioning of the legend in the plot
... additional plot arguments

Details

The data are split into a calibration and a test data set (provided by "train"). Within the calibration set "kfold"-fold CV is performed by applying the classification method to "kfold"-1 parts and evaluation for the last part. The misclassification error is then computed for the training data, for the CV test data (CV error) and for the test data.

Value

trainerr training error rate
testerr test error rate
cvMean mean of CV errors
cvSe standard error of CV errors
cverr all errors from CV
cp range for tree complexity parameter, taken from input

Author(s)

Peter Filzmoser <P.Filzmoser@tuwien.ac.at>

References

K. Varmuza and P. Filzmoser: Introduction to Multivariate Statistical Analysis in Chemometrics. CRC Press. To appear.

See Also

rpart

Examples

data(fgl,package="MASS")
grp=fgl$type
X=scale(fgl[,1:9])
k=length(unique(grp))
dat=data.frame(grp,X)
n=nrow(X)
ntrain=round(n*2/3)
require(rpart)
set.seed(123)
train=sample(1:n,ntrain)
par(mar=c(4,4,3,1))
restree=treeEval(X,grp,train,cp=c(0.01,0.02:0.05,0.1,0.15,0.2:0.5,1))
title("Classification trees")


[Package chemometrics version 0.4 Index]