evaluate.learn {ofw} | R Documentation |
The error rate assessment e.632+ (Efron and Tibshirani, 1997) is performed on ofw applied to CART
or SVM
. It requires first to launch learn
.
## S3 method for class 'learn': evaluate(obj, maxvar=15, type=obj$type, nvar=if(obj$type=="CART") obj$nclass+1 else NULL, ntreeTest= if(obj$type=="CART") 100 else NULL, weight=FALSE,...)
obj |
An object from class learn . |
maxvar |
Size of the evaluated variable selection. |
type |
Classifier used in the object from class learn |
nvar |
If CART , number of randomly sampled variables in the selection that are used to construct each tree. Should be at least obj$nclass+1 to ensure generalizable trees. |
ntreeTest |
If CART , number of trees aggregated to evaluate the performance of ofwCART when each variable enters the selection. |
weight |
Should the weighting procedure be applied during the evaluation phase ? |
... |
not used currently. |
In the case of data sets with a small number of samples (e.g microarray data), the use of e.632+ bootstrap error seems appropriate to assess the performance of the algorithm. With CART
, as classification trees are unstable by nature, ntreeTest
trees are aggregated. Furthermore, to avoid the wrong evaluation where the ntreeTest
trees would be constructed only with the 'best' variables among the selected variables, nvar
variables are randomly sampled in the evaluated selection to ensure a fair evaluation.
An object of class evaluate
, which is a list with the
following components:
maxvar |
Size of the evaluated variable selection. |
nvar |
Number of randomly sampled variables in the selection that are used to construct each tree. |
weight.eval |
Was the weighting procedure applied during the evaluation step ? |
weight.learn |
Was the weighting procedure applied during the learning step ? |
ntreeTest |
If CART , number of aggregated trees as variable enters the selection. |
matTrain |
A nsample by Bsample matrix indicating the training samples in each bootstrap sample. |
matProb |
A nvariable by Bsample matrix for each probability distribution learnt. |
error |
The evaluated e.632+ boostrap error as each variable enters the selection. |
sampleWeight |
if weight==T , the n by Bsample matrix indicating each sample weight in each bootstrap sample. |
matPredInbag |
A nvariable by Bsample matrix indicating the prediction of the inbag samples. |
matPredTest |
A nvariable by Bsample matrix indicating the prediction of the test samples. |
The e.632+ code comes from the ipred
package.
This type of evaluation should only be used to compare several methods and not to assess the performance of only one method.
Kim-Anh L^e Cao Kim-Anh.Le-Cao@toulouse.inra.fr newline Patrick Chabrier Patrick.Chabrier@toulouse.inra.fr
Efron, B. and Tibshirani R.J. (1997), Improvements on cross-validation: the e.632+ bootstrap method, Journal of American Statistical Association 92, 548-560.
L^e Cao, K-A., Gonc calves, O., Besse, P. and Gadat, S. (2007), Selection of biologically relevant genes with a wrapper stochastic algorithm Statistical Applications in Genetics and Molecular Biology: Vol. 6: Iss.1, Article 29.
## On data set "srbct" #data(srbct) #attach(srbct) #learn.boot.cart <- learn(srbct, as.factor(class), type="CART", ntree=50, nforest=100, mtry=5, Bsample=3) #eval.boot.cart <- evaluate(learn.boot.cart, ntreeTest=50, maxvar=10) #plot(eval.boot.cart$error, type="l") #learn.boot.svm <- learn(srbct, as.factor(class),type="SVM", nsvm=500, mtry=5, Bsample=3) #eval.boot.svm <- evaluate(learn.boot.svm, maxvar=10) #plot(eval.boot.svm$error, type="l") #detach(srbct)