msc.classifier.test {caMassClass}R Documentation

Test a Classifier through Cross-validation

Description

Test classifier through cross-validation. Common interface for cross-validation of several standard classifiers. Includes feature selection and feature scaling steps. Allows to specify that some test samples are multiple copies of the same sample, and should return the same label.

Usage

 msc.classifier.test( X, Y, iters=50, SplitRatio=2/3, verbose=FALSE,
     RemCorrCol=0, KeepCol=0, prior=1, same.sample=NULL,
     ScaleType=c("none", "min-max", "avr-std", "med-mad"),
     method=c("svm", "nnet", "lda", "qda", "LogitBoost", "rpart"), ...) 

Arguments

X A matrix or data frame with training/testing data. Rows contain samples and columns contain features/variables
Y Class labels for the training data samples. A response vector with one label for each row/component of x. Can be either a factor, string or a numeric vector. Labels with 'NA' value signify test data-set.
iters Number of iterations. Each iteration consist of splitting the data into train and test sets, performing the classification and storing results
SplitRatio Splitting ratio used to divide available data during cross-validation:
  • if (0<=SplitRatio<1) then SplitRatio fraction of samples will be used for training and the rest for validation.
  • if (SplitRatio==1) leave-one-out cross-validation. All but one samples will used for training, and validation will be done using single sample per iteration.
  • if (SplitRatio>1) then SplitRatio number of samples to be used for training and the rest for validation.
RemCorrCol See msc.classifier.run.
KeepCol See msc.classifier.run.
ScaleType See msc.classifier.run.
prior See msc.classifier.run.
same.sample See msc.classifier.run.
method See msc.classifier.run.
verbose boolean flag turns debugging printouts on.
... Additional parameters to be passed to classifiers. See method for suggestions.

Details

This function follows standard cross-validation steps:

Value

Y Predicted class labels. If there were any unknown samples in input data, marked by NA's in input Y, than output Y will only hold prediction of those samples, otherwise prediction will be made for all samples.
Res Holds fraction of correct prediction during cross-validation for each iteration. mean(Res) will give you average accuracy.
Tabl Contingency table of predictions shows all the input label compared to output labels

Author(s)

Jarek Tuszynski (SAIC) jaroslaw.w.tuszynski@saic.com

See Also

Examples

  data(iris)
  A = msc.classifier.test(iris[,-5],iris[,5], method="LogitBoost", nIter=2) 
  print(A)
  cat("correct classification in",100*mean(A$Res),"+-",100*sd(A$Res),"percent of cases\n")
  stopifnot( mean(A$Res)<89 )

[Package caMassClass version 1.6 Index]