msc.classifier.run {caMassClass} | R Documentation |
Common interface for training and testing several standard classifiers. Includes feature selection and feature scaling steps. Allows to specify that some test samples are multiple copies of the same sample, and should return the same label.
msc.classifier.run( xtrain, ytrain, xtest, ret.prob=FALSE, RemCorrCol=0, KeepCol=0, prior=1, same.sample=NULL, ScaleType=c("none", "min-max", "avr-std", "med-mad"), method=c("svm", "nnet", "lda", "qda", "LogitBoost", "rpart"), ...)
xtrain |
A matrix or data frame with training data. Rows contain samples and columns contain features/variables |
ytrain |
Class labels for the training data samples.
A response vector with one label for each row/component of x .
Can be either a factor, string or a numeric vector. |
xtest |
A matrix or data frame with test data. Rows contain samples and columns contain features/variables |
ret.prob |
if set to TRUE than the a-posterior probabilities for each class are returned as attribute called "probabilities". |
same.sample |
optional parameter which allows to specify that some (or all) test samples have multiple copies which should be used to predict a single label for all of them. Can be either a factor, string or a numeric vector, with unique values for different samples and identical values for copies of the same sample. |
RemCorrCol |
If non-zero than some of the highly correlated columns are
removed using msc.features.remove function with
ccMin=RemCorrCol . |
KeepCol |
If non-zero than columns with low AUC are removed.
|
ScaleType |
Optional parameter, if provided than following types are
recognized
|
prior |
class weights. following types are recognized
|
method |
classifier to be used. Following ones are recognized (followed
by some parameters that could be passed through ... :
|
... |
Additional parameters to be passed to classifiers. See
method for suggestions. |
This function performs the following steps:
msc.features.select
function
msc.features.scale
function
xtrain
and ytrain
xtest
using trained model
same.sample
variable is given than synchronize predicted
labels in such a way that all copies of the same sample return the same
label.
ret.prob=TRUE
then return a-posterior
probabilities as well.
Predicted class labels for each sample in xtest
.
If ret.prob=TRUE
than the a-posterior probabilities of each sample
belonging to each class are returned as attribute called "probabilities".
The returned probabilities do not take into account same.sample
variable, used to synchronize predicted labels.
This function is not fully tested and might be changed in future versions
Jarek Tuszynski (SAIC) jaroslaw.w.tuszynski@saic.com
msc.classifier.test
function.
tune
function from e1071 package.
msc.features.select
and
msc.features.scale
functions.
svm
,
nnet
, LogitBoost
, lda
,
qda
, rpart
data(iris) mask = sample.split(iris[,5], SplitRatio=1/4) # very few points to train xtrain = iris[ mask,-5] # use output of sample.split to ... xtest = iris[!mask,-5] # create train and test subsets ytrain = iris[ mask, 5] ytest = iris[!mask, 5] table(ytrain, msc.classifier.run(xtrain,ytrain,xtrain, method="svm") ) table(ytrain, msc.classifier.run(xtrain,ytrain,xtrain, method="nnet") ) table(ytrain, msc.classifier.run(xtrain,ytrain,xtrain, method="lda") ) table(ytrain, msc.classifier.run(xtrain,ytrain,xtrain, method="qda") ) table(ytrain, msc.classifier.run(xtrain,ytrain,xtrain, method="LogitBoost") ) a=table(ytrain, msc.classifier.run(xtrain,ytrain,xtrain, method="LogitBoost") ) stopifnot( sum(diag(a))==length(ytrain) )