classificationError {optBiomarker} | R Documentation |
Estimates misclassification errors (generalisation errors), sensitivity and specificity using cross-validation,
bootstrap and 632plus
bias corrected bootstrap methods based on Random Forest,
Support Vector Machines, Linear Discriminant Analysis and k-Nearest Neighbour methods.
## S3 method for class 'data.frame': classificationError( formula, data, method=c("RF","SVM","LDA","KNN"), errorType = c("cv", "boot", "six32plus"), senSpec=TRUE, negLevLowest=TRUE, na.action=na.omit, control=control.errorest(k=NROW(na.action(data)),nboot=100), ...)
formula |
A formula of the form lhs ~ rhs relating response (class)
variable and the explanatory variables. See lm for
more detail. |
data |
A data frame containing the response (class membership) variable and the explanatory variables in the formula. |
method |
A character vector of length 1 to 4 representing the classification
methods to be used. Can be one or more of "RF" (Random Forest), "SVM"
(Support Vector Machines), "LDA" (Linear Discriminant Analysis) and "KNN"
(k-Nearest Neighbour). Defaults to all four methods. |
errorType |
A character vector of length 1 to 3 representing the type of
estimators to be used for computing misclassification errors.
Can be one or more of the "cv" (cross-validation), "boot"
(bootstrap) and "632plus" (632plus bias corrected bootstrap) estimators.
Defaults to all three estimators. |
senSpec |
Logical. Should sensitivity and specificity (for cross-validation estimator only)
be computed? Defaults to TRUE . |
negLevLowest |
Logical. Is the lowest of the ordered levels of the class variable represnts
the negative control? Defaults to TRUE . |
na.action |
Function which indicates what should happen when the data
contains NA 's, defaults to na.omit . |
control |
Control parameters of the the function errorest . |
... |
additional parameters to method . |
In the current version of the package, estimation of sensitivity and
specificity is limited to cross-validation estimator only. For LDA
sample size must be greater than the number of explanatory variables to
avoid singularity. The function classificationError
does not
check if this is satisfied, but the underlying function
lda
produces warnings if this condition is violated.
Returns an object of class classificationError
with components
call |
The call of the classificationError function. |
errorRate |
A length(errorType) by length(method)
matrix of classification errors. |
rocData |
A 2 by length(method) matrix of
sensitivities (first row) and specificities (second row). |
Mizanur Khondoker, Till Bachmann, Peter Ghazal
Maintainer: Mizanur Khondoker mizanur.khondoker@googlemail.com.
Breiman, L. (2001). Random Forests, Machine Learning 45(1), 5–32.
Chang, Chih-Chung and Lin, Chih-Jen: LIBSVM: a library for Support Vector Machines, http://www.csie.ntu.edu.tw/~cjlin/libsvm.
Ripley, B. D. (1996). Pattern Recognition and Neural Networks.Cambridge: Cambridge University Press.
Efron, B. and Tibshirani, R. (1997). Improvements on Cross-Validation: The .632+ Bootstrap Estimator. Journal of the American Statistical Association 92(438), 548–560.
mydata<-simData(nTrain=30,nBiom=3) classificationError(formula=class~., data=mydata)