cv.nfeaturesLDA {animation} | R Documentation |
This function has provided an illustration of the process of finding out the optimum number of variables using k-fold cross-validation in a linear discriminant analysis (LDA).
cv.nfeaturesLDA(data = matrix(rnorm(600), 60), cl = gl(3, 20), k = 5, cex.rg = c(0.5, 3), col.av = c("blue", "red"))
data |
a data matrix containg the predictors in columns |
cl |
a factor indicating the classification of the rows of data |
k |
the number of folds |
cex.rg |
the range of the magnification to be used to the points in the plot |
col.av |
the two colors used to respectively denote rates of correct predictions in the i-th fold and the average rates for all k folds |
For a classification problem, usually we wish to use as less variables as possible because of difficulties brought by the high dimension.
The selection procedure is like this:
Note that g_{max} is set by ani.options("nmax")
.
A list containing
accuracy |
a matrix in which the element in the i-th row and j-th column is the rate of correct predictions based on LDA, i.e. build a LDA model with j variables and predict with data in the i-th fold (the test set) |
optimum |
the optimum number of features based on the cross-validation |
Yihui Xie <http://yihui.name>
Maindonald J, Braun J (2007). Data Analysis and Graphics Using R - An Example-Based Approach. Cambridge University Press, 2nd edition. pp. 400
http://animation.yihui.name/da:biostat:select_features_via_cv
op = par(pch = 19, mar = c(3, 3, 0.2, 0.7), mgp = c(1.5, 0.5, 0)) cv.nfeaturesLDA() par(op) ## Not run: # save the animation in HTML pages oopt = ani.options(ani.height = 480, ani.width = 600, interval = 0.5, nmax = 10, title = "Cross-validation to find the optimum number of features in LDA", description = "This animation has provided an illustration of the process of finding out the optimum number of variables using k-fold cross-validation in a linear discriminant analysis (LDA).") ani.start() par(mar = c(3, 3, 1, 0.5), mgp = c(1.5, 0.5, 0), tcl = -0.3, pch = 19, cex = 1.5) cv.nfeaturesLDA() ani.stop() ani.options(oopt) ## End(Not run)