knn.ani {animation} | R Documentation |
Demonstrate the process of k-Nearest Neighbour classification on the 2D plane.
knn.ani(train, test, cl, k = 10, interact = FALSE, tt.col = c("blue", "red"), cl.pch = seq_along(unique(cl)), dist.lty = 2, dist.col = "gray", knn.col = "green")
train |
matrix or data frame of training set cases containing only 2 columns |
test |
matrix or data frame of test set cases. A vector will be interpreted as a row vector for a single case. It should also contain only 2 columns. This data set will be ignored if interact = TRUE ; see interact below. |
cl |
factor of true classifications of training set |
k |
number of neighbours considered. |
interact |
logical. If TRUE , the user will have to choose a test set for himself using mouse click on the screen; otherwise compute kNN classification based on argument test . |
tt.col |
a vector of length 2 specifying the colors for the training data and test data. |
cl.pch |
a vector specifying symbols for each class |
dist.lty, dist.col |
the line type and color to annotate the distances |
knn.col |
the color to annotate the k-nearest neighbour points using a polygon |
For each row of the test set, the k nearest (in Euclidean distance) training set vectors are found, and the classification is decided by majority vote, with ties broken at random. For a single test sample point, the basic steps are:
As there are four steps in an iteration, the total number of animation frames should be 4 * min(nrow(test), ani.options("nmax"))
at last.
A vector of class labels for the test set.
There is a special restriction (only two columns) on the training and test data set just for sake of the convenience for making a scatterplot. This is only a rough demonstration; for practical applications, please refer to existing kNN functions such as knn
in class, etc.
If either one of train
and test
is missing, there'll be random matrices prepared for them. (It's the same for cl
.)
Yihui Xie
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.
http://animation.yihui.name/dmml:k-nearest_neighbour_algorithm
## a binary classification problem oopt = ani.options(interval = 2, nmax = 10) x = matrix(c(rnorm(80, mean = -1), rnorm(80, mean = 1)), ncol = 2, byrow = TRUE) y = matrix(rnorm(20, mean = 0, sd = 1.2), ncol = 2) knn.ani(train = x, test = y, cl = rep(c("first class", "second class"), each = 40), k = 30) x = matrix(c(rnorm(30, mean = -2), rnorm(30, mean = 2), rnorm(30, mean = 0)), ncol = 2, byrow = TRUE) y = matrix(rnorm(20, sd = 2), ncol = 2) knn.ani(train = x, test = y, cl = rep(c("first", "second", "third"), each = 15), k = 25, cl.pch = c(2, 3, 19), dist.lty = 3) ## Not run: # an interactive demo: choose the test set by mouse-clicking ani.options(nmax = 5) knn.ani(interact = TRUE) ani.options(ani.height = 500, ani.width = 600, outdir = getwd(), nmax = 10, interval = 2, title = "Demonstration for kNN Classification", description = "For each row of the test set, the k nearest (in Euclidean distance) training set vectors are found, and the classification is decided by majority vote, with ties broken at random.") ani.start() par(mar = c(3, 3, 1, 0.5), mgp = c(1.5, 0.5, 0)) knn.ani() ani.stop() ## End(Not run) ani.options(oopt)