findCorrelation {caret} | R Documentation |
This function searches through a correlation matrix and returns a vector of integers corresponding to columns to remove to reduce pair-wise correlations.
findCorrelation(x, cutoff = .90, verbose = FALSE)
x |
A correlation matrix |
cutoff |
A numeric value for the pariwise absolute correlation cutoff |
verbose |
A boolean for printing the details |
The absolute values of pair-wise correlations are considered. If two variables have a high correlation, the function looks at the mean absolute correlation of each variable and removes the variable with the largest mean absolute correlation.
A vector of indices denoting the columns to remove. If no correlations meet the criteria, numeric(0)
is returned.
Orignal R code by Dong Li, modified by Max Kuhn
corrMatrix <- diag(rep(1, 5)) corrMatrix[2, 3] <- corrMatrix[3, 2] <- .7 corrMatrix[5, 3] <- corrMatrix[3, 5] <- -.7 corrMatrix[4, 1] <- corrMatrix[1, 4] <- -.67 corrDF <- expand.grid(row = 1:5, col = 1:5) corrDF$correlation <- as.vector(corrMatrix) levelplot(correlation ~ row+ col, corrDF) findCorrelation(corrMatrix, cutoff = .65, verbose = TRUE) findCorrelation(corrMatrix, cutoff = .99, verbose = TRUE)