prepare {clustTool} | R Documentation |
This function can used for transformation and standardisation of the data.
prepare(x, scaling = "classical", transformation = "logarithm", powers = "none")
x |
data frame or matrix |
scaling |
Scaling of the data.
Possible values are: “classical”, “robust”, “none” |
transformation |
Transformation of the data.
Possible values are: “logarithm”, “boxcox”, “bcOpt”, “logratio”,“logcentered”,“iso”,“none” |
powers |
Powers for Box-Cox transformation for each variable (if “boxcox” is chosen) |
Transformation:
“logarithm” replaces the values of x with the natural logarithm by using function ‘log’.
“boxcox” apply a Box-Cox transformation on each variable. Powers must be specified.
“bcOpt” apply a Box-Cox transformation on each variable. Powers are calculated with function ‘box.cox.powers’.
“none” is also possible.
Transformation before clustering: Cluster analysis in general does not need normally distributed data. However, it is advisable that heavily skewed data are first transformed to a more symmetric distribution. If a good cluster structure exists for a variable we can expect a distribution which has two or more modes. A transformation to more symmetry will preserve the modes but remove large skewness.
Standardisation:
“classical” apply a z-Transformation on each variable by using function ‘scale’.
“robust” apply a robustified z-Transformation by using median and MAD.
“none” is also possible.
Standardisation before clustering: Standardisation is needed if the variables show a striking difference in the amount of variablity.
Transformed and standardised data.
Matthias Templ
require(mvoutlier) data(humus) x <- humus[,4:40] xNew <- prepare(x, scaling="classical", transformation="logarithm")