prepare {clustTool}R Documentation

Function for tranformation and standardisation

Description

This function can used for transformation and standardisation of the data.

Usage

prepare(x, scaling = "classical", transformation = "logarithm", powers = "none")

Arguments

x data frame or matrix
scaling Scaling of the data.
Possible values are: “classical”, “robust”, “none”
transformation Transformation of the data.
Possible values are: “logarithm”, “boxcox”, “bcOpt”, “logratio”,“logcentered”,“iso”,“none”
powers Powers for Box-Cox transformation for each variable (if “boxcox” is chosen)

Details

Transformation:

“logarithm” replaces the values of x with the natural logarithm by using function ‘log’.

“boxcox” apply a Box-Cox transformation on each variable. Powers must be specified.

“bcOpt” apply a Box-Cox transformation on each variable. Powers are calculated with function ‘box.cox.powers’.

“none” is also possible.

Transformation before clustering: Cluster analysis in general does not need normally distributed data. However, it is advisable that heavily skewed data are first transformed to a more symmetric distribution. If a good cluster structure exists for a variable we can expect a distribution which has two or more modes. A transformation to more symmetry will preserve the modes but remove large skewness.

Standardisation:

“classical” apply a z-Transformation on each variable by using function ‘scale’.

“robust” apply a robustified z-Transformation by using median and MAD.

“none” is also possible.

Standardisation before clustering: Standardisation is needed if the variables show a striking difference in the amount of variablity.

Value

Transformed and standardised data.

Author(s)

Matthias Templ

See Also

scale, box.cox.powers

Examples

require(mvoutlier)
data(humus)
x <- humus[,4:40]
xNew <- prepare(x, scaling="classical", transformation="logarithm")

[Package clustTool version 1.6.1 Index]