clust {clustTool}R Documentation

Wrapper function for a variety of clustering algorithms

Description

Performs cluster analysis on data.

Usage

clust(x = Cassini$x, k = 3, method = "kmeansHartigan", seed = set.seed(123), distMethod = "euclidean", qtclustsize = 0.7, iter.max = 100, eps = 0.1, vals = TRUE, alt = NULL, coord = NULL, bic = NULL)

Arguments

x data frame or matrix
k Number of clusters
method Cluster algorithm
seed Seed (can be useful if results from clustering should be reproduced exactly)
distMethod Distance Measure
qtclustsize Only important if method qtclust is chosen (see ‘qtclust’ in package flexclust)
iter.max Only important if method kmeans is chosen (see ‘kmeans’ in package stats
eps Only important if method ‘dbscan’ is chosen
vals Validity measures for the resulting clusters would be calculated if this parameter is set to TRUE
alt an integer vector for each observation indicating the cluster number for an alternative clustering. If provided, the corrected rand index for 'clustering' vs. 'alt.clustering' will be computed (see also in package fpc).
coord Cluster validity measures will be calculated based on coordinates.
bic Alternative way to specify bic values for each cluster.

Details

This function acts like a wrapper function for applying a variety of clustering algorithms. The function would be carried out from the clustTool-GUI. To specify additional parameters for special algorithms one should use the algorithm itself and structure the output as the output from this function (as class ‘clust’ suggests).

Number of Clusters: Since there will be no necessarity for a large number of clusters, the maximum number of clusters should not exeed 12.

Cluster algorithms: Possible values are: “kmeansHartigan”, “kmeansLloyd”, “kmeansForgy”, “kmeansMacQueen”, “cmeans”, “cmeansUfcl”, “pam”, “clara”, “fanny”, “bclust”, “cshell”, “Mclust”, “kccaKmeans”, “kccaKmedians”, “kccaAngle”, “kccaJaccard”, “kccaEjaccard”, “cclustKmeans”, “cclustHardcl”, “cclustNeuralgas”, “qtclustKmeans”, “qtclustKmedian”, “qtclustAngle”, “qtclustJaccard”, “qtclustEjaccard”, “dbscan”, “speccPolydot”, “fixmahal”, “hclustSingle”, “hclustComplete”, “hclustAverage”, “hclustWard”, “hclustMcquitty”, “hclustMedian”, “hclustcentroid”.

Cluster algorithms which are supported by clustTool-GUI: “kmeansHartigan”, “clara”, “bclust”, “Mclust”, “kccaKmeans”, “speccPolydot”, “cclustNeuralgas”, “cmeans”, “kccaKmedians”.

For details see the help files listed below.

distMethod: Possible values are: “euclidean”, “manhattan”, “maximum”, “canberra”, “cosa”, “rf” (dissimilarity measure based on random Forest proximity measure), “gower”, “bray”, “kulczynski”, “chord”, “morisita”, “horn”, “mountford”, “correlation” (dissimilarity measure based on correlations).

Distance measures which are supported by spatClust-GUI: “euclidean”, “manhattan”,“rf”,“bray”,“gower”,“kulczynski”,“morisita”,“correlation”.

For details see the help files listed below.

Value

cluster A vector of integers indicating the cluster to which each point is allocated.
centers A matrix of cluster centres.
size The number of points in each cluster.
xdata The input data.
method Clustering method
distMethod Distance measure
k Number of clusters
valTF logical, if global validity measures provided
valMeasures global validity measures
silwidths local validity measure
separation local validity measure
diameter local validity measure
average.distance local validity measure
median.distance local validity measure
average.toother local validity measure
vp logical, if colnames provided

Author(s)

Matthias Templ

See Also

Clustering methods:

kmeans, cmeans, pam, clara, fanny, bclust, Mclust, kcca, cclust, specc, hclust

Distance measures:

dist, vegdist, g.dist, randomForest, “cosa”, cor

Cluster validity measures:

cluster.stats

Examples

 require(mvoutlier)
 data(humus)
 x <- prepare(humus[,c("As", "Ca", "Co", "Mo", "Ni")])
 cl1 <- clust(x, k=9, method="clara", distMethod="manhattan")
 cl1
 names(cl1)

[Package clustTool version 1.6.1 Index]