clust {clustTool} | R Documentation |
Performs cluster analysis on data.
clust(x = Cassini$x, k = 3, method = "kmeansHartigan", seed = set.seed(123), distMethod = "euclidean", qtclustsize = 0.7, iter.max = 100, eps = 0.1, vals = TRUE, alt = NULL, coord = NULL, bic = NULL)
x |
data frame or matrix |
k |
Number of clusters |
method |
Cluster algorithm |
seed |
Seed (can be useful if results from clustering should be reproduced exactly) |
distMethod |
Distance Measure |
qtclustsize |
Only important if method qtclust is chosen (see ‘qtclust’ in package flexclust) |
iter.max |
Only important if method kmeans is chosen (see ‘kmeans’ in package stats |
eps |
Only important if method ‘dbscan’ is chosen |
vals |
Validity measures for the resulting clusters would be calculated if this parameter is set to TRUE |
alt |
an integer vector for each observation indicating the cluster number for an alternative clustering. If provided, the corrected rand index for 'clustering' vs. 'alt.clustering' will be computed (see also in package fpc). |
coord |
Cluster validity measures will be calculated based on coordinates. |
bic |
Alternative way to specify bic values for each cluster. |
This function acts like a wrapper function for applying a variety of clustering algorithms. The function would be carried out from the clustTool-GUI. To specify additional parameters for special algorithms one should use the algorithm itself and structure the output as the output from this function (as class ‘clust’ suggests).
Number of Clusters: Since there will be no necessarity for a large number of clusters, the maximum number of clusters should not exeed 12.
Cluster algorithms: Possible values are: “kmeansHartigan”, “kmeansLloyd”, “kmeansForgy”, “kmeansMacQueen”, “cmeans”, “cmeansUfcl”, “pam”, “clara”, “fanny”, “bclust”, “cshell”, “Mclust”, “kccaKmeans”, “kccaKmedians”, “kccaAngle”, “kccaJaccard”, “kccaEjaccard”, “cclustKmeans”, “cclustHardcl”, “cclustNeuralgas”, “qtclustKmeans”, “qtclustKmedian”, “qtclustAngle”, “qtclustJaccard”, “qtclustEjaccard”, “dbscan”, “speccPolydot”, “fixmahal”, “hclustSingle”, “hclustComplete”, “hclustAverage”, “hclustWard”, “hclustMcquitty”, “hclustMedian”, “hclustcentroid”.
Cluster algorithms which are supported by clustTool-GUI: “kmeansHartigan”, “clara”, “bclust”, “Mclust”, “kccaKmeans”, “speccPolydot”, “cclustNeuralgas”, “cmeans”, “kccaKmedians”.
For details see the help files listed below.
distMethod: Possible values are: “euclidean”, “manhattan”, “maximum”, “canberra”, “cosa”, “rf” (dissimilarity measure based on random Forest proximity measure), “gower”, “bray”, “kulczynski”, “chord”, “morisita”, “horn”, “mountford”, “correlation” (dissimilarity measure based on correlations).
Distance measures which are supported by spatClust-GUI: “euclidean”, “manhattan”,“rf”,“bray”,“gower”,“kulczynski”,“morisita”,“correlation”.
For details see the help files listed below.
cluster |
A vector of integers indicating the cluster to which each point is allocated. |
centers |
A matrix of cluster centres. |
size |
The number of points in each cluster. |
xdata |
The input data. |
method |
Clustering method |
distMethod |
Distance measure |
k |
Number of clusters |
valTF |
logical, if global validity measures provided |
valMeasures |
global validity measures |
silwidths |
local validity measure |
separation |
local validity measure |
diameter |
local validity measure |
average.distance |
local validity measure |
median.distance |
local validity measure |
average.toother |
local validity measure |
vp |
logical, if colnames provided |
Matthias Templ
Clustering methods:
kmeans
, cmeans
, pam
, clara
, fanny
, bclust
, Mclust
, kcca
, cclust
, specc
, hclust
Distance measures:
dist
, vegdist
, g.dist
, randomForest
, “cosa”, cor
Cluster validity measures:
require(mvoutlier) data(humus) x <- prepare(humus[,c("As", "Ca", "Co", "Mo", "Ni")]) cl1 <- clust(x, k=9, method="clara", distMethod="manhattan") cl1 names(cl1)