specc {kernlab}R Documentation

Spectral Clustering

Description

A spectral clustering algorithm. This algorithm clusters points using eigenvectors of kernel matrixes derived from the data.

Usage

## S4 method for signature 'formula':
specc(x, data = NULL, na.action = na.omit, ...)

## S4 method for signature 'matrix':
specc(x, centers, kernel = "rbfdot", kpar = list(sigma = 0.1), 
       iterations = 200, mod.sample = 0.6, na.action = na.omit, ...)

Arguments

x the matrix of data to be clustered or a symbolic description of the model to be fit.
data an optional data frame containing the variables in the model. By default the variables are taken from the environment which `specc' is called from.
centers Either the number of clusters or a set of initial cluster centers. If the first, a random set of rows in the eigenvectors matrix are chosen as the initial centers.
kernel the kernel function used in training and predicting. This parameter can be set to any function, of class kernel, which computes a dot product between two vector arguments. kernlab provides the most popular kernel functions which can be used by setting the kernel parameter to the following strings:
  • rbfdot Radial Basis kernel function "Gaussian"
  • polydot Polynomial kernel function
  • vanilladot Linear kernel function
  • tanhdot Hyperbolic tangent kernel function
  • laplacedot Laplacian kernel function
  • besseldot Bessel kernel function
  • anovadot ANOVA RBF kernel function
  • splinedot Spline kernel
The kernel parameter can also be set to a user defined function of class kernel by passing the function name as an argument.
kpar the list of hyper-parameters (kernel parameters). This is a list which contains the parameters to be used with the kernel function. For valid parameters for existing kernels are :
  • sigma inverse kernel width for the Radial Basis kernel function "rbfdot" and the Laplacian kernel "laplacedot".
  • degree, scale, offset for the Polynomial kernel "polydot"
  • scale, offset for the Hyperbolic tangent kernel function "tanhdot"
  • sigma, order, degree for the Bessel kernel "besseldot".
  • sigma, degree for the ANOVA kernel "anovadot".

Hyper-parameters for user defined kernels can be passed through the kpar parameter as well.
mod.sample
iterations The maximum number of iterations allowed.
na.action The action to perform on NA
... additional parameters

Details

In Spectral Clustering one uses the top k (number of clusters) eigenvectors of a matrix derived from the distance between points. Very good results are obtained by using a standard clustering technique to cluster the resulting eigenvector matrixes.

Value

An S4 object of class specc wich extends the class vector containing integers indicating the cluster to which each point is allocated. The following slots contain useful information

centers A matrix of cluster centers.
size The number of point in each cluster
withinss The within-cluster sum of squares for each cluster
kernelf The kernel function used

Author(s)

Alexandros Karatzoglou
alexandros.karatzoglou@ci.tuwien.ac.at

References

Andrew Y. Ng, Michael I. Jordan, Yair Weiss
On Spectral Clustering: Analysis and an Algorithm
Neural Information Processing Symposium 2001
http://www.nips.cc/NIPS2001/papers/psgz/AA35.ps.gz

See Also

kpca, kcca

Examples

## Cluster the spirals data set.
data(spirals)

sc <- specc(spirals, centers=2)

sc
centers(sc)
size(sc)
withinss(sc)

plot(spirals, col=sc)


[Package kernlab version 0.4-4 Index]