Weka_clusterers {RWeka}R Documentation

R/Weka Clusterers

Description

R interfaces to Weka clustering algorithms.

Usage

Cobweb(x, control = NULL)
FarthestFirst(x, control = NULL)
SimpleKMeans(x, control = NULL)
DBScan(x, control = NULL)

Arguments

x an R object with the data to be clustered.
control a character vector with control options, or NULL (default). Available options can be obtained on-line using the Weka Option Wizard WOW, or the Weka documentation.

Details

There is a predict method for class prediction from the fitted clusterers.

Cobweb implements the Cobweb (Fisher, 1987) and Classit (Gennari et al., 1989) clustering algorithms.

FarthestFirst provides the “farthest first traversal algorithm” by Hochbaum and Shmoys, which works as a fast simple approximate clusterer modelled after simple k-means.

DBScan provides the “density-based clustering algorithm” by Ester, Kriegel, Sander, and Xu. Note that noise points are assigned to NA.

Value

A list inheriting from class Weka_clusterers with components including

clusterer a reference (of class jobjRef) to a Java object obtained by applying the Weka buildClusterer method to the training instances using the given control options.
class_ids a vector of integers indicating the class to which each training instance is allocated (the results of calling the Weka clusterInstance method for the built clusterer and each instance).

References

Ester M., Kriegel H.-P., Sander J., Xu X. (1996). A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise, Proc. 2nd Int. Conf. on Knowledge Discovery and Data Mining (KDD'96), Portland, OR, 226–231.

D. H. Fisher (1987). Knowledge acquisition via incremental conceptual clustering. Machine Learning, 2/2, 139–172.

J. Gennari, P. Langley and D. H. Fisher (1989). Models of incremenal concept formation. Artificial Intelligence, 40, 11–62.

Hochbaum and Shmoys (1985). A best possible heuristic for the k-center problem, Mathematics of Operations Research, 10(2), 180–184.

I. H. Witten and Eibe Frank (2005). Data Mining: Practical Machine Learning Tools and Techniques. 2nd Edition, Morgan Kaufmann, San Francisco.

Examples

data(iris)
cl <- SimpleKMeans(iris[, -5], c("-N", "3"))
cl
table(predict(cl), iris$Species)

[Package RWeka version 0.2-4 Index]