Weka_clusterers {RWeka} | R Documentation |
R interfaces to Weka clustering algorithms.
Cobweb(x, control = NULL) FarthestFirst(x, control = NULL) SimpleKMeans(x, control = NULL) XMeans(x, control = NULL) DBScan(x, control = NULL)
x |
an R object with the data to be clustered. |
control |
an object of class Weka_control , or a
character vector of control options, or NULL (default).
Available options can be obtained on-line using the Weka Option
Wizard WOW , or the Weka documentation. |
There is a predict
method for
predicting class ids or memberships from the fitted clusterers.
Cobweb
implements the Cobweb (Fisher, 1987) and Classit
(Gennari et al., 1989) clustering algorithms.
FarthestFirst
provides the “farthest first traversal
algorithm” by Hochbaum and Shmoys, which works as a fast simple
approximate clusterer modelled after simple k-means.
SimpleKMeans
provides clustering with the k-means
algorithm.
XMeans
provides k-means extended by an
“Improve-Structure part” and automatically determines the
number of clusters.
DBScan
provides the “density-based clustering algorithm”
by Ester, Kriegel, Sander, and Xu. Note that noise points are assigned
to NA
.
A list inheriting from class Weka_clusterers
with components
including
clusterer |
a reference (of class
jobjRef ) to a Java object
obtained by applying the Weka buildClusterer method to the
training instances using the given control options. |
class_ids |
a vector of integers indicating the class to which
each training instance is allocated (the results of calling the Weka
clusterInstance method for the built clusterer and each
instance). |
M. Ester, H.-P. Kriegel, J. Sander, and X. Xu (1996). A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD'96), Portland, OR, 226–231. AAAI Press.
D. H. Fisher (1987). Knowledge acquisition via incremental conceptual clustering. Machine Learning, 2/2, 139–172.
J. Gennari, P. Langley and D. H. Fisher (1989). Models of incremental concept formation. Artificial Intelligence, 40, 11–62.
D. S. Hochbaum and D. B. Shmoys (1985). A best possible heuristic for the k-center problem, Mathematics of Operations Research, 10(2), 180–184.
D. Pelleg and A. W. Moore (2006). X-means: Extending K-means with Efficient Estimation of the Number of Clusters. In: Seventeenth International Conference on Machine Learning, 727–734. Morgan Kaufmann.
I. H. Witten and E. Frank (2005). Data Mining: Practical Machine Learning Tools and Techniques. 2nd Edition, Morgan Kaufmann, San Francisco.
cl1 <- SimpleKMeans(iris[, -5], Weka_control(N = 3)) cl1 table(predict(cl1), iris$Species) ## Use XMeans with a KDTree. cl2 <- XMeans(iris[, -5], c("-L", 3, "-H", 7, "-use-kdtree", "-K", "weka.core.neighboursearch.KDTree -P")) cl2 table(predict(cl2), iris$Species)