agreement {clue} | R Documentation |
Compute the agreement between (ensembles) of partitions or hierarchies.
cl_agreement(x, y = NULL, method = "euclidean", ...)
x |
an ensemble of partitions or hierarchies and dissimilarities,
or something coercible to that (see cl_ensemble ). |
y |
NULL (default), or as for x . |
method |
a character string specifying one of the built-in
methods for computing agreement, or a function to be taken as
a user-defined method. If a character string, its lower-cased
version is matched against the lower-cased names of the available
built-in methods using pmatch . See Details for
available built-in methods. |
... |
further arguments to be passed to methods. |
If y
is given, its components must be of the same kind as those
of x
(i.e., components must either all be partitions, or all be
hierarchies or dissimilarities).
If all components are partitions, the following built-in methods for measuring agreement between two partitions with respective membership matrices u and v (brought to a common number of columns) are available:
"euclidean"
"manhattan"
"Rand"
"cRand"
"NMI"
"KP"
"angle"
"diag"
"Jaccard"
"FM"
If all components are hierarchies, available built-in methods for measuring agreement between two hierarchies with respective ultrametrics u and v are as follows.
"euclidean"
"manhattan"
"cophenetic"
"angle"
"gamma"
The measures based on ultrametrics also allow computing agreement with
“raw” dissimilarities on the underlying objects (R objects
inheriting from class "dist"
).
If a user-defined agreement method is to be employed, it must be a function taking two clusterings as its arguments.
Symmetric agreement objects of class "cl_agreement"
are
implemented as symmetric proximity objects with self-proximities
identical to one, and inherit from class "cl_proximity"
. They
can be coerced to dense square matrices using as.matrix
. It is
possible to use 2-index matrix-style subscripting for such objects;
unless this uses identical row and column indices, this results in a
(non-symmetric agreement) object of class "cl_cross_agreement"
.
If y
is NULL
, an object of class "cl_agreement"
containing the agreements between the all pairs of components of
x
. Otherwise, an object of class "cl_cross_agreement"
with the agreements between the components of x
and the
components of y
.
E. Dimitriadou, A. Weingessel and K. Hornik (2002). A combination scheme for fuzzy clustering. International Journal of Pattern Recognition and Artificial Intelligence, 16, 901–912.
E. B. Fowlkes and C. L. Mallows (1983). A method for comparing two hierarchical clusterings. Journal of the American Statistical Association, 78, 553–569.
A. D. Gordon (1999). Classification (2nd edition). Boca Raton, FL: Chapman & Hall/CRC.
L. Hubert and P. Arabie (1985). Comparing partitions. Journal of Classification, 2, 193–218.
W. M. Rand (1971). Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association, 66, 846–850.
L. Katz and J. H. Powell (1953). A proposed index of the conformity of one sociometric measurement to another. Psychometrika, 18, 249–256.
A. Strehl and J. Ghosh (2002). Cluster ensembles — A knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research, 3, 583–617.
cl_dissimilarity
;
classAgreement
in package e1071.
## An ensemble of partitions. data("CKME") pens <- CKME[1 : 20] # for saving precious time ... summary(c(cl_agreement(pens))) summary(c(cl_agreement(pens, method = "Rand"))) summary(c(cl_agreement(pens, method = "diag"))) cl_agreement(pens[1:5], pens[6:7], method = "NMI") ## Equivalently, using subscripting. cl_agreement(pens, method = "NMI")[1:5, 6:7] ## An ensemble of hierarchies. d <- dist(USArrests) hclust_methods <- c("ward", "single", "complete", "average", "mcquitty", "median", "centroid") hclust_results <- lapply(hclust_methods, function(m) hclust(d, m)) names(hclust_results) <- hclust_methods hens <- cl_ensemble(list = hclust_results) summary(c(cl_agreement(hens))) ## Note that the Euclidean agreements are *very* small. ## This is because the ultrametrics differ substantially in height: u <- lapply(hens, cl_ultrametric) round(sapply(u, max), 3) ## Rescaling the ultrametrics to [0, 1] gives: u <- lapply(u, function(x) (x - min(x)) / (max(x) - min(x))) shens <- cl_ensemble(list = lapply(u, as.cl_dendrogram)) summary(c(cl_agreement(shens))) ## Au contraire ... summary(c(cl_agreement(hens, method = "cophenetic"))) cl_agreement(hens[1:3], hens[4:5], method = "gamma")