dissimilarity {arules} | R Documentation |
Provides the generic function dissimilarity
and the S4 methods to
compute and returns distances for binary data in a matrix
,
transactions
or associations
.
dissimilarity(x, y = NULL, method = NULL, args = NULL, ...) ## S4 method for signature 'itemMatrix': dissimilarity(x, y = NULL, method = NULL, args = NULL, which = "transactions") ## S4 method for signature 'associations': dissimilarity(x, y = NULL, method = NULL, args = NULL, which = "transactions") ## S4 method for signature 'matrix': dissimilarity(x, y = NULL, method = NULL, args = NULL)
x |
the set of elements (e.g., matrix, itemMatrix, transactions,
itemsets, rules ). |
y |
NULL or a second set to calculate cross dissimilarities. |
method |
the distance measure to be used. Implemented measures
are (defaults to "jaccard" ):
|
args |
a list of additional arguments for the methods.
For calculating "affinity" for associations, the affinities between the items in
the transactions are needed and passed to the method as the first
element in args . |
which |
a character string indicating if the dissimilarity should be
calculated between transavtions (default) or items (use "items" ). |
... |
further arguments. |
returns an object of class dist
.
Sneath, P. H. A. (1957) Some thoughts on bacterial classification. Journal of General Microbiology 17, pages 184–200.
Sokal, R. R. and Michener, C. D. (1958) A statistical method for evaluating systematic relationships. University of Kansas Science Bulletin 38, pages 1409–1438.
Dice, L. R. (1945) Measures of the amount of ecologic association between species. Ecology 26, pages 297–302.
Charu C. Aggarwal, Cecilia Procopiuc, and Philip S. Yu. (2002) Finding localized associations in market basket data. IEEE Trans. on Knowledge and Data Engineering 14(1):51–62.
affinity
,
dist-class
,
itemMatrix-class
,
associations-class
.
## cluster items in Groceries with support > 5% data("Groceries") s <- Groceries[,itemFrequency(Groceries)>0.05] d_jaccard <- dissimilarity(s, which = "items") plot(hclust(d_jaccard, method = "ward")) ## cluster transactions for a sample of Adult data("Adult") s <- sample(Adult, 200) ## calculate Jaccard distances and do hclust d_jaccard <- dissimilarity(s) plot(hclust(d_jaccard)) ## calculate affinity-based distances and do hclust d_affinity <- dissimilarity(s, method = "affinity") plot(hclust(d_affinity)) ## cluster rules rules <- apriori(Adult) rules <- subset(rules, subset = lift > 2) ## we need to supply the item affinities from the dataset (sample) d_affinity <- dissimilarity(rules, method = "affinity", args = list(affinity = affinity(s))) plot(hclust(d_affinity))