vegdist {vegan}R Documentation

Dissimilarity Indices for Community Ecologists

Description

The function computes dissimilarity indices that are useful for or popular with community ecologists. Gower, Bray–Curtis, Jaccard and Kulczynski indices are good in detecting underlying ecological gradients (Faith et al. 1987). Morisita and Horn–Morisita indices should be able to handle different sample sizes (Wolda 1981, Krebs 1999), and Mountford (1962) index for presence–absence data should be able to handle unknown (and variable) sample sizes.

Usage

 vegdist(x, method="bray", binary=FALSE, diag=FALSE, upper=FALSE, ...) 

Arguments

x Community data matrix.
method Dissimilarity index, partial match to "manhattan", "euclidean", "canberra", "bray", "kulczynski", "jaccard", "gower", "morisita", "horn" or "mountford".
binary Perform presence/absence standardization before analysis using decostand.
diag Compute diagonals.
upper Return only the upper diagonal.
... Other parameters (ignored).

Details

Jaccard and Mountford indices are discussed below. The other indices are defined as:
euclidean d[jk] = sqrt(sum (x[ij]-x[ik])^2)
manhattan d[jk] = sum(abs(x[ij] - x[ik]))
gower d[jk] = sum (abs(x[ij]-x[ik])/(max(x[i])-min(x[i]))
canberra d[jk] = (1/NZ) sum ((x[ij]-x[ik])/(x[ij]+x[ik]))
where NZ is the number of non-zero entries.
bray d[jk] = (sum abs(x[ij]-x[ik])/(sum (x[ij]+x[ik]))
kulczynski d[jk] 1 - 0.5*((sum min(x[ij],x[ik])/(sum x[ij]) + (sum min(x[ij],x[ik])/(sum x[ik]))
morisita {d[jk] = 2*sum(x[ij]*x[ik])/((lambda[j]+lambda[k]) * sum(x[ij])*sum(x[ik])) }
where lambda[j] = sum(x[ij]*(x[ij]-1))/sum(x[ij])*sum(x[ij]-1)
horn Like morisita, but lambda[j] = sum(x[ij]^2)/(sum(x[ij])^2)

Jaccard index is computed as 2B/(1+B), where B is Bray–Curtis dissimilarity.

Mountford index is defined as M = 1/α where α is the parameter of Fisher's logseries assuming that the compared communities are samples from the same community (cf. fisherfit, fisher.alpha). The index M is found as the positive root of equation exp(a*M) + exp(b*M) = 1 + exp((a+b-j)*M), where j is the number of species occurring in both communities, and a and b are the number of species in each separate community (so the index uses presence–absence information). Mountford index is usually misrepresented in the literature: indeed Mountford (1962) suggested an approximation to be used as starting value in iterations, but the proper index is defined as the root of the equation above. The function vegdist solves M with the Newton method. Please note that if either a or b are equal to j, one of the communities could be a subset of other, and the dissimilarity is 0 meaning that non-identical objects may be regarded as similar and the index is non-metric. The Mountford index is in the range 0 ... log(2), but the dissimilarities are divided by log(2) so that the results will be in the conventional range 0 ... 1.

Morisita index can be used with genuine count data (integers) only. Its Horn–Morisita variant is able to handle any abundance data.

Euclidean and Manhattan dissimilarities are not good in gradient separation without proper standardization but are still included for comparison and special needs.

Bray–Curtis and Jaccard indices are rank-order similar, and some other indices become identical or rank-order similar after some standardizations, especially with presence/absence transformation of equalizing site totals with decostand.

The naming conventions vary. The one adopted here is traditional rather than truthful to priority. The function finds either quantitative or binary variants of the indices under the same name, which correctly may refer only to one of these alternatives For instance, the Bray index is known also as Steinhaus, Czekanowski and Sørensen index. The quantitive version of Jaccard should probably called Ruzicka index (but spelled with characters that cannot be shown here). The abbreviation "horn" for the Horn–Morisita index is misleading, since there is a separate Horn index. The abbreviation will be changed if that index is implemented in vegan.

Value

Should provide a drop-in replacement for dist and return a distance object of the same type.

Note

The function is an alternative to dist adding some ecologically meaningful indices. Both methods should produce similar types of objects which can be interchanged in any method accepting either. Manhattan and Euclidean dissimilarities should be identical in both methods. Canberra index is divided by the number of variables in vegdist, but not in dist. So these differ by a constant multiplier, and the alternative in vegdist in range (0,1).

Author(s)

Jari Oksanen

References

Faith, D. P, Minchin, P. R. and Belbin, L. (1987). Compositional dissimilarity as a robust measure of ecological distance. Vegetatio 69, 57–68.

Krebs, C. J. (1999). Ecological Methodology. Addison Wesley Longman.

Mountford, M. D. (1962). An index of similarity and its application to classification problems. In: P.W.Murphy (ed.), Progress in Soil Zoology, 43–50. Butterworths.

Wolda, H. (1981). Similarity indices, sample size and diversity. Oecologia 50, 296–302.

See Also

decostand, dist, rankindex, isoMDS, stepacross.

Examples

data(varespec)
vare.dist <- vegdist(varespec)
# Orlóci's Chord distance: range 0 .. sqrt(2)
vare.dist <- vegdist(decostand(varespec, "norm"), "euclidean")

[Package vegan version 1.6-8 Index]