simpson {untb} | R Documentation |
Simpson's diversity index
simpson(x, with.replacement=FALSE)
x |
Ecosystem vector; coerced to class count |
with.replacement |
Boolean, with default FALSE meaning to
sample without replacement; see details section |
Returns the Simpson index D: the probability that two randomly sampled individuals belong to different species.
There is some confusion as to the precise definition: some authors specify that the two individuals are necessarily distinct (ie sampling without replacement), and some do not.
Simpson (1949) assumed sampling without replacement and gave
1-frac{sum_{i=1}^Sn_i(n_i-1)}{J(J-1)}
in our notation.
He and Hu (2005) assumed sampling with replacement:
1-frac{sum_{i=1}^Sn_i^2}{J^2}.
The difference is largely academic but is most pronounced when many species occur with low counts (ie close to 1).
Robin K. S. Hankin
10.1111/j.1461-0248.2005.00729.x
data(butterflies) D <- simpson(butterflies) theta <- optimal.prob(butterflies)*2*no.of.ind(butterflies) # compare theta with D/(1-D) (should be roughly equal; see He & Hu 2005): theta D/(1-D) # Second argument pedantic in practice. # Mostly, the difference is small: simpson(butterflies,FALSE) - simpson(butterflies,TRUE) # Most extreme example: x <- count(c(1,1)) simpson(x,TRUE) simpson(x,FALSE)