bioenv {vegan} | R Documentation |
Function finds the best subset of environmental variables, so that the Euclidean distances of scaled environmental variables have the maximum (rank) correlation with community dissimilarities.
## Default S3 method: bioenv(comm, env, method = "spearman", index = "bray", upto = ncol(env), trace = FALSE, ...) ## S3 method for class 'formula': bioenv(formula, data, ...)
comm |
Community data frame. |
env |
Data frame of continuous environmental variables. |
method |
The correlation method used in cor . |
index |
The dissimilarity index used for community data in
vegdist . |
upto |
Maximum number of parameters in studied subsets. |
formula, data |
Model formula and data. |
trace |
Trace the advance of calculations |
... |
Other arguments passed to cor . |
The function calculates a community dissimilarity matrix using
vegdist
. Then it selects all possible subsets of
environmental variables, scale
s the variables, and
calculates Euclidean distances for this subset using
dist
. Then it finds the correlation between community
dissimilarities and environmental distances, and for each size of
subsets, saves the best result.
There are 2^p-1 subsets of p variables, and an exhaustive
search may take a very, very, very long time (parameter upto
offers a
partial relief).
The function can be called with a model formula
where
the LHS is the data matrix and RHS lists the environmental variables.
The formula interface is practical in selecting or transforming
environmental variables.
Clarke & Ainsworth (1993) suggested this method to be used for selecting the best subset of environmental variables in interpreting results of nonmetric multidimensional scaling (NMDS). They recommended a parallel display of NMDS of community dissimilarities and NMDS of Euclidean distances from the best subset of scaled environmental variables. They warned against the use of Procrustes analysis, but to me this looks like a good way of comparing these two ordinations.
Clarke & Ainsworth wrote a computer program BIO-ENV giving the name to the current function. Presumably BIO-ENV was later incorporated in Clarke's PRIMER software (available for Windows). In addition, Clarke & Ainsworth suggested a novel method of rank correlation which is not available in the current function.
The function returns an object of class bioenv
with a
summary
method.
Jari Oksanen. The code for selecting all possible subsets was posted to the R mailing list by Prof. B. D. Ripley in 1999.
Clarke, K. R & Ainsworth, M. 1993. A method of linking multivariate community structure to environmental variables. Marine Ecology Progress Series, 92, 205–219.
vegdist
,
dist
, cor
for underlying routines,
isoMDS
for ordination, procrustes
for Procrustes analysis, protest
for an alternative, and
rankindex
for studying alternatives to the default
Bray-Curtis index.
# The method is very slow for large number of possible subsets. # Therefore only 6 variables in this example. data(varespec) data(varechem) sol <- bioenv(wisconsin(varespec) ~ log(N) + P + K + Ca + pH + Al, varechem) sol summary(sol)