wapls {paltran} | R Documentation |
This function computes with a given training set and environmental parameter a weighted averaging - partial least square (WA-PLS) transfer function as used in paleolimnology. For the calculation of the model predicting error 10 fold cross validation, bootstrap, ore Leave-on-out can bee chosen.
wapls(..., comp = 4, d.plot = TRUE, plot.comp = "RMSEP", env.trans = FALSE, spec.trans = FALSE, diagno = TRUE, seed = 1, run = 10, val = c("none", "10-cross", "loo", "boot"), scale =FALSE, out = TRUE, drop.non.sig = FALSE, min.occ = 1)
... |
required x,y: a matrix or data frame of the species training set (x) and a vector or data frame of the related environmental parameter (y). optional: core samples (z) - vector or data frame of species data from a sediment core. |
comp |
number of components that will be calculated |
d.plot |
TRUE/FALSE: if TRUE diagnostic plots are given at the end of the analysis. |
plot.comp |
if "RMSEP" is chosen, the diagnostic plot for that component is given with the lowest RMESP |
env.trans |
should the environmental parameter bee transformed? "sqrt" for square root and "log10" for the logarithm to the basis 10 are possible choices, default is FALSE. |
spec.trans |
should the species data bee transformed? "sqrt" for square root and "log10" for the logarithm to the basis 10 are possible choices, default is FALSE. |
diagno |
should N2,number of non zero values bee calculated for the training set and test set? Default is TRUE |
seed |
set the seed for the random generator (using boot or 10-cross), default = 1 |
run |
if "boot" or "10-cross" were chosen: number of cycles to run |
val |
validation method: one of "boot"(bootstrap), "loo"(Leave-on-out), or "10-cross"(10-fold cross validation) |
scale |
should the data scaled up to 100 percent? (Default is FALSE) |
out |
should the results printed on the console? |
drop.non.sig |
should a taxon that have non significant response to the environmental variable bee deleted? The calculation, if there is a significant relation between a taxa and the environmental variable of interest, is undertaken using a generalized additive model (GAM) and the package mgcv. As a GAM only works if a taxon occurred several times, only those taxa will be included that occurred more than 5 times (k=3). |
min.occ |
minimum occurrence: all taxa with less than min.occ will be deleted from the training set |
The 10-fold cross validation is much more slower than the bootstrap or Leave-one-out, because 10 times more wapls-runs must bee performed than using e.g. bootstrap (within the same number of runs). The RMSEP of Leave one out is slightly different from C2. In this algorithm before each run of the loop the new training set (each time one sample is taken out) is controlled for zero species and removed (the same procedure as in wa, there C2 does the same). If that row is deleted from the algorithm, the results are equal for LOO. As C2 and R runs with different random numbers, the results of boottrap and 10 fold cross validation are only equal when using a high number of runs.
species in train.set |
Number of non zero species in each sample of the training set |
N2 train.set |
Hill's N2 of each sample of the training set |
updated opt. |
updated optima (see reference) |
sample scores |
sample scores of the training set |
inferred train.set |
inferred environmental parameter for the training set |
performance |
performance of the wa-pls-regression |
inferred train.set.val |
nferred environmental parameter for the training set using Leave-on-out |
species in core.samples |
Number of non zero species in each sample of the core data set |
n species core.samples in train.set |
How many species in the core samples are represented in the training set |
N2 in core.samples |
Hill's N2 of each sample of the core data |
reconstruction_core.samples |
reconstructed environmental parameter for the samples of the core |
mean(reconstruction_core.samples).val |
mean reconstructed environmental parameter for the samples of the core using "boot" or "loo" |
sd(reconstruction_core.samples).val |
standard deviation of the reconstructed environmental parameter for the samples of the core using "boot" or "loo" |
s1 (boot) |
component s1 of the bootstrap |
s2 (boot) |
component s1 of the bootstrap |
mean(inferred train.set).val |
mean inferred environmental variable for the training set using "boot" |
sd(inferred train.set).val |
standard deviation of inferred environmental variable for the training set using "boot" |
Sven Adler
ter Braak, C.J.F. & Juggins, S. 1993. Weighted averaging partial least squares regression WA-PLS: an improved method for reconstructing environmental variables from species assemblages. Hydrobiologia 269:485-502.
package analogue by G. Simpson
data(train_set.MV) data(train_env.MV) data(dud.df) try<-wapls(train_set.MV,train_env.MV,dud.df,val="boot")