bootstrap {analogue}R Documentation

Bootstrap estimation and errors

Description

Function to calculate bootstrap statistics for transfer function models such as bootstrap estimates, model RMSEP, sample specific errors for predictions and summary statistics such as bias and R^2 between oberved and estimated environment.

Usage


bootstrap(object, ...)

## Default S3 method:
bootstrap(object, ...)

## S3 method for class 'mat':
bootstrap(object, newdata, newenv, k,
          weighted = FALSE, n.boot = 1000, ...)

Arguments

object an R object for which bootstrap statistics are to be generated. Only objects of class "mat" currently supported.
newdata a data frame containing samples for which bootstrap predictions and sample specific errors are to be generated. May be missing — See Details. "newdata" must have the same number of columns as the training set data.
newenv a vector containing environmental data for samples in "newdata". Used to calculate full suite of errors for new data such as a test set with known environmental values. May be missing — See Details. "newenv" must have the same number of rows as "newdata".
k numeric; how many modern analogues to use to generate the bootstrap statistics and, if requested, the predictions.
weighted logical; should the weighted mean of the environment for the "k" modern analogues be used instead of the mean?
n.boot Number of bootstrap samples to take.
... arguments passed to other methods.

Details

bootstrap is a fairly flexible function, and can be called with or without arguments newdata and newenv.

If called with only object specified, then bootstrap estimates for the training set data are returned. In this case, the returned object will not include component predictions.

If called with both object and newdata, then in addition to the above, bootstrap estimates for the new samples are also calculated and returned. In this case, component predictions will contain the apparent and bootstrap derived predictions and sample-specific errors for the new samples.

If called with object, newdata and newenv, then the full bootstrap object is returned (as described in the Value section below). With environmental data now available for the new samples, residuals, RMSE(P) and R^2 and bias statistics can be calculated.

The individual components of predictions are the same as those described in the components relating to the training set data. For example, returned.object$predictions$bootstrap contains the components as returned.object$bootstrap.

It is not usual for environmental data to be available for the new samples for which predictions are required. In normal palaeolimnological studies, it is more likely that newenv will not be available as we are dealing with sediment core samples from the past for which environmental data are not available. However, if sufficient training set samples are available to justify producing a training and a test set, then newenv will be available, and bootstrap can accomodate this extra information and calculate apparent and bootstrap estimates for the test set, allowing an independent assessment of the RMSEP of the model to be performed.

Value

A large object is returned with some or all of the following depending on whether newdata and newenv are supplied or not.

observed vector of observed environmental values.
apparent a list containing the apparent or non-bootstrapped estimates for the training set. With the following components:
estimated
estimated values for "y", the environment.
residuals
model residuals.
r.squared
Apparent R^2 between observed and estimated values of "y".
avg.bias
Average bias of the model residuals.
max.bias
Maximum bias of the model residuals.
rmse
Apparent error (RMSE) for the model.
k
numeric; indicating the size of model used in estimates and predictions.
bootstrap a list containing the bootstrap estimates for the training set. With the following components:
estimated
Bootstrap estimates for "y".
residuals
Bootstrap residuals for "y".
r.squared
Bootstrap derived R^2 between observed and estimated values of "y".
avg.bias
Average bias of the bootstrap derived model residuals.
max.bias
Maximum bias of the bootstrap derived model residuals.
rmsep
Bootstrap derived RMSEP for the model.
s1
Bootstrap derived S1 error component for the model.
s2
Bootstrap derived S2 error component for the model.
k
numeric; indicating the size of model used in estimates and predictions.
sample.errors a list containing the bootstrap-derived sample specific errors for the training set. With the following components:
rmsep
Bootstrap derived RMSEP for the training set samples.
s1
Bootstrap derived S1 error component for training set samples.
s2
Bootstrap derived S2 error component for training set samples.
weighted logical; whether the weighted mean was used instead of the mean of the environment for k-closest analogues.
auto logical; whether "k" was choosen automatically or user-selected.
n.boot numeric; the number of bootstrap samples taken.
predictions a list containing the apparent and bootstrap-derived estimates for the new data, with the following components:
observed
the observed values for the new samples — only if newenv is provided.
apparent
a list containing the apparent or non-bootstrapped estimates for the new samples. A list with the same components as apparent, above.
bootstrap
a list containing the bootstrap estimates for the new samples, with some or all of the same components as bootstrap, above.
sample.errors
a list containing the bootstrap-derived sample specific errors for the new samples, with some or all of the same components as sample.errors, above.

Author(s)

Gavin L. Simpson

References

Birks, H.J.B., Line, J.M., Juggins, S., Stevenson, A.C. and ter Braak, C.J.F. (1990). Diatoms and pH reconstruction. Philosophical Transactions of the Royal Society of London; Series B, 327; 263–278.

See Also

mat, plot.mat summary.bootstrap

Examples

## continue the RLGH and SWAP example from ?join
example(join)

## fit the MAT model using the squared chord distance measure
swap.mat <- mat(swapdiat, swappH, method = "SQchord")

## bootstrap training set
swap.boot <- bootstrap(swap.mat, n.boot = 100)
swap.boot
summary(swap.boot)

## bootstrap with predictions:
rlgh.boot <- bootstrap(swap.mat, rlgh, n.boot = 100)


[Package analogue version 0.3-3 Index]