IKFA {rioja} | R Documentation |
Functions for reconstructing (predicting) environmental values from biological assemblages using Imbrie & Kipp Factor Analysis (IKFA), as used in palaeoceanography.
IKFA(y, x, nFact = 5, IsPoly = FALSE, IsRot = TRUE, ccoef = 1:nFact, check.data=TRUE, lean=FALSE, ...) IKFA.fit(y, x, nFact = 5, IsPoly = FALSE, IsRot = TRUE, ccoef = 1:nFact, lean=FALSE) ## S3 method for class 'IKFA': predict (object, newdata=NULL, sse=FALSE, nboot=100, match.data=TRUE, verbose=TRUE, ...) communality <- function(object, y) ## S3 method for class 'IKFA': crossval(object, cv.method="loo", verbose=TRUE, ngroups=10, nboot=100, ...) ## S3 method for class 'IKFA': performance(object, ...) ## S3 method for class 'IKFA': rand.t.test(object, n.perm=999, ...) ## S3 method for class 'IKFA': screeplot(x, rand.test=TRUE, ...) ## S3 method for class 'IKFA': print(x, ...) ## S3 method for class 'IKFA': summary(object, full=FALSE, ...) ## S3 method for class 'IKFA': plot(x, resid=FALSE, xval=FALSE, nFact=max(x$ccoef), xlab="", ylab="", ylim=NULL, xlim=NULL, add.ref=TRUE, add.smooth=FALSE, ...) ## S3 method for class 'IKFA': residuals(object, ...) ## S3 method for class 'IKFA': coef(object, ...) ## S3 method for class 'IKFA': fitted(object, ...)
y |
a data frame or matrix of biological abundance data. |
x, object |
a vector of environmental values to be modelled or an object of class wa . |
newdata |
new biological data to be predicted. |
nFact |
number of factor to extract. |
IsRot |
logical to rotate factors. |
ccoef |
vector of factor numbers to include in the predictions. |
IsPoly |
logical to include quadratic of the factors as predictors in the regression. |
check.data |
logical to perform simple checks on the input data. |
match.data |
logical indicate the function will match two species datasets by their column names. You should only set this to FALSE if you are sure the column names match exactly. |
lean |
logical to exclude some output from the resulting models (used when cross-validating to speed calculations). |
full |
logical to show head and tail of output in summaries. |
resid |
logical to plot residuals instead of fitted values. |
xval |
logical to plot cross-validation estimates. |
xlab, ylab, xlim, ylim |
additional graphical arguments to plot.wa . |
add.ref |
add 1:1 line on plot. |
add.smooth |
add loess smooth to plot. |
cv.method |
cross-validation method, either "loo", "lgo" or "bootstrap". |
verbose |
logical or integer to show feedback during cross-validaton. If TRUE print feedback every 50 cycles, if integer, use this value. |
nboot |
number of bootstrap samples. |
ngroups |
number of groups in leave-group-out cross-validation, or a vector contain leave-out group menbership. |
sse |
logical indicating that sample specific errors should be calculated. |
rand.test |
logical to perform a randomisation t-test to test significance of cross validated factors. |
n.perm |
number of permutations for randomisation t-test. |
... |
additional arguments. |
Function IKFA
performs Imbrie and Kipp Factor Analysis, a form of Principal Components Regrssion (Imbrie & Kipp 1971).
Function predict
predicts values of the environemntal variable for newdata
or returns the fitted (predicted) values from the original modern dataset if newdata
is NULL
. Variables are matched between training and newdata by column name (if match.data
is TRUE
). Use compare.datasets
to assess conformity of two species datasets and identify possible no-analogue samples.
IKFA
has methods fitted
and rediduals
that return the fitted values (estimates) and residuals for the training set, performance
, which returns summary performance statistics (see below), coef
which returns the species coefficients, and print
and summary
to summarise the output. IKFA
also has a plot
method that produces scatter plots of predicted vs observed measurements for the training set.
Function rand.t.test
performs a randomisation t-test to test the significance of the cross-validated components after van der Voet (1994).
Function screeplot
displays the RMSE of prediction for the training set as a function of the number of factors and is useful for estimating the optimal number for use in prediction. By default screeplot
will also carry out a randomisation t-test and add a line to scree plot indicating percentage change in RMSE with each component annotate with the p-value from the randomisation test.
Function IKFA
returns an object of class IKFA
with the following named elements:
coefficients |
species coefficients (the updated "optima"). |
meanY |
weighted mean of the environmental variable. |
iswapls |
logical indicating whether analysis was IKFA (TRUE) or PLS (FALSE). |
T |
sample scores. |
P |
variable (species) scores. |
npls |
number of pls components extracted. |
fitted.values |
fitted values for the training set. |
call |
original function call. |
x |
environmental variable used in the model. |
standx, meanT sdx |
additional information returned for a PLSif model. |
predicted |
predicted values of each training set sample under cross-validation. |
residuals.cv |
prediction residuals. |
fit |
predicted values for newdata . |
fit.boot |
mean of the bootstrap estimates of newdata. |
v1 |
squared standard error of the bootstrap estimates for each new sample. |
v2 |
mean squared error for the training set samples, across all bootstrap samples. |
SEP |
standard error of prediction, calculated as the square root of v1 + v2. |
Function performance
returns a matrix of performance statistics for the IKFA model. See performance
, for a description of the summary.
Function rand.t.test
returns a matrix of performance statistics together with columns indicating the p-value and percentage change in RMSE with each higher component (see van der Veot (1994) for details).
Steve Juggins
Imbrie, J. & Kipp, N.G. (1971). A new micropaleontological method for quantitative paleoclimatology: application to a Late Pleistocene Caribbean core. In The Late Cenozoic Glacial Ages (ed K.K. Turekian), pp. 77-181. Yale University Press, New Haven.
van der Voet, H. (1994) Comparing the predictive accuracy of models uing a simple randomization test. Chemometrics and Intelligent Laboratory Systems, 25, 313-323.
WA
, MAT
, performance
, and compare.datasets
for diagnostics.
data(IK) spec <- IK$spec SumSST <- IK$env$SumSST core <- IK$core fit <- IKFA(spec, SumSST) fit # cross-validate model fit.cv <- crossval(fit, cv.method="lgo") # How many components to use? screeplot(fit.cv) #predict the core pred <- predict(fit, core, npls=2) #plot predictions - depths are in rownames depth <- as.numeric(rownames(core)) plot(depth, pred$fit[, 2], type="b") # fit using only factors 1, 2, 4, & 5 # and using polynomial terms # as Imbrie & Kipp (1971) fit2 <- IKFA(spec, SumSST, ccoef=c(1, 2, 4, 5), IsPoly=TRUE) fit2.cv <- crossval(fit2, cv.method="lgo") screeplot(fit2.cv) ## Not run: # predictions with sample specific errors # takes approximately 1 minute to run pred <- predict(fit, core, sse=TRUE, nboot=1000) pred ## End(Not run)