crossval {pls}R Documentation

Cross-validation of PLSR and PCR models

Description

A “stand alone” cross-validation function for mvr objects.

Usage

crossval(object, segments = 10,
         segment.type = c("random", "consecutive", "interleaved"),
         length.seg, trace = 15, ...)

Arguments

object an mvr object; the regression to cross-validate.
segments the number of segments to use, or a list with segments (see below). Ignored if loo = TRUE.
segment.type the type of segments to use. Ignored if segments is a list.
length.seg Positive integer. The length of the segments to use. If specified, it overrides segments unless segments is a list.
trace if TRUE, tracing is turned on. If numeric, it denotes a time limit (in seconds). If the estimated total time of the cross-validation exceeds this limit, tracing is turned on.
... additional arguments, sent to the underlying fit function.

Details

This function performs cross-validation on a model fit by mvr. It can handle models such as plsr(y ~ msc(X), ...) or other models where the predictor variables need to be recalculated for each segment. When recalculation is not needed, the result of crossval(mvr(...)) is identical to mvr(..., validation = "CV"), but slower.

If segments is a list, the arguments segment.type and length.seg are ignored. The elements of the list should be integer vectors specifying the indices of the segments. See cvsegments for details.

Otherwise, segments of type segment.type are generated. How many segments to generate is selected by specifying the number of segments in segments, or giving the segment length in length.seg. If both are specified, segments is ignored.

The R2 component returned is calculated as the squared correlation between the cross-validated predictions and the responses.

When tracing is turned on, the segment number is printed for each segment.

Value

The supplied object is returned, with an additional component validation, which is a list with components

method euqals "CV" for cross-validation.
pred an array with the cross-validated predictions.
MSEP0 a vector of MSEP values (one for each response variable) for a model with zero components, i.e., only the intercept.
MSEP a matrix of MSEP values for models with 1, ..., ncomp components. Each row corresponds to one response variable.
adj a matrix of adjustment values for calculating bias corrected MSEP. MSEP uses this.
R2 a matrix of R2 values for models with 1, ..., ncomp components. Each row corresponds to one response variable.
segments the list of segments used in the cross-validation.
ncomp the number of components.

Note

The MSEP0 is always cross-validated using leave-one-out cross-validation. This usually makes little difference in practice, but should be fixed for correctness.

Author(s)

Ron Wehrens and Bjørn-Helge Mevik

References

Mevik, B.-H., Cederkvist, H. R. (2004) Mean Squared Error of Prediction (MSEP) Estimates for Principal Component Regression (PCR) and Partial Least Squares Regression (PLSR). Journal of Chemometrics, 18(9), 422–429.

See Also

mvr mvrCv cvsegments MSEP

Examples

data(NIR)
NIR.pcr <- pcr(y ~ msc(X), 6, data = NIR)
NIR.cv <- crossval(NIR.pcr, segments = 10)
plot(MSEP(NIR.cv))

[Package pls version 1.2-0 Index]