prm_cv {chemometrics} | R Documentation |
Cross-validation (CV) is carried out with robust PLS based on partial robust M-regression. A plot with the choice for the optimal number of components is generated. This only works for univariate y-data.
prm_cv(X, y, a, fairct = 4, opt = "median", subset = NULL, segments = 10, segment.type = "random", trim = 0.2, sdfact = 2, plot.opt = TRUE)
X |
predictor matrix |
y |
response variable |
a |
number of PLS components |
fairct |
tuning constant, by default fairct=4 |
opt |
if "l1m" the mean centering is done by the l1-median, otherwise by the coordinate-wise median |
subset |
optional vector defining a subset of objects |
segments |
the number of segments to use or a list with segments (see mvrCv ) |
segment.type |
the type of segments to use. Ignored if 'segments' is a list |
trim |
trimming percentage for the computation of the SEP |
sdfact |
factor for the multiplication of the standard deviation for
the determination of the optimal number of components, see mvr_dcv |
plot.opt |
if TRUE a plot will be generated that shows the selection of the
optimal number of components for each step of the CV, see mvr_dcv |
A function for robust PLS based on partial robust M-regression is available at prm
.
The optimal number of robust PLS components is chosen according to the following criterion:
Within the CV scheme, the mean of the trimmed SEPs SEPtrimave is computed for each number of components,
as well as their standard errors SEPtrimse. Then one searches for the minimum of the SEPtrimave values
and adds sdfact*SEPtrimse. The optimal number of components is the most parsimonious model that
is below this bound.
predicted |
matrix with length(y) rows and a columns with predicted values |
SEPall |
vector of length a with SEP values for each number of components |
SEPtrim |
vector of length a with trimmed SEP values for each number of components |
SEPj |
matrix with segments rows and a columns with SEP values within the CV for each number of components |
SEPtrimj |
matrix with segments rows and a columns with trimmed SEP values within the CV for each number of components |
optcomp |
final optimal number of PLS components |
SEPop |
trimmed SEP value for final optimal number of PLS components |
Peter Filzmoser <P.Filzmoser@tuwien.ac.at>
K. Varmuza and P. Filzmoser: Introduction to Multivariate Statistical Analysis in Chemometrics. CRC Press. To appear.
data(cereal) set.seed(123) res <- prm_cv(cereal$X,cereal$Y[,1],a=10,segments=4,plot.opt=TRUE)