kernel.pls.cv {plsdof}R Documentation

Cross-validation for Kernel Partial Least Squares

Description

This function computes the cross-validation-optimal model parameters for Kernel Partial Least Squares.

Usage

kernel.pls.cv(X, y, k = 10, m = ncol(X), type = "vanilla", sigma = 1)

Arguments

X matrix of predictor observations.
y vector of response observations. The length of y is the same as the number of rows of X.
k number of cross-validation splits. Default is k=10.
m maximal number of Partial Least Squares components. Default is m=ncol(X).
type type of kernel. type="vanilla" is a linear kernel. type="gaussian" is a gaussian kernel. Default is type="vanilla".
sigma vector of kernel parameters. If type="gaussian", these are the kernel widths. If the vanilla kernel is used, sigma is not used. Default value is sigma=1.

Details

For the linear kernel (type="vanilla"), we standardize X to zero mean and unit variance. For the Gaussian kernel (type="gaussian"), we normalize X such that the range of each column is [-1,1].

The default value for sigma is in general NOT a sensible parameter, and sigma should always be selected via cross-validation from a RANGE of values. The default value for m is a sensible upper bound only for the vanilla kernel.

Value

cv.error cross-validated error. If type="vanilla", this is a vector of length m. If type="gaussian", this is a matrix of dimensions length(sigma) x m.
m.opt optimal numnber of Partial Least Squares components
sigma.opt optimal kernel paramter. This is only returned if type="gaussian".

Author(s)

Nicole Kraemer, Mikio L. Braun

See Also

kernel.pls.ic, kernel.pls

Examples

n<-50 # number of observations
p<-5 # number of variables
X<-matrix(rnorm(n*p),ncol=p)
y<-rnorm(n)

# compute linear PLS
linear.pls<-kernel.pls.cv(X,y,m=ncol(X),k=5)

# compute nonlinear PLS
sigma=exp(seq(0,4,length=10))
nonlinear.pls=kernel.pls(X,y,m=10,type="gaussian",sigma=sigma)

[Package plsdof version 0.1-1 Index]