csi {kernlab} | R Documentation |
The csi
function in kernlab is an implementation of an
incomplete Cholesky decomposition algorithm which exploits side
information (e.g., classification labels, regression responses) to
compute a low rank decomposition of a kernel matrix from the data.
## S4 method for signature 'matrix': csi(x, y, kernel="rbfdot", kpar=list(sigma=0.1), rank, centering = TRUE, kappa = 0.99 ,delta = 40 ,tol = 1e-5)
x |
The data matrix indexed by row |
y |
the classification labels or regression repsonses. In classification y is a m times n matrix where m the number of data and n the number of classes y and y_i is 1 if the corresponting x belongs to class i. |
kernel |
the kernel function used in training and predicting.
This parameter can be set to any function, of class kernel ,
which computes the inner product in feature space between two
vector arguments. kernlab provides the most popular kernel functions
which can be used by setting the kernel parameter to the following
strings:
|
kpar |
the list of hyper-parameters (kernel parameters).
This is a list which contains the parameters to be used with the
kernel function. Valid parameters for existing kernels are :
|
rank |
maximal rank of the computed kernel matrix |
centering |
if TRUE centering is performed (default: TRUE) |
kappa |
trade-off between approximation of K and prediction of Y (default: 0.99) |
delta |
number of columns of cholesky performed in advance (default: 40) |
tol |
minimum gain at each iteration (default: 1e-4) |
An incomplete cholesky decomposition calculates
Z where K= ZZ' K being the kernel matrix.
Since the rank of a kernel matrix is usually low, Z tends to
be smaller then the complete kernel matrix. The decomposed matrix can be
used to create memory efficient kernel-based algorithms without the
need to compute and store a complete kernel matrix in memory.
csi
uses the class labels, or regression responses to compute a
more apropriate aproximation for the problem at hand considering the
aditional information from the response variable.
An S4 object of class "csi" which is an extension of the class "matrix". The object is the decomposed kernel matrix along with the slots :
pivots |
Indices on which pivots where done |
diagresidues |
Residuals left on the diagonal |
maxresiduals |
Residuals picked for pivoting |
predgain |
predicted gain before adding each column |
truegain |
actual gain after adding each column |
Q |
QR decomposition of the kernel matrix |
R |
QR decomposition of the kernel matrix |
slots can be accessed either by object@slot
or by accessor functions with the same name
(e.g., pivots(object))
Alexandros Karatzoglou (based on Matlab code by
Francis Bach)
alexandros.karatzoglou@ci.tuwien.ac.at
data(iris) ## create multidimensional y matrix yind <- t(matrix(1:3,3,150)) ymat <- matrix(0, 150, 3) ymat[yind==as.integer(iris[,5])] <- 1 datamatrix <- as.matrix(iris[,-5]) # initialize kernel function rbf <- rbfdot(sigma=0.1) rbf Z <- csi(datamatrix,ymat, kernel=rbf, rank = 30) dim(Z) pivots(Z) # calculate kernel matrix K <- crossprod(t(Z)) # difference between approximated and real kernel matrix (K - kernelMatrix(kernel=rbf, datamatrix))[6,]