mgsim {plsgenomics} | R Documentation |
The function mgsim
performs prediction using Lambert-Lacroix and Peyre's MGSIM algorithm.
mgsim(Ytrain,Xtrain,Lambda,h,Xtest=NULL,NbIterMax=50)
Xtrain |
a (ntrain x p) data matrix of predictors. Xtrain must be a matrix.
Each row corresponds to an observation and each column to a predictor variable. |
Ytrain |
a ntrain vector of responses. Ytrain must be a vector.
Ytrain is a {1,...,c+1}-valued vector and contains the response variable for each
observation. c+1 is the number of classes. |
Xtest |
a (ntest x p) matrix containing the predictors for the test data
set. Xtest may also be a vector of length p (corresponding to only one
test observation). If Xtest is not equal to NULL, then the prediction
step is made for these new predictor variables. |
Lambda |
a positive real value. Lambda is the ridge regularization parameter. |
h |
a strictly positive real value. h is the bandwidth for GSIM step A. |
NbIterMax |
a positive integer. NbIterMax is the maximal number of iterations in the
Newton-Rapson parts. |
The columns of the data matrices Xtrain
and Xtest
may not be standardized,
since standardizing is performed by the function mgsim
as a preliminary step
before the algorithm is run.
The procedure described in Lambert-Lacroix and Peyre (2005) is used to estimate
the c projection directions and the coefficients of the parametric fit obtained
after projecting predictor variables onto the estimated directions. When Xtest
is not equal to NULL, the procedure predicts the labels for these new predictor variables.
A list with the following components:
Ytest |
the ntest vector containing the predicted labels for the observations from
Xtest . |
beta |
the (p x c) matrix containing the c estimated projection directions. |
Coefficients |
the (2 x c) matrix containing the coefficients of the parametric fit obtained after projecting predictor variables onto these estimated directions. |
DeletedCol |
the vector containing the column number of Xtrain when the
variance of the corresponding predictor variable is null. Otherwise DeletedCol =NULL |
Cvg |
the 0-1 value indicating convergence of the algorithm (1 for convergence, 0 otherwise). |
Sophie Lambert-Lacroix (http://www-lmc.imag.fr/lmc-sms/Sophie.Lambert) and Julie Peyre (http://www-lmc.imag.fr/lmc-sms/Julie.Peyre/).
S. Lambert-Lacroix, J. Peyre . (2006) Local likelyhood regression in generalized linear single-index models with applications to microarrays data. Computational Statistics and Data Analysis, vol 51, n 3, 2091-2113.
# load plsgenomics library library(plsgenomics) # load SRBCT data data(SRBCT) IndexLearn <- c(sample(which(SRBCT$Y==1),10),sample(which(SRBCT$Y==2),4),sample(which(SRBCT$Y==3),7),sample(which(SRBCT$Y==4),9)) # perform prediction by MGSIM res <- mgsim(Ytrain=SRBCT$Y[IndexLearn],Xtrain=SRBCT$X[IndexLearn,],Lambda=0.001,h=19,Xtest=SRBCT$X[-IndexLearn,]) res$Cvg sum(res$Ytest!=SRBCT$Y[-IndexLearn]) # prediction for another sample Xnew <- SRBCT$X[83,] # projection of Xnew onto the c estimated direction Xproj <- Xnew %*% res$beta # Compute the linear predictor for each classes expect class 1 eta <- diag(cbind(rep(1,3),t(Xproj)) %*% res$Coefficients) Ypred <- which.max(c(0,eta)) Ypred SRBCT$Y[83]