gsim {plsgenomics} | R Documentation |
The function gsim
performs prediction using Lambert-Lacroix and Peyre's GSIM algorithm.
gsim(Xtrain, Ytrain, Xtest=NULL, Lambda, hA, hB=NULL, NbIterMax=50)
Xtrain |
a (ntrain x p) data matrix of predictors. Xtrain must be a matrix.
Each row corresponds to an observation and each column to a predictor variable. |
Ytrain |
a ntrain vector of responses. Ytrain must be a vector.
Ytrain is a {1,2}-valued vector and contains the response variable for each
observation. |
Xtest |
a (ntest x p) matrix containing the predictors for the test data
set. Xtest may also be a vector of length p (corresponding to only one
test observation). If Xtest is not equal to NULL, then the prediction
step is made for these new predictor variables. |
Lambda |
a positive real value. Lambda is the ridge regularization parameter. |
hA |
a strictly positive real value. hA is the bandwidth for GSIM step A. |
hB |
a strictly positive real value. hB is the bandwidth for
GSIM step B. if hB is equal to NULL, then hB value is chosen using a
plug-in method. |
NbIterMax |
a positive integer. NbIterMax is the maximal number of
iterations in the Newton-Rapson parts. |
The columns of the data matrices Xtrain
and Xtest
may not be standardized,
since standardizing is performed by the function gsim
as a preliminary step
before the algorithm is run.
The procedure described in Lambert-Lacroix and Peyre (2005) is used to estimate
the projection direction beta. When Xtest
is not equal to NULL, the procedure predicts the labels for these new predictor variables.
A list with the following components:
Ytest |
the ntest vector containing the predicted labels for the observations from
Xtest . |
beta |
the p vector giving the projection direction estimated. |
hB |
the value of hB used in step B of GSIM (value given by the user or estimated by plug-in if the argument value was equal to NULL) |
DeletedCol |
the vector containing the column number of Xtrain when the
variance of the corresponding predictor variable is null. Otherwise DeletedCol =NULL |
Cvg |
the 0-1 value indicating convergence of the algorithm (1 for convergence, 0 otherwise). |
Sophie Lambert-Lacroix (http://www-lmc.imag.fr/lmc-sms/Sophie.Lambert) and Julie Peyre (http://www-lmc.imag.fr/lmc-sms/Julie.Peyre/).
S. Lambert-Lacroix, J. Peyre . (2006) Local likelyhood regression in generalized linear single-index models with applications to microarrays data. Computational Statistics and Data Analysis, vol 51, n 3, 2091-2113.
# load plsgenomics library library(plsgenomics) # load Colon data data(Colon) IndexLearn <- c(sample(which(Colon$Y==2),12),sample(which(Colon$Y==1),8)) Xtrain <- Colon$X[IndexLearn,] Ytrain <- Colon$Y[IndexLearn] Xtest <- Colon$X[-IndexLearn,] # preprocess data resP <- preprocess(Xtrain= Xtrain, Xtest=Xtest,Threshold = c(100,16000),Filtering=c(5,500),log10.scale=TRUE,row.stand=TRUE) # perform prediction by GSIM res <- gsim(Xtrain=resP$pXtrain,Ytrain= Ytrain,Xtest=resP$pXtest,Lambda=10,hA=50,hB=NULL) res$Cvg sum(res$Ytest!=Colon$Y[-IndexLearn])