plsrf_x_pv {MAclinical} | R Documentation |
This function builds a prediction rule based on the learning data (microarray predictors only) and applies it to the test data. The classifier consists of two steps: PLS dimension reduction with pre-validation step for summarizing microarray data, and random forests applied to the obtained PLS components. See Boulesteix et al (2008) for more details.
The function plsrf_x_pv
uses the functions cforest
and varimp
from the package party
and the function
pls.regression
from the package plsgenomics
.
plsrf_x_pv(Xlearn,Zlearn=NULL,Ylearn,Xtest,Ztest=NULL,ncomp=0:3, ordered=NULL,nbgene=NULL,fold=10,...)
Xlearn |
A nlearn x p matrix giving the microarray predictors for the learning data set. |
Zlearn |
A nlearn x q matrix giving the clinical predictors for the learning data set. This argument is ignored. |
Ylearn |
A numeric vector of length nlearn giving the class membership of the learning observations, coded as 0,...,K-1 (where K is the number of classes). |
Xtest |
A ntest x p matrix giving the microarray predictors for the test data set. |
Ztest |
A ntest x q matrix giving the clinical predictors for the test data set. This argument is ignored. |
ncomp |
A numeric vector giving the candidate numbers of pre-validated PLS components. All numbers must be >0. |
ordered |
A vector of length p giving the order of the microarray predictors in terms of relevance for prediction. For instance, if the three first elements of ordered are 30,2,2400, it means that the most relevant genes are the genes
in the 30th, 2nd and 2400th columns of the gene expression data matrix Xlearn . Note: if ordered=NULL ,
the columns of Xlearn and Xtest are assumed to be already ordered. |
nbgene |
The number of genes to be selected for use in dimension reduction. Default is nbgene=NULL , in which case all genes
are used. |
fold |
The number of folds for the pre-validation step. See Boulesteix et al (2008) for more details. The default is fold=10 . |
... |
Other arguments to be passed to the function cforest_control from the party package. |
See Boulesteix et al (2008).
A list with the elements:
prediction |
A numeric vector of length nrow(Xtest) giving the predicted class for
each observation from the test data set. |
importance |
The variable importance information output
by the function varimp from the package party for the corresponding forest. |
bestncomp |
The best number of pre-validated PLS components, as obtained using the model selection method based on the out-of-bag error. |
OOB |
A numeric vector of length ncomp giving the out-of-bag error of the forest constructed with the corresponding number of pre-validated PLS components. |
Anne-Laure Boulesteix (http://www.slcmsr.net/boulesteix)
Boulesteix AL, Porzelius C, Daumer M, 2008. Microarray-based classification and clinical predictors: On combined classifiers and additional predictive value. Bioinformatics 24:1698-1706.
Tibshirani R, Efron B, 2002. Pre-validation and inference in microarrays. Stat. Appl. Genet. Mol. Biol. 1:1.
testclass
, testclass_simul
, simulate
,
plsrf_x
, plsrf_xz
, plsrf_xz_pv
, rf_z
,
logistic_z
, svm_x
.
# load MAclinical library # library(MAclinical) # Generating xlearn, zlearn, ylearn, xtest, ztest xlearn<-matrix(rnorm(3000),30,100) ylearn<-sample(0:1,30,replace=TRUE) xtest<-matrix(rnorm(2000),20,100) my.prediction1<-plsrf_x_pv(Xlearn=xlearn,Ylearn=ylearn,Xtest=xtest) ordered<-sample(100) my.prediction2<-plsrf_x(Xlearn=xlearn,Ylearn=ylearn,Xtest=xtest,ordered=ordered,nbgene=20) my.prediction3<-plsrf_x_pv(Xlearn=xlearn,Ylearn=ylearn,Xtest=xtest,fold=30)