superpc.cv {superpc}R Documentation

Cross-validation for supervised principal components

Description

This function uses a form of cross-validation to estimate the optimal feature threshold in supervised principal components

Usage

superpc.cv(fit, data, n.threshold = 20,  n.fold = NULL, folds = NULL,   n.components = 3, min.features = 5, max.features = nrow(data$x),  compute.fullcv =  TRUE,
                 compute.preval = TRUE, xl.mode = c("regular",
                 "firsttime", "onetime", "lasttime"), xl.time = NULL,
                 xl.prevfit = NULL)

Arguments

fit Object returned by superpc.train
data Data object of form described in superpc.train documentation
n.threshold Number of thresholds to consider. Default 20.
n.fold Number of cross-validation folds. default is around 10 (program pick a convenient value based on the sample size
folds List of indices of cross-validation folds (optional)
n.components Number of cross-validation components to use: 1,2 or 3.
min.features Minimum number of features to include, in determining range for threshold. Default 5.
max.features Maximum number of features to include, in determining range for threshold. Default is total number of features in the dataset
compute.fullcv Should full cross-validation be done?
compute.preval Should full pre-validation be done?
xl.mode Used by Excel interface only
xl.time Used by Excel interface only
xl.prevfit Used by Excel interface only

Details

This function uses a form of cross-validation to estimate the optimal feature threshold in supervised principal components. To avoid prolems with fitting Cox models to samll validation datastes, it uses the "pre-validation" approach of Tibshirani and Efron (2002)

Value

list(threshold = th, nonzero = nonzero, scor = out, scor.preval = out.preval, folds = folds, featurescores.folds = featurescores.folds, v.preval = cur2, type = type, call = this.call)

threshold Vector of thresholds considered
nonzero Number of features exceeding each value of the threshold
scor.preval Likelihood ratio scores from pre-validation
scor Full CV scores
folds Indices of CV folds used
featurescores.folds Feature scores for each fold
v.preval The pre-validated predictors
type problem type
call calling sequence

Note

~~further notes~~

Author(s)

Eric Bair and Robert Tibshirani

Examples

set.seed(332)
x<-matrix(rnorm(1000*40),ncol=40)
y<-10+svd(x[1:60,])$v[,1]+ .1*rnorm(40)
censoring.status<- sample(c(rep(1,30),rep(0,10)))

featurenames <- paste("feature",as.character(1:1000),sep="")
data<-list(x=x,y=y, censoring.status=censoring.status, featurenames=featurenames)

a<- superpc.train(data, type="survival")
aa<-superpc.cv(a,data)

[Package superpc version 1.05 Index]