superpc.cv {superpc} | R Documentation |
This function uses a form of cross-validation to estimate the optimal feature threshold in supervised principal components
superpc.cv(fit, data, n.threshold = 20, n.fold = NULL, folds = NULL, n.components = 3, min.features = 5, max.features = nrow(data$x), compute.fullcv = TRUE, compute.preval = TRUE, xl.mode = c("regular", "firsttime", "onetime", "lasttime"), xl.time = NULL, xl.prevfit = NULL)
fit |
Object returned by superpc.train |
data |
Data object of form described in superpc.train documentation |
n.threshold |
Number of thresholds to consider. Default 20. |
n.fold |
Number of cross-validation folds. default is around 10 (program pick a convenient value based on the sample size |
folds |
List of indices of cross-validation folds (optional) |
n.components |
Number of cross-validation components to use: 1,2 or 3. |
min.features |
Minimum number of features to include, in determining range for threshold. Default 5. |
max.features |
Maximum number of features to include, in determining range for threshold. Default is total number of features in the dataset |
compute.fullcv |
Should full cross-validation be done? |
compute.preval |
Should full pre-validation be done? |
xl.mode |
Used by Excel interface only |
xl.time |
Used by Excel interface only |
xl.prevfit |
Used by Excel interface only |
This function uses a form of cross-validation to estimate the optimal feature threshold in supervised principal components. To avoid prolems with fitting Cox models to samll validation datastes, it uses the "pre-validation" approach of Tibshirani and Efron (2002)
list(threshold = th, nonzero = nonzero, scor = out, scor.preval = out.preval, folds = folds, featurescores.folds = featurescores.folds, v.preval = cur2, type = type, call = this.call)
threshold |
Vector of thresholds considered |
nonzero |
Number of features exceeding each value of the threshold |
scor.preval |
Likelihood ratio scores from pre-validation |
scor |
Full CV scores |
folds |
Indices of CV folds used |
featurescores.folds |
Feature scores for each fold |
v.preval |
The pre-validated predictors |
type |
problem type |
call |
calling sequence |
~~further notes~~
Eric Bair and Robert Tibshirani
set.seed(332) x<-matrix(rnorm(1000*40),ncol=40) y<-10+svd(x[1:60,])$v[,1]+ .1*rnorm(40) censoring.status<- sample(c(rep(1,30),rep(0,10))) featurenames <- paste("feature",as.character(1:1000),sep="") data<-list(x=x,y=y, censoring.status=censoring.status, featurenames=featurenames) a<- superpc.train(data, type="survival") aa<-superpc.cv(a,data)