superpc.predict.red {superpc}R Documentation

Feature selection for supervised principal components

Description

Forms reduced models to approximate the supervised principal component predictor.

Usage

superpc.predict.red(fit, data, data.test, threshold, n.components = 3, n.shrinkage= 20, shrinkages=NULL,compute.lrtest = TRUE, sign.wt="both",  prediction.type =
                 c("continuous", "discrete"), n.class = 2 )

Arguments

fit Object returned by superpc.train
data Training data object, of form described in superpc.train dcoumentation
data.test Test data object; same form as train
threshold Feature score threshold; usually estimated from superpc.cv
n.components Number of principal components to examine; should equal 1,2, etc up to the number of components used in training
n.shrinkage Number of shrinkage values to consider. Default 20.
shrinkages Shrinkage values to consider. Default NULL.
compute.lrtest Should the likelihood ratio test be computed? Default TRUE
sign.wt Signs of feature weights allowed: "both", "pos", or "neg"
prediction.type Type of prediction: "continuous" (Default) or "discrete". In the latter, superprc score is divided into n.class groups
n.class Number of groups for discrete predictor. Default 2.

Details

Soft-thresholding by each of the "shrinkages" values is applied to the PC loadings. This reduce the number of features used in the model. The reduced predictor is then used in place of the supervised PC predictor.

Value

shrinkages Shrinkage values used
lrtest.reduced Likelihood ratio tests for reduced models
num.features Number of features used in each reduced model
feature.list List of features used in each reduced model
coef Least squares coefficients for each reduced model
import Importance scores for features
wt Weight for each feature, in constructing the reduced predictor
v.test Outcome predictor from reduced models. Array of n.shrinkage by (number of test observations)
v.test.1df Outcome combined predictor from reduced models. Array of n.shrinkage by (number of test observations)
n.components Number of principal components used
type Type of outcome
call calling sequence

Note

~~further notes~~

Author(s)

Eric Bair and Robert Tibshirani

References

~put references to the literature/web site here ~

Examples


set.seed(332)
#generate some data

x<-matrix(rnorm(1000*40),ncol=40)
y<-10+svd(x[1:60,])$v[,1]+ .1*rnorm(40)
ytest<-10+svd(x[1:60,])$v[,1]+ .1*rnorm(40)
censoring.status<- sample(c(rep(1,30),rep(0,10)))
censoring.status.test<- sample(c(rep(1,30),rep(0,10)))

featurenames <- paste("feature",as.character(1:1000),sep="")
data<-list(x=x,y=y, censoring.status=censoring.status, featurenames=featurenames)
data.test<-list(x=x,y=ytest, censoring.status=censoring.status.test, featurenames= featurenames)


a<- superpc.train(data, type="survival")

fit<- superpc.predict(a, data, data.test, threshold=1.0, n.components=1, prediction.type="continuous")

fit.red<- superpc.predict.red(a,data, data.test, threshold=.6)
superpc.plotred.lrtest(fit.red)


[Package superpc version 1.05 Index]