svm.fs {penalizedSVM}R Documentation

Fits SVM mit variable selection using penalties.

Description

Fits SVM with variable selection (clone selection) using penalties SCAD and L1 norm.

Usage

## Default S3 method:
svm.fs(x, y, fs.method = "1norm", cross.outer = 0, lambda1.set, lambda2.set = NULL, calc.class.weights = FALSE, seed = 240907, maxIter = NULL,...)
run.scad(x,y,  lambda1.set=NULL, class.weights)
run.1norm(x,y,k=5,nu=0, lambda1.set=NULL, output=1, seed=seed)

Arguments

x input matrix with genes in columns and samples in rows!
y vector of class labels
fs.method feature selection method. Availible 'scad' and '1norm'
cross.outer fold of outer cross validation, default is 0, no cv.
lambda1.set set of tuning parameters lambda1
lambda2.set set of tuning parameters lambda2, not yet in use
calc.class.weights calculate class.weights for SVM, default: FALSE
class.weights a named vector of weights for the different classes, used for asymetric class sizes. Not all factor levels have to be supplied (default weight: 1). All components have to be named.
k k-fold cross validation, default: 5
nu nu: weighted parameter
  • 1 - easy estimation,
  • 0 - hard estimation,
  • any other value - used as nu by the algorithm, default - 0.
output 0 - no output, 1 - produce output, default is 0
seed seed
maxIter maximal iteration, default: not used yet
... additional argument(s)

Details

The goodness of the model is highly correlated with the choice of tuning parameter lambda. Therefore the model is trained with different lambdas and the best model with optimal tuning parameter is used in futher analysises.

The Feature Selection methods are using different techniques for finding optimal tunung parameters By SCAD SVM Generalized approximate cross validation (gacv) error is calculated for each pre-defined tuning parameter.

By L1-norm SVM the cross validation (default 5-fold) missclassification error is calculated for each lambda. After training and cross validation, the optimal lambda with minimal missclassification error is choosen, and a final model with optimal lambda is created for the whole data set.

Value

classes vector of class labels as input 'y'
sample.names sample names
class.method feature selection method
cross.outer outer cv
seed seed
model final model
  • w - coefficients of the hyperplane
  • b - intercept of the hyperplane
  • xind - the index of the selected features (genes) in the data matrix.
  • index - the index of the resulting support vectors in the data matrix.
  • type - type of svm, from svm function
  • lam.opt - optimal lambda
  • gacv - corresponding gacv

Author(s)

Natalia Becker natalia.becker at dkfz.de

References

Zhang, H. H., Ahn, J., Lin, X. and Park, C. (2006). Gene selection using support vector machines with nonconvex penalty. Bioinformatics, 22, pp. 88-95.

Fung, G. and Mangasarian, O. L. (2004). A feature selection newton method for support vector machine classification. Computational Optimization and Applications Journal, 28(2), pp. 185-202.

See Also

predict.penSVM, svm (in package e1071)

Examples


my.seed<- 123

train<-sim.data(n = 200, ng = 100, nsg = 10, corr=FALSE, seed=my.seed )
print(str(train)) 

# train SCAD SVM ####################
# define set values of tuning parameter lambda1 for SCAD 
lambda1.scad <- c (seq(0.01 ,0.05, .01),  seq(0.1,0.5, 0.2), 1 ) 
# for presentation don't check  all lambdas : time consuming! 
lambda1.scad<-lambda1.scad[2:3]
# 
# train SCAD SVM
fit.scad<- svm.fs(x=t(train$x),y=train$y, fs.method="scad", cross.outer= 0, lambda1.set=lambda1.scad, seed=my.seed)
        
        
# train 1NORM SVM       ################        
# define set values of tuning parameter lambda1 for 1norm
epsi.set<-vector(); for (num in (1:9)) epsi.set<-sort(c(epsi.set, c(num*10^seq(-5, -1, 1 ))) )
# for presentation don't check  all lambdas : time consuming! 
lambda1.1norm <-        epsi.set[c(3,5)] # 2 params

# train 1norm SVM
# time consuming: for presentation only for the first 100 samples    
## DON'T RUN : fit.1norm<- svm.fs(x=t(train$x),y=train$y, fs.method="1norm", cross.outer= 0, lambda1.set=lambda1.1norm, seed=my.seed)
fit.1norm<- svm.fs(x=t(train$x)[1:100,],y=train$y[1:100], fs.method="1norm", cross.outer= 0, lambda1.set=lambda1.1norm, seed=my.seed)
        
    


[Package penalizedSVM version 1.0 Index]