GSA.func {GSA} | R Documentation |
Determines the significance of pre-defined sets of genes with respect to an outcome variable, such as a group indicator, quantitative variable or survival time. This is the basic function called by GSA.
GSA.func(x,y, genesets, genenames,geneset.names=NULL, method=c("maxmean","mean","absmean"), resp.type=c("Quantitative", "Two class unpaired","Survival","Multiclass", "Two class paired"), censoring.status=NULL, first.time = TRUE, return.gene.ind = TRUE, ngenes = NULL, gs.mat =NULL, gs.ind = NULL, catalog = NULL, catalog.unique =NULL, s0 = NULL, s0.perc = NULL, minsize = 15, maxsize= 500, restand = TRUE)
x |
Data x: p by n matrix of features, one observation per column (missing values allowed) |
y |
|
genesets |
Gene set collection (a list) |
genenames |
Vector of genenames in expression dataset |
geneset.names |
Optional vector of gene set names |
method |
Method for summarizing a gene set: "maxmean" (default), "mean" or "absmean" |
resp.type |
Problem type: "quantitative" for a continuous parameter; "Two class unpaired" ; "Survival" for censored survival outcome; "Multiclass" : more than 2 groups; "Two class paired" for paired outcomes, coded -1,1 (first pair), -2,2 (second pair), etc |
censoring.status |
Vector of censoring status values for survival problems, 1 mean death or failure, 0 means censored) |
first.time |
internal use |
return.gene.ind |
internal use |
ngenes |
internal use |
gs.mat |
internal use |
gs.ind |
internal use |
catalog |
internal use |
catalog.unique |
internal use |
s0 |
Exchangeability factor for denominator of test statistic; Default is automatic choice |
s0.perc |
Percentile of standard deviation values to use for s0; default is automatic choice; -1 means s0=0 (different from s0.perc=0, meaning s0=zeroeth percentile of standard deviation values= min of sd values |
minsize |
Minimum number of genes in genesets to be considered |
maxsize |
Maximum number of genes in genesets to be considered |
restand |
Should restandardization be done? Default TRUE |
Carries out a Gene set analysis, computing the gene set scores. This function does not do any permutations for estimation of false discovery rates. GSA calls this function to estimate FDRs.
A list with components
scores |
Gene set scores for each gene set |
norm.scores |
Gene set scores transformed by the inverse Gaussian cdf |
mean |
Means of gene expression values for each sample |
sd |
Standard deviation of gene expression values for each sample |
gene.ind |
List indicating whch genes in each positive gene set had positive individual scores, and similarly for negative gene sets |
geneset.names |
Names of the gene sets |
nperms |
Number of permutations used |
gene.scores |
Individual gene scores (eg t-statistics for two class problem) |
s0 |
Computed exchangeability factor |
s0.perc |
Computed percentile of standard deviation values |
stand.info |
Information computed used in the restandardization process |
method |
Method used (from call to GSA.func) |
call |
The call to GSA |
Robert Tibshirani
Efron, B. and Tibshirani, R. On testing the significance of sets of genes. Stanford tech report rep 2006. http://www-stat.stanford.edu/~tibs/ftp/GSA.pdf
######### two class unpaired comparison # y must take values 1,2 set.seed(100) x<-matrix(rnorm(1000*20),ncol=20) dd<-sample(1:1000,size=100) u<-matrix(2*rnorm(100),ncol=10,nrow=100) x[dd,11:20]<-x[dd,11:20]+u y<-c(rep(1,10),rep(2,10)) genenames=paste("g",1:1000,sep="") #create some random gene sets genesets=vector("list",50) for(i in 1:50){ genesets[[i]]=paste("g",sample(1:1000,size=30),sep="") } geneset.names=paste("set",as.character(1:50),sep="") GSA.func.obj<-GSA.func(x,y, genenames=genenames, genesets=genesets, resp.type="Two class unpaired") #to use "real" gene set collection, we read it in from a gmt file: # # geneset.obj<- GSA.read.gmt("file.gmt") # # where file.gmt is a gene set collection from GSEA collection or # or the website http://www-stat.stanford.edu/~tibs/GSA, or one # that you have created yourself. Then # GSA.func.obj<-GSA.func(x,y, genenames=genenames, genesets=geneset.obj$genesets, resp.type="Two class unpaired") # #