TopGenes {SharedHT2} | R Documentation |
TopGenes
creates a genelist sorted on values of a chosen statistic.
TopGenes(obj, by = "EB", ref = 1, FDR = 0.05, allsig = FALSE, n.g = 20, browse = FALSE, search.url = genecards, path = "", file = "")
obj |
An object of class fit.n.data returned by EB.Anova |
by |
Specify "EB" or "naive". Do you want to use the empirical bayes or the naive variant of the statistic. Defaults to "EB". |
ref |
If you would like to see group to group fold changes included
in the table then specify ref=k where k is the index of the
group considered referent. |
FDR |
The false discovery fate (FDR) to use in the Benjamini-Hochberg (BH) stepdown procedure. |
allsig |
Set to TRUE if you only want to see the genes
meeting the BH criterion. Defaults to FALSE |
n.g |
If allsig is FALSE , you must specify the number
of top genes that you want to see. |
browse |
set to TRUE if you the results displayed in the HTML
browser with gene identifiers linked to the GeneCards database at the
Weizmann institute. Defaults to FALSE |
search.url |
should contain a url href to search an online database when a gene identifier is appended onto the end. Defaults to 'genecards', included in this package, for searching the GeneCards database at the Weizmann institute. |
file |
If browse is set to TRUE and if you want to
save the html file then supply a filename, file=foobgwb.html |
path |
If specifying a file name above, optionally you
may specify a path name. In the unix implementation, specifying
a file name without a path writes to the current working directory.
This feature is not yet supported in Windows. Use path argument
to explicitly specify a directory |
A sorted genelist in the form of a n.g
by 5+d
matrix, where d
is
the number of groups. The columns contain the following values:
following columns:
RowNum |
Row number from the original unsorted data frame. Useful when you
happen to know that the first 100 genes are true positives and the rest are not
(as is the case with the supplied dataset, SimAffyDat ). |
GeneId |
Taken from the row component of dimnames in the original
dataframe. As such, it is very usefull to name your rows using the affy gene
identifiers as these are searchable in the GeneCards database. |
'NAME1' |
Group mean corresponding to the group named NAME1 |
... |
|
'NAMEd' |
Group mean corresponding to the group named NAMEd |
'TYPE'.stat |
The statistic used in the sort. This is determined
by user choices at two levels. First, when computations are performed inside
the call to EB.Anova , the specification of Var.Struct , which
defaults to "general" with alternate value "simple" determins whether the
multivariate test, i.e. the Hotelling T-squared, (HT2) or the univariate F
test (UT2) are computed. In both cases, both the empirical Bayes and standard
variants are computed. Next after the computation is completed, when the user
requests the sorted genelist, the specification of the arguement by , either
"EB" (default) or "naive" determines which of the two computed statistics is
used to perform the sort. Thus TYPE assumes one of four values
ShHT2 , HT2 , ShUT2 , or UT2 . The first two versus
last two split is on the value of Var.Struct used in the EB.Anova
computation, while the first/second or third/fourth split is made on the by
argument passed to TopGenes . |
'TYPE'.p-val |
The corresponding p-value under the corresponding model. See the manuscript sharedHT2.pdf in the ./doc directory. |
FDR.stepdown='FDR' |
The BH criterion values computed as
FDR * rank/Ngenes |
The print method for class fit.n.data
contains a call to
TopGenes
so that calls to TopGenes
and to print
work almost the same, except the latter produces a model fit summary
as well.
Grant Izmirlian izmirlian@nih.gov
EB.Anova
, EBfit
, SimAffyDat
,
TopGenes
, SimNorm.IG
,
SimMVN.IW
, SimMVN.mxIW
,
SimOneNorm.IG
, SimOneMVN.IW
,
SimOneMVN.mxIW
# The included example dataset is a simulated Affymetrix oligonucleotide # array experiment. Type ?SimAffyDat for details. data(SimAffyDat) # Fit the Wishart/Inverse Wishart empirical Bayes model and derive per gene # Shared Variance Hotelling T-Squared (ShHT2) statistics. fit.SimAffyDat <- EB.Anova(data=SimAffyDat, labels=c("log2.grp" %,% (1:2)), H0="zero.means", Var.Struct = "general") # Top 20 genes (sorted by decreasing ShHT2 statistic) and model summary fit.SimAffyDat # Same screen output & opens html browser with genelist linked to GeneCards database. TopGenes(fit.SimAffyDat, browse = TRUE) # Only the genes selected by the Benjamini-Hochberg procedure at FDR=0.05 TopGenes(fit.SimAffyDat, FDR=0.05, allsig=TRUE) # Just the top 35 genes TopGenes(fit.SimAffyDat, n.g = 35) # Try the Var.Struct="simple" option: fitSV.SimAffyDat <- update(EBfit(fit.SimAffyDat), Var.Struct = "simple") # Now try TopGenes using the univariate statistic: TopGenes(fitSV.SimAffyDat, FDR=0.05, allsig=TRUE)