TopGenes {SharedHT2}R Documentation

Gene lists linked to Gene Cards

Description

TopGenes creates a genelist sorted on values of a chosen statistic.

Usage

  TopGenes(obj, by = "EB", ref = 1, FDR = 0.05, allsig = FALSE, n.g = 20, 
           browse = FALSE, search.url = genecards, path = "", file = "") 

Arguments

obj An object of class fit.n.data returned by EB.Anova
by Specify "EB" or "naive". Do you want to use the empirical bayes or the naive variant of the statistic. Defaults to "EB".
ref If you would like to see group to group fold changes included in the table then specify ref=k where k is the index of the group considered referent.
FDR The false discovery fate (FDR) to use in the Benjamini-Hochberg (BH) stepdown procedure.
allsig Set to TRUE if you only want to see the genes meeting the BH criterion. Defaults to FALSE
n.g If allsig is FALSE, you must specify the number of top genes that you want to see.
browse set to TRUE if you the results displayed in the HTML browser with gene identifiers linked to the GeneCards database at the Weizmann institute. Defaults to FALSE
search.url should contain a url href to search an online database when a gene identifier is appended onto the end. Defaults to 'genecards', included in this package, for searching the GeneCards database at the Weizmann institute.
file If browse is set to TRUE and if you want to save the html file then supply a filename, file=foobgwb.html
path If specifying a file name above, optionally you may specify a path name. In the unix implementation, specifying a file name without a path writes to the current working directory. This feature is not yet supported in Windows. Use path argument to explicitly specify a directory

Value

A sorted genelist in the form of a n.g by 5+d matrix, where d is the number of groups. The columns contain the following values: following columns:

RowNum Row number from the original unsorted data frame. Useful when you happen to know that the first 100 genes are true positives and the rest are not (as is the case with the supplied dataset, SimAffyDat).
GeneId Taken from the row component of dimnames in the original dataframe. As such, it is very usefull to name your rows using the affy gene identifiers as these are searchable in the GeneCards database.
'NAME1' Group mean corresponding to the group named NAME1
...
'NAMEd' Group mean corresponding to the group named NAMEd
'TYPE'.stat The statistic used in the sort. This is determined by user choices at two levels. First, when computations are performed inside the call to EB.Anova, the specification of Var.Struct, which defaults to "general" with alternate value "simple" determins whether the multivariate test, i.e. the Hotelling T-squared, (HT2) or the univariate F test (UT2) are computed. In both cases, both the empirical Bayes and standard variants are computed. Next after the computation is completed, when the user requests the sorted genelist, the specification of the arguement by, either "EB" (default) or "naive" determines which of the two computed statistics is used to perform the sort. Thus TYPE assumes one of four values ShHT2, HT2, ShUT2, or UT2. The first two versus last two split is on the value of Var.Struct used in the EB.Anova computation, while the first/second or third/fourth split is made on the by argument passed to TopGenes.
'TYPE'.p-val The corresponding p-value under the corresponding model. See the manuscript sharedHT2.pdf in the ./doc directory.
FDR.stepdown='FDR' The BH criterion values computed as FDR * rank/Ngenes

Note

The print method for class fit.n.data contains a call to TopGenes so that calls to TopGenes and to print work almost the same, except the latter produces a model fit summary as well.

Author(s)

Grant Izmirlian izmirlian@nih.gov

See Also

EB.Anova, EBfit, SimAffyDat, SimW.IW, Simnu.mix

Examples


# The included example dataset is a simulated Affymetrix oligonucleotide
# array experiment. Type ?SimAffyDat for details.

  data(SimAffyDat)

# Fit the Wishart/Inverse Wishart empirical Bayes model and derive per gene
# Shared Variance Hotelling T-Squared (ShHT2) statistics.

  fit.SimAffyDat <- EB.Anova(data=SimAffyDat, labels=c("log2.grp" %,% (1:2)),
                             H0="zero.means", Var.Struct = "general")

# Top 20 genes (sorted by decreasing ShHT2 statistic) and model summary

  fit.SimAffyDat

# Same screen output & opens html browser with genelist linked to GeneCards database.

  TopGenes(fit.SimAffyDat, browse = TRUE)

# Only the genes selected by the Benjamini-Hochberg procedure at FDR=0.05

  TopGenes(fit.SimAffyDat, FDR=0.05, allsig=TRUE)

# Just the top 35 genes

  TopGenes(fit.SimAffyDat, n.g = 35)

# Try the Var.Struct="simple" option:

  fitSV.SimAffyDat <- update(EBfit(fit.SimAffyDat), Var.Struct = "simple")

# Now try TopGenes using the univariate statistic:

  TopGenes(fitSV.SimAffyDat, FDR=0.05, allsig=TRUE)


[Package SharedHT2 version 1.3 Index]