lemma {lemma} | R Documentation |
This program implements the EM algorithm for the LEMMA model. LEMMA fits a linear mixed-model for normalized microarray data analysis. See the reference to the LEMMA paper, which contains the underlying model and the theory. For a short summary of the model, see
http://www.stat.cornell.edu/lemma/docs/LEMMAsummary.pdf.
This version supports two treatment groups and either a two-way classification (null and nonnull genes, as in the LEMMA paper), or a three-way classification: null genes, for which statistically there is no difference in expression between the two treatment groups; nonnull group 1 - genes that are significantly more expressed in treatment group 1 than in treatment group 2; and nonnull group 2 - genes that are significantly more expressed in treatment group 2 than in treatment group 1.
The program runs on both Windows and Linux.
lemma(dataframe, locfdrcutoff=0.2, fdrcutoff=0.2, mgq=1, titletext="", outdir="OUT", topgenes="nonnull", tol=1e-6, maxIts=50000, modes=3, plots=TRUE)
dataframe |
The data frame file containing the normalized data (the first column contains the gene IDs, the second column contains the gene names, the next n1 columns contain the normalized data for treatment group 1 (Y1), and the last n2 columns contain the normalized data for treatment group 2 (Y2). |
locfdrcutoff |
the local fdr cutoff value for detecting nonnull genes. Default=0.2. |
fdrcutoff |
the level for the FDR procedure for detecting nonnull genes. Default=0.2. |
mgq |
The quantile used for eliminating genes with extreme m_g values. Default=1. |
titletext |
The title to be used in the output files. |
outdir |
the output directory (use / as a directory separator, even on Windows). Default=OUT. |
topgenes |
the number of genes to print (sorted in ascending order by local-fdr or adjusted p-value). Use `all' to print all the genes, or `nonnull' to print only the genes that are declared nonnull with the given local fdr or FDR cut-off values. Default=`nonnull'. |
tol |
The tolerance level for determining the convergence of the EM algorithm. Default=1e-6. |
maxIts |
The maximum number of EM iterations. Default=50000. |
modes |
the number of assumed (null + nonnull) groups. Can be either 2 or 3 (default=3). |
plots |
logical - if TRUE, produce the diagnostics plots in ps and pdf formats (default=TRUE). |
The input should consist of a data frame with G rows, and have the following structure:
In this version n1 and n2 do not have to be the same, but all the rows in Y1 have to have n1 elements, and all the rows in Y2 have to have n2 elements.
All of the parameter estimates, plots, and gene lists will be saved under the outdir directory. In particular, this directory will contain the following files:
log.txt
resultsRR.txt
resultsFDR.txt
AllData.RData
Note:
## Not run: lemma(apoai,titletext="APO-AI, Callow et al (2000)",outdir="OUT/apoai", plots=F) lemmaPlots("OUT/apoai",mgq=0.99, titletext="APO-AI (Callow et al., 2000)") lemma(simdata,titletext="Simulated data",outdir="OUT/simdata") # Similarly, if the user wants to use the 2-way classification: lemma(apoai,titletext="APO-AI, Callow et al (2000)",outdir="OUT/apoai", modes=2, plots=F) lemmaPlots("OUT/apoai",mgq=0.99,titletext="APO-AI (Callow et al., 2000)", modes=2) ## End(Not run)