regularizedt {st}R Documentation

Various (Regularized) t Statistics

Description

These functions provide a simple interface to a variety of (regularized) t statistics that are commonly used in the analysis of high-dimensional case-control studies.

Usage

studentt.stat(X, L)
studentt.fun(L)
diffmean.stat(X, L)
diffmean.fun(L)
efront.stat(X, L, verbose=TRUE)
efront.fun(L, verbose=TRUE)
sam.stat(X, L)
sam.fun(L)
samL1.stat(X, L, method=c("lowess", "cor"), plot=FALSE, verbose=TRUE)
samL1.fun(L, method=c("lowess", "cor"), plot=FALSE, verbose=TRUE)
modt.stat(X, L)
modt.fun(L)

Arguments

X data matrix. Note that the columns correspond to variables (``genes'') and the rows to samples.
L group indicator vector. Samples belonging to the first group are assigned a `1', and those belonging to the second group a `2'.
method determines how the smoothing parameter is estimated (applies only to improved SAM statistic samL1).
plot output diagnostic plot (applies only to improved SAM statistic samL1).
verbose print out some (more or less useful) information during computation.

Details

studentt.* computes the standard equal variance t statistic.

diffmean.* computes the difference of means (i.e. the fold-change for log-transformed data).

efront.* computes the t statistic using the 90 % rule of Efron et al. (2001).

sam.* computes the SAM t statistic of Tusher et al. (2001). Note that this requires the additional installation of the ``samr'' package.

samL1.* computes the improved SAM t statistic of Wu (2005). Note that part of the code in this function is based on the R code providec by B. Wu.

modt.* computes the moderated t statistic of Smyth (2004). Note that this requires the additional installation of the ``limma'' package.

All the above statistics are compared relative to each other (and relative to the shrinkage t statistic) in Opgen-Rhein and Strimmer (2006).

Value

The *.stat functions directly return the respective statistic for each variable.
The corresponding *.fun functions return a function that produces the respective statistics when applied to a data matrix (this is very useful for simulations).

Author(s)

Rainer Opgen-Rhein (http://opgen-rhein.de) and Korbinian Strimmer (http://strimmerlab.org).

References

Opgen-Rhein, R., and K. Strimmer. 2006. Accurate ranking of differentially expressed genes by a distribution-free shrinkage approach.

A preprint is available at http://strimmerlab.org/publications/shrinkt2006.pdf.

See Also

shrinkt.stat.

Examples

# load st library 
library("st")

# load Choe et al. (2005) data
data(choedata)
X <- choe2.mat
dim(X) # 6 11475  
L <- choe2.L
L

# student t statistic
score = studentt.stat(X, L)
order(abs(score), decreasing=TRUE)[1:10]
# [1] 11068   724  9990 11387 11310  9985  9996 11046    43    50

# difference of means /fold change statistic
score = diffmean.stat(X, L)
order(abs(score), decreasing=TRUE)[1:10]
# [1]  4790  6620  1022 10979   970    35  2693  5762  5885     2

# Efron t statistic (90 % rule)
score = efront.stat(X, L)
order(abs(score), decreasing=TRUE)[1:10]
# [1]  4790 10979 11068  1022    50   724  5762    43 10936  9939

# sam statistic
# (requires "samr" package)
#score = sam.stat(X, L)
#order(abs(score), decreasing=TRUE)[1:10]
#[1]  4790 10979  1022  5762    35   970    50 11068 10905  2693

# improved sam statistic
score = samL1.stat(X, L)
order(abs(score), decreasing=TRUE)[1:10]
#[1]  1  2  3  4  5  6  7  8  9 10
# here all scores are zero!

# moderated t statistic
# (requires "limma" package)
#score = modt.stat(X, L)
#order(abs(score), decreasing=TRUE)[1:10]
# [1]  4790 10979  1022  5762    35    50 11068   970 10905    43

# shrinkage t statistic
score = shrinkt.stat(X, L)
order(abs(score), decreasing=TRUE)[1:10]
#[1] 10979 11068    50  1022   724  5762    43  4790 10936  9939

[Package st version 1.0.0 Index]