shrinkcat.stat {st}R Documentation

Correlation-Adjusted t Score

Description

shrinkcat.stat and shrinkcat.fun compute a shrinkage estimate of the ``correlation-adjusted t score'' of Zuber and Strimmer (2009).

Usage

shrinkcat.stat(X, L, group.thresh = 1, verbose=TRUE)
shrinkcat.fun(L, group.thresh = 1, verbose=TRUE)

Arguments

X data matrix. Note that the columns correspond to variables (``genes'') and the rows to samples.
L vector with class labels for the two groups.
group.thresh controls the grouping of features. If group.thresh equals 1 no grouping occurs (the default).
verbose print out some (more or less useful) information during computation.

Details

The cat (``correlation-adjusted t'') score is the product of the square root of the inverse correlation matrix with a vector of t scores. The cat score thus describes the contribution of each individual feature in separating the two groups, after removing the effect of all other features.

In Zuber and Strimmer (2009) it is shown that the cat score is a natural criterion to rank features in the presence of correlation. If there is no correlation, the cat score reduces to the usual t score (hence in this case the estimate from shrinkcat.stat equals that from shrinkt.stat).

Additionally, the option group.thresh allows to specify a correlation neighborhood around a feature. In this neighborhood determined by absolute empirical correlation the grouped cat score is computed, a variant of the Hotelling T2 statistic summarizing the total contribution of the whole feature set for distinguishing the two groups. If the neighborhood contains only a single feature the grouped cat score is identical to the cat score.

Value

shrinkcat.stat returns a vector containing a shrinkage estimate of the ``cat score'' for each variable/gene.
The corresponding shrinkcat.fun functions return a function that computes the cat score when applied to a data matrix (this is very useful for simulations).

Author(s)

Verena Zuber and Korbinian Strimmer (http://strimmerlab.org).

References

Zuber, V., and K. Strimmer. 2009. Gene ranking and biomarker discovery under correlation. See http://arxiv.org/abs/0902.0751 for publication details.

See Also

shrinkt.stat, cst.stat, lait.stat.

Examples

# load st library 
library("st")

# prostate data set
data(singh2002)
X = singh2002$x
L = singh2002$y

dim(X)      # 102 6033 
length(L)   # 102

# shrinkage cat statistic
## Not run: 
score = shrinkcat.stat(X, L)
idx = order(abs(score), decreasing=TRUE)
idx[1:10]
# 610  364 1720 3647 3375  332 3282 3991 1557  914

# compute q-values and local false discovery rates
library("fdrtool")
fdr.out = fdrtool(as.vector(score))
sum(fdr.out$qval < 0.05)
sum(fdr.out$lfdr < 0.2)
## End(Not run)

# compared with:

# shrinkage t statistic 
score = shrinkt.stat(X, L)
idx = order(abs(score), decreasing=TRUE)
idx[1:10]
# 610 1720 3940  914  364  332 3647 4331  579 1068

# Student t statistic
score = studentt.stat(X, L)
idx = order(abs(score), decreasing=TRUE)
idx[1:10]
# 610 1720  364  332  914 3940 4546 1068  579 4331

# difference of means ("Fold Change")
score = diffmean.stat(X, L)
idx = order(abs(score), decreasing=TRUE)
idx[1:10]
# 735  610  694  298  698  292  739 3940  702  721

[Package st version 1.1.3 Index]