score {supclust} | R Documentation |
For a set of n observations grouped into two classes (for
example n expression values of a gene), the score
function measures the separation of the classes. It can be interpreted
as counting for each observation having response zero, the number of
individuals of response class one that are smaller, and summing up
these quantities.
score(x, resp)
x |
Numeric vector of length n, for example containing gene or cluster expression values of n different cases. |
resp |
Numeric vector of length n containing the ``binary''
class labels of the cases. Must be coded by 0 and 1 . |
A numeric value, the score
. The minimal score
is
zero, the maximal score
is the product of the number of samples
in class 0 and class 1. Values near the minimal or maximal
score
indicate good separation, whereas intermediate
score
means poor separation.
Marcel Dettling, dettling@stat.math.ethz.ch
Marcel Dettling (2002) Supervised Clustering of Genes, see http://stat.ethz.ch/~dettling/supercluster.html
Marcel Dettling and Peter Bühlmann (2002). Supervised Clustering of Genes. Genome Biology, 3(12): research0069.1-0069.15.
wilma
, margin
is the second statistic
that is used there.
data(leukemia, package="supclust") op <- par(mfrow=c(1,3)) plot(leukemia.x[,69],leukemia.y) title(paste("Score = ", score(leukemia.x[,69], leukemia.y))) ## Sign-flipping is very important plot(leukemia.x[,161],leukemia.y) title(paste("Score = ", score(leukemia.x[,161], leukemia.y),2)) x <- sign.flip(leukemia.x, leukemia.y)$flipped.matrix plot(x[,161],leukemia.y) title(paste("Score = ", score(x[,161], leukemia.y),2)) par(op)