alternative.probable {HighProbability} | R Documentation |
alternative.probable determines which alternative hypotheses have sufficiently high probability of truth for acceptance.
alternative.beneficial determines which alternative hypotheses should be accepted according to a decision-theoretic approach.
alternative.probable(p.values, min.probability=0.5, marginal.probability = NULL, max.iteration=10, tolerance=get.marginal.probability.tolerance(), plot.relative.gain = FALSE, call.browser=FALSE) alternative.beneficial(p.values, cost.to.benefit=1, marginal.probability = NULL, max.iteration=10, tolerance=get.marginal.probability.tolerance(), plot.relative.gain = FALSE)
p.values |
a vector of p-values that have not been corrected for multiple comparisons. For example, p-values may be calculated from wilcox.test or cor.test for two groups, or from lm for multiple groups. Alternately, permutation-based p-values (achieved significance levels) may be calculated using sample . |
min.probability |
the lowest posterior probability of an alternative hypothesis for it to be true. This probability is conditional on the p-value and thus on the test statistic used to generate the p-value. |
cost.to.benefit |
the ratio of the cost of accepting a false alternative hypothesis to the benefit of accepting a true alternative hypothesis. For example, in a microarray study, one may specify the expense of follow up studies needed to investigate a gene that only seems to be differentially expressed, divided by the enonomic or other benefit of finding a gene that really is differentially expressed. |
marginal.probability |
a known or estimated lower bound on the proportion of p-values that correspond to true alternative hypotheses. The default action is to estimate and print marginal.probability. Printing may be suppressed by supplying 0 or some other value. |
max.iteration |
number of iterations if the estimates of the marginal probability do not converge. |
tolerance |
the difference in proportion estimates that defines convergence, the default value is 0.005. |
plot.relative.gain |
If TRUE, the relative desirability will be plotted as a function of the significance level. |
call.browser |
if TRUE, the debugging facilities are used. |
See the following references for details.
Each of these functions returns a logical vector with the same length as p.values. If an element in the vector is TRUE, then the corresponding p-value is low enough to warrant considering its alternative hypothesis true. The vector returned indicates which null hypotheses are considered true and which are considered false, based either on belief (for alternative.probable) or on a cost/benefit decision analysis (for alternative.beneficial).
David R. Bickel (DavidBickel.66846716@bloglines.com, http://www.davidbickel.com) , Zahra Montazeri (zahra@math.carleton.ca)
Bickel, David R. (2004) Error-Rate and Decision-Theoretic Methods of Multiple Testing: Which Genes Have High Objective Probabilities of Differential Expression?, Statistical Applications in Genetics and Molecular Biology 3: Iss. 1, Article 8 . Available on-line at http://www.bepress.com/sagmb/vol3/iss1/art8
Bickel, D. R. (2004) "HighProbability determines which alternative hypotheses are highly probable: Genomic applications include detection of differential gene expression," arXiv.org e-print ID q-bio.QM/0402049. Available on-line at http://arxiv.org/abs/q-bio.QM/0402049
marginal.probability
, t.test
, wilcox.test
, cor.test
, lm
, sample
n.variables <- 10000 # This could be the number of genes on a microarray. n.individuals <- 5 # This could be the number of microarrays per group. n.effects <- 1000 # This is the number of alternative hypotheses that are true, e.g., number of genes differentially expressed. x1 <- matrix(c(rnorm(n.effects * n.individuals, mean = 2, sd = 1), rnorm((n.variables - n.effects) * n.individuals, mean = 0, sd = 1)), nrow = n.variables, byrow = TRUE) # Observed data, e.g., logarithms of gene expression ratios, for group 1. x2 <- matrix(rnorm(n.variables * n.individuals, mean = 0, sd = 1), nrow = n.variables, byrow = TRUE) # The same for group 2. p.values <- numeric(n.variables) for(i in 1:n.variables) p.values[i] <- t.test(x1[i, ], x2[i, ])$p.value is.probable <- alternative.probable(p.values) # Selects which alternative hypotheses are probably true, e.g., which genes are probably differentially expressed. c(sum(is.probable[1:1000]), sum(is.probable[1001:10000])) # Numbers of true and false calls of differential expression. is.probable.90 <- alternative.probable(p.values, min.probability = .90) # To be at least 90 c(sum(is.probable.90[1:1000]), sum(is.probable.90[1001:10000])) # Smaller numbers of true and false calls of differential expression. is.beneficial <- alternative.beneficial(p.values, cost.to.benefit = 1) all.equal(is.beneficial, is.probable)