alternative.probable {HighProbability}R Documentation

alternative hypotheses accepted by empirical Bayes analysis

Description

alternative.probable determines which alternative hypotheses have sufficiently high probability of truth for acceptance.

alternative.beneficial determines which alternative hypotheses should be accepted according to a decision-theoretic approach.

Usage

alternative.probable(p.values, min.probability=0.5, marginal.probability = NULL, max.iteration=10, tolerance=get.marginal.probability.tolerance(), plot.relative.gain = FALSE, call.browser=FALSE)
alternative.beneficial(p.values, cost.to.benefit=1, marginal.probability = NULL, max.iteration=10, tolerance=get.marginal.probability.tolerance(), plot.relative.gain = FALSE)

Arguments

p.values a vector of p-values that have not been corrected for multiple comparisons. For example, p-values may be calculated from wilcox.test or cor.test for two groups, or from lm for multiple groups. Alternately, permutation-based p-values (achieved significance levels) may be calculated using sample.
min.probability the lowest posterior probability of an alternative hypothesis for it to be true. This probability is conditional on the p-value and thus on the test statistic used to generate the p-value.
cost.to.benefit the ratio of the cost of accepting a false alternative hypothesis to the benefit of accepting a true alternative hypothesis. For example, in a microarray study, one may specify the expense of follow up studies needed to investigate a gene that only seems to be differentially expressed, divided by the enonomic or other benefit of finding a gene that really is differentially expressed.
marginal.probability a known or estimated lower bound on the proportion of p-values that correspond to true alternative hypotheses. The default action is to estimate and print marginal.probability. Printing may be suppressed by supplying 0 or some other value.
max.iteration number of iterations if the estimates of the marginal probability do not converge.
tolerance the difference in proportion estimates that defines convergence, the default value is 0.005.
plot.relative.gain If TRUE, the relative desirability will be plotted as a function of the significance level.
call.browser if TRUE, the debugging facilities are used.

Details

See the following references for details.

Value

Each of these functions returns a logical vector with the same length as p.values. If an element in the vector is TRUE, then the corresponding p-value is low enough to warrant considering its alternative hypothesis true. The vector returned indicates which null hypotheses are considered true and which are considered false, based either on belief (for alternative.probable) or on a cost/benefit decision analysis (for alternative.beneficial).

Author(s)

David R. Bickel (DavidBickel.66846716@bloglines.com, http://www.davidbickel.com) , Zahra Montazeri (zahra@math.carleton.ca)

References

Bickel, David R. (2004) Error-Rate and Decision-Theoretic Methods of Multiple Testing: Which Genes Have High Objective Probabilities of Differential Expression?, Statistical Applications in Genetics and Molecular Biology 3: Iss. 1, Article 8 . Available on-line at http://www.bepress.com/sagmb/vol3/iss1/art8

Bickel, D. R. (2004) "HighProbability determines which alternative hypotheses are highly probable: Genomic applications include detection of differential gene expression," arXiv.org e-print ID q-bio.QM/0402049. Available on-line at http://arxiv.org/abs/q-bio.QM/0402049

See Also

marginal.probability, t.test, wilcox.test, cor.test, lm, sample

Examples

n.variables <- 10000

 # This could be the number of genes on a microarray.

n.individuals <- 5

 # This could be the number of microarrays per group.

n.effects <- 1000 

# This is the number of alternative hypotheses that are true, e.g., number of genes differentially expressed.

x1 <- matrix(c(rnorm(n.effects * n.individuals, mean = 2, sd = 1), rnorm((n.variables - n.effects) * n.individuals, mean = 0, sd = 1)), nrow = n.variables, byrow = TRUE) 

# Observed data, e.g., logarithms of gene expression ratios, for group 1.

x2 <- matrix(rnorm(n.variables * n.individuals, mean = 0, sd = 1), nrow = n.variables, byrow = TRUE) 

# The same for group 2.

p.values <- numeric(n.variables)

for(i in 1:n.variables) p.values[i] <- t.test(x1[i, ], x2[i, ])$p.value

is.probable <- alternative.probable(p.values) 

# Selects which alternative hypotheses are probably true, e.g., which genes are probably differentially expressed.

c(sum(is.probable[1:1000]), sum(is.probable[1001:10000])) 

# Numbers of true and false calls of differential expression.

is.probable.90 <- alternative.probable(p.values, min.probability = .90) 

# To be at least 90

c(sum(is.probable.90[1:1000]), sum(is.probable.90[1001:10000])) 

# Smaller numbers of true and false calls of differential expression.

is.beneficial <- alternative.beneficial(p.values, cost.to.benefit = 1)

all.equal(is.beneficial, is.probable)

[Package HighProbability version 2.1 Index]