ggm.test.edges {GeneNet}R Documentation

Graphical Gaussian Models: Assess Significance of Edges (and Directions)

Description

ggm.test.edges returns a data frame containing all edges listed in order of the magnitude of the partial correlation associated with each edge. If fdr=TRUE then in addition the p-values, q-values and posterior probabilities (=1 - local fdr) for each potential edge are computed.

network.test.edges is the same function as ggm.test.edges.

extract.network returns a data frame with a subset of significant edges.

Usage

ggm.test.edges(r.mat, fdr=TRUE, direct=FALSE, plot=TRUE, ...)
network.test.edges(r.mat, fdr=TRUE, direct=FALSE, plot=TRUE, ...)
extract.network(network.all, method.ggm=c("prob", "qval","number"), 
      cutoff.ggm=0.8, method.dir=c("prob","qval","number", "all"), 
      cutoff.dir=0.8, verbose=TRUE)

Arguments

r.mat matrix of partial correlations
fdr estimate q-values and local fdr
direct compute additional statistics for obtaining a partially directed network
plot plot density and distribution function and (local) fdr values
... parameters passed on to fdrtool
network.all list with partial correlations and fdr values for all potential edges (i.e. the output of network.test.edges
method.ggm determines which criterion is used to select significant partial correlations (default: prob)
cutoff.ggm default cutoff for significant partial correlations
method.dir determines which criterion is used to select significant directions (default: prob)
cutoff.dir default cutoff for significant directions
verbose print information on the number of significant edges etc.

Details

For assessing the significance of edges in the GGM a mixture model is fitted to the partial correlations using fdrtool. This results in (i) two-sided p-values for the test of non-zero correlation, (ii) corresponding posterior probabilities (= 1- local fdr), as well as (iii) tail area-based q-values. See Sch"afer and Strimmer (2005) for details.

For determining putatative directions on this GGM log-ratios of standardized partial variances re estimated, and subsequently the corresponding (local) fdr values are computed - see Opgen-Rhein and Strimmer (2007).

Value

ggm.test.edges and network.test.edges return sorted data frame with the following columns:

pcor correlation (from r.mat)
node1 first node connected to edge
node2 second node connected to edge
pval p-value
qval q-value
prob probability that edge is nonzero (= 1-local fdr
log.spvar log ratio of standardized partial variance (determines direction)
pval.dir p-value (directions)
qval.dir q-value (directions)
prob.dir 1-local fdr (directions)


Each row in the data frame corresponds to one edge, and the rows are sorted according the absolute strength of the correlation (from strongest to weakest)
extract.network processes the above data frame containing all potential edges, and returns a dataframe with a subset of edges. If applicable, an additional last column (11) contains additional information on the directionality of an edge.

Author(s)

Rainer Opgen-Rhein, Juliane Sch"afer, Korbinian Strimmer (http://strimmerlab.org).

References

Sch"afer, J., and Strimmer, K. (2005). An empirical Bayes approach to inferring large-scale gene association networks. Bioinformatics 21:754-764.

Opgen-Rhein, R., and K. Strimmer. (2007). From correlation to causation networks: a simple approximate learning algorithm and its application to high-dimensional plant gene expression data. BMC Syst. Biol. 1:37.

See Also

cor0.test, fdr.control, ggm.estimate.pcor.

Examples

# load GeneNet library
library("GeneNet")
 
# ecoli data 
data(ecoli)

# estimate partial correlation matrix 
inferred.pcor <- ggm.estimate.pcor(ecoli)

# p-values, q-values and posterior probabilities for each potential edge 
#
test.results <- ggm.test.edges(inferred.pcor)

# show best 20 edges (strongest correlation)
test.results[1:20,]

# extract network containing edges with prob > 0.9 (i.e. local fdr < 0.1)
net <- extract.network(test.results, cutoff.ggm=0.9)
net

# how many are significant based on FDR cutoff Q=0.05 ?
num.significant.1 <- sum(test.results$qval <= 0.05)
test.results[1:num.significant.1,]

# how many are significant based on "local fdr" cutoff (prob > 0.9) ?
num.significant.2 <- sum(test.results$prob > 0.9)
test.results[test.results$prob > 0.9,]

# parameters of the mixture distribution used to compute p-values etc.
c <- fdrtool(sm2vec(inferred.pcor), statistic="correlation")
c$param


[Package GeneNet version 1.2.3 Index]