pcAlgo {pcalg}R Documentation

PC-Algorithm: Estimate the Underlying Graph (Skeleton) of a DAG

Description

Estimate the underlying graph (ugraph or “skeleton” (structure) of a DAG (Directed Acyclic Graph) from data using the PC-algorithm.

Usage

pcAlgo(dm, alpha, corMethod = "standard", verbose = FALSE)

Arguments

dm data matrix; rows correspond to samples, cols correspond to nodes.
alpha significance level for the individual partial correlation tests.
corMethod a character string speciyfing the method for (partial) correlation estimation. "standard", "Qn" or "ogkQn" for standard and robust (based on the Qn scale estimator without and with OGK) correlation estimation
verbose Indicates whether some intermediate output should be shown (WARNING: This decreases the performance dramatically!)

Details

The algorithm starts with a complete undirected graph. In a first sweep, an edge ij is kept only if H_0: Cor(X_i,X_j) = 0 can be rejected on significance level alpha. All ordered pairs ij of nodes of the resulting graph are then swept again. An edge ij is kept only if H_0: Cor(X_i,X_j|X_k)=0 can be rejected for all neighbours k of i in the current graph. Again, the remaining egdes are swept. This time, an ordered pair (edge) ij is kept only if H_0: Cor(X_i,X_j|X_a,X_b)=0 can be rejected for all subsets of size two (a,b) of the neighbours of i in the remaining graph. In the next step, the remaining edges are tested using all subsets of size three, then of size four and so on. The algorithm stops when the largest neighbourhood is smaller than the size of the conditioning sets.

The partial correlations are computed recursively or via matrix inversion from the correlation matrix, which are computed by the specified method (corMethod). The partial correlation tests are based on Fisher's z-transformation. For more details on the methods for computing the correlations see mcor.

Value

An undirected graph (object of class "graph", see graph-class from the package graph) (without weigths) as estimate of the skeleton of the underlying DAG.

Author(s)

Markus Kalisch (kalisch@stat.math.ethz.ch) and Martin Maechler.

References

P. Spirtes, C. Glymour and R. Scheines (2000) Causation, Prediction, and Search, 2nd edition, The MIT Press.

Kalisch M. and P. B"uhlmann (2005) Estimating high-dimensional directed acyclic graphs with the PC-algorithm; Research Report Nr.~130, ETH Zurich
(http://stat.ethz.ch/research/research_reports/2005)

See Also

randomDAG for generating a random DAG; rmvDAG for generating data according to a DAG; compareGraphs for comparing undirected graphs in terms of TPR, FPR and TDR. Further, randomGraph (in package graph) for other random graph models.

Examples

p <- 10
## generate and draw random DAG :
set.seed(101)
class(myDAG <- randomDAG(p, prob = 0.2))
plot(myDAG, main = "randomDAG(10, prob = 0.2)")

## generate 1000 samples of DAG using standard normal error distribution
n <- 1000
d.mat <- rmvDAG(n, myDAG, errDist = "normal")

## estimate skeleton given data
res <- pcAlgo(d.mat, alpha = 0.05, corMethod = "standard")
res
plot(res)# << using the plot() method for 'pcAlgo' objects!
str(res, max = 2)
(c.g <- compareGraphs(myDAG, res@graph))
## plot the original DAG and the estimated skeleton :
op <- par(mfrow=c(2,1))
plot(myDAG, main = "original (random)DAG")
plot(res@graph,
     main = "estimated skeleton from pcAlgo(<simulated, n = 1000>)")
par(op)

## generate data containing severe outliers
d.mixmat <- rmvDAG(n, myDAG, errDist = "mix", mix=0.3)
## Compute "classical" and robust estimate of skeleton :
pcC <- pcAlgo(d.mixmat, 0.01, corMeth = "standard")
pcR <- pcAlgo(d.mixmat, 0.01, corMeth = "Qn")
str(pcR, max = 2)
(c.Cg <- compareGraphs(myDAG, pcC@graph))
(c.Rg <- compareGraphs(myDAG, pcR@graph))#-> (.201 0 1) much better
op <- par(mfrow=c(3,1))
  plot(myDAG, main = "original (random)DAG")
  plot(pcC)
  plot(pcR)
par(op)

[Package pcalg version 0.1-3 Index]