pcAlgo {pcalg} | R Documentation |
Estimate the underlying graph (ugraph
or
“skeleton” (structure) of a DAG (Directed Acyclic
Graph) from data using the PC-algorithm. Alternatively, one can
directly infer the equivalence class of the underlying structure,
which is uniquely represented by its CPDAG (Completed
Partially Directed Acyclic Graph).
pcAlgo(dm, alpha, corMethod = "standard", verbose = FALSE, directed = FALSE)
dm |
data matrix; rows correspond to samples, cols correspond to nodes. |
alpha |
significance level for the individual partial correlation tests. |
corMethod |
a character string speciyfing the method for (partial) correlation estimation. "standard", "Qn" or "ogkQn" for standard and robust (based on the Qn scale estimator without and with OGK) correlation estimation |
verbose |
Indicates whether some intermediate output should be shown (WARNING: This decreases the performance dramatically!) |
directed |
If FALSE , the underlying skeleton is computed;
if TRUE , the underlying CPDAG is computed |
The algorithm starts with a complete undirected graph. In a first
sweep, an edge ij is kept only if H_0: Cor(X_i,X_j) = 0 can be
rejected on significance level alpha
. All ordered pairs ij of
nodes of the
resulting graph are then swept again. An edge ij is kept only if
H_0: Cor(X_i,X_j|X_k)=0 can be rejected for all neighbours k of
i in the current graph. Again, the remaining egdes are swept. This
time, an ordered pair (edge) ij is
kept only if H_0: Cor(X_i,X_j|X_a,X_b)=0 can be rejected for all
subsets of size two (a,b) of the neighbours of i in the
remaining graph. In the next
step, the remaining edges are tested using all subsets of size three,
then of size four and so on. The algorithm stops when the largest
neighbourhood is smaller than the size of the conditioning sets.
The partial correlations are
computed recursively or via matrix inversion from the correlation matrix,
which are computed by the
specified method (corMethod
). The partial correlation tests
are based on Fisher's z-transformation. For more details on the
methods for computing the correlations see mcor
.
An object of class
"pcAlgo"
(see
pcAlog-class
) containing an undirected graph
(object of class
"graph"
, see
graph-class
from the package graph)
(without weigths) as estimate of the skeleton or the CPDAG of the
underlying DAG.
Markus Kalisch (kalisch@stat.math.ethz.ch) and Martin Maechler.
P. Spirtes, C. Glymour and R. Scheines (2000) Causation, Prediction, and Search, 2nd edition, The MIT Press.
Kalisch M. and P. B"uhlmann (2007) Estimating high-dimensional directed acyclic graphs with the PC-algorithm; JMLR, Vol. 8, 613-636, 2007.
randomDAG
for generating a random DAG;
rmvDAG
for generating data according to a DAG;
compareGraphs
for comparing undirected graphs in terms of
TPR, FPR and TDR. Further, randomGraph
(in
package graph) for other random graph
models. udag2cpdag
for converting the skeleton to a CPDAG.
p <- 10 ## generate and draw random DAG : set.seed(101) class(myDAG <- randomDAG(p, prob = 0.2)) plot(myDAG, main = "randomDAG(10, prob = 0.2)") ## generate 1000 samples of DAG using standard normal error distribution n <- 1000 d.mat <- rmvDAG(n, myDAG, errDist = "normal") ## estimate skeleton and CPDAG of given data resU <- pcAlgo(d.mat, alpha = 0.05, corMethod = "standard") resU plot(resU,zvalue.lwd=TRUE)# << using the plot() method for 'pcAlgo' objects! str(resU, max = 2) (c.g <- compareGraphs(myDAG, resU@graph)) ## CPDAG resD <- pcAlgo(d.mat, alpha = 0.05, corMethod = "standard",directed=TRUE) plot(resD,zvalue.lwd=TRUE) ## plot the original DAG, the estimated skeleton and the estimated CPDAG: op <- par(mfrow=c(3,1)) plot(myDAG, main = "original (random)DAG") plot(resU@graph, main = "estimated skeleton from pcAlgo(<simulated, n = 1000>)") plot(resD@graph,main="estimated CPDAG from pcAlgo(<simulated, n = 1000>)") par(op) ## generate data containing severe outliers d.mixmat <- rmvDAG(n, myDAG, errDist = "mix", mix=0.3) ## Compute "classical" and robust estimate of skeleton : pcC <- pcAlgo(d.mixmat, 0.01, corMeth = "standard") pcR <- pcAlgo(d.mixmat, 0.01, corMeth = "Qn") str(pcR, max = 2) (c.Cg <- compareGraphs(myDAG, pcC@graph)) (c.Rg <- compareGraphs(myDAG, pcR@graph))#-> (.201 0 1) much better op <- par(mfrow=c(3,1)) plot(myDAG, main = "original (random)DAG") plot(pcC) plot(pcR,zvalue.lwd=TRUE,lwd.max=7) par(op)