pcAlgo {pcalg} | R Documentation |
Estimate the underlying graph (ugraph
or
“skeleton” (structure) of a DAG (Directed Acyclic
Graph) from data using the PC-algorithm.
pcAlgo(dm, alpha, corMethod = "standard", verbose = FALSE)
dm |
data matrix; rows correspond to samples, cols correspond to nodes. |
alpha |
significance level for the individual partial correlation tests. |
corMethod |
a character string speciyfing the method for (partial) correlation estimation. "standard", "Qn" or "ogkQn" for standard and robust (based on the Qn scale estimator without and with OGK) correlation estimation |
verbose |
Indicates whether some intermediate output should be shown (WARNING: This decreases the performance dramatically!) |
The algorithm starts with a complete undirected graph. In a first
sweep, an edge ij is kept only if H_0: Cor(X_i,X_j) = 0 can be
rejected on significance level alpha
. All ordered pairs ij of
nodes of the
resulting graph are then swept again. An edge ij is kept only if
H_0: Cor(X_i,X_j|X_k)=0 can be rejected for all neighbours k of
i in the current graph. Again, the remaining egdes are swept. This
time, an ordered pair (edge) ij is
kept only if H_0: Cor(X_i,X_j|X_a,X_b)=0 can be rejected for all
subsets of size two (a,b) of the neighbours of i in the
remaining graph. In the next
step, the remaining edges are tested using all subsets of size three,
then of size four and so on. The algorithm stops when the largest
neighbourhood is smaller than the size of the conditioning sets.
The partial correlations are
computed recursively or via matrix inversion from the correlation matrix,
which are computed by the
specified method (corMethod
). The partial correlation tests
are based on Fisher's z-transformation. For more details on the
methods for computing the correlations see mcor
.
An undirected graph (object of class
"graph"
, see
graph-class
from the package graph)
(without weigths) as estimate of the skeleton of the underlying DAG.
Markus Kalisch (kalisch@stat.math.ethz.ch) and Martin Maechler.
P. Spirtes, C. Glymour and R. Scheines (2000) Causation, Prediction, and Search, 2nd edition, The MIT Press.
Kalisch M. and P. B"uhlmann (2005)
Estimating high-dimensional
directed acyclic graphs with the PC-algorithm;
Research Report Nr.~130, ETH Zurich
(http://stat.ethz.ch/research/research_reports/2005)
randomDAG
for generating a random DAG;
rmvDAG
for generating data according to a DAG;
compareGraphs
for comparing undirected graphs in terms of
TPR, FPR and TDR. Further, randomGraph
(in
package graph) for other random graph models.
p <- 10 ## generate and draw random DAG : set.seed(101) class(myDAG <- randomDAG(p, prob = 0.2)) plot(myDAG, main = "randomDAG(10, prob = 0.2)") ## generate 1000 samples of DAG using standard normal error distribution n <- 1000 d.mat <- rmvDAG(n, myDAG, errDist = "normal") ## estimate skeleton given data res <- pcAlgo(d.mat, alpha = 0.05, corMethod = "standard") res plot(res)# << using the plot() method for 'pcAlgo' objects! str(res, max = 2) (c.g <- compareGraphs(myDAG, res@graph)) ## plot the original DAG and the estimated skeleton : op <- par(mfrow=c(2,1)) plot(myDAG, main = "original (random)DAG") plot(res@graph, main = "estimated skeleton from pcAlgo(<simulated, n = 1000>)") par(op) ## generate data containing severe outliers d.mixmat <- rmvDAG(n, myDAG, errDist = "mix", mix=0.3) ## Compute "classical" and robust estimate of skeleton : pcC <- pcAlgo(d.mixmat, 0.01, corMeth = "standard") pcR <- pcAlgo(d.mixmat, 0.01, corMeth = "Qn") str(pcR, max = 2) (c.Cg <- compareGraphs(myDAG, pcC@graph)) (c.Rg <- compareGraphs(myDAG, pcR@graph))#-> (.201 0 1) much better op <- par(mfrow=c(3,1)) plot(myDAG, main = "original (random)DAG") plot(pcC) plot(pcR) par(op)