bnlearn-package {bnlearn} | R Documentation |
Bayesian network learning via constraint-based algorithms.
Package: | bnlearn |
Type: | Package |
Version: | 0.4 |
Date: | 2007-11-10 |
License: | GPLv2 or later
|
This package implements some constraint-based algorithms for learning the structure of Bayesian networks. Also known as conditional independence learners, they are all optimized derivatives of the Inductive Causation algorithm (Verma and Pearl, 1991).
These algorithms differ in the way they detect the Markov blankets of the variables, which in turn are used to compute the structure of the Bayesian network. Proofs of correctness are present in the respective papers.
gs
): based on the Grow-Shrink
Markov Blanket algorithm, a forward-selection technique for Markov blanket
detection.
iamb
): based on the
Markov blanket detection algorithm of the same name, which uses stepwise
selection.
fast.iamb
): a
variant of IAMB which uses speculative forward-selection to reduce the
number of conditional independence tests.
inter.iamb
):
another variant of IAMB which interleaves the backward selection with
the forward one to avid false positives (i.e. nodes erroneously included
in the Markov blanket).
This package includes three implementations of each algorithm:
optimized
parameter is set to TRUE
.
optimized
parameter must be set to FALSE
.
makeCluster
function from the snow
package. To use this on pass the cluster object to the various
functions via the cluster
parameter; the optimized
parameter will be ignored.
Their computational complexity is polynomial in the number of tests, usually O(N^2) (O(N^4) in the worst case scenario). The execution time also scales linearly with the size of the data set.
The conditional independence tests used in constraint-based algorithms in practice are statistical tests on the data set. Available tests (and the respective labels) are:
mi
): an information-theoretic
distance measure. Is proportional to the log-likelihood ratio
(they differ by a 2n factor) and is related to the
deviance of the tested models.
fmi
): a variant of the
mutual information which is set to zero when there aren't at least
five data per parameter.
mh
): a stratified
independence test, included for testing purposes only. See
mantelhaen.test
in package stats
.
cor
): linear correlation.
zf
): a transformation of the
linear correlation with asymptotic normal distribution. Used
by commercial software (such as TETRAD II) for the PC algorithm
(an R implementation is present in the pcalg
package
on CRAN).
All algorithms support arc whitelisting and blacklisting:
Any arc whitelisted and blacklisted at the same time is assumed to be whitelisted, and is thus removed from the blacklist.
Optimized implementations of learning algorithms often hide learning errors, usually in the Markov blanket detection step, due to the use of backtracking.
On the other hand in the unoptimized implementations the Markov blanket and neighbour detection of each node is completely independent from the rest of the learning process. Thus it may happen that the Markov blanket or the neighbourhoods are not symmetric (i.e. A is in the Markov blanket but not vice versa), or that some arc directions conflict with each other.
The strict
parameter enables some measure of error
correction, which may help to retrieve a good model
even when the learning process would fail:
strict
is set to TRUE
, every error
stops the learning process and results in an error message.
strict
is set to FALSE
:
An object of class bn
is a list containing at least the
following components:
learning
: a list containing some information about
the results of the learning algorithm. It's never changed
afterward.
nodes
: a list. Each element is named after a node
and contains the following elements:
mb
: the Markov blanket of the node (an array of
character strings).
nbr
: the neighbourhood of the node (an array of
character strings).
arcs
: the arcs of the Bayesian network (a two-column
matrix, whose column are labeled from
and to
).
whitelist
: a sanitized copy of the whitelist
parameter (a two-column matrix, whose column are labeled
from
and to
).
blacklist
: a sanitized copy of the blacklist
parameter (a two-column matrix, whose column are labeled
from
and to
).
test
: the label of the conditional independence test
used by the learning algorithm (a character string).
alpha
: the target nominal type I error rate (a
numerical value).
ntests
: the number of conditional independence tests
used in the learning (an integer value).
algo
: the label of the learning algorithm used
(a character string).
nodes
: a list. Each element is named after a node
and contains the following elements:
mb
: the Markov blanket of the node (an array of
character strings).
nbr
: the neighbourhood of the node (an array of
character strings).
parents
: the parents of the node (an array of
character strings).
children
: the children of the node (an array of
character strings).
arcs
: the arcs of the Bayesian network (a two-column
matrix, whose column are labeled from
and to
).
Marco Scutari
Maintainer: Marco Scutari marco.scutari@gmail.com
A. Agresti. Categorical Data Analysis. John Wiley & Sons, Inc., 2002.
D. Margaritis. Learning Bayesian Network Model Structure from Data. PhD thesis, School of Computer Science, Carnegie-Mellon University, Pittsburgh, PA, May 2003. Available as Technical Report CMU-CS-03-153.
J. Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers Inc., 1988.
I. Tsamardinos, C. F. Aliferis, and A. Statnikov. Algorithms for large scale Markov blanket discovery. In Proceedings of the Sixteenth International Florida Artificial Intelligence Research Society Conference, pages 376-381. AAAI Press, 2003.
S. Yaramakala, D. Margaritis. Speculative Markov Blanket Discovery for Optimal Feature Selection. In Proceedings of the Fifth IEEE International Conference on Data Mining, pages 809-812. IEEE Computer Society, 2005.
library(bnlearn) data(learning.test) ## Simple learning # first try the Grow-Shrink algorithm res = gs(learning.test) # plot the network structure. plot(res) # now try the Incremental Association algorithm. res2 = iamb(learning.test) # plot the new network structure. plot(res2) # the network structures seem to be identical, don't they? compare(res, res2) # [1] TRUE # how many tests each of the two algorithms used? res$learning$ntests # [1] 43 res2$learning$ntests # [1] 50 # and the unoptimized implementation of these algorithms? ## Not run: gs(learning.test, optimized = FALSE)$ntests # [1] 90 ## Not run: iamb(learning.test, optimized = FALSE)$ntests # [1] 116 ## Blacklist and whitelist use # the arc B - F should not be there? blacklist = data.frame(from = c("B", "F"), to = c("F", "B")) blacklist # from to # 1 B F # 2 F B res3 = gs(learning.test, blacklist = blacklist) plot(res3) # force E - F direction (E -> F). whitelist = data.frame(from = c("E"), to = c("F")) whitelist # from to # 1 E F res4 = gs(learning.test, whitelist = whitelist) plot(res4) # use both blacklist and whitelist. res5 = gs(learning.test, whitelist = whitelist, blacklist = blacklist) plot(res5) ## Debugging # use the debugging mode to see the learning algorithm # in action. res = gs(learning.test, debug = TRUE) # log the learning process for future reference. sink(file = "learning-log.txt") res = gs(learning.test, debug = TRUE) sink() # if something seems wrong, try the unoptimized version # in strict mode (inconsistencies trigger errors): ## Not run: res = gs(learning.test, optimized = FALSE, strict = TRUE, debug = TRUE) # or disable strict mode to try to fix errors on the fly: ## Not run: res = gs(learning.test, optimized = FALSE, strict = FALSE, debug = TRUE)