InferEdges {simone}R Documentation

Edge Inference

Description

Estimate the inverse covariance matrix from a i.i.d. size–n sample of a multivariate normally distributed random vector.

Usage

  InferEdges(data, penalty, method="glasso", ...)

Arguments

data A n x p data matrix containing i.i.d. size–n sample taken from a multivariate normally distributed random size–p vector.
penalty Penalty to use. Can be a numerical matrix of size p x p or a scalar value. If NULL, a default conservative penalty is calculated that will lead to very sparse graph.
method A string that defines the method to use for the estimation of the inverse covariance matrix: either "glasso", "regressionAND" or "regressionOR". Default is "glasso".
... Additional arguments are available, see Details

Details

InferEdges is a wrapper for accessing our implementation in C of several algorithms for estimating inverse covariance matrices. Given the link between such matrices and corresponding graph precision matrices, we currently substitute the latter for the former, hence the name InferEdges. The implemented inference algorithms are :

The "glasso" method,
which solves a l1–penalized likelihood problem (Banerjee et al, 2008) based upon the GLasso approach (see Friedman et al, 2007).
The "regressionAND" method,
which solves p independent l1–penalized regressions with an AND rule for symmetrization (see Meinshausen and B"ulhman).
The "Regression OR",
which solves p independent l1–penalized regressions with an OR rule for symmetrization (see Meinshausen and B"ulhman).

The penalty term can be a scalar or a matrix. For the latter, the penalty is applied term-to-term to the inverse covariance matrix estimator, thus penalizing each entry differently.

Additional arguments are :

Sigma.hat
p x p matrix. Starting point of the algorithm. If NULL, use S_n+ diag(penalty) where S_n is the empirical covariance matrix. Default NULL.
eps
Scalar. Convergence threshold for the algorithms. Default 1e-12.
maxIt
Maximum number of iterations for block-wise coordinate algorithm. Default 1e4.

Value

Return a list with the two following components:

Sigma.hat The p x p estimated covariance matrix.
K.hat The p x p estimated inverse covariance (or precision) matrix.


Note that Sigma.hat is NULL for "regressionOR" and "regressionAND", since the precision matrix K.hat is the only one estimated with these methods.

Author(s)

J. Chiquet, based upon ealier work of J. Friedman and R. Tibshirani

References

Banerjee, O., El Ghaoui, L. and d'Aspremont, A. Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data, Jour. Mach. Learn. Res., 9, p.~485–516, 2008.

Friedman, J., Hastie, T. and Tibshirani, R. Sparse inverse covariance estimation with the graphical lasso, 9(3), p.~ 432–441, Biostatistics, 2008.

Meinshausen, N. and B"uhlmann, P. High-dimensional graphs and variable selection with the lasso, Ann. Statist., 34(3), p.~1436–1462, 2006

See Also

SimDataAffiliation, Gplot, Mplot

Examples

  library(simone)

  ## Generating a graph with an associated Gaussian sample
  p <- 100
  n <- 200
  proba.in  <- 0.15
  proba.out <- 0.005
  alpha <- c(.6,.4)
  X <- SimDataAffiliation (p, n, proba.in, proba.out, alpha)

  ## Network inference
  rho <- 0.18
  res <- InferEdges(X$data, rho)

  ## Results, plotting and comparison
  par(mfrow=c(2,2))
  g <- Gplot(X$K.theo, X$cl.theo, main="Theoretical graph")
  Mplot(X$K.theo, X$cl.theo, main="Theoretical Mplot")
  Gplot(res$K.hat, coord=g, main="GLasso Inference")
  Mplot(res$K.hat, X$cl.theo, main="Inferred Mplot")

[Package simone version 0.1-2 Index]