ergmm {latentnetHRT} | R Documentation |
ergmm
is used to fit latent space and latent space cluster random network models,
as described in Hoff, Raftery and Handcock (2002) and
Handcock, Raftery and Tantrum (2005).
ergmm
produces likelihood-based inference. Approximate maximum likelihood estimators are computed, and Bayesian inference is implemented via a MCMC algorithm.
ergmm(formula, theta0=NULL, burnin=1000, MCMCsamplesize=1000, interval=10, latent.control=list(maxit=40,penalty.sigma=c(10,0.5),MLEonly=FALSE), returnMCMCstats=TRUE, randseed=NULL, verbose=FALSE, ...)
formula |
An R formula object, of the form
y ~ <term 1> + <term 2> ... ,
where y is a network object or a matrix that can be coerced to a
network object, and <term 1> , <term 2> , etc, are each
terms chosen from the list given below.
To create a network object in R, use the network function.
For a description of the possible terms see the terms.ergmm .
|
theta0 |
The initial parameter value used to find the MLE. The default is based on multidimensional scaling fit to the positions. |
burnin |
The number of proposals before any MCMC sampling is done. |
MCMCsamplesize |
The number of posterior samples to draw. |
interval |
The number of proposal steps between sampled statistics. |
latent.control |
Control
variables for the latent space algorithm. This are used only if
a latent term is included in the model. maxit
sets the maximum number of iterations to use in the
Quasi-Newton-Raphson algorithm to maximize the MCMC likelihood.
MLEonly is a logical flag set to compute only the MLE estimates and not
Bayesian inference based on the MCMC algorithm.
penalty.sigma is the penalty on the norm of the
latent distances to use in the penalized log-likelihood. The multiplier is
1/(penalty.sigma[1]^2 so that smaller values offer greater penalties.
The second component is the multiplier on the component effects. Values
less than 1 reduce the repulsion between components.
This is used in the MLE of positions only and not the MCMC
log-likelihood. It can be interpreted as a surrogate for
a prior distribution and expresses the belief that the latent distances are
not too large on the log-odds scale.
|
returnMCMCstats |
If this is TRUE the matrix of change
statistics from the MCMC run is returned as component sample .
This matrix is actually an object of class mcmc and can be
used directly in the CODA package to assess MCMC convergence. |
randseed |
Random number integer seed.
The default is sample(10000000, size=1) . |
verbose |
If this is TRUE , we will print out more information as
we run the program, including (currently) some goodness of fit
statistics. |
... |
Additional arguments, to be passed to lower-level functions in the future. |
ergmm
returns an object of class ergmm
that is a list.
Fits including a latentcluster
term will have at least the following
components and fits including a latent
term will have at least
the components up to and including network
.
coef |
The maximum likelihood estimate the p vector of coefficients for the model parameters (excluding the latent positions and cluster parameters). By default this is just the intercept with p=1. |
coef.names |
A p vector of the coefficient names. |
Beta |
The MCMCsamplesize times p matrix of coefficients for the model
parameters corresponding to each of the posterior samples. By default this is
the intercept only. |
Z |
The MCMCsamplesize times k matrix of (Procrustified)
posterior positions, where MCMCsamplesize is the
sample size and k is the number of dimensions of the latent space. |
Z.mkl |
The network.size(g) times k matrix of
minimum Kullback-Leibler positions for each of the nodes. |
Z.pmean |
The network.size(g) times k matrix of
posterior mean positions for each of the nodes. |
Z.pmode |
The network.size(g) times k matrix of
posterior modal positions for each of the nodes. |
Z.mle |
The network.size(g) times k matrix of
MLE positions for each of the nodes. |
beta.mkl |
The p vector of coefficients for the model parameters based on the minimum Kullback-Leibler positions for each of the nodes. |
samplesize |
The number of MCMC samples drawn from the posterior. |
sample |
The MCMCsamplesize times (p+2+k) matrix of network statistics,
where MCMCsamplesize is the
sample size and p is the number of network covariates specified in the
model via the latentcov terms (usually 0). The columns are:
``mcmc.loglikelihood", the log-likelihood value;
``density", the constant term in the latent model;
the p covariates;
``Z 1", ``Z 2", ..., ``Z k", the k dimensional
positions of the first node. The values are recorded for each sample drawn.
This is primarily used for MCMC diagnostics to assess convergence. |
iterations |
The number of Newton-Raphson iterations required before convergence. |
interval |
The number of proposals between sampled statistics. |
null.deviance |
The deviance for the null model, comparable with
-2 loglikelihood . The null model will include the
intercept if there is one in the model, but not the latent variables or latent
clusters. |
mcmc.loglikelihood |
The log-likelihood values corresponding to each of the posterior samples. |
loglikelihood |
The log-likelihood for the MLE of positions (and based on the final fits to the other parameters). |
mle.lik |
The log-likelihood for the initial MLE fit of positions. |
hessian |
The Hessian matrix of the approximated loglikelihood function, evaluated at the maximizer. This matrix may be inverted to give an approximate covariance matrix for the MLE of the parameters. |
formula |
The original formula entered into the ergmm function. |
latent |
A flag to indicate that this is a fit of latent variable model.
This is always TRUE for ergmm fits
and is included for consistency with the statnet
package. |
cluster |
A flag to indicate that this is a fit of a latent cluster model.
This is always TRUE for ergmm fits if a latentcluster
term is in the model
and is included for consistency with the statnet
package. |
network |
The modeled network as an network object. |
BIC |
A Bayesian Information Criterion approximation for the model. This is the approximation based on the fully Bayesian estimation method in Section 3.2 of Handcock, Raftery and Tantrum (2005). The formula for the approximation is given at the end of Section 4 in that paper. See the references for details. |
class |
The vector of posterior modal classes for each node. |
Ki |
The MCMCsamplesize timesnetwork.size(g)
matrix of posterior draws of the classes, where MCMCsamplesize is the
sample size and network.size(g) is the number of nodes in the
network. |
Ki.mle |
The network.size(g) vector of maximum likelihood classes for each node. |
logl.lr |
The log-likelihood for the latent space component of the model. |
logl.mbc |
The log-likelihood for the model-based clustering component of the model. |
mu |
The ngroups timesk timesMCMCsamplesize array of posterior draws of the mean positions
of the class, where MCMCsamplesize is the
sample size and ngroups is the number of classes. |
mu.mle |
The ngroups timesk matrix of
maximum likelihood mean positions for each class. |
ngroups |
The number of classes or clusters. |
qig |
The network.size(g) timesngroups
matrix of posterior probabilities of class membership for each of the nodes. |
Sigma |
The MCMCsamplesize timesngroups
array of posterior draws of the variances of the positions
of the class, where MCMCsamplesize is the
sample size and ngroups is the number of classes. |
Sigma.mle |
The maximum likelihood variances of the positions for each class. |
Note that we have written a
function, summary.ergmm
that returns a summary of the
relevant parts of the ergmm
object in concise summary
format.
Peter D. Hoff, Adrian E. Raftery and Mark S. Handcock. Latent space approaches to social network analysis. Journal of the American Statistical Association, Dec 2002, Vol.97, Iss. 460; pg. 1090-1098.
Mark S. Handcock, Adrian E. Raftery and Jeremy Tantrum. Model-Based Clustering for Social Networks. Working Paper Number 46, Center for Statistics and the Social Sciences, University of Washington, April 2005.
network, set.vertex.attributes, set.network.attributes, summary.ergmm
# # See http://statnetproject.org/latentnetHRT # for more examples # # For an explanation and examples of creating 'network' objects # see the required 'network' package. # # Use 'data(package = "latentnetHRT")' to list the data sets in a # data(package="latentnetHRT") # # Using Sampson's Monk data, lets fit a # simple latent position model # data(sampson) # # Get the group labels # group <- get.vertex.attribute(samplike,"group") samp.labs <- substr(group,1,1) # samp.fit <- ergmm(samplike ~ latent(k=2), burnin=10000, MCMCsamplesize=2000, interval=30) # # See if we have convergence in the MCMC mcmc.diagnostics(samp.fit) # # Plot the fit # plot(samp.fit,label=samp.labs, vertex.col="group") # # Using Sampson's Monk data, lets fit a latent clustering model # ## Not run: samp.fit <- ergmm(samplike ~ latentcluster(k=2, ngroups=3), burnin=10000, MCMCsamplesize=2000, interval=30) # # See if we have convergence in the MCMC mcmc.diagnostics(samp.fit) # # Lets look at the goodness of fit: # plot(samp.fit,label=samp.labs, vertex.col="group") plot(samp.fit,pie=TRUE,label=samp.labs) plot(samp.fit,density=c(2,2)) plot(samp.fit,contours=5,contour.color="red") plot(samp.fit,density=TRUE,drawarrows=TRUE) # # Add contours # ergmm.add.contours(samp.fit,nlevels=8,lwd=2) points(samp.fit$Z.mkl,pch=19,col=samp.fit$class) # # Try a covariate on the group # samegroup <- outer(group, group, "==") diag(samegroup) <- 0 samp.fit <- ergmm(samplike ~ latentcov(samegroup) + latent(k=2)) summary(samp.fit) ## End(Not run)