ergm-terms {ergm} | R Documentation |
The function ergm
is used to fit linear exponential
random graph models, in which
the probability of a given network, y, on a set of nodes is
exp{theta * g(y)}/c(theta), where
g(y) is a vector of network statistics for y,
theta is a parameter vector of the same
length and c(theta) is the
normalizing constant for the distribution.
The network statistics g(y) are entered as terms in the
function call to ergm
.
This page describes the possible terms (and hence network statistics).
Terms to ergm
are specified by a formula to represent the network and
network statistics. This is done via a formula
, that is,
an R formula object, of the form
y ~ <term 1> + <term 2> ...
,
where y
is a network object or a matrix that can be coerced to a
network
object, and <term 1>
, <term 2>
, etc, are each terms chosen
from the list given below.
To create a network object in R, use the network
function,
then add nodal attributes to it using the %v%
operator if necessary.
The ergm
function allows the user to explore a large number
of potential models for their network data. What follows
is a list of model terms currently available by the program,
and a brief description of each.
In the formula for the model, the model terms are various function-like
calls, some of which require arguments, separated by +
signs.
Additional terms can be coded up by users via
the statnetuserterms
package.
The terms currently available are:
absdiff(attrname, pow=1)
attrname
argument is a character string giving the name
of a quantitative attribute in the network's vertex attribute
list. This term adds one network statistic to the model equaling the
sum of abs(attrname[i]-attrname[j])^pow
for all edges (i,j)
in the network. absdiffcat(attrname, base=NULL)
attrname
argument is a character string giving the name of a
quantitative attribute in the network's vertex attribute list. This term
adds one statistic for every possible nonzero distinct value of
abs(attrname[i]-attrname[j])
in the network; the value of each such
statistic is the number of edges in the network with the corresponding
absolute difference. The optional base
argument is a vector
indicating which nonzero differences, in order from smallest to largest,
should be omitted from the model (i.e., treated like the zero-difference
category). The base
argument, if used, should contain indices, not
differences themselves. For instance, if the possible values of
abs(attrname[i]-attrname[j])
are 0, 0.5, 3, 3.5, and 10, then to omit
0.5 and 10 one should set base=c(1, 4)
. Note that this term should
generally be used only when the quantitative attribute has a limited number
of possible values; an example is the "Grade"
attribute of the
faux.mesa.high
or faux.magnolia.high
datasets.altkstar(lambda, fixed=FALSE)
lambda
. This is the version given in Snijders et al. (2006). The
gwdegree
and altkstar
produce mathematically equivalent
models, as long as they are used together with the edges
(or
kstar(1)
) term, yet the interpretation of the gwdegree
parameters is slightly more straightforward than the interpretation of the
altkstar
parameters. For this reason, we recommend the use of the
gwdegree
instead of altkstar
. See Section 3 and especially
equation (13) of Hunter (2007) for details. The optional argument
fixed
indicates whether the scale parameter lambda
is to be
fit as a curved exponential family model (see Hunter and Handcock, 2006).
The default is FALSE
, which means the scale parameter is not fixed
and thus the model is a CEF model. This term can only be used with
undirected networks.asymmetric(attrname=NULL, diff=FALSE, keep=NULL)
attrname
argument is used, only asymmetric pairs that match on the
named vertex attribute are counted. The optional modifiers diff
and
keep
are used in the same way as for the nodematch
term; refer
to this term for details and an example.b1concurrent(by=NULL)
by
is a character string giving the name of an attribute in the
network's vertex attribute list;
it functions just like the by
argument of the b1degree
term.
This term can only be
used with undirected bipartite networks. b1degree(d, by=NULL)
d
argument is a vector of
distinct integers. This term adds one network statistic to the model for
each element in d
; the ith such statistic equals the number of
nodes of degree d[i]
in the first mode of a bipartite network, i.e.
with exactly d[i]
edges. The first mode of a bipartite network object
is sometimes known as the "actor" mode. The optional argument by
is
a character string giving the name of an attribute in the network's vertex
attribute list. If this is specified
then each node's degree is tabulated only with other nodes having the same
value of the by
attribute.
This term can
only be used with undirected bipartite networks.b1factor(attrname, base=1)
attrname
argument is a character string giving the name of a
categorical attribute in the network's vertex attribute list. This term adds
multiple network statistics to the model, one for each of (a subset of) the
unique values of the attrname
attribute. Each of these statistics
gives the number of times a node with that attribute in the first mode of
the network appears in an edge. The first mode of a bipartite network object
is sometimes known as the "actor" mode. To include all attribute values is
usually not a good idea, because the sum of all such statistics equals the
number of edges and hence a linear dependency would arise in any model also
including edges
. Thus, the base
argument tells which value(s)
(numbered in order according to the sort
function) should be omitted.
The default value, base=1
, means that the smallest (i.e., first in
sorted order) attribute value is omitted. For example, if the “fruit”
factor has levels “orange”, “apple”, “banana”, and
“pear”, then to add just two terms, one for “apple” and one
for “pear”, then set “banana” and “orange” to the base
(remember to sort the values first) by using nodefactor("fruit",
base=2:3)
. This term can only be used with undirected bipartite networks.b1star(k, attrname=NULL)
k
argument is a vector of
distinct integers. This term adds one network statistic to the model for
each element in k
. The ith such statistic counts the number of
distinct k[i]
-stars whose center node is in the first mode of the
network. The first mode of a bipartite network object is sometimes known as
the "actor" mode. A k-star is defined to be a center node N and
a set of k different nodes {O_1, ..., O_k} such that the
ties {N, O_i} exist for i=1, ..., k. The optional argument
attrname
is a character string giving the name of an attribute in the
network's vertex attribute list. If this is specified then the count is over
the number of k-stars (with center node in the first mode) where all
nodes have the same value of the attribute. This term can only be used for
undirected bipartite networks. Note that b1star(1)
is equal to
b2star(1)
and to edges
. b1starmix(k, attrname, base=NULL, diff=TRUE)
attrname
. However, the b1 node
(in some contexts, the actor) at the center of the k-star does NOT have to
have the same value as the b2 nodes; indeed, the values taken by the b1
nodes may be completely distinct from those of the b2 nodes, which allows
for the use of this term in cases where there are two separate nodal
attributes, one for the b1 nodes and another for the b2 nodes (in this case,
however, these two attributes should be combined to form a single nodal
attribute called attrname
. A different statistic is created for each
value of attrname
seen in a b1 node, even if no k-stars are observed
with this value. Whether a different statistic is created for each value
seen in a b2 node depends on the value of the diff
argument: When
diff=TRUE
, the default, a different statistic is created for each
value and thus the behavior of this term is reminiscent of the
nodemix
term, from which it takes its name; when diff=FALSE
,
all homophilous k-stars are counted together, though these k-stars are still
categorized according to the value of the central b1 node. The base
term may be used to control which of the possible terms are left out of the
model: By default, all terms are included, but if base
is set to a
vector of indices then the corresponding terms (in the order they would be
created when base=NULL
) are left out.b1twostar(b1attrname, b2attrname, base=NULL)
b1attrname
is required; if b2attrname
is not passed, it is
assumed to be the same as b1attrname
. Assuming that there are
n_1 values of b1attrname
among the b1 nodes and n_2
values of b2attrname
among the b2 nodes, then the total number of
distinct categories of two stars according to these two attributes is
n_1(n_2)(n_2+1)/2. This model term creates a distinct statistic
counting each of these categories. The base
term may be used to leave
some of these categories out; when passed as a vector of integer indices (in
the order the statistics would be created when base=NULL
), the
corresponding terms will be left out.b2concurrent(by=NULL)
by
is a character string giving the name of an attribute in the
network's vertex attribute list;
it functions just like the by
argument of the b2degree
term.
This term can only be
used with undirected bipartite networks. b2degree(d, by=NULL)
d
argument is a vector of
distinct integers. This term adds one network statistic to the model for
each element in d
; the ith such statistic equals the number of
nodes of degree d[i]
in the second mode of a bipartite network, i.e.
with exactly d[i]
edges. The second mode of a bipartite network
object is sometimes known as the "event" mode. The optional term
by
is a character string giving the name of an attribute in the
network's vertex attribute list. If this is specified
then each node's degree is tabulated only with other nodes having the same
value of the by
attribute.
This term can only be used with undirected bipartite networks.b2factor(attrname, base=1)
attrname
argument is a character string giving the name of a
categorical attribute in the network's vertex attribute list. This term adds
multiple network statistics to the model, one for each of (a subset of) the
unique values of the attrname
attribute. Each of these statistics
gives the number of times a node with that attribute in the second mode of
the network appears in an edge. The second mode of a bipartite network
object is sometimes known as the "event" mode. To include all attribute
values is usually not a good idea, because the sum of all such statistics
equals the number of edges and hence a linear dependency would arise in any
model also including edges
. Thus, the base
argument tells
which value(s) (numbered in order according to the sort
function)
should be omitted. The default value, base=1
, means that the smallest
(i.e., first in sorted order) attribute value is omitted. For example, if
the “fruit” factor has levels “orange”, “apple”,
“banana”, and “pear”, then to add just two terms, one for
“apple” and one for “pear”, then set “banana” and
“orange” to the base (remember to sort the values first) by using
nodefactor("fruit", base=2:3)
. This term can only be used with
undirected bipartite networks.b2star(k, attrname=NULL)
k
argument is a vector of
distinct integers. This term adds one network statistic to the model for
each element in k
. The ith such statistic counts the number of
distinct k[i]
-stars whose center node is in the second mode of the
network. The second mode of a bipartite network object is sometimes known as
the "event" mode. A k-star is defined to be a center node N and
a set of k different nodes {O_1, ..., O_k} such that the
ties {N, O_i} exist for i=1, ..., k. The optional argument
attrname
is a character string giving the name of an attribute in the
network's vertex attribute list. If this is specified then the count is over
the number of k-stars (with center node in the second mode) where all
nodes have the same value of the attribute. This term can only be used for
undirected bipartite networks. Note that b2star(1)
is equal to
b1star(1)
and to edges
. b2starmix(k, attrname, base=NULL, diff=TRUE)
b1starmix
except that the roles of
b1 and b2 are reversed.b2twostar(b1attrname, b2attrname, base=NULL)
b1twostar
except that the
roles of b1 and b2 are reversed.balance
102
or 300
in the categorization of Davis and Leinhardt
(1972). For details on the 16 possible triad types, see
?triad.classify
in the {sna}
package. For an undirected
network, the balanced triads are those with an even number of ties (i.e., 0
and 2).concurrent(by=NULL)
by
is a character string giving the name of an attribute in the
network's vertex attribute list;
it functions just like the by
argument of the degree
term.
This term can only be used with undirected
networks. ctriple(attrname=NULL)
triangle
is equal to
ttriple+ctriple
, so at most two of these three terms can be in a
model. The optional argument attrname
is a character string giving
the name of an attribute in the network's vertex attribute list. If this is
specified then the count is over the number of cyclic triples where all
three nodes have the same value of the attribute. This term can only be used
with directed networks.cycle(k)
k
argument is a vector of distinct integers. This term adds one
network statistic to the model for each element in k
; the ith
such statistic equals the number of cycles in the network with length
exactly k[i]
. The cycle statistic applies to both directed and
undirected networks. For directed networks, it counts directed cycles of
length k, as opposed to undirected cycles in the undirected case. The
directed cycle terms of lengths 2 and 3 are equivalent to mutual
and
ctriple
(respectively). The undirected cycle term of length 3 is
equivalent to triangle
, and there is no undirected cycle term of
length 2.degree(d, by=NULL)
d
argument is a vector of distinct integers. This term adds one
network statistic to the model for each element in d
; the ith
such statistic equals the number of nodes in the network of degree
d[i]
, i.e. with exactly d[i]
edges. The optional argument
by
is a character string giving the name of an attribute in the
network's vertex attribute list. If this is specified
then each node's degree is tabulated only with other nodes having the same
value of the by
attribute.
This term can only be used with undirected networks; for directed networks
see idegree
and odegree
. density
density
equals kstar(1)
or
edges
divided by n(n-1)/2; for directed networks,
density
equals edges
or istar(1)
or ostar(1)
divided by n(n-1). dsp(d)
d
argument is a vector of distinct integers. This term adds one
network statistic to the model for each element in d
; the ith
such statistic equals the number of dyads in the network with exactly
d[i]
shared partners. This term can be used with directed and
undirected networks. For directed networks the count is over homogeneous
shared partners only (i.e., only partners on a directed two-path connecting
the nodes in the dyad).dyadcov(x, attrname=NULL)
x
is either a (symmetric) matrix of
covariates, one for each possible dyad (i,j), or an undirected
network; if the latter, optional argument attrname
provides the name
of the quantitative edge attribute to use for covariate values (in this
case, missing edges in x
are assigned a covariate value of zero).
This term adds three statistics to the model, each equal to the sum of the
covariate values for all dyads occupying one of the three possible non-empty
dyad states (mutual, upper-triangular asymmetric, and lower-triangular
asymmetric dyads, respectively), with the empty or null state serving as a
reference category. If the network is undirected, x
is either a
matrix of edgewise covariates, or a network; if the latter, optional
argument attrname
provides the name of the edge attribute to use for
edge values. This term adds one statistic to the model, equal to the sum of
the covariate values for each edge appearing in the network. The
edgecov
and dyadcov
terms are equivalent for undirected
networks.edgecov(x, attrname=NULL)
x
argument is either a square matrix of covariates, one for each
possible edge in the network, covariates, or a network; if the latter,
optional argument attrname
provides the name of the quantitative edge
attribute to use for covariate values (in this case, missing edges in
x
are assigned a covariate value of zero). This term adds one
statistic to the model, equal to the sum of the covariate values for each
edge appearing in the network. The edgecov
term applies to both
directed and undirected networks. For undirected networks the covariates are
also assumed to be undirected. The edgecov
and dyadcov
terms
are equivalent for undirected networks.edges
edges
is equal to kstar(1)
; for directed networks, edges
is equal to
both ostar(1)
and istar(1)
. esp(d)
dsp
term, except this term adds one network
statistic to the model for each element in d
where the ith such
statistic equals the number of {em edges} (rather than dyads) in the
network with exactly d[i]
shared partners. This term can be used with
directed and undirected networks. For directed networks the count is over
homogeneous shared partners only (i.e., only partners on a directed two-path
connecting the nodes in the edge and in the same direction).gwb1degree(decay, fixed=FALSE)
decay
, for nodes in the
first mode of a bipartite network. The first mode of a bipartite network
object is sometimes known as the "actor" mode. This statistic is based on
the version given as equation (14) in Hunter (2007). See the "Remark" in
section 3 of that paper to see why it is used rather than the version given
in Snijders et al. (2006). The optional argument fixed
indicates
whether the scale parameter lambda
is to be fit as a curved
exponential family model (see Hunter and Handcock, 2006). The default is
FALSE
, which means the scale parameter is not fixed and thus the
model is a CEF model. This term can only be used with undirected bipartite
networks.gwb2degree(decay, fixed=FALSE)
decay
, for nodes in the
second mode of a bipartite network. The second mode of a bipartite network
object is sometimes known as the "event" mode. This statistic is based on
the version given as equation (14) in Hunter (2007). See the "Remark" in
section 3 of that paper to see why it is used rather than the version given
in Snijders et al. (2006). The optional argument fixed
indicates
whether the scale parameter lambda
is to be fit as a curved
exponential family model (see Hunter and Handcock, 2006). The default is
FALSE
, which means the scale parameter is not fixed and thus the
model is a CEF model. This term can only be used with undirected bipartite
networks.gwdegree(decay, fixed=FALSE)
decay
. This is the version
given as equation (14) in Hunter (2007). See the “Remark” in section
3 of that paper to see why it is used rather than the version given in
Snijders et al. (2006). The optional argument fixed
indicates whether
the scale parameter lambda
is to be fit as a curved exponential
family model (see Hunter and Handcock, 2006). The default is FALSE
,
which means the scale parameter is not fixed and thus the model is a CEF
model. This term can only be used with undirected networks.gwdsp(alpha, fixed=FALSE)
alpha
> 0. The optional argument fixed
indicates whether
the scale parameter lambda
is to be fit as a curved exponential
family model (see Hunter and Handcock, 2006). The default is FALSE
,
which means the scale parameter is not fixed and thus the model is a CEF
model. This term can be used with directed and undirected networks. For
directed networks the count is over homogeneous shared partners only (i.e.,
only partners on a directed two-path connecting the nodes in the dyad).gwesp(alpha, fixed=FALSE)
gwdsp
except it adds a statistic equal to the
geometrically weighted {em edgewise} (not dyadwise) shared partner
distribution with weight parameter alpha
. The optional argument
fixed
indicates whether the scale parameter lambda
is to be
fit as a curved exponential-family model (see Hunter and Handcock, 2006).
The default is FALSE
, which means the scale parameter is not fixed
and thus the model is a CEF model. This term can be used with directed and
undirected networks. For directed networks the geometric weighting is over
homogeneous shared partners only (i.e., only partners on a directed two-path
connecting the nodes in the edge and in the same direction).gwidegree(decay, fixed=FALSE)
decay
. The optional argument fixed
indicates whether the scale
parameter lambda
is to be fit as a curved exponential family model
(see Hunter and Handcock, 2006). The default is FALSE
, which means
the scale parameter is not fixed and thus the model is a CEF model. This
term can only be used with directed networks.gwnsp(alpha, fixed=FALSE)
gwesp
and gwdsp
except it adds a statistic equal to
the geometrically weighted {em nonedgewise} (that is, over dyads
that do not have an edge) shared partner distribution with weight
parameter alpha
. The optional argument fixed
indicates
whether the scale parameter lambda
is to be fit as a curved
exponential-family model (see Hunter and Handcock, 2006). The
default is FALSE
, which means the scale parameter is not
fixed and thus the model is a CEF model. This term can be used with
directed and undirected networks. For directed networks the
geometric weighting is over homogeneous shared partners only (i.e.,
only partners on a directed two-path connecting the nodes in the
non-edge and in the same direction).gwodegree(decay, fixed=FALSE)
decay
. The optional argument fixed
indicates whether the scale
parameter lambda
is to be fit as a curved exponential family model
(see Hunter and Handcock, 2006). The default is FALSE
, which means
the scale parameter is not fixed and thus the model is a CEF model. This
term can only be used with directed networks.hamming(x, cov, attrname=NULL)
x
. (If no argument is given, x
is taken to be the observed
network, i.e., the network on the left side of the ~ in the formula
that defines the ERGM.) Unweighted Hamming distance is defined as the total
number of pairs (i,j) (ordered or unordered, depending on whether the
network is directed or undirected) on which the two networks differ. If the
optional argument cov
is specified, then the weighted Hamming
distance is computed instead, where each pair (i,j) contributes a
pre-specified weight toward the distance when the two networks differ on
that pair. The argument cov
is either a matrix of edgewise weights or
a network; if the latter, the optional argument attrname
provides the
name of the edge attribute to use for weight values.hammingmix(attrname, x, base=0)
x
. The ordering of the
attribute values is alphabetical.
The option base
gives the index of
statistics to be omitted from the tabulation. For example base=2
will
omit the second statistic, making it the de facto reference category.
This term can only be used with directed networks.idegree(d, by=NULL)
d
argument
is a vector of distinct integers. This term adds one network statistic to
the model for each element in d
; the ith such statistic equals
the number of nodes in the network of in-degree d[i]
, i.e. the number
of nodes with exactly d[i]
in-edges. The optional term
by
is a character string giving the name of an attribute in the
network's vertex attribute list. If this is specified
then each node's degree is tabulated only with other nodes having the same
value of the by
attribute.
This term can only be used with directed networks; for undirected networks
see degree
. intransitive
111D
, 201
, 111U
, 021C
, or 030C
in the
categorization of Davis and Leinhardt (1972). For details on the 16 possible
triad types, see triad.classify
in the
sna
package. Note the distinction from the ctriple
term. This term can only be used with directed networks.isolates
istar(k, attrname=NULL)
k
argument is a
vector of distinct integers. This term adds one network statistic to the
model for each element in k
. The ith such statistic counts the
number of distinct k[i]
-instars in the network, where a
k-instar is defined to be a node N and a set of k
different nodes {O_1, ..., O_k} such that the ties
(O_j, N) exist for j=1, ..., k. The
optional argument attrname
is a character string giving the name of
an attribute in the network's vertex attribute list. If this is specified
then the count is over the number of k-instars where all nodes have
the same value of the attribute. This term can only be used for directed
networks; for undirected networks see kstar
. Note that
istar(1)
is equal to both ostar(1)
and edges
. kstar(k, attrname=NULL)
k
argument is a vector of distinct integers. This term adds one
network statistic to the model for each element in k
. The ith
such statistic counts the number of distinct k[i]
-stars in the
network, where a k-star is defined to be a node N and a set of
k different nodes {O_1, ..., O_k} such that the ties
{N, O_i} exist for i=1, ..., k. The optional argument
attrname
is a character string giving the name of an attribute in the
network's vertex attribute list. If this is specified then the count is over
the number of k-stars where all nodes have the same value of the
attribute. This term can only be used for undirected networks; for directed
networks, see istar
, ostar
, twopath
and m2star
.
Note that kstar(1)
is equal to edges
. localtriangle(x)
x
is a network or an
adjacency matrix that specifies whether the two nodes are in the same
neighborhood. Note that triangle
, with or without an argument, is a
special case of localtriangle
. m2star
kstar(2)
. See
also twopath
.match(attrname, diff=FALSE, keep=NULL)
nodematch(attrname,
diff=FALSE)
.meandeg
edges
and density
.mutual(attrname=NULL, diff=FALSE, keep=NULL)
attrname
argument is used,
only mutual pairs that match on the named vertex attribute are counted. The
optional modifiers diff
and keep
are used in the same way as
for the nodematch
term; refer to this term for details and an
example. nearsimmelian
nodecov(attrname)
attrname
argument is a character string giving the name of a
numeric (not categorical) attribute in the network's vertex attribute list.
This term adds a single network statistic to the model equaling the sum of
attrname(i)
and attrname(j)
for all edges (i,j) in the
network. For categorical attributes, see nodefactor
. Note that for
directed networks, nodecov
equals nodeicov
plus
nodeocov
.nodefactor(attrname, base=1)
attrname
argument is a character vector giving
one or more names of categorical attributes in the network's vertex
attribute list. This term adds multiple network statistics to the
model, one for each of (a subset of) the unique values of the
attrname
attribute (or each combination of the attributes
given). Each of these statistics gives the number of times a node
with that attribute or those attributes appears in an edge in the
network. In particular, for edges whose endpoints both have the same
attribute values, this value is counted twice. To include all
attribute values is usually not a good idea – though this may be
accomplished if desired by setting base=0
– because the sum
of all such statistics equals twice the number of edges and hence a
linear dependency would arise in any model also including
edges
. Thus, the base
argument tells which value(s)
(numbered in order according to the sort
function) should be
omitted. The default value, base=1
, means that the smallest
(i.e., first in sorted order) attribute value is omitted. For
example, if the “fruit” factor has levels “orange”,
“apple”, “banana”, and “pear”, then to add just
two terms, one for “apple” and one for “pear”, then
set “banana” and “orange” to the base (remember to
sort the values first) by using nodefactor("fruit",
base=2:3)
. For an analogous term for quantitative vertex
attributes, see nodecov
. nodeicov(attrname)
attrname
argument is a character string giving the name of a
numeric (not categorical) attribute in the network's vertex attribute list.
This term adds a single network statistic to the model equaling the total
value of attrname(j)
for all edges (i,j) in the network. This
term may only be used with directed networks. For categorical attributes,
see nodeifactor
.nodeifactor(attrname, base=1)
attrname
argument is a character
vector giving one or more names of a categorical attribute in the
network's vertex attribute list. This term adds multiple network
statistics to the model, one for each of (a subset of) the unique
values of the attrname
attribute (or each combination of the
attributes given). Each of these statistics gives the number of
times a node with that attribute or those attributes appears as the
terminal node of a directed tie. To include all attribute values is
usually not a good idea – though this may be accomplished if desired
by setting base=0
–
because the sum of all such statistics
equals the number of edges and hence a linear dependency would arise
in any model also including edges
. Thus, the base
argument tells which value(s) (numbered in order according to the
sort
function) should be omitted. The default value,
base=1
, means that the smallest (i.e., first in sorted order)
attribute value is omitted. For example, if the “fruit”
factor has levels “orange”, “apple”, “banana”,
and “pear”, then to add just two terms, one for
“apple” and one for “pear”, then set “banana”
and “orange” to the base (remember to sort the values first)
by using nodefactor("fruit", base=2:3)
. For an analogous term
for quantitative vertex attributes, see nodeicov
. nodematch(attrname, diff=FALSE,
keep=NULL)
attrname
argument is a character vector giving one or
more names of attributes in the network's vertex attribute
list. When diff=FALSE
, this term adds one network statistic
to the model, which counts the number of edges (i,j) for which
attrname(i)==attrname(j)
. (When multiple names are given, the
statistic counts only those on which all the named attributes
match.) When diff=TRUE
, p network statistics are added
to the model, where p is the number of unique values of the
attrname
attribute. The kth such statistic counts the
number of edges (i,j) for which attrname(i) ==
attrname(j) == value(k)
, where value(k)
is the kth
smallest unique value of the attrname attribute. If set to non-NULL,
the optional keep
argument should be a vector of integers
giving the values of k
that should be considered for matches;
other values are ignored (this works for both diff=FALSE
and
diff=TRUE
). For instance, to add two statistics, counting the
matches for just the 2nd and 4th categories, use nodematch
with diff=TRUE
and keep=c(2,4)
.nodemix(attrname, base=NULL)
attrname
argument is a character vector giving
the names of categorical attributes in the network's vertex
attribute list. By default, this term adds one network statistic to
the model for each possible pairing of attribute values. The
statistic equals the number of edges in the network in which the
nodes have that pairing of values. (When multiple names are given, a
statistic is added for each combination of attribute values for
those names.) In other words, this term produces one statistic for
every entry in the mixing matrix for the attribute(s). The ordering of
the attribute values is alphabetical (for nominal categories) or
numerical (for ordered categories). The optional base
argument is a vector of integers corresponding to the pairings that
should not be included. If base
contains only negative
integers, then these integers correspond to the only pairings that
should be included. By default (i.e., with base=NULL
or
base=0
), all pairings are included.nodeocov(attrname)
attrname
argument is a character string giving the name of a
numeric (not categorical) attribute in the network's vertex attribute list.
This term adds a single network statistic to the model equaling the total
value of attrname(i)
for all edges (i,j) in the network. This
term may only be used with directed networks. For categorical attributes,
see nodeofactor
.nodeofactor(attrname, base=1)
attrname
argument is a character
string giving one or more names of categorical attributes in the
network's vertex attribute list. This term adds multiple network
statistics to the model, one for each of (a subset of) the unique
values of the attrname
attribute (or each combination of the
attributes given). Each of these statistics gives the number of
times a node with that attribute or those attributes appears as the
node of origin of a directed tie. To include all attribute values is
usually not a good idea – though this may be accomplished if desired
by setting base=0
–
because the sum of all such statistics
equals the number of edges and hence a linear dependency would arise
in any model also including edges
. Thus, the base
argument tells which value(s) (numbered in order according to the
sort
function) should be omitted. The default value,
base=1
, means that the smallest (i.e., first in sorted order)
attribute value is omitted. For example, if the “fruit”
factor has levels “orange”, “apple”, “banana”,
and “pear”, then to add just two terms, one for
“apple” and one for “pear”, then set “banana”
and “orange” to the base (remember to sort the values first)
by using nodefactor("fruit", base=2:3)
. For an analogous term
for quantitative vertex attributes, see nodeocov
. nsp(d)
dsp
and esp
terms, except this term adds
one network statistic to the model for each element in d
where the ith such statistic equals the number of {em
non-edges} (that is, dyads that do not have an edge) in the network
with exactly d[i]
shared partners. This term can be used with
directed and undirected networks. For directed networks the count is
over homogeneous shared partners only (i.e., only partners on a
directed two-path connecting the nodes in the non-edge and in the same
direction).odegree(d, by=NULL)
d
argument
is a vector of distinct integers. This term adds one network statistic to
the model for each element in d
; the ith such statistic equals
the number of nodes in the network of out-degree d[i]
, i.e. the
number of nodes with exactly d[i]
out-edges. The optional argument
by
is a character string giving the name of an attribute in the
network's vertex attribute list. If this is specified
then each node's degree is tabulated only with other nodes having the same
value of the by
attribute.
This term can only be used with directed networks; for undirected networks
see degree
. ostar(k, attrname=NULL)
k
argument is
a vector of distinct integers. This term adds one network statistic to the
model for each element in k
. The ith such statistic counts the
number of distinct k[i]
-outstars in the network, where a
k-outstar is defined to be a node N and a set of k
different nodes {O_1, ..., O_k} such that the ties
(N,O_j) exist for j=1, ..., k. The
optional argument attrname
is a character string giving the name of
an attribute in the network's vertex attribute list. If this is specified
then the count is the number of k-outstars where all nodes have the
same value of the attribute. This term can only be used with directed
networks; for undirected networks see kstar
. Note that
ostar(1)
is equal to both istar(1)
and edges
. receiver(base=1)
edges
, but its coefficient
can be computed as the negative of the sum of the coefficients of all the
other actors. That is, the average coefficient is zero, following the
Holland-Leinhardt parametrization of the $p_1$ model (Holland and Leinhardt,
1981). The base
argument allows the user to determine which nodes'
statistics should be omitted. The base
argument can also be a vector
of negative indices, to specify which should be added instead of deleted,
and base=0
specifies that all statistics should be included. This
term can only be used with directed networks. For undirected networks, see
sociality
.sender(base=1)
edges
, but its coefficient
can be computed as the negative of the sum of the coefficients of all the
other actors. That is, the average coefficient is zero, following the
Holland-Leinhardt parametrization of the $p_1$ model (Holland and Leinhardt,
1981). The base
argument allows the user to determine which nodes'
statistics should be omitted. The base
argument can also be a vector
of negative indices, to specify which should be added instead of deleted,
and base=0
specifies that all statistics should be included. This
term can only be used with directed networks. For undirected networks, see
sociality
.simmelian
simmelianties
sociality(attrname=NULL, base=1)
attrname
argument is a character
string giving the name of an attribute in the network's vertex attribute
list that takes categorical values. If provided, this term only counts ties
between nodes with the same value of the attribute (an actor-specific
version of the nodematch
term). This term can only be used with
undirected networks. For directed networks, see sender
and
receiver
. By default, base=1
means that the statistic for the
first node will be omitted, but this argument may be changed to control
which statistics are included just as for the sender
and
receiver
terms.threepath(keep=1:4)
keep
argument),
one for each of the four distinct types of directed three-paths. If the
nodes of the path are written from left to right such that the middle edge
points to the right (R), then the four types are RRR, RRL, LRR, and LRL.
That is, an RRR threepath is of the form
i–>j–>k–>l, and RRL
threepath is of the form
i–>j–>k<–l, etc.
Like in the undirected case, there is no requirement that the nodes be
distinct in a directed threepath. However, the three edges must all be
distinct. Thus, a mutual tie i<–>j does not
count as a threepath of the form
i–>j–>i<–j; however,
in the subnetwork i<–>j–>k,
there are two directed threepaths, one LRR
(k<–j–>i–>j)
and one RRR
(k<–j–>i–>j).
transitive
120D
, 030T
, 120U
, or 300
in the categorization
of Davis and Leinhardt (1972). For details on the 16 possible triad types,
see triad.classify
in the sna
package.
Note the distinction from the ttriple
term. This term can only be
used with directed networks.triadcensus(d)
003, 012, 102, 021D, 021U, 021C, 111D,
111U, 030T, 030C, 201, 120D, 120U, 120C, 210,
and 300
. Note that at
least one category should be dropped; otherwise a linear dependency will
exist among the 16 statistics, since they must sum to the total number of
three-node sets. By default, the category 003
, which is the category
of completely empty three-node sets, is dropped. This is considered category
zero, and the others are numbered 1 through 15 in the order given above. By
specifying a numeric vector of integers from 0 to 15 as the d
argument, the user may specify a set of terms to add other than the default
value of 1:15
. Each statistic is the count of the corresponding triad
type in the network. For details on the 16 types, see ?triad.classify
in the {sna}
package, on which this code is based. For an undirected
network, the triad census is over the four types defined by the number of
ties (i.e., 0, 1, 2, and 3), and the default is to add 1:3
, which is
to say that the 0 is dropped; however, this too may be controlled by
changing the d
argument to a numeric vector giving a subset of
{0, 1, 2, 3}.triangle(attrname=NULL)
triangle
equals ttriple
plus ctriple
— thus at most two of these three terms can be in a model. The optional
argument attrname
restricts the count to those triples of nodes with
equal values of the vertex attribute specified by attrname
. tripercent(attrname=NULL)
triangle
. The optional argument attrname
restricts the counts
(both numerator and denominator) to those triples of nodes with equal values
of the vertex attribute specified by attrname
. This term can only be
used with undirected networks; for directed networks, it is difficult to
define the numerator and denominator in a consistent and meaningful way.ttriple(attrname=NULL)
triangle
equals ttriple+ctriple
for a directed network, so at
most two of the three terms can be in a model. The optional argument
attrname
is a character string giving the name of an attribute in the
network's vertex attribute list. If this is specified then the count is over
the number of transitive triples where all three nodes have the same value
of the attribute. This term can only be used with directed networks.twopath
m2star
.
For undirected networks a twopath is defined as a pair of edges
{i,j}, {j,k}. That is, it is an undirected path of length 2 from
i to k via j, also known as a 2-star.ergm, network, %v%, %n%, sna, summary.ergm, print.ergm
## Not run: ergm(flomarriage ~ kstar(1:2) + absdiff("wealth") + triangle) ergm(molecule ~ edges + kstar(2:3) + triangle + nodematch("atomic type",diff=TRUE) + triangle + absdiff("atomic type")) ## End(Not run)