gnm {gnm} | R Documentation |
gnm
fits generalised nonlinear models using an
over-parameterised representation. gnm
is able to fit models
incorporating multiplicative interactions as standard and can fit other
types of nonlinear effects via “plug-in” functions (see details).
gnm(formula, eliminate = NULL, constrain = NULL, family = gaussian, data = NULL, subset, weights, na.action, method = "gnmFit", offset, start = NULL, control = gnmControl(...), verbose = TRUE, model = TRUE, x = FALSE, vcov = FALSE, termPredictors = FALSE, ...)
formula |
a symbolic description of the nonlinear predictor. |
eliminate |
an optional formula consisting of a single factor to be used instead of an intercept in the predictor. |
constrain |
coefficients to set to zero, specified by a numeric vector of indices, a logical vector, or "pick" to select from a Tk dialog. |
family |
a specification of the error distribution and link function
to be used in the model. This can be a character string naming
a family function; a family function, or the result of a call
to a family function. See family and
wedderburn for possibilities.
|
data |
an optional data frame containing the variables in the model.
If not found in data , the variables are taken from
environment(formula) , typically the environment from which
gnm is called. |
subset |
an optional vector specifying a subset of observations to be used in the fitting process. |
weights |
an optional vector of weights to be used in the fitting process. |
na.action |
a function which indicates what should happen when the data
contain NA s. The default is first, any
na.action attribute of data ; second, any
na.action setting of options , and third,
na.fail . |
method |
the method to be used: either "gnmFit" to fit the
model, "coef" to return a character vector of names for the
coefficients in the model, or "model.frame" to return the
model frame. |
offset |
this can be used to specify an a priori known component to
be added to the predictor during fitting. offset terms
can be included in the formula instead or as well, and if both
are specified their sum is used. |
start |
a vector of starting values for the parameters in the
model; if a starting value is NA , the default starting value
will be used. Starting values need not be specified for eliminated
parameters. |
control |
a list of parameters for controlling the fitting process. See
gnmControl for details. |
verbose |
logical: if TRUE progress indicators are
printed as the model is fitted, including a diagnostic error message
if the algorithm restarts. |
model |
logical: if TRUE the model frame is returned. |
x |
logical: if TRUE the local design matrix from the last
iteration is returned. |
vcov |
logical: if TRUE the variance-covariance matrix of the
model coefficients is returned. |
termPredictors |
logical: if TRUE , a matrix is returned
with a column for each term in the model, containing the additive
contribution of that term to the predictor |
... |
further arguments passed to or from other methods. |
Models for gnm
are specified by giving a symbolic description
of the nonlinear predictor, of the form response ~ terms
. The
response
is typically a numeric vector, see later in this
section for alternatives. The usual symbolic language may be used to
specify any linear terms, see formula
for details.
gnm
has the in-built capability to handle multiplicative
interactions, which can be specified in the model formula using the
symbolic wrapper Mult
; e.g. Mult(A, B)
specifies a
multiplicative interaction between factors A
and
B
. The family of multiplicative interaction models include
row-column association models for contingency tables (e.g., Agresti,
2002, Sec 9.6), log-multiplicative or UNIDIFF models (Erikson and
Goldthorpe, 1992; Xie, 1992), and GAMMI models (van Eeuwijk, 1995).
Other nonlinear terms may be incorporated in the model via
plug-in functions that provide the objects required by gnm
to
fit the desired term. Such terms are specified in the model formula
using the symbolic wrapper Nonlin
;
e.g. Nonlin(PlugInFunction(A, B))
specifies a term to be fitted
by the plug-in function PlugInFunction
involving factors
A
and B
. The gnm package includes plug-in
functions for multiplicative interactions with homogeneous effects
(MultHomog
) and diagonal reference terms (Dref
). Users
may also define their own plug-in functions, see Nonlin
for details.
The eliminate
argument may be used to specify a single factor
to be used instead of an intercept in the model of the predictor. This
feature is designed for such factors that are required in the model
but are not of direct interest. If the factor is specified using
eliminate
, the effects of the factor will be estimated more
efficiently. At the end of the fitting process these parameters are
eliminated from the vector of coefficents. See backPain
for an example.
For contingency tables, the data may be provided as an object of class
"table"
from which the frequencies will be extracted to use
as the response. In this case, the response should be specified as
Freq
in the model formula. The "predictors"
,
"fitted.values"
, "residuals"
, "prior.weights"
,
"weights"
, "y"
and "offset"
components of
the returned gnm
fit will be tables with the same format as the
data.
For binomial models, the response
may be specified as a factor
in which the first level denotes failure and all other levels denote
success, as a two-column matrix with the columns giving the numbers
of successes and failures, or as a vector of the proportions of
successes.
The gnm
fitting process consists of two stages. In the start-up
iterations, any nonlinear parameters that are not specified by either the
start
argument or a plug-in function are updated one parameter
at a time, then the linear parameters are jointly updated before the
next iteration. In the main iterations, all the parameters are jointly
updated. See gnmControl
for more details.
By default, gnm
uses an over-parameterized representation of
the model that is being fitted. Only minimal identifiability constraints
are imposed, so that in general a random parameterization is obtained.
The parameter estimates are ordered so that those for any linear terms
appear first.
getContrasts
may be used to obtain estimates of specified
contrasts, if these contrasts are identifiable. In particular,
getContrasts
may be used to estimate the contrast between the
first k - 1
levels of a factor and level k
.
If appropriate constraints are known in advance, or have been
determined from a gnm
fit, the model may be (re-)fitted using
the constrain
argument to specify coefficients which should be
set to zero. Constraints may only be applied to non-eliminated
parameters. update
provides a convenient way of re-fitting a
gnm
model with new constraints.
If method = "gnmFit"
, gnm
returns NULL
if the
algorithm has failed and an object of class "gnm"
otherwise. A
"gnm"
object inherits first from "glm"
then "lm"
and is a list containing the following components:
call |
the matched call. |
formula |
the formula supplied. |
constrain |
a logical vector, indicating any coefficients that were constrained to zero in the fitting process. |
family |
the family object used. |
prior.weights |
the case weights initially supplied. |
terms |
the terms object used. |
na.action |
the na.action attribute of the model frame |
xlevels |
a record of the levels of the factors used in fitting. |
y |
the response used. |
offset |
the offset vector used. |
control |
the value of the control argument used. |
coefficients |
a named vector of coefficients. |
eliminate |
the number of eliminated parameters. |
predictors |
the fitted values on the link scale. |
fitted.values |
the fitted mean values, obtained by transforming the predictors by the inverse of the link function. |
deviance |
up to a constant, minus twice the maximised log-likelihood. Where sensible, the constant is chosen so that a saturated model has deviance zero. |
aic |
Akaike's An Information Criterion, minus twice the maximized log-likelihood plus twice the number of parameters (so assuming that the dispersion is known). |
iter |
the number of main iterations. |
conv |
logical indicating whether the main iterations converged. |
weights |
the working weights, that is, the weights used in the last iteration. |
residuals |
the working residuals, that is, the residuals from the last iteration. |
df.residual |
the residual degrees of freedom. |
rank |
the numeric rank of the fitted model. |
The list may also contain the components model
, x
,
vcov
, or termPredictors
if requested in the arguments to
gnm
.
If a binomial gnm
model is specified by giving a two-column
response, the weights returned by prior.weights
are the total
numbers of cases (factored by the supplied case weights) and the
component y
of the result is the proportion of successes.
The function summary
may be used to obtain and print a summary
of the results.
The generic functions formula
, family
,
terms
, coefficients
,
fitted.values
, deviance
,
extractAIC
, weights
,
residuals
, df.residual
,
model.frame
, model.matrix
,
vcov
and termPredictors
maybe used to
extract components from the object returned by gnm
or to
construct the relevant objects where necessary.
Note that the generic functions weights
and
residuals
do not act as straight-forward accessor
functions for gnm
objects, but return the prior weights and
deviance residuals respectively, as for glm
objects.
Heather Turner, David Firth
Agresti, A (2002). Categorical Data Analysis (2nd ed.) New York: Wiley.
Cautres, B, Heath, A F and Firth, D (1998). Class, religion and vote in Britain and France. La Lettre de la Maison Francaise 8.
Erikson, R and Goldthorpe, J H (1992). The Constant Flux. Oxford: Clarendon Press.
van Eeuwijk, F A (1995). Multiplicative interaction in generalized linear models. Biometrics 51, 1017-1032.
Xie, Y (1992). The log-multiplicative layer effect model for comparing mobility tables. American Sociological Review 57, 380-395.
formula
for the symbolic language used to specify
formulae.
Diag
and Symm
for specifying special types
of interaction.
Mult
, Nonlin
, Dref
and MultHomog
for incorporating nonlinear terms in the
formula
argument to gnm
.
residuals.glm
and the generic functions
coef
, fitted
, etc. for extracting
components from gnm
objects.
getContrasts
to estimate (identifiable) contrasts from a
gnm
model.
### Analysis of a 4-way contingency table set.seed(1) data(cautres) print(cautres) ## Fit a "double UNIDIFF" model with the religion-vote and class-vote ## interactions both modulated by nonnegative election-specific ## multipliers. doubleUnidiff <- gnm(Freq ~ election:vote + election:class:religion + Mult(Exp(election - 1), religion:vote - 1) + Mult(Exp(election - 1), class:vote - 1), family = poisson, data = cautres) ## Examine the multipliers of the class-vote log odds ratios coefs.of.interest <- grep("Mult2.*election", names(coef(doubleUnidiff))) coef(doubleUnidiff)[coefs.of.interest] ## Mult2.Factor1.election1 Mult2.Factor1.election2 ## -0.5724370 0.1092972 ## Mult2.Factor1.election3 Mult2.Factor1.election4 ## -0.1230682 -0.2105843 ## Re-parameterize by setting Mult2.Factor1.election4 to zero getContrasts(doubleUnidiff, coefs.of.interest) ## estimate se ## Mult2.Factor1.election1 -0.3618399 0.2534762 ## Mult2.Factor1.election2 0.3198951 0.1320034 ## Mult2.Factor1.election3 0.0875308 0.1446842 ## Mult2.Factor1.election4 0.0000000 0.0000000 ## Same thing but with election 1 as reference category: getContrasts(doubleUnidiff, rev(coefs.of.interest)) ## estimate se ## Mult2.Factor1.election4 0.3618399 0.2534746 ## Mult2.Factor1.election3 0.4493707 0.2473524 ## Mult2.Factor1.election2 0.6817351 0.2401645 ## Mult2.Factor1.election1 0.0000000 0.0000000 ## Re-fit model with Mult2.Factor1.election1 set to zero doubleUnidiffConstrained <- update(doubleUnidiff, constrain = coefs.of.interest[1]) ## Examine the multipliers of the class-vote log odds ratios coef(doubleUnidiffConstrained)[coefs.of.interest] ## ...as using 'getContrasts' (to 4 d.p.).