lm.disp {dispmod} | R Documentation |
This function estimates normal dispersion regression models.
lm.disp(formula, var.formula, data = list(), maxit = 30, epsilon = glm.control()$epsilon, subset, na.action = na.omit, contrasts = NULL, offset = NULL)
formula |
a symbolic description of the mean function of the model to be fit. For the details of model formula specification see help(lm) and help(formula) . |
var.formula |
a symbolic description of the variance function of the model to be fit. This must be a one-sided formula; if omitted the same terms used for the mean function are used. For the details of model formula specification see help(lm) and help(formula) . |
data |
an optional data frame containing the variables in the model. By default the variables are taken from `environment(formula)', typically the environment from which the function is called. |
maxit |
integer giving the maximal number of iterations for the model fitting procedure. |
epsilon |
positive convergence tolerance epsilon; the procedure converge when |dev - devold| < epsilon. |
subset |
an optional vector specifying a subset of observations to be used in the fitting process. |
na.action |
a function which indicates what should happen when the data contain `NA's. The default is set by the `na.action' setting of `options', and is `na.fail' if that is unset. The default is `na.omit'. |
contrasts |
an optional list. See the `contrasts.arg' of `model.matrix.default'. |
offset |
this can be used to specify an a priori known component to be included in the linear predictor during fitting. An `offset' term can be included in the formula instead or as well, and if both are specified their sum is used. |
Normal dispersion models allow to model variance heterogeneity in normal regression analysis using a log-linear model for the variance.
Suppose a response y is modelled as a function of a set of p predictors x through the linear model
y_i = β'x_i + e_i
where e_i ~ N(0, σ^2) under homogeneity. Variance heterogeneity is expressed as
Var(e_i) = σ^2 = exp(λ'z_i)
where z_i may contain some or all the variables in x_i and other variables not included in x_i; z_i is however assumed to contain a constant term. This model can be re-expressed also as
E(y|x) = β'x
Var(y|x) = exp(λ'z)
and is fitted by maximum likelihood following the algorithm described in Aitkin (1987).
`lm.dispmod' returns an object of `class' `"dispmod"'.
The function `summary' is used to obtain and print a summary of the results.
An object of class `"lm.dispmod"' is a list containing the following components:
call |
the matched call. |
mean |
an object of class `"glm"' giving the fitted model for the mean function. |
var |
an object of class `"glm"' giving the fitted model for the variance function. |
initial.deviance |
the value of the deviance at the beginning of the iterative procedure, i.e. assuming constant variance. |
deviance |
the value of the deviance at the end of the iterative procedure. |
Based on a similar procedure available in Arc (Cook and Weisberg, http://www.stat.umn.edu/arc)
Luca Scrucca, luca@stat.unipg.it
Aitkin, M. (1987), Modelling variance heterogeneity in normal regression models using GLIM, Applied Statistics, 36, 332–339.
lm
, glm
, glm.binomial.disp
, glm.poisson.disp
, ncv.test
(in the car
library).
data(minitab) attach(minitab) y <- V^(1/3) summary(mod <- lm(y ~ H + D)) summary(mod.disp1 <- lm.disp(y ~ H + D)) summary(mod.disp2 <- lm.disp(y ~ H + D, ~ H)) # Likelihood ratio test deviances <- c(mod.disp1$initial.deviance, mod.disp2$deviance, mod.disp1$deviance) lrt <- c(NA, abs(diff(deviances))) cbind(deviances, lrt, p.value=1-pchisq(lrt, 1)) # quadratic dispersion model on D (as discussed by Aitkin) summary(mod.disp4 <- lm.disp(y ~ H + D, ~ D + I(D^2))) r <- mod$residuals plot(D, log(r^2)) phi.est <- mod.disp4$var$fitted.values lines(D, log(phi.est))