fwdglm {forward} | R Documentation |
This function applies the forward search approach to robust analysis in generalized linear models.
fwdglm(formula, family, data, weights, na.action, contrasts = NULL, bsb = NULL, balanced = TRUE, maxit = 50, epsilon = 1e-06, nsamp = 100, trace = TRUE)
formula |
a symbolic description of the model to be fit. The details of the model are the same as for glm. |
family |
a description of the error distribution and link function to be used in the model. See `family' for details. |
data |
an optional data frame containing the variables in the model. By default the variables are taken from the environment from which the function is called. |
weights |
an optional vector of weights to be used in the fitting process. |
na.action |
a function which indicates what should happen when the data contain `NA's. The default is set by the `na.action' setting of `options', and is `na.fail' if that is unset. The default is `na.omit'. |
contrasts |
an optional list. See the `contrasts.arg' of `model.matrix.default'. |
bsb |
an optional vector specifying a starting subset of observations to be used in the forward search. By default the ``best'' starting subset is chosen using the function lmsglm with control arguments provided by `nsamp'. |
balanced |
logical, for a binary response if TRUE the proportion of successes on the full dataset is approximately balanced during the forward search algorithm. |
maxit |
integer giving the maximal number of IWLS iterations. See glm.control for details. |
epsilon |
positive convergence tolerance epsilon. See glm.control for details. |
nsamp |
the initial subset for the forward search in generalized linear models is found by the function lmsglm . This argument allows to control how many subsets are used in the robust fitting procedure. The choices are: the number of samples (100 by the default) or `"all"'. Note that the algorithm tries to find `nsamp' good subsets or a maximum of 2*`nsamp' subsets. |
trace |
logical, if TRUE a message is printed for every ten iterations completed during the forward search. |
The function returns an object of class `"fwdglm"' with the following components:
call |
the matched call. |
Residuals |
a (n x (n-p+1)) matrix of residuals. |
Unit |
a matrix of units added (to a maximum of 5 units) at each step. |
included |
a list with each element containing a vector of units included at each step of the forward search. |
Coefficients |
a ((n-p+1) x p) matrix of coefficients. |
tStatistics |
a ((n-p+1) x p) matrix of t statistics for the coefficients, i.e. coef.est/SE(coef.est). |
Leverage |
a (n x (n-p+1)) matrix of leverage values. |
MaxRes |
a ((n-p) x 2) matrix of max deviance residuals in the best subsets and m-th deviance residuals. |
MinDelRes |
a ((n-p-1) x 2) matrix of minimum deviance residuals out of best subsets and (m+1)-th deviance residuals. |
ScoreTest |
a ((n-p) x 1) matrix of score test statistics for a goodness of link test. |
Likelihood |
a ((n-p) x 4) matrix with columns containing: deviance, residual deviance, psuedo R^2 (computed as 1-deviance/null.deviance), dispersion parameter (computed as sum(pearson.residuals^2)/(m - p)). |
CookDist |
a ((n-p) x 1) matrix of forward Cook's distances. |
ModCookDist |
a ((n-p) x 5) matrix of forward modified Cook's distances for the units (to a maximum of 5 units) included at each step. |
Weights |
a (n x (n-p)) matrix of weights used at each step of the forward search. |
inibsb |
a vector giving the best starting subset chosen by lmsglm . |
binary.response |
logical, equal to TRUE if binary response. |
Originally written for S-Plus by:
Kjell Konis kkonis@insightful.com and Marco Riani mriani@unipr.it
Ported to R by Luca Scrucca luca@stat.unipg.it
Atkinson, A.C. and Riani, M. (2000), Robust Diagnostic Regression Analysis, First Edition. New York: Springer, Chapter 6.
summary.fwdglm
, plot.fwdglm
, fwdlm
, fwdsco
.
data(cellular) cellular$TNF <- as.factor(cellular$TNF) cellular$IFN <- as.factor(cellular$IFN) mod <- fwdglm(y ~ TNF + IFN, data=cellular, family=poisson(log), nsamp=200) summary(mod) ## Not run: plot(mod) plot(mod, 1) plot(mod, 5) plot(mod, 6, ylim=c(-3, 20)) plot(mod, 7) plot(mod, 8)