modavg {AICcmodavg}R Documentation

Compute Model-averaged Parameter Estimate (Multimodel Inference)

Description

This function model-averages the estimate of a parameter of interest among a set of candidate models, computes the unconditional standard error and unconditional confidence intervals as described in Buckland et al. (1997) and Burnham and Anderson (2002).

Usage

modavg(cand.set, parm, modnames, c.hat = 1, gamdisp = NULL,
conf.level = 0.95, second.ord = TRUE, nobs = NULL, exclude = NULL,
warn = TRUE, uncond.se = "revised")

modavg.glm(cand.set, parm, modnames, c.hat = 1, gamdisp = NULL,
conf.level = 0.95, second.ord = TRUE, nobs = NULL, exclude = NULL,
warn = TRUE, uncond.se = "revised")

modavg.lme(cand.set, parm, modnames, conf.level = 0.95,
second.ord = TRUE, nobs = NULL, exclude = NULL, warn = TRUE,
uncond.se = "revised")

modavg.mult(cand.set, parm, modnames, c.hat = 1, conf.level = 0.95,
second.ord = TRUE, nobs = NULL, exclude = NULL, warn = TRUE,
uncond.se = "revised")

modavg.polr(cand.set, parm, modnames, conf.level = 0.95,
second.ord = TRUE, nobs = NULL, exclude = NULL, warn = TRUE,
uncond.se = "revised")

Arguments

cand.set a list storing each of the models in the candidate model set.
parm the parameter of interest, enclosed between quotes, for which a model-averaged estimate is required. For categorical variables, the level for which an estimate is needed must be included (see 'Details' below).
modnames a character vector of model names to facilitate the identification of each model in the model selection table.
c.hat value of overdispersion parameter (i.e., variance inflation factor) such as that obtained from 'c_hat'. Note that values of c.hat different from 1 are only appropriate either for binomial GLM's with trials > 1 (i.e., success/trial or cbind(success, failure) syntax) or for Poisson GLM's. If c.hat > 1, 'aictab' will return the quasi-likelihood analogue of the information criteria requested and multiply the variance-covariance matrix of the estimates by this value (i.e., SE's are multiplied by sqrt(c.hat)).
gamdisp if gamma GLM is used, the dispersion parameter should be specified here to apply the same value to each model.
conf.level the confidence level requested for the computation of unconditional confidence intervals.
second.ord logical. If TRUE, the function returns the second-order Akaike information criterion (i.e., AICc).
nobs this argument allows to specify a numeric value other than total sample size to compute the AICc. This is relevant only for linear mixed models where sample size is not straightforward. In such cases, one might use total number of observations or number of independent clusters as the value of 'nobs'.
exclude this argument excludes models based on the terms specified from the computation of a model-averaged estimate on 'parm'. The 'exclude' argument is set to NULL by default and does not exclude any models other than those without the 'parm'. When 'parm' is a main effect but is also involved in interactions/polynomial terms in some models, one should specify the interaction/polynomial terms as a list to exclude models with these terms from the computation of model-averaged estimate of the main effect (e.g., exclude = list("sex:mass", "mass2")). See 'Details' and 'Examples' below.
warn logical. If TRUE, modavg( ) performs a check and isssues a warning when the value in 'parm' occurs more than once in any given model. This is a check for potential interaction/polynomial terms in the model when such terms are constructed with the usual operators (e.g., I( ) for polynomial terms, ':' for interaction terms).
uncond.se either, "old", or "revised", specifying the equation used to compute the unconditional standard error of a model-averaged estimate. With uncond.se = "old", computations are based on equation 4.9 of Burnham and Anderson (2002), which was the former way to compute unconditional standard errors. With uncond.se = "revised", equation 6.12 of Burnham and Anderson (2002) is used. Anderson (2008, p. 111) recommends use of the revised version for the computation of unconditional standard errors and it is now the default. Note that versions of package AICcmodavg < 1.04 used the old method to compute unconditional standard errors.

Details

The parameter must be specified as it appears in the model output. For factors, one must specify the name of the variable and the level of interest. To avoid problems, one should specify interaction terms consistently for all models: e.g., either a:b or b:a for all models, but not a mixture of both.

Care must be taken when some models include interaction or polynomial terms, because main effect terms do not have the same interpretation when an interaction/polynomial term appears in the same model. In such cases, one should exclude models containing interaction terms where the main effect is involved with the 'exclude' argument of modavg( ). Note that modavg( ) checks for potential cases of multiple instances of a variable appearing more than once in a given model (presumably in an interaction) and issues a warning. To correctly compute the model-averaged estimate of a main effect involved in interaction/polynomial terms, specify the terms(s) that should not appear in the same model with the 'exclude' argument. This will effectively exclude models from the computation of the model-averaged estimate.

When 'warn' = TRUE, modavg( ) looks for matches among the labels of the estimates with identical( ). It then, compares the results to partial matches with regexpr( ), and issues a warning whenever they are different. As a result, modavg( ) may issue a warning when some variables or levels of categorical variables have nested names (e.g., treat, treat10; L, TL). When this warning is only due to the presence of similarly named variables in the models (and NOT due to interaction terms), you can suppress this warning by setting 'warn' = FALSE.

'modavg' is a function that calls 'modavg.glm', 'modavg.lme', 'modavg.mult', or 'modavg.polr' depending on the class of the object. The current function is implemented for a list containing objects of 'lm','glm', 'lme', 'multinom', and 'polr' classes.

Value

'modavg', 'modavg.glm', 'modavg.lme', 'modavg.mult', 'modavg.polr' and create an object of class 'modavg' with the following components:

Parameter the parameter for which a model-averaged estimate was obtained
Mod.avg.table the reduced model selection table based on models including the parameter of interest
Mod.avg.beta the model-averaged estimate based on all models including the parameter of interest (see 'Details' above regarding the exclusion of models where parameter of interest is involved in an interaction)
Uncond.SE the unconditional standard error for the model-averaged estimate (as opposed to the conditional SE based on a single model)
Conf.level the confidence level used to compute the confidence interval
Lower.CL the lower confidence limit
Upper.CL the upper confidence limit

Author(s)

Marc J. Mazerolle

References

Anderson, D. R. (2008) Model-based Inference in the Life Sciences: a primer on evidence. Springer: New York.

Buckland, S. T., Burnham, K. P., Augustin, N. H. (1997) Model selection: an integral part of inference. Biometrics 53, 603–618.

Burnham, K. P., Anderson, D. R. (2002) Model Selection and Multimodel Inference: a practical information-theoretic approach. Second edition. Springer: New York.

Burnham, K. P., Anderson, D. R. (2004) Multimodel inference: understanding AIC and BIC in model selection. Sociological Methods and Research 33, 261–304.

Mazerolle, M. J. (2006) Improving data analysis in herpetology: using Akaike's Information Criterion (AIC) to assess the strength of biological hypotheses. Amphibia-Reptilia 27, 169–180.

See Also

AICc, aictab, c_hat, importance, confset, evidence, modavgpred

Examples

##anuran larvae example from Mazerolle (2006)
data(min.trap)
##assign "UPLAND" as the reference level as in Mazerolle (2006)          
min.trap$Type <- relevel(min.trap$Type, ref = "UPLAND") 

##set up candidate models          
Cand.mod <- list()
##global model          
Cand.mod[[1]] <- glm(Num_anura ~ Type + log.Perimeter + Num_ranatra,
family = poisson, offset = log(Effort), data = min.trap) 
Cand.mod[[2]] <- glm(Num_anura ~ Type + log.Perimeter, family = poisson,
offset = log(Effort), data = min.trap) 
Cand.mod[[3]] <- glm(Num_anura ~ Type + Num_ranatra, family = poisson,
offset = log(Effort), data = min.trap) 
Cand.mod[[4]] <- glm(Num_anura ~ Type, family = poisson,
offset = log(Effort), data = min.trap) 
Cand.mod[[5]] <- glm(Num_anura ~ log.Perimeter + Num_ranatra,
family = poisson, offset = log(Effort), data = min.trap) 
Cand.mod[[6]] <- glm(Num_anura ~ log.Perimeter, family = poisson,
offset = log(Effort), data = min.trap) 
Cand.mod[[7]] <- glm(Num_anura ~ Num_ranatra, family = poisson,
offset = log(Effort), data = min.trap) 
Cand.mod[[8]] <- glm(Num_anura ~ 1, family = poisson,
offset = log(Effort), data = min.trap) 
          
##check c-hat for global model
c_hat(Cand.mod[[1]]) #uses Pearson's chi-square/df
##note the very low overdispersion: in this case, the analysis could be
##conducted without correcting for c-hat as its value is reasonably close
##to 1  

##assign names to each model
Modnames <- c("type + logperim + invertpred", "type + logperim", "type +
invertpred", "type", "logperim + invertpred", "logperim", "invertpred",
"intercept only") 

##compute model-averaged estimate of TypeBOG
modavg(parm = "TypeBOG", cand.set = Cand.mod, modnames = Modnames)
##round to 4 digits after decimal point
print(modavg(parm = "TypeBOG", cand.set = Cand.mod, modnames =
Modnames), digits = 4)

##compute residual deviance as in Mazerolle (2006)          
Cand.mod[[1]]$deviance/Cand.mod[[1]]$df.residual

##compute model-averaged estimate of TypeBOG as in Table 4 of
##Mazerolle (2006) 
modavg(parm = "TypeBOG", cand.set = Cand.mod, modnames = Modnames,
c.hat = 1.11) 

##example with similarly-named variables and interaction terms
set.seed(seed = 4)
resp <- rnorm(n = 40, mean = 3, sd = 1)
size <- rep(c("small", "medsmall", "high", "medhigh"), times = 10)
set.seed(seed = 4)
mass <- rnorm(n = 40, mean = 2, sd = 0.1)
mass2 <- mass^2
age <- rpois(n = 40, lambda = 3.2)
agecorr <- rpois(n = 40, lambda = 2) 
sizecat <- rep(c("a", "ab"), times = 20)
data1 <- data.frame(resp = resp, size = size, sizecat = sizecat,
mass = mass, mass2 = mass2, age = age, agecorr = agecorr)

##set up models in list
Cand <- list()
Cand[[1]] <- lm(resp ~ size + agecorr, data = data1)
Cand[[2]] <- lm(resp ~ size + mass + agecorr, data = data1)
Cand[[3]] <- lm(resp ~ age + mass, data = data1)
Cand[[4]] <- lm(resp ~ age + mass + mass2, data = data1)
Cand[[5]] <- lm(resp ~ mass + mass2 + size, data = data1)
Cand[[6]] <- lm(resp ~ mass + mass2 + sizecat, data = data1)
Cand[[7]] <- lm(resp ~ mass + mass2 + sizecat, data = data1)
Cand[[8]] <- lm(resp ~ sizecat, data = data1)
Cand[[9]] <- lm(resp ~ sizecat + mass + sizecat:mass, data = data1)
Cand[[10]] <- lm(resp ~ agecorr + sizecat + mass + sizecat:mass,
data = data1) 

Modnames<-NULL
for (i in 1:length(Cand)) {
Modnames[i]<-paste("mod", i, sep="")
}

aictab(cand.set = Cand, modnames = Modnames, sort = TRUE) #correct

##as expected, issues warning as mass occurs sometimes with "mass2" or
##"sizecatab:mass" in some of the models
## Not run: modavg(cand.set = Cand, parm = "mass", modnames = Modnames)

##no warning issued, because "age" and "agecorr" never appear in same model
modavg(cand.set = Cand, parm = "age", modnames = Modnames)

##as expected, issues warning because warn=FALSE, but it is a very bad
##idea in this example since "mass" occurs with "mass2" and "sizecat:mass"
##in some of the models - results are INCORRECT
## Not run: 
modavg(cand.set = Cand, parm = "mass", modnames = Modnames,
warn = FALSE)
## End(Not run)

##correctly excludes models with quadratic term and interaction term
##results are CORRECT
modavg(cand.set = Cand, parm = "mass", modnames = Modnames,
exclude = list("mass2", "sizecat:mass")) 

##correctly computes model-averaged estimate because no other parameter
##occurs simultaneously in any of the models
modavg(cand.set = Cand, parm = "sizesmall", modnames = Modnames) #correct

##as expected, issues a warning because "sizecatab" occurs sometimes in
##an interaction in some models
## Not run: 
modavg(cand.set = Cand, parm = "sizecatab",
modnames = Modnames) 
## End(Not run)

##exclude models with "sizecat:mass" interaction - results are CORRECT
modavg(cand.set = Cand, parm = "sizecatab", modnames = Modnames,
exclude = list("sizecat:mass")) 

[Package AICcmodavg version 1.05 Index]