predict.aster {aster}R Documentation

Predict Method for Aster Model Fits

Description

Obtains predictions and optionally estimates standard errors of those predictions from a fitted Aster model object.

Usage

## S3 method for class 'aster':
predict(object, x, root, modmat, amat,
    parm.type = c("mean.value", "canonical"),
    model.type = c("unconditional", "conditional"),
    se.fit = FALSE, info = c("expected", "observed"),
    info.tol = sqrt(.Machine$double.eps), newcoef = NULL, ...)

## S3 method for class 'aster.formula':
predict(object, newdata, varvar, idvar, root, amat,
    parm.type = c("mean.value", "canonical"),
    model.type = c("unconditional", "conditional"),
    se.fit = FALSE, info = c("expected", "observed"),
    info.tol = sqrt(.Machine$double.eps), newcoef = NULL, ...)

Arguments

object a fitted object of class inheriting from "aster" or "aster.formula".
modmat a model matrix to use instead of object$modmat. Must have the same structure (three-dimensional array, first index runs over individuals, second over nodes of the graphical model, third over covariates. Must have the same second and third dimensions as object$modmat. The second and third components of dimnames(modmat) and dimnames(object$modmat) must also be the same.
May be missing, in which case object$modmat is used.
predict.aster.formula constructs such a modmat from object$formula, the data frame newdata, and the variables in the environment of the formula. When newdata is missing, then object$modmat is used.
x response. Ignored and may be missing unless parm.type == "mean.value" && model.type == "conditional". Even then may be missing when modmat is missing, in which case object$x is used. A matrix whose first and second dimensions and the corresponding dimnames agrees with those of modmat and object$modmat.
predict.aster.formula constructs such an x from the response variable name in object$formula, the data frame newdata, and the variables in the environment of the formula. When newdata is missing, then object$x is used.
root root data. Ignored and may be missing unless parm.type == "mean.value". Even then may be missing when modmat is missing, in which case object$root is used. A matrix of the same form as x.
predict.aster.formula looks up the variable supplied as the argument root in the data frame newdata or in the variables in the environment of the formula and makes it a matrix of the same form as x. When newdata is missing, then object$root is used.
amat if zeta is the requested prediction (mean value or canonical, unconditional or conditional, depending on parm.type and model.type), then we predict the linear function t(amat) %*% zeta. May be missing, in which case the identity linear function is used.
For predict.aster, a three-dimensional array with dim(amat)[1:2] == dim(modmat)[1:2].
For predict.aster.formula, a three-dimensional array of the same dimensions as required for predict.aster (even though modmat is not provided). First dimension is number of individuals in newdata, if provided, otherwise number of individuals in object$data. Second dimension is number of variables (length{object$pred}).
parm.type the type of parameter to predict. The default is mean value parameters (the opposite of the default for predict.glm), the expected value of a linear function of the response under the MLE probability model (also called the MLE of the mean value parameter). The expectation is unconditional or conditional depending on parm.type.
The alternative "canonical" is the value of a linear function of the MLE of canonical parameters under the MLE probability model. The canonical parameter is unconditional or conditional depending on parm.type.
The value of this argument can be abbreviated.
model.type the type of model in which to predict. The default is "unconditional" in which case the parameters (either mean value or canonical, depending on the value of parm.type) are those of an unconditional model. The alternative is "conditional" in which case the parameters are those of a conditional model.
The value of this argument can be abbreviated.
se.fit logical switch indicating if standard errors are required.
info the type of Fisher information use to compute standard errors.
info.tol tolerance for eigenvalues of Fisher information. If eval is the vector of eigenvalues of the information matrix, then eval < cond.tol * max(eval) are considered zero. Hence the corresponding eigenvectors are directions of constancy or recession of the log likelihood.
newdata optionally, a data frame in which to look for variables with which to predict. If omitted, see modmat above. See also details section below.
varvar a variable of length nrow(newdata), typically a variable in newdata that is a factor whose levels are character strings treated as variable names. The number of variable names is nnode. Must be of the form rep(vars, each = nind) where vars is a vector of variable names. Not used if newdata is missing.
idvar a variable of length nrow(newdata), typically a variable in newdata that indexes individuals. The number of individuals is nind. Must be of the form rep(inds, times = nnode) where inds is a vector of labels for individuals. Not used if newdata is missing.
newcoef if not NULL, a variable of length object$coefficients and used in its place when one wants predictions at other than the fitted coefficient values.
... further arguments passed to or from other methods.

Details

Note that model.type need have nothing to do with the type of the fitted aster model, which is object$type.

Whether the fitted model is conditional or unconditional, one typically wants unconditional mean value parameters, because conditional mean value parameters for hypothetical individuals depend on the hypothetical data x, which usually makes no scientific sense.

If one does ask for conditional mean value parameters, generally the “data” should satisfy all(x == 1) and all(root == 1), so that the mean value parameters are “per unit of predecessor variable”, that is we “predict” psi''(theta[i, j]) rather than this multiplied by x[i, p(j)], where p(j) is the mathematical function defined by the R expression pred[j].

Similarly, if object$type == "conditional", then the conditional canonical parameters are a linear function of the regression coefficients theta = M beta, where M is the model matrix, but one can predict either theta or the unconditional canonical parameters phi, as selected by model.type. Similarly, if object$type == "unconditional", so phi = M beta, one can predict either theta or phi as selected by model.type.

The specification of the prediction model is confusing because there are so many possibilities. First the “usual” case. The fit was done using a formula, found in object$formula. A data frame newdata that has the same variables as object$data, the data frame used in the fit, but may have different rows (representing hypothetical individuals) is supplied. But newdata must specify all nodes of the graphical model for each (hypothetical, new) individual, just like object$data did for real observed individuals. Hence newdata is typically constructed using reshape. See also the details section of aster.

In this “usual” case we need varvar and idvar to tell us what rows of newdata correspond to which individuals and nodes (the same role they played in the original fit by aster). If we are predicting canonical parameters, then we do not need root or x. If we are predicting unconditional mean value parameters, then we also need root but not x. If we are predicting conditional mean value parameters, then we also need both root and x. In the “usual” case, these are found in newdata and not supplied as arguments to predict. Moreover, x is not named "x" but is the response in out$formula.

The next case, predict(object) with no other arguments, is often used with linear models (predict.lm), but we expect will be little used for aster models. As for linear models, this “predicts” the observed data. In this case modmat, x, and root are found in object and nothing is supplied as an argument to predict.aster, except perhaps amat if one wants a function of predictions for the observed data.

The final case, also perhaps little used, is a fail-safe mode for problems in which the R formula language just cannot be bludgeoned into doing what you want. This is the same reason aster.default exists. Then a model matrix can be constructed “by hand”, and the function predict.aster is used instead of predict.aster.formula.

Note that it is possible to use a “constructed by hand” model matrix even if object was produced by aster.formula. Simply explicitly call predict.aster rather than predict to override the R method dispatch (which would call predict.aster.formula in this case).

Value

If se.fit = FALSE, a vector of predictions. If se.fit = TRUE, a list with components

fit Predictions
se.fit Estimated standard errors

Examples

### see package vignette for explanation ###
library(aster)
data(echinacea)
vars <- c("ld02", "ld03", "ld04", "fl02", "fl03", "fl04",
    "hdct02", "hdct03", "hdct04")
redata <- reshape(echinacea, varying = list(vars), direction = "long",
    timevar = "varb", times = as.factor(vars), v.names = "resp")
redata <- data.frame(redata, root = 1)
pred <- c(0, 1, 2, 1, 2, 3, 4, 5, 6)
fam <- c(1, 1, 1, 1, 1, 1, 3, 3, 3)
hdct <- grep("hdct", as.character(redata$varb))
hdct <- is.element(seq(along = redata$varb), hdct)
redata <- data.frame(redata, hdct = as.integer(hdct))
aout4 <- aster(resp ~ varb + nsloc + ewloc + pop * hdct - pop,
    pred, fam, varb, id, root, data = redata)
newdata <- data.frame(pop = levels(echinacea$pop))
for (v in vars)
    newdata[[v]] <- 1
newdata$root <- 1
newdata$ewloc <- 0
newdata$nsloc <- 0
renewdata <- reshape(newdata, varying = list(vars),
     direction = "long", timevar = "varb", times = as.factor(vars),
     v.names = "resp")
hdct <- grep("hdct", as.character(renewdata$varb))
hdct <- is.element(seq(along = renewdata$varb), hdct)
renewdata <- data.frame(renewdata, hdct = as.integer(hdct))
nind <- nrow(newdata)
nnode <- length(vars)
amat <- array(0, c(nind, nnode, nind))
for (i in 1:nind)
    amat[i , grep("hdct", vars), i] <- 1
foo <- predict(aout4, varvar = varb, idvar = id, root = root,
    newdata = renewdata, se.fit = TRUE, amat = amat)
bar <- cbind(foo$fit, foo$se.fit)
dimnames(bar) <- list(as.character(newdata$pop), c("Estimate", "Std. Error"))
print(bar)

[Package aster version 0.7-7 Index]