predict.aster {aster} | R Documentation |
Obtains predictions and optionally estimates standard errors of those predictions from a fitted Aster model object.
## S3 method for class 'aster': predict(object, x, root, modmat, amat, parm.type = c("mean.value", "canonical"), model.type = c("unconditional", "conditional"), se.fit = FALSE, info = c("expected", "observed"), info.tol = sqrt(.Machine$double.eps), newcoef = NULL, ...) ## S3 method for class 'aster.formula': predict(object, newdata, varvar, idvar, root, amat, parm.type = c("mean.value", "canonical"), model.type = c("unconditional", "conditional"), se.fit = FALSE, info = c("expected", "observed"), info.tol = sqrt(.Machine$double.eps), newcoef = NULL, ...)
object |
a fitted object of class inheriting from "aster"
or "aster.formula" . |
modmat |
a model matrix to use instead of object$modmat .
Must have the same structure (three-dimensional array, first index runs
over individuals, second over nodes of the graphical model, third over
covariates. Must have the same second and third dimensions as
object$modmat . The second and third components of
dimnames(modmat) and dimnames(object$modmat) must also be
the same.
May be missing, in which case object$modmat is used.
predict.aster.formula constructs such a modmat from
object$formula , the data frame newdata , and the variables
in the environment of the formula. When newdata is missing, then
object$modmat is used. |
x |
response. Ignored and may be missing unless
parm.type == "mean.value" && model.type == "conditional" .
Even then may be missing when modmat is missing,
in which case object$x is used. A matrix whose first and
second dimensions and the corresponding dimnames agrees with
those of modmat and object$modmat .
predict.aster.formula constructs such an x from
the response variable name in object$formula ,
the data frame newdata ,
and the variables in the environment of the formula. When newdata
is missing, then object$x is used. |
root |
root data. Ignored and may be missing unless
parm.type == "mean.value" .
Even then may be missing when modmat is missing,
in which case object$root is used. A matrix of the
same form as x .
predict.aster.formula looks up the variable supplied as
the argument root in the data frame newdata or in
the variables in the environment of the formula and makes it a matrix
of the same form as x . When newdata
is missing, then object$root is used. |
amat |
if zeta is the requested prediction (mean value
or canonical, unconditional or conditional, depending on parm.type
and model.type ), then we predict the linear function
t(amat) %*% zeta . May be missing, in which case the identity
linear function is used.
For predict.aster , a three-dimensional
array with dim(amat)[1:2] == dim(modmat)[1:2] .
For predict.aster.formula , a three-dimensional array
of the same dimensions as required for predict.aster
(even though modmat is not provided). First dimension
is number of individuals in newdata , if provided, otherwise
number of individuals in object$data . Second dimension
is number of variables (length{object$pred}) .
|
parm.type |
the type of parameter to predict. The default is
mean value parameters (the opposite of the default
for predict.glm ), the expected value of a linear function
of the response under the MLE probability model (also called the
MLE of the mean value parameter). The expectation is unconditional
or conditional depending on parm.type .
The alternative "canonical" is the value of a linear function
of the MLE of canonical parameters under the MLE probability model.
The canonical parameter is unconditional
or conditional depending on parm.type .
The value of this argument can be abbreviated. |
model.type |
the type of model in which to predict. The default is
"unconditional" in which case the parameters (either mean value
or canonical, depending on the value of parm.type ) are those
of an unconditional model.
The alternative is "conditional" in which case the parameters
are those of a conditional model.
The value of this argument can be abbreviated. |
se.fit |
logical switch indicating if standard errors are required. |
info |
the type of Fisher information use to compute standard errors. |
info.tol |
tolerance for eigenvalues of Fisher information.
If eval is the vector of eigenvalues of the information matrix,
then eval < cond.tol * max(eval) are considered zero. Hence the
corresponding eigenvectors are directions of constancy or recession of
the log likelihood. |
newdata |
optionally, a data frame in which to look for variables with
which to predict. If omitted, see modmat above. See also details
section below. |
varvar |
a variable of length nrow(newdata) , typically a
variable in newdata
that is a factor whose levels are character strings
treated as variable names. The number of variable names is nnode .
Must be of the form rep(vars, each = nind) where vars is
a vector of variable names. Not used if newdata is missing. |
idvar |
a variable of length nrow(newdata) , typically a
variable in newdata
that indexes individuals. The number
of individuals is nind .
Must be of the form rep(inds, times = nnode) where inds is
a vector of labels for individuals. Not used if newdata is missing. |
newcoef |
if not NULL ,
a variable of length object$coefficients and used
in its place when one wants predictions at other than the fitted
coefficient values. |
... |
further arguments passed to or from other methods. |
Note that model.type
need have nothing to do with the type
of the fitted aster model, which is object$type
.
Whether the
fitted model is conditional or unconditional, one typically wants
unconditional mean value parameters, because conditional mean
value parameters for hypothetical individuals depend on the hypothetical
data x
, which usually makes no scientific sense.
If one does ask for conditional mean value parameters, generally
the “data” should satisfy all(x == 1)
and all(root == 1)
,
so that the mean value parameters
are “per unit of predecessor variable”, that is we “predict”
psi''(theta[i, j]) rather than this multiplied
by x[i, p(j)], where p(j) is the mathematical
function defined by the R expression pred[j]
.
Similarly, if object$type == "conditional"
, then the conditional
canonical parameters are a linear function of the regression coefficients
theta = M beta, where M is the model matrix,
but one can predict either theta or the unconditional
canonical parameters phi,
as selected by model.type
.
Similarly, if object$type == "unconditional"
,
so phi = M beta, one can predict either
theta or phi
as selected by model.type
.
The specification of the prediction model is confusing because there
are so many possibilities. First the “usual” case.
The fit was done using a formula, found in object$formula
.
A data frame newdata
that has the same variables as object$data
,
the data frame used in the fit, but may have different rows (representing
hypothetical individuals) is supplied.
But newdata
must specify all nodes
of the graphical model for each (hypothetical, new) individual,
just like object$data
did for real observed individuals.
Hence newdata
is typically constructed using reshape
.
See also the details section of aster
.
In this “usual” case we need varvar
and idvar
to
tell us what rows of newdata
correspond to which individuals and
nodes (the same role they played in the original fit by aster
).
If we are predicting canonical parameters, then we do not need root
or
x
.
If we are predicting unconditional mean value parameters, then
we also need root
but not x
.
If we are predicting conditional mean value parameters, then
we also need both root
and x
.
In the “usual” case, these are found in newdata
and
not supplied as arguments to predict
. Moreover, x
is not named "x"
but is the response in out$formula
.
The next case, predict(object)
with no other arguments,
is often used with linear models (predict.lm
),
but we expect will be little used for aster models. As for linear
models, this “predicts” the observed data. In this case
modmat
, x
, and root
are found in object
and nothing is supplied as an argument to predict.aster
, except
perhaps amat
if one wants a function of predictions for the observed
data.
The final case, also perhaps little used, is a fail-safe mode for problems
in which the R formula language just cannot be bludgeoned into doing what
you want. This is the same reason aster.default
exists.
Then a model matrix can be constructed “by hand”, and the function
predict.aster
is used instead of predict.aster.formula
.
Note that it is possible to use a “constructed by hand”
model matrix even if object
was produced by aster.formula
.
Simply explicitly call predict.aster
rather than predict
to override the R method dispatch (which would
call predict.aster.formula
in this case).
If se.fit = FALSE
, a vector of predictions.
If se.fit = TRUE
, a list with components
fit |
Predictions |
se.fit |
Estimated standard errors |
### see package vignette for explanation ### library(aster) data(echinacea) vars <- c("ld02", "ld03", "ld04", "fl02", "fl03", "fl04", "hdct02", "hdct03", "hdct04") redata <- reshape(echinacea, varying = list(vars), direction = "long", timevar = "varb", times = as.factor(vars), v.names = "resp") redata <- data.frame(redata, root = 1) pred <- c(0, 1, 2, 1, 2, 3, 4, 5, 6) fam <- c(1, 1, 1, 1, 1, 1, 3, 3, 3) hdct <- grep("hdct", as.character(redata$varb)) hdct <- is.element(seq(along = redata$varb), hdct) redata <- data.frame(redata, hdct = as.integer(hdct)) aout4 <- aster(resp ~ varb + nsloc + ewloc + pop * hdct - pop, pred, fam, varb, id, root, data = redata) newdata <- data.frame(pop = levels(echinacea$pop)) for (v in vars) newdata[[v]] <- 1 newdata$root <- 1 newdata$ewloc <- 0 newdata$nsloc <- 0 renewdata <- reshape(newdata, varying = list(vars), direction = "long", timevar = "varb", times = as.factor(vars), v.names = "resp") hdct <- grep("hdct", as.character(renewdata$varb)) hdct <- is.element(seq(along = renewdata$varb), hdct) renewdata <- data.frame(renewdata, hdct = as.integer(hdct)) nind <- nrow(newdata) nnode <- length(vars) amat <- array(0, c(nind, nnode, nind)) for (i in 1:nind) amat[i , grep("hdct", vars), i] <- 1 foo <- predict(aout4, varvar = varb, idvar = id, root = root, newdata = renewdata, se.fit = TRUE, amat = amat) bar <- cbind(foo$fit, foo$se.fit) dimnames(bar) <- list(as.character(newdata$pop), c("Estimate", "Std. Error")) print(bar)