ecme {pan}R Documentation

ECME algorithm for general linear mixed model, as described by Schafer (1997)

Description

Performs maximum-likelihood estimation for generalized linear mixed models. The model, which is typically applied to longitudinal or clustered responses, is

yi = Xi%*%beta + Zi%*%bi + ei , i=1,...,m,

where

yi = (ni x 1) response vector for subject or cluster i;

Xi = (ni x p) matrix of covariates;

Zi = (ni x q) matrix of covariates;

beta = (p x 1) vector of coefficients common to the population (fixed effects);

bi = (q x 1) vector of coefficients specific to subject or cluster i (random effects); and

ei = (ni x 1) vector of residual errors.

The vector bi is assumed to be normally distributed with mean zero and unstructured covariance matrix psi,

bi ~ N(0,psi) independently for i=1,...,m.

The residual vector ei is assumed to be

ei ~ N(0,sigma2*Vi)

where Vi is a known (ni x ni) matrix. In most applications, Vi is the identity matrix.

Usage

ecme(y, subj, occ, pred, xcol, zcol=NULL, vmax, start, 
     maxits=1000, eps=0.0001, random.effects=F)

Arguments

y vector of responses. This is simply the individual yi vectors stacked upon one another. Each element of y represents the observed response for a particular subject-occasion, or for a particular unit within a cluster.
subj vector of same length as y, giving the subject (or cluster) indicators i for the elements of y. For example, suppose that y is in fact c(y1,y2,y3,y4) where length(y1)=2, length(y2)=3, length(y3)=2, and length(y4)=7. Then subj should be c(1,1,2,2,2,3,3,4,4,4,4,4,4,4).
occ vector of same length as y indicating the "occasions" for the elements of y. In a longitudinal dataset where each individual is measured on at most nmax distinct occasions, each element of y corresponds to one subject-occasion, and the elements of of occ should be coded as 1,2,...,nmax to indicate these occasion labels. (You should label the occasions as 1,2,...,nmax even if they are not equally spaced in time; the actual times of measurement will be incorporated into the matrix "pred" below.) In a clustered dataset, the elements of occ label the units within each cluster i, using the labels 1,2,...,ni.
pred matrix of covariates used to predict y. The number of rows should be length(y). The first column will typically be constant (one), and the remaining columns correspond to other variables appearing in Xi and Zi.
xcol vector of integers indicating which columns of pred will be used in Xi. That is, pred[,xcol] is the Xi matrices (stacked upon one another.
zcol vector of integers indicating which columns of pred will be used in Zi. That is, pred[,zcol] is the Zi matrices (stacked upon one another). If zcol=NULL then the model is assumed to have no random effects; in that case the parameters are estimated noniteratively by generalized least squares.
vmax optional matrix of dimension c(max(occ),max(occ)) from which the Vi matrices will be extracted. In a longitudinal dataset, vmax would represent the Vi matrix for an individual with responses at all possible occasions 1,2,...,nmax=max(occ); for individuals with responses at only a subset of these occasions, the Vi will be obtained by extracting the rows and columns of vmax for those occasions. If no vmax is specified by the user, an identity matrix is used. In most applications of this model one will want to have Vi = identity, so most of the time this argument can be omitted.
start optional starting values of the parameters. If this argument is not given then ecme() chooses its own starting values. This argument should be a list of three elements named "beta", "psi", and "sigma2". Note that "beta" should be a vector of the same length as "xcol", "psi" should be a matrix of dimension c(length(zcol),length(zcol)), and "sigma2" should be a scalar. This argument has no effect if zcol=NULL.
maxits maximum number of cycles of ECME to be performed. The algorithm runs to convergence or until "maxits" iterations, whichever comes first.
eps convergence criterion. The algorithm is considered to have converged if the relative differences in all parameters from one iteration to the next are less than eps–that is, if all(abs(new-old)<eps*abs(old)).
random.effects if TRUE, returns empirical Bayes estimates of all the random effects bi (i=1,2,...,m) and their estimated covariance matrices.

Value

a list containing estimates of beta, sigma2, psi, an estimated covariance matrix for beta, the number of iterations actually performed, an indicator of whether the algorithm converged, and a vector of loglikelihood values at each iteration. If random.effects=T, also returns a matrix of estimated random effects (bhat) for individuals and an array of corresponding covariance matrices.

beta vector of same length as "xcol" containing estimated fixed effects.
sigma2 estimate of error variance sigma2.
psi matrix of dimension c(length(zcol),length(zcol)) containing the estimated covariance matrix psi.
converged T if the algorithm converged, F if it did not
iter number of iterations actually performed. Will be equal to "maxits" if converged=F.
loglik vector of length "iter" reporting the value of the loglikelihood at each iteration.
cov.beta matrix of dimension c(length(xcol),length(xcol)) containing estimated variances and covariances for elements of "beta".
bhat if random.effects=T, a matrix with length(zcol) rows and m columns, where bhat[,i] is an empirical Bayes estimate of bi.
cov.b if random.effects=T, an array of dimension length(zcol) by length(zcol) by m, where cov.b[,,i] is an empirical Bayes estimate of the covariance matrix associated with bi.

References

Schafer, J.L. (1997) Imputation of missing covariates under a multivariate linear mixed model. Technical report, Dept. of Statistics, The Pennsylvania State University.

Examples

## Not run: 
For a detailed example, see the file "ecmeex.R" distributed
with this function.
## End(Not run)

[Package pan version 0.2-6 Index]