icfit {interval} | R Documentation |
This function calculates the the non-parametric maximum likelihood estimate for the distribution from interval censored data using the self-consistent estimator, so the associated survival distribution generalizes the Kaplan-Meier estimate to interval censored data. Formulas using Surv are allowed similar to survfit.
## S3 method for class 'formula': icfit(formula, data, ...) ## Default S3 method: icfit(L, R,initfit = NULL, control=icfitControl(), Lin=NULL, Rin=NULL, ...)
L |
numeric vector of left endpoints of censoring interval |
R |
numeric vector of right endpoints of censoring interval |
initfit |
an object of class icfit or icsurv, used for the initial estimate (see details) |
control |
list of arguments for controling algorithm (see icfitControl ) |
Lin |
logical vector, should L be included in the interval? (see details) |
Rin |
logical vector, should R be included in the interval? (see details) |
formula |
a formula with response a numeric vector (which assumes no censoring) or Surv object the right side of the formula may be 1 or a factor (which produces separate fits for each level). |
data |
an optional matrix or data frame containing the variables in the formula formula. By default the variables are taken from environment(formula). |
... |
values passed to other functions |
The icfit
function fits the nonparametric maximum likelihood estimate (NPMLE) of the
distribution function for interval censored data. In the default case (when Lin=Rin=NULL)
we assume there are n (n=length(L)) failure times, and the ith one is in the interval
between L[i] and R[i]. The default is not to include L[i] in the interval unless L[i]=R[i],
and to include R[i] in the interval unless R[i]=Inf. When Lin and Rin are not NULL they describe
whether to include L and R in the associated interval. If either Lin or Rin is length 1 then it is
repeated n times, otherwise they should be logicals of length n.
The algorithm is basically an EM-algorithm applied to
interval censored data (see Turnbull, 1976); however
first there is a primary reduction (see Aragon and
Eberly, 1992). Convergence is defined when the maximum
reduced gradient is less than epsilon (see icfitControl
), and the
Kuhn-Tucker conditions are approximately met,
otherwise a warning will result. (see Gentleman and
Geyer, 1994). There are other faster algorithms (for example see
EMICM
in the package
Icens
.
The output is of class icfit
which is identical to the icsurv
class of the
Icens
package when there is only one group for which a distribution is needed.
Following that class, there is an intmap
element which gives the bounds
about which each drop in the NPMLE survival function can occur.
Since the classes icfit
and icsurv
are so closely related, one can directly
use of initial (and faster) fits from the Icens
package as input in
initfit
. Note that when using a non-null initfit
, the Lin
and Rin
values of the
initial fit are ignored. The advantage of the icfit
function is that it allows a call similar
to that used in survfit
of the survival
package so that different groups may be
plotted at the same time with similar calls.
An icfit
object prints as a list (see value below). A print
function prints output as a list
except suppresses printing of A matrix. A summary
function prints the
distribution (i.e., probabilities and the intervals where those
probability masses are known to reside) for each group in the icfit object. There is also
a plot method, see plot.icfit
.
An object of class icfit
(same as icsurv class, see details).
A list with elements:
A |
this is the n by k matrix of indicator functions, NULL if more than one strata, not printed by default |
strata |
a named numeric vector of numbers of observations in each strata, if one strata observation named NPMLE |
error |
this is max(d + u - n), see Gentleman and Geyer, 1994 |
numit |
number of iterations |
pf |
vector of estimated probabilities of the distribution |
intmap |
2 by k matrix, where the ith column defines an interval corresponding to the probability, pf[i] |
converge |
a logical, TRUE if normal convergence |
message |
character text message on about convergence |
Michael P. Fay
Aragon, J and Eberly, D (1992). On convergence of convex minorant algorithms for distribution estimation with interval-censored data. J. of Computational and Graphical Statistics. 1: 129-140.
Gentleman, R. and Geyer, C.J. (1994). Maximum likelihood for interval censored data:consistency and computation. Biometrika, 81, 618-623.
Turnbull, B.W. (1976) The empirical distribution function with arbitrarily grouped, censored and truncated data. J. R. Statist. Soc. B 38, 290-295.
data(bcos) icout<-icfit(Surv(left,right,type="interval2")~treatment, data=bcos) plot(icout) ## can pick out just one group plot(icout[1])