fitdistcens {fitdistrplus}R Documentation

Fitting of univariate distributions to censored data

Description

Fits a univariate distribution to censored data by maximum likelihood.

Usage

fitdistcens(censdata, distr, start,...)
## S3 method for class 'fitdistcens':
print(x,...)
## S3 method for class 'fitdistcens':
plot(x,...)
## S3 method for class 'fitdistcens':
summary(object,...)

Arguments

censdata A dataframe of two columns respectively named left and right, describing each observed value as an interval. The left column contains either NA for left censored observations, the left bound of the interval for interval censored observations, or the observed value for non-censored observations. The right column contains either NA for right censored observations, the right bound of the interval for interval censored observations, or the observed value for non-censored observations.
distr A character string "name" naming a distribution, for which the corresponding density function dname and the corresponding distribution function pname must be defined, or directly the density function.
start A named list giving the initial values of parameters of the named distribution. This argument may be omitted for some distributions for which reasonable starting values are computed (see details).
x an object of class 'fitdistcens'.
object an object of class 'fitdistcens'.
... further arguments to be passed to generic functions, or to the function "mledist" in order to control the optimization method.

Details

Maximum likelihood estimations of the distribution parameters are computed using the function mledist. By default direct optimization of the log-likelihood is performed using optim, with the "Nelder-Mead" method for distributions characterized by more than one parameter and the "BFGS" method for distributions characterized by only one parameter. The method used in optim may be chosen or another optimization method may be chosen using ... argument (see mledist for details). For the following named distributions, reasonable starting values will be computed if start is omitted : "norm", "lnorm", "exp" and "pois", "cauchy", "gamma", "logis", "nbinom" (parametrized by mu and size), "geom", "beta" and "weibull". Note that these starting values may not be good enough if the fit is poor. The function is not able to fit a uniform distribution. With the parameter estimates, the function returns the log-likelihood and the standard errors of the estimates calculated from the Hessian at the solution found by optim or by the user-supplied function passed to mledist.

The plot of an object of class "fitdistcens" returned by fitdistcens uses the function plotdistcens.

Value

fitdistcens returns an object of class 'fitdistcens', a list with following components,

estimate the parameter estimates
sd the estimated standard errors
cor the estimated correlation matrix
loglik the log-likelihood
aic the Akaike information criterion
bic the the so-called BIC or SBC (Schwarz Bayesian criterion)
censdata the censored dataset
distname the name of the distribution

Author(s)

Marie-Laure Delignette-Muller ml.delignette@vet-lyon.fr

References

Venables WN and Ripley BD (2002) Modern applied statistics with S. Springer, New York, pp. 435-446.

See Also

plotdistcens, optim, mledist and fitdist.

Examples

# (1) basic fit of a normal distribution on censored data
#

d1<-data.frame(
left=c(1.73,1.51,0.77,1.96,1.96,-1.4,-1.4,NA,-0.11,0.55,0.41,
    2.56,NA,-0.53,0.63,-1.4,-1.4,-1.4,NA,0.13),
right=c(1.73,1.51,0.77,1.96,1.96,0,-0.7,-1.4,-0.11,0.55,0.41,
    2.56,-1.4,-0.53,0.63,0,-0.7,NA,-1.4,0.13))
f1n<-fitdistcens(d1, "norm")
f1n
summary(f1n)
plot(f1n,rightNA=3)

# (2) defining your own distribution functions, here for the Gumbel distribution
# for other distributions, see the CRAN task view dedicated to probability distributions

dgumbel <- function(x,a,b) 1/b*exp((a-x)/b)*exp(-exp((a-x)/b))
pgumbel <- function(q,a,b) exp(-exp((a-q)/b))
qgumbel <- function(p,a,b) a-b*log(-log(p))
f1g<-fitdistcens(d1,"gumbel",start=list(a=0,b=2))
summary(f1g)
plot(f1g,rightNA=3)

# (3) comparison of fits of various distributions
# 

d3<-data.frame(left=10^(d1$left),right=10^(d1$right))
f3w<-fitdistcens(d3,"weibull")
summary(f3w)
plot(f3w,leftNA=0)
f3l<-fitdistcens(d3,"lnorm")
summary(f3l)
plot(f3l,leftNA=0)
f3e<-fitdistcens(d3,"exp")
summary(f3e)
plot(f3e,leftNA=0)

# (4) how to change the optimisation method?
#

fitdistcens(d3,"gamma",optim.method="Nelder-Mead")
fitdistcens(d3,"gamma",optim.method="BFGS") 
fitdistcens(d3,"gamma",optim.method="SANN") 
fitdistcens(d3,"gamma",optim.method="L-BFGS-B",lower=c(0,0)) 

# (5) custom optimisation function - example with the genetic algorithm
#
## Not run: 

    #wrap genoud function rgenoud package
    mygenoud <- function(fn, par, ...) 
    {
        require(rgenoud)
        res <- genoud(fn, starting.values=par, ...)        
        standardres <- c(res, convergence=0)
            
        return(standardres)
    }

    # call fitdistcens with a 'custom' optimization function
    fit.with.genoud<-fitdistcens(d3, "gamma", custom.optim=mygenoud, nvars=2,    
        Domains=cbind(c(0,0), c(10, 10)), boundary.enforcement=1, 
        print.level=1, hessian=TRUE)

    summary(fit.with.genoud)
## End(Not run)


[Package fitdistrplus version 0.1-2 Index]