impute.rsf {randomSurvivalForest}R Documentation

Random Survival Forest Impute Only Mode

Description

Imputation mode for right censored survival and competing risk data using Random Survival Forests (RSF) (Ishwaran, Kogalur, Blackstone and Lauer, 2008). A random forest (Breiman, 2001) of survival trees is grown and used to impute missing data. No ensemble estimates or error rates are calculated.

Usage

    impute.rsf(formula,
      data = NULL,
      ntree = 1000,
      mtry = NULL,
      nodesize = NULL,
      splitrule = NULL,
      nsplit = 0,
      big.data = FALSE,
      nimpute = 1,
      predictorWt = NULL,
      seed = NULL,
      do.trace = FALSE,
      ...)

Arguments

formula A symbolic description of the model to be fit.
data Data frame containing the data to be imputed.
ntree Number of trees to grow.
mtry Number of variables randomly sampled at each split.
nodesize Minimum terminal node size.
splitrule Splitting rule used to grow trees.
nsplit Non-negative integer value used to specify random splitting.
big.data Set this value to TRUE for large data.
nimpute Number of iterations of missing data algorithm.
predictorWt Weights for selecting variables for splitting on.
seed Seed for random number generator.
do.trace Should trace output be enabled?
... Further arguments passed to or from other methods.

Details

Grows a RSF and uses this to impute data. All external calculations such as ensemble calculations, error rates, etc. are turned off. Use this function if your only interest is imputing the data.

All options are the same as rsf and the user should consult the help file for rsf for details.

Value

Invisibly, the data frame containing the orginal data with imputed data overlayed.

Author(s)

Hemant Ishwaran hemant.ishwaran@gmail.com and Udaya B. Kogalur kogalurshear@gmail.com

See Also

rsf.

Examples

## Not run: 
#------------------------------------------------------------
# Example 1:  Veteran's Administration lung cancer data
# Randomized trial of two treatment regimens for lung cancer
# See Kalbfleisch & Prentice

data(pbc, package = "randomSurvivalForest")
imputed.data <- impute.rsf(Survrsf(days, status) ~ ., data = pbc, nsplit = 3)
## End(Not run)

[Package randomSurvivalForest version 3.6.1 Index]