coxpath {glmpath}R Documentation

Fits the entire L1 regularization path for Cox proportional hazards model

Description

This algorithm uses predictor-corrector method to compute the entire regularization path for Cox proportional hazards model with L1 penalty.

Usage

  coxpath(data, nopenalty.subset = NULL, method = c("breslow", "efron"),
          lambda2 = 1e-5, max.steps = 10*min(n, m), max.norm = 100*m,
          min.lambda = (if (m >= n) 1e-3 else 0), max.vars = Inf,
          max.arclength = Inf, frac.arclength = 1, add.newvars = 1,
          bshoot.threshold = 0.1, relax.lambda = 1e-7,
          approx.Gram = FALSE, standardize = TRUE,
          function.precision = 3e-13, eps = .Machine$double.eps,
          trace = FALSE)

Arguments

data a list consisting of x: a matrix of features, time: the survival time, and status: censor status with 1 if died and 0 if censored.
nopenalty.subset a set of indices for the predictors that are not subject to the L1 penalty
method approximation method for tied survival times. Approximations derived by Breslow (1974) and Efron (1977) are available. Default is breslow.
lambda2 regularization parameter for the L2 norm of the coefficients. Default is 1e-5.
max.steps an optional bound for the number of steps to be taken. Default is 10 * min{nrow(x), ncol(x)}.
max.norm an optional bound for the L1 norm of the coefficients. Default is 100 * ncol(x).
min.lambda an optional (lower) bound for the size of λ. When ncol(x) is relatively large, the coefficient estimates are prone to numerical precision errors at extremely small λ. In such cases, early stopping is recommended. Default is 0 for ncol(x) < nrow(x) cases and 1e-3 otherwise.
max.vars an optional bound for the number of active variables. Default is Inf.
max.arclength an optional bound for arc length (L1 norm) of a step. If max.arclength is extremely small, an exact nonlinear path is produced. Default is Inf.
frac.arclength Under the default setting, the next step size is computed so that the active set changes right at the next value of lambda. When frac.arclength is assigned some fraction between 0 and 1, the step size is decreased by the factor of frac.arclength in arc length. If frac.arclength=0.2, the step length is adjusted so that the active set would change after five smaller steps. Either max.arclength or frac.arclength can be used to force the path to be more accurate. Default is 1.
add.newvars add.newvars candidate variables (that are currently not in the active set) are used in the corrector step as potential active variables. Default is 1.
bshoot.threshold If the absolute value of a coefficient is larger than bshoot.threshold at the first corrector step it becomes nonzero (therefore when λ is considered to have been decreased too far), λ is increased again. i.e. A backward distance in λ that makes the coefficient zero is computed. Default is 0.1.
relax.lambda A variable joins the active set if |l'(β)| > λ*(1-relax.lambda). Default is 1e-7. If no variable joins the active set even after many (>20) steps, the user should increase relax.lambda to 1e-6 or 1e-5, but not more than that. This adjustment is sometimes needed because of the numerical precision/error propagation problems. In general, the paths are less accurate with relaxed lambda.
approx.Gram If TRUE, an approximated Gram matrix is used in predictor steps; each step takes less number of computations, but the total number of steps usually increases. This might be useful when the number of features is large.
standardize If TRUE, predictors are standardized to have a unit variance.
function.precision function.precision parameter used in the internal solver. Default is 3e-13. The algorithm is faster, but less accurate with relaxed, larger function precision.
eps an effective zero
trace If TRUE, the algorithm prints out its progress.

Details

This algorithm implements the predictor-corrector method to determine the entire path of the coefficient estimates as the amount of regularization varies; it computes a series of solution sets, each time estimating the coefficients with less regularization, based on the previous estimate. The coefficients are estimated with no error at the knots, and the values are connected, thereby making the paths piecewise linear.

We thank Michael Saunders of SOL, Stanford University for providing the solver used for the convex optimization in corrector steps of coxpath.

Value

A coxpath object is returned.

lambda vector of λ values for which the exact coefficients are computed
lambda2 λ_2 used
step.length vector of step lengths in λ
corr matrix of l'(β) values (derivatives of the log-partial-likelihood)
new.df vector of degrees of freedom (to be used in the plot function)
df vector of degrees of freedom at each step
loglik vector of log-partial-likelihood computed at each step
aic vector of AIC values
bic vector of BIC values
b.predictor matrix of coefficient estimates from the predictor steps
b.corrector matrix of coefficient estimates from the corrector steps
new.A vector of boolean values indicating the steps at which the active set changed (to be used in the plot/predict functions)
actions actions taken at each step
meanx means of the columns of x
sdx standard deviations of the columns of x
xnames column names of x
method method used
nopenalty.subset nopenalty.subset used
standardize TRUE if the predictors were standardized before fitting

Author(s)

Mee Young Park and Trevor Hastie

References

Mee Young Park and Trevor Hastie (2007) L1 regularization path algorithm for generalized linear models. J. R. Statist. Soc. B, 69, 659-677.

See Also

cv.coxpath, plot.coxpath, predict.coxpath, summary.coxpath

Examples

data(lung.data)
attach(lung.data)
fit.a <- coxpath(lung.data)
fit.b <- coxpath(lung.data, method="efron")
detach(lung.data)

[Package glmpath version 0.94 Index]