grpreg {grpreg}R Documentation

fit a group penalized regression path

Description

Fit paths for group lasso, group bridge, or group MCP at a grid of values of the regularization parameter lambda. Fits linear and logistic regression models.

Usage

grpreg(Data,penalty,lambda=NULL,n.lambda=100,lambda.min=NULL,lambda.max=NULL,lambda2=.001,eps=.005,max.iter=100,verbose=FALSE,monitor=NULL,warn.conv=TRUE,...)

Arguments

Data A list containing: y, the response vector; X, the design matrix; family, either "gaussian" or "binomial", depending on the response; and group, a vector of consecutive integers describing the grouping of the coefficients (see example below). The design matrix should not contain an intercept; grpreg standardizes the data and includes an intercept by default.
penalty The penalty to be applied to the model. One of "gLasso" for group lasso, "gBridge" for group bridge, or "gMCP" for group MCP.
lambda A user supplied sequence of lambda values. Typically, this is left unspecified, and the function automatically computes a vector of lambda values that ranges uniformly on the log scale from lambda.min to lambda.max.
n.lambda The number of lambda values. Default is 100.
lambda.min The smallest value for lambda, as a fraction of lambda.max. Default is .001 if the number of observations is larger than the number of covariates and .05 otherwise.
lambda.max The largest value for lambda. Default is to have the algorithm calculate the smallest value for which all penalized coefficients are 0. This can be done exactly for group lasso and group MCP, but only guessed at for group bridge.
lambda2 By default, a small L2 penalty is included alongside the group penalty. lamdba2 controls the magnitude of this penalty, as a fraction of lambda.
eps Convergence threshhold. The algorithm iterates until the relative change in any coefficient is less than eps. Default is .005. See details.
max.iter Maximum number of iterations. Default is 100. See details.
warn.conv Should the function give a warning if it fails to converge? Default is TRUE. See details.
verbose Get a running update on what the algorithm is doing. Default is FALSE.
monitor Monitor the iterations of a vector of covariates. If set to a numeric vector, for example c(3,5), the algorithm will display iterates of the third and fifth covariates as it progresses.
... Other parameters specific to one of the penalized method. These include delta, the amount by which the group lasso penalty is bounded away from 0 - defaults to 0.0005; a, the tuning parameter of the group MCP penalty - defaults to 3 for linear regression and 30 for logistic regression; and gamma, the tuning parameter of the bridge penalty- defaults to 1/2.

Details

The sequence of models indexed by lambda is fit using a locally approximated coordinate descent algorithm. For logistic regression models, some care is taken to avoid overfitting to unstable models. The algorithm may exit early and give unstable results in this setting. The objective function is defined to be

1/(2*n)RSS + penalty

for "gaussian" and

-1/nobs loglik + λ*penalty

for "binomial", where the likelihood is from a traditional generalized linear model for the log-odds of an event.

This algorithm is stable and rapidly converges to values close to the solution. However, it also displays a linear rate of convergence (Newton-Raphson algorithms, in contrast, display quadratic rates of convergence), meaning that it may take a large number of iterations to reach accuracy out to many decimal places. Furthermore, some areas of the regularization path may contain models that are nonidentifiable or nearly singular. Thus, the algorithm may fail to satisfy convergence criteria at certain points while yielding accurate solutions over the region of interest. The default behavior warning the user when convergence criteria are not met is often distracting, and can be modified with warn.conv (convergence can always be checked later by inspecting the value of iter). If models are not converging, consider increasing eps, increasing n.lambda, or increasing lambda.min before increasing max.iter.

Value

An object with S3 class "grpreg" containing:

beta The fitted matrix of coefficients. The number of rows is equal to the number of coefficients, and the number of columns is equal to n.lambda.
beta.std Same as beta, only on the standardized scale.
lambda The sequence of lambda values in the path.
penalty The user-supplied value of penalty.
df A vector of length n.lambda containing estimates of effective number of model parameters all the points along the regularization path. For details on how this is calculated, see reference.
iter A vector of length n.lambda containing the number of iterations until convergence at each value of lambda.
par Processed list of parameters used by grpreg during fitting.
Data Processed list containing data used by grpreg during fitting.

Author(s)

Patrick Breheny <patrick-breheny@uiowa.edu>

References

Breheny, P. and Huang, J. (2008) Penalized Methods for Bi-level variable selection. Tech report No. 393, Department of Statistics and Actuarial Science, University of Iowa.http://www.stat.uiowa.edu/techrep/tr393.pdf

See Also

plot and select methods.

Examples

data(birthwt.grpreg)
Data.gaussian <- list(y=birthwt.grpreg$bwt,
                      X=as.matrix(birthwt.grpreg[,c(-1,-2)]),
                      family="gaussian",
                      group=c(1,1,1,2,2,2,3,3,4,5,5,6,7,8,8,8))
Data.binomial <- list(y=birthwt.grpreg$low,
                      X=as.matrix(birthwt.grpreg[,c(-1,-2)]),
                      family="binomial",
                      group=c(1,1,1,2,2,2,3,3,4,5,5,6,7,8,8,8))

fit1.gLasso <- grpreg(Data.gaussian,"gLasso")
fit1.gBridge <- grpreg(Data.gaussian,"gBridge",lambda.max=0.08)
fit1.gMCP <- grpreg(Data.gaussian,"gMCP")

fit2.gLasso <- grpreg(Data.binomial,"gLasso")
## An example of an unimportant failure to converge.
## Note that the plot looks fine, and that the algorithm fails to
## converge for only one value of lambda.
fit2.gBridge <- grpreg(Data.binomial,"gBridge",lambda.max=0.06)
fit2.gBridge$iter
plot(fit2.gBridge)
fit2.gMCP <- grpreg(Data.binomial,"gMCP")

select(fit2.gLasso)

[Package grpreg version 1.0 Index]