glmboost {mboost} | R Documentation |
Gradient boosting for optimizing arbitrary loss functions where component-wise linear models are utilized as base learners.
## S3 method for class 'formula': glmboost(formula, data = list(), weights = NULL, contrasts.arg = NULL, na.action = na.pass, ...) ## S3 method for class 'matrix': glmboost(x, y, weights = NULL, ...) glmboost_fit(object, family = GaussReg(), control = boost_control(), weights = NULL) ## S3 method for class 'glmboost': plot(x, main = deparse(x$call), col = NULL, ...)
formula |
a symbolic description of the model to be fit. |
data |
a data frame containing the variables in the model. |
weights |
an optional vector of weights to be used in the fitting process. |
contrasts.arg |
a list, whose entries are contrasts suitable for input
to the contrasts replacement function and whose names are
the names of columns of data containing factors.
See model.matrix.default . |
na.action |
a function which indicates what should happen when the data
contain NA s. |
x |
design matrix or an object of class glmboost for plotting. |
y |
vector of responses. |
object |
an object of class boost_data , see boost_dpp . |
family |
an object of class boost_family-class ,
implementing the negative gradient corresponding
to the loss function to be minimized. By default, squared error loss
for continuous responses is used. |
control |
an object of class boost_control . |
main |
a title for the plot. |
col |
(a vector of) colors for plotting the lines representing the coefficient paths. |
... |
additional arguments passed to callies. |
A (generalized) linear model is fitted using a boosting algorithm based on component-wise univariate linear models. The fit, i.e., the regression coefficients, can be interpreted in the usual way. The methodology is described in Buhlmann and Yu (2003), Buhlmann (2006), and Buhlmann and Hothorn (2007).
The function glmboost_fit
provides access to the fitting
procedure without data pre-processing, e.g. for cross-validation.
An object of class glmboost
with print
, coef
,
AIC
and predict
methods being available.
For inputs with longer variable names, you might want to change
par("mai")
before calling the plot
method of glmboost
objects visualizing the coefficients path.
Peter Buhlmann and Bin Yu (2003), Boosting with the L2 loss: regression and classification. Journal of the American Statistical Association, 98, 324–339.
Peter Buhlmann (2006), Boosting for high-dimensional linear models. The Annals of Statistics, 34(2), 559–583.
Peter Buhlmann and Torsten Hothorn (2007), Boosting algorithms: regularization, prediction and model fitting. Statistical Science, 22(4), 477–505.
### a simple two-dimensional example: cars data cars.gb <- glmboost(dist ~ speed, data = cars, control = boost_control(mstop = 5000)) cars.gb ### coefficients should coincide coef(cars.gb) + c(cars.gb$offset, 0) coef(lm(dist ~ speed, data = cars)) ### plot fit layout(matrix(1:2, ncol = 2)) plot(dist ~ speed, data = cars) lines(cars$speed, predict(cars.gb), col = "red") ### alternative loss function: absolute loss cars.gbl <- glmboost(dist ~ speed, data = cars, control = boost_control(mstop = 5000), family = Laplace()) cars.gbl coef(cars.gbl) + c(cars.gbl$offset, 0) lines(cars$speed, predict(cars.gbl), col = "green") ### Huber loss with adaptive choice of delta cars.gbh <- glmboost(dist ~ speed, data = cars, control = boost_control(mstop = 5000), family = Huber()) lines(cars$speed, predict(cars.gbh), col = "blue") legend("topleft", col = c("red", "green", "blue"), lty = 1, legend = c("Gaussian", "Laplace", "Huber"), bty = "n") ### plot coefficient path of glmboost par(mai = par("mai") * c(1, 1, 1, 2.5)) plot(cars.gb)