blackboost {mboost} | R Documentation |
Gradient boosting for optimizing arbitrary loss functions where regression trees are utilized as base learners.
## S3 method for class 'formula': blackboost(formula, data = list(), weights = NULL, ...) ## S3 method for class 'matrix': blackboost(x, y, weights = NULL, ...) blackboost_fit(object, tree_controls = ctree_control(teststat = "max", testtype = "Teststatistic", mincriterion = 0, maxdepth = 2), fitmem = ctree_memory(object, TRUE), family = GaussReg(), control = boost_control(), weights = NULL)
formula |
a symbolic description of the model to be fit. |
data |
a data frame containing the variables in the model. |
weights |
an optional vector of weights to be used in the fitting process. |
x |
design matrix. |
y |
vector of responses. |
object |
an object of class boost_data , see boost_dpp . |
tree_controls |
an object of class TreeControl , which can be
obtained using ctree_control .
Defines hyper-parameters for the trees which are used as base learners.
It is wise
to make sure to understand the consequences of altering any of its
arguments. |
fitmem |
an object of class TreeFitMemory . |
family |
an object of class boost_family-class ,
implementing the negative gradient corresponding
to the loss function to be optimized. By default,
squared error loss
for continuous responses is used. |
control |
an object of class boost_control
which defines the hyper-parameters of the
boosting algorithm. |
... |
additional arguments passed to callies. |
This function implements the `classical'
gradient boosting utilizing regression trees as base learners.
Essentially, the same algorithm is implemented in package
gbm
. The
main difference is that arbitrary loss functions to be optimized
can be specified via the family
argument to blackboost
whereas
gbm
uses hard-coded loss functions.
Moreover, the base learners (conditional
inference trees, see ctree
) are a little bit more flexible.
The regression fit is a black box prediction machine and thus hardly interpretable.
Usually, the formula based interface blackboost
should be used.
When necessary (for example for cross-validation), function
blackboost_fit
operating on objects of class boost_data
is faster alternative.
An object of class blackboost
with print
and predict
methods being available.
Jerome H. Friedman (2001), Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29, 1189–1232.
Greg Ridgeway (1999), The state of boosting. Computing Science and Statistics, 31, 172–181.
Peter Buhlmann and Torsten Hothorn (2007), Boosting algorithms: regularization, prediction and model fitting. Statistical Science, 22(4), 477–505.
Torsten Hothorn, Kurt Hornik and Achim Zeileis (2006). Unbiased Recursive Partitioning: A Conditional Inference Framework. Journal of Computational and Graphical Statistics, 15(3), 651–674.
### a simple two-dimensional example: cars data cars.gb <- blackboost(dist ~ speed, data = cars, control = boost_control(mstop = 50)) cars.gb ### plot fit plot(dist ~ speed, data = cars) lines(cars$speed, predict(cars.gb), col = "red")