baselearners {mboost} | R Documentation |
Base learners to be utilized in the formula specification of gamboost()
.
bols(x, z = NULL, xname = NULL, zname = NULL) bbs(x, z = NULL, df = 4, knots = NULL, degree = 3, differences = 2, center = FALSE, xname = NULL, zname = NULL) bns(x, z = NULL, df = 4, knots = NULL, differences = 2, xname = NULL, zname = NULL) bss(x, df = 4, xname = NULL) bspatial(x, y, z = NULL, df = 5, xknots = NULL, yknots = NULL, degree = 3, differences = 2, center = FALSE, xname = NULL, yname = NULL, zname = NULL) brandom(x, z = NULL, df = 4, xname = NULL, zname = NULL)
x |
a vector containing data, either numeric or a factor. |
y |
a vector containing numeric data |
z |
an optional vector containing numeric data. |
xname |
an optional string indicating the name of the variable whose
data values are given by the vector x . |
yname |
an optional string indicating the name of the variable whose
data values are given by the vector y . |
zname |
an optional string indicating the name of the variable whose
data values are given by the vector z . |
df |
trace of the hat matrix for the base learner defining the base learner
complexity. Low values of df correspond to a large amount of smoothing and
thus to "weaker" base learners. Certain restrictions have to be kept for the
specification of df since most of the base learners rely on penalisation
approaches with a non-trivial null space. For example, for p-splines fitted with
bbs , df has to be larger than the order of differences employed in
the construction of the penalty term. However, when option center=TRUE ,
the effect is centered around its unpenalized part and therefore any positive number
is admissible for df . |
knots |
either the number of (equidistant) interior knots to be used for
the regression spline fit or a vector including the positions of the interior
knots. If knots=NULL , the interior knots are chosen to be equidistant,
where the number of interior knots is defined in the same way as in
smooth.spline . |
xknots |
knots in x -direction when fitting a bivariate surface
with bspatial . See knots for details. |
yknots |
knots in y -direction when fitting a bivariate surface
with bspatial . See knots for details. |
degree |
degree of the regression spline. |
differences |
natural number between 1 and 3. If differences =
k, k-th-order differences are used as a penalty. |
center |
If center=TRUE , the corresponding effect is
re-parameterized such that the unpenalized part of the fit is substracted and
only the deviation effect is fitted. The unpenalized, parametric part has then
to be included in separate base learners using bols (see the examples below). |
bols
refers to linear base learners (ordinary least squares fit), while
bbs
, bns
, and bss
refer to penalized regression splines,
penalized natural splines, and smoothing splines, respectively. bspatial
fits bivariate surfaces and brandom
defines random effects base learners.
In combination with option z
, all base learners can be turned into varying
coefficient terms.
With bbs
, the P-spline approach of Eilers and Marx (1996) is used.
bns
uses the same penalty and interior knots as bbs
,
but operates with a constrained natural spline basis instead of an
unconstrained B-spline basis. P-splines use a squared k-th-order difference
penalty which can be interpreted as an approximation of the integrated squared
k-th derivative of the spline. This approximation is only valid
if the knots are equidistant, so is not recommended to use non-equidistant
knots for bbs
and bns
. bss
refers to a smoothing spline
based on the smooth.spline
function.
bspatial
implements bivariate tensor product P-splines for the estimation
of either spatial effects (if x
and y
correspond to coordinates)
or interaction surfaces. The penalty term is constructed based on bivariate extensions
of the univariate penalties in x
and y
directions, see Kneib, Hothorn
and Tutz (2007) for details.
brandom
specifies a random effects base learner based on a factor variable
x
that defines the grouping structure of the data set. For each level of
x
, a separate random intercept is fitted, where the random effects variance is
governed by the specification of the degrees of freedom df
.
For all base learners except bols
, the amount of smoothing is
determined by the trace of the hat matrix, as indicated by df
. If z
is specified as an additional argument, a varying coefficients term is estimated,
where z
is the interaction variable and the effect modifier is given by
either x
or x
and y
. If only x
is specified and one of the
nonparametric base learners bbs
, bns
or bss
is used, this
corresponds to the classical situation of varying coefficients, where the
effect of z
varies over the domain of x
. In case of bspatial
as
base learner, the effect of z
varies with respect to both
x
and y
, i.e. an interaction surface between x
and
y
is specified as effect modifier. For brandom
specification of z
leads to the estimation of random slopes for covariate z
with grouping structure
defined by factor x
instead of a simple random intercept.
For bbs
and bspatial
, option center
requests that the
fitted effect is centered around its parametric, unpenalized part. For
example, with second order difference penalty, a linear effect of x
remains unpenalized by bbs
and therefore the degrees of freedom for the base learner
have to be larger than 2. To avoid this restriction, option center=TRUE
substracts the unpenalized linear effect from the fit, allowing to specify any
positive number as df
. Note that in this case the linear effect
x
should generally be specified as an additional base learner
bols(x)
. For bspatial
and, for example, second order
differences, a linear effect of x
(bols(x)
), a linear effect of
y
(bols(y)
), and their interaction (bols(x*y)
) are
substracted from the effect and have to be added seperately to the model
equation. More details on centering can be found in Kneib, Hothorn and Tutz
(2007) and Fahrmeir, Kneib and Lang (2004).
Either a matrix (in case of an ordinary least squares fit) or an object of
class basis
(in case of a regression or smoothing spline fit) with a
dpp
function as an additional attribute. The call of dpp
returns
an object of class basisdpp
.
Paul H. C. Eilers and Brian D. Marx (1996), Flexible smoothing with B-splines and penalties. Statistical Science, 11(2), 89-121.
Ludwig Fahrmeir, Thomas Kneib and Stefan Lang (2004), Penalized structured additive regression for space-time data: a Bayesian perspective. Statistica Sinica, 14, 731-761.
Thomas Kneib, Torsten Hothorn and Gerhard Tutz (2007), Variable selection and model choice in geoadditive regression models. Technical Report No. 3, Institut fuer Statistik, LMU Muenchen. http://epub.ub.uni-muenchen.de/2063/
x1 <- rnorm(100) x2 <- rnorm(100) + 0.25*x1 x3 <- as.factor(sample(0:1, 100, replace = TRUE)) y <- 3*sin(x1) + x2^2 + rnorm(100) knots.x2 <- quantile(x2, c(0.25,0.5,0.75)) spline1 <- bbs(x1,knots=20,df=4) attributes(spline1) spline2 <- bns(x2,knots=knots.x2,df=5) attributes(spline2) olsfit <- bols(x3) attributes(olsfit) form1 <- y ~ bbs(x1,knots=20,df=4) + bns(x2,knots=knots.x2,df=5) # example for bspatial x1 <- runif(250,-pi,pi) x2 <- runif(250,-pi,pi) y <- sin(x1)*sin(x2) + rnorm(250, sd = 0.4) spline3 <- bspatial(x1, x2, xknots=12, yknots=12) attributes(spline3) form2 <- y ~ bspatial(x1, x2, xknots=12, yknots=12) # decompose spatial effect into parametric part and deviation with 1 df form2 <- y ~ bols(x1) + bols(x2) + bols(x1*x2) + bspatial(x1, x2, xknots=12, yknots=12, center = TRUE, df=1) # random intercept id <- factor(rep(1:10, each=5)) raneff <- brandom(id) attributes(raneff) # random slope z <- runif(50) raneff <- brandom(id, z=z) attributes(raneff)