segmented {segmented}R Documentation

Segmented relationships in regression models

Description

Fits regression models with segmented relationships between the response and one or more explanatory variables. Break-point estimates are provided.

Usage

segmented(obj, Z, psi, W, it.max=20, toll=0.0001, 
    visual=FALSE, last=TRUE, ...)

Arguments

obj a standard `linear' model of class "lm" or "glm".
Z a vector or a matrix meaning the (continuous) explanatory variable(s) having segmented relationships with the response of obj.
psi starting values for the break-point(s).
W an optional categorical variable specifying levels in which a segmented relationship is assumed.
it.max the maximum number of iterations allowed.
toll the tolerance value used to stop the algorithm.
visual logical value indicating whether the process fitting should be printed at each iteration.
last logical value indicating whether only the last object should be returned.
... further arguments passed to lm or glm when the algorithm stops.

Details

Given a linear regression model (of class "lm" or "glm"), segmented estimates a new model having a broken-line relationship with a specified variable Z. Such a segmented or broken-line relationship is defined by the slope parameters and the break-points where the linear relation changes. Initial guesses for the break-points must be specified in the psi argument. The model is estimated simultaneously yielding point estimates and relevant (Wald-based) standard errors of all the model parameters, including the break-points.

Value

The returned object depends on the last argument. If last=TRUE, the default, segmented returns an object of class "segmented" which inherits from the class "lm" or "glm" depending on the class of obj. If last=FALSE the returned object is just a list in which the ith component is the model fitted at the ith approximation. Therefore the last component is the fitted model when the algorithm converges. The function summary (i.e., summary.segmented) can be used to obtain or print a summary of the results.
If Z is a vector and W is missing, psi specifies the number of break-points to be estimated with respect to the single variable Z; if Z is a vector and W is provided, then a segmented relationship within each level of W is assumed: in this case length(psi) has to equal the number of levels of W and the ith component of psi is assumed as starting value of the break-point inside the ith group. If Z is a matrix, W has to be missing and the ith component of psi is the starting value of the break-point for the variable Z[,i].
An object of class "segmented" is a list containing the components of the original object obj with additionally the followings:

psi Estimated break-points and relevant standard errors
it The number of iterations employed
epsilon difference in the objective function when the algorithm stops

Warning

It is well-known that the log-likelihood function for the break-point may be not concave, especially for poor clear-cut kink-relationships. In these circumstances the initial guess for the break-point, i.e. the psi argument, must be provided with care. For instance visual inspection of a, possibly smoothed, scatter-plot is usually a good way to obtain some idea on breakpoint location.

Note

  1. The algorithm will start if the it.max argument is greater than zero. If it.max=0 segmented will estimate a new linear model with break-point(s) fixed at psi.
  2. In the returned object, the name of the difference-in-slopes parameter is labelled with `U.name_of_variable'.
  3. Currently the functions can not handle missing values, thus if obj has been created with argument na.action=na.omit, segmented will not work.
  4. Methods specific to the class "segmented" are

    Others are inherited from the class "lm" or "glm" depending on the class of obj.

  5. Currently multiple break-points, namely a relationships with multiple changes, are allowed only if a single variable has to be analyzed, that is if Z is a vector and W is missing.

Author(s)

Vito M. R. Muggeo, vito.muggeo@giustizia.it

References

Muggeo, V.M.R. (2003) Estimating regression models with unknown break-points. Statistics in Medicine 22, 3055–3071.

See Also

lm, glm

Examples

## A linear model with a segmented relationship in each level of 
## a categorical variable
##
## Simulate data
x<-1:100
g<-rep(0:1,c(100,100))
set.seed(15)
y1<- 2+1.5*pmax(x-60,0)+rnorm(100,0,5)
y2<- 2+0.6*pmax(x-60,0)+rnorm(100,0,5)
dati<-data.frame(xx=rep(x,2),yy=c(y1,y2),g=factor(g))
rm(x,g,y1,y2)
## Have a look at the plot
## Not run: plot(dati$xx,dati$yy) 
## Not run: points(dati$xx[dati$g==0],dati$yy[dati$g==0],col=2)
## Fit the linear model
obj<-lm(yy~0+g+xx:g,data=dati) 
## Fit segmented models
## Model I: ignore the stratification, assume an equal difference-in-slopes 
## parameter between groups
ogg0<-segmented(obj,Z=xx,psi=60)
## Model II: now stratificate by g. Here note that psi[i] refers to the 
## segmented relationship in the ith level of W
ogg1<-segmented(obj,Z=xx,W=g,psi=c(50,70),visual=TRUE)
## Results....
ogg0
summary(ogg1)
## Have a look at the fitted models
## model I
## Not run: points(dati$xx[dati$g==0],ogg0$fitted[dati$g==0],pch=3,col=2)
## Not run: points(dati$xx[dati$g==1],ogg0$fitted[dati$g==1],pch=3,col=1)
## model II
## Not run: points(dati$xx[dati$g==0],ogg1$fitted[dati$g==0],pch=20,col=2)
## Not run: points(dati$xx[dati$g==1],ogg1$fitted[dati$g==1],pch=20,col=1)
## Not run: legend(5,60,legend=c("model I","model II"),pch=c(3,20))

[Package segmented version 0.1-4 Index]