ltsReg {robustbase} | R Documentation |
Carries out least trimmed squares (LTS) regression.
ltsReg(x, ...) ## S3 method for class 'formula': ltsReg(formula, data, ..., model = TRUE, x.ret = FALSE, y.ret = FALSE) ## Default S3 method: ltsReg(x, y, intercept = TRUE, alpha = NULL, nsamp = 500, adjust = FALSE, mcd = TRUE, qr.out = FALSE, yname = NULL, seed = 0, use.correction=TRUE, control, ...)
formula |
a formula of the form y ~ x1 + x2 + ... . |
data |
data frame from which variables specified in
formula are to be taken. |
model, x.ret, y.ret |
logical s indicating if the
model frame, the model matrix and the response are to be returned,
respectively. |
x |
a matrix or data frame containing the explanatory variables. |
y |
the response: a vector of length the number of rows of x . |
intercept |
if true, a model with constant term will be
estimated; otherwise no constant term will be included. Default is
intercept = TRUE |
alpha |
the percentage of squared residuals whose sum will be
minimized. Its default value is 0.5. In general, alpha must
be a value between 0.5 and 1. |
nsamp |
number of subsets used for initial estimates or
"best" or "exact" . Default is nsamp = 500 . For
nsamp="best" exhaustive enumeration is done, as long as the
number of trials does not exceed 5000. For "exact" ,
exhaustive enumeration will be attempted however many samples are needed.
In this case a warning message will be displayed saying that the
computation can take a very long time. |
adjust |
whether to perform intercept adjustment at each step.
Since this can be time consuming, the default is adjust = FALSE . |
mcd |
whether to compute robust distances using Fast-MCD. |
qr.out |
whether to return the QR decomposition (see
qr ); defaults to false. |
yname |
the name of the dependent variable. Default is yname = NULL |
seed |
starting value for random generator. Default is seed = 0 |
use.correction |
whether to use finite sample correction factors.
Default is use.correction=TRUE |
control |
a list with estimation options - same as these provided in the function specification. If the control object is supplied, the parameters from it will be used. If parameters are passed also in the invocation statement, they will override the corresponding elements of the control object. |
... |
arguments passed to or from other methods. |
The LTS regression method minimizes the sum of the h smallest squared
residuals, where h must be at least half the number of
observations. The default value of h is roughly 0.5n where n is the
total number of observations, but the user may choose any value
between n/2 and n. The LTS estimate of the error scale is given
by the minimum of the objective function multiplied by a consistency
factor and a finite sample correction factor - see Pison et al. (2002)
for details. The rescaling factors for the raw and final estimates are
returned also in the vectors raw.cnp2
and cnp2
of
length 2 respectively. The finite sample corrections can be suppressed
by setting use.correction=FALSE
. The computations are performed
using the Fast LTS algorithm proposed by Rousseeuw and Van Driessen (1999).
As always, the formula interface has an implied intercept term which can be
removed either by y ~ x - 1
or y ~ 0 + x
. See
formula
for more details.
The function ltsReg
returns an object of class "lts"
.
The function summary
is used to obtain and print
a summary table of the results.
The generic accessor functions coefficients
,
fitted.values
and residuals
extract various useful features of the value returned by
ltsReg
.
An object of class lts
is a list
containing at
least the following components:
crit |
the value of the objective function of the LTS regression method, i.e., the sum of the h smallest squared raw residuals. |
coefficients |
vector of coefficient estimates (including the intercept by default when
intercept=TRUE ), obtained after reweighting.
|
best |
the best subset found and used for computing the raw estimates. The
size of best is equal to quan .
|
fitted.values |
vector like y containing the fitted values
of the response after reweighting. |
residuals |
vector like y containing the residuals from
the weighted least squares regression. |
scale |
scale estimate of the reweighted residuals. |
alpha |
same as the input parameter alpha . |
quan |
the number h of observations which have determined the least trimmed squares estimator. |
intercept |
same as the input parameter intercept . |
cnp2 |
a vector of length two containing the consistency correction factor and the finite sample correction factor of the final estimate of the error scale. |
raw.coefficients |
vector of raw coefficient estimates (including
the intercept, when intercept=TRUE ). |
raw.scale |
scale estimate of the raw residuals. |
raw.resid |
vector like y containing the raw residuals
from the regression. |
raw.cnp2 |
a vector of length two containing the consistency correction factor and the finite sample correction factor of the raw estimate of the error scale. |
lts.wt |
vector like y containing weights that can be used in a weighted least squares. These weights are 1 for points with reasonably small raw residuals, and 0 for points with large raw residuals. |
method |
character string naming the method (Least Trimmed Squares). |
X |
the input data as a matrix. |
Y |
the response variable as a vector. |
Valentin Todorov valentin.todorov@chello.at, based on work written for S-plus by Peter Rousseeuw and Katrien van Driessen from University of Antwerp.
Peter J. Rousseeuw (1984), Least Median of Squares Regression. Journal of the American Statistical Association 79, 871–881.
P. J. Rousseeuw and A. M. Leroy (1987) Robust Regression and Outlier Detection. Wiley.
P. J. Rousseeuw and K. van Driessen (1999) A fast algorithm for the minimum covariance determinant estimator. Technometrics 41, 212–223.
Pison, G., Van Aelst, S., and Willems, G. (2002) Small Sample Corrections for LTS and MCD. Metrika 55, 111-123.
covMcd
;
summary.lts
for summaries.
The generic functions coef
, residuals
,
fitted
.
data(heart) ## Default method works with 'x'-matrix and y-var: heart.x <- data.matrix(heart[, 1:2]) # the X-variables heart.y <- heart[,"clength"] ltsReg(heart.x, heart.y) data(stackloss) ltsReg(stack.loss ~ ., data = stackloss)