BACON {robustX}R Documentation

BACON for Regression or Multivariate Covariance Estimation

Description

BACON, short for ‘Blocked Adaptive Computationally-Efficient Outlier Nominators’, is a somewhat robust algorithm (set), with an implementation for regression or multivariate covariance estimation.

BACON() applies the multivariate (covariance estimation) algorithm, using mvBACON(x) in any case, and when y is not NULL adds a regression iteration phase, using the auxiliary .lmBACON() function.

Usage

BACON(x, y = NULL, intercept = TRUE,
      m = min(collect * p, n * 0.5),
      init.sel = c("Mahalanobis", "dUniMedian", "random", "manual"),
      man.sel, init.fraction = 0, collect = 4,
      alpha = 0.95, maxsteps = 100, verbose = TRUE)

## *Auxiliary* function:
.lmBACON(x, y, intercept = TRUE,
         init.dis, init.fraction = 0, collect = 4,
         alpha = 0.95, maxsteps = 100, verbose = TRUE)

Arguments

x a multivariate matrix of dimension [n x p] considered as containing no missing values.
y the response (n vector) in the case of regression, or NULL for the multivariate case.
intercept logical indicating if an intercept has to be used for the regression.
m integer in 1:n specifying the size of the initial basic subset; used only when init.sel is not "manual"; see mvBACON.
init.sel character string, specifying the initial selection mode; see mvBACON.
man.sel only when init.sel == "manual", the indices of observations determining the initial basic subset (and m <- length(man.sel)).
init.dis the distances of the x matrix used for the initial subset determined by mvBACON.
init.fraction if this parameter is > 0 then the tedious steps of selecting the initial subset are skipped and an initial subset of size n * init.fraction is chosen (with smallest dis)
collect numeric factor chosen by the user to define the size of the initial subset (p * collect)
alpha significance level.
maxsteps the maximal number of iteration steps (to prevent infinite loops)
verbose logical indicating if messages are printed which trace progress of the algorithm.

Details

init.sel: the initial selection mode; implemented modes are: "Mah" -> based on Mahalanobis distance (default) "dis" -> based on the distances from the medians "ran" -> based on a random selection "man" -> based on manual selection in this case the vector 'man.sel' which contains the indices of the selected observations must be given. "Mah" and "dis" are proposed by Hadi while "ran" and "man" were implemented in order to study the behaviour of BACON.

Value

basically a list with components

subset the observation indices (in 1:n) denoting the subset of ``good'' observations.
tis ............

Author(s)

Ueli Oetliker, Swiss Federal Statistical Office, for S-plus 5.1; 25.05.2001; modified six times till 17.6.2001.

Port to R, testing etc, by Martin Maechler.

References

Billor, N., Hadi, A. S., and Velleman , P. F. (2000). BACON: Blocked Adaptive Computationally-Efficient Outlier Nominators; Computational Statistics and Data Analysis 34, 279–298.

See Also

mvBACON, the multivariate version of the BACON algorithm.

Examples

data(starsCYG, package = "robustbase")
## Plot simple data and fitted lines
plot(starsCYG)
 lmST <-    lm(log.light ~ log.Te, data = starsCYG)
(B.ST <- with(starsCYG,  BACON(x = log.Te, y = log.light)))
(RlmST <- lmrob(log.light ~ log.Te, data = starsCYG))
abline(lmST, col = "red")
abline(RlmST, col = "blue")


[Package robustX version 1.1-2 Index]