dw.filter {robfilter} | R Documentation |
Procedures for robust (online) extraction of low frequency components (the signal) from a univariate time series based on a moving window technique using two nested time windows in each step.
dw.filter(y, outer.width, inner.width, method = "all", scale = "MAD", d = 2, minNonNAs = 5, online = FALSE, extrapolate = TRUE)
y |
a numeric vector or (univariate) time series object. |
outer.width |
a positive integer specifying the window width
of the outer window used for determining the final estimate. If online=FALSE (see below) this needs to be an odd integer. |
inner.width |
a positive integer (not larger than outer.width )
specifying the window width of the inner window used for determining
the initial estimate and trimming features.If online=FALSE (see below) this needs to be an odd integer. |
method |
a (vector of) character string(s) containing the method(s) to be used for the estimation
of the signal level. It is possible to specify any combination of "MED" , "RM" , "MTM" , "TRM" ,
"MRM" , "DWRM" , "DWMTM" , "DWTRM" , "DWMRM" and
"all" (for all of the above). Default is method="all" .
For a detailed description see the section ‘Methods’ below. |
scale |
a character string specifying the method to be used for robust estimation of the local
variability (within one time window). Possible values are:
|
d |
a positive integer defining factor the current scale estimate is multiplied with for
determining the trimming boundaries for outlier detection. Observations deviating more than d*σ_t from the current level approximation μ_t are replaced by μ_t where σ_t denotes the current scale estimate. Default is d = 2 meaning a 2 σ rule for outlier
detection. |
minNonNAs |
a positive integer defining the minimum number
of non-missing observations within each window which is required
for a ‘sensible’ estimation. Default: if windows contain
less than minNonNAs = 5 observations NA s are returned. |
online |
a logical indicating whether the current level and
scale estimates are evaluated at the most recent time
within each (inner and outer) window (TRUE ) or centred within
the windows (FALSE ). Setting online=FALSE requires odd
inner.width and outer.width . Default is online=FALSE . |
extrapolate |
a logical indicating whether the level
estimations should be extrapolated to the edges of the time series. If online=FALSE the extrapolation consists of the
fitted values within the first half of the first window and the
last half of the last window; if online=TRUE the
extrapolation consists of the all fitted values within the first
time window. |
dw.filter
is suitable for extracting low
frequency components (the signal) from a time series
which may be contaminated with outliers and can contain level
shifts. For this, moving window techniques are applied.
A short inner window of length inner.width
is used in each
step for calculating an initial level estimate (by using either
the median or a robust regression fit) and a robust estimate of
the local standard deviation. Observations deviating strongly from this
initial fit are trimmed from an outer time window of length
outer.width
, and the signal level is estimated from the
remaining observations (by using either a location or regression
estimator). Values specified in method
determine which
combination of estimation methods should be applied to the inner
and outer window (see section ‘Methods’ below).
The applied method
should be chosen based on an a-priori
guess of the underlying signal and the data quality: Location
based method (MED
/ MTM
) are recommended in case
of a locally (piecewise) constant signal, regression based
approaches (RM
/ DWRM
/ TRM
/ MRM
) in
case of locally linear, monotone trends.
Since no big differences have been reported between TRM
and MRM
, the quicker and somewhat more efficient
TRM
option might be preferred. DWRM
is the quickest
of all regression based methods and performs better than the
ordinary RM
at shifts, but it is the least robust and
least efficient method.
If location based methods are used, the inner.width
should
be chosen at least twice the length of expected patches of
subsequent outliers in the time series; if regression based
methods are used, the inner.width
should be at least three
times that length, otherwise outlier patches can influence the
estimations strongly. To increase the efficiency of the final
estimates, outer.width
can then be chosen rather large -
provided that it is smaller than the time between subsequent
level shifts.
For robust scale estimation, MAD
is the classical choice;
SN
is a somewhat more efficient and almost equally robust
alternative, while QN
is much more efficient if the
window widths are not too small, and it performs very well at the
occurrence of level shifts.
The factor d
, specifying the trimming boundaries as a
multiple of the estimated scale, can be chosen similarly to
classical rules for detecting unusual observations in a Gaussian
sample. Choosing d=3
instead of d=2
increases
efficiency, but decreases robustness; d=2.5
might be seen
as a compromise.
dw.filter
returns an object of class dw.filter
.
An object of class dw.filter
is a list containing the
following components:
level |
a data frame containing the corresponding signal level extracted by the filter(s) specified in method . |
slope |
a data frame containing the corresponding slope within each time window. |
sigma |
a data frame containing
inner.loc.sigma , inner.reg.sigma , outer.loc.sigma and outer.reg.sigma ,
the scale estimated from the observations (loc ) or the residuals from the Repeated Median regression (reg )
within the inner window of length inner.width or the outer window of length outer.width , respectively.MTM uses outer.loc.sigma for trimming outliers,
MRM and TRM use outer.reg.sigma for trimming outliers,DWMTM uses inner.loc.sigma for trimming outliers,
DWMRM and DWTRM use inner.reg.sigma for trimming outliers;MED , RM and RM require no scale estimation.The function only returns values for inner.loc.sigma , inner.reg.sigma ,
outer.loc.sigma or outer.reg.sigma if any specified method
requires their estimation; otherwise NA s are returned. |
In addition, the original input time series is returned as list
member y
, and the settings used for the analysis are
returned as the list members outer.width
,
inner.width
, method
, scale
, d
,
minNonNAs
, online
and extrapolate
.
Application of the function plot
to an object of class
dw.filter
returns a plot showing the original time series
with the filtered output.
The following methods are available as method
for signal extraction,
whereby the prefix DW
denotes the fact that different
window widths are used in the first and second step of the
calculations within one window (i.e.
inner.width
<outer.width
) while for the
methods MED
, RM
, MTM
, TRM
and MRM
the first and second step take place in a window of fixed
length outer.width
.
MED
outer.width
.
RM
outer.width
.
MTM
, DWMTM
MTM
): the whole
window with outer.width
or (DWMTM
): the inner window
with inner.width
; in a second step the mean is applied to
the (trimmed) observations in the whole window (with
outer.width
).
TRM
, DWTRM
TRM
): the whole window with outer.width
or
(DWTRM
): the inner window with inner.width
; in a
second step least squares regression is applied to the (trimmed)
observations in the whole window (with outer.width
).
MRM
, DWMRM
MRM
): the whole window with outer.width
or
(DWMRM
): the inner window with inner.width
; in a
second step another repeated median regression is applied to the
(trimmed) observations in the whole window (with
outer.width
).
DWRM
inner.width
to determine the trend
(slope); in a second step the median is applied to the trend
corrected observations in the whole window with
outer.width
(without trimming).
Missing values are treated by omitting them and thus by reducing
the corresponding window width.
MED
, RM
, MTM
, TRM
and MRM
require at least minNonNAs
non-missing observations in each
outer window; DWRM
, DWMTM
, DWTRM
and
DWMRM
require at least minNonNAs
non-missing
observations in each inner window. Otherwise NA
s are
returned for level
, slope
and sigma
.
Roland Fried and Karen Schettlinger
Bernholt, T., Fried, R., Gather, U., Wegener, I. (2006)
Modified Repeated Median Filters,
Statistics and Computing 16,
177-192.
(earlier version: http://www.sfb475.uni-dortmund.de/berichte/tr46-04.ps)
Schettlinger, K., Fried, R., Gather, U. (2006) Robust Filters for Intensive Care Monitoring: Beyond the Running Median, Biomedizinische Technik 51(2), 49-56.
robreg.filter
, robust.filter
, hybrid.filter
, wrm.filter
.
# Generate random time series: y <- cumsum(runif(500)) - .5*(1:500) # Add jumps: y[200:500] <- y[200:500] + 5 y[400:500] <- y[400:500] - 7 # Add noise: n <- sample(1:500, 30) y[n] <- y[n] + rnorm(30) # Filtering with all methods: y.dw <- dw.filter(y, outer.width=31, inner.width=11, method="all") # Plot: plot(y.dw) # Filtering with trimmed RM and double window TRM only: y2.dw <- dw.filter(y, outer.width=31, inner.width=11, method=c("TRM","DWTRM")) plot(y2.dw)