kzs {kzs} | R Documentation |
The kzs
function is designed to smooth a data set of paired values (Xi, Yi), in
which the response variable, Y, is contaminated with noise.
kzs(y, x, delta, d, k = 1, edges = FALSE, plot = TRUE)
y |
a 1-dimensional vector of real values representing the response variable that is to be smoothed. |
x |
a 1-dimensional vector of real values representing the input variable. This vector must be the same length
as the response vector, y .
|
delta |
the physical range of smoothing in terms of unit values of x . The algorithm is designed to smooth Only
the points that lie within this range, while leaving points outside of this range untouched.
|
d |
a positive real number denoting a scale reading along x . This value defines a uniform scale overlapping
x for which each delta -range is based on.
|
k |
an integer specifying the number of iterations kzs will execute; k may also be
interpreted as the order of smoothness (as a polynomial of degree k-1 ). By default, k = 1 .
|
edges |
a logical indicating whether or not to display the outcome data beyond the initial range of x . By
default, edges = FALSE . Further details on this will be documented.
|
plot |
a logical indicating whether or not to produce a plot of the kzs outcome. This is TRUE
by default.
|
The relation between variables Y and X as a function of a current value of X = x [namely, Y(x)] is
often desired as a result of practical research. Usually we search for some simple function, Y(x),
when given a data set of pairs (Xi, Yi). When plotted, these pairs frequently resemble a noisy plot,
and thus Y(x) is desired to be a smooth outcome from the original data, capturing significant patterns
in the data, while leaving out the noise. The kzs
function estimates a solution to this problem
through use of splines, a particular nonparametric estimator of a function. Given a data set of pairs
(Xi, Yi), splines estimate the smooth values of Y from X's. More specifically, kzs
averages all
values of Y for all X within the range delta
around each scale reading di
, along X. The
kzs
algorithm is designed to smooth all fast fluctuations in Y within the delta
-range in X,
while keeping ranges more then delta
untouched. The separation of short scales less than delta
and long scales more than delta
is becoming more effective with a higher k
, while the effective
range of separation is becoming delta*sqrt(k).
a two-column data frame of paired values (xk, yk)
:
xk |
x values in increments of d |
yk |
smoothed response values resulting from k iterations of kzs |
Data set (Xi, Yi) must be provided, usually as some observations that occur at certain times; kzs
is designed for the general situation, including time series data. In many applications where the input
variable, x
, can be time, kzs
is resolving the problem of missing values in time series or
irregularly observed values in longitudinal data analysis.
kzs
may take time to completely run depending on the size of the data set used and the number of
iterations specified.
For more information on the restrictions imposed on delta
and d
, consult kzs.params
.
Derek Cyr cyr.derek@gmail.com and Igor Zurbenko igorg.zurbenko@gmail.com
"Spline Smoothing." http://economics.about.com/od/economicsglossary/g/splines.htm
# This example was created with the intent to push the limits of kzs. The # function has a wide peak and a sharp peak; for a wide peak, you may permit # stronger smoothing and for a sharp peak you may not (you would be over- # smoothing). Try various values for delta and d to see how the outcome may vary. # Total time t t <- seq(from = -round(400*pi), to = round(400*pi), by = .25) # Construct the signal over time ts <- 0.5*sin(sqrt((2*pi*abs(t))/200)) signal <- ifelse(t < 0, -ts, ts) # Bury the signal in noise [randomly, from N(0, 1)] et <- rnorm(length(t), mean = 0, sd = 1) yt <- et + signal # Data frame of (t, yt) pts <- data.frame(cbind(t, yt)) ### EXAMPLE 1 - Apply kzs to the signal buried in noise # Plot of the true signal plot(signal ~ t, xlab = "t", ylab = "Signal", main = "True Signal", type = "l") # Plot of signal + noise plot(yt ~ t, ylab = "yt", main = "Signal buried in noise", type = "p") # Apply 3 iterations of kzs kzs(y = pts[,2], x = pts[,1], delta = 80, d = .2, k = 3, edges = FALSE, plot = TRUE) lines(signal ~ t, col = "red") title(main = "kzs(delta = 80, d = .2, k = 3, edges = FALSE)") legend("topright", c("True signal","kzs estimate"), cex = 0.8, col = c("red", "black"), lty = 1:1, lwd = 2, bty = "n") ### EXAMPLE 2 - Irregularly observed data over time # Cancel a random 20 percent of (t, yt) leaving irregularly observed time points obs <- seq(1:length(t)) t20 <- sample(obs, size = length(obs)/5) pts20 <- pts[-t20,] # Plot of (t,yt) with 20 percent of the data removed plot(pts20$yt ~ pts20$t, main = "Signal buried in noise\n20 percent of (t, yt) deleted", xlab = "t", ylab = "yt", type = "p") # Apply 3 iterations of kzs kzs(y = pts20[,2], x = pts20[,1], delta = 80, d = .2, k = 3, edges = FALSE, plot = TRUE) lines(signal ~ t, col = "red") title(main = "kzs(delta = 80, d = .2, k = 3, edges = FALSE)") legend("topright", c("True signal","kzs estimate"), cex = 0.8, col = c("red", "black"), lty = 1:1, lwd = 2, bty = "n")