kzs {kzs} | R Documentation |
The Kolmogorov-Zurbenko Spline function utilizes the moving average to construct a piece-wise estimator of the underlying signal of the given input data.
kzs(x, delta, h, k = 1, show.edges = FALSE)
x |
a data frame of paired values X and Y. The data frame should consist of two columns of data representing pairs (Xi, Yi), i = 1,..., n and X, Y are real values; the first column of data represents X values and the second column represents the corresponding Y values. |
delta |
the physical range of smoothing in terms of unit values of X. Restriction: delta << Xn-X1
|
h |
a scale reading of all outcomes of the algorithm. More specifically, h is the interval
width of a uniform scale covering the interval (Xn - delta/2, Xn + delta/2) .Restriction: h < min(Xi+1 - Xi) and h > 0
|
k |
the number of iterations the function will execute; k may also be interpreted as
the order of smoothness (as a polynomial of degree k-1 ). By default, k is set to perform
a single iteration.
|
show.edges |
a logical indicating whether or not to display the resulting data beyond the range of X
values of the user-supplied data. If FALSE , then the extended edges are suppressed. By
default, this parameter is set to FALSE .
|
The relation between variables Y and X as a function [namely, Y(x)] of a current value of
X = x is often desired as a result of practical research. Usually we search for some simple
function Y(x) when given a data set of pairs (Xi, Yi). These pairs frequently resemble a
noisy plot, and thus Y(x) is desired to be a smooth outcome from the original data to capture
important patterns in the data, while leaving out the noise. The KZS
function estimates a
solution to this problem through use of splines, which is a nonparametric estimator of a
function. Given a data set of pairs (Xi, Yi), splines estimate the smooth values of Y from
X's. The KZS
function Y(x) averages all values of Yi for all Xi within the range delta
around
each scale reading hi
along the variable X. The KZS
algorithm is designed to smooth all fast
fluctuations in Y within the delta
-range in X, while keeping ranges more then delta
untouched.
The separation of short scales less than delta
and long scales more than delta
is becoming more
effective with higher k
, while effective range of separation is becoming delta
*sqrt(k
).
a two-column data frame containing:
Xk |
X values resulting from execution of algorithm |
Y(Xk) |
Y values resulting from execution of algorithm |
The KZS
function is designed for the general situation, including time series data. In many
applications where variable X can be time, the KZS
is resolving the problem of missing values in
time series or irregularly observed values in longitudinal data analysis.
Derek Cyr dc896148@albany.edu and Igor Zurbenko igorg.zurbenko@gmail.com
"Spline Smoothing." http://economics.about.com/od/economicsglossary/g/splines.htm
# This example was created with the intent to push the limits of KZS. The # function has a wide peak and a sharp peak; for a wide peak, you may permit # stronger smoothing and for a sharp peak you may not (you would be over- # smoothing). The key here is to find satisfying values for the parameters. # EXAMPLE 1 t <- seq(from=-round(400*pi),to=round(400*pi),by=.25) #Total time t tp <- seq(from=0,to=round(400*pi),by=.25) #Positive t (includes t=0) tn <- seq(from=-round(400*pi),to=-.25,by=.25) #Negative t nobs <- 1:length(t) #Sequence of obs # True signal signalp <- 0.5*sin(sqrt((2*pi*abs(tp))/200)) #Positive side of signal signaln <- 0.5*sin(-sqrt((2*pi*abs(tn))/200)) #Negative side of signal signal <- append(signaln,signalp,after=length(tn)) #Appending into one signal # Randomly generate noise from the standard normal distribution et <- rnorm(length(t),mean=0,sd=1) # Add the noise to the true signal yt <- et + signal # Data frame of (t,yt) pts <- data.frame(cbind(t,yt)) # Plot of the true signal plot(signal~t,xlab='t',ylab='Signal',main='True Signal',type="l") # Plot of yt (signal + noise) plot(yt~t,ylab=expression(paste(Y[t])),main='Signal buried in noise',type="p") # Apply KZS function - 3 iterations kzs(pts,delta=80,h=.2,k=3,show.edges=FALSE) lines(signal~t,col="red") title(main="KZS(delta=80, h=0.20, k=3, show.edges=false)") legend("topright", c("True signal","KZS estimate"), cex=0.8, col=c("red","black"), lty=1:1, lwd=2, bty="n") # EXAMPLE 2 - Rerun KZS on the same function after removing 20% of the data # points. This provides an opportunity to create a random scale # along the variable X. # Generate and remove a random 20% of t t20 <- sample(nobs,size=length(nobs)/5) #Random 20% of (t,yt) pts20 <- pts[-t20,] #Remove the 20% # Plot of (t,yt) with 20% removal plot(pts20$yt~pts20$t,xlab='t',ylab=expression(paste(Y[t])),main='Signal buried in noise - 20% removal',type="p") # Apply KZS function - 3 iterations kzs(pts20,delta=80,h=.20,k=3,show.edges=FALSE) lines(signal~t,col="red") title(main="KZS(delta=80, h=0.20, k=3, show.edges=false) - 20% removal") legend("topright", c("True signal","KZS estimate"), cex=0.8, col=c("red","black"), lty=1:1, lwd=2, bty="n")