msPeakCWT {msProcess} | R Documentation |
This function isolates peaks of interest in the input mass spectrum by way of a continuous wavelet transform (CWT). The method is scale-selective, i.e., the user can specify the scale range of interest in defining peaks. The basic algorithm works as follows:
y
is calculated
over a broad range of scales.m/z
value at the smallest scale identifies the peak location
in the original input spectrum.wavCWTPeaks
function for details.)
In addition to peak identification, this function also calculates an estimate of the Holder exponent
associated with each peak. Qualitatively speaking, the magnitude (of the modulus) of the Holder exponent is proportional
to the the sharpness of the corresponding peak. See the holderSpectrum
function for more
information. The Holder exponents are packed into the output data.frame
along with other peak-related information.
msPeakCWT(x, y, n.scale = 100, snr.min = 3, scale.min = 4, length.min = 10, noise.span = NULL, noise.fun = "quantile", noise.min = NULL, n.octave.min = 1, tolerance = 0, holder = TRUE, process = "msPeakCWT")
x |
A numeric vector representing the m/z values of a spectrum. |
y |
A numeric vector representing the intensity values of the spectrum. |
length.min |
The minimum number of points along a CWT tree branch and within the specified scale.range
needed in order for that branches peak to be considered a peak candidate. See the wavCWTPeaks function for more details.
Default: 10 . |
n.octave.min |
A pruning factor for excluding non-persistent branches. If
a branch of connected extrema does not span this number of octaves, it is
excluded from the tree. See the wavCWTTree function for more details. Default: 1. |
n.scale |
The (maximum) number of logarithmically distributed scales over which to evaluate the CWT.
For a uniformly sampled time series (x ), the supported range of CWT scales is deltat(x) * c(1, length(x)) .
where deltat(x) is the sampling interval.
However, a mass spectrum is generally viewed as a non-uniformly sampled time series because the
difference in successive m/z values is non-constant due to the quadratic mapping of the TOF values to the m/z domain.
For the purpose of peak detection, however, the sampling interval can be set to unity. Thus, the possible range of CWT scales
is 1 to length(x) and the scale values are integers. See the wavCWT function for more details on CWT scales.
Default: 100 . |
holder |
a logical value. If TRUE, the holder exponents corresponding to the peaks are also calculated. |
noise.fun |
A character string defining the function to apply to the local noise estimates in order to sumarize
and quantify the local noise level into a scalar value. See the DETAILS section for more information. Supported values
are
where x is a vector of smallest-scale CWT coefficients whose time indices are near that of the branch termination time.
See the wavCWTPeaks function for more details. Default: "quantile" . |
noise.min |
The minimum allowed estimated local noise level.
Values must be between 0 and 1 inclusive. Default NULL corresponds to 0.05.
It is converted to raw scale before passed to wavCWTPeaks :
quantile(attr(x,"noise"), prob=0.05) ,
where x is the input wavCWTTree object.
|
noise.span |
The span in time surrounding each branche's temrination point to use in forming
local noise estimates and (ultimately) peak SNR estimates. See the wavCWTPeaks function for more details.
Default: NULL,max(0.01 * diff(range(times)), 5*sampling.interval) ,
where times and sampling.interval are attributes of the input wavCWTTree object. |
process |
A character string denoting the name of the
process to register with the (embedded) event history object of the input
after processing the input data.
Default: "msPeakCWT" . |
scale.min |
The minimum allowable value of the scale associated with a CWT peak.
See the scale.range argument of the wavCWTPeaks function for more details.
Default: 4 . |
snr.min |
The minimum allowed peak signal-to-noise ratio. See the wavCWTPeaks function for more details. Default: 3 . |
tolerance |
A tolerance vector used to find CWT extrema.
This vector must be as long as there are scales in the CWT such that
the jth element defines the
tolerance to use in finding modulus maxima at
the jth scale of the CWT. If not, the last value
is replicated appropriately.
See the wavCWTTree function for more details. Default: 0. |
A data.frame with 11 columns: peak class location, left bound, right bound and
peak span in both clock tick
("tick.loc"
, "tick.left"
, "tick.right"
, "tick.span"
)
and mass measure
("mass.loc"
, "mass.left"
, "mass.right"
, "mass.span"
),
and peak signal-to-noise ratio and intensity
("snr"
, "intensity"
). The final column is "holder"
, representing
the estimated Holder exponent asscoiated with each peak.
Since noise.local
is NULL
, "snr"
is the same as ("intensity"
).
Pan Du, Warren A. Kibbe, and Simon M. Lin, ``Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching", Bioinformatics, 22, 2059–2065 (2006).
D. B. Percival and A. T. Walden, Wavelet Methods for Time Series Analysis, Cambridge University Press, 2000.
wavCWT
, wavCWTTree
, msPeak
, msPeakInfo
.
if (!exists("qcset")) data("qcset", package="msProcess") ## extract a subset of a single spectrum mz <- qcset$mz imz <- mz > 3000 & mz < 5000 z <- as.vector(qcset[imz, 1]$intensity) mz <- mz[imz] plot(mz, z, cex=0.5, ylab="Intensity", xlab="m/z", type="l") ## estimate the peak locations using various ## minimum CWT scales and overlay the plot ## locations scale.min <- c(2,4,8) col <- c("green","blue","red") for (i in seq(along=scale.min)){ p <- msPeakCWT(mz, z, scale.min=scale.min[i]) ipeak <- p[["tick.loc"]] points(mz[ipeak], z[ipeak], cex=i, col=col[i], pch=1) } ## add a legend if (is.R()){ legend(3000,10000,pch=1,col=col,legend=paste("scale.min =", scale.min)) } else { legend(3000,10000,marks=1,col=col,legend=paste("scale.min =", scale.min)) } ## plot the Holder exponents corresponding to each ## peak hx <- p[["mass.loc"]] hy <- abs(p[["holder"]]) plot(hx, hy, type="h", xlim=range(mz), lty="dashed", col="blue", xlab="m/z", ylab="|Holder exponent|") points(hx, hy, col="red", cex=1.2) lines(mz, rescale(z,c(0,0.5)), lwd=3) abline(h=0, lty="dotted") legend(3000,1,col=c("blue","black"), lwd=c(1,3), lty=c("dashed","solid"), legend=c("|Holder exponent|", "Scaled spectrum"))