msc.peaks.find {caMassClass}R Documentation

Find Peaks of Mass Spectra

Description

Find Peaks in a Batch of Protein Mass Spectra (SELDI) Data.

Usage

msc.peaks.find(X, SNR=2, span=c(81,11), zerothresh=0.9) 

Arguments

X Spectrum data either in matrix format [nFeatures x nSamples] or in 3D array format [nFeatures x nSamples x nCopies]. Row names (rownames(X)) store M/Z mass of each row.
SNR signal to noise ratio (z-score) criterion for peak detection. Similar to SoN variable in isPeak from PROcess package.
span two moving window widths. Smaller one will be used for smoothing and local maxima finding. Larger one will be used for local variance estimation. Similar to span and sm.span variables in isPeak from PROcess package.
zerothresh Intensity threshold criterion for peak detection. Positive numbers in range [0,1), like default 0.9, will be used to calculate a single threshold used for all samples using quantile(X,zerothresh) equation. Negative numbers in range (-1, 0) will be used to calculate threshold for each single sample i using quantile(X[i,],-zerothresh). Similar to zerothrsh variable in isPeak from PROcess package.

Details

Peak finding is done using the following algorithm:
x = X[j,]
thresh = if(zerothresh>=0) quantile(X,zerothresh) else quantile(x,-zerothresh)
sig = runmean(x, span[2])
rMax = runmax (x, span[2])
rAvr = runmed (x, span[1])
rStd = runmad (x, span[1], center=rAvr)
peak = (rMax == x) & (sig > thresh) & (sig-rAvr > SNR*rStd)

What means that a peak have to meet the following criteria to be classified as a peak:

It is very similar to the isPeak and getPeaks functions from PROcess library (ver 1.3.2) written by Xiaochun Li. For example getPeaks(X, PeakFile, SoN=SNR, span=span[1], sm.span=span[2], zerothrsh=zerothresh, area.w=0.003, ratio=0) would give very similar results as msc.peaks.find the differences include: speed ( msc.peaks.find uses much faster C-level code), different use of signal-to-noise-ratio variable, and msc.peaks.find does not do or use area calculations.

Value

A data frame, in the same format as data saved in peakinfofile, have five components:

Spectrum.Tag sample name of each peak
Spectrum. sample number of each peak
Peak. peak number within each sample
Intensity peak height (intensity)
Substance.Mass x-axis position, or corresponding mass of the peak measured in M/Z, which were extracted from row names of the X matrix.

Author(s)

Jarek Tuszynski (SAIC) jaroslaw.w.tuszynski@saic.com

See Also

Examples

  # load "Data_IMAC.Rdata" file containing raw MS spectra 'X'  
  if (!file.exists("Data_IMAC.Rdata")) example("msc.project.read")
  load("Data_IMAC.Rdata")
  Peaks = msc.peaks.find(X) # Find Peaks
  cat(nrow(Peaks), "peaks were found in", Peaks[nrow(Peaks),2], "files.\n")
  stopifnot( nrow(Peaks)==823 )
  
  # work directly with data from the input files
  directory  = system.file("Test", package = "caMassClass")
  X = msc.rawMS.read.csv(directory, "IMAC_normal_.*csv")
  Peaks = msc.peaks.find(X) # Find Peaks
  cat(nrow(Peaks), "peaks were found in", Peaks[nrow(Peaks),2], "files.\n")
  stopifnot( nrow(Peaks)==424 )

[Package caMassClass version 1.6 Index]