msc.preprocess.run {caMassClass}R Documentation

Preprocessing Pipeline of Protein Mass Spectra

Description

Pipeline for preprocessing protein mass spectra (SELDI) data before classification.

Usage

msc.preprocess.run ( X,
    baseline.removal = 0,
      breaks=200, qntl=0, bw=0.005,                    # bslnoff
    min.mass = 3000,                                   # msc.mass.cut
    mass.drift.adjustment = 1,
      shiftPar=0.0005,                                 # msc.mass.adjust
    peak.extraction = 0, 
     PeakFile=0, SNR=2, span=c(81,11), zerothresh=0.9, # msc.peaks.find
     BmrkFile=0, BinSize=c(0.002, 0.008), tol=0.97,    # msc.peaks.align 
     FlBmFile=0, FillType=0.9,                         # msc.biomarkers.fill
    merge.copies = 4+2+1,                              # msc.copies.merge
    verbose = TRUE) 

Arguments

X Spectrum data either in matrix format [nFeatures x nSamples] or in 3D array format [nFeatures x nSamples x nCopies]. Row names (rownames(X) store M/Z mass of each row.
baseline.removal Remove baseline from each spectrum? (boolean or 0/1 integer). See function msc.baseline.subtract and bslnoff from PROcess library for other parameters that can be passed: breaks, qntl and bw.
min.mass Cutting place when removing data corresponding to low masses (m/z). See function msc.mass.cut for details.
mass.drift.adjustment Controls mass drift adjustment and scaling. If 0 than no mass adjustment or scaling will be performed; otherwise, it is passed to msc.mass.adjust function as scalePar. Because of that: 1 means that afterwards all samples will have the same mean, 2 means that afterwards all samples will have the same mean and medium. See function msc.mass.adjust for details and additional parameter shiftPar that can be passed.
peak.extraction Perform peak extraction and alignment, or keep on working with the raw spectra? (boolean or 0/1 integer). See following functions for other parameters that can be passed: Especially filenames to store intermediate results.
merge.copies In case multiple copies of data exist should they be merged and how? Passed to msc.copies.merge function as mergeType variable. See that function for more details.
verbose Boolean flag turns debugging printouts on.
breaks parameter to be passed to bslnoff function from PROcess library by msc.baseline.subtract
qntl parameter to be passed to bslnoff function from PROcess library by msc.baseline.subtract
bw parameter to be passed to bslnoff function from PROcess library by msc.baseline.subtract
shiftPar parameter to be passed to msc.mass.adjust
PeakFile parameter to be passed to msc.peaks.find
SNR parameter to be passed to msc.peaks.find
span parameter to be passed to msc.peaks.find
zerothresh parameter to be passed to msc.peaks.find
BmrkFile parameter to be passed to msc.peaks.align
BinSize parameter to be passed to msc.peaks.align
tol parameter to be passed to msc.peaks.align
FlBmFile parameter to be passed to msc.biomarkers.fill
FillType parameter to be passed to msc.biomarkers.fill

Details

Function containing several pre-processing steps preparing protein mass spectra (SELDI) data for classification. This function is a "pipeline" performing several operations, all of which do not need class label information. Any and all steps are optional and can be skipped:

Value

Return matrix containing features as rows and samples as columns, unless merge.copies was 0,4, or 8 when no merging is done and data is returned in same or similar format as the input format [nFeatures x nSamples x nCopies]. Row names (rownames(X) store M/Z mass of each row.

Author(s)

Jarek Tuszynski (SAIC) jaroslaw.w.tuszynski@saic.com

See Also

Examples

  # load input data 
  if (!file.exists("Data_IMAC.Rdata")) example("msc.project.read")
  load("Data_IMAC.Rdata")
  
  # run preprocess
  Y = msc.preprocess.run(X)
  cat("Size before: ", dim(X), " and after :", dim(Y), "\n")

[Package caMassClass version 1.0 Index]