msc.preprocess.run {caMassClass} | R Documentation |
Pipeline for preprocessing protein mass spectra (SELDI) data before classification.
msc.preprocess.run ( X, baseline.removal = 0, breaks=200, qntl=0, bw=0.005, # bslnoff min.mass = 3000, # msc.mass.cut mass.drift.adjustment = 1, shiftPar=0.0005, # msc.mass.adjust peak.extraction = 0, PeakFile=0, SNR=2, span=c(81,11), zerothresh=0.9, # msc.peaks.find BmrkFile=0, BinSize=c(0.002, 0.008), tol=0.97, # msc.peaks.align FlBmFile=0, FillType=0.9, # msc.biomarkers.fill merge.copies = 4+2+1, # msc.copies.merge verbose = TRUE)
X |
Spectrum data either in matrix format [nFeatures x nSamples] or in
3D array format [nFeatures x nSamples x nCopies]. Row names
(rownames(X) store M/Z mass of each row. |
baseline.removal |
Remove baseline from each spectrum? (boolean or 0/1
integer). See function msc.baseline.subtract and
bslnoff
from PROcess library for other parameters that can be passed:
breaks , qntl and bw . |
min.mass |
Cutting place when removing data corresponding to low masses
(m/z). See function msc.mass.cut for details. |
mass.drift.adjustment |
Controls mass drift adjustment and scaling.
If 0 than no mass adjustment or scaling will be performed; otherwise, it is
passed to msc.mass.adjust function as scalePar . Because
of that: 1 means that afterwards all samples will have the same mean, 2
means that afterwards all samples will have the same mean and medium. See
function msc.mass.adjust for details and additional parameter
shiftPar that can be passed. |
peak.extraction |
Perform peak extraction and alignment, or keep on
working with the raw spectra? (boolean or 0/1 integer). See following
functions for other parameters that can be passed:
|
merge.copies |
In case multiple copies of data exist should they be
merged and how? Passed to msc.copies.merge function as
mergeType variable. See that function for more details. |
verbose |
Boolean flag turns debugging printouts on. |
breaks |
parameter to be passed to bslnoff
function from PROcess library by msc.baseline.subtract |
qntl |
parameter to be passed to bslnoff
function from PROcess library by msc.baseline.subtract |
bw |
parameter to be passed to bslnoff
function from PROcess library by msc.baseline.subtract |
shiftPar |
parameter to be passed to msc.mass.adjust |
PeakFile |
parameter to be passed to msc.peaks.find |
SNR |
parameter to be passed to msc.peaks.find |
span |
parameter to be passed to msc.peaks.find |
zerothresh |
parameter to be passed to msc.peaks.find |
BmrkFile |
parameter to be passed to msc.peaks.align |
BinSize |
parameter to be passed to msc.peaks.align |
tol |
parameter to be passed to msc.peaks.align |
FlBmFile |
parameter to be passed to msc.biomarkers.fill |
FillType |
parameter to be passed to msc.biomarkers.fill |
Function containing several pre-processing steps preparing protein mass spectra (SELDI) data for classification. This function is a "pipeline" performing several operations, all of which do not need class label information. Any and all steps are optional and can be skipped:
msc.baseline.subtract
and bslnoff
from PROcess library.
msc.mass.cut
.
msc.mass.adjust
.
msc.peaks.find
, msc.peaks.align
and
msc.biomarkers.fill
.
msc.copies.merge
.
Return matrix containing features as rows and samples as columns, unless
merge.copies
was 0,4, or 8 when no merging is done and data is
returned in same or similar format as the input format
[nFeatures x nSamples x nCopies].
Row names (rownames(X)
store M/Z mass of each row.
Jarek Tuszynski (SAIC) jaroslaw.w.tuszynski@saic.com
msc.preprocess.run
or
msc.msfiles.read.csv
functions
msc.baseline.subtract
, bslnoff
from PROcess library, msc.mass.cut
,
msc.mass.adjust
, msc.peaks.find
,
msc.peaks.align
, msc.biomarkers.fill
, and
msc.copies.merge
.
msc.classifier.test
function
# load input data if (!file.exists("Data_IMAC.Rdata")) example("msc.project.read") load("Data_IMAC.Rdata") # run preprocess Y = msc.preprocess.run(X) cat("Size before: ", dim(X), " and after :", dim(Y), "\n")