msc.peaks.clust {caMassClass} | R Documentation |
Clusters peaks from multiple protein mass spectra (SELDI) samples
msc.peaks.clust(dM, S, BinSize=c(0,sum(dM)), tol=0.97, verbose=FALSE)
S |
Peak sample number, used to identify the spectrum the peak come from. |
dM |
Distance between sorted peak positions (masses, m/z). |
BinSize |
Upper and lower bound of bin-sizes, based on expected
experimental variation in the mass (m/z) values. Size of any bin is
measured as (R-L)/mean(R,L) where L and R are masses
(m/z values) of left and right boundaries.
All resulting bin sizes will be between BinSize[1] and
BinSize[2] . Default is c(0,sum(dM)) which ensures that no
BinSizes is not being used. |
tol |
gaps bigger than tol*max(gap) are assumed to be the same
size as the largest gap. See details. |
verbose |
boolean flag turns debugging printouts on. |
This is a low level function used by msc.peaks.alignment
and not intended to
be directly used by many users. However it might be useful for other code
developers. It clusters peaks from different samples into bins in
such a way as to satisfy constraints in following order:
BinSize[1]
and BinSize[2]
The output is binary array of the same size as dM
and S
where
left boundaries of each clusters-bin (biomarker) are marked
Jarek Tuszynski (SAIC) jaroslaw.w.tuszynski@saic.com
The initial version of this function started as implementation of algorithm described on webpage of Virginia Prostate Center (at Virginia Medical School) documenting their PeakMiner Software. See http://www.evms.edu/vpc/seldi/peakminer.pdf
msc.preprocess.run
and
msc.project.run
pipelines.
msc.peaks.find
msc.peaks.align
and
msc.biomarkers.fill
msc.peaks.align
function
# example with simple made up data (18 peaks, 3 samples) M = c(1,5,8,12,17,22, 3,5,7,11,14,25, 1, 5, 7,10,17,21) # peak position/mass S = rep(1:3, each=6) # peak's sample number idx = sort(M, index=TRUE)$ix # sort peaks by mass M = M[idx] # sorted mass S = S[idx] # arrange sample numbers in the same order bin = msc.peaks.clust(diff(M), S, verbose=TRUE) rbind(S,M,bin) # show results # use the results to align peaks into biomarkers matrix Bmrks = matrix(NA,sum(bin),max(S)) # init feature (biomarker) matrix bin = cumsum(bin) # find bin numbers for each peak in S array for (j in 1:length(S)) # Bmrks usually store height H of each peak Bmrks[bin[j], S[j]] = M[j]; # but in this example it will be mass Bmrks stopifnot( dim(Bmrks)==c(7,3) ) stopifnot( sum(is.na(Bmrks[5,]))==2 )