histcutoffs {phyloarray}R Documentation

Histogram cutoff values

Description

Calculate cutoff values for standard deviation/signal intensities for badspots. A bad spot has high sd/signal intensity. The cutoff value is based on a fivenum/boxplot analysis.

Usage

  histcutoffs(datalist, cutat=1.7)

Arguments

datalist An object of type phyloarray for which the cutoffs should be calculated.
cutat A value giving how many times the length of the boxplot the cutoff should be set. This is the coef-argument of the boxplot.stats function.

Details

For the function, the standard deviation/signal values are calculated and log-transformed. Using the log-transformed values, a boxplot is calculated, with the coefficient for outliers at cutat times the length of the box itself.

Concerning cutat: If(!) the log-transformed values are normal distributed, setting cutat to 1.0, means having a certainty of 97.7250% of no false negatived. 1.35 has 99.3890% certainty, 1.7 has 99.8650% certainty and 2.5 has 99.9968% certainty of no false negatives. It is stressed that these values are only true for normal distributions!

Value

The return value is the object datalist of class phyloarray, but with one attribute added, i.e. "cutoff"

Note

Author(s)

Kurt Sys (kurt.sys@advalvas.be)

References

See Also

Scandataraw Phylodata

init.data

calcbackground

boxplot.stats boxplot fivenum

Examples

  # load data this-is-escaped-codenormal-bracket40bracket-normal, i.e. this-is-escaped-codenormal-bracket41bracket-normal
  data(Phylodata)

  # show some histograms
  hist(scans$Rsd[,1]/scans$R[,1], nclass=10000, xlim=c(0,5))
  hist(scans$Gsd[,1]/scans$G[,1], nclass=10000, xlim=c(0,1))

  # calculate cutoff values
  scans <- histcutoffs(scans, cutat=2.5)

  # the cutoff values
  attr(scans, "cutoff")

  # which gives the same as
  attributes(scans)$cutoff

[Package Contents]