msc.rawMS.read.csv {caMassClass}R Documentation

Read Protein Mass Spectra from CSV files

Description

Read multiple protein mass spectra (SELDI) files, listed in FileList, from a given directory and combine them into a single data structure. Files are in CSV format, possibly compresses. Data is stored as a matrix one file per column.

Usage

  msc.rawMS.read.csv(directory=".", FileList="\.csv", mzXML.record=FALSE)

Arguments

directory a character vector with name of directory where all the files can be found. Use "/" slashes in directory name. The default corresponds to the working directory getwd().
FileList List of files to read. List can be in the following formats:
  • single string - a regular expression (see regex) to be used in selecting files to read, for example "\.csv"
  • list - list of file names to be read
The last format also support file zip and gzip file compression. For example if individual file name is in the format:
  • "dir/a.csv" - uncompressed file 'a.csv' in directory 'dir'
  • "dir/b.zip/a.csv" - file 'a.csv' within zipped file 'b.zip'
  • "dir/a.csv.gz" - gziped individual file
mzXML.record should mzXML record be created to store mata-data (input file names)?

Details

All files should be in Excel's CSV format (table in text format: 1 row per line, comma delaminated columns). Each file is assumed to have two columns, in case of SELDI data: column 1 (x-axis) is mass/charge (M/Z), and column 2 (y-axis) is spectrum intensity. All files are assumed to have identical first (M/Z) column.

Value

Data structure containing all the data read from the files, in form of a 2D matrix (nFeatures x nSamples). If mzXML.record was set to true than mzXML record with input file names will be attached to X as "mzXML" attribute.

Author(s)

Jarek Tuszynski (SAIC) jaroslaw.w.tuszynski@saic.com

See Also

Examples

  # example of mode "single string" FileList
  directory  = system.file("Test", package = "caMassClass")
  X = msc.rawMS.read.csv(directory, "IMAC_normal_.*csv")
  stopifnot ( dim(X) == c(11883, 20) ) # make sure it is what's expected
  
  # example of explicit 1D FileList
  ProjectFile = file.path(directory,"InputFiles.csv")
  FileList = read.csv(file=ProjectFile, comment.char = "")
  FileList[,3]
  X = msc.rawMS.read.csv(directory, FileList=FileList[,3], mzXML.record=TRUE)
  stopifnot ( dim(X) == c(11883, 20) ) # make sure it is what's expected
  mzXML = attr(X,"mzXML")
  strsplit(mzXML$parentFile, '\n')      # show mzXML$parentFile record
  
  # example using data provided in PROcess package 
  directory  = system.file("Test", package = "PROcess")
  X = msc.rawMS.read.csv(directory) 
  msc.baseline.subtract(X, plot=TRUE) # used here to plot results
  dim(X)

[Package caMassClass version 1.6 Index]