msc.project.read {caMassClass}R Documentation

Read and Manage a Batch of Protein Mass Spectra

Description

Read and manage a batch of protein mass spectra (SELDI) files where files could contain multiple spectra taken from the same sample, or multiple experiments performed on the same sample.

Usage

msc.project.read(ProjectFile, directory.out=NULL) 

Arguments

ProjectFile Path and name of text file in Excel's CSV format storing information about a batch of Mass Spectra data files. Alternative input format is a table equivalent to such CSV file. See details.
directory.out Optional character vector with name of directory where output files will be saved. Use "/" slashes in directory name. By default the directory containing ProjectFile and all Mass Spectra files is used, and this argument is provided in case that directory is read-only and user have to choose a different directory.

Details

Function msc.project.read allows to user to manage large batches of Mass Spectra files, especially when multiple copies of each sample are present. The ProjectFile contains all the information about the project. An example format might be:

Name, Class, IMAC1, IMAC2, WCX1, WCX2
r0008, 1, Nr/imac_r0008.csv, Nr/imac_r0008(2).csv, Nr/wcx_r0008.csv, Nr/wcx_r0008(2).csv
r0012, 1, Nr/imac_r0012.csv, Nr/imac_r0012(2).csv, Nr/wcx_r0012.csv, Nr/wcx_r0012(2).csv
r0014, 1, Nr/imac_r0014.csv, Nr/imac_r0014(2).csv, Nr/wcx_r0014.csv, Nr/wcx_r0014(2).csv
r0021, 2, Ca/imac_r0021.csv, Ca/imac_r0021(2).csv, Ca/wcx_r0021.csv, Ca/wcx_r0021(2).csv
r0022, 2, Ca/imac_r0022.csv, Ca/imac_r0022(2).csv, Ca/wcx_r0022.csv, Ca/wcx_r0022(2).csv
r0024, 2, Ca/imac_r0024.csv, Ca/imac_r0024(2).csv, Ca/wcx_r0024.csv, Ca/wcx_r0024(2).csv
r0027, 2, Ca/imac_r0027.csv, Ca/imac_r0027(2).csv, Ca/wcx_r0027.csv, Ca/wcx_r0027(2).csv

ProjectFile always has the following format:

File names in ProjectFile could be compressed using zip and gzip file compression. They can also be saved in CSV or in mzXML file formats. For example if individual file name is in the format:

Value

List of .Rdata files storing data that was just read. Each file contains either 2D data (if only one copy of the the data existed) or 3D data (if multiple copies of the data existed). Multiple files are produced if multiple experiments were performed under different conditions. In above example two files will be produced: Data_IMAC.Rdata and Data_WCX.Rdata.
Each file will contain the following objects: X, SampleLabels, mzXML. At the moment, matadata related to the individual scans (stored in mzXML$scan) is stored only for single copy data.

Author(s)

Jarek Tuszynski (SAIC) jaroslaw.w.tuszynski@saic.com

See Also

Examples

  #================================================
  # test reading project file with only CVS files
  #================================================
  # find name of example project file
  directory = system.file("Test", package = "caMassClass") # input directory 
  ProjectFile = file.path(directory,"InputFiles.csv")      # full name 
  # read $ save the project data 
  FileName1 = msc.project.read(ProjectFile, '.')
  cat("File ",FileName1," was created\n")
  # load and inspect the project data 
  load(FileName1)
  stopifnot( dim(X)==c(11883,20,2) )    # make sure it is what's expected
  strsplit(mzXML$parentFile, '\n')      # show mzXML$parentFile record
  X1 = X                                # make a copy of X for future use
  
  #================================================
  # test reading project file with mzXML files
  #================================================
  # save X in mzXML format
  msc.rawMS.write.mzXML(X, "rawMS32.mzXML",  precision="32") # save X as mzXML
  # create new project table
  ProjectTable = c(colnames(X), SampleLabels, paste("rawMS32.mzXML", 1:40, sep='/'))
  dim(ProjectTable) = c(20,4)
  colnames(ProjectTable) = c("SampleName", "Class", "Temp1", "Temp2")
  print(ProjectTable)
  # read $ save the project data 
  FileName2 = msc.project.read(ProjectTable, '.')
  cat("File ",FileName2," was created\n")
  # compare results
  load(FileName2) # load data: X & SampleLabels
  stopifnot(max(abs(X-X1))<1e-5)
  file.remove(FileName2) # delete temporary files 

[Package caMassClass version 1.6 Index]