read.mzXML & write.mzXML {caMassClass}R Documentation

Read and Write mzXML Files

Description

Read and write protein mass spectra data to/from mzXML files.

Usage

  mzXML = new.mzXML()
  mzXML = read.mzXML(filename)
  write.mzXML(mzXML, filename, precision=c('32', '64')) 

Arguments

mzXML class storing partially parsed mzXML data
filename character string with name of the file (connection)
precision precision to be used in saving scan data. Save double (floating point) array using 32 or 64 bits?

Details

The main task of read.mzXML and write.mzXML functions is to extract and save scan data of mzXML files. In addition attempt is made to keep all other sections of mzXML file as unparsed XML code, so the data can be extracted latter or saved into new mzXML files. Those unparsed sections are stored as XML text

Value

Function read.mzXML returns object of type mzXML, containing:

scan List of Mass Spectra scans. Each element of the list contain the following elements:
  • peaks - intensities or peaks of the scan
  • mass - masses (m/z) corresponding to peaks. Vectors mass and peaks have the same length.
  • num - scan number
  • parentNum - scan number of parent scan in case of recursively stored scans (msLevel>1)
  • msLevel - 1- means MS scan, 2- means MS/MS scan, etc.
  • header - xml code of <scan> header might contain other useful attributes
  • maldi - optional - acquisition dependent properties of a MALDI experiment
  • scanOrigin - optional - name of parent file(?)
  • precursorMz - optional - information about the precursor ion
  • nameValue - optional - properties of the scan not included elsewhere
All optional elements contain unparsed XML code, if corresponding sections are present, or NULL. See mzXML schema and documentation for more details
header Stores header of <mzXML> section containing information about namespace and schema file location.
msInstrument General information about the MS instrument. Stored as XML.
parentFile Path to all the ancestor files. Stored as XML.
dataProcessing Description of any data manipulation. Stored as XML.
separation Information about the separation technique. Stored as XML.
spotting Acquisition independent properties of a MALDI experiment. Stored as XML.
indexOffset Offset of the index element. Either 0 or a vector.


Function new.mzXML returns the same object as read.mzXML but with all fields equal to NULL. Function write.mzXML does not return anything.

Author(s)

Jarek Tuszynski (SAIC) jaroslaw.w.tuszynski@saic.com

References

Definition of mzXML format: http://tools.proteomecenter.org/mzXMLschema.php

Documentation of mzXML format: http://sashimi.sourceforge.net/schema_revision/mzXML_2.1/Doc/mzXML_2.1_tutorial.pdf

More Documentation of mzXML format: http://sashimi.sourceforge.net/software_glossolalia.html

ReadmzXML software http://tools.proteomecenter.org/readmzXML.php

See Also

For reading XML files see xmlTreeParse from XML.

Other R function related to mzXML format: xcmsRaw from xcms BioConductor package.

Examples

  directory = system.file("Test", package = "caMassClass")
  FileName = file.path(directory,"test1.xml")
  xml = read.mzXML(FileName)
  xml
  
  # test reading/writing
  write.mzXML(xml, "temp.xml")
  xml2 = read.mzXML("temp.xml")
  file.remove("temp.xml")
  stopifnot(all(xml$scan[[1]]$peaks == xml2$scan[[1]]$peaks))
  stopifnot(xml$msInstrument == xml2$msInstrument)
  
  # extracting scan data from the output
  FileName = file.path(directory,"test2.xml")
  xml  = read.mzXML(FileName)
  plot(xml$scan[[1]]$mass, xml$scan[[1]]$peaks, type="l")
  
  # extracting data from unparsed sections
  tree = xmlTreeParse(xml$msInstrument, asText=TRUE, asTree=TRUE)
  x = xmlRoot(tree)
  xmlName(x)
  xmlAttrs(x[["msManufacturer"]]) ["value"]
  xmlAttrs(x[["software"]])

[Package caMassClass version 1.6 Index]