thermo {CHNOSZ}R Documentation

Thermodynamic Properties and Compositional Data

Description

This data object holds the thermodynamic database of properties of species, along with operational parameters for CHNOSZ, the properties of elements, references to sources of thermodynamic and compositional data, compositions of chemical activity buffers, amino acid compositions of proteins, and miscellaneous other data taken from the literature. The thermo object also holds intermediate data used in calculations, in particular the definitions of basis species and species of interest input by the user, and the properties of water so that subsequent calculations at the same temperature-pressure conditions can be accelerated.

The thermo object is a list composed of data.frames or lists each representing a class of data. The object is created upon loading the package (by calling data(thermo) from within the .First.lib function) from *.csv files in the data directory of the package. thermo is globally accessible; i.e., it is present in the user's environment. After loading CHNOSZ you may run ls() to verify that thermo is present, or type thermo to print the entire contents of the object on the screen. The various elements of the thermo object can be accessed using R's subsetting operators; for example, typing thermo$opt at the command line displays the current list of operational parameters (some of which can be altered using functions dedicated to this purpose; see e.g. nuts).

To make persistent additions or changes to the thermodynamic database of your installation, including compositions of proteins, first locate the installation directory of the package. This will be different depending on your operating system and type of R installation, but is something like /usr/lib/R/library/CHNOSZ, /Volumes/Macintosh HD/Library/Frameworks/R.framework/resources/ library/CHNOSZ, C:\Program Files\R\R-2.10.0\library\CHNOSZ, or C:\Users\[User Name]\Documents\R \win-library\2.10\CHNOSZ on Linux, Mac and Windows (XP and Vista) systems, respectively. To find the exact location of this directory on your system, use the command system.file(package="CHNOSZ"). Inside the data directory of the installation directory of the package are the .csv files that can be edited with a spreadsheet program. Edit and save the OBIGT.csv and/or protein.csv files as desired. The next time you start an R session, the new data will be available.

Functions are available to interactively update the thermodynamic database or definitions of buffers (mod.obigt and mod.buffer, respectively; a function named change serves as a wrapper to both of these). Changes made using these functions, as well as any interactive definitions of basis species and species of interest, are lost when the current session is closed without saving or if the thermo object is reinitialized by the command data(thermo).

Usage

data(thermo)

Format

References

Amend, J. P. and Helgeson, H. C., 1997a. Group additivity equations of state for calculating the standard molal thermodynamic properties of aqueous organic species at elevated temperatures and pressures. Geochim. Cosmochim. Acta, 61, 11-46. http://dx.doi.org/10.1016/S0016-7037(96)00306-7

Amend, J. P. and Helgeson, H. C., 1997b. Calculation of the standard molal thermodynamic properties of aqueous biomolecules at elevated temperatures and pressures. Part 1. L-alpha-amino acids. J. Chem. Soc., Faraday Trans., 93, 1927-1941. http://dx.doi.org/10.1039/a608126f

Cox, J. D., Wagman, D. D. and Medvedev, V. A., eds., 1989. CODATA Key Values for Thermodynamics. Hemisphere Publishing Corporation, New York, 271 p. http://www.worldcat.org/oclc/18559968

Dick, J. M., LaRowe, D. E. and Helgeson, H. C., 2006. Temperature, pressure, and electrochemical constraints on protein speciation: Group additivity calculation of the standard molal thermodynamic properties of ionized unfolded proteins. Biogeosciences, 3, 311-336. http://www.biogeosciences.net/3/311/2006/bg-3-311-2006.html

Gattiker, A., Michoud, K., Rivoire, C., Auchincloss, A. H., Coudert, E., Lima, T., Kersey, P., Pagni, M., Sigrist, C. J. A., Lachaize, C., Veuthey, A.-L., Gasteiger, E. and Bairoch, A., 2003. Automatic annotation of microbial proteomes in Swiss-Prot. Comput. Biol. Chem., 27, 49-58. http://dx.doi.org/10.1016/S1476-9271(02)00094-4

Ghaemmaghami, S., Huh, W., Bower, K., Howson, R. W., Belle, A., Dephoure, N., O'Shea, E. K. and Weissman, J. S., 2003. Global analysis of protein expression in yeast. Nature, 425, 737-741. http://dx.doi.org/10.1038/nature02046

HAMAP system. HAMAP FTP directory, ftp://ftp.expasy.org/databases/hamap/, accessed on 2007-12-20.

Huh, W. K., Falvo, J. V., Gerke, L. C., Carroll, A. S., Howson, R. W., Weissman, J. S. and O'Shea, E. K., 2003. Global analysis of protein localization in budding yeast. Nature, 425, 686-691. http://dx.doi.org/10.1038/nature02026

Johnson, J. W., Oelkers, E. H. and Helgeson, H. C., 1992. SUPCRT92: A software package for calculating the standard molal thermodynamic properties of minerals, gases, aqueous species, and reactions from 1 to 5000 bar and 0 to 1000degrees C. Comp. Geosci., 18, 899-947. http://dx.doi.org/10.1016/0098-3004(92)90029-Q

Privalov, P. L. and Makhatadze, G. I., 1990. Heat capacity of proteins. II. Partial molar heat capacity of the unfolded polypeptide chain of proteins: Protein unfolding effects. J. Mol. Biol., 213, 385-391. http://dx.doi.org/10.1016/S0022-2836(05)80198-6

Robie, R. A. and Hemingway, B. S., 1995. Thermodynamic Properties of Minerals and Related Substances at 298.15 K and 1 Bar (10^5 Pascals) Pressure and at Higher Temperatures. U. S. Geol. Surv., Bull. 2131, 461 p. http://www.worldcat.org/oclc/32590140

Roxby, R. and Tanford, C., 1971. Hydrogen ion titration curve of lysozyme in 6 M guanidine hydrochloride. Biochemistry, 10, 3348-3352. http://dx.doi.org/10.1021/bi00794a005

SGD project. Saccharomyces Genome Database, http://www.yeastgenome.org, accessed on 2008-08-04.

Shock, E. L. and Koretsky, C. M., 1995. Metal-organic complexes in geochemical processes: Estimation of standard partial molal thermodynamic properties of aqueous complexes between metal cations and monovalent organic acid ligands at high pressures and temperatures. Geochim. Cosmochim. Acta, 59, 1497-1532. http://dx.doi.org/10.1016/0016-7037(95)00058-8

Shock, E. L., Oelkers, E. H., Johnson, J. W., Sverjensky, D. A. and Helgeson, H. C., 1992. Calculation of the thermodynamic properties of aqueous species at high pressures and temperatures: Effective electrostatic radii, dissociation constants and standard partial molal properties to 1000 degrees C and 5 kbar. J. Chem. Soc. Faraday Trans., 88, 803-826. http://dx.doi.org/10.1039/FT9928800803

Shock, E. L. et al., 1998. slop98.dat (computer data file). http://geopig.asu.edu/supcrt92_data/slop98.dat, accessed on 2005-11-05.

Wagman, D. D., Evans, W. H., Parker, V. B., Schumm, R. H., Halow, I., Bailey, S. M., Churney, K. L. and Nuttall, R. L., 1982. The NBS tables of chemical thermodynamic properties. Selected values for inorganic and C1 and C2 organic substances in SI units. J. Phys. Chem. Ref. Data, 11 (supp. 2), 1-392. http://www.nist.gov/srd/PDFfiles/jpcrdS2Vol11.pdf

YeastGFP project. Yeast GFP Fusion Localization Database, http://yeastgfp.ucsf.edu, accessed on 2007-02-01. Current location: http://yeastgfp.yeastgenome.org

See Also

add.protein and add.obigt for adding data from local .csv files.

Examples

  
  ## exploring thermo$obigt
  # what physical states there are
  unique(thermo$obigt$state)
  # formulas of ten random species
  n <- nrow(thermo$obigt)
  thermo$obigt$formula[runif(10)*n]

  ## cross-checking sources
  # the reference sources
  ref.source <- thermo$source$source
  # only take those that aren't journal abbreviations
  ref.source <- ref.source[-grep('_',ref.source)]
  # sources of elemental data
  element.source <- thermo$element$source
  # primary sources in thermodynamic database
  obigt.source1 <- thermo$obigt$source1
  # secondary sources; some are NA
  obigt.source2 <- 
    thermo$obigt$source2[!is.na(thermo$obigt$source2)]
  # sources of protein compositions
  protein.source <- thermo$protein$source
  # sources of stress response proteins
  stress.source <- as.character(thermo$stress[2,])
  # if the sources are all accounted for 
  # these all produce character(0)
  element.source[!(element.source %in% ref.source)]
  obigt.source1[!(obigt.source1 %in% ref.source)]
  obigt.source2[!(obigt.source2 %in% ref.source)]
  protein.source[!(protein.source %in% ref.source)]
  stress.source[!(stress.source %in% ref.source)]
  # determine if all the reference sources are cited
  my.source <- c(element.source,obigt.source1,
    obigt.source2,protein.source,stress.source)
  # this should produce character(0)
  ref.source[!(ref.source %in% my.source)]

  ## make a table of duplicated species
  name <- thermo$obigt$name
  state <- thermo$obigt$state
  source <- thermo$obigt$source1
  species <- paste(name,state)
  dups <- species[which(duplicated(species))]
  id <- numeric()
  for(i in 1:length(dups)) id <- c(id,which(species %in% dups[i]))
  data.frame(name=name[id],state=state[id],source=source[id])

  ## accessing duplicated species
  # using info()
  i <- info("Al+3","aq")  # length = 2
  i1 <- info("Al+3")      # length = 1
  stopifnot(i1==i[1])
  thermo$opt$level <- 2
  i2 <- info("Al+3")      # length = 1
  stopifnot(i2==i[2])
  # using subcrt()
  subcrt("Al+3","aq")  # always uses the first species
  subcrt("Al+3")  # pays attention to thermo$opt$level
  # .. showing the energetic differences between the
  # duplicated species
  subcrt(i,c(-1,1))
  # using basis()
  basis(c("Al+3","H+","H2"))  
  thermo$opt$level <- 1
  basis(c("Al+3","H+","H2"))  
  # using species()
  species("Al+3","aq")  # listens to thermo$opt$level
  thermo$opt$level <- 2
  species("Al+3")  # also listens to thermo$opt$level
  species(delete=TRUE)
  species(rev(i)) # can also use the species indices
  # see also aluminum speciation example in diagram() help page

[Package CHNOSZ version 0.9 Index]