EXP {seqinr} | R Documentation |
Vectors of coefficients to compute linear forms.
Description
This dataset is used to compute linear forms on codon frequencies:
if codfreq
is a vector of codon frequencies then
drop(freq %*% EXP$CG3)
will return for instance the G+C content
in third codon positions. Base order is the lexical order: a,
c, g, t (or u).
Usage
data(EXP)
Format
List of 24 vectors of coefficients
- A
- num [1:4] 1 0 0 0
- A3
- num [1:64] 1 0 0 0 1 0 0 0 1 0 ...
- AGZ
- num [1:64] 0 0 0 0 0 0 0 0 1 0 ...
- ARG
- num [1:64] 0 0 0 0 0 0 0 0 1 0 ...
- AU3
- num [1:64] 1 0 0 1 1 0 0 1 1 0 ...
- BC
- num [1:64] 0 1 0 0 0 0 0 0 0 0 ...
- C
- num [1:4] 0 1 0 0
- C3
- num [1:64] 0 1 0 0 0 1 0 0 0 1 ...
- CAI
- num [1:64] 0.00 0.00 -1.37 -2.98 -2.58 ...
- CG
- num [1:4] 0 1 1 0
- CG1
- num [1:64] 0 0 0 0 0 0 0 0 0 0 ...
- CG12
- num [1:64] 0 0 0 0 0.5 0.5 0.5 0.5 0.5 0.5 ...
- CG2
- num [1:64] 0 0 0 0 1 1 1 1 1 1 ...
- CG3
- num [1:64] 0 1 1 0 0 1 1 0 0 1 ...
- CGN
- num [1:64] 0 0 0 0 0 0 0 0 0 0 ...
- F1
- num [1:64] 1.026 0.239 1.026 0.239 -0.097 ...
- G
- num [1:4] 0 0 1 0
- G3
- num [1:64] 0 0 1 0 0 0 1 0 0 0 ...
- KD
- num [1:64] -3.9 -3.5 -3.9 -3.5 -0.7 -0.7 -0.7 -0.7 -4.5 -0.8 ...
- Q
- num [1:64] 0 0 0 0 1 1 1 1 0 0 ...
- QA3
- num [1:64] 0 0 0 0 1 0 0 0 0 0 ...
- QC3
- num [1:64] 0 0 0 0 0 1 0 0 0 0 ...
- U
- num [1:4] 0 0 0 1
- U3
- num [1:64] 0 0 0 1 0 0 0 1 0 0 ...
Details
It's better to work directly at the amino-acid level
when computing linear forms on amino-acid frequencies so as to have
a single coefficient vector. For instance EXP$KD
to compute the Kyte
and Doolittle hydrophaty index from codon frequencies is valid only
for the standard genetic code.
An alternative for drop(freq %*% EXP$CG3)
is
sum( freq * EXP$CG3 )
, but this is less efficient in terms of CPU
time. The advantage of the latter, however, is that thanks to
recycling rules you can use either sum( freq * EXP$A )
or sum( freq * EXP$A3 )
. To do the same with the %*%
operator you have to explicit the recycling rule as in
drop( freq %*% rep(EXP$A, 16))
.
Source
ANALSEQ EXPFILEs for command EXP.
http://biomserv.univ-lyon1.fr/doclogi/docanals/manuel.html
References
- A
- content in A nucleotide
- A3
- content in A nucleotide in third position of codon
- AGZ
- Arg content (aga and agg codons)
- ARG
- Arg content
- AU3
- content in A and U nucleotides in third position of codon
- BC
- Good choice (Bon choix). Gouy M., Gautier C. (1982)
codon usage in bacteria : Correlation with gene expressivity. Nucleic Acids Research,10(22):7055-7074.
- C
- content in C nucleotides
- C3
- content in A nucleotides in third position of codon
- CAI
- Codon adaptation index for E. coli. Sharp, P.M., Li, W.-H. (1987) The codon adaptation index -
a measure of directionam synonymous codon usage bias, and its potential
applications. Nucleic Acids Research,15:1281-1295.
- CG
- content in G + C nucleotides
- CG1
- content in G + C nucleotides in first position of codon
- CG12
- content in G + C nucleotides in first and second position of codon
- CG2
- content in G + C nucleotides in second position of codon
- CG3
- content in G + C nucleotides in third position of codon
- CGN
- content in CGA + CGU + CGA + CGG
- F1
- From Table 2 in Lobry, J.R., Gautier, C. (1994) Hydrophobicity,
expressivity and aromaticity are the major trends of amino-acid usage in
999 Escherichia coli chromosome-encode genes. Nucleic Acids
Research,22:3174-3180.
- G3
- content in G nucleotides in third position of codon
- KD
- Kyte, J., Doolittle, R.F. (1982) A simple method for displaying
the hydropathic character of a protein. J. Mol. Biol.,157
:105-132.
- Q
- content in quartet
- QA3
- content in quartet with the A nucleotide in third position
- QC3
- content in quartet with the A nucleotide in third position
- U
- content in U nucleotide
- U3
- content in U nucleotides in third position of codon
To have an overview of the seqinR's functionnality, please consult this vignette:
Charif, D., Lobry, J.R. (2005) SeqinR: a contributed package to the R project for statistical
computing devoted to biological sequences retrieval and analysis. Springer Verlag, Biological and Medical Physics/Biomedical Series, in preparation.
Examples
data(EXP)
[Package
seqinr version 1.0-3
Index]