dist.dna {ape} | R Documentation |
These functions compute a matrix of pairwise distances from DNA sequences using a model of DNA evolution. Five models are currently available.
dist.dna(x, y = NULL, variance = FALSE, gamma = NULL, method = "Kimura", basefreq = NULL, GCcontent = NULL) dist.dna.JukesCantor(x, y, variance = FALSE, gamma = NULL) dist.dna.TajimaNei(x, y, variance = FALSE, basefreq = NULL) dist.dna.Kimura(x, y, variance = FALSE, gamma = NULL) dist.dna.Tamura(x, y, variance = FALSE, GCcontent = NULL) dist.dna.TamuraNei(x, y, variance = FALSE, basefreq = NULL, gamma = NULL)
x |
either, a vector with a single DNA sequence, or a matrix of
DNA sequences, or a list of DNA sequences (the latter can be taken
from, e.g., read.GenBank ). |
y |
a vector with a single DNA sequence. |
gamma |
a value for the gamma parameter which is possibly used to
apply a gamma correction to the distances (by default gamma =
NULL so no correction is applied). |
variance |
a logical indicating whether to compute the variances
of the distances; defaults to FALSE so the variances are not
computed. |
method |
a character string specifying the method used to compute
the distance. Currently four choices are possible: "JukesCantor" ,
"TajimaNei" , "Kimura" (the default), "Tamura" ,
and "TamuraNei" . |
basefreq |
the base frequencies to be used in the computations
(if applicable, i.e. if method = "TajimaNei" ). By default, the
base frequencies are computed from the whole sample of sequences. |
GCcontent |
the content in G+C to be used in the computations
(if applicable, i.e. if method = "Tamura" ). By default, this
percentage is computed from the whole sample of sequences. |
For the function dist.dna
, if the argument y
is specified,
then it is binded to x
, and the distances between all columns
of the resulting matrix are computed; otherwise, x
must be a
matrix or a list. The four other functions take two single sequences
as arguments.
The function dist.dna
actually calls one of the other function
depending on the argument method
(by default "Kimura"
)
eventually passing the relevant arguments. For instance, specifying a
value for the option basefreq
has no effect if the option
method
is set to "Kimura" or "JukesCantor" (the base
frequencies are assumed to be equal to 0.25 in both models).
The molecular evolutionary models available through the option
method
have been extensively described in the literature. A
brief description is given below; more details can be found in the
References.
dist.dna
and no base
frequencies are given (basefreq = NULL
), then they are
computed from the whole vectors, matrix, or list given as argument. If
the distances are computed with the function dist.dna.TajimaNei
and no base frequencies are given, then they are computed from both
vectors given as argument.
a numeric matrix with possibly the names of the individuals (as given
by the rownames of the argument x
) as colnames and rownames (if
variance = FALSE
, the default), or a list of two matrices names
distances
and variance
, respectively (if variance =
TRUE
).
The models of DNA evolution available in `ape' follow somewhat those available in the software MEGA (Kumar et al. 2001).
Emmanuel Paradis paradis@isem.univ-montp2.fr
Felsenstein, J. (1993) Phylip (Phylogeny Inference Package) version 3.5c. Department of Genetics, University of Washington. http://evolution.genetics.washington.edu/phylip/phylip.html
Jukes, T. H. and Cantor, C. R. (1969) Evolution of protein molecules. in Mammalian Protein Metabolism, ed. Munro, H. N., pp. 21–132, New York: Academic Press.
Kimura, M. (1980) A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. Journal of Molecular Evolution, 16, 111–120.
Kumar, S., Tamura, K., Jakobsen, I. B. and Nei, M. (2001) MEGA2: Molecular Evolutionary Genetics Analysis software. Bioinformatics, 17, 1244–1245. http://www.megasoftware.net/
Jin, L. and Nei, M. (1990) Limitations of the evolutionary parsimony method of phylogenetic analysis. Molecular Biology and Evolution, 7, 82–102.
Tajima, F. and Nei, M. (1984) Estimation of evolutionary distance between nucleotide sequences. Molecular Biology and Evolution, 1, 269–285.
Tamura, K. (1992) Estimation of the number of nucleotide substitutions when there are strong transition-transversion and G + C-content biases. Molecular Biology and Evolution, 9, 678–687.
Tamura, K. and Nei, M. (1993) Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Molecular Biology and Evolution, 10, 512–526.
read.GenBank
, read.dna
, write.dna
,
dist.gene
, dist.phylo