simMD {popgen}R Documentation

Simulate multi-population unlinked genotype data from a Multinomial-Dirichlet model.

Description

Simulate multi-population unlinked genotype data from a Multinomial-Dirichlet model.

Usage

simMD(N, P, L, p = NULL, c.vec1, c.vec2 = 1, ac = 2, beta = 1)

Arguments

N The number of people per population.
P The number of populations.
L The number of unlinked loci.
p A (ac x L) with column l containing the ac global allele frequencies for locus l. If NULL the frequencies are generated at random from a Dirichlet distribution with parameter beta (default is beta = 1 which is a uniform distribution.
c.vec1 A vector of length P which contains the level 1 variance parameter for each subpopulation.
c.vec2 An (optional) vector which contains the level 2 variance parameters for each subpopulation of the level 1 subpopulation.
ac The number of alleles at each locus.
beta The parameter of Dirichlet distribution used to simulate the global allele frequencies.

Details

The data is simulated from a Multinomial-Dirichlet structure with a specific parameterisation suitable for the situation.

x_{ipal} = Allele of the ath chromosome from the ith person in the pth population at locus l.

α_{pl_j} = locus l type j allele frequency of the population p.

π_{l_j} = locus l type j allele frequency of the `global' population.

c_{p} = variance parameter of population p.

ac = number of alleles at each locus.

x_{ipal} sim textrm{Multinom}(α_{pl_1}, ..., α_{pl_{ac}})

{α_{pl_{1}}, ..., α_{pl_{ac}}} sim textrm{Dirichlet}(π_{l_1}frac{(1 - c_{p})}{c_{p}}, ..., pi_{l_{ac}}frac{(1 - c_{p})}{c_{p}})

If the global allele frequencies are not specified then {π_{l_1}, ..., π_{l_{ac}}} sim textrm{Dirichlet}(β, ..., β)

Value

If the c.vec2 parameter is left as its default value then the data is simulated from the standard multinomial-direchlet model (see above) and the result is stored in an array with dimensions (N * P, 2, L) containing the data. Entry ((p - 1) * P + i, a, l) contains the simulated allele from the ith person in the pth population of the ath chromosome at the lth locus.
If the c.vec2 vector is specified then each of the P populations has length(c.vec2) subpopulations with the c.vec2 parameters containing the variance parameters of the subpopulations.

Author(s)

Jonathan Marchini

References

Nicholson et al. (2002), Assessing population differentiation and isolation from single-nucleotide polymorphism data. JRSS(B), 64, 695–715

Marchini and Cardon (2002) Discussion of Nicholson et al. (2002). JRSS(B), 64, 740–741

Nichols and Balding (1995) A method for quantifying differentiation between populations at multi-allelic loci and its implications for investigating identity and paternity. Genetica, 96, 3–12.

Examples


  X <- simMD(60, 3, 100, p = NULL, c.vec1 = c(0.1, 0.2, 0.3), c.vec2 =
1, ac = 2, beta = 1) 


[Package popgen version 0.0-4 Index]