genomic.clines {introgress}R Documentation

Genomic Clines

Description

This function fits genomic clines to genotypic data using the method described by Gompert and Buerkle (2009a). Significance testing is included, but optional.

Usage

genomic.clines(introgress.data=NULL, hi.index=NULL, loci.data=NULL,
               sig.test=FALSE, method="permutation", n.reps=1000,
               het.cor=TRUE, loci.touse=NULL, ind.touse=NULL)

Arguments

introgress.data a list produced by prepare.data or a matrix with allele counts.
hi.index a data frame produced by est.h or a numeric vector of hybrid index estimates.
loci.data a matrix or array providing marker information.
sig.test a logical specifying whether to perform significance testing.
method method to generate null distribution; either "permutation" or "parametric".
n.reps numeric value specifying number of neutral simulations.
het.cor a logical specifying whether to correct for deviations from expected heterozygosity when conducting neutral simulations using the parametric method (ignored with permutation method).
loci.touse vector of loci names, numeric indexes, or logicals that specify a subset of loci for analysis, if NULL all loci are included.
ind.touse vector of individual identifications, numeric indexes, or logicals that specify a subset of individuals for analysis, if NULL all individuals are included.

Details

introgress.data may either be the list that is returned by the function prepare.data, or, if fixed=TRUE, introgress.data may simply be a matrix or array providing counts of the number of alleles derived from parental population 1 for each admixed individual. If introgress.data is a matrix or array, rows and columns correspond to loci and individuals, respectively.

loci.data is a matrix or array where each row provides information on one locus. The first column gives a unique locus name (e.g. "locus3"), and the second column specifies whether the locus is co-dominant ("C" or "c"), haploid ("H" or "h"), or dominant ("D" or "d"). These first two columns in loci.data are required. The third column, which is optional, is a numeric value specifying the linkage groups for the marker. The fourth column, which is also optional, is a numeric value specifying both the linkage group and location on the linkage group (e.g. 3.70, for a marker at 70 cM on linkage group 3). These optional columns can be used for ordering markers for the mk.image, genomic.clines, and clines.plot functions.

This function (genomic.clines) estimates genomic clines in genotype frequency for admixed populations using the genomic clines method described by Gompert and Buerkle (2009a). If sig.test = FALSE, genomic clines are estimated for the admixed population, but no signficance testing is done. If sig.test = TRUE, the genomic cline for each locus is evaluated for deviations from neutral expectations. Either the permutation method (method = "permutation") or the parametric method (method = "parametric") described by Gompert and Buerkle (2009a) can be used to generate the neutral distribution for significance testing. The permutation method cannot be used if both co-dominant and dominant (or haploid) data are included in the analysis.

The function will issue a warning if an invariant locus is included (all individuals have the same genotype). In this case the probability of one of the genotypes does not vary with hybrid index.

See Gompert and Buerkle (2009a, 2009b) for additional details.

Value

A list with the following components:

Summary.data a matrix with the locus data including log likelihood ratios and P-values from significance testing if sig.test = TRUE.
Fitted.AA a matrix with fitted values for the population 1 homozygote for each locus (row) and individual (column).
Fitted.Aa a matrix with fitted values for inter-population heterozygotes for each locus (row) and individual (column).
Fitted.aa a matrix with fitted values for the population 2 homozygote for each locus (row) and individual (column).
Neutral.AA a matrix with upper and lower bounds for the empirical 95% confidence interval for the expected population 1 homozygote genomic clines under neutrality for each locus (row) and individual (column); these confidence intervals are based on the neutral simulations/permutations.
Neutral.Aa a matrix with upper and lower bounds for the empirical 95% confidence interval for the expected inter-population heterozygote genomic clines under neutrality for each locus (row) and individual (column); these confidence intervals are based on the neutral simulations/permutations.
Neutral.aa a matrix with upper and lower bounds for the empirical 95% confidence interval for the expected population 2 homozygote genomic clines under neutrality for each locus (row) and individual (column); these confidence intervals are based on the neutral simulations/permutations.
Count.matrix the user supplied count.matrix.
hybrid.index the user supplied hi.index.
Loci.data the user supplied loci.data.

Author(s)

Zachariah Gompert zgompert@uwyo.edu, C. Alex Buerkle buerkle@uwyo.edu

References

Gompert Z. and Buerkle C. A. (2009) A powerful regression-based method for admixture mapping of isolation across the genome of hybrids. Molecular Ecology, 18, 1207-1224.

Gompert Z. and Buerkle C. A. (2009) introgress: a software package for mapping components of isolation in hybrids. Molecular Ecology Resources, in preparation.

See Also

prepare.data, est.h

Examples

## Not run: 
## Example 1, genomic clines analysis without significance testing, or
## with significance testing on a subset of the data

## load simulated data
## markers do not have fixed differences
data(AdmixDataSim2)
data(LociDataSim2)
data(p1DataSim2)
data(p2DataSim2)

## use prepare.data to produce introgress.data
introgress.data1<-prepare.data(admix.gen=AdmixDataSim2,
                               loci.data=LociDataSim2,
                               parental1=p1DataSim2, parental2=p2DataSim2,
                               pop.id=TRUE, ind.id=TRUE, fixed=FALSE)

## estimate hybrid index
hi.index1<-est.h(introgress.data=introgress.data1,loci.data=LociDataSim2,
                 fixed=FALSE)

## estimate genomic clines without significance testing
clines.out1<-genomic.clines(introgress.data=introgress.data1,
                            hi.index=hi.index1,
                            loci.data=LociDataSim2, sig.test=FALSE)

## for a subset of loci, estimate genomic clines with significance testing
clines.out1b<-genomic.clines(introgress.data=introgress.data1,
                             hi.index=hi.index1,
                             loci.data=LociDataSim2, sig.test=TRUE,
                             method="parametric", loci.touse=1:10)

###############################################################
## Example 2, genomic clines analysis with significance testing

## load simulated data
## markers have fixed differences, with
## alleles coded as 'P1' and 'P2'
data(AdmixDataSim1)
data(LociDataSim1)

## use prepare.data to produce introgress.data
introgress.data2<-prepare.data(admix.gen=AdmixDataSim1,
                               loci.data=LociDataSim1,
                               parental1="P1", parental2="P2",
                               pop.id=FALSE, ind.id=FALSE, fixed=TRUE)

## estimate hybrid index
hi.index2<-est.h(introgress.data=introgress.data2,
                loci.data=LociDataSim1, fixed=TRUE, p1.allele="P1",
                p2.allele="P2")

## estimate genomic clines and perform significance testing
## note the small number of replicates (chosen only to speed example)
clines.out2<-genomic.clines(introgress.data=introgress.data2,
                            hi.index=hi.index2, loci.data=LociDataSim1,
                            sig.test=TRUE, method="permutation",
                            n.reps=100)

write.table(clines.out2$Summary.data, file="clines.txt",
            quote=FALSE, sep=",")
## End(Not run)

[Package introgress version 1.1 Index]