mapLD {mapLD} | R Documentation |
mapLD calculates confidence intervals for Lewontin's D' (1964) and constructs haplotype blocks using the approach described by Gabriel, et al (2002).
mapLD(SNPdata, locusID.col, subjectID.col, allele.cols, WhichGene = NA, outgraph = NA)
SNPdata |
Data frame containing SNP data. At least 4 fields are
required, see arguments locusID.col , subjectID.col and
allele.cols for details. |
locusID.col |
Column name or index for a marker ID. |
subjectID.col |
Column name or index for a sample ID. |
allele.cols |
A vector of 2 for the column names or indices for the values of two alleles. |
WhichGene |
Gene name. Default is NA. |
outgraph |
Name of an EPS file to which the heatmap of LD is printed. If not specified, the heatmap is printed to a file called LDmap.eps under the working directory where the current R session is running. |
EM algorithm is not used for estimating two-locus haplotype frequencies. Instead, A faster one-dimensional golden section search combined with parabolic interpolation is applied to finding the MLE for two-locus haplotype frequencies (Weir BS (1996)). In addition, a recursive function is defined to compute the haplotype blocks according to Gabriel et al's (2002) definition later on modified by Wall and Pritchard (2003).
A list of the following components:
LDinfo |
A dataframe containing information about the point estimate and the corresponding 90% coverage confidence interval for the pair-wise D', as well as the four 2-locus haplotype freqiencies. |
LDblock |
A dataframe containing the head and tail SNPs for each of the haplotype blocks. |
LocusIndex |
A numeric index for each of the SNP markers under investigation. |
In addition, an pair-wise LD heatmap with haplotype block boundaries
labelled is printed to a postscript file. See argument outgraph
for details.
If the value of sample ID is an integer exceeding
the precision setup in R for integers, then merging two loci marker data by
sample ID is error-prone. Such a problem could be avoided by specifying
option colClasses = 'character' when data are read using function
read.table
.
A LD map is most meaningful when the markers are ordered by
physical locations on the chromosome. It is therefore recommended markers
be sorted before data are passed into the mapLD
function.
Peter Hu and Jared Lunceford
1. Gabriel SB, et al (2002). The Structure of Haplotypes in the Human Genome. Science, 296(5576):2225-9
2.Wall JD and Pritchard JK (2003). Assessing the Performance of the Haplotype Block Model of Linkage Disequilibrium. Am J Hum Genet., 73(3):502-15
3. Lewontin RC (1964). The interaction of selection and linkage. I. General considerations: heterotic models. Genetics, 49:49-67.
4. Weir BS (1996). Genetic Data Analysis II. Sinauer, Sunderland, MA
data(SNPdata) getLD <- mapLD(SNPdata = SNPdata, locusID.col = 'markerID', subjectID.col = 'subjectID', allele.cols = 1:2, WhichGene = NA, outgraph = NA)