mapLD {mapLD}R Documentation

Construct Haplotype Blocks

Description

mapLD calculates confidence intervals for Lewontin's D' (1964) and constructs haplotype blocks using the approach described by Gabriel, et al (2002).

Usage

mapLD(SNPdata, locusID.col, subjectID.col, allele.cols, WhichGene = NA, outgraph = NA)

Arguments

SNPdata Data frame containing SNP data. At least 4 fields are required, see arguments locusID.col, subjectID.col and allele.cols for details.
locusID.col Column name or index for a marker ID.
subjectID.col Column name or index for a sample ID.
allele.cols A vector of 2 for the column names or indices for the values of two alleles.
WhichGene Gene name. Default is NA.
outgraph Name of an EPS file to which the heatmap of LD is printed. If not specified, the heatmap is printed to a file called LDmap.eps under the working directory where the current R session is running.

Details

EM algorithm is not used for estimating two-locus haplotype frequencies. Instead, A faster one-dimensional golden section search combined with parabolic interpolation is applied to finding the MLE for two-locus haplotype frequencies (Weir BS (1996)). In addition, a recursive function is defined to compute the haplotype blocks according to Gabriel et al's (2002) definition later on modified by Wall and Pritchard (2003).

Value

A list of the following components:

LDinfo A dataframe containing information about the point estimate and the corresponding 90% coverage confidence interval for the pair-wise D', as well as the four 2-locus haplotype freqiencies.
LDblock A dataframe containing the head and tail SNPs for each of the haplotype blocks.
LocusIndex A numeric index for each of the SNP markers under investigation.


In addition, an pair-wise LD heatmap with haplotype block boundaries labelled is printed to a postscript file. See argument outgraph for details.

Warning

If the value of sample ID is an integer exceeding the precision setup in R for integers, then merging two loci marker data by sample ID is error-prone. Such a problem could be avoided by specifying option colClasses = 'character' when data are read using function read.table.

Note

A LD map is most meaningful when the markers are ordered by physical locations on the chromosome. It is therefore recommended markers be sorted before data are passed into the mapLD function.

Author(s)

Peter Hu and Jared Lunceford

References

1. Gabriel SB, et al (2002). The Structure of Haplotypes in the Human Genome. Science, 296(5576):2225-9

2.Wall JD and Pritchard JK (2003). Assessing the Performance of the Haplotype Block Model of Linkage Disequilibrium. Am J Hum Genet., 73(3):502-15

3. Lewontin RC (1964). The interaction of selection and linkage. I. General considerations: heterotic models. Genetics, 49:49-67.

4. Weir BS (1996). Genetic Data Analysis II. Sinauer, Sunderland, MA

Examples

data(SNPdata)
getLD <- mapLD(SNPdata = SNPdata,
locusID.col = 'markerID',
subjectID.col = 'subjectID',
allele.cols = 1:2,
WhichGene = NA,
outgraph = NA)

[Package mapLD version 1.0-1 Index]