haplosim {hapsim} | R Documentation |
Generates a random sample of haplotypes, given an haplotype object created from a data set
haplosim(n, hap, which.snp = NULL, seed = NULL, force.polym = TRUE, summary = TRUE)
n |
Number of haplotypes to generate |
hap |
Haplotype object created with haplodata |
which.snp |
A vector specifying which SNPs to include |
seed |
Seed for the random number generator |
force.polym |
if TRUE, all loci are polymorphic |
summary |
if TRUE, additional summary statistics are returned |
A list containing:
data |
Simulated sample |
freqs |
Allele frequency vector |
cor |
Correlation matrix |
div |
Locus-specific diversity scores |
mse.freqs |
MSE of allele frequencies |
mse.cor |
MSE of correlations |
Giovanni Montana
Montana, G. HapSim: a simulation tool for generating haplotype data with pre-specified allele frequencies and LD coefficients. 2005.
See also haplodata
# # Example 1 # data(ACEdata) # create the haplotype object x <- haplodata(ACEdata) # simulates a first sample of 100 haplotypes using all markers y1 <- haplosim(100, x) # compares allele frequencies in real and simulated samples plot(x$freqs, y1$freqs, title=paste("MSE:",y1$mse.freqs)); abline(a=0, b=1) # compares LD coefficients in real and simulated samples ldplot(mergemats(x$cor, y1$cor), ld.type='r') # simulates a second sample of 1000 haplotypes using the first 20 markers only y2 <- haplosim(1000, which.snp=seq(20), x) # # Example 2 # # simulate a sample of 500 haplotypes based on the ACE data set set.seed(100) data(ACEdata) n <- 500 x <- haplodata(ACEdata) y <- haplosim(n, x) # compute the haplotype frequencies # an haplotype starts at markers 17 and ends at marker 22 freq1 <- haplofreqs(ACEdata, 17, 22) freq2 <- haplofreqs(y$data, 17, 22) # extract the set of haplotypic configurations that are shared # by real and simulated data and their frequencies commonhapls <- intersect(names(freq1),names(freq2)) cfreq1 <- freq1[commonhapls] cfreq2 <- freq2[commonhapls] # compare real vs simulated haplotype frequencies par(mar=c(10.1, 4.1, 4.1, 2.1), xpd=TRUE) legend.text <- names(cfreq1) bp <- barplot(cbind(cfreq1,cfreq2), main="Haplotype Frequencies", names.arg=c("Real","Simulated"), col=heat.colors(length(legend.text))) legend(mean(range(bp)), -0.3, legend.text, xjust = 0.5, fill=heat.colors(length(legend.text)), horiz = TRUE) chisq.test(x=n*cfreq2, p=cfreq1, simulate.p.value = TRUE, rescale.p = TRUE)