prepare.cc {SimHap}R Documentation

Prepare case-control data for inferring haplotypes

Description

prepare.cc prepares case-control data when there may be missing values in the `case status' variable. This eliminates problems when using infer.haplos.cc.

Usage

prepare.cc(geno, pheno, cc.var)

Arguments

geno a genotype data frame where each SNP is represented by two columns, one for each allele, in the form of haplo.dat.
pheno a data frame containing phenotype data with at least two columns - a subject identifier and an indicator of disease status.
cc.var the column name of the parameter indicating disease status. Must be entered with quotations, e.g. ``DISEASE".

Details

prepare.cc searches for missing values in cc.var and reduces geno and pheno to include only those individuals with known disease status. These geno and pheno objects can then be passed into infer.haplos.cc.

Value

geno a genotype data frame where each SNP is represented by two columns, one for each allele, in the form of haplo.dat. Individuals with unknown disease status are removed.
pheno a data frame containing phenotype data with at least two columns - a subject identifier and an indicator of disease status. Individuals with unknown disease status are removed.

Author(s)

Pamela A. McCaskie

References

McCaskie, P.A., Carter, K.W. Hazelton, M., Palmer, L.J. (2007) SimHap: A comprehensive modeling framework for epidemiological outcomes and a multiple-imputation approach to haplotypic analysis of population-based data, [online] www.genepi.org.au/simhap.

See Also

infer.haplos.cc

Examples


data(SNP.dat)

# convert SNP.dat to format required by infer.haplos.cc
haplo.dat <- SNP2Haplo(SNP.dat)
data(pheno.dat)

# not run: will return an error due to missing data in variable 'DISEASE'
# myinfer<-infer.haplos.cc(geno=haplo.dat, pheno=pheno.dat, 
#       cc.var="DISEASE") 

newdata <- prepare.cc(geno=haplo.dat, pheno=pheno.dat, cc.var="DISEASE")
newhaplo.dat <- newdata$geno
newpheno.dat <- newdata$pheno
myinfer<-infer.haplos.cc(geno=newhaplo.dat, pheno=newpheno.dat, 
        cc.var="DISEASE")

# prints haplotype frequencies among cases
myinfer$hap.freq.cases

# prints haplotype frequencies among controls
myinfer$hap.freq.controls 

# generated haplo object where haplotypes with a frequency 
# below min.freq are grouped as a category called "rare"
myhaplo<-make.haplo.rare(myinfer,min.freq=0.05) 
mymodel <- haplo.bin(formula1=DISEASE~AGE+SBP+h.N1AA, 
        formula2=DISEASE~AGE+SBP, pheno=newpheno.dat, haplo=myhaplo, 
        sim=10)


[Package SimHap version 1.0.0 Index]