haplo.ccs {haplo.ccs}R Documentation

Estimate Haplotype Relative Risks in Case-Control Data

Description

'haplo.ccs' estimates haplotype and covariate relative risks in case-control data by weighted logistic regression. Diplotype probabilities, which are estimated by EM computation with progressive insertion of loci, are utilized as weights. The model is specified by a symbolic description of the linear predictor, which includes specification of an allele matrix and an inheritance mode using 'haplo'. Note that use of this function requires installation of the 'haplo.stats' and 'survival' packages. See 'haplo.em' for a description of EM computation of diplotype probabilities.

Usage


haplo.ccs(formula, ...)

haplo.ccs.fit(y, x, int, geno, inherit.mode, control, referent, names.x,
              names.int, ...)

Arguments

formula a symbolic description of the model to be fit, which requires specification of an allele matrix and inheritance mode using 'haplo'. Note that although 'haplo' does not specify a default inheritance mode, 'additive' is the default inheritance mode for 'haplo.ccs'. More details on model formulae are given below.
control a list of control parameters for the EM computation of diplotype probabilities. See 'haplo.em.control'.
referent a character string representing the haplotype to be used as the referent. The haplotype that has the highest posterior probability is the default referent.
... optional model-fitting arguments to be passed to 'glm'.
y a vector of observations.
x the design matrix for environmental covariates.
int the design matrix for haplotype-environment interaction.
geno the allele matrix.
inherit.mode the inheritance mode specified by 'haplo'.
names.x the column names of the design matrix for covariates.
names.int the column names of the design matrix for haplotype-environmental interaction.

Details

A formula has the form 'y ~ terms' where 'y' is a numeric vector indicating case-control status and 'terms' is a series of terms which specifies a linear predictor for 'y'. A terms specification of the form 'first + second' indicates all the terms in 'first' together with all the terms in 'second' with duplicates removed. The terms in the formula will be re-ordered so that main effects come first, followed by the interactions, all second-order, all third-order and so on. The specification 'first*second' indicates the cross of 'first' and 'second'.

Note that 'haplo.ccs.fit' is the workhorse function. The inputs 'y', 'x', 'geno', and 'int' represent case-control status, the matrix of covariates, the matrix of alleles, and the matrix of terms that have interaction with the haplotypes to be estimated from the alleles. The argument inherit.mode corresponds to the inheritance mode that is specified by 'haplo'. 'names.x' and 'names.int' correspond to the column names of 'x' and 'int', respectively. The background functions 'one', 'count.haps', and 'return.haps' are used in specifying the model terms and neatly packaging the results.

Value

'haplo.ccs' returns an object of class inheriting from '"haplo.ccs"'. More details appear later in this section.
The function 'summary' (i.e., 'summary.haplo.ccs') obtains or prints a summary of the results, which include haplotype and covariate relative risks, robust standard error estimates, and estimated diplotype probabilities.
The generic accessory functions 'coefficients', 'fitted.values', and 'residuals' extract corresponding features of the object returned by 'haplo.ccs'. The function 'vcov' (i.e., 'vcov.haplo.ccs') returns sandwich variance-covariance estimates. The function 'haps' extracts information returned by the EM computation of diplotype probabilities.
An object of class '"haplo.ccs"' is a list containing at least the following components:

formula the formula supplied.
call the matched call.
coefficients a named vector of coefficients.
covariance a named matrix of sandwich variance-covariance estimates, computed using 'sandcov'.
residuals the working residuals, i.e., the residuals from the final iteration of the IWLS fit.
fitted.values the fitted mean values, obtained by transforming the linear predictors by the expit function.
linear.predictors the linear fit on the logit scale.
df the model degrees of freedom.
rank the numeric rank of the fitted model.
family the family object, in this case, quasibinomial.
iter the number of iterations of IWLS used.
weights the working weights, i.e., the weights from the final iteration of the IWLS fit.
prior.weights the weights initially supplied, in this case, the diplotype probabilities estimated by the EM computation.
y a vector indicating case-control status, expanded for each subject by the number of plausible haplotypes for that subject.
id the numeric vector used to identify subjects, expanded for each subject by the number of plausible haplotypes for that subject.
converged a logical indicating whether the IWLS fit converged.
boundary a logical indicating whether the fitted values are on the boundary of the attainable values.
model the model matrix used.
terms the terms object used.
offset the offset vector used.
control the value of the control argument used.
contrasts the contrasts used.
xlevels a record of the levels of the factors used in fitting.
inheritance.mode the method of inheritance.
em.lnlike the value of the log likelihood at the last EM iteration.
em.lr the likelihood ratio statistic used to test the assumed model against the model that assumes complete linkage equilibrium among all loci.
em.df.lr the degrees of freedom for the likelihood ratio statistic.
em.nreps the count of haplotype pairs that map to each subject's marker genotypes.
hap1 character strings representing the possible first haplotype for each subject.
hap2 character strings representing the possible second haplotype for each subject.
hap.names character strings representing the unique haplotypes.
hap.probs the estimated frequency of each unique haplotype.
em.converged a logical indicating whether the EM computation converged.
em.nreps the number of haplotype pairs that map to the marker genotypes for each subject.
em.max.pairs the maximum number of pairs of haplotypes per subject that are consistent with their marker data.
em.control a list of control parameters for the EM computation.

Note

The functions 'anova', 'logLik', and 'AIC' are not appropriate for models of class '"haplo.ccs"', because 'haplo.ccs' does not fit by maximum likelihood. Accordingly, model and null deviance are not reported.

Author(s)

Benjamin French and Thomas Lumley

Department of Biostatistics

University of Washington

References

The help files for 'glm' and 'haplo.em' were instrumental in creating this help file.

See Also

glm, haplo.em, haplo.em.control, haplo, sandcov

Examples


data(Renin)

## Fit a model for haplotype effects.

m1 <- haplo.ccs(case ~ haplo(geno[,1:12], mode = "additive"),
                control = haplo.em.control(min.posterior=1e-4), referent = "223144")

## Fit a model for haplotype and covariate effects.

m2 <- haplo.ccs(case ~ gender + age + factor(race) + haplo(geno[,1:12], mode = "additive"),
                control = haplo.em.control(min.posterior=1e-4), referent = "223144")

## Fit a model for haplotype interaction with gender.

m3 <- haplo.ccs(case ~ age + factor(race) + gender*haplo(geno[,1:12], mode = "additive"),
                control = haplo.em.control(min.posterior=1e-4), referent="223144")


[Package haplo.ccs version 1.0 Index]