vs {BAMD}R Documentation

Variable Selection in Bayesian Association Model

Description

This function carries out variable selection on the following linear mixed model

Y = X β + Z gamma + ε

where the covariates for the random effects (in the Z-matrix) have missing values. The Z-matrix consists of Single Nucelotide Polymorphism (SNP) data and the Y-vector contains the phenotypic trait of interest. The X-matrix typically describes the family structure of the organisms.

The best models are determined by their Bayes Factor, and uses the imputed values from the gs function.

Usage

vs(fname, n, p, s, nsim, keep = 5, prop = 0.75, betafile = "beta.csv", 
        gammafile = "gamma.csv", phi2file = "phi2.csv", sig2file = "sig2.csv", 
        missingfile = "Imputed_missing_vals")

Arguments

fname fname should be the name of a .csv file. This file should contain the Y, X, Z and R matrices for the model, in that particular order. Hence it should contain n times (1 + p + s + n) values. There should be a header rown in the input file as well. The Z matrix should use the values 1,2,3 for the SNPs and 0 for any missing SNPs. The program will convert the SNP codings to -1,0,1 and work with those.
n n refers to the length of the Y-vector; equivalent to the number of observations in the dataset.
p p is the number of columns of the X-matrix.
s s is the number of columns of the Z-matrix.
nsim nsim specifies the number of iterations of the Metropolis-Hastings chain to carry out.
keep keep specifies the number of models to store. The top keep models will be retained.
prop As the candidate distribution for the Metropolis-Hastings chain is a mixture, one of whose components is a random walk, prop will determine the percentage of time that the random walk distribution is chosen.
betafile Contains beta values that were output from gs.
gammafile Contains gamma values that were output from gs.
phi2file Contains phi2 values that were output from gs.
sig2file Contains sig2 values that were output from gs.
missingfile Contains the missing SNP values that were output from gs.

Details

A Metropolis-Hastings algorithm is used to conduct a stochastic search through the model space to find the best models.

Value

A matrix consisting of the best keep models and their Bayes Factors is returned.

Author(s)

Vik Gopal viknesh@stat.ufl.edu

Maintainer: Vik Gopal <viknesh@stat.ufl.edu>

References

Gopal, V. "BAMD User Manual" http://www.stat.ufl.edu/~viknesh/assoc_model/assoc.html

See Also

gs

Examples

# Load example matrices and write to csv files.
data(Y, X, Z, R, Zprob)
write.csv(cbind(Y,X,Z,R), file="generatedData.csv", quote=FALSE, row.names=FALSE)
write.csv(Zprob, file="Zprob.csv", quote=FALSE, row.names=FALSE)
        
# Run the gibbs sampler with 100 iterations, keeping the last 800
gs(fname="generatedData.csv", fprob="Zprob.csv", n=8, p=3, s=5, nsim=1000, keep=800)

# Imputed values from gibbs sampler will be used in Variable Selector
vs(fname="generatedData.csv", n=8, p=3, s=5, nsim=100, keep = 5)

#remove all generated csv files
unlink("*.csv")

[Package BAMD version 1.2 Index]