gs {BAMD}R Documentation

Estimate posterior parameters in Bayesian Association Model

Description

This function fits the following linear mixed model

Y = X β + Z gamma + ε

where the covariates for the random effects (in the Z-matrix) have missing values. The Z-matrix consists of Single Nucelotide Polymorphism (SNP) data and the Y-vector contains the phenotypic trait of interest. The X-matrix typically describes the family structure of the organisms.

The model is fit by embedding it in a Bayesian framework and estimating the posterior parameters using a Gibbs sampler.

Usage

gs(fname, fprob, n, p, s, a = 2.01, b = 0.99099, c = 2.01, d = 0.99099, 
        beta.in = 1, gamma.in = 1, sig2.in = 1, phi2.in = 1, alpha = 0.05, 
        nsim = 1000, keep = 100, param = 1)

Arguments

fname fname should be the name of a .csv file. This file should contain the Y, X, Z and R matrices for the model, in that particular order. Hence it should contain n times (1 + p + s + n) values. There should be a header rown in the input file as well. The Z matrix should use the values 1,2,3 for the SNPs and 0 for any missing SNPs. The program will convert the SNP codings to -1,0,1 and work with those.
fprob fprob should also be a .csv file. It should contain one probability vector for each entry in the Z-table. Hence it should be a matrix of dimension n times 3s. The program will read in the entire table, but only store the distributions corresponding to the missing values. If uniform priors are to be used, there is no need to specify anything.
n n refers to the length of the Y-vector; equivalent to the number of observations in the dataset.
p p is the number of columns of the X-matrix.
s s is the number of columns of the Z-matrix.
a,b,c,d ab,c,d are hyperparameters in the Bayesian set-up.
beta.in beta.in is the initial value for the Gibbs sampler. It should be a vector of length p.
gamma.in gamma.in is the initial value for the Gibbs sampler. It should be a vector of length s.
sig2.in sig2.in is the initial value for the Gibbs sampler. It should be a vector of length 1.
phi2.in sig2.in is the initial value for the Gibbs sampler. It should be a vector of length 1.
alpha alpha refers to the (1- α)100% confidence intervals that the program should output.
nsim nsim specifies the number of iterations of the Gibbs sampler to carry out.
keep keep specifies which values from the Gibbs sampler chain to keep and use when computing the mean and confidence intervals. This allows user to allow for burn-in.
param param denotes the parametrization of the genotypes. It currently cannot be changed from 1.

Details

For further details on the prior distributions used, please refer to the User Guide in the reference(s) given below.

Value

There will be no R object returned. Instead, as the routine is running, it will print debugging statements to show the user which iteration of the Gibbs sampler it is currently at. This would allow the user to detect if something is going wrong with the routine.
The valued sampled from the full conditionals will be stored in the following .csv files: beta.csv, gamma.csv, sig2.csv, phi2.csv and Imputed_missing_vals. The .csv files can be used to check that the chain has converged, while the imputed missing values for the Z-matrix will be used by the variable selector routine.

Author(s)

Vik Gopal viknesh@stat.ufl.edu

Maintainer: Vik Gopal <viknesh@stat.ufl.edu>

References

Gopal, V. "BAMD User Manual" http://www.stat.ufl.edu/~viknesh/assoc_model/assoc.html

See Also

vs

Examples

# Load example matrices and write to csv files.
data(Y, X, Z, R, Zprob)
write.csv(cbind(Y,X,Z,R), file="generatedData.csv", quote=FALSE, row.names=FALSE)
write.csv(Zprob, file="Zprob.csv", quote=FALSE, row.names=FALSE)
        
# Run the gibbs sampler with 100 iterations, keeping the last 800
gs(fname="generatedData.csv", fprob="Zprob.csv", n=8, p=3, s=5, nsim=1000, keep=800)

#remove all generated csv files
unlink("*.csv")

[Package BAMD version 1.2 Index]