bayesclust-package {bayesclust} | R Documentation |
This package contains a suite of functions that allow the user to carry out the following hypothesis test on genetic data:
H_0 : No clusters |
H_1 : 2, 3 or 4 clusters |
The hypothesis test is formulated as a model selection problem, where the aim is to identify the model with the highest posterior probability. A Hierarchical Bayes model is assumed for the data. Note that firstly, the null hypothesis is equivalent to saying that the population consists of just one cluster. Secondly, since the functions here only allow the alternative hypothesis to be either 2, 3 or 4 at any one time, the package allows the user to test multiple hypotheses while controlling the False Discovery Rate (FDR).
This is a brief of summary of the test procedure:
cluster.test
. EPP will serve as the test statistic in this hypothesis test.
plot
on the object returned in Step 1.
nulldensity
. This can be
done concurrent to Steps 1 and 2. Be sure to use the same parameters for Steps 1 and 3 though.
emp2pval
. This function takes the objects returned in
Steps 1 and 3 as input.
fdr.test
to the objects returned in Step 4.
cluster.optimal
on significant datasets to pick out optimal clusters.
plot
on the object returned in Step 6 to view the optimal clustering/partition of the
data.
For full details on the distributional assumptions, please refer to the papers listed in the references section. For further details on the individual functions, please refer to their respective help pages and the examples.
George Casella casella@stat.ufl.edu and Claudio Fuentes cfuentes@stat.ufl.edu and Vik Gopal viknesh@stat.ufl.edu
Maintainer: Vik Gopal <viknesh@stat.ufl.edu>
Fuentes, C. and Casella, G. (2008) "Testing for the Existence of Clusters" http://www.stat.ufl.edu/~casella/Papers/paper-v3.pdf
Gopal, V. "BayesClust User Manual" http://www.stat.ufl.edu/~viknesh/bayesclust/clust.html
cluster.test
, cluster.optimal
, emp2pval
,
nulldensity
, fdr.test