nulldensity {bayesclust} | R Documentation |
In testing the following the hypothesis,
H_0 : No clusters |
H_1 : k clusters |
nulldensity
generates random variables from the distribution of
the Empirical Posterior Probability (EPP) under the null hypothesis.
nulldensity(nsim, n, k, mcs=0.2, a=2.01, b=0.990099, tau2=1, prop=0.25, p, file="")
nsim |
This denotes the number of random variables to generate from the distriution of EPP under the null. It is recommended to be at least 100,000 when the intention is to carry out multiple testing. Otherwise 8,000-10,000 iterations will suffice. |
n |
n is the number of observations in the dataset for
testing this hypothesis. See cluster.test for more details. |
k |
k specifies the alternative hypothesis being tested. It
must take an integer value strictly greater than 1. |
mcs |
mcs stands for Minimum Cluster Size. It should be a value between 0
and 1. It instructs the test procedure to only consider clusters of a certain minimum
size. |
a |
a is a hyperparameter for the prior on σ^2.
Further details can be found in the references below. |
b |
Like a , b is also a hyperparameter for the prior on σ^2.
Further details can be found in the references below. |
tau2 |
tau2 is a hyperparameter for the prior on the mean μ for
each cluster. |
prop |
prop specifies what fraction of the space of partitions under the
null hypothesis should be sampled. It is recommended to be at least 0.25. |
p |
The observations are assumed to come from a multivariate normal
distribution, of length p . |
file |
This argument is a character string. If specified, the output object will
be saved to this (binary) file. It can be loaded, inspected and alterered later in
subsequent R sessions using load . If left unspecified, the object will not
be saved to a file and could be lost on quitting the R session. |
The test statistic (EPP) is computed by the function cluster.test
. In order
to assess the significance of the statistic, it is necessary to obtain the
frequentist p-value of
the calculated statistic. This package achieves this task by simulating the null distribution
of the test statistic with nulldensity
and then extracting the sample quantile using
emp2pval
.
A very small portion of the code has been written in C. The code becomes slower as
k
gets larger in the alternative hypothesis.
For a particular dataset, this function can be run in parallel with cluster.test
.
The object returned is of class ``nulldensity''. It is a list comprising two components.
param |
This component, again, exists purely for bookkeeping purposes.
When emp2pval is called, it takes two mandatory arguments - one of class ``cluster.test''
and the other of class ``nulldensity''. Both these objects have a parameter component, which
should match for the p-value conversion to proceed. |
gen.values |
This is a vector of length nsim , consisting of the simulations
from the null distribution. |
Fuentes, C. and Gopal, V.
Fuentes, C. and Casella, G. (2008) "Testing for the Existence of Clusters" http://www.stat.ufl.edu/~casella/Papers/paper-v3.pdf
Gopal, V. "BayesClust User Manual" http://www.stat.ufl.edu/~viknesh/bayesclust/clust.html
cluster.test
for further information on
objects of class ``cluster.test''.
hist.nulldensity
which allows the user to plot a histogram of
simulated values in order to view the shape of the null distribution.
# Generate null density object. null1 <- nulldensity(nsim=100, n=12, p=2, k=2) hist(null1)