npEM {mixtools} | R Documentation |
Returns nonparametric EM algorithm output (Benaglia et al, 2009) for mixtures of multivariate (repeated measures) data where the coordinates of a row (case) in the data matrix are assumed to be independent, conditional on the mixture component (subpopulation) from which they are drawn.
npEM(x, mu0, blockid = 1:ncol(x), bw = bw.nrd0(as.vector(as.matrix(x))), samebw = TRUE, h = bw, eps = 1e-8, maxiter = 300, stochastic = FALSE, verb = TRUE)
x |
An n x r matrix of data. Each of the n rows is a case, and each case has r repeated measurements. These measurements are assumed to be conditionally independent, conditional on the mixture component (subpopulation) from which the case is drawn. |
mu0 |
Either an m x r matrix specifying the initial centers for the kmeans function, or an integer m specifying the number of initial centers, which are then choosen randomly in kmeans |
blockid |
A vector of length r identifying coordinates
(columns of x ) that are
assumed to be identically distributed (i.e., in the same block). For instance,
the default has all distinct elements, indicating that no two coordinates
are assumed identically distributed and thus a separate set of m
density estimates is produced for each column of x. On the other hand,
if blockid=rep(1,ncol(x)) , then the coordinates in each row
are assumed conditionally i.i.d. |
bw |
Bandwidth for density estimation, equal to the standard deviation
of the kernel density. By default, a simplistic application of the
default bw.nrd0
bandwidth used by density to the entire dataset. |
samebw |
Logical: If TRUE , use the same bandwidth for
each iteration and for each component and block. If FALSE ,
use a separate bandwidth for each component and block, and update
this bandwidth at each iteration of the algorithm using a suitably
modified bw.nrd0 method. |
h |
Alternative way to specify the bandwidth, to provide backward compatibility. |
eps |
Tolerance limit for declaring algorithm convergence. Convergence
is declared whenever the maximum change in any coordinate of the
lambda vector (of mixing proportion estimates) does not exceed
eps . |
maxiter |
The maximum number of iterations allowed, for both
stochastic and non-stochastic versions;
for non-stochastic algorithms (stochastic = FALSE ), convergence
may be declared before maxiter iterations (see eps above). |
stochastic |
Flag, if FALSE (the default), runs the non-stochastic version
of the npEM algorithm, as in Benaglia et al (2009). Set to TRUE to
run a stochastic version which simulates the posteriors at each
iteration, and runs for maxiter iterations. |
verb |
If TRUE, print updates for every iteration of the algorithm as it runs |
npEM
returns a list of class npEM
with the following items:
data |
The raw data (an n x r matrix). |
posteriors |
An n x m matrix of posterior probabilities for
observation. If stochastic = TRUE , this matrix is computed
from an average over the maxiter iterations. |
bandwidth |
If samebw==TRUE ,
same as the bw input argument; otherwise, value of bw matrix
at final iteration. This
information is needed by any method that produces density estimates from the
output. |
blockid |
Same as the blockid input argument, but recoded to have
positive integer values. Also needed by any method that produces density
estimates from the output. |
lambda |
The sequence of mixing proportions over iterations. |
lambdahat |
The final mixing proportions if stochastic = FALSE ,
or the average mixing proportions if stochastic = TRUE . |
loglik |
The sequence of log-likelihoods over iterations. |
plot.npEM
, normmixrm.sim
, spEMsymloc
,
plotseq.npEM
## Examine and plot water-level task data set. ## First, try a 3-component solution where no two coordinates are ## assumed i.d. data(Waterdata) a <- npEM(Waterdata, mu0=3, bw=4) # Assume indep but not iid plot(a) # This produces 8 plots, one for each coordinate ## Next, same thing but pairing clock angles that are directly opposite one ## another (1:00 with 7:00, 2:00 with 8:00, etc.) b <- npEM(Waterdata, mu0=3, blockid=c(4,3,2,1,3,4,1,2), bw=4) # iid in pairs plot(b) # Now only 4 plots, one for each block