hmm {hmm.discnp} | R Documentation |
Uses the EM algorithm to perform a maximum likelihood fit of a hidden Markov model to discrete data where the observations come from one of a number of finite discrete distributions, depending on the (hidden) state of the Markov chain. These distributions are specified (non-parametrically) by a matrix Rho = [rho_ij] where rho_ij = P(Y = y_i | S = j), Y being the observable random variable and S being the hidden state.
hmm(y, yval=NULL, par0=NULL, K=NULL, rand.start=NULL, mixture=FALSE, tolerance=1e-4, verbose=FALSE, itmax=200, crit='PCLL', data.name=NULL)
y |
A vector or matrix of discrete data; missing values are allowed. If
y is a matrix, each column is interpreted as an independent
replicate of the observation sequence.
|
yval |
A vector of possible values for the data; it defaults to the sorted
unique values of y . If any value of y does not match
some value of yval , it will be treated as a MISSING VALUE.
|
par0 |
An optional (named) list of starting values for the parameters
of the model, with components tpm (transition probability
matrix) and Rho . The matrix Rho specifies the
probability that the observations take on each value in yval, given
the state of the hidden Markov chain. The columns of Rho
correspond to states, the rows to the values of yval .
If par0 is not specified, starting values are created by the
function init.all() .
|
K |
The number of states in the hidden Markov chain; if par0 is
not specified K MUST be; if par0 is specified, K
is ignored.
Note that K=1 is acceptable; if K is 1 then all
observations are treated as being independent and the non-parametric
estimate of the distribution of the observations is calculated
in the obvious way.
|
rand.start |
A list consisting of two logical scalars which must be named
tpm and Rho , if tpm is TRUE then the function
init.all() chooses entries for then starting value of tpm at
random; likewise for Rho . This argument defaults to
list(tpm=FALSE,Rho=FALSE) .
|
mixture |
A logical scalar; if TRUE then a mixture model (all rows of the transition probability matrix are identical) is fitted rather than a general hidden Markov model. |
tolerance |
If the value of the quantity used for the stopping criterion is less than tolerance then the EM algorithm is considered to have converged. |
verbose |
A logical scalar determining whether to print out details of the progress of the EM algorithm. |
itmax |
If the convergence criterion has not been met by the time itmax
EM steps have been performed, a warning message is printed out,
and the function stops. A value is returned by the function
anyway, with the logical component "converged" set to FALSE.
|
crit |
The name of the stopping criterion, which must be one of "PCLL" (percent change in log-likelihood; the default), "L2" (L-2 norm, i.e. square root of sum of squares of change in coefficients), or "Linf" (L-infinity norm, i.e. maximum absolute value of change in coefficients). |
data.name |
An identifying tag for the fit; if omitted, it defaults to the
name of data set y as determined by deparse(substitute(y)) .
|
The hard work is done by a Fortran subroutine "recurse" (actually coded in Ratfor) which is dynamically loaded.
A list with components:
Rho |
The fitted value of the probability matrix Rho specifying the
distributions of the observations.
|
tpm |
The fitted value of the transition probabilty matrix tpm .
|
ispd |
The fitted initial state probability distribution, assumed to
be the (unique) stationary distribution for the chain, and thereby
determined by the transition probability matrix tpm .
|
log.like |
The final value of the log likelihood, as calculated through recursion. |
converged |
A logical scalar saying whether the algorithm satisfied the convergence criterion before the maximum of itmax EM steps was exceeded. |
nstep |
The number of EM steps performed by the algorithm. |
data.name |
An identifying tag, specified as an argument, or determined from the name of the argument y by deparse(substitute(y)). |
If K=1
then tpm
, ispd
, converged
, and
nstep
are all set equal to NA
in the list returned
by this function.
The ordering of the (hidden) states can be arbitrary. What the estimation procedure decides to call ``state 1'' may not be what you think of as being state number 1. The ordering of the states will be affected by the starting values used.
Rolf Turner r.turner@auckland.ac.nz http://www.math.unb.ca/~rolf
Rabiner, L. R., "A tutorial on hidden Markov models and selected applications in speech recognition," Proc. IEEE vol. 77, pp. 257 – 286, 1989.
Zucchini, W. and Guttorp, P., "A hidden Markov model for space-time precipitation," Water Resources Research vol. 27, pp. 1917-1923, 1991.
MacDonald, I. L., and Zucchini, W., "Hidden Markov and Other Models for Discrete-valued Time Series, Chapman & Hall, London, 1997.
Liu, Limin, "Hidden Markov Models for Precipitation in a Region of Atlantic Canada", Master's Report, University of New Brunswick, 1997.
# See the help for sim.hmm() for how to generate y.sim. ## Not run: try <- hmm(y.sim,K=2,verb=TRUE) try.mix <- hmm(y.sim,K=2,verb=TRUE,mixture=TRUE) ## End(Not run)