hmm {hmm.discnp}R Documentation

Fit a hidden Markov model to discrete data.

Description

Uses the EM algorithm to perform a maximum likelihood fit of a hidden Markov model to discrete data where the observations come from one of a number of finite discrete distributions, depending on the (hidden) state of the Markov chain. These distributions are specified (non-parametrically) by a matrix Rho = [rho_ij] where rho_ij = P(Y = y_i | S = j), Y being the observable random variable and S being the hidden state.

Usage

hmm(y, yval=NULL, par0=NULL, K=NULL, rand.start=NULL, mixture=FALSE,
    tolerance=1e-4, verbose=FALSE, itmax=200, crit='PCLL', data.name=NULL)

Arguments

y A vector or matrix of discrete data; missing values are allowed. If y is a matrix, each column is interpreted as an independent replicate of the observation sequence.
yval A vector of possible values for the data; it defaults to the sorted unique values of y. If any value of y does not match some value of yval, it will be treated as a MISSING VALUE.
par0 An optional (named) list of starting values for the parameters of the model, with components tpm (transition probability matrix) and Rho. The matrix Rho specifies the probability that the observations take on each value in yval, given the state of the hidden Markov chain. The columns of Rho correspond to states, the rows to the values of yval.
If par0 is not specified, starting values are created by the function init.all().
K The number of states in the hidden Markov chain; if par0 is not specified K MUST be; if par0 is specified, K is ignored.
Note that K=1 is acceptable; if K is 1 then all observations are treated as being independent and the non-parametric estimate of the distribution of the observations is calculated in the obvious way.
rand.start A list consisting of two logical scalars which must be named tpm and Rho, if tpm is TRUE then the function init.all() chooses entries for then starting value of tpm at random; likewise for Rho. This argument defaults to list(tpm=FALSE,Rho=FALSE).
mixture A logical scalar; if TRUE then a mixture model (all rows of the transition probability matrix are identical) is fitted rather than a general hidden Markov model.
tolerance If the value of the quantity used for the stopping criterion is less than tolerance then the EM algorithm is considered to have converged.
verbose A logical scalar determining whether to print out details of the progress of the EM algorithm.
itmax If the convergence criterion has not been met by the time itmax EM steps have been performed, a warning message is printed out, and the function stops. A value is returned by the function anyway, with the logical component "converged" set to FALSE.
crit The name of the stopping criterion, which must be one of "PCLL" (percent change in log-likelihood; the default), "L2" (L-2 norm, i.e. square root of sum of squares of change in coefficients), or "Linf" (L-infinity norm, i.e. maximum absolute value of change in coefficients).
data.name An identifying tag for the fit; if omitted, it defaults to the name of data set y as determined by deparse(substitute(y)).

Details

The hard work is done by a Fortran subroutine "recurse" (actually coded in Ratfor) which is dynamically loaded.

Value

A list with components:

Rho The fitted value of the probability matrix Rho specifying the distributions of the observations.
tpm The fitted value of the transition probabilty matrix tpm.
ispd The fitted initial state probability distribution, assumed to be the (unique) stationary distribution for the chain, and thereby determined by the transition probability matrix tpm.
log.like The final value of the log likelihood, as calculated through recursion.
converged A logical scalar saying whether the algorithm satisfied the convergence criterion before the maximum of itmax EM steps was exceeded.
nstep The number of EM steps performed by the algorithm.
data.name An identifying tag, specified as an argument, or determined from the name of the argument y by deparse(substitute(y)).

Note

If K=1 then tpm, ispd, converged, and nstep are all set equal to NA in the list returned by this function.

Warning

The ordering of the (hidden) states can be arbitrary. What the estimation procedure decides to call ``state 1'' may not be what you think of as being state number 1. The ordering of the states will be affected by the starting values used.

Author(s)

Rolf Turner r.turner@auckland.ac.nz http://www.math.unb.ca/~rolf

References

Rabiner, L. R., "A tutorial on hidden Markov models and selected applications in speech recognition," Proc. IEEE vol. 77, pp. 257 – 286, 1989.

Zucchini, W. and Guttorp, P., "A hidden Markov model for space-time precipitation," Water Resources Research vol. 27, pp. 1917-1923, 1991.

MacDonald, I. L., and Zucchini, W., "Hidden Markov and Other Models for Discrete-valued Time Series, Chapman & Hall, London, 1997.

Liu, Limin, "Hidden Markov Models for Precipitation in a Region of Atlantic Canada", Master's Report, University of New Brunswick, 1997.

See Also

sim.hmm(), mps(), viterbi()

Examples

# See the help for sim.hmm() for how to generate y.sim.
## Not run: 
try <- hmm(y.sim,K=2,verb=TRUE)
try.mix <- hmm(y.sim,K=2,verb=TRUE,mixture=TRUE)
## End(Not run)

[Package hmm.discnp version 0.0-9 Index]