dthmm {HiddenMarkov} | R Documentation |
Creates a discrete time hidden Markov model object with class "dthmm"
.
dthmm(x, Pi, delta, distn, pm, pn = NULL, discrete = NULL, nonstat = TRUE)
x |
is a vector of length n containing the observed process. Alternatively, x could be specified as NULL , meaning that the data will be added later (e.g. simulated). |
Pi |
is the m times m transition probability matrix of the homogeneous hidden Markov chain. |
delta |
is the marginal probability distribution of the m hidden states at the first time point. |
distn |
is a character string with the distribution name, e.g. "norm" or "pois" . If the distribution is specified as "wxyz" then a distribution function called "pwxyz" should be available, in the standard R format (e.g. pnorm or ppois ). |
pm |
is a list object containing the (Markov dependent) parameter values associated with the distribution of the observed process (see below). |
pn |
is a list object containing the observation dependent parameter values associated with the distribution of the observed process (see below). |
discrete |
is logical, and is TRUE if distn is a discrete distribution. Set automatically for distributions already contained in the package. |
nonstat |
is logical, TRUE if the homogeneous Markov chain is assumed to be non-stationary, default. See “Details” below. |
A list
object with class "dthmm"
, containing the above arguments as named components.
t
and T
in R, we use i = 1, cdots, n.
X^{(i)} = (X_1, cdots, X_i)
where i = 1, cdots, n. Similarly for C^{(i)}.
π_{jk} = Pr{ C_{i+1}=k , | , C_i=j }
for all i (i.e. all time points), and j,k = 1, cdots, m.
The list object pm
contains parameter values for the probability distribution of the observed process that are dependent on the hidden Markov state. These parameters are generally required to be estimated. See “Modifications” in topic Mstep
when some do not require estimation.
Assume that the hidden Markov chain has m states, and that there are ell parameters that are dependent on the hidden Markov state. Then the list object pm
should contain ell named vector components each of length m. The names are determined by the required probability distribution.
For example, if distn == "norm"
, the arguments names must coincide with those used by the functions dnorm
or rnorm
, which are mean
and sd
. Each must be specified in either pm
or pn
. If they both vary according to the hidden Markov state then pm
should have the named components mean
and sd
. These are both vectors of length m containing the means and standard deviations of the observed process when the hidden Markov chain is in each of the m states. If, for example, sd
was “time” dependent, then sd
would be contained in pn
(see below).
If distn == "pois"
, then pm
should have one component named lambda
, being the parameter name in the function dpois
. Even if there is only one parameter, the vector component should still be within a list and named.
The list object pn
contains parameter values of the probability distribution for the observed process that are dependent on the observation number or “time”. These parameters are assumed to be known.
Assume that the observed process is of length n, and that there are ell parameters that are dependent on the observation number or time. Then the list object pn
should contain ell named vector components each of length n. The names, as in pm
, are determined by the required probability distribution.
For example, in the observed process we may count the number of successes in a known number of Bernoulli trials, i.e. the number of Bernoulli trials is known at each time point, but the probability of success varies according to a hidden Markov state. The prob
parameter of rbinom
(or dbinom
) would be specified in pm
and the size
parameter would specified in pn
.
One could also have a situation where the observed process was Gaussian, with the means varying according to the hidden Markov state, but the variances varying non-randomly according to the observation number (or vice versa). Here mean
would be specified within pm
and sd
within pn
. Note that a given parameter can only occur within one of pm
or pn
.
The “complete data likelihood”, L_c, is
L_c = Pr{ X_1=x_1, cdots, X_n=x_n, C_1=c_1, cdots, C_n=c_n },.
This can be shown to be
Pr{ X_1=x_1 ,|, C_1=c_1 } Pr{ C_1=c_1 } prod_{i=2}^n Pr{ X_i=x_i ,|, C_i=c_i } Pr{ C_i=c_i ,|, C_{i-1}=c_{i-1} },,
and hence, substituting model parameters, we get
L_c = delta_{c_1} π_{c_1c_2} π_{c_2c_3} cdots π_{c_{n-1}c_n} prod_{i=1}^n Pr{ X_i=x_i ,|, C_i=c_i },,
and so
log L_c = log delta_{c_1} + sum_{i=2}^n log π_{c_{i-1}c_i} + sum_{i=1}^n log Pr{ X_i=x_i ,|, C_i=c_i },.
Hence the “complete data likelihood” is split into three terms: the first relates to parameters of the marginal distribution (Markov chain), the second to the transition probabilities, and the third to the distribution parameters of the observed random variable. When the Markov chain is non-stationary, each term can be maximised separately.
When the hidden Markov chain is assumed to be stationary, delta = Pi^prime delta (see topic compdelta
), and then the first two terms of log L_c determine the transition probabilities Pi. This raises more complicated numerical problems, as the first term is effectively a constraint. We deal with this in a slightly ad-hoc manner by effectively disregarding the first term, which is assumed to be relatively small. In the M-step, the transition matrix is determined by the second term, then delta is estimated using the relation delta = delta Pi.
#----- Test Gaussian Distribution ----- Pi <- matrix(c(1/2, 1/2, 0, 1/3, 1/3, 1/3, 0, 1/2, 1/2), byrow=TRUE, nrow=3) delta <- c(0, 1, 0) x <- dthmm(NULL, Pi, delta, "norm", list(mean=c(1, 6, 3), sd=c(0.5, 1, 0.5))) x <- simulate(x, nsim=1000) # use above parameter values as initial values y <- BaumWelch(x) print(summary(y)) print(logLik(y)) hist(residuals(y)) # check parameter estimates print(sum(y$delta)) print(y$Pi %*% rep(1, ncol(y$Pi))) #----- Test Poisson Distribution ----- Pi <- matrix(c(0.8, 0.2, 0.3, 0.7), byrow=TRUE, nrow=2) delta <- c(0, 1) x <- dthmm(NULL, Pi, delta, "pois", list(lambda=c(4, 0.1)), discrete = TRUE) x <- simulate(x, nsim=1000) # use above parameter values as initial values y <- BaumWelch(x) print(summary(y)) print(logLik(y)) hist(residuals(y)) # check parameter estimates print(sum(y$delta)) print(y$Pi %*% rep(1, ncol(y$Pi))) #----- Test Exponential Distribution ----- Pi <- matrix(c(0.8, 0.2, 0.3, 0.7), byrow=TRUE, nrow=2) delta <- c(0, 1) x <- dthmm(NULL, Pi, delta, "exp", list(rate=c(2, 0.1))) x <- simulate(x, nsim=1000) # use above parameter values as initial values y <- BaumWelch(x) print(summary(y)) print(logLik(y)) hist(residuals(y)) # check parameter estimates print(sum(y$delta)) print(y$Pi %*% rep(1, ncol(y$Pi))) #----- Test Beta Distribution ----- Pi <- matrix(c(0.8, 0.2, 0.3, 0.7), byrow=TRUE, nrow=2) delta <- c(0, 1) x <- dthmm(NULL, Pi, delta, "beta", list(shape1=c(2, 6), shape2=c(6, 2))) x <- simulate(x, nsim=1000) # use above parameter values as initial values y <- BaumWelch(x) print(summary(y)) print(logLik(y)) hist(residuals(y)) # check parameter estimates print(sum(y$delta)) print(y$Pi %*% rep(1, ncol(y$Pi))) #----- Test Binomial Distribution ----- Pi <- matrix(c(0.8, 0.2, 0.3, 0.7), byrow=TRUE, nrow=2) delta <- c(0, 1) # vector of "fixed & known" number of Bernoulli trials pn <- list(size=rpois(1000, 10)+1) x <- dthmm(NULL, Pi, delta, "binom", list(prob=c(0.2, 0.8)), pn, discrete=TRUE) x <- simulate(x, nsim=1000) # use above parameter values as initial values y <- BaumWelch(x) print(summary(y)) print(logLik(y)) hist(residuals(y)) # check parameter estimates print(sum(y$delta)) print(y$Pi %*% rep(1, ncol(y$Pi))) #----- Test Gamma Distribution ----- Pi <- matrix(c(0.8, 0.2, 0.3, 0.7), byrow=TRUE, nrow=2) delta <- c(0, 1) pm <- list(rate=c(4, 0.5), shape=c(3, 3)) x <- seq(0.01, 10, 0.01) plot(x, dgamma(x, rate=pm$rate[1], shape=pm$shape[1]), type="l", col="blue", ylab="Density") points(x, dgamma(x, rate=pm$rate[2], shape=pm$shape[2]), type="l", col="red") x <- dthmm(NULL, Pi, delta, "gamma", pm) x <- simulate(x, nsim=1000) # use above parameter values as initial values y <- BaumWelch(x) print(summary(y)) print(logLik(y)) hist(residuals(y)) # check parameter estimates print(sum(y$delta)) print(y$Pi %*% rep(1, ncol(y$Pi)))