Mstep {HiddenMarkov} | R Documentation |
Performs the maximisation step of the EM algorithm for a dthmm
process. This function is called by the BaumWelch
function. The Baum-Welch algorithm used in the HMM literature is a version of the EM algorithm.
Mstep.beta(x, cond, pm, pn, maxiter = 200) Mstep.binom(x, cond, pm, pn) Mstep.exp(x, cond, pm, pn) Mstep.gamma(x, cond, pm, pn, maxiter = 200) Mstep.glm(x, cond, pm, pn, family, link) Mstep.lnorm(x, cond, pm, pn) Mstep.logis(x, cond, pm, pn, maxiter = 200) Mstep.norm(x, cond, pm, pn) Mstep.pois(x, cond, pm, pn)
x |
is a vector of length n containing the observed process. |
cond |
is an object created by Estep . |
family |
character string, the GLM family, one "gaussian", "poisson", "Gamma" or "binomial". |
link |
character string, the link function. If family == "Binomial" , then one of "logit", "probit" or "cloglog"; else one of "identity", "inverse" or "log". |
pm |
is a list object containing the current (Markov dependent) parameter estimates associated with the distribution of the observed process (see dthmm ). These are only used as initial values if the algorithm within the Mstep is iterative. |
pn |
is a list object containing the observation dependent parameter values associated with the distribution of the observed process (see dthmm ). |
maxiter |
maximum number of Newton-Raphson iterations. |
The functions Mstep.beta
, Mstep.binom
, Mstep.exp
, Mstep.gamma
, Mstep.lnorm
, Mstep.logis
, Mstep.norm
and Mstep.pois
perform the maximisation step for the Beta, Binomial, Exponential, Gamma, Log Normal, Logistic, Normal and Poisson distributions, respectively. Each function has the same argument list, even if specific arguments are redundant, because the functions are called from within other functions in a generic like manner. Specific notes for some follow.
Mstep.beta
shape1
and shape2
; and the density also has ncp
. We only use shape1
and shape2
, i.e. ncp
is assumed to be zero. Different combinations of "shape1"
and "shape2"
can be “time” dependent (specified in pn
) and Markov dependent (specified in pm
). However, each should only be specified in one (see topic dthmm
).
Mstep.binom
size
argument of the binomial distribution should always be specified in the pn
argument (see topic dthmm
).
Mstep.gamma
shape
, rate
and scale
. Since scale
is redundant, we only use shape
and rate
. Different combinations of "shape"
and "rate"
can be “time” dependent (specified in pn
) and Markov dependent (specified in pm
). However, each should only be specified in one (see topic dthmm
).
Mstep.lnorm
"meanlog"
and "sdlog"
can be “time” dependent (specified in pn
) and Markov dependent (specified in pm
). However, each should only be specified in one (see topic dthmm
).
Mstep.logis
"location"
and "scale"
can be “time” dependent (specified in pn
) and Markov dependent (specified in pm
). However, each should only be specified in one (see topic dthmm
).
Mstep.norm
"mean"
and "sd"
can be “time” dependent (specified in pn
) and Markov dependent (specified in pm
). However, each should only be specified in one (see topic dthmm
).
A list object with the same structure as pm
(see topic dthmm
).
Consider a distribution with two parameters where both parameters are Markov dependent, but one is known and the other requires estimation. For example, consider the Gaussian distribution. Say we know the Markov dependent means, but we need to estimate the standard deviations. Since both parameters are Markov dependent, they both need to be specified in the pm
argument. The estimation of the distribution specific parameters takes place in the Mstep, in this case Mstep.norm
. To achieve what we want, we need to modify this function. In this case it is relatively easy (see code in “Examples” below. From the function Mstep.norm
, take the code under the section if (all(nms==c("mean", "sd")))
, i.e. both of the parameters are Markov dependent. However, replace the line where the mean is estimated to mean <- pm$mean
, i.e. leave it as was initially specified. Then source
this revised function so that is found by R in preference to the standard version in the package.
One needs to take a little more care when dealing with a distributions like the beta, where the cross derivatives of the log likelihood between the parameters, i.e. partial^2 log L /(partial α_1 partial α_2) are non-zero.
The Mstep functions can be used to estimate the maximum likelihood parameters from a simple sample. See the example below where this is done for the logistic distribution.
# Fit logistic distribution to a simple single sample # Simulate data n <- 20000 location <- -2 scale <- 1.5 x <- rlogis(n, location, scale) # give each datum equal weight cond <- NULL cond$u <- matrix(rep(1/n, n), ncol=1) # calculate maximum likelihood parameter estimates # start iterations at values used to simulate print(Mstep.logis(x, cond, pm=list(location=location, scale=scale))) # Example with Gaussian Observations # Assume that both mean and sd are Markov dependent, but the means # are known and sd requires estimation (See "Modifications" above). Mstep.norm <- function(x, cond, pm, pn){ nms <- sort(names(pm)) n <- length(x) m <- ncol(cond$u) if (all(nms==c("mean", "sd"))){ mean <- pm$mean sd <- sqrt(apply((matrix(x, nrow = n, ncol=m) - matrix(mean, nrow = n, ncol=m, byrow=TRUE))^2 * cond$u, MARGIN=2, FUN=sum)/apply(cond$u, MARGIN = 2, FUN = sum)) return(list(mean=mean, sd=sd)) } } Pi <- matrix(c(1/2, 1/2, 0, 1/3, 1/3, 1/3, 0, 1/2, 1/2), byrow=TRUE, nrow=3) p1 <- c(1, 6, 3) p2 <- c(0.5, 1, 0.5) n <- 1000 pm <- list(mean=p1, sd=p2) x <- dthmm(NULL, Pi, c(0, 1, 0), "norm", pm) x <- simulate(x, n) # use above parameter values as initial values y <- BaumWelch(x) print(y$delta) print(y$pm) print(y$Pi)