phmclust {mixPHM} | R Documentation |
This function allows for the computation of proportional hazards models with different distribution assumptions on the underlying baseline hazard. Several options for imposing proportionality restrictions on the hazards are provided. This function offers several variations of the EM-algorithm regarding the posterior computation in the M-step.
phmclust(x, K, method = "separate", Sdist = "weibull", cutpoint = NULL, EMstart = NA, EMoption = "classification", EMstop = 0.01, maxiter = 100)
x |
Data frame or matrix of dimension n*p with survival times (NA 's allowed). |
K |
Number of mixture components. |
method |
Imposing proportionality restrictions on the hazards:
With "separate" no restrictions are imposed, "main.g" relates to a group main effect,
"main.p" to variable main effects. "main.gp" reflects the proportionality assumption over groups
and variables. "int.gp" allows for interactions between groups and variables. |
Sdist |
Various survival distrubtions such as "weibull" , "exponential" , and "rayleigh" . |
cutpoint |
Integer value with upper bound for observed dwell times. Above this cutpoint, values are regarded as censored. If NULL, no censoring is performed |
EMstart |
Vector of length n with starting values for group membership,
NA indicates random starting values. |
EMoption |
"classification" is based on deterministic cluster assignment,
"maximization" on deterministic assignment, and "randomization"
provides a posterior-based randomized cluster assignement. |
EMstop |
Stopping criterion for EM-iteration. |
maxiter |
Maximum number of iterations. |
The method "separate"
corresponds to an ordinary mixture model. "main.g"
imposes proportionality
restrictions over variables (i.e., the group main effect allows for free-varying variable hazards). "main.p"
imposes proportionality restrictions over groups (i.e., the variable main effect allows for free-varying group hazards).
If clusters with only one observation are generated, the algorithm stops.
Returns an object of class mws
with the following values:
K |
Number of components |
iter |
Number of EM iterations |
method |
Proportionality restrictions used for estimation |
Sdist |
Assumed survival distribution |
likelihood |
Log-likelihood value for each iteration |
pvisit |
Matrix of prior probabilities due to NA structure |
se.pvisit |
Standard errors for priors |
shape |
Matrix with shape parameters |
scale |
Matrix with scale parameters |
group |
Final deterministic cluster assignment |
posteriors |
Final probabilistic cluster assignment |
npar |
Number of estimated parameters |
aic |
Akaike information criterion |
bic |
Bayes information criterion |
clmean |
Matrix with cluster means |
se.clmean |
Standard errors for cluster means |
clmed |
Matrix with cluster medians |
Mair, P., and Hudec, M. (2008). Analysis of dwell times in Web Usage Mining. Proceedings of the 31st Annual Conference of the German Classification Society on Data Analysis, Machine Learning, and Applications.
Collett, D. (2003). Modelling Survival Data in Medical Research. Boca Raton, FL: Chapman & Hall.
Celaux, G., and Govaert, G. (1992). A classification EM algorithm for clustering and two stochastic versions. Computational Statistics and Data Analysis, 14, 315-332.
data(webshop) ## Fitting a Weibll mixture model (3 components) is fitted with classification EM ## Observations above 600sec are regarded as censored res1 <- phmclust(webshop, K = 3, cutpoint = 600) res1 summary(res1) ## Fitting a Rayleigh Weibull proportional hazard model (2 components, proportional over groups) res2 <- phmclust(webshop, K = 2, method = "main.p", Sdist = "rayleigh") res2 summary(res2)