mhtree {mclust1998}R Documentation

Classification Tree for Model-based Gaussian hierarchical clustering.

Description

Determines a classification tree for agglomerative hierarchical clustering using criteria based on parameterizations of Gaussian mixture models that reflect the underlying geometry of the resulting clusters.

Usage

mhtree(data, modelid, partition, min.clusters = 1, verbose = FALSE, ...)
print.mhtree(x, ...)

Arguments

data matrix of observations.
modelid An integer specifying a parameterization of the MVN covariance matrix defined by volume, shape and orientation charactertistics of the underlying clusters. The allowed values or model and their interpretation are as follows: "EI" : uniform spherical, "VI" : spherical, "EEE" : uniform variance, "VVV" : unconstrained variance, "EFV" : fixed (user supplied) uniform volume, "VFV" : fixed (user supplied) shape.
partition initial classification of the data. The default puts every observation in a singleton cluster.
min.clusters minimum number of clusters desired. The default is to carry out agglomerative hierarchical clustering until termination, that is, until all observations belong to a single group.
verbose A logical variable specifying printing of the model type when set to TRUE.
... Allows users to specify the required shape parameter for the two fixed shape models "EFV" and "VFV", and to change default parameters that are used in the algorithms underlying some models. In the print.mhtree function this argument is used for extra parameters to the print function.
x An mhtree object.

Value

an object of class "mhclust", which consists of a classification tree with attributes giving other information relating to the clustering process.

NOTES

Only the six models illustrated in the example below are supported at present. These correspond to the models discussed in the Banfield and Raftery reference. It may be desirable to transform the data in some way before attempting to partition it into clusters. Different permuations of the data may produce different classifications, because mhclust resolves ties in a way that is dependent on the order of the observations, and because values of criterion that are close may change enough to affect the choice of merge pairs in a given stage.

References

J. D. Banfield and A. E. Raftery, Model-based Gaussian and non-Gaussian Clustering, Biometrics, 49:803-821 (September 1993).

C. Fraley, Algorithms for Model-based Gaussian Hierarchical Clustering, Technical Report No. 311, Department of Statistics, University of Washington (October 1996), to appear in SIAM Journal on Scientific Computing.

See Also

mhclass, loglik, awe, partuniq

Examples

data(iris)

# Ellipsoidal, equal volume, shape and orientation
mhtree(iris[,1:4], modelid = "EEE")

# Spherical, equal volume, fixed shape, variable orientation
shape <- c(1,1/2,1/3,1/4)
mhtree(iris[,1:4], modelid = "EFV", shape=shape)

# Spherical, equal volume (Ward's method).
mhtree(iris[,1:4], modelid = "EI")

# Ellipsoidal, equal volume, constant shape, variable orientation
mhtree(iris[,1:4], modelid = "VFV", shape=shape)

# Spherical, variable volume
mhtree(iris[,1:4], modelid = "VI")

# Unconstrained (default).
mhtree(iris[,1:4], modelid = "VVV")


[Package Contents]