HINoV.Mod {clusterSim}R Documentation

Modification of Carmone, Kara & Maxwell Heuristic Identification of Noisy Variables (HINoV) method

Description

Modification of Heuristic Identification of Noisy Variables (HINoV) method

Usage

HINoV.Mod (x, type="metric", s = 2, u, distance=NULL, 
        method = "kmeans", Index ="cRAND")

Arguments

x data matrix
type "metric" (default) - all variables are metric (ratio, interval), "nonmetric" - all variables are nonmetric (ordinal, nominal) or vector containing for each variable value "m"(metric) or "n"(nonmetric) for mixed variables (metric and nonmetric), e.g. type=c("m", "n", "n", "m")
s for metric data only: 1 - ratio data, 2 - interval or mixed (ratio & interval) data
u number of clusters (for metric data only)
distance NULL for kmeans method (based on data matrix) and nonmetric data
for ratio data: "d1" - Manhattan, "d2" - Euclidean, "d3" - Chebychev (max), "d4" - squared Euclidean, "d5" - GDM1, "d6" - Canberra, "d7" - Bray-Curtis
for interval or mixed (ratio & interval) data: "d1", "d2", "d3", "d4", "d5"
method NULL for nonmetric data
clustering method: "kmeans" (default) , "single", "ward", "complete", "average", "mcquitty", "median", "centroid", "pam"
Index "cRAND" - corrected Rand index (default); "RAND" - Rand index

Details

See file $R_HOME\library\clusterSim\pdf\HINoVMod_details.pdf for further details

Value

parim m x m symmetric matrix (m - number of variables). Matrix contains pairwise corrected Rand (Rand) indices for partitions formed by the j-th variable with partitions formed by the l-th variable
topri sum of rows of parim
stopri ranked values of topri in decreasing order

Author(s)

Marek Walesiak marek.walesiak@ue.wroc.pl, Andrzej Dudek andrzej.dudek@ue.wroc.pl

Department of Econometrics and Computer Science, University of Economics, Wroclaw, Poland http://keii.ue.wroc.pl/clusterSim

References

Carmone, F.J., Kara, A., Maxwell, S. (1999), HINoV: a new method to improve market segment definition by identifying noisy variables, "Journal of Marketing Research", November, vol. 36, 501-509.

Hubert, L.J., Arabie, P. (1985), Comparing partitions, "Journal of Classification", no. 1, 193-218.

Rand, W.M. (1971), Objective criteria for the evaluation of clustering methods, "Journal of the American Statistical Association", no. 336, 846-850.

Walesiak, M. (2005), Variable selection for cluster analysis - approaches, problems, methods, Plenary Session of the Committee on Statistics and Econometrics of the Polish Academy of Sciences, 15 March, Wroclaw.

See Also

hclust, kmeans, dist, dist.GDM, dist.BC, dist.SM, cluster.Sim

Examples

# for metric data
library(clusterSim)
data(data_ratio)
r1<- HINoV.Mod(data_ratio, type="metric", s=1, 4, method="kmeans",
     Index="cRAND")
print(r1$stopri)
plot(r1$stopri[,2],xlab="Variable number", ylab="topri",
xaxt="n", type="b")
axis(1,at=c(1:max(r1$stopri[,1])),labels=r1$stopri[,1])

# for nonmetric data
library(clusterSim)
data(data_nominal)
r2<- HINoV.Mod (data_nominal, type="nonmetric", Index = "cRAND")
print(r2$stopri)
plot(r2$stopri[,2], xlab="Variable number", ylab="topri",
xaxt="n", type="b")
axis(1,at=c(1:max(r2$stopri[,1])),labels=r2$stopri[,1])

# for mixed data
library(clusterSim)
data(data_mixed)
r3<- HINoV.Mod(data_mixed, type=c("m","n","m","n"), s=2, 3, distance="d1",
     method="complete", Index="cRAND")
print(r3$stopri)
plot(r3$stopri[,2], xlab="Variable number", ylab="topri",
xaxt="n", type="b")
axis(1,at=c(1:max(r3$stopri[,1])),labels=r3$stopri[,1])


[Package clusterSim version 0.36-4 Index]