lga {lga} | R Documentation |
Linear Grouping Analysis
lga(x, k, biter = NULL, niter = 10, showall = FALSE, scale = TRUE, nnode=NULL, silent=FALSE)
x |
a numeric matrix. |
k |
an integer for the number of clusters. |
biter |
an integer for the number of different starting hyperplanes to try. |
niter |
an integer for the number of iterations to attempt for convergence. |
showall |
logical. If TRUE then display all the outcomes, not just the best one. |
scale |
logical. Allows you to scale the data, dividing each column by its standard deviation, before fitting. |
nnode |
an integer of many CPUS to use for parallel processing. Defaults to NULL i.e. no parallel processing. |
silent |
logical. If TRUE, produces no text output during processing. |
This code tries to find k clusters using the lga algorithm described
in Van Aelst et al (2006). For each attempt, it has up to
niter
steps to get to convergence, and it does this from
biter
different starting hyperplanes. It then selects the
clustering with the smallest Residual Orthoganal Sum of Squareds.
If biter
is left as NULL, then it is selected via the equation
given in Van Aeslt et al (2006).
This function is parallel computing aware via the nnode
argument, and works with the package snow
. In order to use
parallel computing, one of MPI (e.g. lamboot) or PVM is necessary.
For further details, see the documentation for snow
.
Associated with the lga function are a print method and a plot method (see the examples). In the plot method, the fitted hyperplanes are also shown as dashed-lines. When there are more than 2 dimensions, these represent the intersection of the fitted hyperplanes onto the hyperplanes for each pair of axes.
An object of class ‘“lga”’ with components
cluster |
a vector containing the cluster memberships. |
ROSS |
the Residual Orthogonal Sum of Squares for the solution. |
converged |
a logical. True if at least one solution has converged. |
biter |
the biter setting used. |
niter |
the niter setting used. |
nconverg |
the number of converged solutions (out of biter starts). |
scaled |
logical. Is the data scaled? |
k |
the number of clusters to be found. |
x |
the (scaled if selected) dataset. |
Justin Harrington harringt@stat.ubc.ca
Van Aelst, S. and Wang, X. and Zamar, R. and Zhu, R. (2006) ‘Linear Grouping Using Orthogonal Regression’, Computational Statistics & Data Analysis 50, 1287–1312.
## Synthetic Data ## Make a dataset with 2 clusters in 2 dimensions library(MASS) set.seed(1234) X <- rbind(mvrnorm(n=100, mu=c(1,-1), Sigma=diag(0.1,2)+0.9), mvrnorm(n=100, mu=c(1,1), Sigma=diag(0.1,2)+0.9)) lgaout <- lga(X,2) plot(lgaout) print(lgaout) ## nhl94 data set data(nhl94) plot(lga(nhl94, k=3, niter=30)) ## Allometry data set data(brain) plot(lga(log(brain, base=10), k=3)) ## Second Allometry data set data(ob) plot(lga(log(ob[,2:3]), k=3), pch=as.character(ob[,1])) ## Parallel processing case ## In this example, running using 4 nodes. ## Not run: set.seed(1234) X <- rbind(mvrnorm(n=1e6, mu=c(1,-1), Sigma=diag(0.1,2)+0.9), mvrnorm(n=1e6, mu=c(1,1), Sigma=diag(0.1,2)+0.9)) abc <- lga(X, k=2, nnode=4) ## End(Not run)