seriate.matrix {seriation}R Documentation

Seriation of rows and columns of a matrix

Description

Tries to find an order for the rows and columns of a matrix by moving large values closer together creating a block structure.

Usage

## S3 method for class 'matrix':
seriate(x, method = NULL, control = NULL, 
    margin = c(1,2), ...)

Arguments

x a matrix.
method a character string with the name of the seriation method (default: "ME\_TSP").
control a list of control options passed on to the seriation algorithm.
margin a vector giving the margins to be seriated. 1 indicates rows, 2 indicates columns, c(1,2) indicates rows and columns.
... further arguments (unused).

Details

Currently the following methods are implemented:

"BEA"
Bond Energy Algorithm (BEA; McCormick 1972).

The algorithm tries to maximize the measure of effectiveness (see criterion) of a non-negative matrix. Due to the definition of this measure, the tasks of ordering rows and columns is separable and can be solved independently.

A row is arbitrarily placed; then rows are positioned one by one. When this is completed, the columns are treated similarly. The overall procedure amounts to two approximate traveling salesperson problems (TSP), one on the rows and one on the columns. The so-called `best insertion strategy' is used: rows (or columns) are inserted into the current permuted list of rows (or columns). Several consecutive runs of the algorithm might improve the energy.

Note that Arabie and Hubert (1990) question its use with non-binary data if the objective is to find a seriation or one-dimensional orderings of rows and columns.

The BEA code used in this package was implemented by Fionn Murtagh.

In control as element "rep" the number of runs can be specified. The results of the best run will be returned.

"BEA_TSP"
Use a TSP to optimize the measure od effectiveness (Lenstra 1974).

Use a TSP solver to optimize ME.

In control as element "method" a TSP solver method can be specified (see package TSP).

"PCA"
Principal component analysis.

Uses the projection of the data on its first principal component to determine the order.

Note that for a distance matrix calculated from x with Euclidean distance, this methods minimizes the least square criterion.

Value

Returns an object of class ser_permutation.

References

P. Arabie and L.J. Hubert (1990): The bond energy algorithm revisited, IEEE Transactions on Systems, Man, and Cybernetics, 20(1), 268–274.

J.K. Lenstra (1974): Clustering a Data Array and the Traveling-Salesman Problem, Operations Research, 22(2) 413–414.

W.T. McCormick, P.J. Schweitzer and T.W. White (1972): Problem decomposition and data reorganization by a clustering technique, Operations Research, 20(5), 993–1009.

See Also

criterion, seriate.

Examples

data("iris")
x <- as.matrix(iris[-5])

## to make the variables comparable, we scale the data
x <- scale(x, center = FALSE)

## try some methods
def.par <- par(no.readonly = TRUE)
layout(matrix(1:4, ncol = 2, byrow = TRUE), respect=TRUE)

pimage(x, main = "original data")
criterion(x)

order <- seriate(x)
pimage(x, order, main = "TSP to optimize ME")
criterion(x, order)

order <- seriate(x, method="PCA")
pimage(x, order, main = "first principal component")
criterion(x, order)

## 2 TSPs
order <- c(
    seriate(dist(x), method = "TSP"),
    seriate(dist(t(x)), method = "TSP")
)
pimage(x, order, main = "2 TSPs")
criterion(x, order)

par(def.par)

[Package seriation version 0.1-1 Index]