eqdist.etest {energy}R Documentation

Multisample E-statistic (Energy) Test of Equal Distributions

Description

Performs the nonparametric multisample E-statistic (energy) test for equality of multivariate distributions.

Usage

 eqdist.etest(x, sizes, distance = FALSE, 
              incomplete = FALSE, N = 100, R = 999)

Arguments

x data matrix of pooled sample
sizes vector of sample sizes
distance logical: if TRUE, first argument is a distance matrix
incomplete logical: if TRUE, compute incomplete E-statistics
N sample size for incomplete statistics
R number of bootstrap replicates

Details

The k-sample multivariate E-test of equal distributions is performed. The statistic is computed from the original pooled samples, stacked in matrix x where each row is a multivariate observation, or the corresponding distance matrix. The first sizes[1] rows of x are the first sample, the next sizes[2] rows of x are the second sample, etc.

The test is implemented by nonparametric bootstrap, an approximate permutation test with R replicates. For large samples it is more efficient if x contains the data matrix rather than the distances. Incomplete statistics are supported for the two-sample test. If incomplete==TRUE, at most N observations from each sample (by sampling without replacement) are used in the calculation of the statistic. If distance==TRUE complete statistics are always computed.

The definition of the multisample E-statistic is given in the ksample.e documentation.

Value

A list with class htest containing

method description of test
statistic observed value of the test statistic
p.value approximate p-value of the test
data.name description of data

Note

The pairwise e-distances between samples can be conveniently computed by the edist function, which returns a dist object. The function ksample.e computes the test statistic without storing the distances, which is more efficient than calling eqdist.etest with R = 0.

Author(s)

Maria L. Rizzo mrizzo @ bgnet.bgsu.edu and Gabor J. Szekely gabors @ bgnet.bgsu.edu

References

Szekely, G. J. and Rizzo, M. L. (2004) Testing for Equal Distributions in High Dimension, InterStat, November (5).

Szekely, G. J. (2000) Technical Report 03-05: E-statistics: Energy of Statistical Samples, Department of Mathematics and Statistics, Bowling Green State University.

See Also

ksample.e, edist energy.hclust

Examples

 data(iris)
 
 ## test if the 3 varieties of iris data (d=4) have equal distributions
 eqdist.etest(iris[,1:4], c(50,50,50), R = 199)

 ## compare incomplete versions of two sample test
 x <- c(rpois(400, 2), rnbinom(600, size=1, mu=2))
 eqdist.etest(x, c(400, 600), incomplete=TRUE, N=100, R = 199)
 eqdist.etest(x, c(400, 600), incomplete=TRUE, N=200, R = 199)
  


[Package energy version 1.0-6 Index]