dcov.test {energy} | R Documentation |
Distance covariance test of multivariate independence. Distance covariance and distance correlation are multivariate measures of dependence.
dcov.test(x, y, index = 1.0, R = 199)
x |
matrix: first sample, observations in rows |
y |
matrix: second sample, observations in rows |
R |
number of replicates |
index |
exponent on Euclidean distance, in (0,2] |
dcov.test
performs a nonparametric
test of multivariate independence. The test decision is
obtained via bootstrap, with R
replicates.
The sample sizes (number of rows) of the two samples must agree, and samples must not contain missing values. The statistic is nV_n^2 where V_n(x,y) = dcov(x,y), which is based on interpoint Euclidean distances ||x_{i}-y_{j}||.
Distance correlation is a new measure of dependence between random vectors introduced by Szekely, Rizzo, and Bakirov (2007). For all distributions with finite first moments, distance correlation R generalizes the idea of correlation in two fundamental ways: (1) R(X,Y) is defined for X and Y in arbitrary dimension. (2) R(X,Y)=0 characterizes independence of X and Y.
Distance correlation satisfies 0 <= R <= 1, and
R = 0 only if X and Y are independent. Distance
covariance V provides a new approach to the problem of
testing the joint independence of random vectors. The formal
definitions of the population coefficients V and
R are given in (SRB 2007). The definitions of the
empirical coefficients are given in the energy
dcov
topic.
For all values of the index in (0,2) (all except 2), the asymptotic distribution of V_n^2 is a quadratic form of centered Gaussian random variables, with coefficients that depend on the distributions of X and Y. For the general problem of testing independence when the distributions of X and Y are unknown, the test based on n V_n^2 can be implemented as a permutation test. See (SRB 2007) for theoretical properties of the test, including statistical consistency.
dcov.test
returns a list with class htest
containing
method |
description of test |
statistic |
observed value of the test statistic |
estimate |
dCov(x,y) |
estimates |
a vector: [dCov(x,y), dCor(x,y), dVar(x), dVar(y)] |
replicates |
replicates of the test statistic |
p.value |
approximate p-value of the test |
data.name |
description of data |
For the test of independence, the distance covariance test statistic is the V-statistic n V_n^2 (not dCov).
Maria L. Rizzo mrizzo@bgnet.bgsu.edu and Gabor J. Szekely gabors@bgnet.bgsu.edu
Szekely, G.J., Rizzo, M.L., and Bakirov, N.K. (2007),
Measuring and Testing Dependence by Correlation of Distances,
Annals of Statistics, Vol. 35 No. 6, pp. 2769-2794.
http://dx.doi.org/10.1214/009053607000000505
## independent multivariate data x <- matrix(rnorm(60), nrow=20, ncol=3) y <- matrix(rnorm(40), nrow=20, ncol=2) dcov.test(x, y, R = 99) ## Not run: ## dependent multivariate data library(MASS) Sigma <- matrix(c(1, .1, 0, 0 , 1, 0, 0 ,.1, 1), 3, 3) x <- mvrnorm(30, c(0, 0, 0), .1 * diag(3)) y <- mvrnorm(30, c(0, 0, 0), Sigma) * x set.seed(123); dcov.test(x, y, index = 1.5) set.seed(123); dcov.test(x, y) detach("package:MASS") ## End(Not run)