mtsknn {MTSKNN} | R Documentation |
The function tests whether two samples share the same underlying distribution based on k-nearest-neighbors approach.
mtsknn(x,y,k)
x |
A matrix or data frame. |
y |
A matrix or data frame. |
k |
An integer. |
matrices or data frames x and y are the two samples to be tested. Each row consists of the coordinates of a data point. The integer k is the number of nearest neighbors to choose in the testing procedure.
A list consists of the test statistics, normalized Z score and corresponding P value.
This is appropriate for the balanced case where the two sample sizes are about the same level. For the unbalanced case where the two sample sizes deviate largely from each other, two more robust tests mtsknn.eq and mtsknn.neq are recommended.
Lisha Chen lisha.chen@yale.edu, Peng Dai peng.dai@stonybrook.edu and Wei Dou wei.dou@yale.edu
Schilling, M. F. (1986). Multivariate two-sample tests based on nearest neighbors. J. Amer. Statist. Assoc., 81 799-806.
Henze, N. (1988). A multivariate two-sample test based on the number of nearest neighbor type coincidences. Ann. Statist., 16 772-783.
Chen, L. and Dou W. (2009). Robust multivariate two-sample tests based on k nearest neighbors for unbalanced designs. manuscripts.
mtsknn.eq
and mtsknn.neq
## Example of two samples from the same multivariate t distribution: n <- 100 x <- matrix(rt(2*n, df=5),n,2) y <- matrix(rt(2*n, df=5),n,2) mtsknn(x,y,3) ## Example of two samples from different distributions: n <- 100 x <- matrix(rt(2*n, df=10),n,2) y <- matrix(rnorm(2*n),n,2) mtsknn(x,y,3)