mtsknn.discard {MTSKNN} | R Documentation |
The function tests whether two samples share the same underlying distribution based on k-nearest-neighbors approach. This approach is robust in the unbalanced case by discarding extra data points in the larger sample.
mtsknn.discard(x,y,k)
x |
A matrix or data frame. |
y |
A matrix or data frame. |
k |
An integer. |
matrices or data frames x and y are the two samples to be tested. Each row consists of the coordinates of a data point. The integer k is the number of nearest neighbors to choose in the testing procedure.
The test result contains P value, Z score and test statistics.
This is appropriate for the unbalanced case where the two sample sizes are about the same level. Another robust test ismtsknn.neq.
Lisha Chenlisha.chen@yale.edu, Peng Daipeng.dai@yale.edu and Wei Dou wei.dou@yale.edu
Schilling, M. F. (1986). Multivariate two-sample tests based on nearest neighbors. J. Amer. Statist. Assoc., 81 799-806.
Henze, N. (1988). A multivariate two-sample test based on the number of nearest neighbor type coincidences. Ann. Statist., 16 772-783.
Chen, L. and Dou W. (2009). Robust multivariate two-sample tests based on k nearest neighbors for unbalanced designs. manuscripts.
mtsknn
, mtsknn.neq
and mtsknn.eq
## Example of two samples from the same multivariate t distribution: n <- 100 x <- matrix(rt(2*n, df=5),n,2) y <- matrix(rt(2*10*n, df=5),(10*n),2) mtsknn.discard(x,y,3) ## Example of two samples from different distributions: n <- 100 x <- matrix(rt(2*n, df=10),n,2) y <- matrix(rnorm(2*10*n),(10*n),2) mtsknn.discard(x,y,3)