mtsknn.eq {MTSKNN} | R Documentation |
The function tests whether two samples share the same underlying distribution based on k-nearest-neighbors approach. This approach is robust in the unbalanced case.
mtsknn.eq(x,y,k,clevel=0.05,getpval=TRUE, print=TRUE)
x |
A matrix or data frame. |
y |
A matrix or data frame. |
k |
An integer. |
clevel |
The confidence level. Default value is 0.05. |
getpval |
Logic value. If it is set to be TRUE the p value of test will be calcuated and reported; if it is set to be false the p value will not be calculated. |
print |
Logic value. If it is set to be TRUE the test result will be reported; if it is set to be false the test result will not be reported. |
matrices or data frames x and y are the two samples to be tested. Each row consists of the coordinates of a data point. The integer k is the number of nearest neighbors to choose in the testing procedure.
The test result for a given confidence level. Reject or accept the null hypothesis. It can also calculate and report the p value.
This is appropriate for the unbalanced case where the two sample sizes are about the same level. Another robust test ismtsknn.neq.
Lisha Chenlisha.chen@yale.edu, Peng Daipeng.dai@yale.edu and Wei Dou wei.dou@yale.edu
Schilling, M. F. (1986). Multivariate two-sample tests based on nearest neighbors. J. Amer. Statist. Assoc., 81 799-806.
Henze, N. (1988). A multivariate two-sample test based on the number of nearest neighbor type coincidences. Ann. Statist., 16 772-783.
Chen, L. and Dou W. (2009). Robust multivariate two-sample tests based on k nearest neighbors for unbalanced designs. manuscripts.
mtsknn
, mtsknn.neq
and mtsknn.discard
## Example of two samples from the same multivariate t distribution: n <- 100 x <- matrix(rt(2*n, df=5),n,2) y <- matrix(rt(2*10*n, df=5),(10*n),2) mtsknn.eq(x,y,3) ## Example of two samples from different distributions: n <- 100 x <- matrix(rt(2*n, df=10),n,2) y <- matrix(rnorm(2*10*n),(10*n),2) mtsknn.eq(x,y,3)