mtsknn.neq {MTSKNN}R Documentation

A robust multivariate two-sample test based on k-nearest neighbors against unbalanceness

Description

The function tests whether two samples share the same underlying distribution based on k-nearest-neighbors approach. This approach is robust in the unbalanced case.

Usage

mtsknn.neq(x,y,k, delta=1.05, clevel=0.05, seed=12345, getpval=TRUE, print=TRUE, max.loop=20, level.seq="decrease")

Arguments

x A matrix or data frame.
y A matrix or data frame.
k An integer.
delta The parameter determining the size of each subsample.
clevel The confidence level. Default value is 0.05.
seed The seed set for random permutation in the test procedure.
getpval Logic value. If it is set to be TRUE the p value of test will be calcuated and reported; if it is set to be false the p value will not be calculated.
print Logic value. If it is set to be TRUE the test result will be reported; if it is set to be false the test result will not be reported.
max.loop After this sepecified number of loops, the test statistics will be generated from a standard normal instead of being computed from the sample.
level.seq It is set as "decrease" by default, which means that for the sequential sub-tests the critical values are set to be decreased according to a predetermined rule.

Details

matrices or data frames x and y are the two samples to be tested. Each row consists of the coordinates of a data point. The integer k is the number of nearest neighbors to choose in the testing procedure.

Value

The test result for a given confidence level. Reject or accept the null hypothesis. It can also calculate and report the P value.

Note

This is appropriate for the unbalanced case where the two sample sizes are about the same level. Another robust test ismtsknn.neq.

Author(s)

Lisha Chenlisha.chen@yale.edu, Peng Daipeng.dai@yale.edu and Wei Dou wei.dou@yale.edu

References

Schilling, M. F. (1986). Multivariate two-sample tests based on nearest neighbors. J. Amer. Statist. Assoc., 81 799-806.

Henze, N. (1988). A multivariate two-sample test based on the number of nearest neighbor type coincidences. Ann. Statist., 16 772-783.

Chen, L. and Dou W. (2009). Robust multivariate two-sample tests based on k nearest neighbors for unbalanced designs. manuscripts.

See Also

mtsknn, mtsknn.eq and mtsknn.discard

Examples


## Example of two samples from the same multivariate t distribution:

n <- 100

x <- matrix(rt(2*n, df=5),n,2)

y <- matrix(rt(2*15*n, df=5),(15*n),2)

mtsknn.neq(x,y,3)

## Example of two samples from different distributions:

n <- 100

x <- matrix(rt(2*n, df=10),n,2)

y <- matrix(rnorm(2*15*n),(15*n),2)

mtsknn.neq(x,y,3)


[Package MTSKNN version 0.0-5 Index]