cvmts.pval {CvM2SL2Test} | R Documentation |
The Cramer-von Mises two sample test is to test whether two independent samples were drawn from the same population. The function cvmts.pval() evaluates the p-value for given test statistic.
cvmts.pval(cvmstats, m, n)
cvmstats |
an R object holding a list of computed Cramer-von Mises test scores. |
m |
sample size of the first sample. |
n |
sample size of the second sample. |
The returned value the p-value(s)
The function cvmts.pval() first construct the distribution function, which is represented as seqeunce of tables in my implementation. This step may be slow, depending the sample sizes n and m as well as the capacity of the computer. Once the distribution function is established, the function us convolution operation to compute the p-value(s) for given Cramer-von Mises test statistics. So, if you have a sequence of pairs of samples of the same sample sizes n and m, it is best to compute the test scores for all pairs of samples, and then call this function. If you compute the p-values seperately, the process may be slow.
Yuanhui Xiao and Ye Cui, Department of Mathematics and Statistics, Georgia State University, Atlanta, Georgia, 30302 matyxx@langate.gsu.edu
Yaunhui Xiao, Alexander Gordon and Andrei Yakovlev (2007), A C++ program for the Cramer-von Mises two sample test, Journal of Statistical Software, 17(8), http://www.jstatsoft.org
cvmts.pval
## sample size of the first sample n <- 10 ## create a sample x of size n from the normal distribution with mean 0 and ## standard deviation 1 x <- rnorm(n, 0, 1) ## sample size of the second sample m <- 10 ## create a sample y of size m from the normal distribution with mean 1 and ## standard deviation 1 y <- rnorm(m, 1, 1) ## compute the Cramer-von Mises test statistic cvm <- cvmts.test(x, y) ## compute the p-value for the test. pval <- cvmts.pval(cvm, n, m) ## Now suppose x is a list of samples of the same size (n), and y is a list ## of samples of the same size (m), and you want to test whether each pair ## (x[i], y[i]) were drawn from the same population, for i = 1, 2, ... ## a bad way to use cvmts.pval() ## for(i <- 1; i<=length(x); i++){ ## cvm <- cvmts.test(x[i], y[i]) ## pval <- cvmts.pval(cvm, n, m) ## } ## Why? In each call to the function cvmts.pval(), in the first step established is ## the distribution of the Cramer-von Mises test statistics under the ## assumption that the two sample were drawn from the same population. Then ## the distribution is used to calculate the p-value. The first step may ## be expensive if the sample sizes n and m are large. ## I prefer the following way ## initialize # cvms <- seq(1, length(x)) ## compute test scores # for(i <- 1; i<=length(x); i++){ # cvms[i] <- cvmts.test(x[i], y[i]) # } # compute p-values # pvals <- cvmts.pval(cvms, n, m)