permsep {permax}R Documentation

Permutation analysis for complete separation

Description

Given two groups of samples of high dimensional attribute vectors (eg DNA array expression levels), determines the number of attributes which completely separate the two groups (all values in one group strictly larger than in the other), and a permutation p-value for this quantity.

Usage

permsep(data, ig1, nperm=0, ig2, WHseed=NULL)

Arguments

data Data matrix or data frame. Each case is a column, and each row is an attribute (the opposite of the standard configuration).
ig1 The columns of data corresponding to group 1
nperm The number of random permutations to use in computing the p-values. The default is to use the entire permutation distribution, which is only feasible if the sample sizes are fairly small
ig2 The columns of data corresponding to group 2. The default is to include all columns not in ig1 in group 2. When both ig1 and ig2 are given, columns not in either are excluded.
WHseed Initial random number seed (a vector of 3 integers). If missing, an initial seed is generated from the runif() function. Not needed if all permutations are calculated.
Prints a vector giving the # genes with complete separation (all in one group larger than all in the other, the proportion of permutations with this many or more genes with complete separation (p-value) ('permutation' actually means a distinct rearrangement of columns into 2 groups), the average number of genes per permutation with complete separation, and the proportion of permutations with any genes with complete separation.
The value returned is a list with components ics = a vector indicating (with 1) which rows of data have complete separation, and dtcs = a vector containing the printed output.
Also, if nperm>0, then the output includes attributes seed.start giving the initial random number seed, and seed.end giving the value of the seed at the end. These can be accessed with the attributes and attr functions.

Details

For each gene there will be 0, 1 or 2 rearrangements with complete separation, depending on the number of unique values and the sizes of the two groups. Adding these numbers over genes and dividing by the number of rearrangements gives the average number per permutation. The value returned averages only over the rearrangements actually used, though.

It is strongly recommended that different seeds be used for different runs, and ideally the final seed from one run, attr(output,'seed.end'), would be used as the initial seed in the next run.

See Also

permax

Examples

   ngenes <- 1000
   m1 <- rnorm(ngenes,4,1)
   m2 <- rnorm(ngenes,4,1)
    exp1 <- cbind(matrix(exp(rnorm(ngenes*5,m1,1)),nrow=ngenes),
               matrix(exp(rnorm(ngenes*10,m2,1)),nrow=ngenes))
   exp1[exp1<20] <- 20
   sub <- exp1>20 & exp1<150
   exp1[sub] <- ifelse(runif(length(sub[sub]))<.5,20,exp1[sub])
   dimnames(exp1) <- list(paste('x',format(1:ngenes,justify='l'),sep=''),
                     paste('sample',format(1:ncol(exp1),justify='l'),sep=''))
   dimnames(exp1) <- list(paste('x',1:ngenes,sep=''),
                     paste('sample',1:ncol(exp1),sep=''))
   exp1 <- round(exp1)

 uuu <- permsep(exp1,1:5)


[Package permax version 1.2.1 Index]