recodeSNPs {scrime} | R Documentation |
Recodes the values used to specify the genotypes of the SNPs to other values. Such a recoding might be required to use other functions contained in this package.
recodeSNPs(mat, first.ref = FALSE, geno = 1:3, snp.in.col = FALSE)
mat |
a matrix or data frame consisting of character strings of length 2 that
specify the genotypes of the SNPs. Each of these character strings
must be a combination of the letters A, T, C, and G. Missing values can
be specified by "NN" or NA . Depending on
snp.in.col it is assumed that each row of mat represents
a SNP and each column a variable (snp.in.col = FALSE ), or vice versa. |
first.ref |
does the first letter in the string coding the heterozygous
genotype always stands for the more frequent allele? E.g., codes "CC"
for the homozygous reference genotype if the genotypes
of a SNP are coded by "CC" , "CG" and "GG" ? If TRUE ,
the value made up only of this first letter is set to geno[1] , and the
value made up only of the second letter is set to geno[3] . If FALSE ,
it is evaluated rowwise which of the homozygous genotypes has the higher frequency
and the more often occuring value is set to geno[1] , and the other to geno[3] . |
geno |
a numeric or character vector of length 3 giving the three values that
should be used to recode the genotypes. By default, geno = 1:3 which is the
coding, e.g., required by rowChisqStats or pamCat . |
snp.in.col |
does each column of mat correspond to a SNP (and each row to an array)?
If FALSE , it is assumed that each row represents a SNP, and each column an array. |
A matrix of the same size as mat
containing the recoded genotypes. (Missing values are
coded by NA
).
## Not run: # Generate an example data set consisting of 5 rows and 12 columns, # where it is assumed that each row corresponds to a SNP. mat <- matrix("", 10, 12) mat[c(1, 4, 6),] <- sample(c("AA", "AT", "TT"), 18, TRUE) mat[c(2, 3, 10),] <- sample(c("CC", "CG", "GG"), 18, TRUE) mat[c(5, 8),] <- sample(c("GG", "GT", "TT"), 12, TRUE) mat[c(7, 9),] <- sample(c("AA", "AC", "CC"), 12, TRUE) mat # Recode the SNPs recodeSNPs(mat) # Recode the SNPs by assuming that the first letter in # the heterogyzous genotype refers to the major allele. recodeSNPs(mat, first.ref = TRUE) ## End(Not run)