remove.short {tileHMM} | R Documentation |
Remove short regions that are likely to be spurious.
remove.short(regions, post, probe.pos, min.length = 1000, min.score = 0.8, summary.fun = mean)
regions |
A matrix with information about the location of enriched regions. |
post |
A numeric vector with the posterior probability of ChIP enrichment for each probe. |
probe.pos |
A data frame with columns ‘chromosome’ and ‘position’ providing genomic coordinates for each probe. |
min.length |
Minimum length of enriched regions (see Details). |
min.score |
Minimum score for enriched regions (see Details). |
summary.fun |
Function used to summarise posterior probe probabilities into region scores. |
All regions that are shorter than min.length
and have a score of less than
min.score
will be removed. To filter regions based on only one of these values set the
other one to 0.
Region scores are calculated based on posterior probe probabilities. The summary function used
should accept a single numeric argument and return a numeric vector of length 1. If the
probabilities in post
are log transformed they will be transformed back to linear
space before they are summarised for each region.
A matrix with two rows and one column for each remaining region.
Peter Humburg
## create two state HMM with t distributions state.names <- c("one","two") transition <- c(0.1, 0.1) location <- c(1, 2) scale <- c(1, 1) df <- c(4, 6) model <- getHMM(list(a=transition, mu=location, sigma=scale, nu=df), state.names) ## obtain observation sequence from model obs <- sampleSeq(model, 500) ## make up some genomic probe coordinates pos <- data.frame(chromosome = rep("chr1", times = 500), position = seq(1, 18000, length = 500)) ## calculate posterior probability for state "one" post <- posterior(obs, model, log=FALSE) ## get sequence of individually most likely states state.seq <- apply(post, 2, which.max) state.seq <- states(model)[state.seq] ## find regions attributed to state "one" reg.pos <- region.position(state.seq, region="one") ## remove short and unlikely regions reg.pos2 <- remove.short(reg.pos, post, pos, min.length = 200, min.score = 0.8)