RJaCGH {RJaCGH} | R Documentation |
This function fits a non-homogeneous hidden Markov model to CGH data through bayesian methods and Reversible Jump Markov chain Montecarlo.
RJaCGH(y, Chrom = NULL, Pos = NULL, model = "genome", burnin = 10000, TOT = 10000, k.max = 6, stat = NULL, mu.alfa = NULL, mu.beta = NULL, ka = NULL, g = NULL, prob.k = NULL, jump.parameters=list(), start.k = NULL, RJ=TRUE)
y |
Vector with Log Ratio observations. |
Chrom |
Vector with Chromosome indicator. Must be numeric (1 to 23) |
Pos |
Vector with Positions of every gene. They can be absolute to the genome or relative to the chromosome. |
model |
if model ="genome", the same model is fitted for
the whole genome. If model ="Chrom", a different model is
fitted for each chromosome. |
burnin |
Number of burn-in iterations in the Markov Chain |
TOT |
Number of iterations after the burn-in |
k.max |
Maximum number of hidden states to fit. |
stat |
Initial Distribution for the hidden states. Must be a
vector of size 1 + 2 + ... +k.max . If NULL, it is assumed a
uniform distribution for every model. |
mu.alfa |
Hyperparameter. See details |
mu.beta |
Hyperparameter. See details |
ka |
Hyperparameter. See details |
g |
Hyperparameter. See details |
prob.k |
Hyperparameter. See details |
jump.parameters |
List with the parameters for the MCMC jumps. See details. |
start.k |
Initial number of states. if NULL, a random draw from
prob.k is chosen. |
RJ |
Logical. If TRUE, Reversible Jump is performed. If not, MCMC over a fixed number of hidden states. |
RJaCGH fits the following bayesian model: There is a priori
distribution for the number of hidden states (different copy numbers)
as stated by prob.k
. If NULL, a uniform distribution between 1
and k.max
is used.
The hidden states follow a normal distribution which mean (mu
) follows
itself a normal distribution with mean
mu.alfa
and variance mu.beta
. If NULL, these are the
median of the data and the range. The variance (sigma.2
)of the hidden states
follows an Inverse Gamma distribution with parameters ka
and
g
. If NULL, these are 2
and diff(range(y))^2 / 50
,
respectively.
The model for the transition matrix is based on a random matrix
beta
whose diagonal is zero. The transition matrix, Q
,
has the form: newline
Q[i,j] = exp(-beta[i,j] + beta[i,j]*x) / sum(i,.) {exp(-beta[i,.] +
beta[i,.]*x}
The prior distribution for beta
is gamma with parameters 1, 1. newline
The x
are the distances between positions, normalized to lay
between zero and 1 (x=diff(Pos) / max(diff(Pos))
)
RJaCGH performs Markov Chain MonteCarlo with Reversible Jump to sample for the posterior distribution. Every sweep has 3 steps:
1.- A Metropolis-Hastings move is used to update, for a fixed number
of hidden states, mu
, sigma.2
and beta
. A
symmetric proposal with a normal distribution and standard deviation
sigma.tau.mu
, sigma.tau.sigma.2
and
sigma.tau.beta
is sampled.
2.- A transdimensional move is chosen, between birth (a new hidden state is sampled from the prior) or death (an existing hidden state is erased).
3.- Another transdimensional move is performed; an split move (divide
an existing state in two) or a combine move (join two adjacent
states). The length of the split is sampled from a normal distribution
with standard deviation tau.split.mu
for the mu
and
tau.split.beta
for beta
.
jump.parameters
must be a list with the parameters for the
moves. It must have components sigma.tau.mu
,
sigma.tau.sigma.2
, sigma.tau.beta
These are vectors of
length k.max
. tau.split.mu
, tau.split.beta
are vectors of
length 1. If any of them is NULL, a call to the internal function
get.jump()
is made to find 'good' values.
The object returned follows a hierarchy: newline
If y is a matrix or data.frame (i.e., several arrays), an object of
class RJaCGH.array
is returned, with components:
[[]] |
A list with an object of corresponding class (see below) for every array. |
array.names |
Vector with the names of the arrays. |
[[]] |
a list with as many objects as k.max, with the fits. |
k |
sequence of number of hidden states sampled. |
prob.b |
Number of birth moves performed (Includes burn-in. |
prob.d |
Number of death moves performed (Includes burn-in. |
prob.s |
Number of split moves performed (Includes burn-in. |
prob.c |
Number of combine moves performed (Includes burn-in. |
y |
y vector. |
Pos |
Pos vector. |
model |
model. |
Chrom |
Chromosome vector. |
x |
x vector of distances between genes. |
[[]] |
a list with as many components as chromosomes, of class
RJaCGH (See below). |
Pos |
Pos vector. |
model |
model. |
Chrom |
Chromosome vector. |
mu |
a matrix with the means sampled |
sigma.2 |
a matrix with the variances sampled |
beta |
an array of dimension 3 with beta values sampled |
stat |
vector of initial distribution |
loglik |
log likelihoods of every MCMC iteration |
prob.mu |
probability of aceptance of mu in the
Metropolis-Hastings step. |
prob.sigma.2 |
probability of aceptance of sigma.2 in the
Metropolis-Hastings step. |
prob.beta |
probability of aceptance of beta in the
Metropolis-Hastings step. |
Oscar Rueda and Ramon Diaz Uriarte
Oscar Rueda and Ramon Diaz Uriarte, in prep.
Cappe, Moulines and Ryden, 2005. Inference in Hidden Markov Models. Springer.
Green, P.J. (1995) Reversible Jump Markov Chain Monte Carlo computation and Bayesian model determination. Biometrika, 82, 711-732.
summary.RJaCGH
,
states
, model.averaging
,
plot.RJaCGH
, trace.plot
,
gelman.brooks.plot
, collapseChain
y <- c(rnorm(100, 0, 1), rnorm(10, -3, 1), rnorm(20, 3, 1), rnorm(100,0, 1)) Pos <- runif(230) Pos <- cumsum(Pos) Chrom <- rep(1:23, rep(10, 23)) jp <- list(sigma.tau.mu=rep(0.5, 4), sigma.tau.sigma.2=rep(0.3, 4), sigma.tau.beta=rep(0.7, 4), tau.split.mu=0.5, tau.split.beta=0.5) fit.chrom <- RJaCGH(y=y, Pos=Pos, Chrom=Chrom, model="Chrom", burnin=10, TOT=1000, k.max = 4, jump.parameters=jp) ##RJ results for chromosome 5 table(fit.chrom[[5]]$k) fit.genome <- RJaCGH(y=y, Pos=Pos, Chrom=Chrom, model="genome", burnin=100, TOT=1000, jump.parameters=jp, k.max = 4) ## Results for the model with 3 states: apply(fit.genome[[3]]$mu, 2, summary) apply(fit.genome[[3]]$sigma.2, 2, summary) apply(fit.genome[[3]]$beta, c(1,2), summary)