bcp {bcp} | R Documentation |
bcp()
implements the Barry and Hartigan (1993) product partition model for the standard change point problem using Markov Chain Monte Carlo. Versions < 2.0 are quadratic in speed, and perform the default 550 iterations in approximately 0.75 seconds for a sequence of length 100. Versions => 2.0 are linear in speed and partition a sequence of length 10,000 in approximately 45 seconds (compared with 45 minutes for versions < 2.0). These times were computed on a PC with Windows XP, a Pentium D Processor (2.99 GHz) and 3.50GB of RAM.
bcp(x, w0 = 0.2, p0 = 0.2, burnin = 50, mcmc = 500, return.mcmc=FALSE)
x |
a vector of numerical data. |
w0 |
an optional numeric value specifying the prior variance. If no value is specified, the default value of 0.2 is used, as recommended by Barry and Hartigan. |
p0 |
an optional numeric value specifying the prior probability of a change point at each location in the sequence. If no value is specified, the default value of 0.2 is used, as recommended by Barry and Hartigan. |
burnin |
the number of burnin iterations. |
mcmc |
the number of iterations after burnin. |
return.mcmc |
if return.mcmc=TRUE the posterior means and the partitions after each iteration are returned. |
This algorithm is used when there exists an unknown partition of a sequence into contiguous blocks such that the mean is constant within blocks.
bcp()
returns a list containing the following components:
data |
a vector copy of the data. |
return.mcmc |
if return.mcmc=TRUE , the posterior means and the partitions after each iteration are returned. |
mcmc.means |
if return.mcmc=TRUE , mcmc.means contains the posterior means at for each iteration conditional on the state of the partition. |
mcmc.rhos |
if return.mcmc=TRUE , mcmc.rhos contains the partitions after each iteration. A value of 1 indicates the end of a block. |
blocks |
a vector of the number of blocks after each iteration. |
posterior.mean |
a vector of the posterior means over the iterations, excuding burnin. |
posterior.var |
a vector of the posterior variances over the iterations, excuding burnin. |
posterior.prob |
a vector of the posterior probability of changes over the iterations, excuding burnin. |
burnin |
the number of burnin iterations. |
mcmc |
the number of iterations after burnin. |
w0 |
an optional numeric value specifying the prior variance. If no value is specified, the default value of 0.2 is used, as recommended by Barry and Hartigan. |
p0 |
an optional numeric value specifying the prior probability of a change point at each location in the sequence. If no value is specified, the default value of 0.2 is used, as recommended by Barry and Hartigan. |
The generic accessor functions data
, blocks
, and posterior.mean
, posterior.prob
, burnin
, mcmc
, w0
, and p0
extract various useful components of the list returned by bcp()
. The functions summary.bcp()
, print.bcp()
, and plot.bcp()
are used to obtain summaries of the results.
The vector of data must contain no missing or NA
values. The user may set .Random.seed
to control the MCMC if desired.
Chandra Erdman and John W. Emerson
Bai J., Perron P. (2003), Computation and Analysis of Multiple Structural Change Models, Journal of Applied Econometrics, 18, 1-22. url: http://qed.econ.queensu.ca/jae/2003-v18.1/bai-perron/.
Daniel Barry and J. A. Hartigan (1993), A Bayesian Analysis for Change Point Problems, Journal of The American Statistical Association, 88, 309-19.
Olshen, A. B., Venkatraman, E. S., Lucito, R., Wigler, M. (2004), Circular binary segmentation for the analysis of array-based DNA copy number data, Biostatistics, 5, 557-572. url: http://www.bioconductor.org/repository/release1.5/package/html/DNAcopy.html.
Martyn Plummer, Nicky Best, Kate Cowles, and Karen Vines (2006), The coda Package, version 0.10-7, CRAN: The Comprehensive R Network.
Snijders et al. (2001), Assembly of microarrays for genome-wide measurement of DNA copy number, Nature Genetics, 29, 263-264.
Achim Zeileis, Friedrich Leisch, Bruce Hansen, Kurt Hornik, Christian Kleiber (2006), The strucchange Package, version 1.3-1, CRAN: The Comprehensive R Network.
plot.bcp
, summary.bcp
, and print.bcp
for summaries of the results.
##### A random sample from a few normal distributions ##### testdata <- c(rnorm(50), rnorm(50, 5, 1), rnorm(50)) bcp.0 <- bcp(testdata) summary.bcp(bcp.0) plot.bcp(bcp.0) ##### Coriell chromosome 11 ##### data(coriell) chrom11 <- as.vector(na.omit(coriell$Coriell.05296[coriell$Chromosome==11])) bcp.11 <- bcp(chrom11) summary.bcp(bcp.11) plot.bcp(bcp.11) # to see bcp and Circular Binary Segmentation results run: if(require("DNAcopy")) { bcp.11$posterior.prob[length(bcp.11$posterior.brob)] <- 0 n <- length(chrom11) cbs <- segment(CNA(chrom11, rep(1, n), 1:n), verbose = 0) cbs.ests <- rep(unlist(cbs$output[6]), unlist(cbs$output[5])) op <- par(mfrow=c(2,1),col.lab="black",col.main="black") plot(1:n, bcp.11$posterior.mean, type="l", xlab="Location", ylab="Posterior Mean", main="Posterior Means") lines(cbs.ests, col="red") points(chrom11) plot(1:n, bcp.11$posterior.prob, type="l", ylim=c(0,1), xlab="Location", ylab="Posterior Probability of a Change", main="Change Point Locations") for(i in 1:(dim(cbs$output)[1]-1)) abline(v=cbs$output$loc.end[i], col="red") par(op) } else { cat("DNAcopy is not loaded") } ##### RealInt ##### data("RealInt") bcp.ri <- bcp(as.vector(RealInt)) summary.bcp(bcp.ri) plot.bcp(bcp.ri) # to see bcp and Bai and Perron results run: if(require("strucchange")) { bcp.ri$posterior.prob[length(bcp.ri$posterior.brob)] <- 0 bp <- breakpoints(RealInt ~ 1, h = 2)$breakpoints rho <- rep(0, length(RealInt)) rho[bp] <- 1 b.num<-1 + c(0,cumsum(rho[1:(length(rho)-1)])) bp.mean <- unlist(lapply(split(RealInt,b.num),mean)) bp.ri <- rep(0,length(RealInt)) for(i in 1:length(bp.ri)) bp.ri[i] <- bp.mean[b.num[i]] op <- par(mfrow=c(2,1),col.lab="black",col.main="black") xax <- seq(1961, 1987, length=103) plot(xax, bcp.ri$posterior.mean, type="l", xlab="Time", ylab="Mean", main="Posterior Means") lines(xax, bp.ri, col="blue") points(RealInt) plot(xax, bcp.ri$posterior.prob, type="l", ylim=c(0,1), xlab="Time", ylab="Posterior Probability", main="Posterior Probability of a Change") for(i in 1:length(bp.ri)) abline(v=xax[bp[i]], col="blue") par(op) } else { cat("strucchange is not loaded") }