bcp {bcp}R Documentation

A Package for Performing a Bayesian Analysis of Change Point Problems

Description

bcp() implements the Barry and Hartigan (1993) product partition model for the standard change point problem using Markov Chain Monte Carlo. Versions < 2.0 are quadratic in speed, and perform the default 550 iterations in approximately 0.75 seconds for a sequence of length 100. Versions => 2.0 are linear in speed and partition a sequence of length 10,000 in approximately 45 seconds (compared with 45 minutes for versions < 2.0). These times were computed on a PC with Windows XP, a Pentium D Processor (2.99 GHz) and 3.50GB of RAM.

Usage

 bcp(x, w0 = 0.2, p0 = 0.2, burnin = 50, mcmc = 500, return.mcmc=FALSE)

Arguments

x a vector of numerical data.
w0 an optional numeric value specifying the prior variance. If no value is specified, the default value of 0.2 is used, as recommended by Barry and Hartigan.
p0 an optional numeric value specifying the prior probability of a change point at each location in the sequence. If no value is specified, the default value of 0.2 is used, as recommended by Barry and Hartigan.
burnin the number of burnin iterations.
mcmc the number of iterations after burnin.
return.mcmc if return.mcmc=TRUE the posterior means and the partitions after each iteration are returned.

Details

This algorithm is used when there exists an unknown partition of a sequence into contiguous blocks such that the mean is constant within blocks.

Value

bcp() returns a list containing the following components:

data a vector copy of the data.
return.mcmc if return.mcmc=TRUE, the posterior means and the partitions after each iteration are returned.
mcmc.means if return.mcmc=TRUE, mcmc.means contains the posterior means at for each iteration conditional on the state of the partition.
mcmc.rhos if return.mcmc=TRUE, mcmc.rhos contains the partitions after each iteration. A value of 1 indicates the end of a block.
blocks a vector of the number of blocks after each iteration.
posterior.mean a vector of the posterior means over the iterations, excuding burnin.
posterior.var a vector of the posterior variances over the iterations, excuding burnin.
posterior.prob a vector of the posterior probability of changes over the iterations, excuding burnin.
burnin the number of burnin iterations.
mcmc the number of iterations after burnin.
w0 an optional numeric value specifying the prior variance. If no value is specified, the default value of 0.2 is used, as recommended by Barry and Hartigan.
p0 an optional numeric value specifying the prior probability of a change point at each location in the sequence. If no value is specified, the default value of 0.2 is used, as recommended by Barry and Hartigan.


The generic accessor functions data, blocks, and posterior.mean, posterior.prob, burnin, mcmc, w0, and p0 extract various useful components of the list returned by bcp(). The functions summary.bcp(), print.bcp(), and plot.bcp() are used to obtain summaries of the results.

Note

The vector of data must contain no missing or NA values. The user may set .Random.seed to control the MCMC if desired.

Author(s)

Chandra Erdman and John W. Emerson

References

Bai J., Perron P. (2003), Computation and Analysis of Multiple Structural Change Models, Journal of Applied Econometrics, 18, 1-22. url: http://qed.econ.queensu.ca/jae/2003-v18.1/bai-perron/.

Daniel Barry and J. A. Hartigan (1993), A Bayesian Analysis for Change Point Problems, Journal of The American Statistical Association, 88, 309-19.

Olshen, A. B., Venkatraman, E. S., Lucito, R., Wigler, M. (2004), Circular binary segmentation for the analysis of array-based DNA copy number data, Biostatistics, 5, 557-572. url: http://www.bioconductor.org/repository/release1.5/package/html/DNAcopy.html.

Martyn Plummer, Nicky Best, Kate Cowles, and Karen Vines (2006), The coda Package, version 0.10-7, CRAN: The Comprehensive R Network.

Snijders et al. (2001), Assembly of microarrays for genome-wide measurement of DNA copy number, Nature Genetics, 29, 263-264.

Achim Zeileis, Friedrich Leisch, Bruce Hansen, Kurt Hornik, Christian Kleiber (2006), The strucchange Package, version 1.3-1, CRAN: The Comprehensive R Network.

See Also

plot.bcp, summary.bcp, and print.bcp for summaries of the results.

Examples


  ##### A random sample from a few normal distributions #####
  testdata <- c(rnorm(50), rnorm(50, 5, 1), rnorm(50))
  bcp.0 <- bcp(testdata)
  summary.bcp(bcp.0)
  plot.bcp(bcp.0)
  
  ##### Coriell chromosome 11 #####
  data(coriell)
  chrom11 <- as.vector(na.omit(coriell$Coriell.05296[coriell$Chromosome==11]))
  bcp.11 <- bcp(chrom11)
  summary.bcp(bcp.11)
  plot.bcp(bcp.11)
  
  # to see bcp and Circular Binary Segmentation results run:
  if(require("DNAcopy")) {
  bcp.11$posterior.prob[length(bcp.11$posterior.brob)] <- 0
  n <- length(chrom11)
  cbs <- segment(CNA(chrom11, rep(1, n), 1:n), verbose = 0)
  cbs.ests <- rep(unlist(cbs$output[6]), unlist(cbs$output[5]))
  op <- par(mfrow=c(2,1),col.lab="black",col.main="black")
  plot(1:n, bcp.11$posterior.mean, type="l", xlab="Location", ylab="Posterior Mean", main="Posterior Means")
  lines(cbs.ests, col="red")
  points(chrom11)
  plot(1:n, bcp.11$posterior.prob, type="l", ylim=c(0,1), xlab="Location", ylab="Posterior Probability of a Change", main="Change Point Locations")
  for(i in 1:(dim(cbs$output)[1]-1)) abline(v=cbs$output$loc.end[i], col="red")
  par(op)
  } else {
        cat("DNAcopy is not loaded")
  }
  
  ##### RealInt #####
  data("RealInt")
  bcp.ri <- bcp(as.vector(RealInt))
  summary.bcp(bcp.ri)
  plot.bcp(bcp.ri)
  
  # to see bcp and Bai and Perron results run:
  if(require("strucchange")) {
  bcp.ri$posterior.prob[length(bcp.ri$posterior.brob)] <- 0
  bp <- breakpoints(RealInt ~ 1, h = 2)$breakpoints
  rho <- rep(0, length(RealInt))
  rho[bp] <- 1
  b.num<-1 + c(0,cumsum(rho[1:(length(rho)-1)]))
  bp.mean <- unlist(lapply(split(RealInt,b.num),mean))
  bp.ri <- rep(0,length(RealInt))
  for(i in 1:length(bp.ri)) bp.ri[i] <- bp.mean[b.num[i]]
  op <- par(mfrow=c(2,1),col.lab="black",col.main="black")
  xax <- seq(1961, 1987, length=103)
  plot(xax, bcp.ri$posterior.mean, type="l", xlab="Time", ylab="Mean", main="Posterior Means")
  lines(xax, bp.ri, col="blue")
  points(RealInt)
  plot(xax, bcp.ri$posterior.prob, type="l", ylim=c(0,1), xlab="Time", ylab="Posterior Probability", main="Posterior Probability of a Change")
  for(i in 1:length(bp.ri)) abline(v=xax[bp[i]], col="blue")
  par(op)
  } else {
  cat("strucchange is not loaded")
  }


[Package bcp version 2.0 Index]