btgp {tgp} | R Documentation |
The seven functions described below implement Bayesian regression models of varying complexity: linear model, linear CART, Gaussian process (GP), GP with jumps to the limiting linear model (LLM), treed GP, and treed GP LLM.
blm(X, Z, XX = NULL, bprior = "bflat", BTE = c(1000, 4000, 3), R = 1, m0r1 = FALSE, itemps = NULL, pred.n = TRUE, krige = TRUE, Ds2x = FALSE, improv = FALSE, trace = FALSE, verb = 1, ...) btlm(X, Z, XX = NULL, bprior = "bflat", tree = c(0.5, 2), BTE = c(2000, 7000, 2), R = 1, m0r1 = FALSE, itemps = NULL, pred.n = TRUE, krige = TRUE, Ds2x = FALSE, improv = FALSE, trace = FALSE, verb = 1, ...) bcart(X, Z, XX = NULL, bprior = "bflat", tree = c(0.5, 2), BTE = c(2000, 7000, 2), R = 1, m0r1 = FALSE, itemps = NULL, pred.n = TRUE, krige = TRUE, Ds2x = FALSE, improv=FALSE, trace = FALSE, verb = 1, ...) bgp(X, Z, XX = NULL, bprior = "bflat", corr = "expsep", BTE = c(1000, 4000, 2), R = 1, m0r1 = FALSE, itemps = NULL, pred.n = TRUE, krige = TRUE, Ds2x = FALSE, improv = FALSE, nu = 1.5, trace = FALSE, verb = 1, ...) bgpllm(X, Z, XX = NULL, bprior = "bflat", corr = "expsep", gamma=c(10,0.2,0.7), BTE = c(1000, 4000, 2), R = 1, m0r1 = FALSE, itemps = NULL, pred.n = TRUE, krige = TRUE, Ds2x = FALSE, improv = FALSE, nu = 1.5, trace = FALSE, verb = 1, ...) btgp(X, Z, XX = NULL, bprior = "bflat", corr = "expsep", tree = c(0.5, 2), BTE = c(2000, 7000, 2), R = 1, m0r1 = FALSE, linburn = FALSE, itemps = NULL, pred.n = TRUE, krige = TRUE, Ds2x = FALSE, improv = FALSE, nu = 1.5, trace = FALSE, verb = 1, ...) btgpllm(X, Z, XX = NULL, bprior = "bflat", corr = "expsep", tree = c(0.5, 2), gamma=c(10,0.2,0.7), BTE = c(2000, 7000, 2), R = 1, m0r1 = FALSE, linburn = FALSE, itemps = NULL, pred.n = TRUE, krige = TRUE, Ds2x = FALSE, improv = FALSE, nu = 1.5, trace = FALSE, verb = 1, ...)
Each of the above functions takes some subset of the following arguments...
X |
data.frame , matrix , or vector of inputs X |
Z |
Vector of output responses Z of length equal to the
leading dimension (rows) of X , i.e., length(Z) == nrow(X) |
XX |
Optional data.frame , matrix ,
or vector of predictive input locations
with the same number of columns as X , i.e.,
ncol(XX) == ncol(X) |
bprior |
Linear (beta) prior, default is "bflat" ;
alternates include "b0" hierarchical Normal prior,
"bmle" empirical Bayes Normal prior, "b0not" Bayesian
treed LM-style prior from Chipman et al. (same as "b0" but
without tau2 ), "bmzt" a independent Normal
prior (mean zero) with inverse-gamma variance (tau2 ).
The default "bflat" gives
an “improper” prior which can perform badly when the
signal-to-noise ratio is low. In these cases the “proper” hierarchical
specification "b0" or independent "b0" priors may perform better |
tree |
2-vector of tree process prior parameterization
c(alpha, beta) specifying
p(split leaf eta) = alpha*(1+depth(eta))^(-beta)
automatically giving zero probability to trees
with partitions containing less than |
gamma |
Limiting linear model parameters c(g, t1, t2) ,
with growth parameter g > 0
minimum parameter t1 >= 0 and maximum parameter t1 >= 0 , where
t1 + t2 <= 1 specifies
p(b|d)= t1 + exp(-g*(t2-t1)/(d-0.5)) |
corr |
Gaussian process correlation model. Choose between the isotropic
power exponential family ("exp" ) or the separable power exponential
family ("expsep" , default); the current version also supports
the isotropic Matern ("matern" ) as “beta” functionality |
BTE |
3-vector of Monte-carlo parameters (B)urn in, (T)otal, and (E)very. Predictive samples are saved every E MCMC rounds starting at round B, stopping at T. |
R |
Number of repeats or restarts of BTE MCMC rounds,
default R=1 is no restarts |
m0r1 |
If TRUE the responses Z will be scaled to have a mean of
zero and a range of 1; default is FALSE |
linburn |
If TRUE initializes MCMC with B (additional)
rounds of Bayesian Linear CART (btlm ); default is FALSE |
itemps |
Importance tempering inverse temperature ladder,
or powers to improve mixing. Can be a vector (0 < itemps )
which assumes a uniform prior over temperatures; or a matrix
with two columns where the second gives the distribution; or a list
(or data.frame ) with entries (columns) named $itemps
and $tprobs . This is “alpha” functionality |
pred.n |
TRUE (default) value results in prediction at
the inputs X ; FALSE
skips prediction at X resulting in a faster
implementation |
krige |
TRUE (default) value results in collection of kriging
means and variances at predictive (and/or data) locations; FALSE
skips the gathering of kriging statistics giving a savings in storage |
Ds2x |
TRUE results in ALC (Active Learning–Cohn)
computation of expected reduction in uncertainty calculations at the
X locations, which can be used for adaptive sampling;
FALSE (default) skips this computation, resulting in
a faster implementation |
improv |
TRUE results in samples from the expected
improvement (in reduction of uncertainty) at locations XX
about the global minimum which can be used for adaptive sampling;
FALSE (default) skips this computation, resulting in a faster
implementation |
nu |
“beta” functionality: fixed smoothness parameter for the Matern correlation function; nu+0.5 times differentiable predictive surfaces result |
trace |
TRUE results in a saving of samples from the
posterior distribution for most of the parameters in the model. The
default is FALSE for speed/storage reasons. See note below |
verb |
Level of verbosity of R-console print statements: from 0 (none); 1 (default) which shows the “progress meter”; 2 includes an echo of initialization parameters; up to 3 and 4 (max) with more info about successful tree operations |
... |
These ellipses arguments are interpreted as augmentations
to the prior specification generated by
params <- tgp.default.params(ncol(X)+1) .
You may use these to specify a custom setting of any of default parameters in the output list params
except those for which a specific argument is already provided
(e.g., params$corr or params$bprior ) or those which contradict
the type of b* function being called (e.g.,
params$tree or params$gamma ); these redundant or
possibly conflicting specifications will be ignored |
The functions and their arguments can be categorized by whether or not they use treed partitioning (T), GP models, and jumps to the LLM (or LM)
blm | LM | Linear Model |
btlm | T, LM | Treed Linear Model |
bcart | T | Treed Constant Model |
bgp | GP | GP Regression |
bgpllm | GP, LLM | GP with jumps to the LLM |
btgp | T, GP | treed GP Regression |
btgpllm | T, GP, LLM | treed GP with jumps to the LLM |
Each function implements a special case of the generic function
tgp
which is an interface to C/C++ code for treed Gaussian process
modeling of varying parameterization. Documentation for tgp
has been declared redundant, and has subsequently been removed. To see
how the b*
functions use tgp
simply examine the
function. In the latest version, with the addition of the ellipses
“...” argument, there is nothing that can be done
with the direct tgp
function that cannot also be done with a
b*
function
Only functions in the T (tree) category take the tree
argument;
GP category functions take the corr
argument; and LLM category
functions take the gamma
argument. Non-tree class functions omit
the parts
output, see below
bcart
is the same as btlm
except that only the
intercept term in the LM is estimated; the others are zero, thereby
implementing a Bayesian version of the original CART model
Please see vignette("tgp")
for detailed illustration
bgp
returns an object of class "tgp"
. The function plot.tgp
can be used to help visualize results.
An object of class "tgp"
is a list containing at least the following
components... The code{parts} output is unique to the T (tree) category functions.
Tree viewing is supported by tgp.trees
X |
Input argument: data.frame of inputs X |
n |
Number of rows in X , i.e., nrow(X) |
d |
Number of cols in X , i.e., ncol(X) |
Z |
Vector of output responses Z |
XX |
Input argument: data.frame of predictive locations XX |
nn |
Number of rows in XX , i.e., nrow(XX) |
BTE |
Input argument: Monte-carlo parameters |
R |
Input argument: restarts |
linburn |
Input argument: initialize MCMC with linear CART |
params |
list of model parameters generated by
tgp.default.params and subsequently modified according
to the calling b* function and its arguments |
dparams |
Double-representation of model input parameters used by the C-code |
itemps |
Input argument: data.frame of inverse
temperatures ($itemps ) (temperature ladder) and prior
probability distribution ($tprobs ) used for importance tempering |
Zp.mean |
Vector of mean predictive estimates at X locations |
Zp.q1 |
Vector of 5% predictive quantiles at X locations |
Zp.q2 |
Vector of 95% predictive quantiles at X locations |
Zp.q |
Vector of quantile norms Zp.q2-Zp.q1 |
Zp.km |
Vector of (expected) kriging means at X locations |
Zp.ks2 |
Vector of (expected) kriging variances at X locations |
ZZ.mean |
Vector of mean predictive estimates at XX locations |
ZZ.q1 |
Vector of 5% predictive quantiles at XX locations |
ZZ.q2 |
Vector of 95% predictive quantiles at XX locations |
ZZ.q |
Vector of quantile norms ZZ.q2-ZZ.q1 , used by the
ALM adaptive sampling algorithm |
ZZ.km |
Vector of (expected) kriging means at XX locations |
ZZ.ks2 |
Vector of (expected) kriging variances at XX locations |
Ds2x |
If argument Ds2x=TRUE , this vector contains ALC
statistics for XX locations |
improv |
If argument improv=TRUE , this vector contains expected
improvement (about the global minimum) statistics for XX locations |
response |
Name of response Z if supplied by data.frame
in argument, or "z" if none provided |
parts |
Internal representation of the regions depicted by partitions of the maximum a' posteriori (MAP) tree |
trees |
list of trees (maptree representation) which
were MAP as a function
of each tree height sampled between MCMC rounds B and
T |
trace |
If trace==TRUE , this list
contains traces of most of the model
parameters and posterior predictive distributions at input locations
XX . Otherwise the entry is FALSE . See note below |
ess |
Importance tempering effective sample size (ESS). If
itemps==NULL this corresponds to the total number of
samples collected, i.e.,
R*(BTE[2]-BTE[1])/BTE[3] ).
Otherwise the ESS will be lower due to a non-zero coefficient of variation of the calculated importance tempering weights |
Inputs X, XX, Z
containing NaN, NA
, or Inf
are
discarded with non-fatal warnings
Upon execution, MCMC reports are made every 1,000 rounds to indicate progress
Stationary (non-treed) processes on larger inputs (e.g., X,Z
)
of size greater than 500, *might* be slow in execution, especially on
older machines. Once the C code starts executing, it can be interrupted
in the usual way: either via Ctrl-C (Unix-alikes) or pressing the Stop
button in the R-GUI. When this happens, interrupt messages will
indicate which required cleanup measures completed before returning
control to R.
Regarding trace=TRUE
: Samples from the posterior will be
collected for all parameters in the model. GP parameters are collected
with reference to the locations in XX
, resulting
nn=nrow{XX}
traces of d,g,s2,tau2
, etc. Therefore, it
is recommended that nn
is chosen to be a small, representative,
set of input locations. Besides GP parameters, traces are saved for
the tree partitions, areas under the LLM, log posterior (as a function
of tree height), and samples from the posterior predictive
distributions. Note that since some traces are stored in
files, multiple tgp
/R sessions in the same working
directory can clobber the trace files of other sessions
Robert B. Gramacy rbgramacy@ams.ucsc.edu
Gramacy, R. B., Lee, H. K. H. (2006). Bayesian treed Gaussian process models. Available as UCSC Technical Report ams2006-01.
Gramacy, R. B., Lee, H. K. H. (2006). Adaptive design of supercomputer experiments. Available as UCSC Technical Report ams2006-02.
Chipman, H., George, E., & McCulloch, R. (1998). Bayesian CART model search (with discussion). Journal of the American Statistical Association, 93, 935–960.
Chipman, H., George, E., & McCulloch, R. (2002). Bayesian treed models. Machine Learning, 48, 303–324.
http://www.ams.ucsc.edu/~rbgramacy/tgp.html
plot.tgp
, tgp.trees
, predict.tgp
## ## Many of the examples below illustrate the above ## function(s) on random data. Thus it can be fun ## (and informative) to run them several times. ## # # simple linear response # # input and predictive data X <- seq(0,1,length=50) XX <- seq(0,1,length=99) Z <- 1 + 2*X + rnorm(length(X),sd=0.25) out <- blm(X=X, Z=Z, XX=XX) # try Linear Model plot(out) # plot the surface # # 1-d Example # # construct some 1-d nonstationary data X <- seq(0,20,length=100) XX <- seq(0,20,length=99) Z <- (sin(pi*X/5) + 0.2*cos(4*pi*X/5)) * (X <= 9.6) lin <- X>9.6; Z[lin] <- -1 + X[lin]/10 Z <- Z + rnorm(length(Z), sd=0.1) out <- btlm(X=X, Z=Z, XX=XX) # try Linear CART plot(out) # plot the surface tgp.trees(out) # plot the MAP trees out <- btgp(X=X, Z=Z, XX=XX) # use a treed GP plot(out) # plot the surface tgp.trees(out) # plot the MAP trees # # 2-d example # (using the isotropic correlation function) # # construct some 2-d nonstationary data exp2d.data <- exp2d.rand() X <- exp2d.data$X; Z <- exp2d.data$Z XX <- exp2d.data$XX # try a GP out <- bgp(X=X, Z=Z, XX=XX, corr="exp") plot(out) # plot the surface # try a treed GP LLM out <- btgpllm(X=X, Z=Z, XX=XX, corr="exp") plot(out) # plot the surface tgp.trees(out) # plot the MAP trees # # Motorcycle Accident Data # # get the data # and scale the response to zero mean and a rage of 1 (m0r1) require(MASS) # try a GP out <- bgp(X=mcycle[,1], Z=mcycle[,2], m0r1=TRUE) plot(out) # plot the surface # try a treed GP LLM # best to use the "b0" beta linear prior to capture common # common linear process throughout all regions (using the # elipses "...") out <- btgpllm(X=mcycle[,1], Z=mcycle[,2], bprior="b0", m0r1=TRUE) plot(out) # plot the surface tgp.trees(out) # plot the MAP trees # for other examples try the demos or the vignette