bark-package {bark} | R Documentation |
Implementation of BARK: Bayesian Additive Regression Kernels with Feature Selection, Zhi Ouyang, Ph.D. thesis, Duke University
Package: | bark |
Type: | Package |
Version: | 0.1-0 |
Date: | 2008-07-16 |
License: | GPL version 2 or newer |
LazyLoad: | yes |
Overview:
BARK is a Bayesian sum-of-kernels model.
For numeric response y, we have
y = f(x) + e,
where e ~ N(0,sigma^2).
For a binary response y, P(Y=1 | x) = F(f(x)), where F
denotes the standard normal cdf (probit link).
In both cases, f is the sum of many Gaussian kernel functions. The goal is to have very flexible inference for the unknown function f. It uses an approximated Cauchy process as the prior distribution for the unknown function f.
Feature selection can be achieved through the inference on the scale parameters in the Gaussian kernels. BARK accepts four different types of prior distributions, e, d, se, sd, enabling either soft shrinkage or hard shrinkage for the scale parameters.
Functions:
bark()
sim.Friedman1()
sim.Friedman2()
sim.Friedman3()
sim.Circle()
Zhi Ouyang <zo2@stat.duke.edu>, Merlise Clyde <clyde@stat.duke.edu>, Robert Wolpert <wolpert@stat.duke.edu>
Maintainer: Zhi Ouyang <zo2@stat.duke.edu>
Ouyang, Zhi (2008) Bayesian Additive Regression Kernels.
Duke University. Ph.D. dissertation, Chapter 3.
at:
http://stat.duke.edu/people/theses/OuyangZ.html
##Simulate regression example # Friedman 2 data set, 200 noisy training, 1000 noise free testing # Out of sample MSE in SVM (default RBF): 6500 (sd. 1600) # Out of sample MSE in BART (default): 5300 (sd. 1000) traindata <- sim.Friedman2(200, sd=125) testdata <- sim.Friedman2(1000, sd=0) fit.bark.d <- bark(traindata$x, traindata$y, testdata$x, classification=FALSE, type="d") boxplot(as.data.frame(fit.bark.d$theta.lambda)) mean((fit.bark.d$yhat.test.mean-testdata$y)^2) ##Simulate classification example # Circle 5 with 2 signals and three noisy dimensions # Out of sample erorr rate in SVM (default RBF): 0.110 (sd. 0.02) # Out of sample error rate in BART (default): 0.065 (sd. 0.02) traindata <- sim.Circle(200, dim=5) testdata <- sim.Circle(1000, dim=5) fit.bark.se <- bark(traindata$x, traindata$y, testdata$x, classification=TRUE, type="se") boxplot(as.data.frame(fit.bark.se$theta.lambda)) mean((fit.bark.se$yhat.test.mean>0)!=testdata$y)