mlegp {mlegp}R Documentation

mlegp: maximum likelihood estimation of Gaussian process parameters

Description

Finds maximum likelihood estimates of Gaussian process parameters for a vector (or matrix) of one (or more) responses. For multiple responses, the user chooses between fitting independent Gaussian processes to the separate responses or fitting independent Gaussian processes to principle component weights obtained through singular value decomposition of the output. The latter is useful for functional output or data rich situations.

Usage

mlegp(X, Z, constantMean = 1, nugget = NULL, min.nugget = 0, param.names = NULL, gp.names = NULL, 
        PC.UD = NULL, PC.num = NULL, PC.percent = NULL, 
        simplex.ntries = 5, simplex.maxiter = 500, simplex.reltol = 1e-8,  
        BFGS.maxiter = 500, BFGS.tol = 0.01, BFGS.h = 1e-10, seed = 0, verbose = 1)

Arguments

X the design matrix
Z vector or matrix of observations; corresponding to the rows of X
constantMean a value of 1 indicates that each Gaussian process will have a constant mean; otherwise the mean function will be a linear regression in X, plus an intercept term
nugget a positive number indicating an initial estimate for the nugget term, or a vector corresponding to the diagonal nugget matrix up to a multiplicative constant. If NULL (the default), mlegp estimates a nugget term only there are replicates in the design matrix
min.nugget minimum value of the nugget term; 0 by default
param.names a vector of parameter names, corresponding to the columns of X; parameter names are ‘p1’, ‘p2’, ... by default
PC.UD the UD matrix if Z is a matrix of principle component weights; see mlegp-svd-functions
PC.num the number of principle component weights to keep in the singular value decomposition of Z
PC.percent if not NULL the number of principle component weights kept is the minimum number that accounts for PC.percent of the total variance of the matrix Z
simplex.ntries the number of simplexes to run
simplex.maxiter maximum number of evaluations / simplex
simplex.reltol relative tolerance for simplex method, defaulting to 1e-16
BFGS.maxiter maximum number of iterations for BFGS method
BFGS.tol stopping condition for BFGS method is when norm(gradient) < BFGS.tol * max(1, norm(x)), where x is the parameter vector and norm is the Euclidian norm
BFGS.h derivatives are approximated as [f(x+BFGS.h) - f(x)] / BFGS.h)
seed the random number seed
verbose a value of '1' or '2' will result in status updates being printed; a value of '2' results in more information

Details

This function calls the C function fitGPFromR which in turn calls fitGP (both in the file fit_gp.h) to fit each Gaussian process.

Separate Gaussian processes are fit to the observations in each column of Z. Maximum likelihood estimates for correlation and nugget parameters are found through numerical methods (i.e., the Nelder-Mead Simplex and the L-BFGS method), while maximum likelihood estimates of the mean regression parameters and overall variance are calculated in closed form (given the correlation and (scaled) nugget parameters). Multiple simplexes are run, and estimates from the best simplex are used as initial values to the gradient (L-BFGS) method.

Gaussian processes are fit to principle component weights by utilizing the singular value decomposition (SVD) of Z, Z = UDVprime. Columns of Z should correspond to a single k-dimensional observation (e.g., functional output of a computer model, evaluated at a particular input)

In the complete SVD, Z is k x m, and r = min(k,m), U is k x r, D is r x r, containing the singular values along the diagonal, and Vprime is r x m. The output Z is approximated by keeping l < r singular values, keeping a UD matrix of dimension k x l, and the Vprime matrix of dimension l x m. Each column of Vprime now contains l principle component weights, which can be used to reconstruct the functional output.

Value

an object of class gp.list if Z has more than 1 column, otherwise an object of class gp

Note

The random number seed is 0 by default, but should be randomly set by the user

In some situations, especially for noiseless data, it may be desirable to force a nugget term in order to make the variance-covariance matrix of the Gaussian process more stable; this can be done by setting the argument min.nugget.

If fitting multiple Gaussian processes, the arguments min.nugget and nugget apply to all Gaussian processes being fit.

In some cases, the variance-covariance matrix is stable in C but not stable in R. When this happens, this function will attempt to impose a minimum value for the nugget term, and this will be reported. However, the user is encouraged to refit the GP and manually setting the argument min.nugget in mlegp.

When fitting Gaussian processes to principle component weights, a minimum of two principle component weights must be used.

Author(s)

Garrett M. Dancik dancikg@nsula.edu

References

Santner, T.J. Williams, B.J., Notz, W., 2003. The Design and Analysis of Computer Experiments (New York: Springer).

Heitmann, K., Higdon, D., Nakhleh, C., Habib, S., 2006. Cosmic Calibration. The Astrophysical Journal, 646, 2, L1-L4.

http://users.nsula.edu/dancikg/mlegp/

See Also

createGP for details of the gp object; gp.list for details of the gp.list object; mlegp-svd-functions for details on fitting Gaussian processes to high-dimensional data using principle component weights; the L-BFGS method uses C code written by Naoaki Okazaki (http://www.chokkan.org/software/liblbfgs)

Examples


###### fit a single Gaussian process ######
x = -5:5; y1 = sin(x) + rnorm(length(x),sd=.1)
fit1 = mlegp(x, y1)

## summary and diagnostic plots ##
summary(fit1)
plot(fit1)

###### fit multiple Gaussian processes to multiple observations ######
x = -5:5 
y1 = sin(x) + rnorm(length(x),sd=.1)
y2 = sin(x) + 2*x + rnorm(length(x), sd = .1)
fitMulti = mlegp(x, cbind(y1,y2))

## summary and diagnostic plots ##
summary(fitMulti)
plot(fitMulti)

###### fit multiple Gaussian processes using principle component weights ######

## generate functional output ##
x = seq(-5,5,by=.2)
p = 1:50
y = matrix(0,length(p), length(x))
for (i in p) {
        y[i,] = sin(x) + i + rnorm(length(x), sd  = .01)
}

## we now have 10 functional observations (each of length 100) ##
for (i in p) {
        plot(x,y[i,], type = "l", col = i, ylim = c(min(y), max(y)))
        par(new=TRUE)
}

## fit GPs to the two most important principle component weights ##
numPCs = 2
fitPC = mlegp(p, t(y), PC.num = numPCs)
plot(fitPC) ## diagnostics

## reconstruct the output Y = UDV'
Vprime = matrix(0,numPCs,length(p))
Vprime[1,] = predict(fitPC[[1]])
Vprime[2,] = predict(fitPC[[2]])

predY = fitPC$UD%*%Vprime
m1 = min(y[39,], predY[,39])
m2 = max(y[39,], predY[,39])

plot(x, y[39,], type="l", lty = 1, ylim = c(m1,m2), ylab = "original y" )
par(new=TRUE)
plot(x, predY[,39], type = "p", col = "red", ylim = c(m1,m2), ylab = "predicted y" )

[Package mlegp version 2.2.6 Index]