mlelcd {LogConcDEAD} | R Documentation |
Uses Shor's r-algorithm to compute the maximum
likelihood estimator of a log-concave density based on an
i.i.d. sample. The estimator is uniquely determined by its value at the data
points. The output is an object of class "LogConcDEAD"
which
contains all the information needed to plot the estimator using the
plot
method, or to evaluate it using
the function dlcd
.
mlelcd(x, w=rep(1/nrow(x),nrow(x)), y=initialy(x), verbose=-1, alpha=5, c=1, sigmatol=10^-8, integraltol=10^-4, ytol=10^-4, stepscale=5.1, stepscale2=2, stepscale3=1.5, stepscale4=1.05, desiredsize=3.3, Jtol=0.001) lcd.mle(x, w=rep(1/nrow(x),nrow(x)), y=initialy(x), verbose=-1, alpha=5, c=1, sigmatol=10^-8, integraltol=10^-4, ytol=10^-4, stepscale=5.1, stepscale2=2, stepscale3=1.5, stepscale4=1.05, desiredsize=3.3, Jtol=0.001)
x |
Data in R^d, in the form of an n x d numeric matrix |
w |
Vector of weights w_i such that the computed estimator
maximizes w[1] log f(x[1,]) + ... + w[n] log f([x,n]) |
y |
Vector giving starting point for the r-algorithm. If none given, a kernel estimate is used. |
verbose |
|
alpha |
Scalar parameter for SolvOpt |
c |
Scalar giving starting step size |
sigmatol |
Real-valued scalar giving one of the stopping
criteria: Relative change in sigma must be below
sigmatol for algorithm to terminate. (See Details) |
ytol |
Real-valued scalar giving on of the stopping criteria: Relative change in y must be
below ytol for algorithm to terminate. (See
Details) |
integraltol |
Real-valued scalar giving one of the stopping
criteria: |exp(h_y) - 1| must be below
integraltol for algorithm to terminate. (See Details) |
stepscale, stepscale2, stepscale3, stepscale4,
desiredsize |
Scalar parameters for SolvOpt. Changing these is not recommended. |
Jtol |
Parameter controlling when Taylor expansion is used in computing the function sigma |
The log-concave maximum likelihood density estimator based on data X_1, ..., X_n is the function that maximizes
(w_1 log f(X_1) + ... + w_n log f(X_n))
subject to the constraint that f is log-concave. For i.i.d.~data, the weights w_i should be 1/n for each i.
This is a function of the form h_y for some y in R^n, where
h_y(x) = inf{h(x): h concave, h(x_i) >= y_i for i = 1, ..., n}.
Functions of this form may equivalently be specified by dividing C_n, the convex hull of the data, into simplices C_j for j in J (triangles in 2d, tetrahedra in 3d etc), and setting
f(x) = exp{b_j^T x - beta_j}
for x in C_j, and f(x) = 0 for x not in C_n.
This function uses Shor's r-algorithm (an iterative subgradient-based procedure) to minimize over vectors y in R^n the function
sigma(y) = -1/n (y_1 + ... + y_n) + int (h_y(x)) dx.
This is equivalent to finding the log-concave maximum likelihood estimator, as demonstrated in Cule, Samworth and Stewart (2008).
An implementation of Shor's r-algorithm based on SolvOpt is used.
Computing sigma makes use of the qhull library, adapted from the R implementation in geometry. Code from this package is copied here as it is not currently possible to use compiled code from another package. For points not in general position, this requires a Taylor expansion of sigma, discussed in Cule and D"umbgen (2008).
lcd.mle
is deprecated, but retained for compatibility with
previous versions.
An object of class "LogConcDEAD"
, with the following
components:
x |
Data copied from input (may be reordered) |
w |
weights copied from input (may be reordered) |
logMLE |
vector of
the log of the maximum likelihood estimate, evaluated at the observation points |
NumberOfEvaluations |
Vector containing the number of steps, number of function evaluations, and number of subgradient evaluations. If the SolvOpt algorithm fails, the first component will be an error code (<0). |
MinSigma |
Real-valued scalar giving minimum value of the objective function |
b |
matrix (see Details) |
beta |
vector (see Details) |
triang |
matrix containing final triangulation of the convex hull of the data |
verts |
matrix containing details of triangulation for use in dlcd |
vertsoffset |
matrix containing details of triangulation for use in dlcd |
chull |
Vector containing vertices of faces of the convex hull of the data |
outnorm |
matrix where each row is an outward
pointing normal vectors for the faces of the convex hull of the
data. The number of vectors depends on the number of faces of the
convex hull. |
outoffset |
matrix where each row is a point on a face of
the convex hull of the data. The number of vectors depends on the
number of faces of the convex hull. |
For one-dimensional data, the active set algorithm of
logcondens
is faster, and may be
preferred.
The authors gratefully acknowledge the assistance of Lutz Duembgen at the University of Bern for his insight into the objective function sigma.
Further references, including definitions and background material, may be found in Cule, Samworth and Stewart (2008).
Madeleine Cule mlc40@cam.ac.uk
Robert B. Gramacy
Richard Samworth
Barber, C.B., Dobkin, D.P., and Huhdanpaa, H.T. (1996) The Quickhull algorithm for convex hulls ACM Trans. on Mathematical Software, 22(4) p.469-483 http://www.qhull.org
Cule, M. L. and D"umbgen, L. (2008) On an auxiliary function for log-density estimation, University of Bern technical report. http://arxiv.org/abs/0807.4719
Cule, M. L., Samworth, R. J., and Stewart, M. I. (2007) Maximum likelihood estimation of a log-concave density, Submitted.http://arxiv.org/abs/0804.3989
Kappel, F. and Kuntsevich, A. V. (2000)
An implementation of Shor's r-algorithm Computational
Optimization and Applications 15
http://www.uni-graz.at/imawww/kuntsevich/solvopt/
Shor, N. Z. (1985) Minimization methods for nondifferentiable functions Springer-Verlag
geometry
,
logcondens
,
interplcd
, plot.LogConcDEAD
,
interpmarglcd
, rlcd
, dlcd
, dmarglcd
## Some simple normal data, and a few plots x <- matrix(rnorm(200),ncol=2) lcd <- mlelcd(x) g <- interplcd(lcd) par(mfrow=c(2,2), ask=TRUE) plot(lcd, g=g, type="c") plot(lcd, g=g, type="c", uselog=TRUE) plot(lcd, g=g, type="i") plot(lcd, g=g, type="i", uselog=TRUE) ## Some plots of marginal estimates par(mfrow=c(1,1)) g.marg1 <- interpmarglcd(lcd, marg=1) g.marg2 <- interpmarglcd(lcd, marg=2) plot(lcd, marg=1, g.marg=g.marg1) plot(lcd, marg=2, g.marg=g.marg2) ## generate some points from the fitted density generated <- rlcd(100, lcd) genmean <- mean(generated) ## evaluate the fitted density mypoint <- c(0, 0) dlcd(mypoint, lcd, uselog=FALSE) mypoint <- c(10, 0) dlcd(mypoint, lcd, uselog=FALSE) ## evaluate the marginal density dmarglcd(0, lcd, marg=1) dmarglcd(1, lcd, marg=2)