metrics {monomvn} | R Documentation |
These functions calculate the root-mean-squared-error, the expected log likelihood, and Kullback-Leibler (KL) divergence (a.k.a. distance), between two multivariate normal (MVN) distributions described by their mean vector and covariance matrix
rmse.muS(mu1, S1, mu2, S2) Ellik.norm(mu1, S1, mu2, S2, quiet=FALSE) kl.norm(mu1, S1, mu2, S2, quiet=FALSE, symm=FALSE)
mu1 |
mean vector of first (estimated) MVN |
S1 |
covariance matrix of first (estimated) MVN |
mu2 |
mean vector of second (true, baseline, or comparator) MVN |
S2 |
covariance matrix of second (true, baseline, or comparator) MVN |
quiet |
when FALSE (default), gives a warning if
the accuracy package cannot be loaded to deal with (possibly)
non-positive definite S1 and S2 |
symm |
when TRUE a symmetrized version of the
KL divergence is used; see the note below |
The root-mean-squared-error is calculated between the entries of the mean vectors, and the upper-triangular part of the covariance matrices (including the diagonal).
The KL divergence is given by the formula:
0.5 (log(|S1|/|S2|) + tr(inv(S1)S2) + t(mu1-m2)inv(S2)(mu1-mu2) - N)
where N is length(mu1)
, and must agree with
the dimensions of the other parameters.
The expected log likelihood can be formulated in terms of the
KL divergence. That is, the expected log likelihood of data
simulated from the normal distribution with parameters mu2
and S2
under the estimated normal with parameters
mu1
and S1
is given by
-0.5 ln((2 pi e)^N |S2|) - kl.norm(mu1, S1, mu2, S2).
The sechol
function from the accuracy
package is used to decompose
(possibly) non-positive definite S1
and/or S2
to give more stable and robust calculations in the face of
numerical instabilities that might occur for larger problems
In the case of the expected log likelihood the result is a real number. The RMSE is a positive real number. The KL divergence method returns a positive real number depicting the distance between the two normal distributions
The KL-divergence is not symmetric. Therefore
kl.norm(mu1,S1,mu2,S2) != kl.norm(mu2,S2,mu1,S1).
But a symmetric metric can be constructed from
0.5 * (kl.norm(mu1,S1,mu2,S2) + kl.norm(mu2,S2,mu1,S1))
or by using symm = TRUE
Robert B. Gramacy bobby@statslab.cam.ac.uk
http://www.statslab.cam.ac.uk/~bobby/monomvn.html
mu1 <- rnorm(5) s1 <- matrix(rnorm(100), ncol=5) S1 <- t(s1) %*% s1 mu2 <- rnorm(5) s2 <- matrix(rnorm(100), ncol=5) S2 <- t(s2) %*% s2 ## RMSE rmse.muS(mu1, S1, mu2, S2) ## expected log likelihood Ellik.norm(mu1, S1, mu2, S2) ## KL is not symmetric kl.norm(mu1, S1, mu2, S2) kl.norm(mu2, S2, mu1, S1) ## symmetric version kl.norm(mu2, S2, mu1, S1, symm=TRUE)