mahal.dist {optmatch} | R Documentation |
Calculates Mahalanobis distances between treatment and control
observations on given variables,
assembling them into a discrepancy matrix (or matrices) from which
pairmatch()
or fullmatch()
can determine optimal matches.
mahal.dist(distance.fmla, data, structure.fmla = NULL, inverse.cov = NULL)
distance.fmla |
A formula with variables to be combined in the Mahalanobis distance on its right-hand side and the treatment variable on its left. |
data |
Data frame in which distance.fmla and (if
specified) structure.fmla are to be evaluated. |
structure.fmla |
Optional formula argument specifying subclasses within which matches are to be performed. If omitted, no subclassification is done. If it is given, its left-hand side gives the treatment variable and its RHS gives variables on which to stratify the sample prior to matching. |
inverse.cov |
The inverse covariance of the variables to be combined into the Mahalanobis distance (optional). |
Mahalanobis distance tracks the discrepancy between points on a number of given variables, after standardizing the variables and taking account of their covariances. It is best suited to variables whose joint distribution resembles a multivariate Normal.
The purpose of giving a structure.fmla
argument is to speed
up large problems. Variables appearing on its
right-hand side will be interacted to create the subclasses. If
structure.fmla
is given then its LHS is used to define
treatment and control groups (and one doesn't have to put anything on
the LHS of distance.fmla
).
The function attempts to calculate the inverse covariance itself, so
ordinarily you shouldn't need to give it one. If you'll be calling
the function repeatedly, however, it may speed things up to compute
and store the inverse covariance once, rather than each time this
function is called; in that case you can save time by giving the
inverse.covariance
argument.
Object of class optmatch.dlist
, which is suitable to be given
as distance
argument to fullmatch
or
pairmatch
.
Specifically, a list of matrices, one for each subclass defined by the
interaction of variables appearing on the right hand side of
structure.fmla
. Each of these is a number of treatments by
number of controls matrix of propensity distances.
The list also carries some metadata as attributes, data that is not of direct interest to
the user but is useful to fullmatch()
and pairmatch()
.
Ben B. Hansen
makedist
, pscore.dist
, fullmatch
, pairmatch
data(nuclearplants) mhd1 <- mahal.dist(pr~date+cum.n, nuclearplants) lapply(mhd1, round) attributes(mhd1) fullmatch(mhd1) ##- Mahalanobis within subclasses defined by levels of pt mhd2 <- mahal.dist(~date+cum.n, nuclearplants, pr~pt) lapply(mhd2, round) fullmatch(mhd2) ##- Trick mahal.dist into returning absolute differences on a scalar. mhd3 <- mahal.dist(pr~date, nuclearplants, inverse.cov=matrix(1,1,1,dimnames=list("date", "date"))) mhd3[[1]] ##- Matching within calipers of 3 years fullmatch(mhd1/(mhd3<3))