pairmatch {optmatch} | R Documentation |
Given a treatment group, a larger control reservoir, and discrepancies between each treatment and control unit, finds a pairing of treatment units to controls that minimizes the sum of discrepancies.
pairmatch(distance, controls = 1, tol = 0.001)
distance |
A matrix of nonnegative discrepancies,
each indicating the permissibility and desirability of matching the unit
corresponding to its row (a 'treatment') to the unit
corresponding to its column (a 'control'); or a list of such matrices
made using makedist . Finite
discrepancies indicate permissible matches, with smaller
discrepancies indicating more desirable matches.
Matrix distance , or the matrix elements of distance ,
must have row and column names. |
controls |
The number of controls to be matched to each treatment. |
tol |
Tolerance – see fullmatch for details. |
This is a wrapper to fullmatch
; see its documentation for
more information.
fullmatch
tries to guess the
order in which units would have been given in a data frame, and to
order the factor that it returns accordingly. If the dimnames of
distance
, or the matrices it lists, are not simply row numbers
of the data frame you're working with, then you should compare the
names of fullmatch's output to your row names in order to be sure
things are in the proper order. You can relieve yourself of these
worries by using makedist
to produce the distances, as
it passes the ordering of units to fullmatch
, which then uses
it to order its outputs.
The value of tol
can have a substantial effect on
computation time; with smaller values, computation takes longer.
Not every tolerance can be met, and how small a tolerance is too small
varies with the machine and with the details of the problem. If
fullmatch
can't guarantee that the tolerance is as small as the
given value of argument tol
, then matching proceeds but a
warning is issued.
Primarily, a named vector of class c('optmatch',
'factor')
. Elements of this vector correspond to members of the
treatment and control groups in reference to which the matching
problem was posed, and are named accordingly; the names are taken from
the row and column names of distance
. Each element of
the vector is the concatenation of: (i) a character abbreviation of
subclass.indices
, if that argument was given, or the string
'm
' if it was not; (ii) the string .
; and (iii) a
nonnegative integer or the string NA
. In this last place,
positive whole numbers indicate placement of the unit into a matched
set, a number beginning with zero indicates a unit that was not
matched, and NA
indicates that all or part of the matching problem given to
fullmatch
was found to be infeasible.
Secondarily, fullmatch
returns various data about the matching
process and its result, stored as attributes of the named vector
which is its primary output. In particular, the exceedances
attribute gives upper bounds, not necessarily sharp, for the amount by
which the sum of distances between matched units in the result of
fullmatch
exceeds the least possible sum of distances between
matched units in a feasible solution to the matching problem given to
fullmatch
. Such a bound is also printed by
print.optmatch
.
Hansen, B.B. and Klopfer, S.O. (2006), ‘Optimal full matching and related designs via network flows’, Journal of Computational and Graphical Statistics, 15, 609–627.
data(plantdist) pr <- logical(26) pr[match(dimnames(plantdist)[[1]], LETTERS)] <- TRUE plantspm <- pairmatch(plantdist)