stratumStructure {optmatch} | R Documentation |
Tabulate treatment:control ratios occurring in matched sets, and the frequency of their occurrence.
stratumStructure(stratum,trtgrp=NULL)
stratum |
Matched strata, as returned by fullmatch or
pairmatch |
trtgrp |
Dummy variable for treatment group membership. (Not
required if stratum is an optmatch object, as returned by
fullmatch or pairmatch .) |
A table showing frequency of occurrence of those treatment:control
ratios that occur.
The ‘effective sample size’ of the stratification, in matched
pairs. Given as an attribute of the table, named
‘comparable.num.matched.pairs
’; see Note.
For comparing treatment and control groups both of size 10,
say, a stratification consisting of two strata, one with 9 treatments
and 1 control, has a smaller ‘effective sample size’,
intuitively, than a stratification into 10 matched pairs, despite the
fact that both contain 20 subjects in total. stratumStructure
first summarizes this aspect of the structure of the stratification it
is given, then goes on to identify one number as the stratification's
effective sample size. The
‘comparable.num.matched.pairs
’ attribute returned by
stratumStructure
is the sum of harmonic means of the sizes of
the treatment and control subgroups of each stratum, a general way of
calibrating such differences as well as differences in the number of
subjects contained in a stratification. For example, by this metric
the 9:1, 1:9 stratification is comparable to 3.6 matched pairs.
Why should effective sample size be calculated this way? The phrase ‘effective sample size’ suggests the observations are taken to be similar in information content. Modeling them as random variables, this suggests that they be assumed to have the same variance, sigma, conditional on what stratum they reside in. If that is the case, and if also treatment and control observations differ in expectation by a constant that is the same for each stratum, then it can be shown that the optimum weights with which to combine treatment-control contrasts across strata, s, are proportional to the stratum-wise harmonic means of treatment and control counts, h[s] = 1/(0.5/n.t[s] + 0.5/n.c[s]) (Kalton, 1968). The thus-weighted average of contrasts then has variance 2*sigma/sum(h). This motivates the use of sum(h) as a measure of effective sample size. Since for a matched pair s, h[s]=1, sum(h) can be thought of as the number of matched pairs needed to attain comparable precision. (Alternately, the stratification might be taken into account when comparing treatment and control groups using fixed effects in an ordinary least-squares regression, as in Hansen (2004). This leads to the same result. A still different formulation, in which outcomes are not modeled as random variables but assignment to treatment or control is, again suggests the same weighting across strata, and a measure of precision featuring sum(h) in a similar role; see Hansen and Bowers (2008).)
Ben Hansen
Kalton, G. (1968), ‘Standardization: {A} technique to control for extraneous variables’, Applied Statistics, 17, 118–136.
Hansen, B.B. (2004), ‘Full Matching in an Observational Study of Coaching for the {SAT}’, Journal of the American Statistical Association, 99, 609–618.
Hansen B.B. and Bowers, J. (2008), ‘Covariate balance in simple, stratified and clustered comparative studies’, Statistical Science, 23, to appear.
data(plantdist) plantsfm <- fullmatch(plantdist) # A full match with unrestricted # treatment-control balance plantsfm1 <- fullmatch(plantdist,min.controls=2, max.controls=3) stratumStructure(plantsfm) stratumStructure(plantsfm1)