svyrecvar {survey}R Documentation

Variance estimation for multistage surveys

Description

Compute the variance of a total under multistage sampling, using a recursive descent algorithm.

Usage

svyrecvar(x, clusters, stratas,fpcs, postStrata = NULL,
lonely.psu = getOption("survey.lonely.psu"),
one.stage=getOption("survey.ultimate.cluster"))

Arguments

x Matrix of data or estimating functions
clusters Data frame or matrix with cluster ids for each stage
stratas Strata for each stage
fpcs Information on population and sample size for each stage, created by as.fpc
postStrata post-stratification information as created by postStratify or calibrate
lonely.psu How to handle strata with a single PSU
one.stage If TRUE, compute a one-stage (ultimate-cluster) estimator

Details

The main use of this function is to compute the variance of the sum of a set of estimating functions under multistage sampling. The sampling is assumed to be simple or stratified random sampling within clusters at each stage except perhaps the last stage. The variance of a statistic is computed from the variance of estimating functions as described by Binder (1983).

Use one.stage=FALSE for compatibility with other software that does not perform multi-stage calculations, and set options(survey.ultimate.cluster=TRUE) to make this the default.

The idea of a recursive algorithm is due to Bellhouse (1985). Texts such as Cochran (1977) and Sarndal et al (1991) describe the decomposition of the variance into a single-stage between-cluster estimator and a within-cluster estimator, and this is applied recursively.

If one.stage is a positive integer it specifies the number of stages of sampling to use in the recursive estimator.

Value

A covariance matrix

Note

A simple set of finite population corrections will only be exactly correct when each successive stage uses simple or stratified random sampling without replacement. A correction under general unequal probability sampling (eg PPS) would require joint inclusion probabilities (or, at least, sampling probabilities for units not included in the sample), information not generally available.

For a PPS survey one option is probably to treat the survey as sampled with replacement by omitting the fpc argument. This appears to be the most widely used solution with other software. Another option is to treat the survey as if it were stratified, grouping together units with similar sampling probabilities.

References

Bellhouse DR (1985) Computing Methods for Variance Estimation in Complex Surveys. Journal of Official Statistics. Vol.1, No.3, 1985

Binder, David A. (1983). On the variances of asymptotically normal estimators from complex surveys. International Statistical Review, 51, 279-292.

Cochran, W. (1977) Sampling Techniques. 3rd edition. Wiley.

Sarndal C-E, Swensson B, Wretman J (1991) Model Assisted Survey Sampling. Springer.

See Also

svrVar for replicate weight designs

svyCprod for a description of how variances are estimated at each stage

Examples

data(mu284)
dmu284<-svydesign(id=~id1+id2,fpc=~n1+n2, data=mu284)
svytotal(~y1, dmu284)

data(api)
# two-stage cluster sample
dclus2<-svydesign(id=~dnum+snum, fpc=~fpc1+fpc2, data=apiclus2)
summary(dclus2)
svymean(~api00, dclus2)
svytotal(~enroll, dclus2,na.rm=TRUE)

# two-stage `with replacement'
dclus2wr<-svydesign(id=~dnum+snum, weights=~pw, data=apiclus2)
summary(dclus2wr)
svymean(~api00, dclus2wr)
svytotal(~enroll, dclus2wr,na.rm=TRUE)


[Package survey version 3.6-13 Index]