epi.clustersize {epiR}R Documentation

Sample size for cluster-sample surveys

Description

Estimates the number of clusters to be sampled using a cluster-sample design.

Usage

epi.clustersize(p, b, rho, epsilon, conf.level = 0.95)

Arguments

p the estimated prevalence of disease in the population.
b the number of units to be sampled per cluster.
rho the intra-cluster correlation, a measure of the variation between clusters compared with the variation within clusters.
epsilon scalar, the acceptable absolute error.
conf.level scalar, defining the level of confidence in the computed result.

Value

A list containing the following:

clusters the estimated number of clusters to be sampled.
units the total number of units to be sampled.
design the design effect.

Note

The intra-cluster correlation (rho) will be higher for those situations where the between-cluster variation is greater than within-cluster variation. The design effect is dependent on rho and b (the number of units sampled per cluster): rho = (D - 1) / (b - 1). Design effects of 2, 4, and 7 can be used to estimate rho when intra-cluster correlation is low, medium, and high (respectively). A design effect of 7.5 should be used when the intra-cluster correlation is unknown.

References

Otte J, Gumm I (1997). Intra-cluster correlation coefficients of 20 infections calculated from the results of cluster-sample surveys. Preventive Veterinary Medicine 31: 147 - 150.

Bennett S, Woods T, Liyanage WM, Smith DL (1991). A simplified general method for cluster-sample surveys of health in developing countries. Raport trimestriel de statistiques sanitaires modiales 44: 98 - 106.

Examples

## The expected prevalence of disease in a population of cattle is 0.10.
## We wish to conduct a survey, sampling 50 animals per farm. No data  
## are available to provide an estimate of rho, though we suspect
## the intra-cluster correlation for this disease to be relatively high.           
## We wish to be 95% certain of being within 10% of the true population
## prevalence of disease. How many herds should be sampled?

p <- 0.10
b <- 50
D <- 7
rho <- (D - 1) / (b - 1)
epi.clustersize(p = 0.10, b = 50, rho = rho, epsilon = 0.10, conf.level = 0.95)

## We need to sample 485 herds (24250 samples in total).

[Package epiR version 0.9-15 Index]