rel.risk.cint {corpora}R Documentation

Conservative confidence interval for the relative risk ratio (corpora)

Description

This function approximates a conservative confidence interval for the relative risk coefficient, i.e. the ratio r = p_1/p_2 between two population proportions, based on frequency counts from two corpora. The approximation is computed from individual confidence intervals for the two proportions, with confidence levels adjusted accordingly.

Usage


rel.risk.cint(k1, n1, k2, n2,
              conf.level = 0.95, alternative = c("two.sided", "less", "greater"),
              method = c("binomial", "z.score"), correct = TRUE)

Arguments

k1 frequency of a type in the first corpus (or an integer vector of type frequencies)
n1 the sample size of the first corpus (or an integer vector specifying the sizes of different samples)
k2 frequency of the type in the second corpus (or an integer vector of type frequencies, in parallel to k1)
n2 the sample size of the second corpus (or an integer vector specifying the sizes of different samples, in parallel to n1)
conf.level the desired confidence level (defaults to 95%)
alternative a character string specifying the alternative hypothesis, yielding a two-sided (two.sided, default), lower one-sided (less) or upper one-sided (greater) confidence interval
method a character string specifying whether the individual confidence intervals for the two proportions are based on the binomial test (binomial) or the z-score test (z.score)
correct if TRUE, apply Yates' continuity correction for the z-score test (default)

Details

This function computes individual confidence intervals for the two population proportions p_1 (from k_1 and n_1) and p_2 (from k_2 and n_2). Then, a confidence interval for the relative risk ratio r = p_1 / p_2 is determined in such a way, that r lies within the interval whenever p_1 and p_2 lie in their respective confidence intervals.

Thus, when these intervals are computed with a confidence level of e.g. .975, r is certain to fall within its confidence interval in .975^2 = .95 of all cases. This adjustment of confidence levels is made automatically. Note that r might fall within its confidence interval even when either p_1 or p_2 is outside the respective interval, hence rel.risk.cint computes a conservative confidence interval that will be larger than necessary.

Exact confidence intervals for the odds ratio coefficient theta = (p_1 / (1-p_1)) / (p_2 / (1-p_2)) can be computed with the fisher.test function. However, these exact intervals are computationally very expensive and may cause R to run out of memory for large frequency counts. In addition, fisher.test only computes a single confidence interval for each function call (i.e., it cannot be applied to vectorised data).

Value

A data frame with two columns, labelled lower for the lower boundary and upper for the upper boundary of the confidence interval. The number of rows is determined by the length of the longest input vector (k1, n1, k2, n2 and conf.level).

Author(s)

Stefan Evert

See Also

prop.cint, chisq.pval, fisher.pval, fisher.test


[Package corpora version 0.3-2.1 Index]