besag.newell {SpatialEpi} | R Documentation |
Besag-Newell cluster detection method. There are differences with the original paper and our implementation:
The first two and last differences are because we view the testing on an area-by-area level, rather than a case-by-case level.
besag.newell(geo, population, cases, expected.cases=NULL, k, alpha.level)
geo |
an n x 2 table of the (x,y)-coordinates of the area centroids |
cases |
aggregated case counts for all n areas |
population |
aggregated population counts for all n areas |
expected.cases |
expected numbers of disease for all n areas |
k |
number of cases to consider |
alpha.level |
α-level threshold used to declare significance |
For the population
and cases
tables, the rows are bunched by areas first, and then for each area, the counts for each strata are listed. It is important that the tables are balanced: the strata information are in the same order for each area, and counts for each area/strata combination appear exactly once (even if zero).
List containing
clusters |
information on all clusters that are α-level significant, in decreasing order of the p-value |
p.values |
for each of the n areas, p-values of each cluster of size at least k |
m.values |
for each of the n areas, the number of areas need to observe at least k cases |
observed.k.values |
based on m.values , the actual number of cases used to compute the p-values |
The clusters
list elements are themselves lists reporting:
location.IDs.included | ID's of areas in cluster, in order of distance |
population | population of cluster |
number.of.cases | number of cases in cluster |
expected.cases | expected number of cases in cluster |
SMR | estimated SMR of cluster |
p.value | p-value |
Albert Y. Kim
Besag J. and Newell J. (1991) The Detection of Clusters in Rare Diseases Journal of the Royal Statistical Society. Series A (Statistics in Society), 154, 143–155
## Load Pennsylvania Lung Cancer Data data(pennLC) data <- pennLC$data ## Process geographical information and convert to grid geo <- pennLC$geo[,2:3] geo <- latlong2grid(geo) ## Get aggregated counts of population and cases for each county population <- tapply(data$population,data$county,sum) cases <- tapply(data$cases,data$county,sum) ## Based on the 16 strata levels, computed expected numbers of disease n.strata <- 16 expected.cases <- expected(data$population, data$cases, n.strata) ## Set Parameters k <- 1250 alpha.level <- 0.05 # not controlling for stratas results <- besag.newell(geo, population, cases, expected.cases=NULL, k, alpha.level) # controlling for stratas results <- besag.newell(geo, population, cases, expected.cases, k, alpha.level)