BNCInChargeOf {corpora} | R Documentation |
This data set lists collocations (in the sense of Sinclair 1991) of the phrase in charge of found in the British National Corpus, World Edition (BNC). A span size of 3 and a frequency threshold of 5 were used, i.e. all words that occur at least five times within a distance of three tokens from the key phrase in charge of are listed as collocates. Note that collocations were not allowed to cross sentence boundaries.
See Aston & Burnard (1998) for more information about the BNC, or go to http://www.natcorp.ox.ac.uk/.
data(BNCInChargeOf)
A data set with 250 rows and the following columns:
collocate
:f.in
:N.in
:f.out
:N.out
:
Punctuation, numbers and any words containing non-alphabetic
characters (except for -
) were not considered as potential
collocates. Likewise, the number of tokens inside / outside the span
given in the columns N.in
and N.out
only includes simple
alphabetic word forms.
Stefan Evert (http://purl.org/stefan.evert)
Aston, Guy and Burnard, Lou (1998). The BNC Handbook. Edinburgh University Press, Edinburgh. See also the BNC homepage at http://www.natcorp.ox.ac.uk/.
Sinclair, John (1991). Corpus, Concordance, Collocation. Oxford University Press, Oxford.