BNCdomains {corpora} | R Documentation |
This data set gives the number of documents and tokens in each of the 18 domains represented in the British National Corpus, World Edition (BNC). See Aston & Burnard (1998) for more information about the BNC and the domain classification, or go to http://www.natcorp.ox.ac.uk/.
data(BNCdomains)
A data set with 19 rows and the following columns:
domain
:documents
:tokens
:
For one document in the BNC, the domain classification is missing.
This document is represented by the code Unlabeled
in the data
set.
Marco Baroni (baroni@sslmit.unibo.it)
Aston, Guy and Burnard, Lou (1998). The BNC Handbook. Edinburgh University Press, Edinburgh. See also the BNC homepage at http://www.natcorp.ox.ac.uk/.