readNewsgroup {tm} | R Documentation |
Returns a function which reads in a newsgroup document as found in the UCI KDD newsgroup data set.
readNewsgroup(DateFormat = "%d %B %Y %H:%M:%S", ...)
DateFormat |
the format of the Date header in the newsgroup document. |
... |
arguments for the generator function. |
Formally this function is a function generator, i.e., it returns a
function (which reads in a newsgroup document) with a well-defined
signature, but can access passed over arguments (e.g., to specify the
format of the Date header in the newsgroup document via
DateFormat
) via lexical scoping.
A function
with the signature elem, language, load, id
:
elem |
A list with the two named elements content
and uri . The first element must hold the document to
be read in, the second element must hold a call to extract this
document. The call is evaluated upon a request for load on demand. |
language |
A character vector giving the text's language. |
load |
A logical value indicating whether the document
corpus should be immediately loaded into memory. |
id |
A character vector representing a unique identification
string for the returned text document. |
The function returns a NewsgroupDocument
representing
content
.
Ingo Feinerer
Use getReaders
to list available reader functions.
See strptime
for date format specifications.