readDOC {tm} | R Documentation |
Returns a function which reads in a Microsoft Word document extracting its text.
readDOC(...)
... |
Arguments for the generator function. |
Formally this function is a function generator, i.e., it returns a function (which reads in a text document) with a well-defined signature, but can access passed over arguments via lexical scoping. This is especially useful for reader functions for complex data structures which need a lot of configuration options.
Note that this MS Word reader needs the tool antiword
installed
and accessable on your system.
A function
with the signature elem, language, load, id
:
elem |
A list with the two named elements content
and uri . The first element must hold the document to
be read in, the second element must hold a call to extract this
document. The call is evaluated upon a request for load on demand. |
language |
A character vector giving the text's language. |
load |
A logical value indicating whether the document
corpus should be immediately loaded into memory. |
id |
A character vector representing a unique identification
string for the returned text document. |
The function returns a PlainTextDocument
representing the text
in content
.
Ingo Feinerer
Use getReaders
to list available reader functions.