convertReut21578XMLPlain {tm} | R Documentation |
Transform a Reuters21578 XML document to a plain text document.
convertReut21578XMLPlain(node, ...)
node |
an XML node representing a <REUTERS></REUTERS> element from a well-formed Reuters-21578 XML file. |
... |
Arguments passed over by calling functions. |
A PlainTextDocument
representing node
.
Ingo Feinerer
reut21578 <- system.file("texts", "reut21578", package = "tm") reut21578TDC <- Corpus(DirSource(reut21578), readerControl = list(reader = readReut21578XML, language = "en_US", load = TRUE)) reut21578TDC[[1]] asPlain(reut21578TDC[[1]], convertReut21578XMLPlain)