TermDocumentMatrix {tm} | R Documentation |
Constructs a term-document matrix or a document-term matrix.
TermDocumentMatrix(object, control = list()) DocumentTermMatrix(object, control = list())
object |
a corpus |
control |
a named list of control options. The component
weighting must be a weighting function capable of handling a
TermDocumentMatrix . It defaults to weightTf for term
frequency weighting. All other options are delegated internally to a
termFreq call. |
An object of class TermDocumentMatrix
or class
DocumentTermMatrix
containing a sparse term-document matrix or
document-term matrix. The following slots contain useful information:
Weighting |
The weighting applied to the matrix. |
Ingo Feinerer
The documentation of termFreq
gives an extensive list of
possible options.
Available weighting functions shipped with the tm
package are weightTf
, weightTfIdf
, and
weightBin
.
data("crude") tdm <- TermDocumentMatrix(crude, control = list(weighting = weightTfIdf, stopwords = TRUE)) dtm <- DocumentTermMatrix(crude, control = list(weighting = weightTfIdf, stopwords = TRUE)) inspect(tdm[165:170,1:5]) inspect(dtm[1:5,165:170])