Functions for Text Mining and Topic Modeling


[Up] [Top]

Documentation for package ‘textmineR’ version 1.6.0

Help Pages

acq 50 Exemplary News Articles from the Reuters-21578 Data Set of Topic acq
acq2 50 Exemplary News Articles from the Reuters-21578 Data Set of Topic acq
CalcLikelihood Calculate the log likelihood of a document term matrix given a topic model
CalcLikelihoodC Internal helper functions for 'textmineR'
CalcSumSquares Internal helper functions for 'textmineR'
CalcTopicModelR2 Function to calculate R-squared of a topic model.
CorrectS Function to remove some forms of pluralization.
DepluralizeDtm Run the CorrectS function on columns of a document term matrix.
documents 50 Exemplary News Articles from the Reuters-21578 Data Set of Topic acq
dtm 50 Exemplary News Articles from the Reuters-21578 Data Set of Topic acq
Dtm2Docs Convert a DTM to a Character Vector of documents
Dtm2DocsC Internal helper functions for 'textmineR'
Files2Vec Function for reading text files into R
FitLdaModel Fit a topic model using Latent Dirichlet Allocation
FormatRawLdaOutput Format Raw Output from lda::lda.collapsed.gibbs.sampler()
GetPhiPrime Calculate a matrix whose rows represent P(topic_i|tokens)
GetProbableTerms Get cluster labels using a "more probable" method of terms
GetTopTerms Get Top Terms for each topic from a topic model
HellDist Hellinger Distance
HellingerMat Internal helper functions for 'textmineR'
Hellinger_cpp Internal helper functions for 'textmineR'
JSD Jensen-Shannon Divergence
JSDmat Internal helper functions for 'textmineR'
JSD_cpp Internal helper functions for 'textmineR'
LabelTopics Get some topic labels using a "more probable" method of terms
MakeSparseDTM Convert a sparse simple triplet document term matrix to a sparse Matrix
model 50 Exemplary News Articles from the Reuters-21578 Data Set of Topic acq
NgramTokenizer Get n-grams when creating a document term matrix
ProbCoherence Probailistic coherence of topics
RecursiveRbind Recursively call rBind from the Matrix package.
TermDocFreq Get term frequencies and document frequencies from a document term matrix.
TmParallelApply An OS-independent parallel version of 'lapply'
Vec2Dtm Convert a character vector to a document term matrix of class Matrix.