Text Mining Package


[Up] [Top]

Documentation for package ‘tm’ version 0.5-2

Help Pages

A C D E F G H I L M N O P R S T U V W X

-- A --

acq 50 Exemplary News Articles from the Reuters-21578 XML Data Set of Topic acq
as.PlainTextDocument Create Objects of Class PlainTextDocument
as.PlainTextDocument.PlainTextDocument Create Objects of Class PlainTextDocument
as.PlainTextDocument.RCV1Document Create Objects of Class PlainTextDocument
as.PlainTextDocument.Reuters21578Document Create Objects of Class PlainTextDocument
Author Access and Modify Text Documents

-- C --

c.Corpus Combine Corpora, Documents, and Term-Document Matrices
c.TermDocumentMatrix Combine Corpora, Documents, and Term-Document Matrices
c.TextDocument Combine Corpora, Documents, and Term-Document Matrices
CMetaData Volatile Corpus
CMetaData.Corpus Volatile Corpus
colnames.DocumentTermMatrix Row, Column, Dim Names, Document IDs, and Terms
colnames.TermDocumentMatrix Row, Column, Dim Names, Document IDs, and Terms
Content Access and Modify Text Documents
Content<- Access and Modify Text Documents
Content<-.PlainTextDocument Access and Modify Text Documents
Content<-.XMLDocument Access and Modify Text Documents
convert_UTF_8 Convert Encoding to UTF-8
Corpus Volatile Corpus
crude 20 Exemplary News Articles from the Reuters-21578 XML Data Set of Topic crude

-- D --

DataframeSource Data Frame Source
DateTimeStamp Access and Modify Text Documents
DBControl Permanent Corpus Constructor
Description Access and Modify Text Documents
Dictionary Dictionary
Dictionary.character Dictionary
Dictionary.TermDocumentMatrix Dictionary
dim.DocumentTermMatrix The Number of Rows/Columns/Dimensions/Documents/Terms of a Term-Document Matrix
dim.TermDocumentMatrix The Number of Rows/Columns/Dimensions/Documents/Terms of a Term-Document Matrix
dimnames.DocumentTermMatrix Row, Column, Dim Names, Document IDs, and Terms
dimnames.TermDocumentMatrix Row, Column, Dim Names, Document IDs, and Terms
DirSource Directory Source
dissimilarity Dissimilarity
dissimilarity.PlainTextDocument Dissimilarity
dissimilarity.TermDocumentMatrix Dissimilarity
DMetaData Volatile Corpus
DMetaData.PCorpus Permanent Corpus Constructor
DMetaData.VCorpus Volatile Corpus
DMetaData<- Volatile Corpus
DMetaData<-.PCorpus Permanent Corpus Constructor
DMetaData<-.VCorpus Volatile Corpus
Docs Row, Column, Dim Names, Document IDs, and Terms
DocumentTermMatrix Term-Document Matrix
DublinCore Meta Data Management
DublinCore<- Meta Data Management

-- E --

eoi Access Sources
eoi.DataframeSource Access Sources
eoi.DirSource Access Sources
eoi.URISource Access Sources
eoi.VectorSource Access Sources
eoi.XMLSource Access Sources

-- F --

findAssocs Find Associations in a Term-Document Matrix
findAssocs.matrix Find Associations in a Term-Document Matrix
findAssocs.TermDocumentMatrix Find Associations in a Term-Document Matrix
findFreqTerms Find Frequent Terms
FunctionGenerator Function Generator

-- G --

getElem Access Sources
getElem.DataframeSource Access Sources
getElem.DirSource Access Sources
getElem.URISource Access Sources
getElem.VectorSource Access Sources
getElem.XMLSource Access Sources
getFilters List Available Filters
getReaders List Available Readers
getSources List Available Sources
getTransformations List Available Transformations
GmaneSource Gmane Source

-- H --

Heading Access and Modify Text Documents

-- I --

ID Access and Modify Text Documents
inspect Inspect Objects
inspect.PCorpus Inspect Objects
inspect.TermDocumentMatrix Inspect Objects
inspect.VCorpus Inspect Objects

-- L --

Language Access and Modify Text Documents
LocalMetaData Access and Modify Text Documents

-- M --

makeChunks Split a Corpus into Chunks
materialize Materialize Lazy Mappings
meta Meta Data Management
meta.Corpus Meta Data Management
meta.TextDocument Meta Data Management
meta.TextRepository Meta Data Management
meta<- Meta Data Management
meta<-.Corpus Meta Data Management
meta<-.TextDocument Meta Data Management
meta<-.TextRepository Meta Data Management

-- N --

ncol.DocumentTermMatrix The Number of Rows/Columns/Dimensions/Documents/Terms of a Term-Document Matrix
ncol.TermDocumentMatrix The Number of Rows/Columns/Dimensions/Documents/Terms of a Term-Document Matrix
nDocs The Number of Rows/Columns/Dimensions/Documents/Terms of a Term-Document Matrix
nrow.DocumentTermMatrix The Number of Rows/Columns/Dimensions/Documents/Terms of a Term-Document Matrix
nrow.TermDocumentMatrix The Number of Rows/Columns/Dimensions/Documents/Terms of a Term-Document Matrix
nTerms The Number of Rows/Columns/Dimensions/Documents/Terms of a Term-Document Matrix

-- O --

Origin Access and Modify Text Documents

-- P --

PCorpus Permanent Corpus Constructor
pGetElem Access Sources
pGetElem.DataframeSource Access Sources
pGetElem.DirSource Access Sources
pGetElem.VectorSource Access Sources
PlainTextDocument Plain Text Document
plot.TermDocumentMatrix Visualize a Term-Document Matrix
preprocessReut21578XML Preprocess the Reuters-21578 XML archive.
prescindMeta Prescind Document Meta Data

-- R --

RCV1Document RCV1 Text Document
readDOC Read In a MS Word Document
readGmane Read In a Gmane RSS Feed
readPDF Read In a PDF Document
readPlain Read In a Text Document
readRCV1 Read In a Reuters Corpus Volume 1 Document
readReut21578XML Read In a Reuters-21578 XML Document
readReut21578XMLasPlain Read In a Reuters-21578 XML Document
readTabular Read In a Text Document
readXML Read In an XML Document
removeNumbers Remove Numbers from a Text Document
removeNumbers.PlainTextDocument Remove Numbers from a Text Document
removePunctuation Remove Punctuation Marks from a Text Document
removePunctuation.PlainTextDocument Remove Punctuation Marks from a Text Document
removeSparseTerms Remove Sparse Terms from a Term-Document Matrix
removeWords Remove Words from a Text Document
removeWords.PlainTextDocument Remove Words from a Text Document
RepoMetaData Text Repository
Reuters21578Document Reuters-21578 Text Document
ReutersSource Reuters-21578 XML Source
rownames.DocumentTermMatrix Row, Column, Dim Names, Document IDs, and Terms
rownames.TermDocumentMatrix Row, Column, Dim Names, Document IDs, and Terms

-- S --

searchFullText Full Text Search
searchFullText.PlainTextDocument Full Text Search
sFilter Statement Filter
Source Access Sources
stemCompletion Complete Stems
stemDocument Stem Words
stemDocument.PlainTextDocument Stem Words
stepNext Access Sources
stepNext.Source Access Sources
stopwords Multilingual Stopwords
stripWhitespace Strip Whitespace from a Text Document
stripWhitespace.PlainTextDocument Strip Whitespace from a Text Document

-- T --

TermDocumentMatrix Term-Document Matrix
termFreq Term Frequency Vector
Terms Row, Column, Dim Names, Document IDs, and Terms
TextDocument Access and Modify Text Documents
TextRepository Text Repository
tm_filter Filter and Index Functions on Corpora
tm_filter.Corpus Filter and Index Functions on Corpora
tm_index Filter and Index Functions on Corpora
tm_index.Corpus Filter and Index Functions on Corpora
tm_intersect Intersection between Documents and Words
tm_intersect.PlainTextDocument Intersection between Documents and Words
tm_map Transformations on Corpora
tm_map.PCorpus Transformations on Corpora
tm_map.VCorpus Transformations on Corpora
tm_reduce Combine Transformations
tm_startCluster Allow 'tm' to Use a Cluster
tm_stopCluster Allow 'tm' to Use a Cluster

-- U --

URISource Uniform Resource Identifier Source

-- V --

VCorpus Volatile Corpus
VectorSource Vector Source

-- W --

weightBin Weight Binary
WeightFunction Weighting Function
weightTf Weight by Term Frequency
weightTfIdf Weight by Term Frequency - Inverse Document Frequency
writeCorpus Write a Corpus to Disk

-- X --

XMLSource XML Source