Corpus-class {tm}R Documentation

Corpus

Description

A class representing a collection of text documents (denoted as corpus in linguistics).

Objects from the Class

Objects can be created by calls of the form new("Corpus",...) or by calling the function Corpus.

Slots

CMetaData:
Object of class MetaDataNode containing the document collection (corpus) specific meta data for the collection in form of tag-value pairs and information about children in form of a binary tree. This information is useful for reconstructing meta data after e.g. merging document collections.
DMetaData:
Object of class data.frame containing the document specific meta data for the collection. This dataframe typically encompasses clustering or classification results which basically are metadata for documents but form an own entity (e.g., with its name, the value range, etc.).
DBControl:
Object of class list with three named components: useDb indicates whether database support is activated, dbName holds the path to the database storage, and dbType stores the database type.

Extends

Class list, directly.

Methods

CMetaData
signature(object = "Corpus"): Returns the corpus specific metadata in form of a tag-value paired list.
DMetaData
signature(object = "Corpus"): Returns the document specific metadata in form of a data frame.
DBControl
signature(object = "Corpus"): Returns the database configuration settings.

Author(s)

Ingo Feinerer

See Also

MetaDataNode-class Corpus


[Package tm version 0.3-3 Index]