triples {lsa} | R Documentation |
Allows to store, manage and retrieve SPO-triples (subject, predicate, object) bound to the document columns of a document term matrix.
getTriple( M, subject, predicate ) setTriple( M, subject, predicate, object ) delTriple( M, subject, predicate, object ) getSubjectId( M, subject )
M |
the document term matrix (see textmatrix ). |
subject |
column number or column name (e.g., "doc3" or 3 ). |
predicate |
predicate of the triple sentence (e.g., "has\_category" or "has\_grade" ). |
object |
value of the triple sentence (e.g., "14" or 14 ). |
SPO-Triples are simple facts of the uniform structure (subject, predicate, object). A subject is typically a document in the given document-term matrix M, i.e. its document title (as in the column names) or its column position. A key-value pair (the predicate and the object) can be bound to this subject.
This can be used, for example, to store classification information about the documents of the text base used.
The triple data is stored in the environment of M
constructed by textmatrix()
.
Whenever a matrix has to be used which has not been generated by this function, its class should be set to 'textmatrix' and an environment has to be added manually via:
class(mymatrix) = "textmatrix"
environment(mymatrix) = new.env()
Alternatively, as.matrix()
can be used to convert a
matrix to a textmatrix. To spare memory, the manual method
might be of advantage.
In getTriple()
, the arguments subject and predicate
are optional.
textmatrix |
the document-term matrix (including row and column names). |
Fridolin Wild fridolin.wild@wu-wien.ac.at
x = matrix(2,2,3) # we fake a document term matrix rownames(x) = c("dog","mouse") # fake term names colnames(x) = c("doc1","doc2","doc3") # fake doc titles class(x) = "textmatrix" # usually done by textmatrix() environment(x) = new.env() # usually done by textmatrix() setTriple(x, "doc1", "has_category", "15") setTriple(x, "doc2", "has_category", "7") setTriple(x, "doc1", "has_grade", "5") setTriple(x, "doc1", "has_category", "11") getTriple(x, "doc1") getTriple(x, "doc1")[[2]] getTriple(x, "doc1", "has_category") # -> [1] "15" "11" delTriple(x, "doc1", "has_category", "15") getTriple(x, "doc1", "has_category") # -> [1] "11"