textcat_options {textcat} | R Documentation |
Get and set options used for n-gram based text categorization.
textcat_options(option, value)
option |
character string indicating the option to get or set (see Details). If missing, all options are returned as a list. |
value |
Value to be set. If omitted, the current value of the given option is returned. |
Currently, the following options are available:
n
:
Default: 5L
.
split
:
Default: "[[:space:][:punct:][:digit:]]+"
.
tolower
:
Default: TRUE
.
reduce
:
Default: TRUE
.
useBytes
:
Default: FALSE
.
ignore
:
Default: "_"
(corresponding to a word boundary).
size
:
Default: 1000L
.
method
:textcat
).
Default: "CT"
, giving the Cavnar-Trenkle out of place
measure.
textcat_profile_db
for how the first 6 options are used
when computing n-gram profiles.
textcnt
in package tau which provides the
functionality for term or pattern counting of text documents employed
by textcat.