SnowballStemmer {Snowball} | R Documentation |
R interface to Weka's Snowball stemmers.
SnowballStemmer(x, control = NULL)
x |
a character vector with words to be stemmed. |
control |
an object of class Weka_control ,
or a character vector of control options, or NULL (default).
Available options can be obtained on-line using the Weka Option
Wizard WOW , or the Weka documentation. |
The Snowball stemmers contain the Porter stemmer and several other stemmers for different languages. See http://snowball.tartarus.org/ for more information.
SnowballStemmer
is an interface to Weka's wrapper classes for
the Java version of the Snowball stemmers. The corresponding jar
cannot be included in package RWeka due to license restrictions,
and hence is made available via the separate package Snowball.
The Omegahat package Rstem provides an R interface to a C version of Porter's word stemming algorithm.
A character vector with the stemmed words.
Other R interfaces to Weka stemmers (RWeka_stemmers)
## Test the supplied vocabulary for the default stemmer ('porter'): source <- readLines(system.file("words", "porter","voc.txt", package = "Snowball")) result <- SnowballStemmer(source) target <- readLines(system.file("words", "porter", "output.txt", package = "Snowball")) ## Any differences? any(result != target)