text2vec-package | text2vec |
as.lda_c | Converts document-term matrix sparse matrix to 'lda_c' format |
char_tokenizer | Simple tokenization functions, which performs string splitting |
check_analogy_accuracy | Checks accuracy of word embeddings on the analogy task |
create_corpus | Create a corpus |
create_dtm | Document-term matrix construction |
create_dtm.itoken | Document-term matrix construction |
create_dtm.list | Document-term matrix construction |
create_tcm | Term-co-occurence matrix construction |
create_tcm.itoken | Term-co-occurence matrix construction |
create_tcm.list | Term-co-occurence matrix construction |
create_vocabulary | Creates a vocabulary of unique terms |
create_vocabulary.character | Creates a vocabulary of unique terms |
create_vocabulary.itoken | Creates a vocabulary of unique terms |
create_vocabulary.list | Creates a vocabulary of unique terms |
dist2 | Pairwise Distance Matrix Computation |
distances | Pairwise Distance Matrix Computation |
fit | Fits model to data |
fit.Matrix | Fits model to data |
fit.matrix | Fits model to data |
fit_transform | Fit model to data, then transform it |
fit_transform.Matrix | Fit model to data, then transform it |
fit_transform.matrix | Fit model to data, then transform it |
get_dtm | Extract document-term matrix |
get_idf | Inverse document-frequency scaling matrix |
get_tcm | Extract term-co-occurence matrix |
get_tf | Term-frequency scaling matrix |
GlobalVectors | Creates Global Vectors word-embeddings model. |
GloVe | Creates Global Vectors word-embeddings model. |
glove | Fit a GloVe word-embedded model |
hash_vectorizer | Vocabulary and hash vectorizers |
idir | Creates iterator over text files from the disk |
ifiles | Creates iterator over text files from the disk |
itoken | Iterators over input objects |
itoken.character | Iterators over input objects |
itoken.iterator | Iterators over input objects |
itoken.list | Iterators over input objects |
LatentDirichletAllocation | Creates Latent Dirichlet Allocation model. |
LatentSemanticAnalysis | Latent Semantic Analysis model |
LDA | Creates Latent Dirichlet Allocation model. |
LSA | Latent Semantic Analysis model |
movie_review | IMDB movie reviews |
normalize | Matrix normalization |
pdist2 | Pairwise Distance Matrix Computation |
prepare_analogy_questions | Prepares list of analogy questions |
prune_vocabulary | Prune vocabulary |
psim2 | Pairwise Similarity Matrix Computation |
regexp_tokenizer | Simple tokenization functions, which performs string splitting |
RelaxedWordMoversDistance | Creates model which can be used for calculation of "relaxed word movers distance". |
RWMD | Creates model which can be used for calculation of "relaxed word movers distance". |
sim2 | Pairwise Similarity Matrix Computation |
similarities | Pairwise Similarity Matrix Computation |
space_tokenizer | Simple tokenization functions, which performs string splitting |
split_into | Split a vector for parallel processing |
text2vec | text2vec |
TfIdf | TfIdf |
tokenizers | Simple tokenization functions, which performs string splitting |
transform | Transforms Matrix-like object using 'model' |
transform.Matrix | Transforms Matrix-like object using 'model' |
transform.matrix | Transforms Matrix-like object using 'model' |
transform_binary | Scale a document-term matrix |
transform_filter_commons | Remove terms from a document-term matrix |
transform_tf | Scale a document-term matrix |
transform_tfidf | Scale a document-term matrix |
vectorizers | Vocabulary and hash vectorizers |
vocabulary | Creates a vocabulary of unique terms |
vocab_vectorizer | Vocabulary and hash vectorizers |
word_tokenizer | Simple tokenization functions, which performs string splitting |