add_collocation_label | Choose and add collocation strings based on collocation categories |
agg_tcorpus | Aggregate the tokens data |
as.tcorpus | Force an object to be a tCorpus class |
as.tcorpus.default | Force an object to be a tCorpus class |
as.tcorpus.tCorpus | Force an object to be a tCorpus class |
backbone_filter | Extract the backbone of a network. |
browse_hits | View hits in a browser |
browse_texts | Create and view a full text browser |
calc_chi2 | Vectorized computation of chi^2 statistic for a 2x2 crosstab containing the values [a, b] [c, d] |
code_dictionary | Dictionary lookup |
code_features | Code features in a tCorpus based on a search string |
compare_corpus | Compare tCorpus vocabulary to that of another (reference) tCorpus |
compare_documents | Calculate the similarity of documents |
compare_subset | Compare vocabulary of a subset of a tCorpus to the rest of the tCorpus |
context | Get a context vector |
corenlp_tokens | coreNLP example sentences |
count_tcorpus | Count results of search hits, or of a given feature in tokens |
create_tcorpus | Create a tCorpus |
create_tcorpus.character | Create a tCorpus |
create_tcorpus.corpus | Create a tCorpus |
create_tcorpus.data.frame | Create a tCorpus |
create_tcorpus.factor | Create a tCorpus |
deduplicate | Deduplicate documents |
delete_columns | Delete column from the data and meta data |
delete_meta_columns | Delete column from the data and meta data |
docfreq_filter | Support function for subset method |
dtm_compare | Compare two document term matrices |
dtm_wordcloud | Plot a word cloud from a dtm |
ego_semnet | Create an ego network |
emoticon_dict | Dictionary with common ASCII emoticons |
feats_to_columms | Cast the "feats" column in UDpipe tokens to columns |
feature_associations | Get common nearby features given a query or query hits |
feature_stats | Feature statistics |
feature_subset | Filter features |
freq_filter | Support function for subset method |
get | Access the data from a tCorpus |
get_dfm | Create a document term matrix. |
get_dtm | Create a document term matrix. |
get_global_i | Compute global feature positions |
get_kwic | Get keyword-in-context (KWIC) strings |
get_meta | Access the data from a tCorpus |
get_stopwords | Get a character vector of stopwords |
laplace | Laplace (i.e. add constant) smoothing |
lda_fit | Estimate a LDA topic model |
melt_quanteda_dict | Convert a quanteda dictionary to a long data.table format |
merge_tcorpora | Merge tCorpus objects |
plot.contextHits | S3 plot for contextHits class |
plot.featureAssociations | visualize feature associations |
plot.featureHits | S3 plot for featureHits class |
plot.vocabularyComparison | visualize vocabularyComparison |
plot_semnet | Visualize a semnet network |
plot_words | Plot a wordcloud with words ordered and coloured according to a dimension (x) |
preprocess | Preprocess feature |
preprocess_tokens | Preprocess tokens in a character vector |
print.contextHits | S3 print for contextHits class |
print.featureHits | S3 print for featureHits class |
print.tCorpus | S3 print for tCorpus class |
refresh_tcorpus | Refresh a tCorpus object using the current version of corpustools |
replace_dictionary | Replace tokens with dictionary match |
require_package | Check if package with given version exists |
search_contexts | Search for documents or sentences using Boolean queries |
search_dictionary | Dictionary lookup |
search_features | Find tokens using a Lucene-like search query |
search_recode | Recode features in a tCorpus based on a search string |
semnet | Create a semantic network based on the co-occurence of tokens in documents |
semnet_window | Create a semantic network based on the co-occurence of tokens in token windows |
set | Modify the token and meta data.tables of a tCorpus |
set_levels | Change levels of factor columns |
set_meta | Modify the token and meta data.tables of a tCorpus |
set_meta_levels | Change levels of factor columns |
set_meta_name | Change column names of data and meta data |
set_name | Change column names of data and meta data |
set_network_attributes | Set some default network attributes for pretty plotting |
set_special | Designate column as columns with special meaning (token, lemma, POS, relation, parent) |
sgt | Simple Good Turing smoothing |
show_udpipe_models | Show the names of udpipe models |
sotu_texts | State of the Union addresses |
stopwords_list | Basic stopword lists |
subset | Subset a tCorpus |
subset.tCorpus | S3 subset for tCorpus class |
subset_meta | Subset a tCorpus |
subset_query | Subset tCorpus token data using a query |
summary.contextHits | S3 summary for contextHits class |
summary.featureHits | S3 summary for featureHits class |
summary.tCorpus | Summary of a tCorpus object |
tCorpus | tCorpus: a corpus class for tokenized texts |
tcorpus | tCorpus: a corpus class for tokenized texts |
tCorpus$code_dictionary | Dictionary lookup |
tCorpus$code_features | Code features in a tCorpus based on a search string |
tCorpus$compare_corpus | Compare tCorpus vocabulary to that of another (reference) tCorpus |
tCorpus$compare_documents | Calculate the similarity of documents |
tCorpus$compare_subset | Compare vocabulary of a subset of a tCorpus to the rest of the tCorpus |
tCorpus$context | Get a context vector |
tCorpus$deduplicate | Deduplicate documents |
tCorpus$delete_columns | Delete column from the data and meta data |
tCorpus$delete_meta_columns | Delete column from the data and meta data |
tCorpus$dfm | Create a document term matrix. |
tCorpus$dtm | Create a document term matrix. |
tCorpus$feats_to_columns | Cast the "feats" column in UDpipe tokens to columns |
tCorpus$feature_associations | Get common nearby terms given a feature query |
tCorpus$feature_stats | Feature statistics |
tCorpus$feature_subset | Filter features |
tCorpus$get | Access the data from a tCorpus |
tCorpus$get_meta | Access the data from a tCorpus |
tCorpus$kwic | Get keyword-in-context (KWIC) strings |
tCorpus$lda_fit | Estimate a LDA topic model |
tCorpus$preprocess | Preprocess feature |
tCorpus$replace_dictionary | Replace tokens with dictionary match |
tCorpus$search_contexts | Search for documents or sentences using Boolean queries |
tCorpus$search_features | Find tokens using a Lucene-like search query |
tCorpus$search_recode | Recode features in a tCorpus based on a search string |
tCorpus$semnet | Create a semantic network based on the co-occurence of tokens in documents |
tCorpus$semnet_window | Create a semantic network based on the co-occurence of tokens in token windows |
tCorpus$set | Modify the token and meta data.tables of a tCorpus |
tCorpus$set_levels | Change levels of factor columns |
tCorpus$set_meta | Modify the token and meta data.tables of a tCorpus |
tCorpus$set_meta_levels | Change levels of factor columns |
tCorpus$set_meta_name | Change column names of data and meta data |
tCorpus$set_name | Change column names of data and meta data |
tCorpus$set_special | Designate column as columns with special meaning (token, lemma, POS, relation, parent) |
tCorpus$subset | Subset a tCorpus |
tCorpus$subset_meta | Subset a tCorpus |
tCorpus$subset_query | Subset tCorpus token data using a query |
tCorpus$top_features | Show top features |
tCorpus_compare | Corpus comparison |
tCorpus_create | Creating a tCorpus |
tCorpus_data | Methods and functions for viewing, modifying and subsetting tCorpus data |
tCorpus_docsim | Document similarity |
tCorpus_features | Preprocessing, subsetting and analyzing features |
tCorpus_modify_by_reference | Modify tCorpus by reference |
tCorpus_querying | Use Boolean queries to analyze the tCorpus |
tCorpus_semnet | Feature co-occurrence based semantic network analysis |
tCorpus_topmod | Topic modeling |
tokens_to_tcorpus | Create a tcorpus based on tokens (i.e. preprocessed texts) |
tokenWindowOccurence | Gives the window in which a term occured in a matrix. |
top_features | Show top features |