multigapweightkernel {stringkernels}R Documentation

Multiple gap-weight kernels

Description

Compute gap-weight kernels of multiple length at once and pack them in a precomputed kernel.

Usage

multigapweightkernel(items, maxlength, kernelarray = NULL, lambda = 0.75, 
    normalized = TRUE, tokenizer = openNLP::tokenize, minlength = 1)

## S4 method for signature 'multigapweight':
getkernel(mgw, length, use_dummy = FALSE)

Arguments

items List of input texts
maxlength Maximum match length
kernelarray Optionally supply an array of kernel values.
lambda Gap length penalty factor
normalized Normalize kernel values
tokenizer String tokenizer function. By default, this uses openNLP's tokenize to split the text into words, but users may specify their own function.
minlength Minimum match length
mgw multigapweight object returned by multigapweightkernel
length The desired length parameter for the kernel
use_dummy The flag use_dummy=TRUE can be used to create a kernel with dummy values (see precomputedkernel)

Details

The dynamic programming algorithm used for the gap-weighted kernel works by computing the matching statistics for an incrementally larger match length.

Therefore, computing the kernel value for match length n does not take significantly less computational time than computing all kernel values for n' <= n.

This function computes kernel matrices for multiple lengths in one step. The getkernel method retrieves the matrix of the desired length and creates a kernel object with the precomputed values.

Value

A multigapweight object that contains the kernel value array (a kernel matrix with an additional dimension for length) and the kernel parameters.

Author(s)

Martin Kober
martin.kober@gmail.com

See Also

precomputedkernel

Examples


library(tm)

## This is necessary to make tm's corpora usable with 
## stringkernels' S4 classes.
setOldClass(c("VCorpus", "Corpus"))
setIs("Corpus", "list")

data(crude)

m = multigapweightkernel(crude, maxlength=3, minlength=2)

k2 = getkernel(m, 2)
k3 = getkernel(m, 3)

kernelMatrix(k2, crude[1:5])
kernelMatrix(k3, crude[1:5])



[Package stringkernels version 0.8.8 Index]