LCS {qualV} | R Documentation |
Determines the longest common subsequence of two strings.
LCS(a, b)
a |
vector (numeric or character), missing values are not accepted |
b |
vector (numeric or character), missing values are not accepted |
A longest common subsequence (LCS
) is a common subsequence
of two strings of maximum length. The LCS
Problem consists of
finding a LCS
of two given strings and its length
(LLCS
). The QSI
is computed by division of the
LLCS
over maximum length of 'a'
and 'b'
.
a |
vector 'a' |
b |
vector 'b' |
LLCS |
length of LCS |
LCS |
longest common subsequence |
QSI |
quality similarity index |
va |
one possible LCS of vector 'a' |
vb |
one possible LCS of vector 'b' |
Basing on the most prominent but simple calculation scheme this algorithm is not very efficient with respect to its time and memory requirements.
Wagner, R. A. and Fischer, M. J. (1974) The String-to-String Correction Problem. Journal of the ACM, 21, 168-173.
Paterson, M. and Dancík, V. (1994) Longest Common Subsequences. Mathematical Foundations of Computer Science, 841, 127-142.
Gusfield, D. (1997) Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, England, ISBN 0-521-58519-8.
# direct use a <- c("b", "c", "a", "b", "c", "b") b <- c("a", "b", "c", "c", "b") LCS(a, b) # a constructed example x <- seq(0, 2 * pi, 0.1) # time y <- 5 + sin(x) # a process o <- y + rnorm(x, sd=0.2) # observation with random error p <- y + 0.1 # simulation with systematic bias plot(x, o); lines(x, p) lcs <- LCS(f.slope(x, o), f.slope(x, p)) # too much noise lcs$LLCS lcs$QSI os <- ksmooth(x, o, kernel = "normal", bandwidth = dpill(x, o), x.points = x)$y lcs <- LCS(f.slope(x, os), f.slope(x, p)) lcs$LLCS lcs$QSI # observed and measured data with non-matching time intervals data(phyto) bbobs <- dpill(obs$t, obs$y) n <- tail(obs$t, n = 1) - obs$t[1] + 1 obsdpill <- ksmooth(obs$t, obs$y, kernel = "normal", bandwidth = bbobs, n.points = n) obss <- data.frame(t = obsdpill$x, y = obsdpill$y) obss <- obss[match(sim$t, obss$t),] obs_f1 <- f.slope(obss$t, obss$y) sim_f1 <- f.slope(sim$t, sim$y) lcs <- LCS(obs_f1, sim_f1) lcs$QSI