Indexed Hierarchical Approximate String Matching
Luís M. S. Russo, Gonzalo Navarro, Arlindo L.
Oliveira
Abstract:
We present a new search procedure for approximate string matching
over suffix trees. We show that hierarchical verification,
which is a well-established technique for on-line searching, can also
be used with an indexed approach. For this, we need that the index supports
bidirectionality, meaning that the search for a pattern can be updated by
adding a letter at the right or at the left. This turns out to be easily
supported by most compressed text self-indexes, which represent
the index and the text essentially in the same space of the compressed text
alone. To complete the symbiotic exchange, our hierarchical verification
largely reduces the need to access the text, which is expensive in compressed
text self-indexes. The resulting algorithm can, in particular, run over an
existing fully compressed suffix tree, which makes it very appealing for
applications in computational biology. We compare our algorithm with related
approaches, showing that our method offers an interesting space/time tradeoff,
and in particular does not need of any parameterization, which is necessary
in the most successful competing approaches.
Luis Manuel Silveira Russo
2008-09-03