Algorithms to Compute K-Approximate Periodicities
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
A cover of a string w is a string u such that each letter of w occurs in some occurrences
of u. In this thesis, we present the k-approximate maximal cover, where we allow for
at most k mismatches at each occurrence of u in w, u is not required to cover the
entire string and to compute the coverage we consider only the exact matches. Our
ultimate objective is to find an approximate cover that encompasses the maximum
number of exact positions covered while allowing for k mismatches. For this problem,
we present an efficient algorithm that executes in O(n2kLave)-time, where n is the
length of the given string and Lave is the average length of the longest common prefix
between every substring and the original string with up to k mismatches allowed.
To further broaden the scope of string analysis, we address two novel problems: the
k-frequency cover and the k-border. The k-frequency cover problem seeks to identify
the longest substring u of x, that has the maximum number of its k-approximate
occurrences over all the other substrings in x. We propose an algorithm with a
computational complexity of O(n2) to resolve this challenge efficiently. The longest
k-border problem focuses on determining the longest prefix of string x that matches
a suffix of x of the same length while allowing at most k mismatches. We propose an
O(n2)-time algorithm to compute the longest k-border of every substring of x.