Please use this identifier to cite or link to this item:
http://hdl.handle.net/11375/29850
Title: | Algorithms to Compute K-Approximate Periodicities |
Authors: | Liu, Chenge |
Advisor: | Mhaskar, Neerja |
Department: | Computing and Software |
Publication Date: | 2024 |
Abstract: | A cover of a string w is a string u such that each letter of w occurs in some occurrences of u. In this thesis, we present the k-approximate maximal cover, where we allow for at most k mismatches at each occurrence of u in w, u is not required to cover the entire string and to compute the coverage we consider only the exact matches. Our ultimate objective is to find an approximate cover that encompasses the maximum number of exact positions covered while allowing for k mismatches. For this problem, we present an efficient algorithm that executes in O(n2kLave)-time, where n is the length of the given string and Lave is the average length of the longest common prefix between every substring and the original string with up to k mismatches allowed. To further broaden the scope of string analysis, we address two novel problems: the k-frequency cover and the k-border. The k-frequency cover problem seeks to identify the longest substring u of x, that has the maximum number of its k-approximate occurrences over all the other substrings in x. We propose an algorithm with a computational complexity of O(n2) to resolve this challenge efficiently. The longest k-border problem focuses on determining the longest prefix of string x that matches a suffix of x of the same length while allowing at most k mismatches. We propose an O(n2)-time algorithm to compute the longest k-border of every substring of x. |
URI: | http://hdl.handle.net/11375/29850 |
Appears in Collections: | Open Access Dissertations and Theses |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Liu_Chenge_2024May_MSc.pdf | 316.9 kB | Adobe PDF | View/Open |
Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.