Please use this identifier to cite or link to this item:
http://hdl.handle.net/11375/24947
Title: | Computing Lyndon Arrays |
Authors: | Liut, Michael Adam |
Advisor: | Deza, Antoine Franek, Frantisek |
Department: | Computing and Software |
Keywords: | Combinatorics on Words;String Algorithmics;Strings;Suffix Sorting;Lyndon Substrings;Lyndon Arrays;Maximal Lyndon Substrings |
Publication Date: | 2019 |
Abstract: | There are at least two reasons to have an efficient algorithm for identifying all maximal Lyndon substrings in a string: first, in 2015, Bannai et al. introduced a linear algorithm to compute all runs in a string that relies on knowing all maximal Lyndon substrings of the input string, and second, in 2017, Franek et al. showed a linear co-equivalence of sorting suffixes and sorting maximal Lyndon substrings of a string (inspired by a novel suffix sorting algorithm of Baier). In 2016, Franek et al. presented a brief overview of algorithms for com- puting the Lyndon array that encodes the knowledge of maximal Lyndon substrings of the input string. It discussed four different algorithms. Two known algorithms for computing the Lyndon array: a quadratic in-place algorithm based on iterated Duval’s algorithm for Lyndon factorization and a linear algorithmic scheme based on linear suffix sorting, computing the inverse suffix array, and applying the NSV (Next Smaller Value) algorithm. The overview also discusses a recursive version of Duval’s algorithm with a quadratic complexity and an algorithm emulating the NSV approach with a possible O(n log(n)) complexity. The authors at that time did not know of Baier’s algorithm. In 2017, Paracha proposed in her Ph.D. thesis an algorithm for the Lyndon array. The proposed algorithm was interesting as it emulated Farach’s recursive approach for computing suffix trees in linear time and introduced τ-reduction; which might be of independent interest. This was the starting point of this Ph.D. thesis. The primary aim is: (a) developing, analyzing, proving correct, and implementing in C++ a linear algorithm for computing the Lyndon array based on Baier’s suffix sorting; (b) analyzing, proving correct, and implementing in C++ the algorithm proposed by Paracha; and (c) empirically comparing the performance of these two algorithms with the iterative version of Duval’s algorithm. |
URI: | http://hdl.handle.net/11375/24947 |
Appears in Collections: | Open Access Dissertations and Theses |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
liut_michael_a_finalsubmission2019september_phd.pdf | PhD Dissertation | 1.78 MB | Adobe PDF | View/Open |
Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.