Skip navigation
  • Home
  • Browse
    • Communities
      & Collections
    • Browse Items by:
    • Publication Date
    • Author
    • Title
    • Subject
    • Department
  • Sign on to:
    • My MacSphere
    • Receive email
      updates
    • Edit Profile


McMaster University Home Page
  1. MacSphere
  2. Open Access Dissertations and Theses Community
  3. Open Access Dissertations and Theses
Please use this identifier to cite or link to this item: http://hdl.handle.net/11375/22749
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorFranek, Frantisek-
dc.contributor.authorParacha, Asma-
dc.date.accessioned2018-04-23T16:48:09Z-
dc.date.available2018-04-23T16:48:09Z-
dc.date.issued2018-
dc.identifier.urihttp://hdl.handle.net/11375/22749-
dc.description.abstractStrings are very simple yet very applicable data structures. Their applicability ranges from modelling DNA, to modelling protein sequences, to information retrieval, to web page searches, and many more. Due to their simplicity, there are few structural properties that could be exploited for analysis of string algorithms and their auxiliary data structures. Thus, from the beginning, researchers paid utmost attention to periodic properties of strings, such as runs which are maximal fractional periodicities.Though first conjectured in 1999 by Kolpakov and Kucherov, the runs conjecture that there are fewer runs than the length of the string was only settled in 2015 by Bannai et al. via specific Lyndon roots referred to as L-roots. This method allows mapping of runs to the starting points of its L-roots that form mutually disjoint subsets of the indices of the string. This relationship between runs and maximal Lyndon factors (substrings) of a string is not coincidental, as Bannai et al. used the knowledge of all maximal Lyndon factors with respect to an order and its inverse to compute all runs in linear time. Thus, computing the all maximal Lyndon factors efficiently becomes of importance. In this thesis, we review the fundamental properties of Lyndon strings,including the famous Lyndon factorization and its linear solution due to Duval. In addition to that we explore a new and conceptually simple data structure called Lyndon array and its relationship to the suffix array. Finally, we discuss 2015 Baier's algorithm for sorting suffixes that identifies and sorts in phase 2 the maximal Lyndon factors in O(log ∑)steps for a string of length n over an alphabet ∑. We examine the fact that Baier's algorithm sorts the suffixes by sorting the maximal Lyndon factors, and present a different, potentially faster algorithm for phase 2. Our goal was to gather all the relevant well known and some unpublished facts about Lyndon strings and their relationship to runs. In addition we present a novel O(n log(n)) recursive algorithm for computing Lyndon arrays that may be competitive with Baier's for strings with large alphabets.en_US
dc.language.isoenen_US
dc.subjectLyndonen_US
dc.titleLYNDON FACTORS AND PERIODICITIES IN STRINGSen_US
dc.typeThesisen_US
dc.contributor.departmentComputing and Softwareen_US
dc.description.degreetypeThesisen_US
dc.description.degreeDoctor of Philosophy (PhD)en_US
Appears in Collections:Open Access Dissertations and Theses

Files in This Item:
File Description SizeFormat 
paracha_asma_mushakbar_finalsubmission2018January_PhD.pdf
Open Access
787.46 kBAdobe PDFView/Open
Show simple item record Statistics


Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.

Sherman Centre for Digital Scholarship     McMaster University Libraries
©2022 McMaster University, 1280 Main Street West, Hamilton, Ontario L8S 4L8 | 905-525-9140 | Contact Us | Terms of Use & Privacy Policy | Feedback

Report Accessibility Issue