Welcome to the upgraded MacSphere! We're putting the finishing touches on it; if you notice anything amiss, email macsphere@mcmaster.ca

EFFICIENT COMPUTATION OF CLOSED SUBSTRINGS

Loading...
Thumbnail Image

Date

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

A \df{closed string} $ u $ is either of length one or contains a border that occurs only as a prefix and as a suffix of $ u $, without appearing elsewhere within $ u $. This thesis presents a fast and practical $ \mathcal{O}(n \log n) $ time algorithm to compute all $ \mathcal{O}(n^2) $ closed substrings of a string $ w[1..n] $. This is achieved by introducing a compact representation of all closed substrings using only $ \mathcal{O}(n \log n) $ space. Additionally, a simple and space-efficient approach is proposed to compute all maximal closed substrings (MCSs) using the suffix array ($\mathsf{SA}$) and the longest common prefix ($\mathsf{LCP}$) array of $ w[1..n] $. Given a Fibonacci word $f_n$, where $n>5$, the thesis shows that the exact number of MCSs is $ M(f_n) \approx \left(1 + \frac{1}{\phi^2}\right) F_n \approx 1.382 F_n $, where $ \phi $ is the golden ratio, and $ F_n $ is the $ n $-th Fibonacci number. These results highlight novel combinatorial properties of closed substrings in structured sequences. To complement the theoretical findings, an efficient implementation of the algorithms for computing closed strings and MCSs is provided. The implementation has been thoroughly tested and is designed to support practical applications, facilitating further exploration of closed substrings in various contexts.

Description

Citation

Endorsement

Review

Supplemented By

Referenced By