Stochastic Heuristic Program for Target Motif Identification
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
<p> Identifying motifs that are "close" to one or more substrings in each sequence in a given set of sequences and hence characterize that set is an important problem in computational biology. The target motif identification problem requires motifs that characterize one given set of sequences but are far from every substring in another given set of sequences. This problem is N P-hard and hence is unlikely to have efficient optimal solution algorithms. In this thesis, we propose a set of modifications to one of the most popular stochastic heuristics for finding motifs, Gibbs Sampling [LAB+93], which allow this heuristic to detect target motifs. We also present the results of four simulation studies and tests on real protein datasets which suggest that these modified heuristics are very good at (and are even, in some cases, necessary for) detecting target motifs.</p>