Bioinformatic Applications in Protein Low Complexity Regions and Targeted Metagenomics
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Part I: Low complexity regions (LCRs) are common motifs in eukaryotic proteins, despite the
fact that they are also mutationally unstable. For LCRs to be widely used and tolerated there
must be regulatory mechanisms which compensate for their presence. I have endeavored to
characterize the relationships and co-evolution of LCRs with the abundance of the proteins that
host them as well as the transcripts which encode them. As the abundance of a gene product is
ultimately responsible for its associated phenotype, any relationships have implications for the
many neurodegenerative diseases associated with LCR expansion. I found that there are indeed
relationships. LCRs are more associated with low abundance proteins, but the opposite is true
at the RNA level: LCRs encoding transcripts have higher abundance. Investigating the
co-evolution of LCRs and transcript abundance revealed that on short evolutionary timescales
indels in LCRs influence the selective pressures on TAb. Viewing LCRs through the previously
unexplored lens of abundance has generated new results. Results which, together with
explorations of information flow and low-complexity in untranslated regions, expand our
knowledge of the functional impacts of LCRs evolution.
Part II: A commonly encountered problem in DNA sequencing is a situation where the DNA
of interest makes up a small proportion of the DNA in a sample. This challenge can be
compounded when the DNA of interest may come from many different organisms. Targeted
metagenomics is a set of techniques which aim to bias sequencing results towards the DNA of
interest. Many of these techniques rely on carefully designed probes which are specific to
targets of interest. I have developed a bioinformatic tool, HUBDesign, to design oligonucleotide
probes to capture identifying sequences from a given set of targets of interest. Using
HUBDesign, and other methods, I have contributed to projects ranging in context from clinical
to ancient DNA.