Please use this identifier to cite or link to this item:
http://hdl.handle.net/11375/28495
Title: | DEMOCRATISING DEEP LEARNING IN MICROBIAL METABOLITES RESEARCH |
Other Titles: | DEMOCRATISING DEEP LEARNING IN NATURAL PRODUCTS RESEARCH |
Authors: | Dial, Keshav |
Advisor: | Magarvey, Nathan |
Department: | Biochemistry |
Keywords: | Deep Learning;Cheminformatics;BERT;LLM;Bioinformatics;T5;genomic mining;GNN |
Publication Date: | 2023 |
Abstract: | Deep learning models are dominating performance across a wide variety of tasks. From protein folding to computer vision to voice recognition, deep learning is changing the way we interact with data. The field of natural products, and more specifically genomic mining, has been slow to adapt to these new technological innovations. As we are in the midst of a data explosion, it is not for lack of training data. Instead, it is due to the lack of a blueprint demonstrating how to correctly integrate these models to maximise performance and inference. During my PhD, I showcase the use of large language models across a variety of data domains to improve common workflows in the field of natural product drug discovery. I improved natural product scaffold comparison by representing molecules as sentences. I developed a series of deep learning models to replace archaic technologies and create a more scalable genomic mining pipeline decreasing running times by 8X. I integrated deep learning-based genomic and enzymatic inference into legacy tooling to improve the quality of short-read assemblies. I also demonstrate how intelligent querying of multi-omic datasets can be used to facilitate the gene signature prediction of encoded microbial metabolites. The models and workflows I developed are wide in scope with the hopes of blueprinting how these industry standard tools can be applied across the entirety of natural product drug discovery. |
URI: | http://hdl.handle.net/11375/28495 |
Appears in Collections: | Open Access Dissertations and Theses |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Dial_Keshav_202304_PhD.pdf | 17.38 MB | Adobe PDF | View/Open |
Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.