Welcome to the upgraded MacSphere! We're putting the finishing touches on it; if you notice anything amiss, email macsphere@mcmaster.ca

DEMOCRATISING DEEP LEARNING IN MICROBIAL METABOLITES RESEARCH

dc.contributor.advisorMagarvey, Nathan
dc.contributor.authorDial, Keshav
dc.contributor.departmentBiochemistryen_US
dc.date.accessioned2023-05-03T17:57:51Z
dc.date.available2023-05-03T17:57:51Z
dc.date.issued2023
dc.description.abstractDeep learning models are dominating performance across a wide variety of tasks. From protein folding to computer vision to voice recognition, deep learning is changing the way we interact with data. The field of natural products, and more specifically genomic mining, has been slow to adapt to these new technological innovations. As we are in the midst of a data explosion, it is not for lack of training data. Instead, it is due to the lack of a blueprint demonstrating how to correctly integrate these models to maximise performance and inference. During my PhD, I showcase the use of large language models across a variety of data domains to improve common workflows in the field of natural product drug discovery. I improved natural product scaffold comparison by representing molecules as sentences. I developed a series of deep learning models to replace archaic technologies and create a more scalable genomic mining pipeline decreasing running times by 8X. I integrated deep learning-based genomic and enzymatic inference into legacy tooling to improve the quality of short-read assemblies. I also demonstrate how intelligent querying of multi-omic datasets can be used to facilitate the gene signature prediction of encoded microbial metabolites. The models and workflows I developed are wide in scope with the hopes of blueprinting how these industry standard tools can be applied across the entirety of natural product drug discovery.en_US
dc.description.degreeDoctor of Philosophy (PhD)en_US
dc.description.degreetypeThesisen_US
dc.identifier.urihttp://hdl.handle.net/11375/28495
dc.language.isoenen_US
dc.subjectDeep Learningen_US
dc.subjectCheminformaticsen_US
dc.subjectBERTen_US
dc.subjectLLMen_US
dc.subjectBioinformaticsen_US
dc.subjectT5en_US
dc.subjectgenomic miningen_US
dc.subjectGNNen_US
dc.titleDEMOCRATISING DEEP LEARNING IN MICROBIAL METABOLITES RESEARCHen_US
dc.title.alternativeDEMOCRATISING DEEP LEARNING IN NATURAL PRODUCTS RESEARCHen_US
dc.typeThesisen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Dial_Keshav_202304_PhD.pdf
Size:
16.98 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.68 KB
Format:
Item-specific license agreed upon to submission
Description: