Skip navigation
  • Home
  • Browse
    • Communities
      & Collections
    • Browse Items by:
    • Publication Date
    • Author
    • Title
    • Subject
    • Department
  • Sign on to:
    • My MacSphere
    • Receive email
      updates
    • Edit Profile


McMaster University Home Page
  1. MacSphere
  2. Open Access Dissertations and Theses Community
  3. Open Access Dissertations and Theses
Please use this identifier to cite or link to this item: http://hdl.handle.net/11375/28495
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorMagarvey, Nathan-
dc.contributor.authorDial, Keshav-
dc.date.accessioned2023-05-03T17:57:51Z-
dc.date.available2023-05-03T17:57:51Z-
dc.date.issued2023-
dc.identifier.urihttp://hdl.handle.net/11375/28495-
dc.description.abstractDeep learning models are dominating performance across a wide variety of tasks. From protein folding to computer vision to voice recognition, deep learning is changing the way we interact with data. The field of natural products, and more specifically genomic mining, has been slow to adapt to these new technological innovations. As we are in the midst of a data explosion, it is not for lack of training data. Instead, it is due to the lack of a blueprint demonstrating how to correctly integrate these models to maximise performance and inference. During my PhD, I showcase the use of large language models across a variety of data domains to improve common workflows in the field of natural product drug discovery. I improved natural product scaffold comparison by representing molecules as sentences. I developed a series of deep learning models to replace archaic technologies and create a more scalable genomic mining pipeline decreasing running times by 8X. I integrated deep learning-based genomic and enzymatic inference into legacy tooling to improve the quality of short-read assemblies. I also demonstrate how intelligent querying of multi-omic datasets can be used to facilitate the gene signature prediction of encoded microbial metabolites. The models and workflows I developed are wide in scope with the hopes of blueprinting how these industry standard tools can be applied across the entirety of natural product drug discovery.en_US
dc.language.isoenen_US
dc.subjectDeep Learningen_US
dc.subjectCheminformaticsen_US
dc.subjectBERTen_US
dc.subjectLLMen_US
dc.subjectBioinformaticsen_US
dc.subjectT5en_US
dc.subjectgenomic miningen_US
dc.subjectGNNen_US
dc.titleDEMOCRATISING DEEP LEARNING IN MICROBIAL METABOLITES RESEARCHen_US
dc.title.alternativeDEMOCRATISING DEEP LEARNING IN NATURAL PRODUCTS RESEARCHen_US
dc.typeThesisen_US
dc.contributor.departmentBiochemistryen_US
dc.description.degreetypeThesisen_US
dc.description.degreeDoctor of Philosophy (PhD)en_US
Appears in Collections:Open Access Dissertations and Theses

Files in This Item:
File Description SizeFormat 
Dial_Keshav_202304_PhD.pdf
Access is allowed from: 2024-04-28
17.38 MBAdobe PDFView/Open
Show simple item record Statistics


Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.

Sherman Centre for Digital Scholarship     McMaster University Libraries
©2022 McMaster University, 1280 Main Street West, Hamilton, Ontario L8S 4L8 | 905-525-9140 | Contact Us | Terms of Use & Privacy Policy | Feedback

Report Accessibility Issue