Skip navigation
  • Home
  • Browse
    • Communities
      & Collections
    • Browse Items by:
    • Publication Date
    • Author
    • Title
    • Subject
    • Department
  • Sign on to:
    • My MacSphere
    • Receive email
      updates
    • Edit Profile


McMaster University Home Page
  1. MacSphere
  2. Open Access Dissertations and Theses Community
  3. Open Access Dissertations and Theses
Please use this identifier to cite or link to this item: http://hdl.handle.net/11375/24866
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorHope, Kristin-
dc.contributor.advisorMcArthur, Andrew-
dc.contributor.advisorLeber, Brian-
dc.contributor.authorTran, Damian-
dc.date.accessioned2019-10-02T14:17:25Z-
dc.date.available2019-10-02T14:17:25Z-
dc.date.issued2019-
dc.identifier.urihttp://hdl.handle.net/11375/24866-
dc.description.abstractEvidence-based software engineering (EBSE) solutions for drug discovery that are effective, affordable, and accessible all-in-one are lacking. This thesis chronicles the progression and accomplishments of the AiDA (Artificially-intelligent Desktop Assistant) functional artificial intelligence (AI) project for the purposes of drug discovery in the challenging acute myeloid leukemia context (AML). AiDA is a highly automated combined natural language processing (NLP) and spreadsheet feature extraction solution that harbours potential to disrupt the state of current research investigation methods using big data and aggregated literature. The completed work includes a text-to-function (T2F) NLP method for automated text interpretation, a ranked-list algorithm for multi-dataset analysis, and a custom multi-purpose neural network engine presented to the user using an open-source graphics engine. Validation of the deep learning engine using MNIST and CIFAR machine learning benchmark datasets showed performance comparable to state-of-the-art libraries using similar architectures. An n-dimensional word embedding method for the handling of unstructured natural language data was devised to feed convolutional neural network (CNN) models that over 25 random permutations correctly predicted functional responses to up to 86.64% of over 300 validation transcripts. The same CNN NLP infrastructure was then used to automate biomedical context recognition in >20000 literature abstracts with up to 95.7% test accuracy over several permutations. The AiDA platform was used to compile a bidirectional ranked list of potential gene targets for pharmaceuticals by extracting features from leukemia microarray data, followed by mining of the PubMed biomedical citation database to extract recyclable pharmaceutical candidates. Downstream analysis of the candidate therapeutic targets revealed enrichments in AML- and leukemic stem cell (LSC)-related pathways. The applicability of the AiDA algorithms in whole and part to the larger biomedical research field is explored.en_US
dc.language.isoenen_US
dc.subjectAcute myeloid leukemiaen_US
dc.subjectDrug discoveryen_US
dc.subjectDeep learningen_US
dc.subjectArtificial intelligenceen_US
dc.subjectLiterature reviewen_US
dc.subjectNatural language processingen_US
dc.subjectAutomated data analysisen_US
dc.subjectChatboten_US
dc.subjectConvolutional neural networken_US
dc.subjectAutomated pipelineen_US
dc.titleAutomated Text Mining and Ranked List Algorithms for Drug Discovery in Acute Myeloid Leukemiaen_US
dc.typeThesisen_US
dc.contributor.departmentHealth Sciencesen_US
dc.description.degreetypeThesisen_US
dc.description.degreeMaster of Science (MSc)en_US
dc.description.layabstractLead generation is an integral requirement of any research organization in all fields and is typically a time-consuming and therefore expensive task. This is due to the requirement of human intuition to be applied iteratively over a large body of evidence. In this thesis, a new technology called the Artificially-intelligent Desktop Assistant (AiDA) is explored in order to provide a large number of leads from accumulated biomedical information. AiDA was created using a combination of classical statistics, deep learning methods, and modern graphical interface engineering. It aims to simplify the interface between the researcher and an assortment of bioinformatics tasks by organically interpreting written text messages and responding with the appropriate task. AiDA was able to identify several potential targets for new pharmaceuticals in acute myeloid leukemia (AML), a cancer of the blood, by reading whole-genome data. It then discovered appropriate therapeutics by automatically scanning through the accumulated body of biomedical research papers. Analysis of the discovered drug targets shows that together, they are involved in key biological processes that are known by the scientific community to be involved in leukemia and other cancers.en_US
Appears in Collections:Open Access Dissertations and Theses

Files in This Item:
File Description SizeFormat 
tran_damian_v_201909_MSc.pdf
Open Access
Main thesis document4.21 MBAdobe PDFView/Open
S6_mutation_enrichments.xlsx
Open Access
Supplementary Table S667.93 kBMicrosoft Excel XMLView/Open
S7_context_abstracts.xlsx
Open Access
Supplementary Table S7476.53 kBMicrosoft Excel XMLView/Open
S8_context_filtered.tsv
Open Access
Supplementary Table S8415.05 kBUnknownView/Open
S1_drug_target_gene_ranks.xlsx
Open Access
Supplementary Table S1625.98 kBMicrosoft Excel XMLView/Open
S3_AI_func_dataset.tsv
Open Access
Supplementary Table S366.94 kBUnknownView/Open
S2_categories_verbal.tsv
Open Access
Supplementary Table S244.87 kBUnknownView/Open
S4_gene_abstracts.tsv
Open Access
Supplementary Table S464.52 MBUnknownView/Open
S5_drug_abstracts.tsv
Open Access
Supplementary Table S562.35 MBUnknownView/Open
Show simple item record Statistics


Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.

Sherman Centre for Digital Scholarship     McMaster University Libraries
©2022 McMaster University, 1280 Main Street West, Hamilton, Ontario L8S 4L8 | 905-525-9140 | Contact Us | Terms of Use & Privacy Policy | Feedback

Report Accessibility Issue