Skip navigation
  • Home
  • Browse
    • Communities
      & Collections
    • Browse Items by:
    • Publication Date
    • Author
    • Title
    • Subject
    • Department
  • Sign on to:
    • My MacSphere
    • Receive email
      updates
    • Edit Profile


McMaster University Home Page
  1. MacSphere
  2. Open Access Dissertations and Theses Community
  3. Open Access Dissertations and Theses
Please use this identifier to cite or link to this item: http://hdl.handle.net/11375/24866
Title: Automated Text Mining and Ranked List Algorithms for Drug Discovery in Acute Myeloid Leukemia
Authors: Tran, Damian
Advisor: Hope, Kristin
McArthur, Andrew
Leber, Brian
Department: Health Sciences
Keywords: Acute myeloid leukemia;Drug discovery;Deep learning;Artificial intelligence;Literature review;Natural language processing;Automated data analysis;Chatbot;Convolutional neural network;Automated pipeline
Publication Date: 2019
Abstract: Evidence-based software engineering (EBSE) solutions for drug discovery that are effective, affordable, and accessible all-in-one are lacking. This thesis chronicles the progression and accomplishments of the AiDA (Artificially-intelligent Desktop Assistant) functional artificial intelligence (AI) project for the purposes of drug discovery in the challenging acute myeloid leukemia context (AML). AiDA is a highly automated combined natural language processing (NLP) and spreadsheet feature extraction solution that harbours potential to disrupt the state of current research investigation methods using big data and aggregated literature. The completed work includes a text-to-function (T2F) NLP method for automated text interpretation, a ranked-list algorithm for multi-dataset analysis, and a custom multi-purpose neural network engine presented to the user using an open-source graphics engine. Validation of the deep learning engine using MNIST and CIFAR machine learning benchmark datasets showed performance comparable to state-of-the-art libraries using similar architectures. An n-dimensional word embedding method for the handling of unstructured natural language data was devised to feed convolutional neural network (CNN) models that over 25 random permutations correctly predicted functional responses to up to 86.64% of over 300 validation transcripts. The same CNN NLP infrastructure was then used to automate biomedical context recognition in >20000 literature abstracts with up to 95.7% test accuracy over several permutations. The AiDA platform was used to compile a bidirectional ranked list of potential gene targets for pharmaceuticals by extracting features from leukemia microarray data, followed by mining of the PubMed biomedical citation database to extract recyclable pharmaceutical candidates. Downstream analysis of the candidate therapeutic targets revealed enrichments in AML- and leukemic stem cell (LSC)-related pathways. The applicability of the AiDA algorithms in whole and part to the larger biomedical research field is explored.
URI: http://hdl.handle.net/11375/24866
Appears in Collections:Open Access Dissertations and Theses

Files in This Item:
File Description SizeFormat 
tran_damian_v_201909_MSc.pdf
Open Access
Main thesis document4.21 MBAdobe PDFView/Open
S6_mutation_enrichments.xlsx
Open Access
Supplementary Table S667.93 kBMicrosoft Excel XMLView/Open
S7_context_abstracts.xlsx
Open Access
Supplementary Table S7476.53 kBMicrosoft Excel XMLView/Open
S8_context_filtered.tsv
Open Access
Supplementary Table S8415.05 kBUnknownView/Open
S1_drug_target_gene_ranks.xlsx
Open Access
Supplementary Table S1625.98 kBMicrosoft Excel XMLView/Open
S3_AI_func_dataset.tsv
Open Access
Supplementary Table S366.94 kBUnknownView/Open
S2_categories_verbal.tsv
Open Access
Supplementary Table S244.87 kBUnknownView/Open
S4_gene_abstracts.tsv
Open Access
Supplementary Table S464.52 MBUnknownView/Open
S5_drug_abstracts.tsv
Open Access
Supplementary Table S562.35 MBUnknownView/Open
Show full item record Statistics


Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.

Sherman Centre for Digital Scholarship     McMaster University Libraries
©2022 McMaster University, 1280 Main Street West, Hamilton, Ontario L8S 4L8 | 905-525-9140 | Contact Us | Terms of Use & Privacy Policy | Feedback

Report Accessibility Issue