Please use this identifier to cite or link to this item:
http://hdl.handle.net/11375/29748
Title: | Deep Learning for Nematode Image-based Antiparasitic Drug Discovery |
Authors: | Wang, Lyuyang |
Advisor: | Moradi, Mehdi |
Department: | Computing and Software |
Publication Date: | 2024 |
Abstract: | Deep learning has revolutionized the domain of computer vision, with various neural network architectures excelling in tasks like biomedical image classification. This thesis focuses on using deep learning methods to automate the classification of nematode images for drug discovery. The process involves visually examining thousands of images of C. elegans exposed to natural extracts for signs of abnormal development, including morphological defects and reproductive issues. Our dataset comprises 12,717 microscopic images associated with natural product extracts. Approximately one-third of the images are labeled by an expert, and the remaining are unlabeled. Depending on the drug discovery objective, we define the classification as binary (Normal/Abnormal), the most common six (including only the most common six phenotype combinations), or 27 classes (including all phenotype combinations). Initially, we explored fully supervised and semi-supervised learning approaches for binary classification, utilizing high-confidence pseudo-labels from the unlabeled data to progressively enrich our training dataset. To better identify groups with similar visual observations and improve the classification performance, we propose the Triple Cluster Classification (TriCC) method, which enables the detection of underlying feature patterns in C. elegans. TriCC includes self-supervised contrastive learning, unsupervised image clustering, and supervised classification using labeled data to map clusters to the phenotype combinations. Following this, we propose a semi-supervised nematode image classifier based on self-supervised representation learning with Mix-up Barlow Twins (MBT-NC). This system integrates self-supervised learning (SSL) for feature representation (MBT) with a supervised classification stage (NC). We use additional linear interpolated samples in MBT to enhance batch information utilization. The MBT-NC outperforms the two previously developed methods, achieving test accuracies of 89.6%, 83.4%, and 77.6% for binary, six-class, and twenty-seven-class classifications, respectively. |
URI: | http://hdl.handle.net/11375/29748 |
Appears in Collections: | Open Access Dissertations and Theses |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Wang_Lyuyang_finalsubmission202404_MSc.pdf | 4.75 MB | Adobe PDF | View/Open |
Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.