Skip navigation
  • Home
  • Browse
    • Communities
      & Collections
    • Browse Items by:
    • Publication Date
    • Author
    • Title
    • Subject
    • Department
  • Sign on to:
    • My MacSphere
    • Receive email
      updates
    • Edit Profile


McMaster University Home Page
  1. MacSphere
  2. Open Access Dissertations and Theses Community
  3. Open Access Dissertations and Theses
Please use this identifier to cite or link to this item: http://hdl.handle.net/11375/24319
Title: Using machine learning to predict long non-coding RNAs and exploring their evolutionary patterns and prevalence in plant transcriptomes
Authors: Simopoulos, Caitlin
Advisor: Weretilnyk, Elizabeth
Golding, Brian
Department: Biology
Keywords: lncRNA;machine learning;phylogenetic signal;transcriptomes;extremophile;Eutrema salsugineum;plants
Publication Date: 2019
Abstract: Long non-protein coding RNAs (lncRNAs) represent a diverse and enigmatic classification of RNA. With roles associated with development and stress responses, these non-coding gene regulators are essential, and yet remain understudied in plants. Thus far, of just over 430 experimentally validated lncRNAs, only 13 are derived from plant systems and many of which do not meet the classic criteria of the RNA class. Without a solid definition of what makes a lncRNA, and few empirically validated transcripts, methods currently available for prediction fall short. To address this deficiency in lncRNA research, we constructed and applied a machine learning-based lncRNA prediction protocol that does not impose predefined rules, and utilises only experimentally confirmed lncRNAs in its training datasets. Through model evaluation, we found that our novel lncRNA prediction tool had an estimated accuracy of over 96%. In a study that predicted lncRNAs from transcriptomes of evolutionary diverse plant species, we determined that molecular features of lncRNAs display different phylogenetic signal patterns compared to protein-coding genes. Additionally, our analyses suggested that stress-resistant species express fewer lncRNAs than more stress sensitive species. To expand on these results, we used the prediction tool in concert with a transcriptomic study of two natural accessions of the drought tolerant species Eutrema salsugineum. Previously reported to show little physiological differences in a first drought, but differ significantly in a second, we instead demonstrated that the two ecotypes displayed vastly different transcriptomic responses, including the expression of lncRNAs, to a first and second drought treatment. In conclusion, the prediction tool can be applied to studies to further our knowledge of lncRNA evolution and as an additional tool in classic transcriptomic studies. The suggested importance of lncRNAs in drought resistance, and evidence of expression in two natural E. salsugineum accessions, merits further studies on the molecular and evolutionary mechanisms of these putatively regulatory transcripts.
URI: http://hdl.handle.net/11375/24319
Appears in Collections:Open Access Dissertations and Theses

Files in This Item:
File Description SizeFormat 
Simopoulos_Caitlin_MA_2019April_PhD.pdf
Access is allowed from: 2020-04-24
1.98 MBAdobe PDFView/Open
Show full item record Statistics


Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.

Sherman Centre for Digital Scholarship     McMaster University Libraries
©2022 McMaster University, 1280 Main Street West, Hamilton, Ontario L8S 4L8 | 905-525-9140 | Contact Us | Terms of Use & Privacy Policy | Feedback

Report Accessibility Issue