Skip navigation
  • Home
  • Browse
    • Communities
      & Collections
    • Browse Items by:
    • Publication Date
    • Author
    • Title
    • Subject
    • Department
  • Sign on to:
    • My MacSphere
    • Receive email
      updates
    • Edit Profile


McMaster University Home Page
  1. MacSphere
  2. Open Access Dissertations and Theses Community
  3. Open Access Dissertations and Theses
Please use this identifier to cite or link to this item: http://hdl.handle.net/11375/12779
Title: Improving specimen identification: Informative DNA using a statistical Bayesian method
Authors: Lou, Melanie
Advisor: Golding, G. B.
Department: Biology
Keywords: Theory of segregating sites;Bayesian theory;Species identification;Mitochondrial DNA;DNA barcoding;Treeless statistical assignment method;Computational Biology;Ecology and Evolutionary Biology;Evolution;Genetics and Genomics;Population Biology;Computational Biology
Publication Date: Apr-2013
Abstract: <p>This work investigates the assignment of unknown sequences to their species of origin. In particular, I examine four questions: Is existing (GenBank) data reliable for accurate species identification? Does a segregating sites algorithm make accurate species identifications and how does it compare to another Bayesian method? Does broad sampling of reference species improve the information content of reference data? And does an extended model (of the theory of segregating sites) describe the genetic variation in a set of sequences (of a species or population) better? Though we did not find unusually similar between-species sequences in GenBank, there was evidence of unusually divergent within-species sequences, suggesting that caution and a firm understanding of GenBank species should be exercised before utilizing GenBank data. To address challenging identifications resulting from an overlap between within- and between species variation, we introduced a Bayesian treeless statistical assignment method that makes use of segregating sites. Assignments with simulated and <em>Drosophila</em> (fruit fly) sequences show that this method can provide fast, high probability assignments for recently diverged species. To address reference sequences with low information content, the addition of even one broadly sampled reference sequence can increase the number of correct assignments. Finally, an extended theory of segregating sites generates more realistic probability estimates of the genetic variability of a set of sequences. Species are dynamic entities and this work will highlight ideas and methods to address dynamic genetic patterns in species.</p>
URI: http://hdl.handle.net/11375/12779
Identifier: opendissertations/7637
8698
3551744
Appears in Collections:Open Access Dissertations and Theses

Files in This Item:
File SizeFormat 
fulltext.pdf
Open Access
569.31 kBAdobe PDFView/Open
Show full item record Statistics


Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.

Sherman Centre for Digital Scholarship     McMaster University Libraries
©2022 McMaster University, 1280 Main Street West, Hamilton, Ontario L8S 4L8 | 905-525-9140 | Contact Us | Terms of Use & Privacy Policy | Feedback

Report Accessibility Issue