Skip navigation
  • Home
  • Browse
    • Communities
      & Collections
    • Browse Items by:
    • Publication Date
    • Author
    • Title
    • Subject
    • Department
  • Sign on to:
    • My MacSphere
    • Receive email
      updates
    • Edit Profile


McMaster University Home Page
  1. MacSphere
  2. Open Access Dissertations and Theses Community
  3. Open Access Dissertations and Theses
Please use this identifier to cite or link to this item: http://hdl.handle.net/11375/25930
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorMcNicholas, Paul-
dc.contributor.authorClark, Katharine-
dc.date.accessioned2020-10-15T13:51:27Z-
dc.date.available2020-10-15T13:51:27Z-
dc.date.issued2020-
dc.identifier.urihttp://hdl.handle.net/11375/25930-
dc.description.abstractUnsupervised classification is a problem often plagued by outliers, yet there is a paucity of work on handling outliers in unsupervised classification. Mixtures of Gaussian distributions are a popular choice in model-based clustering. A single outlier can affect parameters estimation and, as such, must be accounted for. This issue is further complicated by the presence of multiple outliers. Predicting the proportion of outliers correctly is paramount as it minimizes misclassification error. It is proved that, for a finite Gaussian mixture model, the log-likelihoods of the subset models are distributed according to a mixture of beta-type distributions. This relationship is leveraged in two ways. First, an algorithm is proposed that predicts the proportion of outliers by measuring the adherence of a set of subset log-likelihoods to a beta-type mixture reference distribution. This algorithm removes the least likely points, which are deemed outliers, until model assumptions are met. Second, a hypothesis test is developed, which, at a chosen significance level, can test whether a dataset contains a single outlier.en_US
dc.language.isoenen_US
dc.subjectclusteringen_US
dc.subjectoutliersen_US
dc.subjectmodel selectionen_US
dc.subjectmixture modelsen_US
dc.titleOutlier Detection in Gaussian Mixture Modelsen_US
dc.typeThesisen_US
dc.contributor.departmentStatisticsen_US
dc.description.degreetypeThesisen_US
dc.description.degreeMaster of Science (MSc)en_US
Appears in Collections:Open Access Dissertations and Theses

Files in This Item:
File Description SizeFormat 
Clark_Katharine_M_2020Aug_MSc.pdf
Open Access
440.13 kBAdobe PDFView/Open
Show simple item record Statistics


Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.

Sherman Centre for Digital Scholarship     McMaster University Libraries
©2022 McMaster University, 1280 Main Street West, Hamilton, Ontario L8S 4L8 | 905-525-9140 | Contact Us | Terms of Use & Privacy Policy | Feedback

Report Accessibility Issue