Skip navigation
  • Home
  • Browse
    • Communities
      & Collections
    • Browse Items by:
    • Publication Date
    • Author
    • Title
    • Subject
    • Department
  • Sign on to:
    • My MacSphere
    • Receive email
      updates
    • Edit Profile


McMaster University Home Page
  1. MacSphere
  2. Open Access Dissertations and Theses Community
  3. Open Access Dissertations and Theses
Please use this identifier to cite or link to this item: http://hdl.handle.net/11375/20598
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorMcNicholas, Paul D.-
dc.contributor.authorBlostein, Martin-
dc.date.accessioned2016-10-05T18:37:47Z-
dc.date.available2016-10-05T18:37:47Z-
dc.date.issued2016-
dc.identifier.urihttp://hdl.handle.net/11375/20598-
dc.description.abstractClustering and classification are fundamental problems in statistical and machine learning, with a broad range of applications. A common approach is the Gaussian mixture model, which assumes that each cluster or class arises from a distinct Gaussian distribution. This thesis studies a robust, high-dimensional extension of the Gaussian mixture model that automatically detects outliers and noise, and a computationally efficient implementation thereof. The contaminated Gaussian distribution is a robust elliptic distribution that allows for automatic detection of ``bad points'', and is used to make robust the usual factor analysis model. In turn, the mixtures of contaminated Gaussian factor analyzers (MCGFA) algorithm allows high-dimesional, robust clustering, classification and detection of bad points. A family of MCGFA models is created through the introduction of different constraints on the covariance structure. A new, efficient implementation of the algorithm is presented, along with an account of its development. The fast implementation permits thorough testing of the MCGFA algorithm, and its performance is compared to two natural competitors: parsimonious Gaussian mixture models (PGMM) and mixtures of modified t factor analyzers (MMtFA). The algorithms are tested systematically on simulated and real data.en_US
dc.language.isoenen_US
dc.subjectclusteringen_US
dc.subjectclassificationen_US
dc.subjectstatistical learningen_US
dc.subjectmachine learningen_US
dc.subjectrobusten_US
dc.subjectcomputational statisticsen_US
dc.subjectmixture modelsen_US
dc.titleAn Efficient Implementation of a Robust Clustering Algorithmen_US
dc.typeThesisen_US
dc.contributor.departmentMathematics and Statisticsen_US
dc.description.degreetypeThesisen_US
dc.description.degreeMaster of Science (MSc)en_US
Appears in Collections:Open Access Dissertations and Theses

Files in This Item:
File Description SizeFormat 
blostein_martin_201609_msc.pdf
Open Access
183.32 kBAdobe PDFView/Open
Show simple item record Statistics


Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.

Sherman Centre for Digital Scholarship     McMaster University Libraries
©2022 McMaster University, 1280 Main Street West, Hamilton, Ontario L8S 4L8 | 905-525-9140 | Contact Us | Terms of Use & Privacy Policy | Feedback

Report Accessibility Issue