Skip navigation
  • Home
  • Browse
    • Communities
      & Collections
    • Browse Items by:
    • Publication Date
    • Author
    • Title
    • Subject
    • Department
  • Sign on to:
    • My MacSphere
    • Receive email
      updates
    • Edit Profile


McMaster University Home Page
  1. MacSphere
  2. Open Access Dissertations and Theses Community
  3. Open Access Dissertations and Theses
Please use this identifier to cite or link to this item: http://hdl.handle.net/11375/21982
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorMcNicholas, Paul-
dc.contributor.authorTang, Yang-
dc.date.accessioned2017-10-03T18:46:07Z-
dc.date.available2017-10-03T18:46:07Z-
dc.date.issued2017-11-
dc.identifier.urihttp://hdl.handle.net/11375/21982-
dc.description.abstractBroadly speaking, cluster analysis is the organization of a data set into meaningful groups and mixture model-based clustering is recently receiving a wide interest in statistics. Historically, the Gaussian mixture model has dominated the model-based clustering literature. When model-based clustering is performed on a large number of observed variables, it is well known that Gaussian mixture models can represent an over-parameterized solution. To this end, this thesis focuses on the development of novel non-Gaussian mixture models for high-dimensional continuous and categorical data. We developed a mixture of joint generalized hyperbolic models (JGHM), which exhibits different marginal amounts of tail-weight. Moreover, it takes into account the cluster specific subspace and, therefore, limits the number of parameters to estimate. This is a novel approach, which is applicable to high, and potentially very- high, dimensional spaces and with arbitrary correlation between dimensions. Three different mixture models are developed using forms of the mixture of latent trait models to realize model-based clustering of high-dimensional binary data. A family of mixture of latent trait models with common slope parameters are developed to reduce the number of parameters to be estimated. This approach facilitates a low-dimensional visual representation of the clusters. We further developed the penalized latent trait models to facilitate ultra high dimensional binary data which performs automatic variable selection as well. For all models and families of models developed in this thesis, the algorithms used for model-fitting and parameter estimation are presented. Real and simulated data sets are used to assess the clustering ability of the models.en_US
dc.language.isoenen_US
dc.subjectclusteringen_US
dc.subjectnon-Gaussianen_US
dc.subjectlatent variablesen_US
dc.subjectmixture Modelsen_US
dc.subjectcategorical dataen_US
dc.subjectvariational methoden_US
dc.titleDimensionality Reduction with Non-Gaussian Mixturesen_US
dc.typeThesisen_US
dc.contributor.departmentMathematics and Statisticsen_US
dc.description.degreetypeThesisen_US
dc.description.degreeDoctor of Philosophy (PhD)en_US
Appears in Collections:Open Access Dissertations and Theses

Files in This Item:
File Description SizeFormat 
Tang_Yang_2017April_PhD.pdf
Access is allowed from: 2018-04-27
PhDThesis2.27 MBAdobe PDFView/Open
Show simple item record Statistics


Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.

Sherman Centre for Digital Scholarship     McMaster University Libraries
©2022 McMaster University, 1280 Main Street West, Hamilton, Ontario L8S 4L8 | 905-525-9140 | Contact Us | Terms of Use & Privacy Policy | Feedback

Report Accessibility Issue