Welcome to the upgraded MacSphere! We're putting the finishing touches on it; if you notice anything amiss, email macsphere@mcmaster.ca

Dimensionality Reduction with Non-Gaussian Mixtures

dc.contributor.advisorMcNicholas, Paul
dc.contributor.authorTang, Yang
dc.contributor.departmentMathematics and Statisticsen_US
dc.date.accessioned2017-10-03T18:46:07Z
dc.date.available2017-10-03T18:46:07Z
dc.date.issued2017-11
dc.description.abstractBroadly speaking, cluster analysis is the organization of a data set into meaningful groups and mixture model-based clustering is recently receiving a wide interest in statistics. Historically, the Gaussian mixture model has dominated the model-based clustering literature. When model-based clustering is performed on a large number of observed variables, it is well known that Gaussian mixture models can represent an over-parameterized solution. To this end, this thesis focuses on the development of novel non-Gaussian mixture models for high-dimensional continuous and categorical data. We developed a mixture of joint generalized hyperbolic models (JGHM), which exhibits different marginal amounts of tail-weight. Moreover, it takes into account the cluster specific subspace and, therefore, limits the number of parameters to estimate. This is a novel approach, which is applicable to high, and potentially very- high, dimensional spaces and with arbitrary correlation between dimensions. Three different mixture models are developed using forms of the mixture of latent trait models to realize model-based clustering of high-dimensional binary data. A family of mixture of latent trait models with common slope parameters are developed to reduce the number of parameters to be estimated. This approach facilitates a low-dimensional visual representation of the clusters. We further developed the penalized latent trait models to facilitate ultra high dimensional binary data which performs automatic variable selection as well. For all models and families of models developed in this thesis, the algorithms used for model-fitting and parameter estimation are presented. Real and simulated data sets are used to assess the clustering ability of the models.en_US
dc.description.degreeDoctor of Philosophy (PhD)en_US
dc.description.degreetypeThesisen_US
dc.identifier.urihttp://hdl.handle.net/11375/21982
dc.language.isoenen_US
dc.subjectclusteringen_US
dc.subjectnon-Gaussianen_US
dc.subjectlatent variablesen_US
dc.subjectmixture Modelsen_US
dc.subjectcategorical dataen_US
dc.subjectvariational methoden_US
dc.titleDimensionality Reduction with Non-Gaussian Mixturesen_US
dc.typeThesisen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Tang_Yang_2017April_PhD.pdf
Size:
2.22 MB
Format:
Adobe Portable Document Format
Description:
PhDThesis

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.68 KB
Format:
Item-specific license agreed upon to submission
Description: