Please use this identifier to cite or link to this item:
http://hdl.handle.net/11375/22758
Title: | Dimension Reduction and Clustering of High Dimensional Data using a Mixture of Generalized Hyperbolic Distributions |
Authors: | Pathmanathan, Thinesh |
Advisor: | McNicholas, Sharon |
Department: | Statistics |
Keywords: | Model-based clustering, dimension reduction, statistical learning |
Publication Date: | 2018 |
Abstract: | Model-based clustering is a probabilistic approach that views each cluster as a component in an appropriate mixture model. The Gaussian mixture model is one of the most widely used model-based methods. However, this model tends to perform poorly when clustering high-dimensional data due to the over-parametrized solutions that arise in high-dimensional spaces. This work instead considers the approach of combining dimension reduction techniques with clustering via a mixture of generalized hyperbolic distributions. The dimension reduction techniques, principal component analysis and factor analysis along with their extensions were reviewed. Then the aforementioned dimension reduction techniques were individually paired with the mixture of generalized hyperbolic distributions in order to demonstrate the clustering performance achieved under each method using both simulated and real data sets. For a majority of the data sets, the clustering method utilizing principal component analysis exhibited better classi cation results compared to the clustering method based on the extending the factor analysis model. |
URI: | http://hdl.handle.net/11375/22758 |
Appears in Collections: | Open Access Dissertations and Theses |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Pathmanathan_Thinesh_2018March_MSc.pdf | 890.48 kB | Adobe PDF | View/Open |
Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.