Welcome to the upgraded MacSphere! We're putting the finishing touches on it; if you notice anything amiss, email macsphere@mcmaster.ca

Dimension Reduction and Clustering of High Dimensional Data using a Mixture of Generalized Hyperbolic Distributions

Loading...
Thumbnail Image

Date

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Model-based clustering is a probabilistic approach that views each cluster as a component in an appropriate mixture model. The Gaussian mixture model is one of the most widely used model-based methods. However, this model tends to perform poorly when clustering high-dimensional data due to the over-parametrized solutions that arise in high-dimensional spaces. This work instead considers the approach of combining dimension reduction techniques with clustering via a mixture of generalized hyperbolic distributions. The dimension reduction techniques, principal component analysis and factor analysis along with their extensions were reviewed. Then the aforementioned dimension reduction techniques were individually paired with the mixture of generalized hyperbolic distributions in order to demonstrate the clustering performance achieved under each method using both simulated and real data sets. For a majority of the data sets, the clustering method utilizing principal component analysis exhibited better classi cation results compared to the clustering method based on the extending the factor analysis model.

Description

Citation

Endorsement

Review

Supplemented By

Referenced By