Please use this identifier to cite or link to this item:
|Title:||Subspace Clustering with the Multivariate-t Distribution|
|Advisor:||McNicholas, Paul D.|
Franczak, Brian C.
|Abstract:||Clustering procedures suitable for the analysis of very high-dimensional data are needed for many modern data sets. One model-based clustering approach called high-dimensional data clustering (HDDC) uses a family of Gaussian mixture models to model the sub-populations of the observed data, i.e., to perform cluster analysis. The HDDC approach is based on the idea that high-dimensional data usually exists in lower-dimensional subspaces; as such, the dimension of each subspace, called the intrinsic dimension, can be estimated for each sub-population of the observed data. As a result, each of these Gaussian mixture models can be fitted using only a fraction of the total number of model parameters. This family of models has gained attention due to its superior classification performance compared to other families of mixture models; however, it still suffers from the usual limitations of Gaussian mixture model-based approaches. Herein, a robust analogue of the HDDC approach is proposed. This approach, which extends the HDDC procedure to include the mulitvariate-t distribution, encompasses 28 models that rectify one of the major shortcomings of the HDDC procedure. Our tHDDC procedure is fitted to both simulated and real data sets and is compared to the HDDC procedure using an image reconstruction problem that arose from satellite imagery of Mars' surface.|
|Appears in Collections:||Open Access Dissertations and Theses|
Files in This Item:
|Pesevski_Angelina_2017_Master.pdf||1.3 MB||Adobe PDF||View/Open|
Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.