Analysis of Three-Way Data and Other Topics in Clustering and Classification

Gallaugher, Michael Patrick Brian

Analysis of Three-Way Data and Other Topics in Clustering and Classification

Files

Primary gallaugher_michael_pb_finalsubmission2020march_phd..pdf (3.18 MB)

Date

2020

Authors

Gallaugher, Michael Patrick Brian

Abstract

Clustering and classification is the process of finding underlying group structure in heterogenous data. With the rise of the “big data” phenomenon, more complex data structures have made it so traditional clustering methods are oftentimes not advisable or feasible. This thesis presents methodology for analyzing three different examples of these more complex data types. The first is three-way (matrix variate) data, or data that come in the form of matrices. A large emphasis is placed on clustering skewed three-way data, and high dimensional three-way data. The second is click- stream data, which considers a user’s internet search patterns. Finally, co-clustering methodology is discussed for very high-dimensional two-way (multivariate) data. Parameter estimation for all these methods is based on the expectation maximization (EM) algorithm. Both simulated and real data are used for illustration.

Keywords

clustering, classification, mixture models, matrix variate distributions

URI

http://hdl.handle.net/11375/25359

Collections

Open Access Dissertations and Theses

Full item page

Analysis of Three-Way Data and Other Topics in Clustering and Classification

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By