On Clustering Comparisons Using Data From a Seroprevalence Study
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Various longitudinal clustering approaches are discussed and compared on an application to a seroprevalence study. The data contains information about the behaviours of individuals throughout the course of the COVID-19 pandemic. First, a review of the various longitudinal clustering methods compared throughout this thesis is discussed. Longitudinal k-means, growth mixture models, latent class growth analysis and a two-step approach involving growth curve models and k-means are reviewed. Longitudinal model-based clustering based on a modified Cholesky decomposition of a Gaussian mixture and Gaussian linear means are also reviewed. The BIC is used as the primary criterion to determine the number of components, and the ARI is used to determine cluster similarity between models. The various clustering approaches are then compared as they attempt to identify gathering patterns within the population of the seroprevalence dataset.