Please use this identifier to cite or link to this item:
http://hdl.handle.net/11375/31784
Title: | A Novel Approach for Simulation-Based Power Estimation and Joint Modeling of Microbiome Counts |
Authors: | Agronah, Michael |
Advisor: | Bolker, Benjamin |
Department: | Computational Engineering and Science |
Keywords: | Microbiome;Differential abundance;Statistical power;Sample size;Reduced rank;Longitudinal data |
Publication Date: | 2025 |
Abstract: | Advances in microbiome research have greatly enhanced our understanding of how microbial communities influence human health and disease. The advent of high-throughput sequencing technologies such as 16S rRNA amplicon and shotgun metagenomic sequencing has enabled researchers to generate microbiome abundance data for statistical analysis. These technological developments, together with the development of statistical methods have enabled researchers to detect differences in microbial composition across experimental conditions. Despite these advancements, challenges remain in the areas of statistical power and sample size estimation, and the modeling of correlations between taxa in subject when analyzing associations between microbiome data and covariates. This PhD thesis addresses these challenges by developing new methods for power and sample size determination, and proposing methods for joint analysis of microbiome data while accounting for correlations among taxa in differential abundance studies. We first developed two novel simulation methods (Chapter 2) designed to generate realistic microbiome count data for power and sample size estimation, and for evaluating the performance of the models we propose in this thesis. We then developed a new method for estimating statistical power in differential abundance studies. We apply this method to evaluate whether existing microbiome studies have sufficient power to detect differences in microbiome abundance (Chapter 3). Our findings suggest that differential abundance studies have low power to detect biologically meaningful differences. We extended our power estimation procedure to develop a novel method for sample size estimation for differential abundance studies (Chapter 4). Applying our sample size estimation procedure to real microbiome data sets suggests that the sample sizes seen in differential abundance microbiome literature may be too small to detect meaningful effects. Most existing methods for differential abundance studies analyze individual taxa separately. We propose the Reduced Rank Multivariate Mixed Model (RRMM), which jointly models all taxa while accounting for correlations within subjects (Chapter 5). Due to the high dimensionality of microbiome data, modeling the correlations between taxa through a full variance-covariance matrix requires estimating thousands or even millions of parameters, making it computationally infeasible. RRMM reduces the number of parameters by applying rank reduction to the variance-covariance matrix. We show through simulation and using real microbiome data that RRMM improves precision in effect size estimates relative to standard methods such as the models implemented in the DESeq2 and NBZIMM R packages. We extend RRMM to longitudinal microbiome design and developed the Longitudinal Reduced Rank Mixed Model (LRRMM) (Chapter 6). LRRMM jointly analyzes all taxa in a longitudinal study and models correlation and changes over time. Analyses of real and simulated data demonstrate that LRRMM improves precision in effect size (ie, a measure of the magnitude of the difference in taxon abundance between experimental conditions) estimates than the models implemented in the NBZIMM package which model individual taxa separately. Together, these contributions enhance the methodological foundation for microbiome research, offering methods for simulation, power analysis, and modeling that accounts for correlations between taxa within subjects in microbiome data. |
URI: | http://hdl.handle.net/11375/31784 |
Appears in Collections: | Open Access Dissertations and Theses |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Agronah_Michael_2025May_PhD.pdf | 19.24 MB | Adobe PDF | View/Open |
Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.