Skip navigation
  • Home
  • Browse
    • Communities
      & Collections
    • Browse Items by:
    • Publication Date
    • Author
    • Title
    • Subject
    • Department
  • Sign on to:
    • My MacSphere
    • Receive email
      updates
    • Edit Profile


McMaster University Home Page
  1. MacSphere
  2. Open Access Dissertations and Theses Community
  3. Open Access Dissertations and Theses
Please use this identifier to cite or link to this item: http://hdl.handle.net/11375/31784
Title: A Novel Approach for Simulation-Based Power Estimation and Joint Modeling of Microbiome Counts
Authors: Agronah, Michael
Advisor: Bolker, Benjamin
Department: Computational Engineering and Science
Keywords: Microbiome;Differential abundance;Statistical power;Sample size;Reduced rank;Longitudinal data
Publication Date: 2025
Abstract: Advances in microbiome research have greatly enhanced our understanding of how microbial communities influence human health and disease. The advent of high-throughput sequencing technologies such as 16S rRNA amplicon and shotgun metagenomic sequencing has enabled researchers to generate microbiome abundance data for statistical analysis. These technological developments, together with the development of statistical methods have enabled researchers to detect differences in microbial composition across experimental conditions. Despite these advancements, challenges remain in the areas of statistical power and sample size estimation, and the modeling of correlations between taxa in subject when analyzing associations between microbiome data and covariates. This PhD thesis addresses these challenges by developing new methods for power and sample size determination, and proposing methods for joint analysis of microbiome data while accounting for correlations among taxa in differential abundance studies. We first developed two novel simulation methods (Chapter 2) designed to generate realistic microbiome count data for power and sample size estimation, and for evaluating the performance of the models we propose in this thesis. We then developed a new method for estimating statistical power in differential abundance studies. We apply this method to evaluate whether existing microbiome studies have sufficient power to detect differences in microbiome abundance (Chapter 3). Our findings suggest that differential abundance studies have low power to detect biologically meaningful differences. We extended our power estimation procedure to develop a novel method for sample size estimation for differential abundance studies (Chapter 4). Applying our sample size estimation procedure to real microbiome data sets suggests that the sample sizes seen in differential abundance microbiome literature may be too small to detect meaningful effects. Most existing methods for differential abundance studies analyze individual taxa separately. We propose the Reduced Rank Multivariate Mixed Model (RRMM), which jointly models all taxa while accounting for correlations within subjects (Chapter 5). Due to the high dimensionality of microbiome data, modeling the correlations between taxa through a full variance-covariance matrix requires estimating thousands or even millions of parameters, making it computationally infeasible. RRMM reduces the number of parameters by applying rank reduction to the variance-covariance matrix. We show through simulation and using real microbiome data that RRMM improves precision in effect size estimates relative to standard methods such as the models implemented in the DESeq2 and NBZIMM R packages. We extend RRMM to longitudinal microbiome design and developed the Longitudinal Reduced Rank Mixed Model (LRRMM) (Chapter 6). LRRMM jointly analyzes all taxa in a longitudinal study and models correlation and changes over time. Analyses of real and simulated data demonstrate that LRRMM improves precision in effect size (ie, a measure of the magnitude of the difference in taxon abundance between experimental conditions) estimates than the models implemented in the NBZIMM package which model individual taxa separately. Together, these contributions enhance the methodological foundation for microbiome research, offering methods for simulation, power analysis, and modeling that accounts for correlations between taxa within subjects in microbiome data.
URI: http://hdl.handle.net/11375/31784
Appears in Collections:Open Access Dissertations and Theses

Files in This Item:
File Description SizeFormat 
Agronah_Michael_2025May_PhD.pdf
Open Access
19.24 MBAdobe PDFView/Open
Show full item record Statistics


Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.

Sherman Centre for Digital Scholarship     McMaster University Libraries
©2022 McMaster University, 1280 Main Street West, Hamilton, Ontario L8S 4L8 | 905-525-9140 | Contact Us | Terms of Use & Privacy Policy | Feedback

Report Accessibility Issue