Please use this identifier to cite or link to this item:
http://hdl.handle.net/11375/27880
Title: | Computational inference and prediction in public health |
Authors: | Cygu, Steve Bicko |
Advisor: | Jonathan, Dushoff |
Department: | Computational Engineering and Science |
Keywords: | Machine learning;Public health;Cancer;Survival data |
Publication Date: | 2022 |
Abstract: | Using computational approaches utilizing large datasets to investigate public health information is an important mechanism for institutions seeking to identify strategies for improving public health. The art in computational approaches, for example in health research, is managing the trade-offs between the two perspectives: first, inference and second, prediction. Many techniques from statistical methods (SM) and machine learning (ML) may, in principle, be used for both perspectives. However, SM has a well established focus on inference by building probabilistic models which allows us to determine a quantitative measure of confidence about the magnitude of the effect. Simulation-based validation approaches can be used in conjunction with SM to explicitly verify assumptions and redefine the specified model, if necessary. On the other hand, ML uses general-purpose algorithms to find patterns that best predict the outcome and makes minimal assumptions about the data-generating process; and may be more effective in a number of situations. My work employs both SM- and ML- based computational approaches to investigate particular public health problems. Chapter One provides philosophical background and compares the application of the two approaches in public health. Chapter Two describes and implements penalized Cox proportional hazard models for time-varying covariates time-to-event data. Chapter Three applies traditional survival models and machine learning algorithms to predict survival times of cancer patients, while incorporating the information about the time-varying covariates. Chapter Four discusses and implements various approaches for computing predictions and effects for generalized linear (mixed) models. Finally, Chapter Five implements and compares various statistical models for handling univariate and multivariate binary outcomes for water, sanitation and hygiene (WaSH) data. |
URI: | http://hdl.handle.net/11375/27880 |
Appears in Collections: | Open Access Dissertations and Theses |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
cygu_steve_bicko_finalsubmission2022september_phd.pdf | 1.59 MB | Adobe PDF | View/Open |
Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.