Skip navigation
  • Home
  • Browse
    • Communities
      & Collections
    • Browse Items by:
    • Publication Date
    • Author
    • Title
    • Subject
    • Department
  • Sign on to:
    • My MacSphere
    • Receive email
      updates
    • Edit Profile


McMaster University Home Page
  1. MacSphere
  2. Open Access Dissertations and Theses Community
  3. Open Access Dissertations and Theses
Please use this identifier to cite or link to this item: http://hdl.handle.net/11375/27069
Title: Methods to Simulate Correlated Binomial Random Variables
Authors: Lai, Winfield
Advisor: Canty, Angelo
Davies, Katherine
Department: Mathematics and Statistics
Keywords: Statistics;Binomial;Correlation;Simulation;Multivariate
Publication Date: 2021
Abstract: Single nucleotide polymorphisms (SNPs) have been involved in describing the risk a person is at for developing diseases. Simulating a collection of d correlated autosomal biallelic SNPs is useful to acquire empirical results for statistical tests in settings such as having a low sample size. A collection of d correlated autosomal biallelic SNPs can be modeled as a random vector X = (X1,...,Xd) where Xi ∼ binomial(2, pi) and pi is the minor allele frequency for the ith SNP. The pairwise correlations between components of X can be specified by a d ×d symmetric positive definite correlation matrix having all diagonal entries equal to one. Two versions of a novel method to simulate X are developed in this thesis; one version is based on generating correlated binomials directly and the other is based on generating correlated Bernoulli random vectors and summing them component wise. Two existing methods to simulate X are also discussed and implemented. In particular, a method involving the multivariate normal by Madsen and Birkes (2013) is compared to our novel methods for d ≥ 3. Our novel binomial method has a different variance for the Fisher transformed sample correlation than the other two methods. Overall, if the target pairwise correlations are smaller than the lowest upper bound possible and the number of SNPs is low, then our novel Bernoulli method works the best since it is faster than the Madsen and Birkes method and has comparable variability and bias for sample correlation.
URI: http://hdl.handle.net/11375/27069
Appears in Collections:Open Access Dissertations and Theses

Files in This Item:
File Description SizeFormat 
Lai_Winfield_2021September_MScStatistics.pdf
Open Access
Thesis file689.74 kBAdobe PDFView/Open
Lai_Winfield_2021September_MScStatistics_RCode.txt
Open Access
R code found in the appendix of the thesis36.43 kBTextView/Open
Show full item record Statistics


Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.

Sherman Centre for Digital Scholarship     McMaster University Libraries
©2022 McMaster University, 1280 Main Street West, Hamilton, Ontario L8S 4L8 | 905-525-9140 | Contact Us | Terms of Use & Privacy Policy | Feedback

Report Accessibility Issue