Skip navigation
  • Home
  • Browse
    • Communities
      & Collections
    • Browse Items by:
    • Publication Date
    • Author
    • Title
    • Subject
    • Department
  • Sign on to:
    • My MacSphere
    • Receive email
      updates
    • Edit Profile


McMaster University Home Page
  1. MacSphere
  2. Open Access Dissertations and Theses Community
  3. Open Access Dissertations and Theses
Please use this identifier to cite or link to this item: http://hdl.handle.net/11375/12780
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorHamid, Jemila Sen_US
dc.contributor.advisorN. Balakrishnan, R. Viverosen_US
dc.contributor.authorJana, Sayanteeen_US
dc.date.accessioned2014-06-18T17:00:43Z-
dc.date.available2014-06-18T17:00:43Z-
dc.date.created2012-12-20en_US
dc.date.issued2013-04en_US
dc.identifier.otheropendissertations/7638en_US
dc.identifier.other8699en_US
dc.identifier.other3552261en_US
dc.identifier.urihttp://hdl.handle.net/11375/12780-
dc.description.abstract<p>Recent advances in technology have allowed researchers to collect high-dimensional biological data simultaneously. In genomic studies, for instance, measurements from tens of thousands of genes are taken from individuals across several experimental groups. In time course microarray experiments, gene expression is measured at several time points for each individual across the whole genome resulting in massive amount of data. In such experiments, researchers are faced with two types of high-dimensionality. The first is global high-dimensionality, which is common to all genomic experiments. The global high-dimensionality arises because inference is being done on tens of thousands of genes resulting in multiplicity. This challenge is often dealt with statistical methods for multiple comparison, such as the Bonferroni correction or false discovery rate (FDR). We refer to the second type of high-dimensionality as gene specific high-dimensionality, which arises in time course microarry experiments due to the fact that, in such experiments, sample size is often smaller than the number of time points ($n</p> <p>In this thesis, we use the growth curve model (GCM), which is a generalized multivariate analysis of variance (GMANOVA) model, and propose a moderated test statistic for testing a special case of the general linear hypothesis, which is specially useful for identifying genes that are expressed. We use the trace test for the GCM and modify it so that it can be used in high-dimensional situations. We consider two types of moderation: the Moore-Penrose generalized inverse and Stein's shrinkage estimator of $ S $. We performed extensive simulations to show performance of the moderated test, and compared the results with original trace test. We calculated empirical level and power of the test under many scenarios. Although the focus is on hypothesis testing, we also provided moderated maximum likelihood estimator for the parameter matrix and assessed its performance by investigating bias and mean squared error of the estimator and compared the results with those of the maximum likelihood estimators. Since the parameters are matrices, we consider distance measures in both power and level comparisons as well as when investigating bias and mean squared error. We also illustrated our approach using time course microarray data taken from a study on Lung Cancer. We were able to filter out 1053 genes as non-noise genes from a pool of 22,277 genes which is approximately 5\% of the total number of genes. This is in sync with results from most biological experiments where around 5\% genes are found to be differentially expressed.</p>en_US
dc.subjectGeneralized multivariate analysis of varianceen_US
dc.subjectgrowth curve modelen_US
dc.subjecthigh-dimensional dataen_US
dc.subjectEuclidean distanceen_US
dc.subjectmultivariate bias and mean square erroren_US
dc.subjectmoderated trace testen_US
dc.subjectMoore-Penrose generalized inverseen_US
dc.subjectBiostatisticsen_US
dc.subjectMultivariate Analysisen_US
dc.subjectBiostatisticsen_US
dc.titleThe Growth Curve Model for High Dimensional Data and its Application in Genomicsen_US
dc.typethesisen_US
dc.contributor.departmentMathematics and Statisticsen_US
dc.description.degreeMaster of Science (MSc)en_US
Appears in Collections:Open Access Dissertations and Theses

Files in This Item:
File SizeFormat 
fulltext.pdf
Open Access
938.43 kBAdobe PDFView/Open
Show simple item record Statistics


Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.

Sherman Centre for Digital Scholarship     McMaster University Libraries
©2022 McMaster University, 1280 Main Street West, Hamilton, Ontario L8S 4L8 | 905-525-9140 | Contact Us | Terms of Use & Privacy Policy | Feedback

Report Accessibility Issue