Skip navigation
  • Home
  • Browse
    • Communities
      & Collections
    • Browse Items by:
    • Publication Date
    • Author
    • Title
    • Subject
    • Department
  • Sign on to:
    • My MacSphere
    • Receive email
      updates
    • Edit Profile


McMaster University Home Page
  1. MacSphere
  2. Open Access Dissertations and Theses Community
  3. Open Access Dissertations and Theses
Please use this identifier to cite or link to this item: http://hdl.handle.net/11375/11352
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorBeyene, Josephen_US
dc.contributor.advisorNarayanaswamy Balakrishnan, Aaron Childsen_US
dc.contributor.advisorNarayanaswamy Balakrishnan, Aaron Childsen_US
dc.contributor.authorYang, Xiao Dien_US
dc.date.accessioned2014-06-18T16:54:23Z-
dc.date.available2014-06-18T16:54:23Z-
dc.date.created2011-09-28en_US
dc.date.issued2011-10en_US
dc.identifier.otheropendissertations/6325en_US
dc.identifier.other7377en_US
dc.identifier.other2262620en_US
dc.identifier.urihttp://hdl.handle.net/11375/11352-
dc.description.abstract<p>With the advance of technology, the collection and storage of data has become routine. Huge amount of data are increasingly produced from biological experiments. the advent of DNA microarray technologies has enabled scientists to measure expressions of tens of thousands of genes simultaneously. Single nucleotide polymorphism (SNP) are being used in genetic association with a wide range of phenotypes, for example, complex diseases. These high-dimensional problems are becoming more and more common. The "large p, small n" problem, in which there are more variables than samples, currently a challenge that many statisticians face. The penalized variable selection method is an effective method to deal with "large p, small n" problem. In particular, The Lasso (least absolute selection and shrinkage operator) proposed by Tibshirani has become an effective method to deal with this type of problem. the Lasso works well for the covariates which can be treated individually. When the covariates are grouped, it does not work well. Elastic net, group lasso, group MCP and group bridge are extensions of the Lasso. Group lasso enforces sparsity at the group level, rather than at the level of the individual covariates. Group bridge, group MCP produces sparse solutions both at the group level and at the level of the individual covariates within a group. Our simulation study shows that the group lasso forces complete grouping, group MCP encourages grouping to a rather slight extent, and group bridge is somewhere in between. If one expects that the proportion of nonzero group members to be greater than one-half, group lasso maybe a good choice; otherwise group MCP would be preferred. If one expects this proportion to be close to one-half, one may wish to use group bridge. A real data analysis example is also conducted for genetic variation (SNPs) data to find out the associations between SNPs and West Nile disease.</p>en_US
dc.subjectLassoen_US
dc.subjectHigh-Dimensionalen_US
dc.subjectPenalized Variable Selection Methodsen_US
dc.subjectApplied Statisticsen_US
dc.subjectBiostatisticsen_US
dc.subjectStatistical Modelsen_US
dc.subjectApplied Statisticsen_US
dc.titleSTATISTICAL METHODS FOR VARIABLE SELECTION IN THE CONTEXT OF HIGH-DIMENSIONAL DATA: LASSO AND EXTENSIONSen_US
dc.typethesisen_US
dc.contributor.departmentMathematics and Statisticsen_US
dc.description.degreeMaster of Science (MSc)en_US
Appears in Collections:Open Access Dissertations and Theses

Files in This Item:
File SizeFormat 
fulltext.pdf
Open Access
688.93 kBAdobe PDFView/Open
Show simple item record Statistics


Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.

Sherman Centre for Digital Scholarship     McMaster University Libraries
©2022 McMaster University, 1280 Main Street West, Hamilton, Ontario L8S 4L8 | 905-525-9140 | Contact Us | Terms of Use & Privacy Policy | Feedback

Report Accessibility Issue