Skip navigation
  • Home
  • Browse
    • Communities
      & Collections
    • Browse Items by:
    • Publication Date
    • Author
    • Title
    • Subject
    • Department
  • Sign on to:
    • My MacSphere
    • Receive email
      updates
    • Edit Profile


McMaster University Home Page
  1. MacSphere
  2. Open Access Dissertations and Theses Community
  3. Open Access Dissertations and Theses
Please use this identifier to cite or link to this item: http://hdl.handle.net/11375/28888
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorMcNicholas, Paul-
dc.contributor.authorZhang, Yiran-
dc.date.accessioned2023-09-14T18:24:48Z-
dc.date.available2023-09-14T18:24:48Z-
dc.date.issued2023-
dc.identifier.urihttp://hdl.handle.net/11375/28888-
dc.description.abstractThe silhouette score is a widely used technique to evaluate the quality of a clustering result. One of the current issues with the silhouette score is its sensitivity to outliers, which can lead to misleading interpretations. This problem is caused by the silhouette score using the arithmetic mean to calculate the average intra and inter-cluster distances. To address this issue, three modified silhouette scores are presented: GenSil, TrimSil, and extended TrimSil, which replace the arithmetic mean with the generalized mean, the trimmed mean and a modified trimmed mean, respectively. Experiments on both simulated and real-world datasets show that GenSil is the most effective method, significantly reducing the impact of outliers and achieving high silhouette scores with negative parameter values. TrimSil also improves silhouette scores but performs worse than GenSil, while the extended TrimSil outperforms TrimSil but is still less effective than GenSil. To further aid in selecting the optimal number of clusters with these modified silhouette scores, a more straightforward visualization technique, the silhouette-parameter plot, is also introduced.en_US
dc.language.isoenen_US
dc.subjectSilhouette Scoreen_US
dc.subjectClusteringen_US
dc.subjectGeneralized Meanen_US
dc.subjectTrimmed Meanen_US
dc.titleModified Silhouette Score with Generalized Mean and Trimmed Meanen_US
dc.typeThesisen_US
dc.contributor.departmentMathematics and Statisticsen_US
dc.description.degreetypeThesisen_US
dc.description.degreeMaster of Science (MSc)en_US
Appears in Collections:Open Access Dissertations and Theses

Files in This Item:
File Description SizeFormat 
Zhang_Yiran_2023Sep_MSc.pdf
Open Access
1.15 MBAdobe PDFView/Open
Show simple item record Statistics


Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.

Sherman Centre for Digital Scholarship     McMaster University Libraries
©2022 McMaster University, 1280 Main Street West, Hamilton, Ontario L8S 4L8 | 905-525-9140 | Contact Us | Terms of Use & Privacy Policy | Feedback

Report Accessibility Issue