Welcome to the upgraded MacSphere! We're putting the finishing touches on it; if you notice anything amiss, email macsphere@mcmaster.ca

Modified Silhouette Score with Generalized Mean and Trimmed Mean

dc.contributor.advisorMcNicholas, Paul
dc.contributor.authorZhang, Yiran
dc.contributor.departmentMathematics and Statisticsen_US
dc.date.accessioned2023-09-14T18:24:48Z
dc.date.available2023-09-14T18:24:48Z
dc.date.issued2023
dc.description.abstractThe silhouette score is a widely used technique to evaluate the quality of a clustering result. One of the current issues with the silhouette score is its sensitivity to outliers, which can lead to misleading interpretations. This problem is caused by the silhouette score using the arithmetic mean to calculate the average intra and inter-cluster distances. To address this issue, three modified silhouette scores are presented: GenSil, TrimSil, and extended TrimSil, which replace the arithmetic mean with the generalized mean, the trimmed mean and a modified trimmed mean, respectively. Experiments on both simulated and real-world datasets show that GenSil is the most effective method, significantly reducing the impact of outliers and achieving high silhouette scores with negative parameter values. TrimSil also improves silhouette scores but performs worse than GenSil, while the extended TrimSil outperforms TrimSil but is still less effective than GenSil. To further aid in selecting the optimal number of clusters with these modified silhouette scores, a more straightforward visualization technique, the silhouette-parameter plot, is also introduced.en_US
dc.description.degreeMaster of Science (MSc)en_US
dc.description.degreetypeThesisen_US
dc.identifier.urihttp://hdl.handle.net/11375/28888
dc.language.isoenen_US
dc.subjectSilhouette Scoreen_US
dc.subjectClusteringen_US
dc.subjectGeneralized Meanen_US
dc.subjectTrimmed Meanen_US
dc.titleModified Silhouette Score with Generalized Mean and Trimmed Meanen_US
dc.typeThesisen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Zhang_Yiran_2023Sep_MSc.pdf
Size:
1.12 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.68 KB
Format:
Item-specific license agreed upon to submission
Description: