Welcome to the upgraded MacSphere! We're putting the finishing touches on it; if you notice anything amiss, email macsphere@mcmaster.ca

Quantifying Trust in Deep Learning Ultrasound Models by Investigating Hardware and Operator Variance

dc.contributor.advisorNoseworthy, Michael
dc.contributor.authorZhu, Calvin
dc.contributor.departmentBiomedical Engineeringen_US
dc.date.accessioned2021-10-01T19:41:06Z
dc.date.available2021-10-01T19:41:06Z
dc.date.issued2021
dc.description.abstractUltrasound (US) is the most widely used medical imaging modality due to its low cost, portability, real time imaging ability and use of non-ionizing radiation. However, unlike other imaging modalities such as CT or MRI, it is a heavily operator dependent, requiring trained expertise to leverage these benefits. Recently there has been an explosion of interest in artificial intelligence (AI) across the medical community and many are turning to the growing trend of deep learning (DL) models to assist in diagnosis. However, deep learning models do not perform as well when training data is not fully representative of the problem. Due to this difference in training and deployment, model performance suffers which can lead to misdiagnosis. This issue is known as dataset shift. Two aims to address dataset shift were proposed. The first was to quantify how US operator skill and hardware affects acquired images. The second was to use this skill quantification method to screen and match data to deep learning models to improve performance. A BLUE phantom from CAE Healthcare (Sarasota, FL) with various mock lesions was scanned by three operators using three different US systems (Siemens S3000, Clarius L15, and Ultrasonix SonixTouch) producing 39013 images. DL models were trained on a specific set to classify the presence of a simulated tumour and tested with data from differing sets. The Xception, VGG19, and ResNet50 architectures were used to test the effects with varying frameworks. K-Means clustering was used to separate images generated by operator and hardware into clusters. This clustering algorithm was then used to screen incoming images during deployment to best match input to an appropriate DL model which is trained specifically to classify that type of operator or hardware. Results showed a noticeable difference when models were given data from differing datasets with the largest accuracy drop being 81.26% to 31.26%. Overall, operator differences more significantly affected DL model performance. Clustering models had much higher success separating hardware data compared to operator data. The proposed method reflects this result with a much higher accuracy across the hardware test set compared to the operator data.en_US
dc.description.degreeMaster of Applied Science (MASc)en_US
dc.description.degreetypeThesisen_US
dc.identifier.urihttp://hdl.handle.net/11375/26951
dc.language.isoenen_US
dc.subjectUltrasound, Deep Learning, Machine Learning, K-Means Clustering, Trust, Hardware Variance, Operator Varianceen_US
dc.titleQuantifying Trust in Deep Learning Ultrasound Models by Investigating Hardware and Operator Varianceen_US
dc.typeThesisen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
ZHU_CALVIN_2021SEPT_MASc.pdf
Size:
20.26 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.68 KB
Format:
Item-specific license agreed upon to submission
Description: