Skip navigation
  • Home
  • Browse
    • Communities
      & Collections
    • Browse Items by:
    • Publication Date
    • Author
    • Title
    • Subject
    • Department
  • Sign on to:
    • My MacSphere
    • Receive email
      updates
    • Edit Profile


McMaster University Home Page
  1. MacSphere
  2. Open Access Dissertations and Theses Community
  3. Open Access Dissertations and Theses
Please use this identifier to cite or link to this item: http://hdl.handle.net/11375/29934
Title: TinyML Inference Enablement and Acceleration on Microcontrollers The Case of Healthcare
Authors: Sun, Bailian
Advisor: Hassan, Mohamed
Department: Electrical and Computer Engineering
Publication Date: 2024
Abstract: Controlling high blood pressure can eliminate more than half of the deaths caused by cardiovascular diseases (CVDs). Towards this target, continuous BP monitoring is a must. The existing Convolutional Neural Network (CNN) -based solutions rely on server-like infrastructure with huge computation and memory capabilities. This entails these solutions impractical with several security, privacy, reliability, and latency concerns. To address the challenges, an alternative solution has merged to conduct the machine learning algorithms into tiny devices. The unprecedented boom in tinyML development also drives the high relevance of optimizing network inference strategies on resource-constrained microcontrollers (MCUs) The contributions of the thesis are: First, the thesis contributes to the general field of tinyML by proposing novel techniques that enable the fitting of five popular CNNs - AlexNet, LeNet, SqueezeNet, ResNet, and MobileNet - into extremely-constrained edge devices with limited computation, memory, and power budget. The proposed techniques use a combination of novel architecture modifications, pruning, and quantization methods. Second, utilizing this stepping stone, the thesis proposes a tinyML-based solution to enable accurate and continuous BP estimation using only photoplethysmogram (PPG) signals. Third, the thesis proposes several techniques to accelerate the CNNs inference process. From a hardware perspective, we discuss architecture-aware accelerations with cache and multi-core specifications; from the software perspective, we develop application-aware optimizations with an existing real-time compatible C library to maximize the computation and intermediate buffer reuse. Those solutions only require the general MCU features thus demonstrating board generalization across various networks and devices. We conduct an extensive evaluation using thousands of real Intensive Care Unit (ICU) patient data and several tiny edge devices and all the five aforementioned CNNs. Results show comparable accuracy to server-based solutions. The proposed acceleration strategies achieve up to 71% reduction in inference latency.
URI: http://hdl.handle.net/11375/29934
Appears in Collections:Open Access Dissertations and Theses

Files in This Item:
File Description SizeFormat 
Sun_Bailian_202406_Master.pdf
Open Access
1.39 MBAdobe PDFView/Open
Show full item record Statistics


Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.

Sherman Centre for Digital Scholarship     McMaster University Libraries
©2022 McMaster University, 1280 Main Street West, Hamilton, Ontario L8S 4L8 | 905-525-9140 | Contact Us | Terms of Use & Privacy Policy | Feedback

Report Accessibility Issue