Skip navigation
  • Home
  • Browse
    • Communities
      & Collections
    • Browse Items by:
    • Publication Date
    • Author
    • Title
    • Subject
    • Department
  • Sign on to:
    • My MacSphere
    • Receive email
      updates
    • Edit Profile


McMaster University Home Page
  1. MacSphere
  2. Open Access Dissertations and Theses Community
  3. Open Access Dissertations and Theses
Please use this identifier to cite or link to this item: http://hdl.handle.net/11375/31980
Title: Self-Supervised Masked Autoencoding Meets Federated Learning for Electric Vehicle Battery State-of-Health Estimation
Authors: Ismail, Mohanad
Advisor: Ahmed, Ryan
Department: Mechanical Engineering
Keywords: Electric Vehicle;State-of-Health Estimation;Self-Supervised Learning;Masked Autoencoding;Federated Learning;Cloud Computing;Edge Computing;Fine-Tuning Strategies;Masking Ratio Optimization;Data Scarcity and Heterogeneity;Privacy-Preserving Machine Learning
Publication Date: 2025
Abstract: EVs live and die by their batteries. To keep drivers safe and confident in their vehicles, we need efficient, accurate, and private ways to track each battery's SoH. But, EV labelled data is scarce, sharing raw data raises privacy flags, and big models strain on-board hardware. This thesis tackles all three problems through a two-step remedy in one shot. 1. Learn data representations without needing labels: Each car trains a small autoencoder to reconstruct its own collected sensor data after randomly hiding parts of the signal. 2. Share knowledge, not data: Instead of uploading the raw collected data, every car sends only its trained model parameters to a remote cloud server. The server aggregates parameters from all cars and sends the improved model back. Four simple questions guide our work: 1. Does this usage of unlabelled data improve the model's performance? 2. How much of the signal should be hidden to get the best representation learning? 3. What is the optimal strategy for incorporating the limited labelled data available into the model? 4. Does this aggregation of separately trained models hurt accuracy compared with a fully centralized approach? Our experiments show a 17% lower average MAE, with up to a 60% improvement in the best cases, when we make use of the available unlabelled data versus training exclusively on labelled data. Hiding 30-40% of signals strikes the balance between challenge and clarity. Finally, aggregation of models on average stays within 0.05Ah of centralized training, virtually no loss, with zero raw-data exposure. This thesis incorporates cloud computing, SSL, and FL to present a light, privacy-friendly pipeline for fleet-wide SoH estimation, evidence that unfrozen fine-tuning outshines frozen variants, the first systematic look at how masking ratio shapes battery time-series representation learning, and practical proof that sharing model weights instead of data keeps accuracy basically untouched and privacy intact.
URI: http://hdl.handle.net/11375/31980
Appears in Collections:Open Access Dissertations and Theses

Files in This Item:
File Description SizeFormat 
ismail_mohanad_2025july_masc.pdf
Open Access
15.74 MBAdobe PDFView/Open
Show full item record Statistics


Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.

Sherman Centre for Digital Scholarship     McMaster University Libraries
©2022 McMaster University, 1280 Main Street West, Hamilton, Ontario L8S 4L8 | 905-525-9140 | Contact Us | Terms of Use & Privacy Policy | Feedback

Report Accessibility Issue