Skip navigation
  • Home
  • Browse
    • Communities
      & Collections
    • Browse Items by:
    • Publication Date
    • Author
    • Title
    • Subject
    • Department
  • Sign on to:
    • My MacSphere
    • Receive email
      updates
    • Edit Profile


McMaster University Home Page
  1. MacSphere
  2. Departments and Schools
  3. Student Publications (Not Graduate Theses)
Please use this identifier to cite or link to this item: http://hdl.handle.net/11375/31458
Full metadata record
DC FieldValueLanguage
dc.contributor.authorWheat, Lesley-
dc.contributor.authorMohrenschildt, Martin V.-
dc.contributor.authorHabibi, Saeid-
dc.contributor.authorAl-Ani, Dhafar-
dc.date.accessioned2025-04-01T16:16:58Z-
dc.date.available2025-04-01T16:16:58Z-
dc.date.issued2024-11-13-
dc.identifier10.1109/ACCESS.2024.3497716-
dc.identifier.issn10.1109/ACCESS.2024.3497716-
dc.identifier.urihttp://hdl.handle.net/11375/31458-
dc.description.abstractBearing fault diagnosis is a well-developed field and an active area of research in which the combination of model-free machine learning techniques with vibration data has become a popular approach. However, vibration data from rotating machines has the potential to contain domain shifts beyond the accepted causes in this research area (different part models, operating conditions and sensor locations) which can enable data leakage between training and test datasets. To demonstrate the impact of data leakage, six common bearing diagnosis methods are applied to two datasets using three data splitting methods to compare classification performance. Diagnosis is preformed using Principal Component Analysis (PCA), Supervised Principal Component Analysis (SPCA) and Linear Discriminant Analysis (LDA) in combination with frequency analysis and envelope analysis feature extraction methods. Datasets from McMaster University and Paderborn University are used as experimental data sources, and produce vastly differing results (over a 40% drop in accuracy) depending on the selected dataset splitting method, revealing a previously unknown domain shift. Despite great results for diagnosis methods using frequency response analysis on the data from McMaster, these results are not expected to generalize due to possible data leakage. Out of fifty-five previous works using the Paderborn dataset, ten are identified as likely to be affected and only six properly address the problem. Recommendations are given for future experiment design, model creation and model evaluation.en_US
dc.description.sponsorshipThis work was supported in part by FedDev Ontario project 814996, Natural Sciences and Engineering Research Council of Canada (NSERC) Create project CREAT-482038-2016 and D&V-NSERC Alliance project ALLRP-549016-2019.en_US
dc.language.isoen_USen_US
dc.publisherIEEE Accessen_US
dc.subjectMachine Learningen_US
dc.subjectVibrationen_US
dc.subjectFault Diagnosisen_US
dc.subjectData Leakageen_US
dc.subjectDomain Shiften_US
dc.subjectRotating Machinesen_US
dc.titleImpact of Data Leakage in Vibration Signals Used for Bearing Fault Diagnosisen_US
dc.typeArticleen_US
dc.contributor.departmentComputing and Softwareen_US
Appears in Collections:Student Publications (Not Graduate Theses)

Files in This Item:
File Description SizeFormat 
Impact_of_Data_Leakage_in_Vibration_Signals_Used_for_Bearing_Fault_Diagnosis.pdf
Open Access
3.11 MBAdobe PDFView/Open
Show simple item record Statistics


This item is licensed under a Creative Commons License Creative Commons

Sherman Centre for Digital Scholarship     McMaster University Libraries
©2022 McMaster University, 1280 Main Street West, Hamilton, Ontario L8S 4L8 | 905-525-9140 | Contact Us | Terms of Use & Privacy Policy | Feedback

Report Accessibility Issue