Skip navigation
  • Home
  • Browse
    • Communities
      & Collections
    • Browse Items by:
    • Publication Date
    • Author
    • Title
    • Subject
    • Department
  • Sign on to:
    • My MacSphere
    • Receive email
      updates
    • Edit Profile


McMaster University Home Page
  1. MacSphere
  2. Open Access Dissertations and Theses Community
  3. Open Access Dissertations and Theses
Please use this identifier to cite or link to this item: http://hdl.handle.net/11375/32246
Title: BEST PRACTICES FOR DATA QUALITY ASSURANCE FOR HOSPITAL ELECTRONIC MEDICAL RECORD RESEARCH PLATFORMS
Authors: Schneider, Tyler
Advisor: Holbrook, Anne
Department: eHealth
Keywords: Electronic medical records;Data validation;Data quality;Epic;Pharmacoepidemiology
Publication Date: 2025
Abstract: Background Electronic medical records (EMRs) are a rich source of clinical data across many patients. However, the data must be high quality to be used for research. There is a paucity of information on the quality of Canadian hospital EMRs for research, as well as comprehensive EMR data quality assessment checklists. Purpose This thesis aims to validate data from a leading Canadian hospital EMR then use a scoping review to develop a survey for experts to rate items for inclusion in a checklist on assessing EMR data quality for research. Methods An entity relationship diagram (ERD) for key research data was created by navigating over 20,000 data tables. Data validation was completed iteratively by manual chart review or comparing it to Canadian Institute for Health Information Discharge Abstract Database (CIHI-DAD) data for agreement. A scoping review was conducted to identify data quality checklists or frameworks potentially relevant to EMR data quality to be summarized and included in an online survey. Survey items will be rated on a 3-point scale and content validity ratios will be calculated for inclusion in the resulting expert opinion checklist. Results The ERD showed 43 tables were used for key research data. We validated data across 5 themes: Demographics, Exposures, Outcomes, Potential Confounders, and Timestamping. Most items validated with over 95% agreement, but some diagnoses for Outcomes and Potential Confounders performed poorly necessitating the use of linked CIHI-DAD data. For survey development, 533 potentially relevant checklist items were identified and summarized as 42 data quality items in the survey. Conclusions EMR data validation took many iterations to create an accurate ERD. Most key research data in the EMR had high agreement but linked coded data is required for some diagnoses. Our survey will result in a comprehensive, expert opinion checklist of EMR data quality for research.
URI: http://hdl.handle.net/11375/32246
Appears in Collections:Open Access Dissertations and Theses

Files in This Item:
File Description SizeFormat 
Schneider_Tyler_M_2025Aug_eHealthMSc.pdf
Embargoed until: 2026-08-25
1.04 MBAdobe PDFView/Open
Show full item record Statistics


Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.

Sherman Centre for Digital Scholarship     McMaster University Libraries
©2022 McMaster University, 1280 Main Street West, Hamilton, Ontario L8S 4L8 | 905-525-9140 | Contact Us | Terms of Use & Privacy Policy | Feedback

Report Accessibility Issue