Skip navigation
  • Home
  • Browse
    • Communities
      & Collections
    • Browse Items by:
    • Publication Date
    • Author
    • Title
    • Subject
    • Department
  • Sign on to:
    • My MacSphere
    • Receive email
      updates
    • Edit Profile


McMaster University Home Page
  1. MacSphere
  2. Open Access Dissertations and Theses Community
  3. Open Access Dissertations and Theses
Please use this identifier to cite or link to this item: http://hdl.handle.net/11375/27510
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorFei, Chiang-
dc.contributor.authorZheng, Zheng-
dc.date.accessioned2022-05-03T14:12:30Z-
dc.date.available2022-05-03T14:12:30Z-
dc.date.issued2022-
dc.identifier.urihttp://hdl.handle.net/11375/27510-
dc.description.abstractThis thesis focuses on tackling three problems in two data quality dimensions: data currency and data consistency. We first address the problem of estimating data currency in a relational database, and we argue that data currency is a relative notion that is dependent on an individual’s update pattern. This pattern can have spatial and temporal variance among individuals. By learning the patterns from an update history, we present a probabilistic system for identifying and cleaning stale values. Secondly, we explore the problem of estimating data currency in distributed database settings. Replicas of the same data item often exhibit varying consistency levels when executing read and write requests due to system availability and network limitations. When one or more replicas respond to a query, estimating the currency of the returned data item is essential for applications requiring timely data. Depending on how confident the estimation is, the query may dynamically decide to return the retrieved replicas, or wait for the remaining replicas to respond. We present a system that accurately estimates whether the retrieved replicas are current or stale, and guarantees that the estimation satisfies a user-given confidence threshold. Using this confidence-bounded replica currency estimation, we implement a novel DYNAMIC consistency level in the open-source, NoSQL database, Cassandra. Finally, we tackle the problem of resolving inconsistencies in a database. Data consistency is often measured by whether the data adheres to a set of data quality rules. Recent work has proposed a new class of data quality rules that considers the data semantics with respect to (w.r.t.) an ontology. As the data evolves w.r.t. these rules and ontology, we propose a system to re-align and repair the data and the ontology w.r.t these ontological rules.en_US
dc.language.isoenen_US
dc.titleEnforcement and Refinement of Data Currency and Data Consistencyen_US
dc.typeThesisen_US
dc.contributor.departmentComputing and Softwareen_US
dc.description.degreetypeThesisen_US
dc.description.degreeCandidate in Philosophyen_US
Appears in Collections:Open Access Dissertations and Theses

Files in This Item:
File Description SizeFormat 
Zheng_Zheng_202204_PhD.pdf
Open Access
5.19 MBAdobe PDFView/Open
Show simple item record Statistics


Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.

Sherman Centre for Digital Scholarship     McMaster University Libraries
©2022 McMaster University, 1280 Main Street West, Hamilton, Ontario L8S 4L8 | 905-525-9140 | Contact Us | Terms of Use & Privacy Policy | Feedback

Report Accessibility Issue