Welcome to the upgraded MacSphere! We're putting the finishing touches on it; if you notice anything amiss, email macsphere@mcmaster.ca

Enforcement and Refinement of Data Currency and Data Consistency

dc.contributor.advisorFei, Chiang
dc.contributor.authorZheng, Zheng
dc.contributor.departmentComputing and Softwareen_US
dc.date.accessioned2022-05-03T14:12:30Z
dc.date.available2022-05-03T14:12:30Z
dc.date.issued2022
dc.description.abstractThis thesis focuses on tackling three problems in two data quality dimensions: data currency and data consistency. We first address the problem of estimating data currency in a relational database, and we argue that data currency is a relative notion that is dependent on an individual’s update pattern. This pattern can have spatial and temporal variance among individuals. By learning the patterns from an update history, we present a probabilistic system for identifying and cleaning stale values. Secondly, we explore the problem of estimating data currency in distributed database settings. Replicas of the same data item often exhibit varying consistency levels when executing read and write requests due to system availability and network limitations. When one or more replicas respond to a query, estimating the currency of the returned data item is essential for applications requiring timely data. Depending on how confident the estimation is, the query may dynamically decide to return the retrieved replicas, or wait for the remaining replicas to respond. We present a system that accurately estimates whether the retrieved replicas are current or stale, and guarantees that the estimation satisfies a user-given confidence threshold. Using this confidence-bounded replica currency estimation, we implement a novel DYNAMIC consistency level in the open-source, NoSQL database, Cassandra. Finally, we tackle the problem of resolving inconsistencies in a database. Data consistency is often measured by whether the data adheres to a set of data quality rules. Recent work has proposed a new class of data quality rules that considers the data semantics with respect to (w.r.t.) an ontology. As the data evolves w.r.t. these rules and ontology, we propose a system to re-align and repair the data and the ontology w.r.t these ontological rules.en_US
dc.description.degreeCandidate in Philosophyen_US
dc.description.degreetypeThesisen_US
dc.identifier.urihttp://hdl.handle.net/11375/27510
dc.language.isoenen_US
dc.titleEnforcement and Refinement of Data Currency and Data Consistencyen_US
dc.typeThesisen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Zheng_Zheng_202204_PhD.pdf
Size:
5.07 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.68 KB
Format:
Item-specific license agreed upon to submission
Description: