Please use this identifier to cite or link to this item:
http://hdl.handle.net/11375/27510
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Fei, Chiang | - |
dc.contributor.author | Zheng, Zheng | - |
dc.date.accessioned | 2022-05-03T14:12:30Z | - |
dc.date.available | 2022-05-03T14:12:30Z | - |
dc.date.issued | 2022 | - |
dc.identifier.uri | http://hdl.handle.net/11375/27510 | - |
dc.description.abstract | This thesis focuses on tackling three problems in two data quality dimensions: data currency and data consistency. We first address the problem of estimating data currency in a relational database, and we argue that data currency is a relative notion that is dependent on an individual’s update pattern. This pattern can have spatial and temporal variance among individuals. By learning the patterns from an update history, we present a probabilistic system for identifying and cleaning stale values. Secondly, we explore the problem of estimating data currency in distributed database settings. Replicas of the same data item often exhibit varying consistency levels when executing read and write requests due to system availability and network limitations. When one or more replicas respond to a query, estimating the currency of the returned data item is essential for applications requiring timely data. Depending on how confident the estimation is, the query may dynamically decide to return the retrieved replicas, or wait for the remaining replicas to respond. We present a system that accurately estimates whether the retrieved replicas are current or stale, and guarantees that the estimation satisfies a user-given confidence threshold. Using this confidence-bounded replica currency estimation, we implement a novel DYNAMIC consistency level in the open-source, NoSQL database, Cassandra. Finally, we tackle the problem of resolving inconsistencies in a database. Data consistency is often measured by whether the data adheres to a set of data quality rules. Recent work has proposed a new class of data quality rules that considers the data semantics with respect to (w.r.t.) an ontology. As the data evolves w.r.t. these rules and ontology, we propose a system to re-align and repair the data and the ontology w.r.t these ontological rules. | en_US |
dc.language.iso | en | en_US |
dc.title | Enforcement and Refinement of Data Currency and Data Consistency | en_US |
dc.type | Thesis | en_US |
dc.contributor.department | Computing and Software | en_US |
dc.description.degreetype | Thesis | en_US |
dc.description.degree | Candidate in Philosophy | en_US |
Appears in Collections: | Open Access Dissertations and Theses |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Zheng_Zheng_202204_PhD.pdf | 5.19 MB | Adobe PDF | View/Open |
Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.