Please use this identifier to cite or link to this item:
|Title:||Discovering Ontology Functional Dependencies|
|Department:||Computing and Software|
|Abstract:||Functional Dependencies (FDs) are commonly used in data cleaning to identify dirty and inconsistent data values. However, many errors require user input for specific do- main knowledge. For example, let us consider the drugs, Advil and Crocin. FDs will consider these two drugs different because they are not syntactically equal. However, Advil and Crocin are synonyms as they are two different drugs with similar chemical compounds but marketed under distinct names in different countries. While FDs have traditionally been used in existing data cleaning solutions to model syntactic equivalence, they are not able to model broader relationships (e.g., synonym, Is-A (Inheritance)) defined by ontologies. In this thesis, we take a first step to discover a new dependency called Ontology Functional Dependencies (OFDs). OFDs model attribute relationships based on re- lationships in a given ontology. We present two effective algorithms to discover OFDs using synonyms and inheritance relationships. Our discovery algorithms search for minimal OFDs and prune the redundant ones. Both algorithms traverse the search lattice in a level-wise Breadth First Search (BFS) manner. In addition, we have devel- oped a set of pruning rules so that we can avoid considering unnecessary candidates in the search lattice. We present an experimental study describing the performance ivand scalability of our techniques. Experimental results show that both algorithms are effective in practice and discover OFDs efficiently for large datasets with millions of tuples. We also present a qualitative study showing that the discovered OFDs are meaningful with high precision and recall.|
|Appears in Collections:||Open Access Dissertations and Theses|
Files in This Item:
|Baskaran_Sridevi_201609_MScComputerScience.pdf||533.58 kB||Adobe PDF||View/Open|
Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.