Database dependency discovery: a machine learning approach
Abstract:Database dependencies, such as functional and multivalued dependencies, express the presence of structure in database relations, that can be utilised in the database design process. The discovery of database dependencies can be viewed as an induction problem, in which general rules dependencies are obtained from specific facts the relation. This viewpoint has the advantage of abstracting away as much as possible from the particulars of the dependencies. The algorithms in this paper are designed such that they can easily be generalised to other kinds of dependencies.
Like in current approaches to computational induction such as inductive logic programming, we distinguish between top-down algorithms and bottom-up algorithms. In a top-down approach, hypotheses are generated in a systematic way and then tested against the given relation. In a bottom-up approach, the relation is inspected in order to see what dependencies it may satisfy or violate. We give a simple but inefficient top-down algorithm, a bi-directional algorithm, and a bottom-up algorithm. In the case of functional dependencies, these algorithms have been implemented in the FDEP system and evaluated experimentally. The bottom-up algorithm is the most efficient of the three, and also outperforms other algorithms from the literature.
Document Type: Research Article
Affiliations: 1: Department of Computer Science, University of Bristol, Bristol BS8 1UB, UK E-mail: Peter.Flach@bristol.ac.uk 2: Faculty of Computer and Information Science, University of Ljubljana, 1000 Ljubljana, Slovenia E-mail: Iztok.Savnik@fri.uni-lj.si
Publication date: January 1, 1999