Compiling Source Descriptions for Efficient and Flexible Information Integration

Authors: Ambite J.L.1; Knoblock C.A.2; Muslea I.3; Philpot A.G.4

Source: Journal of Intelligent Information Systems, Volume 16, Number 2, 4 March 2001 , pp. 149-187(39)

Publisher: Springer

Buy & download fulltext article:

OR

Price: $47.00 plus tax (Refund Policy)

Abstract:

Integrating data from heterogeneous data sources is a critical problem that has received a great deal of attention in recent years. There are two competing approaches to address this problem. The traditional approach, which first appeared in Multibase and more recently in HERMES and TSIMMIS, often called global-as-view, defines the global model as a view on the sources. A more recent approach, sometimes referred to as local-as-view or view rewriting, defines the sources as views on the global model. The disadvantage of the first approach is that a person must re-engineer the definitions of the global model whenever any of the sources change or when new sources are added. The view rewriting approach does not suffer from this drawback, but the problem of rewriting queries into equivalent plans using views is computationally hard and must be performed for each query at run-time.

In this paper we propose a hybrid approach that amortizes the cost of query processing over all queries by pre-compiling the source descriptions into a minimal set of integration axioms. Using this approach, the sources are defined in terms of the global model and then compiled into axioms that define the global model in terms of the sources. These axioms can be efficiently instantiated at run-time to determine the most appropriate rewriting to answer a query and facilitate traditional cost-based query optimization. Our approach combines the flexibility of the local-as-view approach with the run-time efficiency of the query processing in global-as-view systems. We have implemented this approach for the SIMS and Ariadne information mediators and provide empirical results that demonstrate that in practice the approach scales to large numbers of sources and that the approach can compile the axioms for a variety of real-world domains in a matter of seconds.

Keywords: information integration; information mediators; axiom compilation; heterogeneous data sources; SIMS; Ariadne

Language: English

Document Type: Regular paper

Affiliations: 1: Information Sciences Institute, Integrated Media Systems Center, and Department of Computer Science, University of Southern California, 4676 Admiralty Way, Marina del Rey, CA 90292. ambite@isi.edu 2: Information Sciences Institute, Integrated Media Systems Center, and Department of Computer Science, University of Southern California, 4676 Admiralty Way, Marina del Rey, CA 90292. knoblock@isi.edu 3: Information Sciences Institute, Integrated Media Systems Center, and Department of Computer Science, University of Southern California, 4676 Admiralty Way, Marina del Rey, CA 90292. muslea@isi.edu 4: Information Sciences Institute, Integrated Media Systems Center, and Department of Computer Science, University of Southern California, 4676 Admiralty Way, Marina del Rey, CA 90292. philpot@isi.edu

Publication date: 2001-03-04

Related content

Key

Free Content
Free content
New Content
New content
Open Access Content
Open access content
Subscribed Content
Subscribed content
Free Trial Content
Free trial content

Text size:

A | A | A | A
Share this item with others: These icons link to social bookmarking sites where readers can share and discover new web pages. print icon Print this page