Query and data mapping across heterogeneous information sources

January 2001

Publisher:

Stanford University
408 Panama Mall, Suite 217
Stanford
CA
United States

ISBN:978-0-493-08549-4

Order Number:AAI3000016

Pages:

185

Purchase on ProQuest

Bibliometrics

Abstract

The Internet has brought together information sources worldwide. Integrating such heterogeneous and autonomous sources is challenging because of their non-uniform query languages and data representations. To help users uniformly query over different sources, we have developed an integration system or a mediator for optimally mapping queries and data across disparate contexts. Such a translation technique is essential for many important applications that require querying sources and analyzing data on the web, such as meta-searching, e-commerce, and web mining. This thesis presents our solutions for the main functionalities of mediation: query translation, postfiltering, and data translation. First, the mediator must translate a user query for a source to execute. We develop a general approximate query mapping mechanism that finds the closet mappings under virtually any closeness criteria, such as minimal-superset, maximal subset, or some hybrid scheme that combines both precision and recall. Furthermore, for the important special case of minimal-superset mapping (and its dual case maximal-subset mapping), we present efficient algorithms that do not rely on query normal forms. Since the translation machinery relies on separately-supplied rules for rewriting basic query constraints, we also develop algorithms for rewriting IR predicates commonly used for document retrieval. Second, because a translated query may contain extra answers that do not match the original query, the mediator must perform post-filtering to remove the errors. We develop an algorithm for deriving the optimal filters that incur the least processing costs, and report our experiments to quantify the worst-case costs (i.e., for superset mappings). Finally, to present the query results uniformly, the mediator must translate the native data retrieved from the external source. We adopt our general query mapping framework for data translation by developing the modeling of data as a set of conjunctive constraints. The machinery can deal with flat data as well as hierarchically structured information such as XML.

Cited By

Tryfonopoulos C, Koubarakis M and Drougas Y (2009). Information filtering and query indexing for an information retrieval model, ACM Transactions on Information Systems, 27:2, (1-47), Online publication date: 1-Feb-2009.

Contributors

Héctor García-Molina
Stanford University
- Publication Years1979 - 2019
- Publication counts406
- Citation count21,676
- Available for Download217
- Downloads (cumulative)219,724
- Downloads (12 months)13,058
- Downloads (6 weeks)2,118
- Average Downloads per Article1,013
- Average Citation per Article53
View Full Profile
Kevin Chang
University of Illinois Urbana-Champaign
- Publication Years1996 - 2021
- Publication counts91
- Citation count3,907
- Available for Download65
- Downloads (cumulative)52,299
- Downloads (12 months)2,905
- Downloads (6 weeks)352
- Average Downloads per Article805
- Average Citation per Article43
View Full Profile

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Recommendations

Boolean Query Mapping Across Heterogeneous Information Sources

Searching over heterogeneous information sources is difficult because of the nonuniform query languages. Our approach is to allow a user to compose Boolean queries in one rich front-end language. For each user query and target source, we transform the ...
Query Relaxation across Heterogeneous Data Sources
CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management

The fundamental assumption for query rewriting in heterogeneous environments is that the mappings used for the rewriting are complete, i.e., every relation and attribute mentioned in the query is associated, through mappings, to relations and attributes ...
An adaptive approach to query mediation across heterogeneous information sources
COOPIS '96: Proceedings of the First IFCIS International Conference on Cooperative Information Systems

The authors propose a query mediation framework to support customizable information gathering across heterogeneous and autonomous information sources. Instead of an integrated (and static) global schema, they propose an adaptive approach to ...

Browse Theses

Sections

Cited By

Boolean Query Mapping Across Heterogeneous Information Sources

Query Relaxation across Heterogeneous Data Sources

An adaptive approach to query mediation across heterogeneous information sources

Sections

Cited By

Save to Binder

Recommendations

Boolean Query Mapping Across Heterogeneous Information Sources

Query Relaxation across Heterogeneous Data Sources

An adaptive approach to query mediation across heterogeneous information sources