Nothing Special   »   [go: up one dir, main page]

skip to main content
article

Data-driven understanding and refinement of schema mappings

Published: 01 May 2001 Publication History

Abstract

At the heart of many data-intensive applications is the problem of quickly and accurately transforming data into a new form. Database researchers have long advocated the use of declarative queries for this process. Yet tools for creating, managing and understanding the complex queries necessary for data transformation are still too primitive to permit widespread adoption of this approach. We present a new framework that uses data examples as the basis for understanding and refining declarative schema mappings. We identify a small set of intuitive operators for manipulating examples. These operators permit a user to follow and refine an example by walking through a data source. We show that our operators are powerful enough both to identify a large class of schema mappings and to distinguish effectively between alternative schema mappings. These operators permit a user to quickly and intuitively build and refine complex data transformation queries that map one data source into another.

References

[1]
G. Bhargava, P. God, and B. R. Iyer. Hypergraph based reorderings of outer join queries with complex predicates. In ACM SIGMOD Int'l Conf. on the Management of Data, pp. 304-315, 1995.
[2]
A. Deutsch, L. Popa, and V. Tannen. Physical Data Independance, Constraints, and Optimization with Universal Plans. In Proc. of the Int'l Conf. on Very Large Data Bases (VLDB), pp. 459-470, 1999.
[3]
R. Fagin, A. O. Mendelzon, and J. D. Ullman. A Simplified Universal Relation Assumption and Its Properties. ACM Trans. on Database Sys. (TODS), 7(3):343-360, Sept. 1982.
[4]
C. A. Galindo-Legaria. Outer-joins as Disjunctions. In ACM SIGMOD Int'l Conf. on the Management of Data, pp. 348-358, 1994.
[5]
C. A. Galindo-Legaria mad A. Rosentbal. Outer-join Simplification mad Reordering for Query Optimization. ACM Trans. on Database Sys. (TODS), 22(1):43-73, 1997.
[6]
L. M. Haas, R. J. Miller, B. Niswonger, M. T. Roth, P. M. Schwarz, mad E. L. Wimmers. Transforming Heterogeneous Data with Database Middleware: Beyond Integration. IEEE Data Engineering, 22(1):31-36, 1999.
[7]
C.-T. Ho, F. Natmann, X. Tian, L. Haas, and N. Megiddo. Automatic classification of attributes using feature analysis. Submitted, 2001.
[8]
H. F. Korth, G. M. Kuper, J. Feigenbaum, A. V. Gelder, and J. D. Ullman. System/U: A Database System Based on the Universal Relation Assumption. TODS, 9(3):331-347, 1984.
[9]
R. J. Miller, L. M. Haas, and M. Hernandez. Schema Mapping as Query Discovery. In Proc. of the Int'l Conf. on Very Large Data Bases (VLDB), pp. 77-88, Cairo, Egypt, Sept. 2000.
[10]
R. J. Miller, M. A. Hernandez, L. M. Haas, L. Yan, C. T. H. Ho, R. Fagin, and L. Popa. The Clio Project: Managing Heterogeneity. SIGMOD Record, 30(1), Mar. 2001.
[11]
A. Rajaraman and J. D. Ullman. Integrating Information by Outerjoins and Full Disjunctions. In Proc. of the ACM Syrup. on Principles of Database Systems (PODS'), pp. 238-248, 1996.
[12]
S. Ram and V. Ramesh. Schema Integration: Past, Current and Future. In A. Elmagarmid, et al, eds, Management of Heterogeneous & Autonomous Database Systems, pp. 119-155. Morgan Kaufmann Publishers, 1999.
[13]
V. Raman, A. Chou, and J. M. Hellerstein. Scalable Spreadsheets for Interactive Data Analysis. In ACM-SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, May 1999.
[14]
E. A. Rundensteiner, ed. Special issue on data transformations. IEEE Data Eng. Bull., 22(1), 1999.
[15]
J. Stein and D. Maier. Relaxing the universal relation scheme assumption. In Proceedings of the Fourth ACM SIGACT-SIGMOD Symposium on Principles of Database Systems, March 25-27, 1985, Portland, Oregon, pp. 76-84. ACM, 1985.
[16]
J. D. Ullman. Princples of Database and Knowledge-Base Systems, volume II: The New Technologies. Computer Science Press, 1989.
[17]
L. Yan, R. J. Miller, L. Haas, and R. Fagin. Data-Driven Schema Mapping. Tectmical Report CSRG-423, Univ. of Toronto, 2001.

Cited By

View all
  • (2023)GIO: Generating Efficient Matrix and Frame Readers for Custom Data Formats by ExampleProceedings of the ACM on Management of Data10.1145/35892651:2(1-26)Online publication date: 20-Jun-2023
  • (2022)Data Transformation from Hierarchical Model to Relational Model Based on Example ProgrammingHans Journal of Data Mining10.12677/HJDM.2022.12403212:04(334-350)Online publication date: 2022
  • (2021)Towards Knowledge Exchange: State-of-the-Art and Open ProblemsSOFSEM 2021: Theory and Practice of Computer Science10.1007/978-3-030-67731-2_2(13-27)Online publication date: 11-Jan-2021
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGMOD Record
ACM SIGMOD Record  Volume 30, Issue 2
June 2001
625 pages
ISSN:0163-5808
DOI:10.1145/376284
Issue’s Table of Contents
  • cover image ACM Conferences
    SIGMOD '01: Proceedings of the 2001 ACM SIGMOD international conference on Management of data
    May 2001
    630 pages
    ISBN:1581133324
    DOI:10.1145/375663
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 May 2001
Published in SIGMOD Volume 30, Issue 2

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)15
  • Downloads (Last 6 weeks)1
Reflects downloads up to 29 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)GIO: Generating Efficient Matrix and Frame Readers for Custom Data Formats by ExampleProceedings of the ACM on Management of Data10.1145/35892651:2(1-26)Online publication date: 20-Jun-2023
  • (2022)Data Transformation from Hierarchical Model to Relational Model Based on Example ProgrammingHans Journal of Data Mining10.12677/HJDM.2022.12403212:04(334-350)Online publication date: 2022
  • (2021)Towards Knowledge Exchange: State-of-the-Art and Open ProblemsSOFSEM 2021: Theory and Practice of Computer Science10.1007/978-3-030-67731-2_2(13-27)Online publication date: 11-Jan-2021
  • (2020)Knowledge translationProceedings of the VLDB Endowment10.14778/3407790.340780613:12(2018-2032)Online publication date: 14-Sep-2020
  • (2019)Interactive Mapping Specification with Exemplar TuplesACM Transactions on Database Systems10.1145/332148544:3(1-44)Online publication date: 5-Jun-2019
  • (2018)Automated migration of hierarchical data to relational tables using programming-by-exampleProceedings of the VLDB Endowment10.1145/3187009.317773511:5(580-593)Online publication date: 1-Jan-2018
  • (2018)Automated migration of hierarchical data to relational tables using programming-by-exampleProceedings of the VLDB Endowment10.1145/3177732.317773511:5(580-593)Online publication date: 5-Oct-2018
  • (2018)MfeCNN: Mixture Feature Embedding Convolutional Neural Network for Data MappingIEEE Transactions on NanoBioscience10.1109/TNB.2018.284105317:3(165-171)Online publication date: Jul-2018
  • (2017)Approximation Algorithms for Schema-Mapping Discovery from Data ExamplesACM Transactions on Database Systems10.1145/304471242:2(1-41)Online publication date: 28-Apr-2017
  • (2017)Interactive Mapping Specification with Exemplar TuplesProceedings of the 2017 ACM International Conference on Management of Data10.1145/3035918.3064028(667-682)Online publication date: 9-May-2017
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media