Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/1182635.1164135acmconferencesArticle/Chapter ViewAbstractPublication PagesvldbConference Proceedingsconference-collections
Article

Nested mappings: schema mapping reloaded

Published: 01 September 2006 Publication History

Abstract

Many problems in information integration rely on specifications, called schema mappings, that model the relationships between schemas. Schema mappings for both relational and nested data are well-known. In this work, we present a new formalism for schema mapping that extends these existing formalisms in two significant ways. First, our nested mappings allow for nesting and correlation of mappings. This results in a natural programming paradigm that often yields more accurate specifications. In particular, we show that nested mappings can naturally preserve correlations among data that existing mapping formalisms cannot. We also show that using nested mappings for purposes of exchanging data from a source to a target will result in less redundancy in the target data. The second extension to the mapping formalism is the ability to express, in a declarative way, grouping and data merging semantics. This semantics can be easily changed and customized to the integration task at hand. We present a new algorithm for the automatic generation of nested mappings from schema matchings (that is, simple element-to-element correspondences between schemas). We have implemented this algorithm, along with algorithms for the generation of transformation queries (e.g., XQuery) based on the nested mapping specification. We show that the generation algorithms scale well to large, highly nested schemas. We also show that using nested mappings in data exchange can drastically reduce the execution cost of producing a target instance, particularly over large data sources, and can also dramatically improve the quality of the generated data.

References

[1]
{1} M. Arenas and L. Libkin. XML Data Exchange: Consistency and Query Answering. In PODS, pages 13-24, 2005.]]
[2]
{2} M. Benedikt, C. Y. Chan, W. Fan, J. Freire, and R. Rastogi. Capturing both types and constraints in data integration. In SIGMOD, pages 277-288, 2003.]]
[3]
{3} P. Bernstein, S. Melnik, and P. Mork. Interactive Schema Translation with Instance-Level Mappings. In VLDB, pages 1283-1286, 2005.]]
[4]
{4} P. Bohannon, W. Fan, M. Flaster, and P. P. S. Narayan. Information preserving xml schema embedding. In VLDB, pages 85-96, 2005.]]
[5]
{5} A. Bonifati, E. Q. Chang, T. Ho, V. S. Lakshmanan, and R. Pottinger. HePToX: Marrying XML and Heterogeneity in Your P2P Databases. In VLDB, pages 1267-1270, 2005.]]
[6]
{6} A. Deutsch, M. Fernandez, D. Florescu, A. Levy, and D. Suciu. A Query Language for XML. In WWW8, pages 77-91, 1999.]]
[7]
{7} R. Fagin. Inverting Schema Mappings. In PODS, 2006.]]
[8]
{8} R. Fagin, P. Kolaitis, L. Popa, and W.-C. Tan. Composing Schema Mappings: Second-Order Dependencies to the Rescue. In PODS, pages 83-94, 2004.]]
[9]
{9} R. Fagin, P. G. Kolaitis, R. J. Miller, and L. Popa. Data Exchange: Semantics and Query Answering. TCS, 336(1):89-124, 2005.]]
[10]
{10} M. Friedman, A. Y. Levy, and T. D. Millstein. Navigational Plans For Data Integration. In AAAI/IAAI, pages 67-73, 1999.]]
[11]
{11} L. M. Haas, M. A. Hernández, H. Ho, L. Popa, and M. Roth. Clio Grows Up: From Research Prototype to Industrial Tool. In SIGMOD, pages 805-810, 2005.]]
[12]
{12} A. Y. Halevy, Z. G. Ives, J. Madhavan, P. Mork, D. Suciu, and I. Tatarinov. The Piazza Peer Data Management System. IEEE Trans. Knowl. Data Eng., 16(7):787-798, 2004.]]
[13]
{13} R. Hull and M. Yoshikawa. Ilog: Declarative creation and manipulation of object identifiers. In VLDB, pages 455-468, 1990.]]
[14]
{14} M. Lenzerini. Data Integration: A Theoretical Perspective. In PODS, pages 233-246, 2002.]]
[15]
{15} J. Madhavan and A. Y. Halevy. Composing Mappings Among Data Sources. In VLDB, pages 572-583, 2003.]]
[16]
{16} D. Maier, A. O. Mendelzon, and Y. Sagiv. Testing Implications of Data Dependencies. ACM TODS, 4(4):455-469, 1979.]]
[17]
{17} W. May. Information Extraction and Integration with FLORID: The MONDIAL Case Study. Technical Report 131, Universität Freiburg, Institut für Informatik, 1999.]]
[18]
{18} S. Melnik, P. A. Bernstein, A. Halevy, and E. Rahm. Applying Model Management to Executable Mappings. In SIGMOD, pages 167-178, 2005.]]
[19]
{19} R. J. Miller, L. M. Haas, and M. A. Hernández. Schema Mapping as Query Discovery. In VLDB, pages 77-88, 2000.]]
[20]
{20} T. Milo and S. Zohar. Using Schema Matching to Simplify Heterogeneous Data Translation. In VLDB, pages 122-133, 1998.]]
[21]
{21} A. Nash, P. A. Bernstein, and S. Melnik. Composition of Mappings Given by Embedded Dependencies. In PODS, pages 172-183, 2005.]]
[22]
{22} Y. Papakonstantinou, S. Abiteboul, and H. Garcia-Molina. Object Fusion in Mediator Systems. In VLDB, pages 413-424, 1996.]]
[23]
{23} L. Popa and V. Tannen. An Equational Chase for Path-Conjunctive Queries, Constraints, and Views. In ICDT, pages 39-57, 1999.]]
[24]
{24} L. Popa, Y. Velegrakis, R. J. Miller, M. A. Hernández, and R. Fagin. Translating Web Data. In VLDB, pages 598-609, 2002.]]
[25]
{25} P. Roy, S. Seshadri, S. Sudarshan, and S. Bhobe. Efficient and Extensible Algorithms for Multi Query Optimization. In SIGMOD, pages 249-260, 2000.]]
[26]
{26} Y. Velegrakis, R. J. Miller, and L. Popa. Preserving Mapping Consistency under Schema Changes. VLDB Journal, 13(3):274-293, 2004.]]
[27]
{27} C. Yu and L. Popa. Constraint-Based XML Query Rewriting for Data Integration. In SIGMOD, pages 371-382, 2004.]]
[28]
{28} C. Yu and L. Popa. Semantic Adaptation of Schema Mappings when Schemas Evolve. In VLDB, pages 1006-1017, 2005.]]

Cited By

View all
  • (2019)Rewriting of plain SO tgds into nested tgdsProceedings of the VLDB Endowment10.14778/3342263.334263112:11(1526-1538)Online publication date: 1-Jul-2019
  • (2015)Function Symbols in Tuple-Generating DependenciesProceedings of the 34th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/2745754.2745756(65-77)Online publication date: 20-May-2015
  • (2014)Nested dependenciesProceedings of the 33rd ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems10.1145/2594538.2594544(176-187)Online publication date: 18-Jun-2014
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
VLDB '06: Proceedings of the 32nd international conference on Very large data bases
September 2006
1269 pages

Sponsors

  • SIGMOD: ACM Special Interest Group on Management of Data
  • K.I.S.S. SIG on Databases
  • AJU Information Technology Co., Ltd
  • US Army ITC-PAC Asian Research Office
  • Google Inc.
  • The Database Society of Japan
  • Samsung SOS
  • Advanced Information Technology Research Center
  • Naver
  • Microsoft: Microsoft
  • Korea Info Sci Society: Korea Information Science Society
  • SK telecom
  • Systems Applications Products
  • ORACLE: ORACLE
  • International Business Management
  • Air Force Office of Scientific Research/Asian Office of Aerospace R&D
  • Kosef
  • Kaist
  • LG Electronics
  • CCF-DBS

Publisher

VLDB Endowment

Publication History

Published: 01 September 2006

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2019)Rewriting of plain SO tgds into nested tgdsProceedings of the VLDB Endowment10.14778/3342263.334263112:11(1526-1538)Online publication date: 1-Jul-2019
  • (2015)Function Symbols in Tuple-Generating DependenciesProceedings of the 34th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/2745754.2745756(65-77)Online publication date: 20-May-2015
  • (2014)Nested dependenciesProceedings of the 33rd ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems10.1145/2594538.2594544(176-187)Online publication date: 18-Jun-2014
  • (2013)Automatic web spreadsheet data extractionProceedings of the 3rd International Workshop on Semantic Search Over the Web10.1145/2509908.2509909(1-8)Online publication date: 30-Aug-2013
  • (2013)Value invention in data exchangeProceedings of the 2013 ACM SIGMOD International Conference on Management of Data10.1145/2463676.2465311(157-168)Online publication date: 22-Jun-2013
  • (2013)A sound and complete chase procedure for constrained tuple-generating dependenciesJournal of Intelligent Information Systems10.1007/s10844-012-0216-540:1(63-84)Online publication date: 1-Feb-2013
  • (2011)Generating SPARQL executable mappings to integrate ontologiesProceedings of the 30th international conference on Conceptual modeling10.5555/2075144.2075158(118-131)Online publication date: 31-Oct-2011
  • (2011)Spreadsheet-based complex data transformationProceedings of the 20th ACM international conference on Information and knowledge management10.1145/2063576.2063829(1749-1754)Online publication date: 24-Oct-2011
  • (2011)Tractable XML data exchange via relationsProceedings of the 20th ACM international conference on Information and knowledge management10.1145/2063576.2063813(1629-1638)Online publication date: 24-Oct-2011
  • (2011)Leveraging query logs for schema mapping generation in U-MAPProceedings of the 2011 ACM SIGMOD International Conference on Management of data10.1145/1989323.1989337(121-132)Online publication date: 12-Jun-2011
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media