Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1247480.1247559acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
Article

From complete to incomplete information and back

Published: 11 June 2007 Publication History

Abstract

Incomplete information arises naturally in numerous data management applications. Recently, several researchers have studied query processing in the context of incomplete information. Most work has combined the syntax of a traditional query language like relational algebra with a nonstandard semantics such as certain or ranked possible answers. There are now also languages with special features to deal with uncertainty. However, to the standards of the data management community, to date no language proposal has been made that can be considered a natural analog to SQL or relational algebra for the case of incomplete information.
In this paper we propose such a language, World-set Algebra, which satisfies the robustness criteria and analogies to relational algebra that we expect. The language supports the contemplation on alternatives and can thus map from a complete database to an incomplete one comprising several possible worlds. We show that World-set Algebra is conservative over relational algebra in the sense that any query that maps from a complete database to a complete database (a complete-to-complete query) is equivalent to a relational algebra query. Moreover, we give an efficient algorithm for effecting this translation.
We then study algebraic query optimization of such queries. We argue that query languages with explicit constructs for handling uncertainty allow for the more natural and simple expression of many real-world decision support queries. The results of this paper not only suggest a language for specifying queries in this way, but also allow for their efficient evaluation in any relational database management system.

References

[1]
S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, 1995.
[2]
S. Abiteboul, P. Kanellakis, and G. Grahne. On the representation and querying of sets of possible worlds. Theor. Comput. Sci., 78(1):158--187, 1991.
[3]
P. Andritsos, A. Fuxman, and R. J. Miller. Clean answers over dirty databases: A probabilistic approach. In Proc. ICDE, 2006.
[4]
L. Antova, C. Koch, and D. Olteanu. 1010 6 worlds and beyond: Efficient representation and processing of incomplete information. In Proc. ICDE, 2007.
[5]
L. Antova, C. Koch, and D. Olteanu. World-set decompositions: Expressiveness and efficient algorithms. In Proc. ICDT, 2007.
[6]
M. Arenas, L. E. Bertossi, and J. Chomicki. Consistent query answers in inconsistent databases. In Proc. PODS, 1999.
[7]
M. Arenas, L. E. Bertossi, and J. Chomicki. "Answer sets for consistent query answering in inconsistent databases". TPLP, 3(4-5):393--424, 2003.
[8]
O. Benjelloun, A. D. Sarma, A. Halevy, and J. Widom. ULDBs: Databases with uncertainty and lineage. In Proc. VLDB, 2006.
[9]
O. Benjelloun, A. D. Sarma, C. Hayworth, and J. Widom. An Introduction to ULDBs and the Trio System. IEEE Data Engineering Bulletin, 29(1):5--16, Mar. 2006.
[10]
N. Dalvi and D. Suciu. Efficient query evaluation on probabilistic databases. In Proc. VLDB, 2004
[11]
R. Fagin, P. G. Kolaitis, R. J. Miller, and L. Popa. Data exchange: semantics and query answering. Theoretical Computer Science, 336(1):89--124, 2005.
[12]
G. Grahne. Dependency satisfaction in databases with incomplete information. In Proc. VLDB, pages 37--45, 1984.
[13]
G. Grahne. The Problem of Incomplete Information in Relational Databases. Number 554 in LNCS. Springer-Verlag, 1991.
[14]
T. J. Green and V. Tannen. "Models for Incomplete and Probabilistic Information". In International Workshop on Incompleteness and Inconsistency in Databases (IIDB), 2006.
[15]
T. Griffin and R. Hull. "A Framework for Implementing Hypothetical Queries". In Proc. SIGMOD, 1997.
[16]
T. Imielinski and W. Lipski. Incomplete information in relational databases. Journal of ACM, 31:761--791, 1984.
[17]
T. Imielinski, S. Naqvi, and K. Vadaparty. Incomplete objects - a data model for design and planning applications. In Proc. SIGMOD, pages 288--297, 1991.
[18]
N. Leone, G. Greco, G. Ianni, V. Lio, G. Terracina, T. Eiter, W. Faber, M. Fink, G. Gottlob, R. Rosati, D. Lembo, M. Lenzerini, M. Ruzzi, E. Kalka, B. Nowicki, and W. Staniszkis. "The INFOMIX system for advanced integration of incomplete and inconsistent data" In Proc. SIGMOD, pages 915--917, 2005.
[19]
L. Libkin and L. Wong. Semantic representations and query languages for OR-sets. In Proc. PODS, pages 37--48, 1993.
[20]
J. Paredaens and D. V. Gucht. Converting nested algebra expressions into flat algebra expressions. TODS, 17(1):65--93, 1992.
[21]
M. Poess and C. Floyd. New TPC Benchmarks for Decision Support and Web Commerce. SIGMOD Record, 29(4), 2000.
[22]
R. Rantzau and C. Mangold. Laws for rewriting queries containing division operators. In Proc. ICDE, 2006.
[23]
Stanford Trio Project. TriQL-The Trio Query Language, Oct. 2006. http://infolab.stanford.edu/~widom/triql.html.
[24]
Transaction Processing Performance Council. TPC Benchmark H (Decision Support), revision 2.6.0 edition, 2006. http://www.tpc.org/tpch/spec/tpch2.6.0.pdf.

Cited By

View all
  • (2022)Revealing Top-k Dominant Individuals in Incomplete Data Based on Spark EnvironmentGenetic and Evolutionary Computing10.1007/978-981-16-8430-2_43(471-480)Online publication date: 4-Jan-2022
  • (2022)Top-k Dominating Queries on Incremental DatasetsDatabase Systems for Advanced Applications. DASFAA 2022 International Workshops10.1007/978-3-031-11217-1_6(79-88)Online publication date: 16-Jul-2022
  • (2019)Skyline queries over incomplete data streamsThe VLDB Journal10.1007/s00778-019-00577-628:6(961-985)Online publication date: 17-Oct-2019
  • Show More Cited By

Index Terms

  1. From complete to incomplete information and back

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data
    June 2007
    1210 pages
    ISBN:9781595936868
    DOI:10.1145/1247480
    • General Chairs:
    • Lizhu Zhou,
    • Tok Wang Ling,
    • Program Chair:
    • Beng Chin Ooi
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 June 2007

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. hypothetical queries
    2. incomplete information
    3. query rewriting
    4. use cases
    5. world-set algebra

    Qualifiers

    • Article

    Conference

    SIGMOD/PODS07
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 785 of 4,003 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)18
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 16 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Revealing Top-k Dominant Individuals in Incomplete Data Based on Spark EnvironmentGenetic and Evolutionary Computing10.1007/978-981-16-8430-2_43(471-480)Online publication date: 4-Jan-2022
    • (2022)Top-k Dominating Queries on Incremental DatasetsDatabase Systems for Advanced Applications. DASFAA 2022 International Workshops10.1007/978-3-031-11217-1_6(79-88)Online publication date: 16-Jul-2022
    • (2019)Skyline queries over incomplete data streamsThe VLDB Journal10.1007/s00778-019-00577-628:6(961-985)Online publication date: 17-Oct-2019
    • (2018)Combined geo-social searchGeoinformatica10.5555/3238836.323886122:3(615-660)Online publication date: 1-Jul-2018
    • (2018)Parallel and Progressive Approaches for Skyline Query Over Probabilistic Incomplete DatabaseIEEE Access10.1109/ACCESS.2018.28063796(13289-13301)Online publication date: 2018
    • (2018)Finding Top- $k$ Dominance on Incomplete Big Data Using MapReduce FrameworkIEEE Access10.1109/ACCESS.2018.27970486(7872-7887)Online publication date: 2018
    • (2018)Incomplete data managementFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-016-6195-x12:1(4-25)Online publication date: 1-Feb-2018
    • (2017)Declarative Probabilistic Programming with DatalogACM Transactions on Database Systems10.1145/313270042:4(1-35)Online publication date: 27-Oct-2017
    • (2017)Combined geo-social search: computing top-k join queries over incomplete informationGeoInformatica10.1007/s10707-017-0297-y22:3(615-660)Online publication date: 25-Mar-2017
    • (2016)ENFrameACM Transactions on Database Systems10.1145/287720541:1(1-44)Online publication date: 18-Mar-2016
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media