Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2745754.2745768acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
research-article

Join Dependency Testing, Loomis-Whitney Join, and Triangle Enumeration

Published: 20 May 2015 Publication History

Abstract

In this paper, we revisit two fundamental problems in database theory. The first one is called join dependency (JD) testing, where we are given a relation r and a JD, and need to determine whether the JD holds on r. The second problem is called JD existence testing, where we need to determine if there exists any non-trivial JD that holds on r.
We prove that JD testing is NP-hard even if the JD is defined only on binary relations (i.e., each with only two attributes). Unless P = NP, this result puts a negative answer to the question whether it is possible to efficiently test JDs defined exclusively on small (in terms of attribute number) relations. The question has been open since the classic NP-hard proof of Maier, Sagiv, and Yannakakis in JACM'81 which requires the JD to involve a relation of Ω(d) attributes, where d is the number of attributes in r.
For JD existence testing, the challenge is to minimize the computation cost because the problem is known to be solvable in polynomial time. We present a new algorithm for solving the problem I/O-efficiently in the external memory model. Our algorithm in fact settles the closely related Loomis-Whitney (LW) enumeration problem, and as a side product, achieves the optimal I/O complexity for the triangle enumeration problem, improving a recent result of Pagh and Silvestri in PODS'14.

References

[1]
S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley Publishing Company, 1995.
[2]
A. Aggarwal and J. S. Vitter. The input/output complexity of sorting and related problems. CACM, 31(9):1116--1127, 1988.
[3]
L. Arge, P. Ferragina, R. Grossi, and J. S. Vitter. On sorting strings in external memory (extended abstract). In STOC, pages 540--548, 1997.
[4]
A. Atserias, M. Grohe, and D. Marx. Size bounds and query plans for relational joins. SIAM J. of Comp., 42(4):1737--1767, 2013.
[5]
C. Beeri and M. Vardi. On the complexity of testing implications of data dependencies. Computer Science Report, Hebrew Univ, 1980.
[6]
P. C. Fischer and D. Tsou. Whether a set of multivalued dependencies implies a join dependency is NP-hard. SIAM J. of Comp., 12(2):259--266, 1983.
[7]
M. R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, 1979.
[8]
X. Hu, Y. Tao, and C.-W. Chung. I/O-efficient algorithms on triangle listing and counting. To appear in ACM TODS, 2014.
[9]
P. C. Kanellakis. On the computational complexity of cardinality constraints in relational databases. IPL, 11(2):98--101, 1980.
[10]
D. Maier. The Theory of Relational Databases. Available Online at http://web.cecs.pdx.edu/ $\sim$maier/TheoryBook/TRD.html, 1983.
[11]
D. Maier, Y. Sagiv, and M. Yannakakis. On the complexity of testing implications of functional and join dependencies. JACM, 28(4):680--695, 1981.
[12]
H. Q. Ngo, E. Porat, C. Ré, and A. Rudra. Worst-case optimal join algorithms: {extended abstract}. In PODS, pages 37--48, 2012.
[13]
J. Nicolas. Mutual dependencies and some results on undecomposable relations. In VLDB, pages 360--367, 1978.
[14]
R. Pagh and F. Silvestri. The input/output complexity of triangle enumeration. In PODS, pages 224--233, 2014.

Cited By

View all
  • (2017)Comparing MapReduce and Pipeline Implementations for Counting TrianglesElectronic Proceedings in Theoretical Computer Science10.4204/EPTCS.237.2237(20-33)Online publication date: 11-Jan-2017
  • (2017)2-3 Cuckoo Filters for Faster Triangle Listing and Set IntersectionProceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3034786.3056115(247-260)Online publication date: 9-May-2017
  • (2016)Counting Triangles in Large Graphs by Random SamplingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2016.255666328:8(2013-2026)Online publication date: 1-Aug-2016
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
PODS '15: Proceedings of the 34th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems
May 2015
358 pages
ISBN:9781450327572
DOI:10.1145/2745754
  • General Chair:
  • Tova Milo,
  • Program Chair:
  • Diego Calvanese
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 May 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. join dependency
  2. loomis-whitney join
  3. triangle enumeration

Qualifiers

  • Research-article

Funding Sources

  • HKRGC

Conference

SIGMOD/PODS'15
Sponsor:
SIGMOD/PODS'15: International Conference on Management of Data
May 31 - June 4, 2015
Victoria, Melbourne, Australia

Acceptance Rates

PODS '15 Paper Acceptance Rate 25 of 80 submissions, 31%;
Overall Acceptance Rate 642 of 2,707 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2017)Comparing MapReduce and Pipeline Implementations for Counting TrianglesElectronic Proceedings in Theoretical Computer Science10.4204/EPTCS.237.2237(20-33)Online publication date: 11-Jan-2017
  • (2017)2-3 Cuckoo Filters for Faster Triangle Listing and Set IntersectionProceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3034786.3056115(247-260)Online publication date: 9-May-2017
  • (2016)Counting Triangles in Large Graphs by Random SamplingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2016.255666328:8(2013-2026)Online publication date: 1-Aug-2016
  • (2016)On Efficient External-Memory Triangle Listing2016 IEEE 16th International Conference on Data Mining (ICDM)10.1109/ICDM.2016.0021(101-110)Online publication date: Dec-2016

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media