Nothing Special   »   [go: up one dir, main page]

skip to main content
article
Free access

On the optimal nesting order for computing N-relational joins

Published: 01 September 1984 Publication History

Abstract

Using the nested loops method, this paper addresses the problem of minimizing the number of page fetches necessary to evaluate a given query to a relational database. We first propose a data structure whereby the number of page fetches required for query evaluation is substantially reduced and then derive a formula for the expected number of page fetches. An optimal solution to our problem is the nesting order of relations in the evaluation program, which minimizes the number of page fetches. Since the minimization of the formula is NP-hard, as shown in the Appendix, we propose a heuristic algorithm which produces a good suboptimal solution in polynomial time. For the special case where the input query is a “tree query,” we present an efficient algorithm for finding an optimal nesting order.

References

[1]
ABDEL-WAHAB, H.M., AND KAMDEDA, T. On strictly optimal schedules for the cumulative costoptimal scheduling problem. Computing 24 (1980), 61-86.
[2]
ADACHI, J., SAITO, T., AND INOSE, H. On improving the retrieval efficiency in a database. In Proceedings 20th National Con{erence, Information Society of Japan, 1979, 743-744.
[3]
ADACHI, J., SAITO, T., AND INOSE, H. Optimization of query processing in a database. In Proceedings I981 National Con/erence, IECE of Japan, 1981, 6/57.
[4]
ASTRAHAN, M.M., ET AL. System R: Relational approach to database management. ACM Trans. Database Syst. 1, 2 (July 1976), 97-137.
[5]
BERNSTEIN, P.A., AND CHIU, D-M.W. Using semijoins to solve relational queries. J. ACM 28, 1 ( Jan. 1981), 25-40.
[6]
BERNSTEIN, P.A., AND GOODMAN, N. Power of natural semijoins. SIAM J. Comput. 10, 4 (Nov. 1981), 751-771.
[7]
Blasgen, M.W., and Eswaran, K.P. Storage and access in relational databases. IBM Syst. J. 16, 4 (1977), 363-377.
[8]
CARDENAS, A.F. Analysis and performance of inverted database structures. Comrnun. ACM 18, 5 (May 1975), 253-263.
[9]
COl)D, E.F. A relational model of data for large shared data banks. Commun. ACM 13, 6 (June 1970), 377-387.
[10]
DATE, C.J. An Introduction to Data Base Systems. Addison-Wesley, Reading, Mass., 1975.
[11]
GAREY, M.R., AND JOHNSON, D.S. Computers and Intractability: A Guide to the Theory of NP- Completeness. Freeman and Co., San Francisco, 1979.
[12]
HAMILTON, P., AND MANUEL T. Relational databases do it more easily. Electronics (Mar. 24, 1981). 102-103.
[13]
KAMBAYASHI, Y., YASUURA, K., IWAMA, K., AND YAJIMA, S. The join operation in hierarchical memories. Unpublished manuscript, Dept. Information Science, Kyoto Univ., Japan, Feb. 1981.
[14]
KIM, W. A new way to compute the product and join of relations, in Proceedings ACM-SIGMOD International Conference on Management of Data (1980), 179-187.
[15]
KNUTH, D.E. The Art of Computer Programming, Vol. III; Searching and Sorting. Addison- Wesley, Reading, Mass., 1973.
[16]
LAWLER, E.L. Sequencing jobs to minimize total weighted completion time subject to precedence constraints. Ann. Discrete Math. 2 {1978), 75-90.
[17]
MONMA, C.L., AND SIDNEY, J.B. Sequencing with series-parallel precedence constraints. Math Oper. Res. 4 (1979), 215-224.
[18]
PERCHERER, R.M. Efficient exploration of product spaces. In Proceedings ACM-SIGMOD International Conference on Management of Data {1976), 169-177.
[19]
SELINGER, P.G., ET AL. Access path selection in a relational database management system. In Proceedings A CM-SIGMOD International Conference on Management of Data (1979), 23-34.
[20]
STONEBRAKER, M., WONG, E., AND KREPS, P. The design and implementation of INGRES. ACM Trans. Database Systo 1, 3 {Sept. 1976), 189-222.
[21]
ULLMAN, J.D. Principles of Database Systems. Computer Science Press, Potomac, Md., 1980.
[22]
WON6, E., AND YOUSSEFI, K. Decomposition--a strategy for query processing. ACM Trans. Database Syst. 1, 3 (Sept. 1976), 223-241.
[23]
YAO, S.B. Approximating block accesses in database organizations. Commun. ACM 20, 4 (Apr. 1977), 260-261.
[24]
YAO, S.B. Optimization of query evaluation algorithms.ACM Trans. Database Systm. 4, 2 (June 1979), 133-155.
[25]
YAO, S.B., AND DEJONG, D. Evaluation of database access paths. In ProceedingsACM-SIGMOD International Conference on Management of Data (May 1978), 66-77.
[26]
YONG, P-Y. Masters Thesis, Dept. of Computing Science, Simon Fraser Univ., Nov. 1983.
[27]
GELENBE, E., AND GARDY, D. On the size of projections: I. Inf. Process. Lett. 14, 1 (1982), 18- 21.

Cited By

View all
  • (2024)Sub-optimal Join Order Identification with L1-errorProceedings of the ACM on Management of Data10.1145/36392722:1(1-24)Online publication date: 26-Mar-2024
  • (2024)On the Optimal Linear Contraction Order of Tree Tensor Networks, and BeyondSIAM Journal on Scientific Computing10.1137/23M161286X46:5(B647-B668)Online publication date: 8-Oct-2024
  • (2024)Optimizing the Execution of Product Data ModelsIEEE Access10.1109/ACCESS.2024.336608112(25397-25410)Online publication date: 2024
  • Show More Cited By

Recommendations

Reviews

Frederick Neil Springsteel

This work concerns certain algorithmic questions about efficient information retrieval in database management. This clearly written, original research addresses the specific question of minimizing the total page fetches needed to evaluate typical queries about a relational database. The approach is based on the well-known “nested loops method” of iterative evaluation and improves upon it by adding heuristics and by solving special cases. The chief contributions of this compact article are: (1)A new data structure to guide the nested loops method: the “query graph.” (2)A crucial formula for the expected number of page accesses in an evaluation. (3)A proof that minimizing this expected number is an intractable (NP-complet- e) problem. (4)A heuristic algorithm for finding an optimal nesting order for the N relations referenced in a query: it runs in cubic time if ties do not occur among successors. (5)For the special case where the given query is a “tree query,” a chain-merging algorithm which minimizes the expected number of page fetches and which takes at most 0( N*Nlog N) time. Clearly, this paper advances the state of the theory of nested loops methods, especially by its efficient algorithms (see (4) and (5) above). However, its several derivations of probabilistic formulas, culminating in (2) above, are based on some convenient assumptions which may not apply in practice. For example, in a given query Q = P 1 & P 2 & . . . & P m, the predicates { P j} are assumed to be independent in the sense that none is implied by the others. Subset conditions violate this. Also, to use the papers algorithms one would have to precompute numbers critical to their formulas: the “selectivity factor” of each join predicate P j = R i.a op R j.b, i.e., the expected fraction of all tuple pairs from the two relations which satisfy P j. Means and estimated costs of finding these factors are not given. Moreover, the critical hypothesis is made that the values of the attributes a,b from each relation are uniformly distributed. As the authors note, this is strictly impossible in a relational database, which has nonrepeated tuples, but it is an approximation often made in the literature [1].

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Database Systems
ACM Transactions on Database Systems  Volume 9, Issue 3
Sept. 1984
172 pages
ISSN:0362-5915
EISSN:1557-4644
DOI:10.1145/1270
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 September 1984
Published in TODS Volume 9, Issue 3

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)196
  • Downloads (Last 6 weeks)20
Reflects downloads up to 13 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Sub-optimal Join Order Identification with L1-errorProceedings of the ACM on Management of Data10.1145/36392722:1(1-24)Online publication date: 26-Mar-2024
  • (2024)On the Optimal Linear Contraction Order of Tree Tensor Networks, and BeyondSIAM Journal on Scientific Computing10.1137/23M161286X46:5(B647-B668)Online publication date: 8-Oct-2024
  • (2024)Optimizing the Execution of Product Data ModelsIEEE Access10.1109/ACCESS.2024.336608112(25397-25410)Online publication date: 2024
  • (2023)Quantum-Inspired Digital Annealing for Join OrderingProceedings of the VLDB Endowment10.14778/3632093.363211217:3(511-524)Online publication date: 1-Nov-2023
  • (2023)Join Order Selection with Deep Reinforcement Learning: Fundamentals, Techniques, and ChallengesProceedings of the VLDB Endowment10.14778/3611540.361157616:12(3882-3885)Online publication date: 1-Aug-2023
  • (2023)LOGER: A Learned Optimizer Towards Generating Efficient and Robust Query Execution PlansProceedings of the VLDB Endowment10.14778/3587136.358715016:7(1777-1789)Online publication date: 1-Mar-2023
  • (2023)Efficiently Computing Join Orders with Heuristic SearchProceedings of the ACM on Management of Data10.1145/35889271:1(1-26)Online publication date: 30-May-2023
  • (2023)AIM: A practical approach to automated index management for SQL databases2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00257(3349-3362)Online publication date: Apr-2023
  • (2023)Communication-Avoiding Recursive Aggregation2023 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER52292.2023.00024(197-208)Online publication date: 31-Oct-2023
  • (2023)Host-graph-sensitive RETE nets for incremental graph pattern matching with nested graph conditionsJournal of Logical and Algebraic Methods in Programming10.1016/j.jlamp.2022.100841131(100841)Online publication date: Feb-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media