Abstract
Joins on set-valued attributes (set joins) have numerous database applications. In this paper we propose PRETTI (PREfix Tree based seT joIn) – a suite of set join algorithms for containment, overlap and equality join predicates. Our algorithms use prefix trees and inverted indices. These structures are constructed on-the-fly if they are not already precomputed. This feature makes our algorithms usable for relations without indices and when joining intermediate results during join queries with more than two relations. Another feature of our algorithms is that results are output continuously during their execution and not just at the end. Experiments on real life datasets show that the total execution time of our algorithms is significantly less than that of previous approaches, even when the indices required by our algorithms are not precomputed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. of Intl. Conf. on Very Large Databases (VLDB) (September 1994)
Cai, J., Chakaravarthy, V.T., Kaushik, R., Naughton, J.F.: On the complexity of join predicates. In: ACM SIGMOD-SIGACT-SIGART Symp. on Principles of Database Systems (2001)
Grahne, G., Zhu, J.: Efficiently using prefix-trees in mining frequent itemsets. In: IEEE ICDM Workshop on Frequent Itemset Mining Implementations (FIMI) (2003)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proc. of ACM SIGMOD Intl. Conf. on Management of Data (2000)
Helmer, S., Moerkotte, G.: Evaluation of main memory join algorithms for joins with set comparison join predicates. In: Proc. of Intl. Conf. on Very Large Databases (VLDB) (1997)
Helmer, S., Moerkotte, G.: A study of four index structures for set-valued attributes of low cardinality. Technical report, University of Mannheim (1999)
Mamoulis, N.: Efficient processing of joins on set-valued attributes. In: Proc. of ACM SIGMOD Intl. Conf. on Management of Data (2003)
Melnik, S.: Set containment joins: Testbed, http://www-db.stanford.edu/~melnik/scj
Melnik, S., Garcia-Molina, H.: Divide-and-conquer algorithm for computing set containment joins. In: Jensen, C.S., Jeffery, K., Pokorný, J., Šaltenis, S., Bertino, E., Böhm, K., Jarke, M. (eds.) EDBT 2002. LNCS, vol. 2287, p. 427. Springer, Heidelberg (2002)
Melnik, S., Garcia-Molina, H.: Adaptive algorithms for set containment joins. ACM Transactions on Database Systems (TODS) 28(2) (2003)
Ramasamy, K., Patel, J.M., Naughton, J.F., Kaushik, R.: Set containment joins: the good, the bad and the ugly. In: Proc. of Intl. Conf. on Very Large Databases (VLDB) (2000)
Rantzau, R.: Processing frequent itemset discovery queries by division and set containment join operators. In: 8th ACM SIGMOD DMKD Workshop (2003)
Sarawagi, S., Kirpal, A.: Efficient set joins on similarity predicates. In: Proc. of ACM SIGMOD Intl. Conf. on Management of Data (2004)
Stonebraker, M.: Object-relational DBMS: The Next Great Wave. Morgan Kaufmann, San Francisco (1996)
Zheng, Z., Kohavi, R., Mason, L.: Real world performance of association rule algorithms. In: Intl. Conf. on Knowledge Discovery and Data Mining (KDD) (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jampani, R., Pudi, V. (2005). Using Prefix-Trees for Efficiently Computing Set Joins. In: Zhou, L., Ooi, B.C., Meng, X. (eds) Database Systems for Advanced Applications. DASFAA 2005. Lecture Notes in Computer Science, vol 3453. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11408079_69
Download citation
DOI: https://doi.org/10.1007/11408079_69
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25334-1
Online ISBN: 978-3-540-32005-0
eBook Packages: Computer ScienceComputer Science (R0)