Nothing Special   »   [go: up one dir, main page]

skip to main content
article

Data Partitioning for Parallel Spatial Join Processing

Published: 01 June 1998 Publication History

Abstract

The cost of spatial join processing can be very high because of the large sizes of spatial objects and the computation-intensive spatial operations. While parallel processing seems a natural solution to this problem, it is not clear how spatial data can be partitioned for this purpose. Various spatial data partitioning methods are examined in this paper. A framework combining the data-partitioning techniques used by most parallel join algorithms in relational databases and the filter-and-refine strategy for spatial operation processing is proposed for parallel spatial join processing. Object duplication caused by multi-assignment in spatial data partitioning can result in extra CPU cost as well as extra communication cost. We find that the key to overcome this problem is to preserve spatial locality in task decomposition. In this paper we show that a near-optimal speedup can be achieved for parallel spatial join processing using our new algorithms.

References

[1]
1. D.J. Abel, V. Gaede, R.A. Power, and X. Zhou. "Resequencing and clustering to improve the performance of spatial join," Technical report, CSIRO Mathematical and Information Sciences, Australia, 1997.
[2]
2. D.J. Abel, B.C. Ooi, K.-L. Tan, R. Power, and J.X. Yu. "Spatial join strategies in distributed spatial dbms," in LNCS 951: Proc. of 4th Int. Symp. on Large Spatial Databases (SSD'95), 346-367, Springer-Verlag, 1995.
[3]
3. D.J. Abel and J.L. Smith. "A data structure and algorithm based on a linear key for a rectangle retrieval problem," Computer Vision, Graphics and Image Processing, 24(1):1-13, 1983.
[4]
4. T. Brinkhoff, H.P. Kriegel, and B. Seeger. "Efficient processing of spatial joins using R-trees," in Proc. ACM SIGMOD Int. Conf. on Management of Data, 237-246, 1993.
[5]
5. T. Brinkhoff, H.P. Kriegel, and B. Seeger. "Parallel processing of spatial joins using R-trees," in Proc. of 12th International Conference on Data Engineering, 1996.
[6]
6. D.J. DeWitt and J. Gray. "Parallel database systems: the future of database processing," C. ACM, 35(6):85-98, 1992.
[7]
7. D.J. DeWitt et al. "Practical skew handling in parallel join," in Proc. 18th Int. Conf. on Very Large Data Bases, Vancouver, Canada, 27-40, 1992.
[8]
8. O. Günther. "Efficient computation of spatial joins," in Proc. of 9th International Conference on Data Engineering, Vienna, Austria, 50-59, 1993.
[9]
9. R.H. Güting. "An introduction to spatial database systems," VLDB Journal, 3(4):357-399, 1994.
[10]
10. A. Guttman. "R-trees: A dynamic index structure for spatial searching," in Proc. ACM SIGMOD Int. Conf. on Management of Data, 47-54, 1984.
[11]
11. E.G. Hoel and H. Samet. "Performance of data-parallel spatial operations," in Proc. 20th Int. Conf. on Very Large Data Bases, 156-167, 1995.
[12]
12. E. Horowitz and S. Sahni. Fundamentals of Computer Algorithms, Computer Science Press, 1978.
[13]
13. K.A. Hua and C. Lee. "Handing data skew in multiprocessor database computers using partition tuning," in Proc. of 17th International Conference on Very Large Data Bases, Barcelona, 523-535, 1991.
[14]
14. H. Ishihata, T. Horie, S. Inano, T. Shimizu, and S. Kato. "CAP-IID architecture," in Proc. of the 1st Fujitsu-ANU CAP Workshop, Kawasaki, Japan, 1990.
[15]
15. M. Kitsuregawa and Y. Ogawa. "Bucket spreading parallel hash: A new, robust, parallel hash join method for data skew in the super database computer (SDC)," in Proc. 16th Int. Conf. on Very Large Data Bases, 210-221, 1990.
[16]
16. M. Kitsuregawa, H. Tanaka, and T. Motooka. "Application of hash to database machine and its architecture," New Generation Computing, 1(1):66-74, 1983.
[17]
17. M.L. Lo and C.V. Ravishankar. "Spatial joins using seeded trees," in Proc. ACM SIGMOD Int. Conf. on Management of Data, 209-220, 1994.
[18]
18. M.L. Lo and C.V. Ravishankar. "Spatial hash-join," in Proc. ACM SIGMOD Int. Conf. on Management of Data, Montreal, Canada, 247-258, 1996.
[19]
19. J.H. Lu, B.C. Ooi, and K.L. Tan. Query Processing in Parallel Relational Database Systems, IEEE Computer Society Press, 1994.
[20]
20. J. Orenstein and F.A. Manola. "Probe spatial data modeling and query processing in an image database application," IEEE TOSE, 14(5):611-629, 1988.
[21]
21. J.M. Patel and D.J. DeWitt. "Partition based spatial-merge join," in Proc. ACM SIGMOD Int. Conf. on Management of Data, Montreal, Canada, 259-270, 1996.
[22]
22. F.P. Preparata and M.I. Shamos. Computational Geometry: an introduction. Springer-Verlag, 1985.
[23]
23. T. Sellis, N. Roussopoulos, and C. Faloutsos. "The R+-tree: a dynamic index for multi-dimensional objects," in Proc. 13th Int. Conf. on Very Large Data Bases, 3-11, 1987.
[24]
24. M. Stonebraker, J. Frew, K. Gardels, and J. Meredith. "The SEQUOIA 2000 storage benchmark," in Proc. of ACM SIGMOD Int. Conf. on Management of Data, Washington, DC, 2-11, 1993.
[25]
25. X. Zhou. Parallel Processing in Relational Database Systems, Ph.D. thesis, University of Queensland, 1994.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Geoinformatica
Geoinformatica  Volume 2, Issue 2
June 1998
93 pages

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 June 1998

Author Tags

  1. data partitioning
  2. parallel processing
  3. spatial join

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2020)Indexing of real time geospatial data by IoT enabled devicesJournal of Ambient Intelligence and Smart Environments10.3233/AIS-20056512:4(281-312)Online publication date: 1-Jan-2020
  • (2019)Computing over encrypted spatial data generated by IoTTelecommunications Systems10.1007/s11235-018-0479-470:2(193-229)Online publication date: 1-Feb-2019
  • (2019)Spatial data management in apache sparkGeoinformatica10.1007/s10707-018-0330-923:1(37-78)Online publication date: 1-Jan-2019
  • (2017)On Spatial Joins in MapReduceProceedings of the 25th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems10.1145/3139958.3139967(1-10)Online publication date: 7-Nov-2017
  • (2017)iSPEEDProceedings of the 25th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems10.1145/3139958.3139961(1-10)Online publication date: 7-Nov-2017
  • (2017)Optimizing Spatial Queries in MapReduceProceedings of the 2017 ACM International Conference on Management of Data10.1145/3055167.3055179(28-30)Online publication date: 14-May-2017
  • (2016)GCMFProceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems10.1145/2996913.2996982(1-10)Online publication date: 31-Oct-2016
  • (2016)Multi-Assignment Single Joins for Parallel Cross-Match of Astronomic Catalogs on Heterogeneous ClustersProceedings of the 28th International Conference on Scientific and Statistical Database Management10.1145/2949689.2949705(1-12)Online publication date: 18-Jul-2016
  • (2016)A Spatial Mashup Service for Efficient Evaluation of Concurrent $k$ -NN QueriesIEEE Transactions on Computers10.1109/TC.2015.248521565:8(2428-2442)Online publication date: 7-Jul-2016
  • (2016)Events and SightingsIEEE Annals of the History of Computing10.1109/MAHC.2016.3138:3(88-88)Online publication date: 1-Jul-2016
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media