Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Large-scale indexing of spatial data in distributed repositories: the SD-Rtree

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

We propose a scalable distributed data structure (SDDS) called SD-Rtree. We intend our structure for point, window and kNN queries over large spatial datasets distributed on clusters of interconnected servers. The structure balances the storage and processing load over the available resources, and aims at minimizing the size of the cluster. SD-Rtree generalizes the well-known Rtree structure. It uses a distributed balanced binary tree that scales with insertions to potentially any number of storage servers through splits of the overloaded ones. A user/application manipulates the structure from a client node. The client addresses the tree through its image that can be possibly outdated due to later split. This may generate addressing errors, solved by the forwarding among the servers. Specific messages towards the clients incrementally correct the outdated images. We present the building of an SD-Rtree through insertions, focusing on the split and rotation algorithms. We follow with the query algorithms. We describe then a flexible allocation protocol which allows to cope with a temporary shortage of storage resources through data storage balancing. Experiments show additional aspects of SD-Rtree and compare its behavior with a distributed quadtree. The results justify our various design choices and the overall utility of the structure.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Adelson-Velskii G.M., Landis E.M.: An algorithm for the organization of information. Doklady Akademii Nauk SSSR 146, 263–266 (1962)

    MathSciNet  Google Scholar 

  2. Arge, L., Eppstein, D., Goodrich, M.T.: Skip-Webs: efficient distributed data structures for multi-dimensional data sets. In: Proc. Intl. Symp. on Principles of Distributed Computing (PODC), pp. 69–76 (2005)

  3. Beckmann, N., Kriegel, H.P., Schneider, R., Seeger B.: The R*tree: an efficient and robust access method for points and rectangles. In: Proc. ACM Symp. on the Management of Data (SIGMOD), pp. 322–331 (1990)

  4. Crainiceanu, A., Linga, P., Gehrke, J., Shanmugasundaram, J.: Querying peer-to-peer networks using P-trees. In: Proc. Intl. Workshop on the Web and Databases (WebDB), pp. 25–30 (2004)

  5. Devine, R.: Design and implementation of DDH: a distributed dynamic hashing algorithm. In: Proc. Intl. Conf. on Foundations of Data Organization and Algorithms (FODO), pp. 101–114 (1993)

  6. du Mouza, C., Litwin, W., Rigaux, P.: Dynamic storage balancing in a distributed spatial index. In: Proc. Intl. Symp. on Geographic Information Systems (ACM-GIS) (2007)

  7. du Mouza, C., Litwin, W., Rigaux, P.: SD-Rtre: a scalable distributed Rtree. In: Proc. Intl. Conf. on Data Engineering (ICDE), pp. 96–305 (2007)

  8. Gaede V., Günther O.: Multidimensional access methods. ACM Comput. Surv. 30(2), 170–231 (1998)

    Article  Google Scholar 

  9. US Census Bureau Geography Division. Tiger/Line files (2007). http://www.census.gov/geo/www/tiger/

  10. Guttman, A.: R-trees : A dynamic index structure for spatial searching. In: Proc. ACM Symp. on the Management of Data (SIGMOD), pp. 45–57 (1984)

  11. Hambrusch, S.E., Khokhar, A.A.: Maintaining spatial data sets in distributed-memory machines. In: Proc. Intl. Parallel Processing Symp. (IPPS), pp. 702–707 (1997)

  12. Jagadish, H.V., Ooi, B.C, Vu, Q.H.: BATON: a balanced tree structure for peer-to-peer networks. In: Proc. Intl. Conf. on Very Large Data Bases (VLDB), pp. 661–672 (2005)

  13. Jagadish, H.V., Ooi, B.C., Vu, Q.H., Zhang, R., Zhou, A.: VBI-Tree: a peer-to-peer framework for supporting multi-dimensional indexing schemes. In: Proc. Intl. Conf. on Data Engineering (ICDE) (2006)

  14. Karlsson, J.S.: hQT*: a scalable distributed data structure for high-performance spatial accesses. In: Proc. Intl. Conf. on Foundations of Data Organization and Algorithms (FODO), pp. 37–46 (1998)

  15. Kriakov, V, Delis, A., Kollios, G.: Management of highly dynamic multidimensional data in a cluster of workstations. In: Proc. Intl. Conf. on Extending Data Base Technology (EDBT), pp. 748–764 (2004)

  16. Litwin, W., Neimat, M.-A., Schneider, D.A.: RP*: a family of order preserving scalable distributed data structures. In: Proc. Intl. Conf. on Very Large Data Bases (VLDB), pp. 342–353 (1994)

  17. Litwin W., Neimat M.-A., Schneider D.A.: LH*: a scalable, distributed data structure. ACM Trans. Database Syst. (TODS) 21(4), 480–525 (1996)

    Article  Google Scholar 

  18. Liu, B., Lee, W.-C., Lee, D.L.: Supporting complex multi-dimensional queries in P2P systems. In: Proc. Intl. Conf. on Distributed Computing Systems (ICDCS), pp. 155–164 (2005)

  19. Mondal, A., Lifu, Y., Kitsuregawa, M.: P2PR-Tree: an R-tree-based spatial index for peer-to-peer environments. In: EDBT Workshops, pp. 516–525 (2004)

  20. Roussopoulos, N., Kelley, S., Vincent, F.: Nearest neighbor queries. In: Proc. ACM Symp. on the Management of Data (SIGMOD), pp. 71–79 (1995)

  21. Samet H.: Foundations of Multi-dimensional Data Structures. Kaufmann, Morgan (2006)

    Google Scholar 

  22. Tanin E., Harwood A., Samet H.: Using a distributed quadtree index in peer-to-peer networks. VLDB J. 16(2), 165–178 (2007)

    Article  Google Scholar 

  23. Theodoridis, Y., Silva, J.R.O., Nascimento, M.A.: On the generation of spatiotemporal datasets. In: Proc. Intl. Conf. on Large Spatial Databases (SSD), pp. 147–164 (1999)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cédric du Mouza.

Rights and permissions

Reprints and permissions

About this article

Cite this article

du Mouza, C., Litwin, W. & Rigaux, P. Large-scale indexing of spatial data in distributed repositories: the SD-Rtree. The VLDB Journal 18, 933–958 (2009). https://doi.org/10.1007/s00778-009-0135-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-009-0135-4

Keywords

Navigation