Abstract
Effective indexing is crucial for any AI and big data analysis task involving huge datasets. Recently, machine-learned models for indexing have achieved much attention, and we apply such for spatial data, specifically huge collections of polygons. We propose an index structure \(\text {SPL}^{{index}}\) that organizes polygons into a tree of clusters with linear regression models for effective branching in search. It integrates an effective layout of polygon data to disk space that minimize disk access and amount of data to be kept in main memory. The approach is shown outperform the state-of-the-art R-tree for both range and point queries.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Beckmann, N., Kriegel, H.P., Schneider, R., Seeger, B.: The R*-tree: an efficient and robust access method for points and rectangles. In: Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data, pp. 322–331 (1990)
Bentley, J.L.: Multidimensional binary search trees used for associative searching. Commun. ACM 18, 509–517 (1975)
Davitkova, A., Milchevski, E., Michel, S.: The ML-Index: a multidimensional, learned index for point, range, and nearest-neighbor queries. In: EDBT, pp. 407–410 (2020)
Ding, J., et al.: ALEX: an updatable adaptive learned index. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pp. 969–984 (2020)
Ding, J., Nathan, V., Alizadeh, M., Kraska, T.: Tsunami: a learned multi-dimensional index for correlated data and skewed workloads. ArXiv abs/2006.13282 (2020)
Ferragina, P., Vinciguerra, G.: The PGM-index: a fully-dynamic compressed learned index with provable worst-case bounds. Proc. VLDB Endow. 13(8), 1162–1175 (2020)
Finkel, R.A., Bentley, J.L.: Quad trees a data structure for retrieval on composite keys. Acta Informatica 4, 1–9 (1974)
Galakatos, A., Markovitch, M., Binnig, C., Fonseca, R., Kraska, T.: FITing-Tree: a data-aware index structure. In: Proceedings of the 2019 International Conference on Management of Data, SIGMOD Conference 2019, Amsterdam, The Netherlands, 30 June–5 July 2019, pp. 1189–1206. ACM (2019)
Georgiadis, T., Mamoulis, N.: Raster intervals: an approximation technique for polygon intersection joins. Proc. ACM Manag. Data 1(1), 1–18 (2023)
Guttman, A.: R-trees: a dynamic index structure for spatial searching. In: Proceedings of the 1984 ACM SIGMOD International Conference on Management of Data, pp. 47–57 (1984)
Higuchi, S., Takemasa, J., Koizumi, Y., Tagami, A., Hasegawa, T.: Feasibility of longest prefix matching using learned index structures. SIGMETRICS Perform. Eval. Rev. 48(4), 45–48 (2021)
Kamel, I., Faloutsos, C.: Hilbert r-tree: an improved rtree using fractals. In: VLDB, vol. 94, pp. 500–509. Citeseer (1994)
Kraska, T., et al.: SageDB: a learned database system. In: Conference on Innovative Data Systems Research (2019)
Kraska, T., Beutel, A., Chi, E.H., Dean, J., Polyzotis, N.: The case for learned index structures. In: Proceedings of the 2018 International Conference on Management of Data, pp. 489–504 (2018)
Lee, K.C., Zheng, B., Li, H., Lee, W.C.: Approaching the skyline in Z order. In: VLDB, vol. 7, pp. 279–290 (2007)
Li, P., Hua, Y., Jia, J., Zuo, P.: FINEdex: a fine-grained learned index scheme for scalable and concurrent memory systems. VLDB Endow. 15(2), 321–334 (2021)
Li, P., Lu, H., Zheng, Q., Yang, L., Pan, G.: LISA: a learned index structure for spatial data. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pp. 2119–2133 (2020)
Nathan, V., Ding, J., Alizadeh, M., Kraska, T.: Learning multi-dimensional indexes. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (2019)
Orenstein, J.A., Merrett, T.H.: A class of data structures for associative searching. In: ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (1984)
Pandey, V., van Renen, A., Kipf, A., Sabek, I., Ding, J., Kemper, A.: The case for learned spatial indexes. ArXiv abs/2008.10349 (2020)
Park, S.C., Shin, H., Choi, B.K.: A sweep line algorithm for polygonal chain intersection and its applications. In: Kimura, F. (ed.) Geometric Modelling. ITIFIP, vol. 75, pp. 309–321. Springer, Boston, MA (2001). https://doi.org/10.1007/978-0-387-35490-3_21
Qi, J., Liu, G., Jensen, C.S., Kulik, L.: Effectively learning spatial indices. Proc. VLDB Endow. 13(12), 2341–2354 (2020)
Sellis, T., Roussopoulos, N., Faloutsos, C.: The R+-tree: a dynamic index for multi-dimensional objects. Technical report (1987)
Singla, S., Eldawy, A., Diao, T., Mukhopadhyay, A., Scudiero, E.: The raptor join operator for processing big raster+ vector data. In: Proceedings of the 29th International Conference on Advances in Geographic Information Systems, pp. 324–335 (2021)
Teng, D., Baig, F., Sun, Q., Kong, J., Wang, F.: IDEAL: a vector-raster hybrid model for efficient spatial queries over complex polygons. In: 2021 22nd IEEE International Conference on Mobile Data Management (MDM), pp. 99–108. IEEE (2021)
Wang, C., Yu, J.: GLIN: a lightweight learned indexing mechanism for complex geometries. arXiv preprint arXiv:2207.07745 (2022)
Wang, H., Fu, X., Xu, J., Lu, H.: Learned index for spatial queries. In: 2019 20th IEEE International Conference on Mobile Data Management (MDM), pp. 569–574. IEEE (2019)
Yang, Z., et al.: Qd-tree: learning data layouts for big data analytics. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, SIGMOD 2020, pp. 193–208. ACM, New York (2020)
Yousfi, H., Mesmoudi, A., Hadjali, A., Matallah, H., Lahfa, F.: Efficient R-tree exploration for big spatial data. In: Kacprzyk, J., Balas, V.E., Ezziyyani, M. (eds.) AI2SD 2020. AISC, vol. 1418, pp. 865–874. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-90639-9_70
Zardbani, F., Mamoulis, N., Idreos, S., Karras, P.: Adaptive indexing of objects with spatial extent. Proc. VLDB Endow. 16, 2248–2260 (2023)
Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: an efficient data clustering method for very large databases. ACM SIGMOD Rec. 25(2), 103–114 (1996)
Acknowledgement
This research is partly funded by Independent Research Fund Denmark (No. 1032-00481B). The authors are grateful to Prof. Hua Lu for initiating this project and for valuable advice.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Vahedi, M., Christiansen, H. (2024). \(\text {SPL}^{{index}}\): A Spatial Polygon Learned Index. In: Appice, A., Azzag, H., Hacid, MS., Hadjali, A., Ras, Z. (eds) Foundations of Intelligent Systems. ISMIS 2024. Lecture Notes in Computer Science(), vol 14670. Springer, Cham. https://doi.org/10.1007/978-3-031-62700-2_24
Download citation
DOI: https://doi.org/10.1007/978-3-031-62700-2_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-62699-9
Online ISBN: 978-3-031-62700-2
eBook Packages: Computer ScienceComputer Science (R0)