Abstract
Semantic-aware spatial keyword search is an important technique for digital map services. However, existing indexing and search methods have limited pruning effect due to the high dimensionality in semantic space, causing query efficiency to be a serious issue. To handle this problem, this paper proposes a novel pivot-based hierarchical indexing structure S2R-tree to integrate spatial and semantic information in a seamless way. Instead of indexing objects in the original semantic space, we carefully design a space mechanism to transform the high dimensional semantic vectors to a low dimensional space, so that more effective pruning effect can be achieved. On top of the S2R-tree, an efficient query processing algorithm is further designed, which not only ensures efficient query processing by a set of theoretical bounds, but also returns accurate results despite of the indexing in the low dimensional space. Furthermore, we conduct extensive experiments to evaluate and compare our proposed and baseline methods.
Similar content being viewed by others
References
Andreas J, Klein D (2014) How much do word embeddings encode about syntax? In: ACL, pp 822–827
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022
Bozkaya T, Özsoyoglu ZM (1997) Distance-based indexing for high-dimensional metric spaces. In: SIGMOD, pp 357–368
Cao X, Cong G, Jensen CS, Ooi BC (2011) Collective spatial keyword querying. In: SIGMOD, pp 373–384
Cao X, Chen L, Cong G, Jensen CS, Qu Q, Skovsgaard A, Wu D, Yiu ML (2012) Spatial keyword querying. In: ER, pp 16–29
Cao X, Chen L, Cong G, Xiao X (2012) Keyword-aware optimal route search. PVLDB 5(11):1136– 1147
Cao X, Chen L, Cong G, Guan J, Phan N, Xiao X (2013) KORS: keyword-aware optimal route search system. In: ICDE, pp 1340–1343
Chen L, Cong G (2015) Diversity-aware top-k publish/subscribe for text stream. In: SIGMOD, pp 347–362
Chen L, Cong G, Cao X (2013) An efficient query indexing mechanism for filtering geo-textual data. In: SIGMOD, pp 749–760
Chen L, Cong G, Jensen CS, Wu D (2013) Spatial keyword query processing: an experimental evaluation, vol 6, pp 217–228
Chen L, Cui Y, Cong G, Cao X (2014) SOPS: a system for efficient processing of spatial-keyword publish/subscribe. PVLDB 7(13):1601–1604
Chen L, Cong G, Cao X, Tan K (2015) Temporal spatial-keyword top-k publish/subscribe. In: ICDE, pp 255–266
Chen L, Lin X, Hu H, Jensen CS, Xu J (2015) Answering why-not questions on spatial keyword top-k queries. In: ICDE, pp 279–290
Chen W, Zhao L, Xu J, Liu G, Zheng K, Zhou X (2015) Trip oriented search on activity trajectory. J Comput Sci Technol 30(4):745–761
Chen J, Xu J, Liu C, Li Z, Liu A, Ding Z (2017) Multi-objective spatial keyword query with semantics. In: DASFAA 2017, pp 34–48
Chen Z, Cong G, Zhang Z, Fu TZJ, Chen L (2017) Distributed publish/subscribe query processing on the spatio-textual data stream. In: ICDE, pp 1095–1106
Chen L, Shang S, Zhang Z, Cao X, Jensen CS, Kalnis P (2018) Location-aware top-k term publish/subscribe. In: ICDE, pp 749–760
Cong G, Jensen CS, Wu D (2009) Efficient retrieval of the top-k most relevant spatial web objects. PVLDB 2(1):337–348
Fariha A, Sarwar SM, Meliou A (2018) Squid: semantic similarity-aware query intent discovery. In: SIGMOD Conference 2018, pp 1745–1748
Felipe ID, Hristidis V, Rishe N (2008) Keyword search on spatial databases. In: ICDE, pp 656–665
Gao Y, Zhao J, Zheng B, Chen G (2016) Efficient collective spatial keyword query processing on road networks. IEEE Trans Intell Transp Syst 17(2):469–480
Gunasekaran YD, Rahman MF, Hasani S, Zhang N, Das G (2018) DBLOC: density based clustering over location based services. In: SIGMOD, pp 1697–1700
Han J, Wen J (2013) Mining frequent neighborhood patterns in a large labeled graph. In: CIKM, pp 259–268
Han J, Wen J, Pei J (2014) Within-network classification using radius-constrained neighborhood patterns. In: CIKM, pp 1539–1548
Han J, Zheng K, Sun A, Shang S, Wen J (2016) Discovering neighborhood pattern queries by sample answers in knowledge base. In: ICDE, pp 1014–1025
Henao R, Li C, Carin L, Su Q, Shen D, Wang G, Wang W, Min MR, Zhang Y (2018) Baseline needs more love: on simple word-embedding-based models and associated pooling mechanisms. In: ACL, pp 440–450
Jagadish HV, Ooi BC, Tan K, Yu C, Zhang R (2005) idistance: an adaptive b+-tree based indexing method for nearest neighbor search. In: TODS, vol 30, pp 364–397
Li G, Xu J, Feng J (2012) Keyword-based k-nearest neighbor search in spatial databases. In: CIKM, pp 2144–2148
Li F, Yao B, Tang M, Hadjieleftheriou M (2013) Spatial approximate string search. TKDE 25(6):1394–1409
Li M, Chen L, Cong G, Gu Y, Yu G (2016) Efficient processing of location-aware group preference queries. In: CIKM, pp 559–568
Li X, Cheng Y, Cong G, Chen L (2017) Discovering pollution sources and propagation patterns in urban area. In: KDD, pp 1863–1872
Liu H, Xu J, Zheng K, Liu C, Du L, Wu X (2017) Semantic-aware query processing for activity trajectories. In: WSDM, pp 283–292
Liu A, Wang W, Shang S, Li Q, Zhang X (2018) Efficient task assignment in spatial crowdsourcing with worker and task privacy protection. GeoInformatica 22 (2):335–362
Mahmood AR, Aref WG (2017) Query processing techniques for big spatial-keyword data. In: SIGMOD, pp 1777–1782
Novak D, Batko M, Zezula P (2011) Metric index: an efficient and scalable solution for precise and approximate similarity search. Inf Syst 36(4):721–733
Qian Z, Xu J, Zheng K, Sun W, Li Z, Guo H (2016) On efficient spatial keyword querying with semantics. In: DASFAA, pp 149–164
Qian Z, Xu J, Zheng K, Zhao P, Zhou X (2018) Semantic-aware top-k spatial keyword queries. World Wide Web J 21(3):573–594
Ray S, Blanco R, Goel AK (2017) High performance location-based services in a main-memory database. GeoInformatica 21(2):293–322
Rocha-Junior JB, Gkorgkas O, Jonassen S, Nørvåg K (2011) Efficient processing of top-k spatial keyword queries. In: SSTD, pp 205–222
Shang S, Ding R, Yuan B, Xie K, Zheng K, Kalnis P (2012) User oriented trajectory search for trip recommendation. In: EDBT’12, pp 156–167
Shang S, Ding R, Zheng K, Jensen CS, Kalnis P, Zhou X (2014) Personalized trajectory matching in spatial networks. VLDB J 23(3):449–468
Shang S, Liu J, Zheng K, Lu H, Pedersen TB, Wen J (2015) Planning unobstructed paths in traffic-aware spatial networks. GeoInformatica 19(4):723–746
Shang S, Chen L, Wei Z, Guo D, Wen J (2016) Dynamic shortest path monitoring in spatial networks. J Comput Sci Technol 31(4):637–648
Shang S, Chen L, Jensen CS, Wen J, Kalnis P (2017) Searching trajectories by regions of interest. IEEE Trans Knowl Data Eng 29(7):1549–1562
Shang S, Chen L, Wei Z, Jensen CS, Zheng K, Kalnis P (2017) Trajectory similarity join in spatial networks. PVLDB 10(11):1178–1189
Shang S, Chen L, Wei Z, Jensen CS, Zheng K, Kalnis P (2018) Parallel trajectory similarity joins in spatial networks. VLDB J 27(3):395–420
Shang S, Chen L, Zheng K, Jensen CS, Wei Z, Kalnis P (2019) Parallel trajectory-to-location join. IEEE Trans Knowl Data Eng 31(6):1194–1207
Sun J, Xu J, Zheng K, Liu C (2017) Interactive spatial keyword querying with semantics. In: CIKM, pp 1727–1736
Traina C Jr, Filho RFS, Traina AJM, Vieira MR, Faloutsos C (2007) The omni-family of all-purpose access methods: a simple and effective way to make similarity search more efficient. VLDB J 16(4):483–505
Wang T, Li G, Feng J (2011) Efficient algorithms for top-k keyword queries on spatial databases. In: MDM, pp 285–286
Xu J, Gao Y, Liu C, Zhao L, Ding Z (2015) Efficient route search on hierarchical dynamic road networks. Distrib Parallel Databases 33(2):227–252
Yao B, Li F, Hadjieleftheriou M, Hou K (2010) Approximate string search in spatial databases. In: ICDE, pp 545–556
Yue X, Xi M, Chen B, Gao M, He Y, Xu J (2019) A revocable group signatures scheme to provide privacy-preserving authentications. In: MONET
Zhang C, Zhang Y, Zhang W, Lin X, Cheema MA, Wang X (2014) Diversified spatial keyword search on road networks. In: EDBT, pp 367–378
Zhang D, Chan C, Tan K (2014) Processing spatial keyword query as a top-k aggregation query. In: SIGIR, pp 355–364
Zhao K, Chen L, Cong G (2016) Topic exploration in spatio-temporal document collections. In: SIGMOD, pp 985–998
Zhao K, Liu Y, Yuan Q, Chen L, Chen Z, Cong G (2016) Towards personalized maps: mining user preferences from geo-textual data. PVLDB 9(13):1545–1548
Zhao J, Gao Y, Chen G, Chen R (2018) Why-not questions on top-k geo-social keyword queries in road networks. In: ICDE 2018, pp 965–976
Zheng K, Su H, Zheng B, Shang S, Xu J, Liu J, Zhou X (2015) Interactive top-k spatial keyword queries. In: ICDE, pp 423–434
Zheng K, Zheng B, Xu J, Liu G, Liu A, Li Z (2017) Popularity-aware spatial keyword search on activity trajectories. World Wide Web 20(4):749–773
Acknowledgements
This work was supported by the National Natural Science Foundation of China under Grant Nos. 61872258, 61572335, 61772356, 61876117, and 61802273, the Dongguan Innovative Research Team Program under grant number 2018607201008, the Australian Research Council discovery projects under grant numbers DP160102412, DP170104747, DP180100212, and the Open Program of State Key Laboratory of Software Architecture under item number SKLSAOP1801.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Chen, X., Xu, J., Zhou, R. et al. S2R-tree: a pivot-based indexing structure for semantic-aware spatial keyword search. Geoinformatica 24, 3–25 (2020). https://doi.org/10.1007/s10707-019-00372-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10707-019-00372-z