Abstract
Shortest path distances between node pairs on road networks are essential for many applications. Traditional methods, such as breadth first search (BFS) and Dijkstra algorithm, focus on precise result. However, they are difficult to apply to the large-scale road network because of the high time cost. And some methods precomputed the shortest path of all node pairs and stored them, then answer distance queries by simple lookups. But these methods need high space cost. For some applications, such as finding nearest point of interest (POI) for travel recommendations, which only need the approximate distances. Therefore, it is important to find a method to estimate the shortest path distance timely with low time and space cost. In this paper, we use CatBoost, a machine learning method based on gradient boosting decision trees, to estimate the shortest distance. We first obtain the node features based on landmarks and combine them with the longitude and latitude of the node and the real Euclidean distance between node pairs as the node features. Then we fed them into CatBoost model to train the model. The space complexity is O(k|V|) and the query time complexity is O(1). We conduct experiments on the real road network in Xiamen and New York City. Experiments demonstrate that our model can estimate the shortest distance with low error and small time cost and space cost.
This work was supported in part by the Natural Science Foundation of Guangdong under Grant 2021A1515011578; in part by the Natural Science Foundation of China under Grant 61672441 and Grant 61673324; in part by the Natural Science Foundation of Fujian under Grant 2018J01097; in part by the Shenzhen Basic Research Program under Grant JCYJ20170818141325209 and Grant JCYJ20190809161603551.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Chow, E.: A graph search heuristic for shortest distance paths. Citeseer (2005)
Dorogush, A.V., Ershov, V., Gulin, A.: CatBoost: gradient boosting with categorical features support. arXiv preprint arXiv:1810.11363 (2018)
Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 855–864 (2016)
Gubichev, A., Bedathur, S., Seufert, S., Weikum, G.: Fast and accurate estimation of shortest paths in large graphs. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 499–508 (2010)
Maleki, S., Nguyen, D., Lenharth, A., Garzarán, M., Padua, D., Pingali, K.: DSMR: a parallel algorithm for single-source shortest path problem. In: Proceedings of the 2016 International Conference on Supercomputing, pp. 1–14 (2016)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Potamias, M., Bonchi, F., Castillo, C., Gionis, A.: Fast shortest path distance estimation in large networks. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 867–876 (2009)
Qi, J., Wang, W., Zhang, R., Zhao, Z.: A learning based approach to predict shortest-path distances (2020)
Rattigan, M.J., Maier, M., Jensen, D.: Using structure indices for efficient approximation of network properties. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 357–366 (2006)
Rizi, F.S., Schloetterer, J., Granitzer, M.: Shortest path distance approximation using deep learning techniques. In: 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 1007–1014. IEEE (2018)
Takes, F.W., Kosters, W.A.: Adaptive landmark selection strategies for fast shortest path computation in large real-world graphs. In: 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT), vol. 1, pp. 27–34. IEEE (2014)
Tang, J.T., Wang, T., Wang, J.: Shortest path approximate algorithm for complex network analysis. Ruanjian Xuebao/J. Softw. 22(10), 2279–2290 (2011)
Zhang, H., Xie, X., Wen, Y., Zhang, Y.: A twig-based algorithm for top-k subgraph matching in large-scale graph data. In: Wang, G., Lin, X., Hendler, J., Song, W., Xu, Z., Liu, G. (eds.) WISA 2020. LNCS, vol. 12432, pp. 475–487. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-60029-7_43
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Jiang, L., Lai, Y., Chen, Q., Zeng, W., Yang, F., Yi, F. (2021). Shortest Path Distance Prediction Based on CatBoost. In: Xing, C., Fu, X., Zhang, Y., Zhang, G., Borjigin, C. (eds) Web Information Systems and Applications. WISA 2021. Lecture Notes in Computer Science(), vol 12999. Springer, Cham. https://doi.org/10.1007/978-3-030-87571-8_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-87571-8_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87570-1
Online ISBN: 978-3-030-87571-8
eBook Packages: Computer ScienceComputer Science (R0)