Quantifying the topological similarities of different parts of urban road networks enables us to understand urban growth patterns. Although conventional statistics provide useful information about the characteristics of either a single node’s direct neighbours or the entire network, such metrics fail to measure the similarities of subnetworks or capture local, indirect neighbourhood relationships. Here we propose a graph-based machine learning method to quantify the spatial homogeneity of subnetworks. We apply the method to 11,790 urban road networks across 30 cities worldwide to measure the spatial homogeneity of road networks within each city and across different cities. We find that intracity spatial homogeneity is highly associated with socioeconomic status indicators such as gross domestic product and population growth. Moreover, intercity spatial homogeneity values obtained by transferring the model across different cities reveal the intercity similarity of urban network structures originating in Europe, passed on to cities in the United States and Asia. The socioeconomic development and intercity similarity revealed using our method can be leveraged to understand and transfer insights between cities. It also enables us to address urban policy challenges including network planning in rapidly urbanizing areas and regional inequality.
Data availability
We used the publicly available road network data from OpenStreetMap (https://www.openstreetmap.org/) via the OSMnx Python package (https://github.com/gboeing/osmnx). We also used images from Google Maps (https://www.google.com/maps) to validate the node merging results. These images are also available to the public. The population data we used comes from https://worldpopulationreview.com/, which is a visualization platform for the open datasets owned by the United Nations. The airport flow data for 21 cities are from the 2019 Annual Airport Traffic Report at https://www.panynj.gov/airports/en/statistics-general-info.html owned by the Port Authority of New York and New Jersey and are also publicly available. The airport flow data for the other nine cities can be accessed from links that are listed in Supplementary Section 4. Both the road network data and socioeconomic data are available at the online data warehouse: https://github.com/jiang719/road-network-predictability.git. The data are available via Zenodo at https://doi.org/10.5281/zenodo.5866593 (ref. 109).
Code availability
Source codes for the training and testing results are available at the online data warehouse: https://github.com/jiang719/road-network-predictability.git. The code is available via Zenodo at https://doi.org/10.5281/zenodo.5866593 (ref. 109).
We thank S. Rao from Purdue University for discussions about the comparison between the spatial homogeneity metric and existing network metrics. S.L. acknowledges support from the Ross-Lynn fellowship, Purdue University. T.Y. is partly funded by National Science Foundation (grant number 1638311).
J.X., T.Y., S.V.U. and J.M. proposed the question. J.X., N.J., S.L., Q.P. and J.M. designed the research. N.J. trained and tested the GNN models. S.L. and Q.P. performed the intracity analysis. N.J. and J.X. conducted the intercity analysis. J.X. and J.M. drew the figures. J.X., T.Y., S.V.U. and J.M. wrote the paper.
