Open AccessProceeding Paper

Urban Functional Zone Mapping by Integrating Multi-Source Data and Spatial Relationship Characteristics^†

Daoyou Zhu

Xu Dang

¹,

Wenjia Shi

²,

Yixiang Chen

^1,* and

Wenmei Li

School of Internet of Things, Nanjing University of Posts and Telecommunications, Nanjing 210003, China

China Mobile Communications Group Jiangsu Co., Ltd. Nanjing Branch, Nanjing 210008, China

Author to whom correspondence should be addressed.

^†

Presented at the 31st International Conference on Geoinformatics, Toronto, ON, Canada, 14–16 August 2024.

Proceedings 2024, 110(1), 17; https://doi.org/10.3390/proceedings2024110017

Published: 4 December 2024

(This article belongs to the Proceedings of The 31st International Conference on Geoinformatics)

Download

Browse Figures

Versions Notes

Abstract

Timely and precise acquisition of urban functional zone (UFZ) information is crucial for effective urban planning, management, and resource allocation. However, current UFZ mapping approaches primarily focus on individual functional units’ visual and semantic characteristics, often overlooking the crucial spatial relationships between them, resulting in classification inaccuracies. To address this limitation, our study presents a novel framework for UFZ classification that seamlessly integrates visual image features, Points of Interest (POI) semantic attributes, and spatial relationship information. This framework leverages the OpenStreetMap (OSM) road network to partition the study area into functional units, employs a graph model to represent urban functional nodes and their intricate spatial topological relationships, and harnesses the capabilities of Graph Convolutional Network (GCN) to fuse these multi-dimensional features through end-to-end learning for accurate urban function discrimination. Experimental evaluations utilizing Gaofen-2 (GF-2) satellite imagery, POI data, and OSM road network information from Shenzhen, China have yielded remarkable results. Our method has achieved significant improvements in classification accuracy across all functional categories, surpassing approaches that rely solely on visual or semantic features. Notably, the overall classification accuracy reached an impressive 87.92%, marking a significant 2.08% increase over methods that disregard spatial relationship features. Furthermore, our method has demonstrated superior performance when compared to similar techniques, underscoring its effectiveness and potential for widespread application in UFZ classification.

Keywords:

urban functional zone; remote-sensing image; POI; SE-ResNet-50; word2vec; GCN

1. Introduction

The timely and accurate acquisition of geospatial information, such as the type, location, and extent of urban functional zones (UFZs), is crucial for urban planning and management, resource allocation, and environmental protection [1]. Currently, methods for identifying UFZs can generally be divided into three main categories: remote sensing-based, social sensing-based, and multi-source data fusion-based approaches [2].

UFZs exhibit both natural physical attributes and socio-economic characteristics. Remote sensing technology can capture visual characteristics, elemental composition, and the spatial structure of functional zones [3], whereas social sensing data, such as taxi trajectories, social media, and Points of Interest (POI), contain the rhythm of human activities and functional attributes of cities. Integrating these diverse data sources can provide complementary information for the discrimination of UFZs and enhance their recognition accuracy. Various data integration methods such as remote-sensing images and POI [4,5,6], remote-sensing images and trajectory data [7,8], and remote-sensing images and social media data [9] have been explored. However, these methods mainly focus on the features presented by functional units, and they frequently fail to consider their spatial contextual information. In real urban environments, there are often complex spatial relationships among the units of UFZs, which provide valuable insights for their identification.

To effectively leverage this spatial information, we propose a UFZ classification model that integrates multi-source data and spatial relationship features. By considering the spatial topological relationships of functional units and integrating their internal feature information, this model significantly improves the recognition accuracy of UFZs.

2. Methodology

2.1. Study Region and Data Sets

The selected study area is Shenzhen, China. Shenzhen (113°43′ E~114°38′ E, 22°24′ N~22°52′ N) is a coastal city in southern China, located on the east coast of the Pearl River Estuary, as shown in Figure 1. The city’s main functional categories include residential, commercial, industrial, public service, green space, and the political, educational, and cultural zones. In this study, the OpenStreetMap (OSM) road network is used to define the basic spatial units, namely street blocks. Gaofen-2 (GF-2) satellite images with a spatial resolution of 1 m are utilized to extract visual features of functional units. Additionally, semantic features are extracted from approximately 300,000 POI from Baidu Maps in 2019.

2.2. Proposed Method

The proposed UFZ mapping method integrates multi-source data and spatial relationship features, and its overall workflow is depicted in Figure 2. The process begins by dividing the study area into street block units using OSM three-level road network data and constructing a region adjacency graph (RAG) based on generated street blocks. Next, a graph convolutional neural network (GCN) is used to perform feature learning on the nodes of RAG. The GCN model can effectively integrate the internal features of the nodes and the spatial relationships among them. Finally, by classifying the nodes in the graph, the categories of functional units can be obtained. The advantage of this method is that it can take into account both the spatial relationships and internal feature information of UFZs, improving their classification accuracy.

2.2.1. Graph Structure Construction Based on Street Block Units

Given the constraints imposed by road infrastructure on urban planning and human activities, this study uses street blocks, defined by multi-level road networks (including primary, secondary, and tertiary roads), as the basic unit for urban function analysis. To leverage the GCN capabilities on any graph structure, we treat each street block as a node and construct an undirected graph G = (V, E, A) to represent the spatial relationships between these irregular street blocks. In this graph, V represents the nodes, and E denotes the edges that connect these nodes, illustrating the spatial relationships between street blocks. An edge e_ij ∈ E signifies a connection between nodes v_i and v_j. The region adjacency matrix A ∈ ℝ^N×N defines the adjacency between street blocks, where N represents the total number of nodes. The adjacency matrix is determined by the distance between the centers of all adjacent blocks, and its specific formula is as follows

A_{i j} = \{\begin{matrix} \frac{1}{d_{i j}}, a d j a c e n t \\ 0, o t h e r \end{matrix}

(1)

where d_ij is the distance between the centers of two adjacent blocks i and j, and

A_{i j}

indicates the adjacency between them. This region adjacency matrix serves as the foundation for feature propagation during the training of the subsequent GCN model.

2.2.2. Multimodal Feature Extraction

This study proposes a multimodal feature extraction framework that integrates GF-2 satellite imagery and POI data. For each block image, the pre-trained Se-ResNet50 model is used for feature learning to obtain a vector representation of the visual features. Simultaneously, the POI vocabulary within each block is processed using the natural language processing model Word2Vec to achieve a word vector representation of its semantic features. Assuming f_image and f_poi represent the visual and semantic feature vectors of the block, respectively, then we define x = f_image + f_poi as its feature vector representation, which incorporates the feature information of two different modal data, providing a more comprehensive representation of their connectivity.

2.2.3. UFZs Learning Based on Multimodal Features and GCN

After obtaining the multimodal features of nodes and the adjacency matrix that characterizes their spatial relationships, the GCN model is adopted to aggregate information between the central node and its neighboring nodes. GCN is a powerful graph neural network architecture that captures local relationships between nodes by applying convolution operations to graph-structured data. Let X^G = [x₁, x₂, ..., x_N]^T ∈ N × D represent the feature matrix, where N is the number of nodes, D is the dimension of the node feature vector, and x_i represents the feature vector of the i-th node. Then, the forward propagation process of GCN is formalized as:

H_{l + 1} = R e L U ({\tilde{D}}^{- \frac{1}{2}} Ã {\tilde{D}}^{- \frac{1}{2}} X_{l}^{G} θ_{l})

(2)

In Equation (2), H_l+1 represents the output feature matrix of the l + 1 layer of GCN;

\tilde{A}

represents the normalized adjacency matrix, while

\tilde{D}

represents the degree matrix, which is a diagonal matrix comprising the degrees (number of neighboring nodes) of each node;

X_{l}^{G}

represents the input feature matrix of the l layer, while θ_l represents the learnable weight matrix of the l layer.

The operation of GCN consists of three stages. First, the feature information of each node is transmitted to its neighboring nodes and aggregated with their features to obtain the local neighborhood features. Then, each node’s feature information is combined with the received neighborhood information to obtain a new feature representation. Finally, the model is enhanced by applying a nonlinear activation function. During this process, the nonlinear activation function ReLU is used to improve the model’s expressive power, enabling it to learn more complex feature representations.

3. Experiments and Analysis

3.1. Experimental Setup

Our hardware environment included a computer with a GTX 1080 GPU, 32 GB of memory, and an 8-threaded CPU. The software configured on it mainly included the Windows 10 operating system, CUDA11, and PyTorch 1.7. For the training of the Se-ResNet-50 model, the batch size was set to 32, the cross-entropy loss function and SGD optimizer were used for model optimization, and the number of iterations was set to 30. In the semantic feature representation based on the Word2Vec model, the word vector dimension was set to 60, given the limited diversity of POI samples. In GCN training, SGD optimizer and cross-entropy loss function were also used, with a learning rate set to 1 × 10⁻⁴. All datasets used were divided into training and testing sets at a 7:3 ratio.

3.2. Results and Analysis

In this experiment, we tested and compared four methods that use different features, namely, only visual features (Baseline1), only POI semantic features (Baseline2), a combination of the two aforementioned features (Baseline3), and the fusion of visual features, semantic features, and spatial relationships (Proposed). The confusion matrices for the four methods are presented in Figure 3.

The UFZ classification results of these methods are illustrated in Figure 4, and the overall accuracy (OA) and Kappa coefficient are presented in Table 1.

The experimental results demonstrate that methods based on fused features can achieve better performance than those relying solely on a single feature. The proposed method achieved the best classification results, with an OA of 87.92% and a Kappa coefficient of 0.8405, which improved by 2.08% and 2.72%, respectively, compared to the baseline 2 method. The accuracy for different functional zones is as follows (Figure 3): 85.8% for commercial, 89.2% for residential, 89.3% for industrial, 89.3% for public service, 87.9% for the political, educational, and cultural, and 81.5% for green space.

4. Conclusions

UFZs not only have physical and socio-economic properties, but also contain complex spatial relationships. This study proposes a UFZ classification method that integrates visual image features, POI semantic features, and spatial relationship features between functional units. The experimental results of the Shenzhen dataset show that the overall accuracy and Kappa coefficient of the proposed method are 87.92% and 0.8405, respectively. Compared with methods that only rely on visual and/or semantic features, it significantly improves the classification accuracy of UFZs. These facts demonstrate the strong advantage of spatial relationship features for improving the UFZ classification. In the future, we will introduce knowledge graph technology to model complex spatial and semantic relationships in functional areas and study the UFZ classification framework that couples knowledge graphs with multimodal large models to further improve the mapping accuracy of UFZs.

Author Contributions

D.Z., Y.C. and W.S. conceived and designed the model. W.S. performed the experiments. X.D. analyzed the data. D.Z. wrote the paper. Y.C. and W.L. reviewed and revised the paper. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Postgraduate Research & Practice Innovation Program of Jiangsu Province (Grant No. KYCX24_1217).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors upon request.

Acknowledgments

We appreciate the helpful comments of the reviewers.

Conflicts of Interest

The authors declare that they have no conflicts of interest. China Mobile Communications Group Jiangsu Co., Ltd. has no commercial conflict of interest.

References

Chen, Y.; Dang, X.; Zhu, D.; Huang, Y.; Qin, K. Urban functional zone mapping by coupling domain knowledge graphs and high-resolution satellite images. Trans. GIS 2024, 28, 1510–1535. [Google Scholar] [CrossRef]
Chen, Y.; Shi, W.; Dang, X.; Wu, C.; Li, S. Classification of Urban Functional Zones by Integrating Spatial Features of VHR Satellite Images and Semantic Features of POI Data. In Proceedings of the 2022 29th International Conference on Geoinformatics, Beijing, China, 15–18 August 2022; pp. 1–6. [Google Scholar] [CrossRef]
Chen, Y.; Yao, S.; Hu, Z.; Huang, B.; Miao, L.; Zhang, J. Built-Up Area Extraction Combing Densely Connected Dual-Attention Network and Multiscale Context. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 16, 5128–5143. [Google Scholar] [CrossRef]
Du, S.; Du, S.; Liu, B.; Zhang, X.; Zheng, Z. Large-scale urban functional zone mapping by integrating remote sensing images and open social data. GIScience Remote Sens. 2020, 57, 411–430. [Google Scholar] [CrossRef]
Xu, N.; Luo, J.; Wu, T.; Dong, W.; Liu, W.; Zhou, N. Identification and Portrait of Urban Functional Zones Based on Multisource Heterogeneous Data and Ensemble Learning. Remote Sens. 2021, 13, 373. [Google Scholar] [CrossRef]
Guo, Z.; Wen, J.; Xu, R. A Shape and Size Free-CNN for Urban Functional Zone Mapping with High-Resolution Satellite Images and POI Data. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5622117. [Google Scholar] [CrossRef]
Cai, L.; Zhang, L.; Liang, Y.; Li, J. Discovery of urban functional regions based on Node2vec. Appl. Intell. 2022, 52, 16886–16899. [Google Scholar] [CrossRef]
Yan, X.; Jiang, Z.; Luo, P.; Wu, H.; Dong, A.; Mao, F.; Wang, Z.; Liu, H.; Yao, Y. A multimodal data fusion model for accurate and interpretable urban land use mapping with uncertainty analysis. Int. J. Appl. Earth Obs. Geoinf. 2024, 129, 103805. [Google Scholar] [CrossRef]
Xu, X.; Bai, Y.; Liu, Y.; Zhao, X.; Sun, Y. MM-UrbanFAC: Urban Functional Area Classification Model Based on Multimodal Machine Learning. IEEE Trans. Intell. Transp. Syst. 2022, 23, 8488–8497. [Google Scholar] [CrossRef]

Figure 1. GF-2 satellite imagery of Shenzhen.

Figure 2. The proposed framework for UFZ classification.

Figure 3. Confusion matrices for different features. (a) Visual features; (b) semantic features; (c) visual and semantic features; (d) visual, semantic, and spatial relationship features.

Figure 4. UFZ mapping results that integrate visual, semantic, and spatial relationship features.

Table 1. OA and Kappa coefficient of UFZ classification.

Method	Feature	OA (%)	Kappa (%)
Baseline1 Baseline2 Baseline3 Proposed	Visual	70.51	61.80
	Semantic Visual and semantic	77.21 85.84	69.87 81.33
	Visual, semantic, and spatial relationship	87.92	84.05

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhu, D.; Dang, X.; Shi, W.; Chen, Y.; Li, W. Urban Functional Zone Mapping by Integrating Multi-Source Data and Spatial Relationship Characteristics. Proceedings 2024, 110, 17. https://doi.org/10.3390/proceedings2024110017

AMA Style

Zhu D, Dang X, Shi W, Chen Y, Li W. Urban Functional Zone Mapping by Integrating Multi-Source Data and Spatial Relationship Characteristics. Proceedings. 2024; 110(1):17. https://doi.org/10.3390/proceedings2024110017

Chicago/Turabian Style

Zhu, Daoyou, Xu Dang, Wenjia Shi, Yixiang Chen, and Wenmei Li. 2024. "Urban Functional Zone Mapping by Integrating Multi-Source Data and Spatial Relationship Characteristics" Proceedings 110, no. 1: 17. https://doi.org/10.3390/proceedings2024110017

APA Style

Zhu, D., Dang, X., Shi, W., Chen, Y., & Li, W. (2024). Urban Functional Zone Mapping by Integrating Multi-Source Data and Spatial Relationship Characteristics. Proceedings, 110(1), 17. https://doi.org/10.3390/proceedings2024110017

Article Menu

Urban Functional Zone Mapping by Integrating Multi-Source Data and Spatial Relationship Characteristics^†

Abstract

1. Introduction