EfficientUNet+: A Building Extraction Method for Emergency Shelters Based on Deep Learning
<p>The technical route of this study.</p> "> Figure 2
<p>EfficientUNet+ module structure.</p> "> Figure 3
<p>MBConv structure.</p> "> Figure 4
<p>Operation flow of scSE module.</p> "> Figure 5
<p>Spatial distribution of park emergency shelter sites within the Fifth Ring Road of Beijing.</p> "> Figure 6
<p>Part of the details of WHU aerial building dataset and Google building dataset of emergency shelters within the Fifth Ring Road of Beijing. (<b>a</b>) WHU aerial building dataset. (<b>b</b>) Google building dataset of emergency shelters within the Fifth Ring Road of Beijing.</p> "> Figure 7
<p>Original image, building ground truth value, and extraction results of the emergency shelter in Chaoyang Park. (<b>a</b>) Original image. (<b>b</b>) Ground truth. (<b>c</b>) Extraction results.</p> "> Figure 8
<p>Extraction results of buildings in emergency shelters of Chaoyang Park. (<b>a</b>) Google image. (<b>b</b>) Building ground truth. (<b>c</b>) Building extraction results.</p> "> Figure 9
<p>Feature map visualization. (<b>a</b>) Sample image. (<b>b</b>) Depth = 1. (<b>c</b>) Depth = 2. (<b>d</b>) Depth = 3. (<b>e</b>) Depth = 4. (<b>f</b>) Depth = 5.</p> "> Figure 10
<p>Partial details of the building in the emergency shelter through different methods. (<b>a</b>) Original image. (<b>b</b>) Ground truth. (<b>c</b>) EfficientUNet+. (<b>d</b>) DeepLabv3+. (<b>e</b>) PSPNet. (<b>f</b>) ResUNet. (<b>g</b>) HRNet.</p> "> Figure 11
<p>Accuracy comparison chart of different methods.</p> "> Figure 12
<p>Building extraction results with or without the scSE. (<b>a</b>) Original image. (<b>b</b>) Ground truth. (<b>c</b>) EfficientUNet+. (<b>d</b>) EfficientUNet (without scSE).</p> "> Figure 13
<p>Building extraction results with different loss functions. (<b>a</b>) Original image. (<b>b</b>) Ground truth. (<b>c</b>) Loss<sub>CE_BW</sub> + Loss<sub>Dice</sub>. (<b>d</b>) Loss<sub>CE</sub> + Loss<sub>Dice</sub>.</p> "> Figure 14
<p>Building extraction results with and without transfer learning. (<b>a</b>) Original image. (<b>b</b>) Ground truth. (<b>c</b>) EfficientUNet+ with transfer learning. (<b>d</b>) EfficientUNet+ without transfer learning.</p> "> Figure 15
<p>Visualization graph of training loss and epochs.</p> ">
Abstract
:1. Introduction
- (1)
- We use EfficientNet-b0 as the encoder to trade off model accuracy and speed. The features extracted by the model are crucial to the segmentation results; we also embed the spatial and channel squeeze and excitation (scSE) in the decoder to achieve positive correction of features.
- (2)
- The accurate boundary segmentation of positive samples in the segmentation results has always been a challenge. We weight the building boundary area with the cross-entropy function and combine the Dice loss to alleviate this problem from the perspective of the loss function.
- (3)
- Producing a large number of samples for emergency shelters within the Fifth Ring Road of Beijing is time-consuming and labor-intensive. We use the existing public WHU aerial building dataset for transfer learning to achieve high extraction accuracy using a few samples. It can improve the computational efficiency and robustness of the model.
2. Methods
2.1. EfficientUNet+ Module Overview
2.2. EfficientNet-b0
2.3. scSE Module
2.4. Loss Function
2.5. Transfer Learning
3. Experimental Results
3.1. Study Area and Data
3.1.1. Study Area
3.1.2. Dataset
3.2. Experimental Environment and Parameter Settings
3.3. Accuracy Evaluation
3.4. Experimental Results
4. Discussion
4.1. Comparison to State-of-the-Art Studies
4.2. Ablation Experiment
4.2.1. scSE Module
4.2.2. Loss Function
4.2.3. Transfer Learning
4.3. Efficiency Evaluation
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
CNN | Convolutional Neural Network |
FCN | Fully Convolutional Network |
DSM | Digital Surface Model |
GIS | Geographic Information System |
scSE | Spatial and Channel Squeeze and Excitation |
sSE | Spatial Squeeze and Excitation |
cSE | Channel Squeeze and Excitation |
BN | Batch Normalization |
SE | Squeeze and Excitation |
mIoU | Mean Intersection over Union |
TP | True Positive |
FP | False Positive |
FN | False Negative |
Adam | Adaptive Moment Estimation |
GPU | Graphics Processing Unit |
PSPNet | Pyramid Scene Parsing Network |
ResNet | Residual UNet |
HRNet | High-Resolution Net |
References
- Chen, Q.; Wang, L.; Waslander, S.; Liu, X. An end-to-end shape modeling framework for vectorized building outline generation from aerial images. ISPRS J. Photogramm. Remote Sens. 2020, 170, 114–126. [Google Scholar] [CrossRef]
- Janalipour, M.; Mohammadzadeh, A. Evaluation of effectiveness of three fuzzy systems and three texture extraction methods for building damage detection from post-event LiDAR data. Int. J. Digit. Earth 2018, 11, 1241–1268. [Google Scholar] [CrossRef]
- Melgarejo, L.; Lakes, T. Urban adaptation planning and climate-related disasters: An integrated assessment of public infrastructure serving as temporary shelter during river floods in Colombia. Int. J. Disaster Risk Reduct. 2014, 9, 147–158. [Google Scholar] [CrossRef] [Green Version]
- GB21734-2008; Earthquake Emergency Shelter Site and Supporting Facilities. National Standards of People’s Republic of China: Beijing, China, 2008.
- Jing, J. Beijing Municipal Planning Commission announced the Outline of Planning for Earthquake and Emergency Refuge Places (Outdoor) in Beijing Central City. Urban Plan. Newsl. 2007, 21, 1. [Google Scholar]
- Yu, J.; Wen, J. Multi-criteria Satisfaction Assessment of the Spatial Distribution of Urban Emergency Shelters Based on High-Precision Population Estimation. Int. J. Disaster Risk Sci. 2016, 7, 413–429. [Google Scholar] [CrossRef] [Green Version]
- Hui, J.; Du, M.; Ye, X.; Qin, Q.; Sui, J. Effective Building Extraction from High-Resolution Remote Sensing Images with Multitask Driven Deep Neural Network. IEEE Geosci. Remote Sens. Lett. 2018, 16, 786–790. [Google Scholar] [CrossRef]
- Xu, Z.; Zhou, Y.; Wang, S.; Wang, L.; Li, F.; Wang, S.; Wang, Z. A Novel Intelligent Classification Method for Urban Green Space Based on High-Resolution Remote Sensing Images. Remote Sens. 2020, 12, 3845. [Google Scholar] [CrossRef]
- Dai, Y.; Gong, J.; Li, Y.; Feng, Q. Building segmentation and outline extraction from UAV image-derived point clouds by a line growing algorithm. Int. J. Digit. Earth 2017, 10, 1077–1097. [Google Scholar] [CrossRef]
- Zeng, Y.; Guo, Y.; Li, J. Recognition and extraction of high-resolution satellite remote sensing image buildings based on deep learning. Neural Comput. Appl. 2021, 5, 2691–2706. [Google Scholar] [CrossRef]
- Jing, W.; Xu, Z.; Ying, L. Texture-based segmentation for extracting image shape features. In Proceedings of the 19th International Conference on Automation and Computing (ICAC), London, UK, 13–14 September 2013; pp. 13–14. [Google Scholar]
- Huang, Z.; Cheng, G.; Wang, H.; Li, H.; Shi, L.; Pan, C. Building extraction from multi-source remote sensing images via deep deconvo-lution neural networks. In Proceedings of the 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Beijing, China, 10–15 July 2016; pp. 1835–1838. [Google Scholar]
- Blaschke, T. Object based image analysis for remote sensing. ISPRS J. Photogramm. Remote Sens. 2010, 65, 2–16. [Google Scholar] [CrossRef] [Green Version]
- Chen, G.; Zhang, X.; Wang, Q.; Dai, F.; Gong, Y.; Zhu, K. Symmetrical dense-shortcut deep fully convolutional networks for semantic segmentation of very-high-resolution remote sensing images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 1633–1644. [Google Scholar] [CrossRef]
- Zhang, J.; Li, T.; Lu, X.; Cheng, Z. Semantic classification of high-resolution remote-sensing images based on mid-level features. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2016, 9, 2343–2353. [Google Scholar] [CrossRef]
- Gong, M.; Zhan, T.; Zhang, P.; Miao, Q. Superpixel-based difference representation learning for change detection in multispectral remote sensing images. IEEE Trans. Geosci. Remote Sens. 2017, 55, 2658–2673. [Google Scholar] [CrossRef]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137. [Google Scholar] [CrossRef] [Green Version]
- Liu, H.; Luo, J.; Huang, B.; Hu, X.; Sun, Y.; Yang, Y.; Xu, N.; Zhou, N. DE-Net: Deep Encoding Network for Building Extraction from High-Resolution Remote Sensing Imagery. Remote Sens. 2019, 11, 2380. [Google Scholar] [CrossRef] [Green Version]
- Zhu, Q.; Liao, C.; Han, H.; Mei, X.; Li, H. MAPGnet: Multiple attending path neural network for building footprint extraction from remote sensed imagery. IEEE Trans. Geosci. Remote Sens. 2021, 59, 6169–6181. [Google Scholar] [CrossRef]
- Yang, X.; Sun, H.; Fu, K.; Yang, J.; Sun, X.; Yan, M.; Guo, Z. Automatic ship detection in remote sensing images from google earth of complex scenes based on multiscale rotation dense feature pyramid networks. Remote Sens. 2018, 10, 132. [Google Scholar] [CrossRef] [Green Version]
- Jin, Y.; Xu, W.; Zhang, C.; Luo, X.; Jia, H. Boundary-Aware Refined Network for Automatic Building Extraction in Very High-Resolution Urban Aerial Images. Remote Sens. 2021, 13, 692. [Google Scholar] [CrossRef]
- Tang, Z.; Chen, C.; Jiang, C.; Zhang, D.; Luo, W.; Hong, Z.; Sun, H. Capsule–Encoder–Decoder: A Method for Generalizable Building Extraction from Remote Sensing Images. Remote Sens. 2022, 14, 1235. [Google Scholar] [CrossRef]
- Li, S.; Fu, S.; Zheng, D. Rural Built-Up Area Extraction from Remote Sensing Images Using Spectral Residual Methods with Embedded Deep Neural Network. Sustainability 2022, 14, 1272. [Google Scholar] [CrossRef]
- Chen, X.; Qiu, C.; Guo, W.; Yu, A.; Tong, X.; Schmitt, M. Multiscale Feature Learning by Transformer for Building Extraction from Satellite Images. IEEE Geosci. Remote Sens. Lett. 2022, 19, 2503605. [Google Scholar] [CrossRef]
- Long, J.; Shelhamer, E.; Darrell, T. Fully Convolutional Networks for Semantic Segmentation. arXiv 2015, arXiv:1411.4038. [Google Scholar]
- Maggiori, E.; Tarabalka, Y.; Charpiat, G.; Alliez, P. Convolutional Neural Networks for Large-Scale Remote-Sensing Image Classification. IEEE Trans. Geosci. Remote Sens. 2017, 55, 645–657. [Google Scholar] [CrossRef] [Green Version]
- Bittner, K.; Adam, F.; Cui, S.; Korner, M.; Reinartz, P. Building footprint extraction from VHR remote sensing images combined with normalized DSMs using fused Fully Convolutional Networks. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 2615–2629. [Google Scholar] [CrossRef] [Green Version]
- Xu, Y.; Wu, L.; Xie, Z.; Chen, Z. Building Extraction in Very High Resolution Remote Sensing Imagery Using Deep Learning and Guided Filters. Remote Sens. 2018, 10, 144. [Google Scholar] [CrossRef] [Green Version]
- Wei, S.; Ji, S.; Lu, M. Toward automatic building footprint delineation from aerial images using CNN and regularization. IEEE Trans. Geosci. Remote Sens. 2020, 58, 2178–2189. [Google Scholar] [CrossRef]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. arXiv 2015, arXiv:1505.04597. [Google Scholar]
- Ye, Z.; Fu, Y.; Gan, M.; Deng, J.; Comber, A.; Wang, K. Building extraction from very high resolution aerial imagery using joint attention deep neural network. Remote Sens. 2019, 11, 2970. [Google Scholar] [CrossRef] [Green Version]
- Qin, X.; Zhang, Z.; Huang, C.; Dehghan, M.; Zaiane, O.; Jagersand, M. U2Net: Going deeper with nested U-structure for salient object detection. Pattern Recognit. 2020, 106, 107404. [Google Scholar] [CrossRef]
- Peng, X.; Zhong, R.; Li, Z.; Li, Q. Optical remote sensing image change detection based on attention mechanism and image difference. IEEE Trans. Geosci. Remote Sens. 2021, 59, 7296–7307. [Google Scholar] [CrossRef]
- Wang, Y.; Zeng, X.; Liao, X.; Zhuang, D. B-FGC-Net: A Building Extraction Network from High Resolution Remote Sensing Imagery. Remote Sens. 2022, 14, 269. [Google Scholar] [CrossRef]
- Wang, H.; Miao, F. Building extraction from remote sensing images using deep residual U-Net. Eur. J. Remote Sens. 2022, 55, 71–85. [Google Scholar] [CrossRef]
- Tian, Q.; Zhao, Y.; Li, Y.; Chen, J.; Chen, X.; Qin, K. Multiscale Building Extraction with Refined Attention Pyramid Networks. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
- Cao, D.; Xing, H.; Wong, M.; Kwan, M.; Xing, H.; Meng, Y. A Stacking Ensemble Deep Learning Model for Extraction from Remote Sensing Images. Remote Sens. 2021, 13, 3898. [Google Scholar] [CrossRef]
- Tadepalli, Y.; Kollati, M.; Kuraparthi, S.; Kora, P. EfficientNet-B0 Based Monocular Dense-Depth Map Estimation. Trait. Signal 2021, 38, 1485–1493. [Google Scholar] [CrossRef]
- Zhao, P.; Huang, L. Multi-Aspect SAR Target Recognition Based on Efficientnet and GRU. In Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Electr Network, Waikoloa, HI, USA, 26 September–2 October 2020; pp. 1651–1654. [Google Scholar]
- Alhichri, H.; Alswayed, A.; Bazi, Y.; Ammour, N.; Alajlan, N. Classification of Remote Sensing Images using EfficientNet-B3 CNN Model with Attention. IEEE Access 2021, 9, 14078–14094. [Google Scholar] [CrossRef]
- Ferrari, L.; Dell’Acqua, F.; Zhang, P.; Du, P. Integrating EfficientNet into an HAFNet Structure for Building Mapping in High-Resolution Optical Earth Observation Data. Remote Sens. 2021, 13, 4361. [Google Scholar] [CrossRef]
- Tan, M.; Le, Q. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. In Proceedings of the 36th International Conference on Machine Learning(ICML), Long Beach, CA, USA, 9–15 June 2019; p. 97. [Google Scholar]
- Roy, A.; Navab, N.; Wachinger, C. Concurrent Spatial and Channel Squeeze & Excitation in Fully Convolutional Networks. arXiv 2018, arXiv:1803.02579. [Google Scholar]
- Mondal, A.; Agarwal, A.; Dolz, Z.; Desrosiers, C. Revisiting CycleGAN for semi-supervised segmentation. arXiv 2019, arXiv:1908.11569. [Google Scholar]
- Qin, X.; He, S.; Yang, X.; Dehghan, M.; Qin, Q.; Martin, J. Accurate outline extraction of individual building from very high-resolution optical images. IEEE Geosci. Remote Sens. Lett. 2018, 15, 1775–1779. [Google Scholar] [CrossRef]
- Das, A.; Chandran, S. Transfer Learning with Res2Net for Remote Sensing Scene Classification. In Proceedings of the 11th International Conference on Cloud Computing, Data Science and Engineering (Confluence), Amity Univ, Amity Sch Engn & Technol, Electr Network, Noida, India, 28–29 January 2021; pp. 796–801. [Google Scholar]
- Zhu, Q.; Shen, F.; Shang, P.; Pan, Y.; Li, M. Hyperspectral Remote Sensing of Phytoplankton Species Composition Based on Transfer Learning. Remote Sens. 2019, 11, 2001. [Google Scholar] [CrossRef] [Green Version]
- Seventh National Census Communiqué; National Bureau of Statistics: Beijing, China, 2021.
- Ji, S.; Wei, S. Building extraction via convolution neural networks from an open remote sensing building dataset. Arca Geod. Cartogr. Sin. 2019, 48, 448–459. [Google Scholar]
- Wang, Y.; Chen, C.; Ding, M.; Li, J. Real-time dense semantic labeling with dual-Path framework for high-resolution remote sensing image. Remote Sens. 2019, 11, 3020. [Google Scholar] [CrossRef] [Green Version]
- Ji, S.; Wei, S.; Lu, M. Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set. IEEE T Rans. Geosci. Remote Sens. 2018, 57, 574–586. [Google Scholar] [CrossRef]
- Chen, L.C.E.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. In Proceedings of the 15th European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 833–851. [Google Scholar]
- Zhao, H.; Shi, J.; Qi, X.; Wang, X.; Jia, J. Pyramid Scene Parsing Network. In Proceedings of the 30th IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 6230–6239. [Google Scholar]
- Zhang, Z.X.; Liu, Q.J.; Wang, Y.H. Road Extraction by Deep Residual U-Net. IEEE Geosci. Remote Sens. Lett. 2018, 15, 749–753. [Google Scholar] [CrossRef] [Green Version]
- Wang, J.; Sun, K.; Cheng, T.; Jiang, B.; Deng, C.; Zhao, Y.; Liu, D.; Mu, Y.; Tan, M.; Wang, X.; et al. Deep High-Resolution Representation Learning for Visual Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 43, 3349–3364. [Google Scholar] [CrossRef] [Green Version]
Stage | Operator | Resolution | Layers |
---|---|---|---|
1 | Conv 3 × 3 | 512 × 512 | 1 |
2 | MBConv1, k3 × 3 | 256 × 256 | 1 |
3 | MBConv6, k3 × 3 | 256 × 256 | 2 |
4 | MBConv6, k5 × 5 | 128 × 128 | 2 |
5 | MBConv6, k3 × 3 | 64 × 64 | 3 |
6 | MBConv6, k5 × 5 | 32 × 32 | 3 |
7 | MBConv6, k5 × 5 | 32 × 32 | 4 |
8 | MBConv6, k3 × 3 | 16 × 16 | 1 |
9 | Conv1 × 1&Pooling&FC | 8 × 8 | 1 |
Precision | Recall | F1-Score | mIoU |
---|---|---|---|
93.01% | 89.17% | 91.05% | 90.97% |
Methods | Precision | Recall | F1-Score | mIoU |
---|---|---|---|---|
DeepLabv3+ [52] | 90.52% | 87.15% | 88.80% | 88.92% |
PSPNet [53] | 76.40% | 75.34% | 75.87% | 78.36% |
ResUNet [54] | 88.51% | 80.72% | 84.44% | 85.16% |
HRNet [55] | 89.14% | 83.43% | 86.19% | 86.63% |
EfficientUNet+ | 93.01% | 89.17% | 91.05% | 90.97% |
Method | Decoder | Precision | Recall | F1-Score | mIoU |
---|---|---|---|---|---|
EfficientUNet | Without scSE | 90.81% | 88.23% | 89.50% | 89.54% |
EfficientUNet+ | With scSE | 93.01% | 89.17% | 91.05% | 90.97% |
Loss Function | Precision | Recall | F1-Score | mIoU |
---|---|---|---|---|
LossCE + LossDice | 92.07 | 87.39 | 89.67 | 89.71 |
LossCE_BW + LossDice | 93.01 | 89.17 | 91.05 | 90.97 |
Transfer Learning | Precision | Recall | F1-Score | mIoU |
---|---|---|---|---|
— | 92.75% | 88.92% | 90.79% | 90.73% |
√ | 93.01% | 89.17% | 91.05% | 90.97% |
Time | DeepLabv3+ | PSPNet | ResUnet | HRNet | EfficientUNet+ |
---|---|---|---|---|---|
Inference time | 16.31 s | 13.42 s | 15.96 s | 32.05 s | 11.16 s |
Train time | 362.77 min | 312.82 min | 334.77 min | 427.98 min | 279.05 min |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
You, D.; Wang, S.; Wang, F.; Zhou, Y.; Wang, Z.; Wang, J.; Xiong, Y. EfficientUNet+: A Building Extraction Method for Emergency Shelters Based on Deep Learning. Remote Sens. 2022, 14, 2207. https://doi.org/10.3390/rs14092207
You D, Wang S, Wang F, Zhou Y, Wang Z, Wang J, Xiong Y. EfficientUNet+: A Building Extraction Method for Emergency Shelters Based on Deep Learning. Remote Sensing. 2022; 14(9):2207. https://doi.org/10.3390/rs14092207
Chicago/Turabian StyleYou, Di, Shixin Wang, Futao Wang, Yi Zhou, Zhenqing Wang, Jingming Wang, and Yibing Xiong. 2022. "EfficientUNet+: A Building Extraction Method for Emergency Shelters Based on Deep Learning" Remote Sensing 14, no. 9: 2207. https://doi.org/10.3390/rs14092207