Abstract
Height estimation plays a crucial role in the planning and assessment of urban development, enabling effective decision-making and evaluation of urban built areas. Accurate estimation of building heights from remote sensing optical imagery poses significant challenges in preserving both the overall structure of complex scenes and the elevation details of the buildings. This paper proposes a novel end-to-end deep learning-based network (Stereoential Net) comprising a multi-scale differential shortcut connection module (MSDSCM) at the decoding end and a modified stereo U-Net (mSUNet). The proposed Stereoential network performs a multi-scale differential decoding features fusion to preserve fine details for improved height estimation using stereo optical imagery. Unlike existing methods, our approach does not use any multi-spectral satellite imagery, instead, it only employs freely available optical imagery, yet it achieves superior performance. We evaluate our proposed network on two benchmark datasets, the IEEE Data Fusion Contest 2018 (DFC2018) dataset and the 42-cities dataset. The 42-cities dataset is comprised of 42 different densely populated cities of China having diverse sets of buildings with varying shapes and sizes. The quantitative and qualitative results reveal that our proposed network outperforms the SOTA algorithms for DFC2018. Our method reduces the root-mean-square error (RMSE) by 0.31 m as compared to state-of-the-art multi-spectral approaches on the 42-cities dataset. The code will be made publically available via the GitHub repository.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
World urban population. https://statisticstimes.com/demographics/world-urban-population.php/. Accessed 21 June 2023
Ahn, H., Yim, C.: Convolutional neural networks using skip connections with layer groups for super-resolution image reconstruction based on deep learning. Appl. Sci. 10(6), 1959 (2020)
Cao, Y., Huang, X.: A deep learning method for building height estimation using high-resolution multi-view imagery over urban areas: a case study of 42 Chinese cities. Remote Sens. Environ. 264, 112590 (2021)
Carvalho, M., Le Saux, B., Trouvé-Peloux, P., Champagnat, F., Almansa, A.: Multitask learning of height and semantics from aerial images. IEEE Geosci. Remote Sens. Lett. 17(8), 1391–1395 (2019)
Chollet, F., et al.: Keras (2015). https://github.com/fchollet/keras
Deren, L., Wenbo, Y., Zhenfeng, S.: Smart city based on digital twins. Comput. Urban Sci. 1(1), 1–11 (2021)
Huang, H., et al.: Estimating building height in China from ALOS AW3D30. ISPRS J. Photogramm. Remote. Sens. 185, 146–157 (2022)
Karatsiolis, S., Kamilaris, A., Cole, I.: IMG2nDSM: height estimation from single airborne RGB images with deep learning. Remote Sens. 13(12), 2417 (2021)
Liu, C.J., Krylov, V.A., Kane, P., Kavanagh, G., Dahyot, R.: IM2ELEVATION: building height estimation from single-view aerial imagery. Remote Sens. 12(17), 2719 (2020)
Lu, M., Liu, J., Wang, F., Xiang, Y.: Multi-task learning of relative height estimation and semantic segmentation from single airborne RGB images. Remote Sens. 14(14), 3450 (2022)
Mahtta, R., Mahendra, A., Seto, K.C.: Building up or spreading out? Typologies of urban growth across 478 cities of 1 million+. Environ. Res. Lett. 14(12), 124077 (2019)
Mou, L., Zhu, X.X.: IM2HEIGHT: height estimation from single monocular imagery via fully residual convolutional-deconvolutional network. arXiv preprint arXiv:1802.10249 (2018)
Perera, A., Javanroodi, K., Nik, V.M.: Climate resilient interconnected infrastructure: co-optimization of energy systems and urban morphology. Appl. Energy 285, 116430 (2021)
Prasad, S., Le Saux, B., Yokoya, N., Hansch, R.: IEEE Data Fusion Challenge - Fusion of Multispectral LiDAR and Hyperspectral data (2020). https://doi.org/10.21227/jnh9-nz89
Qi, F., Zhai, J.Z., Dang, G.: Building height estimation using Google Earth. Energy Build. 118, 123–132 (2016)
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Sautier, C., Puy, G., Gidaris, S., Boulch, A., Bursuc, A., Marlet, R.: Image-to-lidar self-supervised distillation for autonomous driving data. In: Proceedings of CVPR, June 2022
Shao, Y., Taff, G.N., Walsh, S.J.: Shadow detection and building-height estimation using IKONOS data. Int. J. Remote Sens. 32(22), 6929–6944 (2011)
Stouffs, R.: Virtual 3D city models. ISPRS Int. J. Geo-Inf. 11(4), 1–7 (2022)
Suwardhi, D., Trisyanti, S.W., Virtriana, R., Syamsu, A.A., Jannati, S., Halim, R.S.: Heritage smart city mapping, planning and land administration (Hestya). ISPRS Int. J. Geo-Inf. 11(2), 1–10 (2022)
Xie, Y., Feng, D., Xiong, S., Zhu, J., Liu, Y.: Multi-scene building height estimation method based on shadow in high resolution imagery. Remote Sens. 13(15), 2862 (2021)
Xing, S., Dong, Q., Hu, Z.: Gated feature aggregation for height estimation from single aerial images. IEEE Geosci. Remote Sens. Lett. 19, 1–5 (2021)
Xue, M., Li, J., Zhao, Z., Luo, Q.: SAR2HEIGHT: height estimation from a single SAR image in mountain areas via sparse height and proxyless depth-aware penalty neural architecture search for Unet. Remote Sens. 14(21), 5392 (2022)
Yu, D., Ji, S., Liu, J., Wei, S.: Automatic 3D building reconstruction from multi-view aerial images with deep learning. ISPRS J. Photogramm. Remote. Sens. 171, 155–170 (2021)
Zhang, C., Cui, Y., Zhu, Z., Jiang, S., Jiang, W.: Building height extraction from GF-7 satellite images based on roof contour constrained stereo matching. Remote sensing 14(7), 1566 (2022)
Acknowledgements
We thank Dr.Usman Nazir for the assistance with proofreading and comments that greatly improved the manuscript.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Jabbar, S., Taj, M. (2024). Stereoential Net: Deep Network for Learning Building Height Using Stereo Imagery. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Communications in Computer and Information Science, vol 1967. Springer, Singapore. https://doi.org/10.1007/978-981-99-8178-6_36
Download citation
DOI: https://doi.org/10.1007/978-981-99-8178-6_36
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8177-9
Online ISBN: 978-981-99-8178-6
eBook Packages: Computer ScienceComputer Science (R0)