Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3423323.3423407acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

PP-LinkNet: Improving Semantic Segmentation of High Resolution Satellite Imagery with Multi-stage Training

Published: 12 October 2020 Publication History

Abstract

Road network and building footprint extraction is essential for many applications such as updating maps, traffic regulations, city planning, ride-hailing, disaster response etc. Mapping road networks is currently both expensive and labor-intensive. Recently, improvements in image segmentation through the application of deep neural networks has shown promising results in extracting road segments from large scale, high resolution satellite imagery. However, significant challenges remain due to lack of enough labeled training data needed to build models for industry grade applications. In this paper, we propose a two-stage transfer learning technique to improve robustness of semantic segmentation for satellite images that leverages noisy pseudo ground truth masks obtained automatically (without human labor) from crowd-sourced OpenStreetMap (OSM) data. We further propose Pyramid Pooling-LinkNet (PP-LinkNet), an improved deep neural network for segmentation that uses focal loss, poly learning rate, and context module. We demonstrate the strengths of our approach through evaluations done on three popular datasets over two tasks, namely, road extraction and building foot-print detection. Specifically, we obtain 78.19% meanIoU on SpaceNet building footprint dataset, 67.03% and 77.11% on the road topology metric on SpaceNet and DeepGlobe road extraction dataset, respectively.

References

[1]
Nicolas Audebert, Bertrand Le Saux, and Sebastien Lefevre. 2017. Joint Learning From Earth Observation and OpenStreetMap Data to Get Faster Better Semantic Maps. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops.
[2]
Vijay Badrinarayanan, Alex Kendall, and Roberto Cipolla. 2017. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 39, 12 (2017), 2481--2495. https://doi.org/10.1109/TPAMI.2016.2644615 arxiv: 1511.00561
[3]
Favyen Bastani, Songtao He, Sofiane Abbar, Mohammad Alizadeh, Hari Balakrishnan, Sanjay Chawla, Sam Madden, and David DeWitt. 2018. RoadTracer: Automatic Extraction of Road Networks from Aerial Images. (2018). https://doi.org/10.1109/CVPR.2018.00496 arxiv: 1802.03680
[4]
Anil Batra, Suriya Singh, Guan Pang, Saikat Basu, C V Jawahar, and Manohar Paluri. 2019. Improved Road Connectivity by Joint Learning of Orientation and Segmentation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[5]
Samuel Rota Bulo, Lorenzo Porzi, and Peter Kontschieder. 2018. In-place Activated BatchNorm for Memory-Optimized Training of DNNs. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 5639--5647. https://doi.org/10.1109/CVPR.2018.00591 arxiv: 1712.02616
[6]
Abhishek Chaurasia and Eugenio Culurciello. 2017. LinkNet: Exploiting encoder representations for efficient semantic segmentation. In IEEE Visual Communications and Image Processing (VCIP). https://doi.org/10.1109/VCIP.2017.8305148 arxiv: 1707.03718
[7]
Liang Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L. Yuille. 2018. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 40, 4 (2018), 834--848. https://doi.org/10.1109/TPAMI.2017.2699184 arxiv: 1606.00915
[8]
Ilke Demir, Forest Hughes, Aman Raj, Kaunil Dhruv, Suryanarayana Murthy Muddala, Sanyam Garg, Barrett Doo, and Ramesh Raskar. 2018a. Generative Street Addresses from Satellite Imagery. ISPRS International Journal of Geo-Information, Vol. 7, 3 (2018). https://doi.org/10.3390/ijgi7030084
[9]
Ilke Demir, Krzysztof Koperski, David Lindenbaum, Guan Pang, Jing Huang, Saikat Basu, Forest Hughes, Devis Tuia, and Ramesh Raska. 2018b. DeepGlobe 2018: A challenge to parse the earth through satellite images. In IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). https://doi.org/10.1109/CVPRW.2018.00031 arxiv: 1805.06561
[10]
J Deng, W Dong, R Socher, L.-J. Li, K Li, and L Fei-Fei. 2009. ImageNet: A Large-Scale Hierarchical Image Database. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[11]
M Everingham, L Van Gool, C K I Williams, J Winn, and A Zisserman. 2010. The Pascal Visual Object Classes (VOC) Challenge. International Journal of Computer Vision, Vol. 88, 2 (2010), 303--338.
[12]
Sergey Golovanov, Rauf Kurbanov, Aleksey Artamonov, Alex Davydow, and Sergey Nikolenko. 2018. Building detection from satellite imagery using a composite loss function. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. https://doi.org/10.1109/CVPRW.2018.00040
[13]
Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. 2014. Explaining and Harnessing Adversarial Examples. (2014). arxiv: 1412.6572 http://arxiv.org/abs/1412.6572
[14]
Ryuhei Hamaguchi and Shuhei Hikosaka. 2018. Building Detection From Satellite Imagery Using Ensemble of Size-Specific Detectors. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops.
[15]
Kaiming He, Georgia Gkioxari, Piotr Dollá r, and Ross Girshick. 2017. Mask R-CNN. (mar 2017). arxiv: 1703.06870 http://arxiv.org/abs/1703.06870
[16]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2016-Decem. 770--778. https://doi.org/10.1109/CVPR.2016.90 arxiv: 1512.03385
[17]
Songtao He, Favyen Bastani, Satvat Jagwani, Mohammad Alizadeh, Hari Balakrishnan, Sanjay Chawla, Mohamed M. Elshrif, Samuel Madden, and Amin Sadeghi. 2020. Sat2Graph: Road Graph Extraction through Graph-Tensor Encoding. (jul 2020). arxiv: 2007.09547 http://arxiv.org/abs/2007.09547
[18]
Geoffrey E. Hinton, Simon Osindero, and Yee Whye Teh. 2006. A fast learning algorithm for deep belief nets. Neural Computation, Vol. 18, 7 (2006), 1527--1554. https://doi.org/10.1162/neco.2006.18.7.1527
[19]
Humanitarian OpenStreetMap Team. [n.d.]. https://export.hotosm.org/en/v3/. https://export.hotosm.org/en/v3/
[20]
Vladimir Iglovikov, Selim Seferbekov, Alexander Buslaev, and Alexey Shvets. 2018. TernausNetV2: Fully convolutional network for instance segmentation. In IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 228--232. https://doi.org/10.1109/CVPRW.2018.00042 arxiv: 1806.00844
[21]
Tsung-Yi Lin, Piotr Dollá r, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017. Feature Pyramid Networks for Object Detection. In CVPR .arxiv: 1612.03144 http://arxiv.org/abs/1612.03144
[22]
Tsung Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollá r, and C. Lawrence Zitnick. 2014. Microsoft COCO: Common objects in context. In European Conference on Computer Vision, Vol. 8693 LNCS. 740--755. https://doi.org/10.1007/978--3--319--10602--1_48 arxiv: 1405.0312
[23]
Wei Liu, Andrew Rabinovich, and Alexander C. Berg. 2016. ParseNet: Looking Wider to See Better. ICLR (jun 2016). arxiv: 1506.04579 http://arxiv.org/abs/1506.04579
[24]
Ye Luo, Loong-Fah Cheong, and An Tran. 2015. Actionness-assisted Recognition of Actions. In The IEEE International Conference on Computer Vision (ICCV).
[25]
Gellert Mattyus, Wenjie Luo, and Raquel Urtasun. 2017. DeepRoadMapper: Extracting Road Topology From Aerial Images. In The IEEE International Conference on Computer Vision (ICCV).
[26]
Gellert Mattyus and Raquel Urtasun. 2018. Matching Adversarial Networks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[27]
Gellert Mattyus, Shenlong Wang, Sanja Fidler, and Raquel Urtasun. 2015. Enhancing Road Maps by Parsing Aerial Images Around the World. In International Conference on Computer Vision (ICCV).
[28]
Fausto Milletari, Nassir Navab, and Seyed Ahmad Ahmadi. 2016. V-Net: Fully convolutional neural networks for volumetric medical image segmentation. Proceedings - 2016 4th International Conference on 3D Vision, 3DV 2016 (2016), 565--571. https://doi.org/10.1109/3DV.2016.79 arxiv: 1606.04797
[29]
Agata Mosinska, Pablo Marquez-Neila, Mateusz Kozinski, and Pascal Fua. 2018. Beyond the Pixel-Wise Loss for Topology-Aware Delineation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 3136--3145. https://doi.org/10.1109/CVPR.2018.00331 arxiv: 1712.02190
[30]
Sharan Narang, Gregory Diamos, Erich Elsen, Paulius Micikevicius, Jonah Alben, David Garcia, Boris Ginsburg, Michael Houston, Oleksii Kuchaiev, Ganesh Venkatesh, and Hao Wu. 2018. Mixed precision training. In 6th International Conference on Learning Representations, ICLR 2018 - Conference Track Proceedings .arxiv: 1710.03740 http://arxiv.org/abs/1710.03740
[31]
Gerhard Neuhold, Tobias Ollmann, Samuel Rota Bulò, and Peter Kontschieder. 2017. The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes. In International Conference on Computer Vision (ICCV). https://www.mapillary.com/dataset/vistas
[32]
Adam Paszke, Abhishek Chaurasia, Sangpil Kim, and Eugenio Culurciello. 2016. ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation. (2016). arxiv: 1606.02147 http://arxiv.org/abs/1606.02147
[33]
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. (jun 2015). arxiv: 1506.01497 http://arxiv.org/abs/1506.01497
[34]
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 9351 (2015), 234--241. https://doi.org/10.1007/978--3--319--24574--4_28 arxiv: 1505.04597
[35]
Evan Shelhamer, Jonathan Long, and Trevor Darrell. 2015. Fully Convolutional Networks for Semantic Segmentation. In IEEE Conference on Computer Vision and Pattern Recognition, Vol. 39. 640--651. https://doi.org/10.1109/TPAMI.2016.2572683 arxiv: 1411.4038
[36]
Suriya Singh, Anil Batra, Guan Pang, Lorenzo Torresani, Saikat Basu, Manohar Paluri, and C. V. Jawahar. 2018. Self-supervised Feature Learning for Semantic Segmentation of Overhead Imagery. In British Machine Vision Conference (BMVC), Vol. 1.
[37]
An Tran and Loong-Fah Cheong. 2017. Two-stream Flow-guided Convolutional Attention Networks for Action Recognition. In The IEEE International Conference on Computer Vision Workshop (ICCVW).
[38]
USGS. [n.d.]. https://earthexplorer.usgs.gov/. https://earthexplorer.usgs.gov/
[39]
Adam Van Etten, Dave Lindenbaum, and Todd M. Bacastow. 2018. SpaceNet: A Remote Sensing Dataset and Challenge Series. (2018). arxiv: 1807.01232 http://arxiv.org/abs/1807.01232
[40]
Panqu Wang, Pengfei Chen, Ye Yuan, Ding Liu, Zehua Huang, Xiaodi Hou, and Garrison Cottrell. 2018. Understanding Convolution for Semantic Segmentation. In IEEE Winter Conference on Applications of Computer Vision (WACV). 1451--1460. https://doi.org/10.1109/WACV.2018.00163 arxiv: 1702.08502
[41]
Shenlong Wang, Min Bai, Gellert Mattyus, Hang Chu, Wenjie Luo, Bin Yang, Justin Liang, Joel Cheverie, Sanja Fidler, and Raquel Urtasun. 2017. TorontoCity: Seeing the World with a Million Eyes. In Proceedings of the IEEE International Conference on Computer Vision. https://doi.org/10.1109/ICCV.2017.327 arxiv: 1612.00423
[42]
Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia. 2017. Pyramid scene parsing network. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.1109/CVPR.2017.660 arxiv: 1612.01105v2
[43]
Lichen Zhou, Chuang Zhang, and Ming Wu. 2018. D-linknet: Linknet with pretrained encoder and dilated convolution for high resolution satellite imagery road extraction. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, Vol. 2018-June. https://doi.org/10.1109/CVPRW.2018.00034

Cited By

View all
  • (2024)Artificial Intelligence for Digital Heritage Innovation: Setting up a R&D Agenda for EuropeHeritage10.3390/heritage70200387:2(794-816)Online publication date: 6-Feb-2024
  • (2024)A Digital 4D Information System on the World Scale: Research Challenges, Approaches, and Preliminary ResultsApplied Sciences10.3390/app1405199214:5(1992)Online publication date: 28-Feb-2024
  • (2024)Updating road maps at city scale with remote sensed images and existing vector mapsIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2024.3375807(1-1)Online publication date: 2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SUMAC'20: Proceedings of the 2nd Workshop on Structuring and Understanding of Multimedia heritAge Contents
October 2020
70 pages
ISBN:9781450381550
DOI:10.1145/3423323
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 October 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. building footprint
  2. hyperspectral imaging
  3. mapping application
  4. multi-stage training
  5. pp-linknet
  6. remote sensing
  7. road network
  8. transfer learning

Qualifiers

  • Research-article

Conference

MM '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 5 of 6 submissions, 83%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)26
  • Downloads (Last 6 weeks)3
Reflects downloads up to 30 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Artificial Intelligence for Digital Heritage Innovation: Setting up a R&D Agenda for EuropeHeritage10.3390/heritage70200387:2(794-816)Online publication date: 6-Feb-2024
  • (2024)A Digital 4D Information System on the World Scale: Research Challenges, Approaches, and Preliminary ResultsApplied Sciences10.3390/app1405199214:5(1992)Online publication date: 28-Feb-2024
  • (2024)Updating road maps at city scale with remote sensed images and existing vector mapsIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2024.3375807(1-1)Online publication date: 2024
  • (2023)Road Extraction With Satellite Images and Partial Road MapsIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2023.326133261(1-14)Online publication date: 2023
  • (2023)DPPNet: An Efficient and Robust Deep Learning Network for Land Cover Segmentation From High-Resolution Satellite ImagesIEEE Transactions on Emerging Topics in Computational Intelligence10.1109/TETCI.2022.31824147:1(128-139)Online publication date: Feb-2023
  • (2023)CSU-Net: Contour Semantic Segmentation Self-Enhancement for Human Head DetectionIEEE Access10.1109/ACCESS.2022.323341911(987-999)Online publication date: 2023
  • (2023)Automated Road Extraction from Remotely Sensed Imagery using ConnectNetJournal of the Indian Society of Remote Sensing10.1007/s12524-023-01747-451:10(2105-2120)Online publication date: 8-Sep-2023
  • (2022)MECA-Net: A MultiScale Feature Encoding and Long-Range Context-Aware Network for Road Extraction from Remote Sensing ImagesRemote Sensing10.3390/rs1421534214:21(5342)Online publication date: 25-Oct-2022
  • (2022)Active learning based semantic segmentation for extraction of minute objects from multispectral satellite imagesIGARSS 2022 - 2022 IEEE International Geoscience and Remote Sensing Symposium10.1109/IGARSS46834.2022.9884592(7274-7277)Online publication date: 17-Jul-2022
  • (2022)Segmenting across places: The need for fair transfer learning with satellite imagery2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)10.1109/CVPRW56347.2022.00329(2915-2924)Online publication date: Jun-2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media