research-article

Open access

A Detailed Analysis on the Use of General-purpose Vision Transformers for Remote Sensing Image Segmentation

Authors:

Miguel Gonçalves,

Jacinto EstimaAuthors Info & Claims

GeoAI '23: Proceedings of the 6th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery

Pages 20 - 29

https://doi.org/10.1145/3615886.3627751

Published: 20 November 2023 Publication History

Abstract

Image segmentation is currently a hot topic in the context of Earth observation through remote sensing. Recent research has advanced many new models designed specifically for remote sensing image segmentation, often with sophisticated architectures and purposelybuilt mechanisms for this domain. Our work, on the other hand, explores the use of recent general-purpose image segmentation Transformer models on this same context, with emphasis on the adopted training strategy and its influence on segmentation performance. Our objective is to assess the degree to which domainspecific architectures are indeed required to achieve state-of-the-art results, and assess the role of training strategies in the performance of general models. We tested different model sizes and a variety of possibilities in what regards the training strategy, including adaptations to 4-channel inputs, over two datasets used in previous studies. Results show that general-purpose models are indeed competitive with the current state-of-the-art, without relying on purposely-built architectures for remote sensing images.

References

[1]

Nabila Abraham and Naimul Mefraz Khan. 2019. A Novel Focal Tversky Loss Function With Improved Attention U-Net for Lesion Segmentation. In 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019). 683--687.

[2]

Lei Ding, Dong Lin, Shaofu Lin, Jing Zhang, Xiaojie Cui, Yuebin Wang, Hao Tang, and Lorenzo Bruzzone. 2022. Looking Outside the Window: Wide-Context Transformer for the Semantic Segmentation of High-Resolution Remote Sensing Images. IEEE Transactions on Geoscience and Remote Sensing 60 (2022), 1--13.

[3]

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2020. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. CoRR abs/2010.11929 (2020).

[4]

K. Grauman and T. Darrell. 2005. The pyramid match kernel: discriminative classification with sets of image features. In Proceedings of the IEEE International Conference on Computer Vision.

[5]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2014. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. CoRR abs/1406.4729 (2014).

[6]

Xin He, Yong Zhou, Jiaqi Zhao, Di Zhang, Rui Yao, and Yong Xue. 2022. Swin Transformer Embedding UNet for Remote Sensing Image Semantic Segmentation. IEEE Transactions on Geoscience and Remote Sensing 60 (2022), 1--15.

[7]

Yuansheng Hua, Diego Marcos, Lichao Mou, Xiao Xiang Zhu, and Devis Tuia. 2022. Semantic Segmentation of Remote Sensing Images With Sparse Annotations. IEEE Geoscience and Remote Sensing Letters 19 (2022), 1--5.

[8]

Shruti Jadon. 2020. A survey of loss functions for semantic segmentation. In Proceedings of the IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology.

Digital Library

[9]

Satyawant Kumar, Abhishek Kumar, and Dong-Gyu Lee. 2023. RemoteNet: Remote Sensing Image Segmentation Network based on Global-Local Information. CoRR abs/2302.13084 (2023).

[10]

S. Lazebnik, C. Schmid, and J. Ponce. 2006. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]

Rui Li, Shunyi Zheng, Chenxi Duan, Jianlin Su, and Ce Zhang. 2022. Multistage Attention ResU-Net for Semantic Segmentation of Fine-Resolution Remote Sensing Images. IEEE Geoscience and Remote Sensing Letters 19 (2022), 1--5.

[12]

Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. CoRR abs/2103.14030 (2021).

[13]

Zhongyu Sun, Wangping Zhou, Chen Ding, and Min Xia. 2022. Multi-Resolution Transformer Network for Building and Road Segmentation of Remote Sensing Image. ISPRS International Journal of Geo-Information 11, 3 (2022).

[14]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention Is All You Need. CoRR abs/1706.03762 (2017).

Digital Library

[15]

Michele Volpi and Vittorio Ferrari. 2015. Semantic segmentation of urban scenes by learning local class interactions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 1--9.

[16]

Libo Wang, Rui Li, Chenxi Duan, Ce Zhang, Xiaoliang Meng, and Shenghui Fang. 2022. A Novel Transformer Based Semantic Segmentation Scheme for Fine-Resolution Remote Sensing Images. IEEE Geoscience and Remote Sensing Letters 19 (2022), 1--5.

[17]

Libo Wang, Rui Li, Ce Zhang, Shenghui Fang, Chenxi Duan, Xiaoliang Meng, and Peter M. Atkinson. 2022. UNetFormer: A UNet-like transformer for efficient semantic segmentation of remote sensing urban scene imagery. ISPRS Journal of Photogrammetry and Remote Sensing 190 (2022), 196--214.

[18]

WenhaiWang, Enze Xie, Xiang Li, Deng-Ping Fan, Kaitao Song, Ding Liang, Tong Lu, Ping Luo, and Ling Shao. 2021. Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions. CoRR abs/2102.12122 (2021).

[19]

Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, and Ping Luo. 2021. SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers. CoRR abs/2105.15203 (2021).

[20]

Yonghao Xu and Pedram Ghamisi. 2022. Consistency-Regularized Region- Growing Network for Semantic Segmentation of Urban Scenes With Point-Level Annotations. IEEE Transactions on Image Processing 31 (2022), 5038--5051.

Digital Library

[21]

Haotian Yan, Chuang Zhang, and MingWu. 2022. Lawin Transformer: Improving Semantic Segmentation Transformer with Multi-Scale Representations via Large Window Attention. CoRR abs/2201.01615 (2022).

[22]

Yanchao Yang and Stefano Soatto. 2020. FDA: Fourier Domain Adaptation for Semantic Segmentation. CoRR abs/2004.05498 (2020).

[23]

Can Yaris, Bohao Huang, Kyle Bradbury, and Jordan M. Malof. 2021. Randomized Histogram Matching: A Simple Augmentation for Unsupervised Domain Adaptation in Overhead Imagery. CoRR abs/2104.14032 (2021).

Cited By

Lekavičius JGružauskas V(2024)Data Augmentation with Generative Adversarial Network for Solar Panel Segmentation from Remote Sensing ImagesEnergies10.3390/en1713320417:13(3204)Online publication date: 29-Jun-2024
https://doi.org/10.3390/en17133204

Index Terms

A Detailed Analysis on the Use of General-purpose Vision Transformers for Remote Sensing Image Segmentation
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Image segmentation

Recommendations

Supervised remote sensing image segmentation using boosted convolutional neural networks

In this paper, a region segmentation technique for remote sensing images using a boosted committee of Convolutional Neural Networks (CNNs) coupled with inter-band and intra-band fusion, is proposed. The vast heterogeneity in remote sensing images ...
An edge embedded marker-based watershed algorithm for high spatial resolution remote sensing image segmentation
Special section on distributed camera networks: sensing, processing, communication, and implementation

This correspondence proposes an edge embedded markerbased watershed algorithm for high spatial resolution remote sensing image segmentation. Two improvement techniques are proposed for the two key steps of maker extraction and pixel labeling, ...
Study on Automatic Shoreline Extraction Based on Multi-spectral Remote Sensing Images
ICAIP '21: Proceedings of the 5th International Conference on Advances in Image Processing

Remote sensing images contain important scientific data reflecting earth resources. Designing a computer algorithm to extract shoreline information from remote sensing images quickly and accurately is an important research direction in ocean engineering. ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

GeoAI '23: Proceedings of the 6th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery

November 2023

135 pages

ISBN:9798400703485

DOI:10.1145/3615886

Editors:
Shawn Newsam
University of California, Merced, CA, USA
,
Lexie Yang
Oak Ridge National Laboratory, TN, USA
,
Gengchen Mai
University of Georgia, GA, USA
,
Bruno Martins
University of Lisbon, Portugal
,
Dalton Lunga
Oak Ridge National Laboratory, TN, USA
,
Song Gao
University of Wisconsin, Madison, WI, USA

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGSPATIAL: ACM Special Interest Group on Spatial Information

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 November 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Portuguese Recovery and Resilience Plan
Fundação para a Ciência e Tecnologia - FCT

Conference

SIGSPATIAL '23

Sponsor:

SIGSPATIAL

SIGSPATIAL '23: The 31st ACM International Conference on Advances in Geographic Information Systems

November 13, 2023

Hamburg, Germany

Acceptance Rates

Overall Acceptance Rate 17 of 25 submissions, 68%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
251
Total Downloads

Downloads (Last 12 months)251
Downloads (Last 6 weeks)28

Reflects downloads up to 21 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Lekavičius JGružauskas V(2024)Data Augmentation with Generative Adversarial Network for Solar Panel Segmentation from Remote Sensing ImagesEnergies10.3390/en1713320417:13(3204)Online publication date: 29-Jun-2024
https://doi.org/10.3390/en17133204

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents