Nothing Special   »   [go: up one dir, main page]

You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
remotesensing-logo

Journal Browser

Journal Browser

Deep Neural Networks for Hyperspectral Remote Sensing Image Processing

A special issue of Remote Sensing (ISSN 2072-4292). This special issue belongs to the section "Remote Sensing Image Processing".

Deadline for manuscript submissions: 15 January 2025 | Viewed by 15951

Special Issue Editors

Center for Hyperspectral Imaging in Remote Sensing (CHIRS), Information and Technology College Dalian Maritime University, Dalian, China
Interests: hyperspectral image processing; multi-source remote sensing image fusion; artificial intelligence
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Center for Hyperspectral Imaging in Remote Sensing (CHIRS), Information Science and Technology College, Dalian Maritime University, Dalian 116026, China
Interests: hyperspectral image processing; artificial intelligence; remote sensing
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Center for Hyperspectral Imaging in Remote Sensing (CHIRS), Information Science and Technology College, Dalian Maritime University, Dalian 116026, China
Interests: thermal infrared; hyperspectral; quantitative remote sensing

E-Mail Website
Guest Editor
State Key Laboratory of Integrated Services Networks, Xidian University, Xi’an 710071, China
Interests: hyperspectral anomaly detection; network compression; efficient distributed learning
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Center for Hyperspectral Imaging in Remote Sensing (CHIRS), Information Science and Technology College, Dalian Maritime University, Dalian 116026, China
Interests: remote sensing image processing
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, China
Interests: thermal infrared; hyperspectral; quantitative remote sensing

Special Issue Information

Dear Colleagues,

A hyperspectral image (HSI) is a three-dimensional cube containing rich spatial and spectral information with hundreds of narrow and contiguous wavebands generated by an imaging spectrometer. Each pixel in hyperspectral remote sensing images corresponds to a nearly continuous spectral curve, which can reflect substances’ diagnostic spectral absorption differences and provide rich spectral information for an accurate extraction of ground object information. Thanks to its high spectral resolution, hyperspectral images have received reasonable attention and have essential applications in military and civil fields. In recent years, with the continuous improvement of the hyperspectral data acquisition capability of satellites and aerial platforms, hyperspectral image processing has also developed towards big data-driven feature information extraction. However, processing the massive data collected by these platforms using traditional image analysis methodologies is impractical and ineffective. This calls for the adoption of powerful techniques that can extract reliable and useful information, where deep neural networks have been gradually applied in HSI processing due to the strong generalization and deep extraction properties of advanced semantic features.

This Special Issue aims to explore features that truly benefit hyperspectral remote sensing interpretation tasks and provides a forum for many individuals working in deep-learning-based hyperspectral image processing to report their research findings and share their experiences with the HSI community. All contributions to deep neural networks for hyperspectral remote sensing image processing are welcome to this Special Issue. Topics of interest include, but are not limited to, the following:

  • Deep neural networks for target detection, band selection, and classification in hyperspectral images.
  • Deep learning for surface parameters retrieval from thermal infrared images.
  • Deep feature extraction for multi-source remote sensing images.
  • The hybrid architecture of CNN and transformer for hyperspectral applications.
  • Feature fusion and learning for hyperspectral image processing.
  • Light-weight design of deep models.
  • Review/surveys of recent applications and techniques of hyperspectral images.

Dr. Yulei Wang
Prof. Dr. Meiping Song
Dr. Enyu Zhao
Dr. Weiying Xie
Dr. Chunyan Yu
Prof. Dr. Caixia Gao
Prof. Dr. Silvia Liberata Ullo
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Remote Sensing is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • deep neural networks
  • deep feature extraction
  • light weighed model
  • hyperspectral remote sensing images
  • thermal infrared
  • quantitative remote sensing
  • multi-sensor and multi-platform analysis
  • remote sensing applications

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (13 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Other

24 pages, 8893 KiB  
Article
Assessing Data Preparation and Machine Learning for Tree Species Classification Using Hyperspectral Imagery
by Wenge Ni-Meister, Anthony Albanese and Francesca Lingo
Remote Sens. 2024, 16(17), 3313; https://doi.org/10.3390/rs16173313 - 6 Sep 2024
Abstract
Tree species classification using hyperspectral imagery shows incredible promise in developing a large-scale, high-resolution model for identifying tree species, providing unprecedented details on global tree species distribution. Many questions remain unanswered about the best practices for creating a global, general hyperspectral tree species [...] Read more.
Tree species classification using hyperspectral imagery shows incredible promise in developing a large-scale, high-resolution model for identifying tree species, providing unprecedented details on global tree species distribution. Many questions remain unanswered about the best practices for creating a global, general hyperspectral tree species classification model. This study aims to address three key issues in creating a hyperspectral species classification model. We assessed the effectiveness of three data-labeling methods to create training data, three data-splitting methods for training/validation/testing, and machine-learning and deep-learning (including semi-supervised deep-learning) models for tree species classification using hyperspectral imagery at National Ecological Observatory Network (NEON) Sites. Our analysis revealed that the existing data-labeling method using the field vegetation structure survey performed reasonably well. The random tree data-splitting technique was the most efficient method for both intra-site and inter-site classifications to overcome the impact of spatial autocorrelation to avoid the potential to create a locally overfit model. Deep learning consistently outperformed random forest classification; both semi-supervised and supervised deep-learning models displayed the most promising results in creating a general taxa-classification model. This work has demonstrated the possibility of developing tree-classification models that can identify tree species from outside their training area and that semi-supervised deep learning may potentially utilize the untapped terabytes of unlabeled forest imagery. Full article
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>Locations of NEON study sites used in this project.</p>
Full article ">Figure 2
<p>Locations of vegetation sampling plots from the NIWO site.</p>
Full article ">Figure 3
<p>Illustration of data sources used from each NEON site. From left to right: 10 cm RGB imagery, 1 m true-color composite from hyperspectral imagery, and 1 m canopy-height model (CHM) derived from lidar, all collected in August 2020. Survey tree locations are indicated in red.</p>
Full article ">Figure 4
<p>Overview of experimental workflow.</p>
Full article ">Figure 5
<p>Mean hyperspectral reflectance values for a study plot at the NIWO site before and after performing a simple de-noising operation. Bands with consistently low or noisy values were filtered out from further processing and analysis.</p>
Full article ">Figure 6
<p>Results from all three annotation methods used on the NIWO_014 study plot produced slightly different results. This is demonstrated well with the isolated tree in the middle right of the image. The filtering algorithm removes this tree location due to the difference between CHM and surveyed tree height. The snapping algorithm changes its location, and the Scholl algorithm keeps this location unaltered. Original tree locations from the NEON woody vegetation survey are on the upper left.</p>
Full article ">Figure 7
<p>Network designs for deep-learning models utilized. The pre-training model utilizes the swapping assignments between views (SwAV) unsupervised clustering architecture to find clusters within the data. The encoder from the pre-training model is then used as a backbone for the semi-supervised model to the supervised multi-layer perception (MLP) learning model. At the same time, the supervised model is initialized with no pre-training or prior exposure to data to the MLP model.</p>
Full article ">Figure 8
<p>Mean hyperspectral reflectance from 380 to 2510 nm, extracted from all polygons with half the maximum crown diameter at the NEON NIWO site for each of the dominant tree species: ABLAL (Subalpine fir), PICOL (Lodgepole pine), PIEN (Engelmann spruce), and PIFL2 (Limber pine).</p>
Full article ">Figure 9
<p>Results from testing different label-selection method algorithms at the NIWO site. Five trials were run for each set of parameters, and the median overall accuracy amongst those trials was plotted. Minimum and maximum accuracy values from trials are indicated with error bars.</p>
Full article ">Figure 10
<p>Results from testing transferability of trained models using the random pixel (labeled as Pixel), plot-divide (labeled as Plot), and random tree (labeled as Tree) data-splitting methods for training/validation/testing. All models were initially trained on data from the NIWO site and then tested on data from the RMNP site. Minimum and maximum accuracy values from trials are indicated with error bars.</p>
Full article ">Figure 11
<p>Results for deep-learning classification models with and without pre-training. The color of the bar indicates three cases: pre-training was not performed (purple), performed on the NIWO site (orange), or performed on the STEI site (green). The top row results were trained and classified on the NIWO site, while the bottom row results were trained on the NIWO site and classified on the RMNP site.</p>
Full article ">
26 pages, 6739 KiB  
Article
Pansharpening Based on Multimodal Texture Correction and Adaptive Edge Detail Fusion
by Danfeng Liu, Enyuan Wang, Liguo Wang, Jón Atli Benediktsson, Jianyu Wang and Lei Deng
Remote Sens. 2024, 16(16), 2941; https://doi.org/10.3390/rs16162941 - 11 Aug 2024
Viewed by 543
Abstract
Pansharpening refers to the process of fusing multispectral (MS) images with panchromatic (PAN) images to obtain high-resolution multispectral (HRMS) images. However, due to the low correlation and similarity between MS and PAN images, as well as inaccuracies in spatial information injection, HRMS images [...] Read more.
Pansharpening refers to the process of fusing multispectral (MS) images with panchromatic (PAN) images to obtain high-resolution multispectral (HRMS) images. However, due to the low correlation and similarity between MS and PAN images, as well as inaccuracies in spatial information injection, HRMS images often suffer from significant spectral and spatial distortions. To address these issues, a pansharpening method based on multimodal texture correction and adaptive edge detail fusion is proposed in this paper. To obtain a texture-corrected (TC) image that is highly correlated and similar to the MS image, the target-adaptive CNN-based pansharpening (A-PNN) method is introduced. By constructing a multimodal texture correction model, intensity, gradient, and A-PNN-based deep plug-and-play correction constraints are established between the TC and source images. Additionally, an adaptive degradation filter algorithm is proposed to ensure the accuracy of these constraints. Since the TC image obtained can effectively replace the PAN image and considering that the MS image contains valuable spatial information, an adaptive edge detail fusion algorithm is also proposed. This algorithm adaptively extracts detailed information from the TC and MS images to apply edge protection. Given the limited spatial information in the MS image, its spatial information is proportionally enhanced before the adaptive fusion. The fused spatial information is then injected into the upsampled multispectral (UPMS) image to produce the final HRMS image. Extensive experimental results demonstrated that compared with other methods, the proposed algorithm achieved superior results in terms of both subjective visual effects and objective evaluation metrics. Full article
Show Figures

Figure 1

Figure 1
<p>The proposed model framework diagram.</p>
Full article ">Figure 2
<p>Iterative convergence results from the WorldView-3 dataset.</p>
Full article ">Figure 3
<p>Subjective evaluation fusion results of the RS images in the QuickBird dataset.</p>
Full article ">Figure 4
<p>Subjective evaluation fusion results of the RS images in the WorldView-2 dataset.</p>
Full article ">Figure 5
<p>Subjective evaluation fusion results of the RS images in the WorldView-3 dataset.</p>
Full article ">Figure 6
<p>Subjective evaluation fusion results of the FS images in the GaoFen-2 dataset.</p>
Full article ">Figure 7
<p>Subjective evaluation fusion results of the FS images in the WorldView-2 dataset.</p>
Full article ">Figure 8
<p>Parameter settings for four different datasets in RS and FS experiments.</p>
Full article ">Figure 9
<p>Subjective evaluation fusion results of different ablation combination models from the WorldView-3 dataset.</p>
Full article ">
21 pages, 9928 KiB  
Article
DDSR: Degradation-Aware Diffusion Model for Spectral Reconstruction from RGB Images
by Yunlai Chen and Xiaoyan Zhang
Remote Sens. 2024, 16(15), 2692; https://doi.org/10.3390/rs16152692 - 23 Jul 2024
Viewed by 588
Abstract
The reconstruction of hyperspectral images (HSIs) from RGB images is an attractive low-cost approach to recover hyperspectral information. However, existing approaches focus on learning an end-to-end mapping of RGB images and their corresponding HSIs with neural networks, which makes it difficult to ensure [...] Read more.
The reconstruction of hyperspectral images (HSIs) from RGB images is an attractive low-cost approach to recover hyperspectral information. However, existing approaches focus on learning an end-to-end mapping of RGB images and their corresponding HSIs with neural networks, which makes it difficult to ensure generalization due to the fact that they are trained on data with a specific degradation process. As a new paradigm of generative models, the diffusion model has shown great potential in image restoration, especially in noisy contexts. To address the unstable generalization ability of end-to-end models while exploiting the powerful ability of the diffusion model, we propose a degradation-aware diffusion model. The degradation process from HSI to RGB is modeled as a combination of multiple degradation operators, which are used to guide the inverse process of the diffusion model by utilizing a degradation-aware correction. By integrating the degradation-aware correction to the diffusion model, we obtain an efficient solver for spectral reconstruction, which is robust to different degradation patterns. Experiment results on various public datasets demonstrate that our method achieves competitive performance and shows a promising generalization ability. Full article
Show Figures

Figure 1

Figure 1
<p>Description of the overall architecture of our proposed DDSR and degradation-aware correction.</p>
Full article ">Figure 2
<p>The architecture of the neural network with parameter <math display="inline"><semantics> <mi mathvariant="bold-italic">θ</mi> </semantics></math>, which uses U-Net as a backbone and consists of a number of ResBlocks and attention blocks, where we uniformly sample timestep <math display="inline"><semantics> <mrow> <mi>t</mi> <mo>∼</mo> <mo>(</mo> <mn>0</mn> <mo>,</mo> <mn>1</mn> <mo>,</mo> <mo>.</mo> <mo>.</mo> <mo>.</mo> <mo>,</mo> <mi>T</mi> <mo>)</mo> </mrow> </semantics></math> and encode it sinusoidally, and then pass it through Full connection layers to get time embedding.</p>
Full article ">Figure 3
<p>Structure of the ResBlock of the neural network; C, H, and W represent the number of channels, height, and width of the input, respectively.</p>
Full article ">Figure 4
<p>Description of the degradation process in this paper, where <math display="inline"><semantics> <mrow> <mi mathvariant="bold">H</mi> <mi>x</mi> </mrow> </semantics></math> is the linear spectral downsampling, and JPEG denotes the JPEG compression.</p>
Full article ">Figure 5
<p>Visualization of MRAE (second and third rows) and <math display="inline"><semantics> <msub> <mi>L</mi> <mn>1</mn> </msub> </semantics></math> loss (fourth and fifth rows) heatmap of the reconstruction results by our method corresponding to 460 and 500 nm, which are sampled from ARAD-1K dataset. Note that the red boxes label the regions where the MRAE loss is large, but the <math display="inline"><semantics> <msub> <mi>L</mi> <mn>1</mn> </msub> </semantics></math> loss in these regions is small.</p>
Full article ">Figure 6
<p>Visualization of spectral density curve of selected samples corresponding to the green box from the Foster and CAVE datasets; corr represents the correlation with the ground-truth curve.</p>
Full article ">Figure 7
<p>Visualization of slices of reconstructed image and RMSE curves for all methods, corresponding to the green box position in the RGB image on the KAUST dataset.</p>
Full article ">Figure 8
<p>Visualization of the error heatmap of generated results on the NUS dataset. Please follow the color bar to find areas of large losses and zoom in for a better view.</p>
Full article ">Figure 9
<p>Visualization of the error heatmap of generated results on the CAVE dataset. Please follow the color bar to find areas of large losses and zoom in for a better view.</p>
Full article ">Figure 10
<p>Visualization of the error heatmap of the generated results on the ICVL dataset. hlPlease follow the color bar to find areas of large losses and zoom in for a better view.</p>
Full article ">
29 pages, 10649 KiB  
Article
Hyperspectral Image Classification Based on Double-Branch Multi-Scale Dual-Attention Network
by Heng Zhang, Hanhu Liu, Ronghao Yang, Wei Wang, Qingqu Luo and Changda Tu
Remote Sens. 2024, 16(12), 2051; https://doi.org/10.3390/rs16122051 - 7 Jun 2024
Cited by 1 | Viewed by 626
Abstract
Although extensive research shows that CNNs achieve good classification results in HSI classification, they still struggle to effectively extract spectral sequence information from HSIs. Additionally, the high-dimensional features of HSIs, the limited number of labeled samples, and the common sample imbalance significantly restrict [...] Read more.
Although extensive research shows that CNNs achieve good classification results in HSI classification, they still struggle to effectively extract spectral sequence information from HSIs. Additionally, the high-dimensional features of HSIs, the limited number of labeled samples, and the common sample imbalance significantly restrict classification performance improvement. To address these issues, this article proposes a double-branch multi-scale dual-attention (DBMSDA) network that fully extracts spectral and spatial information from HSIs and fuses them for classification. The designed multi-scale spectral residual self-attention (MSeRA), as a fundamental component of dense connections, can fully extract high-dimensional and intricate spectral information from HSIs, even with limited labeled samples and imbalanced distributions. Additionally, this article adopts a dataset partitioning strategy to prevent information leakage. Finally, this article introduces a hyperspectral geological lithology dataset to evaluate the accuracy and applicability of deep learning methods in geology. Experimental results on the geological lithology hyperspectral dataset and three other public datasets demonstrate that the DBMSDA method exhibits superior classification performance and robust generalization ability compared to existing methods. Full article
Show Figures

Figure 1

Figure 1
<p>Overall structure of the proposed DBMSDA.</p>
Full article ">Figure 2
<p>Proposed MSeRA structure.</p>
Full article ">Figure 3
<p>The architecture of residual network (ResNet) and dense convolutional network (DenseNet).</p>
Full article ">Figure 4
<p>The details of the spectral attention module and the spatial attention module.</p>
Full article ">Figure 5
<p>KY dataset geographical location.</p>
Full article ">Figure 6
<p>Original remote sensing images, label information, and category information of four types of hyperspectral datasets.</p>
Full article ">Figure 7
<p>(<b>a</b>) Classification performance of different numbers of MSeRA densely connected units; (<b>b</b>) classification performance of different convolution kernel scale combinations.</p>
Full article ">Figure 8
<p>Classification plot on KY dataset. (<b>a</b>) Real object map. (<b>b</b>) CDCNN. (<b>c</b>) SSRN. (<b>d</b>) SSTN. (<b>e</b>) DBMA. (<b>f</b>) DBDA. (<b>g</b>) DRIN. (<b>h</b>) SSFTT. (<b>i</b>) SpectralFormer. (<b>j</b>) FactoFormer. (<b>k</b>) HyperBCS. (<b>l</b>) 3DCT. (<b>m</b>) DBMSDA (sub-1). (<b>n</b>) DBMSDA (sub-2). (<b>o</b>) Proposed.</p>
Full article ">Figure 9
<p>Classification plot on the Hu dataset. (<b>a</b>) Real object map. (<b>b</b>) CDCNN. (<b>c</b>) SSRN. (<b>d</b>) SSTN. (<b>e</b>) DBMA. (<b>f</b>) DBDA. (<b>g</b>) DRIN. (<b>h</b>) SSFTT. (<b>i</b>) SpectralFormer. (<b>j</b>) FactoFormer. (<b>k</b>) HyperBCS. (<b>l</b>) 3DCT. (<b>m</b>) DBMSDA (sub-1). (<b>n</b>) DBMSDA (sub-2). (<b>o</b>) Proposed.</p>
Full article ">Figure 10
<p>Classification plot on the IN dataset. (<b>a</b>) Real object map. (<b>b</b>) CDCNN. (<b>c</b>) SSRN. (<b>d</b>) SSTN. (<b>e</b>) DBMA. (<b>f</b>) DBDA. (<b>g</b>) DRIN. (<b>h</b>) SSFTT. (<b>i</b>) SpectralFormer. (<b>j</b>) FactoFormer. (<b>k</b>) HyperBCS. (<b>l</b>) 3DCT. (<b>m</b>) DBMSDA (sub-1). (<b>n</b>) DBMSDA (sub-2). (<b>o</b>) Proposed.</p>
Full article ">Figure 11
<p>Classification diagram on SV dataset. (<b>a</b>) Real object map. (<b>b</b>) CDCNN. (<b>c</b>) SSRN. (<b>d</b>) SSTN. (<b>e</b>) DBMA. (<b>f</b>) DBDA. (<b>g</b>) DRIN. (<b>h</b>) SSFTT. (<b>i</b>) SpectralFormer. (<b>j</b>) FactoFormer. (<b>k</b>) HyperBCS. (<b>l</b>) 3DCT. (<b>m</b>) DBMSDA (sub-1). (<b>n</b>) DBMSDA (sub-2). (<b>o</b>) Proposed.</p>
Full article ">Figure 12
<p>Effects of the block-patch size on OA and AA (%): (<b>a</b>) KY, (<b>b</b>) Hu, (<b>c</b>) IN, and (<b>d</b>) SV.</p>
Full article ">Figure 13
<p>Number of labeled training pixels on OA (%): (<b>a</b>) KY, (<b>b</b>) Hu, (<b>c</b>) IN, and (<b>d</b>) SV.</p>
Full article ">Figure 14
<p>Classification results of networks in minimal sample data.</p>
Full article ">
17 pages, 6317 KiB  
Article
Spectral Reconstruction from Thermal Infrared Multispectral Image Using Convolutional Neural Network and Transformer Joint Network
by Enyu Zhao, Nianxin Qu, Yulei Wang and Caixia Gao
Remote Sens. 2024, 16(7), 1284; https://doi.org/10.3390/rs16071284 - 5 Apr 2024
Cited by 1 | Viewed by 1084
Abstract
Thermal infrared remotely sensed data, by capturing the thermal radiation characteristics emitted by the Earth’s surface, plays a pivotal role in various domains, such as environmental monitoring, resource exploration, agricultural assessment, and disaster early warning. However, the acquisition of thermal infrared hyperspectral remotely [...] Read more.
Thermal infrared remotely sensed data, by capturing the thermal radiation characteristics emitted by the Earth’s surface, plays a pivotal role in various domains, such as environmental monitoring, resource exploration, agricultural assessment, and disaster early warning. However, the acquisition of thermal infrared hyperspectral remotely sensed imagery necessitates more complex and higher-precision sensors, which in turn leads to higher research and operational costs. In this study, a novel Convolutional Neural Network (CNN)–Transformer combined block, termed CTBNet, is proposed to address the challenge of thermal infrared multispectral image spectral reconstruction. Specifically, the CTBNet comprises blocks that integrate CNN and Transformer technologies (CTB). Within these CTBs, an improved self-attention mechanism is introduced, which not only considers features across spatial and spectral dimensions concurrently, but also explicitly extracts incremental features from each channel. Compared to other algorithms, the proposed method more closely aligns with the true spectral curves in the reconstruction of hyperspectral images across the spectral dimension. Through a series of experiments, this approach has been proven to ensure robustness and generalizability, outperforming some state-of-the-art algorithms across various metrics. Full article
Show Figures

Figure 1

Figure 1
<p>CTBNet structure.</p>
Full article ">Figure 2
<p>CTB structure. (<b>a</b>) overall structure of CTB; (<b>b</b>) composition of FNN; (<b>c</b>) process of K Map; (<b>d</b>) process of V Map.</p>
Full article ">Figure 3
<p>A portion of the hyperspectral data: (<b>a</b>) including various man-made structures; (<b>b</b>) encompassing rivers, vegetation, highways, and other elements.</p>
Full article ">Figure 4
<p>Simulated spectral response functions.</p>
Full article ">Figure 5
<p>Comparison of results for land region. (<b>a</b>) Error map for land region. (<b>b</b>) Truth map for the land region.</p>
Full article ">Figure 5 Cont.
<p>Comparison of results for land region. (<b>a</b>) Error map for land region. (<b>b</b>) Truth map for the land region.</p>
Full article ">Figure 6
<p>Comparison of results for water region. (<b>a</b>) Error map for the water region (<b>b</b>) True value for the water region.</p>
Full article ">Figure 6 Cont.
<p>Comparison of results for water region. (<b>a</b>) Error map for the water region (<b>b</b>) True value for the water region.</p>
Full article ">Figure 7
<p>Radiance spectral curve. (<b>a</b>) for the land region and (<b>b</b>) for the water region.</p>
Full article ">Figure 8
<p>Spectral response function of simulated MODIS.</p>
Full article ">
20 pages, 5955 KiB  
Article
CroplandCDNet: Cropland Change Detection Network for Multitemporal Remote Sensing Images Based on Multilayer Feature Transmission Fusion of an Adaptive Receptive Field
by Qiang Wu, Liang Huang, Bo-Hui Tang, Jiapei Cheng, Meiqi Wang and Zixuan Zhang
Remote Sens. 2024, 16(6), 1061; https://doi.org/10.3390/rs16061061 - 16 Mar 2024
Viewed by 1250
Abstract
Dynamic monitoring of cropland using high spatial resolution remote sensing images is a powerful means to protect cropland resources. However, when a change detection method based on a convolutional neural network employs a large number of convolution and pooling operations to mine the [...] Read more.
Dynamic monitoring of cropland using high spatial resolution remote sensing images is a powerful means to protect cropland resources. However, when a change detection method based on a convolutional neural network employs a large number of convolution and pooling operations to mine the deep features of cropland, the accumulation of irrelevant features and the loss of key features will lead to poor detection results. To effectively solve this problem, a novel cropland change detection network (CroplandCDNet) is proposed in this paper; this network combines an adaptive receptive field and multiscale feature transmission fusion to achieve accurate detection of cropland change information. CroplandCDNet first effectively extracts the multiscale features of cropland from bitemporal remote sensing images through the feature extraction module and subsequently embeds the receptive field adaptive SK attention (SKA) module to emphasize cropland change. Moreover, the SKA module effectively uses spatial context information for the dynamic adjustment of the convolution kernel size of cropland features at different scales. Finally, multiscale features and difference features are transmitted and fused layer by layer to obtain the content of cropland change. In the experiments, the proposed method is compared with six advanced change detection methods using the cropland change detection dataset (CLCD). The experimental results show that CroplandCDNet achieves the best F1 and OA at 76.04% and 94.47%, respectively. Its precision and recall are second best of all models at 76.46% and 75.63%, respectively. Moreover, a generalization experiment was carried out using the Jilin-1 dataset, which effectively verified the reliability of CroplandCDNet in cropland change detection. Full article
Show Figures

Figure 1

Figure 1
<p>The structure of CroplandCDNet.</p>
Full article ">Figure 2
<p>Effect of data augmentation. (<b>a</b>) Original image. (<b>b</b>) Rotation. (<b>c</b>) Horizontal flip. (<b>d</b>) Vertical flip. (<b>e</b>) Cropping. (<b>f</b>) Translation. (<b>g</b>) Contrast change. (<b>h</b>) Brightness change. (<b>i</b>) Addition of Gaussian noise. (<b>j</b>) Data augmentation for multiple images.</p>
Full article ">Figure 2 Cont.
<p>Effect of data augmentation. (<b>a</b>) Original image. (<b>b</b>) Rotation. (<b>c</b>) Horizontal flip. (<b>d</b>) Vertical flip. (<b>e</b>) Cropping. (<b>f</b>) Translation. (<b>g</b>) Contrast change. (<b>h</b>) Brightness change. (<b>i</b>) Addition of Gaussian noise. (<b>j</b>) Data augmentation for multiple images.</p>
Full article ">Figure 3
<p>Process of the feature extraction module.</p>
Full article ">Figure 4
<p>Change detection module.</p>
Full article ">Figure 5
<p>The structure of SKA.</p>
Full article ">Figure 6
<p>Sample images in the CLCD dataset. (<b>a</b>) T1. (<b>b</b>) T2. (<b>c</b>) Ground truth: the white area indicates a change, and the black area indicates no change.</p>
Full article ">Figure 7
<p>Visualization results for Scene 1. (<b>a</b>) T1. (<b>b</b>) T2. (<b>c</b>) Ground truth. (<b>d</b>) CDNet. (<b>e</b>) DSIFN. (<b>f</b>) SNUNet. (<b>g</b>) BIT. (<b>h</b>) L-UNet. (<b>i</b>) P2V-CD. (<b>j</b>) CroplandCDNet (ours).</p>
Full article ">Figure 8
<p>Visualization results for Scene 2. (<b>a</b>) T1. (<b>b</b>) T2. (<b>c</b>) Ground truth. (<b>d</b>) CDNet. (<b>e</b>) DSIFN. (<b>f</b>) SNUNet. (<b>g</b>) BIT. (<b>h</b>) L-UNet. (<b>i</b>) P2V-CD. (<b>j</b>) CroplandCDNet (ours).</p>
Full article ">Figure 9
<p>Visualization results for Scene 3. (<b>a</b>) T1. (<b>b</b>) T2. (<b>c</b>) Ground truth. (<b>d</b>) CDNet. (<b>e</b>) DSIFN. (<b>f</b>) SNUNet. (<b>g</b>) BIT. (<b>h</b>) L-UNet. (<b>i</b>) P2V-CD. (<b>j</b>) CroplandCDNet (ours).</p>
Full article ">Figure 10
<p>Visualization results for Scene 4. (<b>a</b>) T1. (<b>b</b>) T2. (<b>c</b>) Ground truth. (<b>d</b>) CDNet. (<b>e</b>) DSIFN. (<b>f</b>) SNUNet. (<b>g</b>) BIT. (<b>h</b>) L-UNet. (<b>i</b>) P2V-CD. (<b>j</b>) CroplandCDNet (ours).</p>
Full article ">Figure 11
<p>Visualization results of ablation experiments of the proposed method using the CLCD dataset. (<b>a</b>) T1. (<b>b</b>) T2. (<b>c</b>) Ground truth. (<b>d</b>) Base. (<b>e</b>) Base + SKA. (<b>f</b>) Base + SKA, +layer2. (<b>g</b>) Base + SKA, +layer2,3. (<b>h</b>) Base +layer2,3,4. (<b>i</b>) Base + SKA, +layer2,3,4. (<b>j</b>) CroplandCDNet (ours).</p>
Full article ">Figure 12
<p>Partial visualization results of the proposed method using the Jilin-1 cropland change detection dataset. (<b>a</b>) T1. (<b>b</b>) T2. (<b>c</b>) Ground truth: white areas indicate changes; black areas indicate no changes. (<b>d</b>) CroplandCDNet (ours).</p>
Full article ">
22 pages, 8724 KiB  
Article
Hyperspectral Image Classification on Large-Scale Agricultural Crops: The Heilongjiang Benchmark Dataset, Validation Procedure, and Baseline Results
by Hongzhe Zhang, Shou Feng, Di Wu, Chunhui Zhao, Xi Liu, Yuan Zhou, Shengnan Wang, Hongtao Deng and Shuang Zheng
Remote Sens. 2024, 16(3), 478; https://doi.org/10.3390/rs16030478 - 26 Jan 2024
Cited by 2 | Viewed by 1737
Abstract
Over the past few decades, researchers have shown sustained and robust investment in exploring methods for hyperspectral image classification (HSIC). The utilization of hyperspectral imagery (HSI) for crop classification in agricultural areas has been widely demonstrated for its feasibility, flexibility, and cost-effectiveness. However, [...] Read more.
Over the past few decades, researchers have shown sustained and robust investment in exploring methods for hyperspectral image classification (HSIC). The utilization of hyperspectral imagery (HSI) for crop classification in agricultural areas has been widely demonstrated for its feasibility, flexibility, and cost-effectiveness. However, numerous coexisting issues in agricultural scenarios, such as limited annotated samples, uneven distribution of crops, and mixed cropping, could not be explored insightfully in the mainstream datasets. The limitations within these impractical datasets have severely restricted the widespread application of HSIC methods in agricultural scenarios. A benchmark dataset named Heilongjiang (HLJ) for HSIC is introduced in this paper, which is designed for large-scale crop classification. For practical applications, the HLJ dataset covers a wide range of genuine agricultural regions in Heilongjiang Province; it provides rich spectral diversity enriched through two images from diverse time periods and vast geographical areas with intercropped multiple crops. Simultaneously, considering the urgent demand of deep learning models, the two images in the HLJ dataset have 319,685 and 318,942 annotated samples, along with 151 and 149 spectral bands, respectively. To validate the suitability of the HLJ dataset as a baseline dataset for HSIC, we employed eight classical classification models in fundamental experiments on the HLJ dataset. Most of the methods achieved an overall accuracy of more than 80% with 10% of the labeled samples used for training. Furthermore, the advantages of the HLJ dataset and the impact of real-world factors on experimental results are comprehensively elucidated. The comprehensive baseline experimental evaluation and analysis affirm the research potential of the HLJ dataset as a large-scale crop classification dataset. Full article
Show Figures

Figure 1

Figure 1
<p>Study regions of Heilongjiang Province.</p>
Full article ">Figure 2
<p>Flowchart for the construction of the HLJ dataset.</p>
Full article ">Figure 3
<p>Pseudocolor image and ground truth map of HLJ-Raohe dataset. (<b>a</b>) Pseudocolor image. (<b>b</b>) Ground truth.</p>
Full article ">Figure 4
<p>Pseudocolor image and ground truth map of HLJ-Yan dataset. (<b>a</b>) Pseudocolor image. (<b>b</b>) Ground truth.</p>
Full article ">Figure 5
<p>Pseudocolor image and ground truth map of WHU-Hi-LongKou dataset. (<b>a</b>) Pseudocolor image. (<b>b</b>) Ground truth.</p>
Full article ">Figure 6
<p>Pseudocolor image and ground truth map of WHU-Hi-HanChuan dataset. (<b>a</b>) Pseudocolor image. (<b>b</b>) Ground truth.</p>
Full article ">Figure 7
<p>Pseudocolor image and ground truth map of Yellow River Estuary dataset. (<b>a</b>) Pseudocolor image. (<b>b</b>) Ground truth.</p>
Full article ">Figure 8
<p>Pseudocolor image and ground truth map of Indian Pines dataset. (<b>a</b>) Pseudocolor image. (<b>b</b>) Ground truth.</p>
Full article ">Figure 9
<p>Pseudocolor image and ground truth map of Salinas dataset. (<b>a</b>) Pseudocolor image. (<b>b</b>) Ground truth.</p>
Full article ">Figure 10
<p>Classification maps obtained through various methods for HLJ-Raohe dataset. (<b>a</b>) Ground truth, (<b>b</b>) SVM, (<b>c</b>) 2D-Deform, (<b>d</b>) SSRN, (<b>e</b>) DBDA, (<b>f</b>) DBDA-MISH, (<b>g</b>) ViT, (<b>h</b>) SpectralFormer, (<b>i</b>) SSFTT.</p>
Full article ">Figure 11
<p>Classification maps obtained through various methods for HLJ-Yan dataset. (<b>a</b>) Ground truth, (<b>b</b>) SVM, (<b>c</b>) 2D-Deform, (<b>d</b>) SSRN, (<b>e</b>) DBDA, (<b>f</b>) DBDA-MISH, (<b>g</b>) ViT, (<b>h</b>) SpectralFormer, (<b>i</b>) SSFTT.</p>
Full article ">Figure 12
<p>The classification performance of Soybean and Corn in different training samples. (<b>a</b>) SSRN method on HLJ-Raohe, (<b>b</b>) SSFTT method on HLJ-Raohe, (<b>c</b>) SSRN method on HLJ-Yan, (<b>d</b>) SSFTT method on HLJ-Yan.</p>
Full article ">Figure 13
<p>The classification performance of HLJ dataset with different training samples. (<b>a</b>) HLJ-Raohe, (<b>b</b>) HLJ-Yan.</p>
Full article ">Figure 14
<p>Visualization of all labeled samples using t-SNE. (<b>a</b>) HLJ-Raohe, (<b>b</b>) HLJ-Yan, (<b>c</b>) WHU-Hi-LongKou, (<b>d</b>) Yellow River Estuary, (<b>e</b>) Indian Pines, (<b>f</b>) Salinas.</p>
Full article ">Figure 15
<p>The spectral curves of HLJ dataset. (<b>a</b>) HLJ-Raohe, (<b>b</b>) HLJ-Yan.</p>
Full article ">
18 pages, 6951 KiB  
Article
Self-Supervised Deep Multi-Level Representation Learning Fusion-Based Maximum Entropy Subspace Clustering for Hyperspectral Band Selection
by Yulei Wang, Haipeng Ma, Yuchao Yang, Enyu Zhao, Meiping Song and Chunyan Yu
Remote Sens. 2024, 16(2), 224; https://doi.org/10.3390/rs16020224 - 5 Jan 2024
Cited by 1 | Viewed by 1026
Abstract
As one of the most important techniques for hyperspectral image dimensionality reduction, band selection has received considerable attention, whereas self-representation subspace clustering-based band selection algorithms have received quite a lot of attention with good effect. However, many of them lack the self-supervision of [...] Read more.
As one of the most important techniques for hyperspectral image dimensionality reduction, band selection has received considerable attention, whereas self-representation subspace clustering-based band selection algorithms have received quite a lot of attention with good effect. However, many of them lack the self-supervision of representations and ignore the multi-level spectral–spatial information of HSI and the connectivity of subspaces. To this end, this paper proposes a novel self-supervised multi-level representation learning fusion-based maximum entropy subspace clustering (MLRLFMESC) method for hyperspectral band selection. Firstly, to learn multi-level spectral–spatial information, self-representation subspace clustering is embedded between the encoder layers of the deep-stacked convolutional autoencoder and its corresponding decoder layers, respectively, as multiple fully connected layers to achieve multi-level representation learning (MLRL). A new auxiliary task is constructed for multi-level representation learning and multi-level self-supervised training to improve its capability of representation. Then, a fusion model is designed to fuse the multi-level spectral–spatial information to obtain a more distinctive coefficient matrix for self-expression, where the maximum entropy regularization (MER) method is employed to promote connectivity and the uniform dense distribution of band elements in each subspace. Finally, subspace clustering is conducted to obtain the final band subset. Experiments have been conducted on three hyperspectral datasets, and the corresponding results show that the proposed MLRLFMESC algorithm significantly outperforms several other band selection methods in classification performance. Full article
Show Figures

Figure 1

Figure 1
<p>Overall flowchart of the proposed MLRLFMESC framework.</p>
Full article ">Figure 2
<p>The IP dataset with (<b>a</b>) a pseudo-color map of IP, and (<b>b</b>) true image class distribution with labels.</p>
Full article ">Figure 3
<p>The PU dataset, with (<b>a</b>) a pseudo-color map of PU, and (<b>b</b>) true image class distribution with labels.</p>
Full article ">Figure 4
<p>The SA dataset, with (<b>a</b>) a pseudo-color map of SA, and (<b>b</b>) true image class distribution with labels.</p>
Full article ">Figure 5
<p>Box plot of the OA for different BS methods on three hyperspectral datasets. (<b>a</b>) Indian Pines, (<b>b</b>) Pavia University, and (<b>c</b>) Salinas.</p>
Full article ">Figure 6
<p>Classification performance with a different number of selected bands on IP, (<b>a</b>) OA, (<b>b</b>) AA, (<b>c</b>) Kappa.</p>
Full article ">Figure 7
<p>Distribution of bands selected using various BS algorithms on IP data.</p>
Full article ">Figure 8
<p>Classification result maps, (<b>a</b>) labeled image, and (<b>b</b>–<b>h</b>) classification result maps on IP-selected 30 bands using UBS, E-FDPC, ISSC, ASPS_MN, DSC, MLRLFMESC, and all bands, respectively.</p>
Full article ">Figure A1
<p>Classification performance of PU dataset with a different number of selected bands, (<b>a</b>) OA, (<b>b</b>) AA, (<b>c</b>) Kappa.</p>
Full article ">Figure A2
<p>Distribution of bands selected using various BS algorithms for PU dataset.</p>
Full article ">Figure A3
<p>Classification result maps with (<b>a</b>) a labeled image of PU, and (<b>b</b>–<b>h</b>) classification result maps with 20 bands using UBS, E-FDPC, ISSC, ASPS_MN, DSC, MLRLFMESC, and all bands.</p>
Full article ">Figure A4
<p>Classification performance of SA dataset with a different number of selected bands, (<b>a</b>) OA, (<b>b</b>) AA, (<b>c</b>) Kappa.</p>
Full article ">Figure A5
<p>Distribution of bands selected using various BS algorithms for SA dataset.</p>
Full article ">Figure A6
<p>Classification result maps with a (<b>a</b>) labeled image of SA, and (<b>b</b>–<b>h</b>) classification result maps with 30 bands using UBS, E-FDPC, ISSC, ASPS_MN, DSC, MLRLFMESC, and all bands.</p>
Full article ">
24 pages, 15242 KiB  
Article
Pan-Sharpening Network of Multi-Spectral Remote Sensing Images Using Two-Stream Attention Feature Extractor and Multi-Detail Injection (TAMINet)
by Jing Wang, Jiaqing Miao, Gaoping Li, Ying Tan, Shicheng Yu, Xiaoguang Liu, Li Zeng and Guibing Li
Remote Sens. 2024, 16(1), 75; https://doi.org/10.3390/rs16010075 - 24 Dec 2023
Viewed by 1165
Abstract
Achieving a balance between spectral resolution and spatial resolution in multi-spectral remote sensing images is challenging due to physical constraints. Consequently, pan-sharpening technology was developed to address this challenge. While significant progress was recently achieved in deep-learning-based pan-sharpening techniques, most existing deep learning [...] Read more.
Achieving a balance between spectral resolution and spatial resolution in multi-spectral remote sensing images is challenging due to physical constraints. Consequently, pan-sharpening technology was developed to address this challenge. While significant progress was recently achieved in deep-learning-based pan-sharpening techniques, most existing deep learning approaches face two primary limitations: (1) convolutional neural networks (CNNs) struggle with long-range dependency issues, and (2) significant detail loss during deep network training. Moreover, despite these methods’ pan-sharpening capabilities, their generalization to full-sized raw images remains problematic due to scaling disparities, rendering them less practical. To tackle these issues, we introduce in this study a multi-spectral remote sensing image fusion network, termed TAMINet, which leverages a two-stream coordinate attention mechanism and multi-detail injection. Initially, a two-stream feature extractor augmented with the coordinate attention (CA) block is employed to derive modal-specific features from low-resolution multi-spectral (LRMS) images and panchromatic (PAN) images. This is followed by feature-domain fusion and pan-sharpening image reconstruction. Crucially, a multi-detail injection approach is incorporated during fusion and reconstruction, ensuring the reintroduction of details lost earlier in the process, which minimizes high-frequency detail loss. Finally, a novel hybrid loss function is proposed that incorporates spatial loss, spectral loss, and an additional loss component to enhance performance. The proposed methodology’s effectiveness was validated through experiments on WorldView-2 satellite images, IKONOS, and QuickBird, benchmarked against current state-of-the-art techniques. Experimental findings reveal that TAMINet significantly elevates the pan-sharpening performance for large-scale images, underscoring its potential to enhance multi-spectral remote sensing image quality. Full article
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>Detailed architectures of the TAMINet. FEN consists of FEN-1 and FEN-2; FN consists of FN-1 and FN-2; REC consists of REC-1, REC-2, REC-3 and REC-4.</p>
Full article ">Figure 2
<p>The visualization of the IKONOS dataset: (<b>I</b>) represents the LRMS image, PAN image and the image has been sharpened by the network TAMINet. (<b>II</b>) represents the result graph of pan-sharpening. (<b>III</b>) plot of the difference between the GroundTruth image and the resulting graph in the blue band. (<b>IV</b>) histogram of GroundTruth image with graph of results in blue band. Where lowercase letter (<b>a</b>) is LRMS image, (<b>b</b>) is PAN image, and (<b>c</b>) is TAMINet. Lowercase letters (<b>d</b>–<b>m</b>) are the method of comparison. (<b>d</b>) GS. (<b>e</b>) IHS. (<b>f</b>) Brovey. (<b>g</b>) PRACS. (<b>h</b>) PNN. (<b>i</b>) PanNet. (<b>j</b>) TFNet. (<b>k</b>) MSDCNN. (<b>l</b>) SRPPNN. (<b>m</b>) <math display="inline"><semantics> <mi>λ</mi> </semantics></math>-PNN.</p>
Full article ">Figure 3
<p>The visualization of the QuickBird dataset: (<b>I</b>) represents the LRMS image, PAN image and the image has been sharpened by the network TAMINet. (<b>II</b>) represents the resulting plot of pan-sharpening. (<b>III</b>) plot of the difference between the GroundTruth image and the resulting graph in the NIR band. (<b>IV</b>) histogram of the GroundTruth image with the graph of result in the NIR band. Where lowercase letter (<b>a</b>) is LRMS image, (<b>b</b>) is PAN image, and (<b>c</b>) is TAMINet. Lowercase letters (<b>d</b>–<b>m</b>) are the method of comparison. (<b>d</b>) GS. (<b>e</b>) IHS. (<b>f</b>) Brovey. (<b>g</b>) PRACS. (<b>h</b>) PNN. (<b>i</b>) PanNet. (<b>j</b>) TFNet. (<b>k</b>) MSDCNN. (<b>l</b>) SRPPNN. (<b>m</b>) <math display="inline"><semantics> <mi>λ</mi> </semantics></math>-PNN.</p>
Full article ">Figure 4
<p>The visualization of the WorldView-2 dataset: (<b>I</b>) represents the LRMS image, PAN image and the image has been sharpened by the network TAMINet. (<b>II</b>) represents the resulting plot of pan-sharpening. (<b>III</b>) plot of the difference between the GroundTruth image and the resulting graph in the NIR band. (<b>IV</b>) histogram of the GroundTruth image with result plot in the NIR band. Where lowercase letter (<b>a</b>) is LRMS image, (<b>b</b>) is PAN image, and (<b>c</b>) is TAMINet. Lowercase letters (<b>d</b>–<b>m</b>) are the method of comparison. (<b>d</b>) GS. (<b>e</b>) IHS. (<b>f</b>) Brovey. (<b>g</b>) PRACS. (<b>h</b>) PNN. (<b>i</b>) PanNet. (<b>j</b>) TFNet. (<b>k</b>) MSDCNN. (<b>l</b>) SRPPNN. (<b>m</b>) <math display="inline"><semantics> <mi>λ</mi> </semantics></math>-PNN.</p>
Full article ">Figure 5
<p>The values of weights <math display="inline"><semantics> <mi>α</mi> </semantics></math>, <math display="inline"><semantics> <mi>β</mi> </semantics></math> and <math display="inline"><semantics> <mi>μ</mi> </semantics></math>: (<b>a</b>) represents <math display="inline"><semantics> <mrow> <mi>α</mi> <mo>=</mo> <mn>0.1</mn> </mrow> </semantics></math>, where <math display="inline"><semantics> <mi>β</mi> </semantics></math> and <math display="inline"><semantics> <mi>μ</mi> </semantics></math> vary according to formula <math display="inline"><semantics> <mrow> <mi>α</mi> <mo>+</mo> <mi>β</mi> <mo>+</mo> <mi>μ</mi> <mo>=</mo> <mn>1</mn> </mrow> </semantics></math>; (<b>b</b>) represents <math display="inline"><semantics> <mrow> <mi>α</mi> <mo>=</mo> <mn>0.3</mn> </mrow> </semantics></math>, where <math display="inline"><semantics> <mi>β</mi> </semantics></math> and <math display="inline"><semantics> <mi>μ</mi> </semantics></math> vary according to formula <math display="inline"><semantics> <mrow> <mi>α</mi> <mo>+</mo> <mi>β</mi> <mo>+</mo> <mi>μ</mi> <mo>=</mo> <mn>1</mn> </mrow> </semantics></math>; (<b>c</b>) <math display="inline"><semantics> <mrow> <mi>α</mi> <mo>=</mo> <mn>0.5</mn> </mrow> </semantics></math>, where <math display="inline"><semantics> <mi>β</mi> </semantics></math> and <math display="inline"><semantics> <mi>μ</mi> </semantics></math> vary according to formula <math display="inline"><semantics> <mrow> <mi>α</mi> <mo>+</mo> <mi>β</mi> <mo>+</mo> <mi>μ</mi> <mo>=</mo> <mn>1</mn> </mrow> </semantics></math>; (<b>d</b>) <math display="inline"><semantics> <mrow> <mi>α</mi> <mo>=</mo> <mn>0.7</mn> </mrow> </semantics></math>, where <math display="inline"><semantics> <mi>β</mi> </semantics></math> and <math display="inline"><semantics> <mi>μ</mi> </semantics></math> vary according to formula <math display="inline"><semantics> <mrow> <mi>α</mi> <mo>+</mo> <mi>β</mi> <mo>+</mo> <mi>μ</mi> <mo>=</mo> <mn>1</mn> </mrow> </semantics></math>.</p>
Full article ">
27 pages, 7141 KiB  
Article
TransHSI: A Hybrid CNN-Transformer Method for Disjoint Sample-Based Hyperspectral Image Classification
by Ping Zhang, Haiyang Yu, Pengao Li and Ruili Wang
Remote Sens. 2023, 15(22), 5331; https://doi.org/10.3390/rs15225331 - 12 Nov 2023
Cited by 2 | Viewed by 1731
Abstract
Hyperspectral images’ (HSIs) classification research has seen significant progress with the use of convolutional neural networks (CNNs) and Transformer blocks. However, these studies primarily incorporated Transformer blocks at the end of their network architectures. Due to significant differences between the spectral and spatial [...] Read more.
Hyperspectral images’ (HSIs) classification research has seen significant progress with the use of convolutional neural networks (CNNs) and Transformer blocks. However, these studies primarily incorporated Transformer blocks at the end of their network architectures. Due to significant differences between the spectral and spatial features in HSIs, the extraction of both global and local spectral–spatial features remains incomplete. To address this challenge, this paper introduces a novel method called TransHSI. This method incorporates a new spectral–spatial feature extraction module that leverages 3D CNNs to fuse Transformer to extract the local and global spectral features of HSIs, then combining 2D CNNs and Transformer to capture the local and global spatial features of HSIs comprehensively. Furthermore, a fusion module is proposed, which not only integrates the learned shallow and deep features of HSIs but also applies a semantic tokenizer to transform the fused features, enhancing the discriminative power of the features. This paper conducts experiments on three public datasets: Indian Pines, Pavia University, and Data Fusion Contest 2018. The training and test sets are selected based on a disjoint sampling strategy. We perform a comparative analysis with 11 traditional and advanced HSI classification algorithms. The experimental results demonstrate that the proposed method, TransHSI algorithm, achieves the highest overall accuracies and kappa coefficients, indicating a competitive performance. Full article
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>(<b>a</b>) The MHSA of Transformer Encode; (<b>b</b>) Transformer Encode.</p>
Full article ">Figure 2
<p>TransHSI classification network framework.</p>
Full article ">Figure 3
<p>Spectral–spatial feature extraction module. (<b>a</b>) The spectral feature extraction module; (<b>b</b>) The spatial feature extraction module.</p>
Full article ">Figure 4
<p>Fusion module.</p>
Full article ">Figure 5
<p>Indian Pines Dataset: (<b>a</b>) false-color map, (<b>b</b>) ground truth map, (<b>c</b>) training, (<b>d</b>) test.</p>
Full article ">Figure 6
<p>Pavia University Dataset: (<b>a</b>) false-color map, (<b>b</b>) ground truth map, (<b>c</b>) training, (<b>d</b>) test.</p>
Full article ">Figure 7
<p>DFC 2018: (<b>a</b>) false-color map, (<b>b</b>) ground truth map, (<b>c</b>) training, (<b>d</b>) test.</p>
Full article ">Figure 8
<p>Impact of patch size and spectral number after PCA for OA (%). (<b>a</b>) The Indian Pines dataset; (<b>b</b>) the Pavia University dataset; (<b>c</b>) the DFC2018.</p>
Full article ">Figure 9
<p>The classification results of the Indian Pines Dataset. (<b>a</b>) SVM; (<b>b</b>) RF; (<b>c</b>) 2D CNN; (<b>d</b>) 3D CNN; (<b>e</b>) HybirdSN; (<b>f</b>) SSRN; (<b>g</b>) InternImage; (<b>h</b>) ViT; (<b>i</b>) Next-ViT; (<b>j</b>) SSFTT; (<b>k</b>) SSTN; (<b>l</b>) TransHSI.</p>
Full article ">Figure 10
<p>The classification results of the Pavia University dataset. (<b>a</b>) SVM; (<b>b</b>) RF; (<b>c</b>) 2D CNN; (<b>d</b>) 3D CNN; (<b>e</b>) HybirdSN; (<b>f</b>) SSRN; (<b>g</b>) InternImage; (<b>h</b>) ViT; (<b>i</b>) Next-ViT; (<b>j</b>) SSFTT; (<b>k</b>) SSTN; (<b>l</b>) TransHSI.</p>
Full article ">Figure 11
<p>The classification results of DFC2018. (<b>a</b>) SVM; (<b>b</b>) RF; (<b>c</b>) 2D CNN; (<b>d</b>) 3D CNN; (<b>e</b>) HybirdSN; (<b>f</b>) SSRN; (<b>g</b>) InternImage; (<b>h</b>) ViT; (<b>i</b>) Next-ViT; (<b>j</b>) SSFTT; (<b>k</b>) SSTN; (<b>l</b>) TransHSI.</p>
Full article ">Figure 12
<p>The 3D confusion matrix of TransHSI. (<b>a</b>) The Indian Pines dataset; (<b>b</b>) the Pavia University dataset; (<b>c</b>) the DFC2018.</p>
Full article ">Figure 13
<p>Visualization of the t-SNE algorithm in the TransHSI method for three datasets (top/original features and bottom/output features). (<b>a</b>) The Indian Pines dataset; (<b>b</b>) the Pavia University dataset; (<b>c</b>) the DFC2018.</p>
Full article ">Figure 14
<p>The partial activation maps of two HSI samples on the Pavia University Dataset. (<b>a</b>,<b>e</b>) represent the activation maps of the TransHSI method; (<b>b</b>,<b>f</b>) show the activation maps when the spectral feature extraction module is removed; (<b>c</b>,<b>g</b>) illustrate the activation maps when the spatial feature extraction module is removed; (<b>d</b>,<b>h</b>) show the activation maps when all Transformer Encode modules are removed. The yellow region corresponds to areas with high activation values, the green region corresponds to regions with moderate activation values, and the blue region corresponds to areas with low activation values.</p>
Full article ">Figure 15
<p>Effect of different percentages (<b>top</b>/Indian Pines, <b>bottom</b>/DFC2018, and <b>middle</b>/Pavia University) of training samples on the classification results.</p>
Full article ">
17 pages, 6875 KiB  
Article
Invariant Attribute-Driven Binary Bi-Branch Classification of Hyperspectral and LiDAR Images
by Jiaqing Zhang, Jie Lei, Weiying Xie and Daixun Li
Remote Sens. 2023, 15(17), 4255; https://doi.org/10.3390/rs15174255 - 30 Aug 2023
Viewed by 1096
Abstract
The fusion of hyperspectral and LiDAR images plays a crucial role in remote sensing by capturing spatial relationships and modeling semantic information for accurate classification and recognition. However, existing methods, such as Graph Convolutional Networks (GCNs), face challenges in constructing effective graph structures [...] Read more.
The fusion of hyperspectral and LiDAR images plays a crucial role in remote sensing by capturing spatial relationships and modeling semantic information for accurate classification and recognition. However, existing methods, such as Graph Convolutional Networks (GCNs), face challenges in constructing effective graph structures due to variations in local semantic information and limited receptiveness to large-scale contextual structures. To overcome these limitations, we propose an Invariant Attribute-driven Binary Bi-branch Classification (IABC) method, which is a unified network that combines a binary Convolutional Neural Network (CNN) and a GCN with invariant attributes. Our approach utilizes a joint detection framework that can simultaneously learn features from small-scale regular regions and large-scale irregular regions, resulting in an enhanced structural representation of HSI and LiDAR images in the spectral–spatial domain. This approach not only improves the accuracy of classification and recognition but also reduces storage requirements and enables real-time decision making, which is crucial for effectively processing large-scale remote sensing data. Extensive experiments demonstrate the superior performance of our proposed method in hyperspectral image analysis tasks. The combination of CNNs and GCNs allows for the accurate modeling of spatial relationships and effective construction of graph structures. Furthermore, the integration of binary quantization enhances computational efficiency, enabling the real-time processing of large-scale data. Therefore, our approach presents a promising opportunity for advancing remote sensing applications using deep learning techniques. Full article
Show Figures

Figure 1

Figure 1
<p>The graph structure construction is influenced by feature variations in the same class field.</p>
Full article ">Figure 2
<p>The architecture of the proposed IABC. The invariant attributes are captured by Invariant Attribute Extraction (IAE) and then transformed to construct an effective graph structure for the GCN. The Spatial Consistency Fusion (SCF) is designed to enhance the consistency of similar features in the observed area’s terrain feature information for the CNN. The collaboration between the CNN and GCN improves the classification performance while the CNN with binary weights reduces storage requirements and enables accelerating speed.</p>
Full article ">Figure 3
<p>Classification maps of different methods for the Houston2013 dataset.</p>
Full article ">Figure 4
<p>Classification maps of different methods for the Trento dataset.</p>
Full article ">

Other

Jump to: Research

15 pages, 10968 KiB  
Technical Note
RANet: Relationship Attention for Hyperspectral Anomaly Detection
by Yingzhao Shao, Yunsong Li, Li Li, Yuanle Wang, Yuchen Yang, Yueli Ding, Mingming Zhang, Yang Liu and Xiangqiang Gao
Remote Sens. 2023, 15(23), 5570; https://doi.org/10.3390/rs15235570 - 30 Nov 2023
Cited by 2 | Viewed by 1007
Abstract
Hyperspectral anomaly detection (HAD) is of great interest for unknown exploration. Existing methods only focus on local similarity, which may show limitations in detection performance. To cope with this problem, we propose a relationship attention-guided unsupervised learning with convolutional autoencoders (CAEs) for HAD, [...] Read more.
Hyperspectral anomaly detection (HAD) is of great interest for unknown exploration. Existing methods only focus on local similarity, which may show limitations in detection performance. To cope with this problem, we propose a relationship attention-guided unsupervised learning with convolutional autoencoders (CAEs) for HAD, called RANet. First, instead of only focusing on the local similarity, RANet, for the first time, pays attention to topological similarity by leveraging the graph attention network (GAT) to capture deep topological relationships embedded in a customized incidence matrix from absolutely unlabeled data mixed with anomalies. Notably, the attention intensity of GAT is self-adaptively controlled by adjacency reconstruction ability, which can effectively reduce human intervention. Next, we adopt an unsupervised CAE to jointly learn with the topological relationship attention to achieve satisfactory model performance. Finally, on the basis of background reconstruction, we detect anomalies by the reconstruction error. Extensive experiments on hyperspectral images (HSIs) demonstrate that our proposed RANet outperforms existing fully unsupervised methods. Full article
Show Figures

Figure 1

Figure 1
<p>An illustration of the existing methods. Some anomalies (red blocks) are far apart in spatial dimensions and, thus, may be considered as background (blue blocks).</p>
Full article ">Figure 2
<p>High-level overview of the proposed RANet. For the topological-aware (TA) module part, we first use an incidence matrix D and a representation set S as input to GAT, to obtain a topological feature map <math display="inline"><semantics> <mover accent="true"> <mi mathvariant="bold">S</mi> <mo>^</mo> </mover> </semantics></math>. The reconstruction of D (denoted as <math display="inline"><semantics> <mover accent="true"> <mi mathvariant="bold">D</mi> <mo>^</mo> </mover> </semantics></math>) is gained after the decoding process. As for the reconstructed backbone part, we then use CAE as our network backbone, combined with the previously obtained <math display="inline"><semantics> <mover accent="true"> <mi mathvariant="bold">S</mi> <mo>^</mo> </mover> </semantics></math>, to reconstruct H into <math display="inline"><semantics> <mover accent="true"> <mi mathvariant="bold">H</mi> <mo>^</mo> </mover> </semantics></math>. The joint learning part is responsible for jointly learning the TA module and the reconstructed backbone in an end-to-end manner.</p>
Full article ">Figure 3
<p>(<b>Left</b>): Locations of the background samples in the pseudo-color image. (<b>Middle</b>) and (<b>Right</b>): Spectral curves of the background samples in the corresponding color. The legend indicates the spectral angle distance (SAD) and the Euclidean distance (ED) of the two spectral curves.</p>
Full article ">Figure 4
<p>Implementation details on four HSIs. (<b>a</b>) The threshold <math display="inline"><semantics> <mi>η</mi> </semantics></math>. (<b>b</b>) Number of hidden nodes.</p>
Full article ">Figure 5
<p>Implementation details of the hyperparameters <math display="inline"><semantics> <msub> <mi>λ</mi> <mn>1</mn> </msub> </semantics></math> and <math display="inline"><semantics> <msub> <mi>λ</mi> <mn>2</mn> </msub> </semantics></math> on (<b>a</b>) Texas Coast-1, (<b>b</b>) Texas Coast-2, (<b>c</b>) Los Angeles and (<b>d</b>) San Diego. Different colors refer to different intervals of values of <math display="inline"><semantics> <msub> <mi>λ</mi> <mn>1</mn> </msub> </semantics></math> and <math display="inline"><semantics> <msub> <mi>λ</mi> <mn>2</mn> </msub> </semantics></math>.</p>
Full article ">Figure 6
<p>False-color data, reference map, and detection maps of the compared methods for (<b>a</b>) Texas Coast-1, (<b>b</b>) Texas Coast-2, (<b>c</b>) Los Angeles and (<b>d</b>) San Diego.</p>
Full article ">Figure 7
<p>ROC curves of (TPR, FPR) for the algorithms on (<b>a</b>) Texas Coast-1, (<b>b</b>) Texas Coast-2, (<b>c</b>) Los Angeles and (<b>d</b>) San Diego.</p>
Full article ">Figure 8
<p>Background–anomaly separation analysis of the compared methods on (<b>a</b>) Texas Coast-1, (<b>b</b>) Texas Coast-2, (<b>c</b>) Los Angeles and (<b>d</b>) San Diego.</p>
Full article ">Figure 9
<p>Detection accuracy comparison under four factors related to the performance of RANet on four HSIs.</p>
Full article ">
13 pages, 15032 KiB  
Technical Note
Retrieval of Land Surface Temperature over Mountainous Areas Using Fengyun-3D MERSI-II Data
by Yixuan Xue, Xiaolin Zhu, Zihao Wu and Si-Bo Duan
Remote Sens. 2023, 15(23), 5465; https://doi.org/10.3390/rs15235465 - 23 Nov 2023
Viewed by 1192
Abstract
Land surface temperature (LST) is an important physical quantity in the energy exchange of hydrothermal cycles between the land and near-surface atmosphere at regional and global scales. However, the traditional thermal infrared transfer equation (RTE) and LST retrieval algorithms are always based on [...] Read more.
Land surface temperature (LST) is an important physical quantity in the energy exchange of hydrothermal cycles between the land and near-surface atmosphere at regional and global scales. However, the traditional thermal infrared transfer equation (RTE) and LST retrieval algorithms are always based on the underlying assumptions of homogeneity and isotropy, which ignore the terrain effect influence of a heterogeneous topography. It can cause significant errors when traditional RTE and other algorithms are used to retrieve LST in such mountainous research. In this study, the mountainous thermal infrared transfer model considering terrain effect correction is used to retrieve the mountainous LST using FY-3D MERSI-II data, and the in situ site data are simultaneously utilized to evaluate the performance of the iterative single-channel algorithm. The elevation of this study region ranges from 500 m to 2200 m, whereas the minimum SVF can reach 0.75. Results show that the spatial distribution of the retrieved LST is similar to topographic features, and the LST has larger values in the lower valley and smaller values in the higher ridge. In addition, the overall bias and RMSE between the retrieved LSTs and five in situ stations are respectively −0.70 K and 2.64 K, which demonstrates this iterative single-channel algorithm performs well in taking into account the terrain effect influence. Accuracy of the LST estimation is meaningful for mountainous ecological environmental monitoring and global climate research. Such an adjacent terrain effect correction should be considered in future research on complex terrains, especially with high spatial resolution TIR data. Full article
Show Figures

Figure 1

Figure 1
<p>The location and elevation of the study areas. The overall regions of the FY-3D LST retrieved images of study area-I (<b>a</b>) and study area-II (<b>b</b>), and (<b>c</b>) the locations of the in situ stations.</p>
Full article ">Figure 2
<p>Spectral response functions in FY-3D bands 24 and 25.</p>
Full article ">Figure 3
<p>Spatial distribution of the study area-I (<b>a</b>) aspect angle, (<b>b</b>) slope angle, (<b>c</b>) SVF, and (<b>d</b>) ASTER GED bare soil emissivity, where the white regions represent missing values.</p>
Full article ">Figure 3 Cont.
<p>Spatial distribution of the study area-I (<b>a</b>) aspect angle, (<b>b</b>) slope angle, (<b>c</b>) SVF, and (<b>d</b>) ASTER GED bare soil emissivity, where the white regions represent missing values.</p>
Full article ">Figure 4
<p>Spatial distribution of retrieved LST images from FY-3D band 24 with the terrain correction effect on (<b>a</b>) 10 November (05:45 UTC time), (<b>b</b>) 11 November (05:25 UTC time), (<b>c</b>) 11 November (19:25 UTC time), and (<b>d</b>) 14 November (18:30 UTC time) in 2021.</p>
Full article ">Figure 4 Cont.
<p>Spatial distribution of retrieved LST images from FY-3D band 24 with the terrain correction effect on (<b>a</b>) 10 November (05:45 UTC time), (<b>b</b>) 11 November (05:25 UTC time), (<b>c</b>) 11 November (19:25 UTC time), and (<b>d</b>) 14 November (18:30 UTC time) in 2021.</p>
Full article ">Figure 5
<p>Spatial distribution of the study area-II (<b>a</b>) aspect angle, (<b>b</b>) slope angle, (<b>c</b>) SVF, and (<b>d</b>) ASTER GED bare soil emissivity.</p>
Full article ">Figure 5 Cont.
<p>Spatial distribution of the study area-II (<b>a</b>) aspect angle, (<b>b</b>) slope angle, (<b>c</b>) SVF, and (<b>d</b>) ASTER GED bare soil emissivity.</p>
Full article ">Figure 6
<p>Spatial distribution of the retrieved LST images from FY-3D band 24 with the terrain correction effect on (<b>a</b>) 11 November (05:25 UTC time), (<b>b</b>) 12 November (19:05 UTC time), (<b>c</b>) 13 November (18:45 UTC time), and (<b>d</b>) 14 November (06:10 UTC time) in 2021.</p>
Full article ">Figure 6 Cont.
<p>Spatial distribution of the retrieved LST images from FY-3D band 24 with the terrain correction effect on (<b>a</b>) 11 November (05:25 UTC time), (<b>b</b>) 12 November (19:05 UTC time), (<b>c</b>) 13 November (18:45 UTC time), and (<b>d</b>) 14 November (06:10 UTC time) in 2021.</p>
Full article ">Figure 7
<p>The scatterplot of the LST bias and RMSE between the five in situ sites (A1701, A1702, A1703, A1707, and A1708) and the retrieved FY-3D LST.</p>
Full article ">
Back to TopTop