MDPI - Publisher of Open Access Journals

21 pages, 24146 KiB

Open AccessArticle

SMEP-DETR: Transformer-Based Ship Detection for SAR Imagery with Multi-Edge Enhancement and Parallel Dilated Convolutions

by Chushi Yu and Yoan Shin

Remote Sens. 2025, 17(6), 953; https://doi.org/10.3390/rs17060953 - 7 Mar 2025

Abstract

Synthetic aperture radar (SAR) serves as a pivotal remote sensing technology, offering critical support for ship monitoring, environmental observation, and national defense. Although optical detection methods have achieved good performance, SAR imagery still faces challenges, including speckle, complex backgrounds, and small, dense targets. [...] Read more.

Synthetic aperture radar (SAR) serves as a pivotal remote sensing technology, offering critical support for ship monitoring, environmental observation, and national defense. Although optical detection methods have achieved good performance, SAR imagery still faces challenges, including speckle, complex backgrounds, and small, dense targets. Reducing false alarms and missed detections while improving detection performance remains a key objective in the field. To address these issues, we propose SMEP-DETR, a transformer-based model with multi-edge enhancement and parallel dilated convolutions. This model integrates a speckle denoising module, a multi-edge information enhancement module, and a parallel dilated convolution and attention pyramid network. Experimental results demonstrate that SMEP-DETR achieves the high

m A P

98.6% on SSDD, 93.2% in HRSID, and 80.0% in LS-SSDD-v1.0, surpassing several state-of-the-art algorithms. Visualization results validate the model’s capability to effectively mitigate the impact of speckle noise while preserving valuable information in both inshore and offshore scenarios. Full article

(This article belongs to the Special Issue Remote Sensing Image Thorough Analysis by Advanced Machine Learning)

► Show Figures

Figure 1

19 pages, 5899 KiB

Open AccessArticle

DGBL-YOLOv8s: An Enhanced Object Detection Model for Unmanned Aerial Vehicle Imagery

by Chonghao Wang and Huaian Yi

Appl. Sci. 2025, 15(5), 2789; https://doi.org/10.3390/app15052789 - 5 Mar 2025

Viewed by 169

Abstract

Unmanned aerial vehicle (UAV) imagery often suffers from significant object scale variations, high target density, and varying distances due to shooting conditions and environmental factors, leading to reduced robustness and low detection accuracy in conventional models. To address these issues, this study adopts [...] Read more.

Unmanned aerial vehicle (UAV) imagery often suffers from significant object scale variations, high target density, and varying distances due to shooting conditions and environmental factors, leading to reduced robustness and low detection accuracy in conventional models. To address these issues, this study adopts DGBL-YOLOv8s, an improved object detection model tailored for UAV perspectives based on YOLOv8s. First, a Dilated Wide Residual (DWR) module is introduced to replace the C2f module in the backbone network of YOLOv8, enhancing the model’s capability to capture fine-grained features and contextual information. Second, the neck structure is redesigned by incorporating a Global-to-Local Spatial Aggregation (GLSA) module combined with a Bidirectional Feature Pyramid Network (BiFPN), which strengthens feature fusion. Third, a lightweight shared convolution detection head is proposed, incorporating shared convolution and batch normalization techniques. Additionally, to further improve small object detection, a dedicated small-object detection head is introduced. Results from experiments on the VisDrone dataset reveal that DGBL-YOLOv8s enhances detection accuracy by 8.5% relative to the baseline model, alongside a 34.8% reduction in parameter count. The overall performance exceeds most of the current detection models, which confirms the advantages of the proposed improvement. Full article

► Show Figures

Figure 1

20 pages, 3815 KiB

Open AccessArticle

A Benchmark for Water Surface Jet Segmentation with MobileHDC Method

by Yaojie Chen, Qing Quan, Wei Wang and Yunhan Lin

Appl. Sci. 2025, 15(5), 2755; https://doi.org/10.3390/app15052755 - 4 Mar 2025

Viewed by 138

Abstract

Intelligent jet systems are widely used in various fields, including firefighting, marine operations, and underwater exploration. Accurate extraction and prediction of jet trajectories are essential for optimizing their performance, but challenges arise due to environmental factors such as climate, wind direction, and suction [...] Read more.

Intelligent jet systems are widely used in various fields, including firefighting, marine operations, and underwater exploration. Accurate extraction and prediction of jet trajectories are essential for optimizing their performance, but challenges arise due to environmental factors such as climate, wind direction, and suction efficiency. To address these issues, we introduce two novel jet segmentation datasets, Libary and SegQinhu, which cover both indoor and outdoor environments under varying weather conditions and temporal intervals. These datasets present significant challenges, including occlusions and strong light reflections, making them ideal for evaluating jet trajectory segmentation methods. Through empirical evaluation of several state-of-the-art (SOTA) techniques on these datasets, we observe that general methods struggle with highly imbalanced pixel distributions in jet trajectory images. To overcome this, we propose a data-driven pipeline for jet trajectory extraction and segmentation. At its core is MobileHDC, a new baseline model that leverages the MobileNetV2 architecture and integrates dilated convolutions to enhance the receptive field without increasing computational cost. Additionally, we introduce a parallel convolutional block and a decoder to fuse multi-level features, enabling a better capture of contextual information and improving the continuity and accuracy of jet segmentation. The experimental results show that our method outperforms existing SOTA techniques on both jet-specific datasets, highlighting the effectiveness of our approach. Full article

► Show Figures

Figure 1

21 pages, 3926 KiB

Open AccessArticle

S4Det: Breadth and Accurate Sine Single-Stage Ship Detection for Remote Sense SAR Imagery

by Mingjin Zhang, Yingfeng Zhu, Longyi Li, Jie Guo, Zhengkun Liu and Yunsong Li

Remote Sens. 2025, 17(5), 900; https://doi.org/10.3390/rs17050900 - 4 Mar 2025

Viewed by 178

Abstract

Synthetic Aperture Radar (SAR) is a remote sensing technology that can realize all-weather and all-day monitoring, and it is widely used in ocean ship monitoring tasks. Recently, many oriented detectors were used for ship detection in SAR images. However, these methods often found [...] Read more.

Synthetic Aperture Radar (SAR) is a remote sensing technology that can realize all-weather and all-day monitoring, and it is widely used in ocean ship monitoring tasks. Recently, many oriented detectors were used for ship detection in SAR images. However, these methods often found it difficult to balance the detection accuracy and speed, and the noise around the target in the inshore scene of SAR images led to a poor detection network performance. In addition, the rotation representation still has the problem of boundary discontinuity. To address these issues, we propose S4Det, a Sinusoidal Single-Stage SAR image detection method that enables real-time oriented ship target detection. Two key mechanisms were designed to address inshore scene processing and angle regression challenges. Specifically, a Breadth Search Compensation Module (BSCM) resolved the limited detection capability issue observed within inshore scenarios. Neural Discrete Codebook Learning was strategically integrated with Multi-scale Large Kernel Attention, capturing context information around the target and mitigating the information loss inherent in dilated convolutions. To tackle boundary discontinuity arising from the periodic nature of the target regression angle, we developed a Sine Fourier Transform Coding (SFTC) technique. The angle is represented using diverse sine components, and the discrete Fourier transform is applied to convert these periodic components to the frequency domain for processing. Finally, the experimental results of our S4Det on the RSSDD dataset achieved 92.2% mAP and 31+ FPS on an RTXA5000 GPU, which outperformed the prevalent mainstream of the oriented detection network. The robustness of the proposed S4Det was also verified on another public RSDD dataset. Full article

(This article belongs to the Section AI Remote Sensing)

► Show Figures

Figure 1

21 pages, 21254 KiB

Open AccessArticle

Lightweight Explicit 3D Human Digitization via Normal Integration

by Jiaxuan Liu, Jingyi Wu, Ruiyang Jing, Han Yu, Jing Liu and Liang Song

Sensors 2025, 25(5), 1513; https://doi.org/10.3390/s25051513 - 28 Feb 2025

Viewed by 165

Abstract

In recent years, generating 3D human models from images has gained significant attention in 3D human reconstruction. However, deploying large neural network models in practical applications remains challenging, particularly on resource-constrained edge devices. This problem is primarily because large neural network models require [...] Read more.

In recent years, generating 3D human models from images has gained significant attention in 3D human reconstruction. However, deploying large neural network models in practical applications remains challenging, particularly on resource-constrained edge devices. This problem is primarily because large neural network models require significantly higher computational power, which imposes greater demands on hardware capabilities and inference time. To address this issue, we can optimize the network architecture to reduce the number of model parameters, thereby alleviating the heavy reliance on hardware resources. We propose a lightweight and efficient 3D human reconstruction model that balances reconstruction accuracy and computational cost. Specifically, our model integrates Dilated Convolutions and the Cross-Covariance Attention mechanism into its architecture to construct a lightweight generative network. This design effectively captures multi-scale information while significantly reducing model complexity. Additionally, we introduce an innovative loss function tailored to the geometric properties of normal maps. This loss function provides a more accurate measure of surface reconstruction quality and enhances the overall reconstruction performance. Experimental results show that, compared with existing methods, our approach reduces the number of training parameters by approximately 80% while maintaining the generated model’s quality. Full article

(This article belongs to the Topic 3D Computer Vision and Smart Building and City, 2nd Volume)

► Show Figures

Figure 1

21 pages, 5606 KiB

Open AccessArticle

CE-RoadNet: A Cascaded Efficient Road Network for Road Extraction from High-Resolution Satellite Images

by Ke-Nan Cheng, Weiping Ni, Han Zhang, Junzheng Wu, Xiao Xiao and Zhigang Yang

Remote Sens. 2025, 17(5), 831; https://doi.org/10.3390/rs17050831 - 27 Feb 2025

Viewed by 129

Abstract

The reconstruction of road networks from high-resolution satellite images is of significant importance across a range of disciplines, including traffic management, vehicle navigation and urban planning. However, existing models are computationally demanding and memory-intensive due to their high model complexity, rendering them impractical [...] Read more.

The reconstruction of road networks from high-resolution satellite images is of significant importance across a range of disciplines, including traffic management, vehicle navigation and urban planning. However, existing models are computationally demanding and memory-intensive due to their high model complexity, rendering them impractical in many real-world applications. In this work, we present Cascaded Efficient Road Network (CE-RoadNet), a novel neural network architecture which emphasizes the elegance and simplicity of its design, while also retaining a noteworthy level of performance in road extraction tasks. First, a simple encoder–decoder architecture (Effi-RoadNet) is proposed, which leverages smoothed dilated convolutions combined with an attention-guided feature fusion module to aggregate features from multiple levels. Subsequently, an extended variant termed CE-RoadNet is designed in a cascaded architecture to enhance the feature representation ability of the model. Benefiting from the concise network design and the prominent representational ability of the stacking mechanism, our network can accomplish better trade-offs between accuracy and efficiency. Extensive experiments on public road datasets demonstrate that our approach achieves state-of-the-art results with lower complexity. All codes and models will be released soon to facilitate reproduction of our results. Full article

(This article belongs to the Special Issue The Emerging Trends and Applications of Big Data and Machine Learning/Artificial Intelligence (AI) in Remote Sensing II)

► Show Figures

Graphical abstract

19 pages, 3572 KiB

Open AccessArticle

MOSSNet: A Lightweight Dual-Branch Multiscale Attention Neural Network for Bryophyte Identification

by Haixia Luo, Xiangfen Zhang, Feiniu Yuan, Jing Yu, Hao Ding, Haoyu Xu and Shitao Hong

Symmetry 2025, 17(3), 347; https://doi.org/10.3390/sym17030347 - 25 Feb 2025

Viewed by 161

Abstract

Bryophytes, including liverworts, mosses, and hornworts, play an irreplaceable role in soil moisture retention, erosion prevention, and pollution monitoring. The precise identification of bryophyte species enhances our understanding and utilization of their ecological functions. However, their complex morphology and structural symmetry make identification [...] Read more.

Bryophytes, including liverworts, mosses, and hornworts, play an irreplaceable role in soil moisture retention, erosion prevention, and pollution monitoring. The precise identification of bryophyte species enhances our understanding and utilization of their ecological functions. However, their complex morphology and structural symmetry make identification difficult. Although deep learning improves classification efficiency, challenges remain due to limited datasets and the inadequate adaptation of existing methods to multi-scale features, causing poor performance in fine-grained multi-classification. Thus, we propose MOSSNet, a lightweight neural network for bryophyte feature detection. It has a four-stage architecture that efficiently extracts multi-scale features using a modular design with symmetry consideration in feature representation. At the input stage, the Convolutional Patch Embedding (CPE) module captures representative features through a two-layer convolutional structure. In each subsequent stage, Dual-Branch Multi-scale (DBMS) modules are employed, with one branch utilizing convolutional operations and the other utilizing the Dilated Convolution Enhanced Attention (DCEA) module for multi-scale feature fusion. The DBMS module extracts fine-grained and coarse-grained features by a weighted fusion of the outputs from two branches. Evaluating MOSSNet on the self-constructed dataset BryophyteFine reveals a Top-1 accuracy of 99.02% in classifying 26 bryophyte species, 7.13% higher than the best existing model, while using only 1.58 M parameters, 0.07 G FLOPs. Full article

(This article belongs to the Section Computer)

► Show Figures

Figure 1

20 pages, 3901 KiB

Open AccessArticle

Design and Implementation of a Lightweight and Energy-Efficient Semantic Segmentation Accelerator for Embedded Platforms

by Hui Li, Jinyi Li, Bowen Li, Zhengqian Miao and Shengli Lu

Micromachines 2025, 16(3), 258; https://doi.org/10.3390/mi16030258 - 25 Feb 2025

Viewed by 220

Abstract

With the rapid development of lightweight network models and efficient hardware deployment techniques, the demand for real-time semantic segmentation in areas such as autonomous driving and medical image processing has increased significantly. However, realizing efficient semantic segmentation on resource-constrained embedded platforms still faces [...] Read more.

With the rapid development of lightweight network models and efficient hardware deployment techniques, the demand for real-time semantic segmentation in areas such as autonomous driving and medical image processing has increased significantly. However, realizing efficient semantic segmentation on resource-constrained embedded platforms still faces many challenges. As a classical lightweight semantic segmentation network, ENet has attracted much attention due to its low computational complexity. In this study, we optimize the ENet semantic segmentation network to significantly reduce its computational complexity through structural simplification and 8-bit quantization and improve its hardware compatibility through the optimization of on-chip data storage and data transfer while maintaining 51.18% mIoU. The optimized network is successfully deployed on hardware accelerator and SoC systems based on Xilinx ZYNQ ZCU104 FPGA. In addition, we optimize the computational units of transposed convolution and dilated convolution and improve the on-chip data storage and data transfer design. The optimized system achieves a frame rate of 130.75 FPS, which meets the real-time processing requirements in areas such as autonomous driving and medical imaging. Meanwhile, the power consumption of the accelerator is 3.479 W, the throughput reaches 460.8 GOPS, and the energy efficiency reaches 132.2 GOPS/W. These results fully demonstrate the effectiveness of the optimization and deployment strategies in achieving a balance between computational efficiency and accuracy, which makes the system well suited for resource-constrained embedded platform applications. Full article

► Show Figures

Figure 1

22 pages, 11312 KiB

Open AccessArticle

Multi-Scale Kolmogorov-Arnold Network (KAN)-Based Linear Attention Network: Multi-Scale Feature Fusion with KAN and Deformable Convolution for Urban Scene Image Semantic Segmentation

by Yuanhang Li, Shuo Liu, Jie Wu, Weichao Sun, Qingke Wen, Yibiao Wu, Xiujuan Qin and Yanyou Qiao

Remote Sens. 2025, 17(5), 802; https://doi.org/10.3390/rs17050802 - 25 Feb 2025

Viewed by 212

Abstract

The introduction of an attention mechanism in remote sensing image segmentation improves the accuracy of the segmentation. In this paper, a novel multi-scale KAN-based linear attention (MKLA) segmentation network of MKLANet is developed to promote a better segmentation result. A hybrid global–local attention [...] Read more.

The introduction of an attention mechanism in remote sensing image segmentation improves the accuracy of the segmentation. In this paper, a novel multi-scale KAN-based linear attention (MKLA) segmentation network of MKLANet is developed to promote a better segmentation result. A hybrid global–local attention mechanism in a feature decoder is designed to enhance the ability of aggregating the global–local context and avoiding potential blocking artifacts for feature extraction and segmentation. The local attention channel adopts MKLA block by bringing the merits of KAN convolution in Mamba like the linear attention block to improve the ability of handling linear and nonlinear feature and complex function approximation with a few extra computations. The global attention channel uses long-range cascade encoder–decoder block, where it mainly employs the 7 × 7 depth-wise convolution token mixer and lightweight 7 × 7 dilated deep convolution to capture the long-distance spatial features field and retain key spatial information. In addition, to enrich the input of the attention block, a deformable convolution module is developed between the encoder output and corresponding scale decoder, which can improve the expression ability of the segmentation model without increasing the depth of the network. The experimental results of the Vaihingen dataset (83.68% in mIoU, 92.98 in OA, and 91.08 in mF1), the UAVid dataset (69.78% in mIoU, 96.51 in OA), the LoveDA dataset (51.53% in mIoU, 86.42% in OA, and 67.19% in mF1), and the Potsdam dataset (97.14% in mIoU, 92.64% in OA, and 93.8% in mF1) outperform other advanced attention-based approaches in terms of small targets and edges’ segmentation. Full article

► Show Figures

Graphical abstract

20 pages, 6191 KiB

Open AccessArticle

Research on High-Precision Gas Concentration Inversion for Imaging Fourier Transform Spectroscopy Based on Multi-Scale Feature Attention Model

by Jianhao Luo, Wei Zhao, Feipeng Ouyang, Kaiyang Sheng and Shurong Wang

Appl. Sci. 2025, 15(5), 2438; https://doi.org/10.3390/app15052438 - 25 Feb 2025

Viewed by 203

Abstract

The accurate monitoring of greenhouse gas (GHG) concentrations is crucial in mitigating global warming. The imaging Fourier transform spectrometer (IFTS) is an effective tool for measuring GHG concentrations, offering high throughput and a wide spectral measurement range. In order to address the issue [...] Read more.

The accurate monitoring of greenhouse gas (GHG) concentrations is crucial in mitigating global warming. The imaging Fourier transform spectrometer (IFTS) is an effective tool for measuring GHG concentrations, offering high throughput and a wide spectral measurement range. In order to address the issue of spectral inconsistency during the detection process of the target gas, which is influenced by external environmental factors, making it difficult to achieve high-precision gas concentration inversion, this paper proposes a multi-scale feature attention (MDISE) model. The model uses a multi-scale dilated convolution (MD) module to retain both global and local shallow features of the spectra; introduces the one-dimensional Inception (1D Inception) module to further extract multi-scale deep features; and incorporates the channel attention mechanism (SE) module to enhance attention to important spectral wavelengths, suppressing redundant and interfering information. A target gas detection system was built in the laboratory, and the proposed model was tested on gas samples collected by two channels of a short and medium-wavelength infrared imaging Fourier transform spectrometer (SMWIR-IFTS). The experimental results show that the MDISE model reduces the root mean square error (RMSE) in both channels by 79.14%, 76.59%, and 69.80%, and 81.45%, 82.65%, and 74.01%, respectively, compared to the partial least squares regression (PLSR), support vector regression (SVR), and conventional one-dimensional convolutional neural network (1D-CNN) models. Additionally, the MDISE model achieved average coefficient of determination (R²) values of 0.997 and 0.995 for the concentration intervals in both channels. The MDISE model demonstrates excellent performance and significantly improves the accuracy of GHG concentration inversion. Full article

► Show Figures

Figure 1

26 pages, 15621 KiB

Open AccessArticle

Integrated Convolution and Attention Enhancement-You Only Look Once: A Lightweight Model for False Estrus and Estrus Detection in Sows Using Small-Target Vulva Detection

by Yongpeng Duan, Yazhi Yang, Yue Cao, Xuan Wang, Riliang Cao, Guangying Hu and Zhenyu Liu

Animals 2025, 15(4), 580; https://doi.org/10.3390/ani15040580 - 18 Feb 2025

Viewed by 299

Abstract

Accurate estrus detection and optimal insemination timing are crucial for improving sow productivity and enhancing farm profitability in intensive pig farming. However, sows’ estrus typically lasts only 48.4 ± 1.0 h, and interference from false estrus further complicates detection. This study proposes an [...] Read more.

Accurate estrus detection and optimal insemination timing are crucial for improving sow productivity and enhancing farm profitability in intensive pig farming. However, sows’ estrus typically lasts only 48.4 ± 1.0 h, and interference from false estrus further complicates detection. This study proposes an enhanced YOLOv8 model, Integrated Convolution and Attention Enhancement (ICAE), for vulvar detection to identify the estrus stages. This model innovatively divides estrus into three phases (pre-estrus, estrus, and post-estrus) and distinguishes five different estrus states, including pseudo-estrus. ICAE-YOLO integrates the Convolution and Attention Fusion Module (CAFM) and Dual Dynamic Token Mixing (DDTM) for improved feature extraction, Dilation-wise Residual (DWR) for expanding the receptive field, and Focaler-Intersection over Union (Focaler-IoU) for boosting the performance across various detection tasks. To validate the model, it was trained and tested on a dataset of 6402 sow estrus images and compared with YOLOv8n, YOLOv5n, YOLOv7tiny, YOLOv9t, YOLOv10n, YOLOv11n, and the Faster R-CNN. The results show that ICAE-YOLO achieves an mAP of 93.4%, an F1-Score of 92.0%, GFLOPs of 8.0, and a model size of 4.97 M, reaching the highest recognition accuracy among the compared models, while maintaining a good balance between model size and performance. This model enables accurate, real-time estrus monitoring in complex, all-weather farming environments, providing a foundation for automated estrus detection in intensive pig farming. Full article

(This article belongs to the Special Issue Animal Health and Welfare Assessment of Pigs)

► Show Figures

Figure 1

18 pages, 510 KiB

Open AccessArticle

MCDCNet: Mask Classification Combined with Adaptive Dilated Convolution for Image Semantic Segmentation

by Geng Wei, Junbo Wang, Bingxian Shi, Xiaolin Zhu, Bo Cao and Tong Liu

Appl. Sci. 2025, 15(4), 2012; https://doi.org/10.3390/app15042012 - 14 Feb 2025

Viewed by 293

Abstract

Effectively classifying each pixel in an image is an important research topic in semantic segmentation. The Existing methods typically require the network to directly generate a feature map of the same size as the original image and classify each pixel, which makes it [...] Read more.

Effectively classifying each pixel in an image is an important research topic in semantic segmentation. The Existing methods typically require the network to directly generate a feature map of the same size as the original image and classify each pixel, which makes it difficult for the network to fully leverage the representations from the backbone. To handle this challenge, this paper proposes a method named mask classification combined with an adaptive dilated convolution network (MCDCNet). Firstly, a Vision Transformer (ViT)-based module is employed to capture contextual features as the backbone. Secondly, the Spatial Extraction Module (SEM) is proposed to extract multi-scale spatial information through adaptive dilated convolution while preserving the original feature size. This spatial information is then integrated into the corresponding contextual features to enhance the representation. Finally, a novel inference process is proposed that incorporates the instance activation map (IAM)-based decoder for semantic segmentation, thereby enhancing the network’s capability to capture and comprehend semantic features. The experimental results demonstrate that our network significantly outperforms other per-pixel classification networks across several semantic segmentation datasets. In particular, on Cityscapes, MCDCNet achieves 80.3 mIoU with 11.8 M Params, demonstrating that the network is able to deliver a strong segmentation performance while maintaining a relatively low parameter count. Full article

► Show Figures

Figure 1

Figure 1
The mIoU versus parameters on Cityscapes. Seg50, Seg75, and Seg100, respectively, represent input image resolutions of 1536 × 768, 1024 × 512, and 2048 × 1024, respectively. The orange triangle represents our model, while the blue circles represent others. Full article ">Figure 2
The proposed architecture of MCDCNet. The bottom shows the Spatial Extraction Module (SEM) and inference structure proposed in this paper. Full article ">Figure 3
Comparison of inference structures of three types of networks. (a–c) refer to the inference processes used in the traditional semantic segmentation network, mask classification network, and MCDCNet, respectively. Full article ">Figure 4
A ViT-based module with MSCA and FFN as the core components. Full article ">Figure 5
Visualization results on ADE20k. We compare MCDCNet with TopFormer, SegNeXt-tiny, and SegNeXt-small in the two images above. The top image shows that MCDCNet predicts large areas more accurately. The bottom image demonstrates that MCDCNet classifies categories with similar features more accurately. Full article ">Figure 6
Visualization results on ADE20k. We visualized the ablation experiments for the network architecture component. There are three network structures in total: MSCA, MSCA + SEM, and MSCA + SEM + IAM. The visualization results effectively demonstrate that the network structure of MCDCNet is optimal. Full article ">

46 pages, 13796 KiB

Open AccessFeature PaperReview

Measurement Techniques for Interfacial Rheology of Surfactant, Asphaltene, and Protein-Stabilized Interfaces in Emulsions and Foams

by Ronald Marquez and Jean-Louis Salager

Colloids Interfaces 2025, 9(1), 14; https://doi.org/10.3390/colloids9010014 - 14 Feb 2025

Viewed by 617

Abstract

This work provides a comprehensive review of experimental methods used to measure rheological properties of interfacial layers stabilized by surfactants, asphaltenes, and proteins that are relevant to systems with large interfacial areas, such as emulsions and foams. Among the shear methods presented, the [...] Read more.

This work provides a comprehensive review of experimental methods used to measure rheological properties of interfacial layers stabilized by surfactants, asphaltenes, and proteins that are relevant to systems with large interfacial areas, such as emulsions and foams. Among the shear methods presented, the deep channel viscometer, bicone rheometer, and double-wall ring rheometers are the most utilized. On the other hand, the main dilational rheology techniques discussed are surface waves, capillary pressure, oscillating Langmuir trough, oscillating pendant drop, and oscillating spinning drop. Recent developments—including machine learning and artificial intelligence (AI) models, such as artificial neural networks (ANN) and convolutional neural networks (CNN)—to calculate interfacial tension from drop shape analysis in shorter times and with higher precision are critically analyzed. Additionally, configurations involving an Atomic Force Microscopy (AFM) cantilever contacting bubble, a microtensiometer platform, rectangular and radial Langmuir troughs, and high-frequency oscillation drop setups are presented. The significance of Gibbs–Marangoni effects and interfacial rheological parameters on the (de)stabilization of emulsions is also discussed. Finally, a critical review of the recent literature on the measurement of interfacial rheology is presented. Full article

(This article belongs to the Special Issue Rheology of Complex Fluids and Interfaces)

► Show Figures

Graphical abstract

18 pages, 1807 KiB

Open AccessArticle

3DVT: Hyperspectral Image Classification Using 3D Dilated Convolution and Mean Transformer

by Xinling Su and Jingbo Shao

Photonics 2025, 12(2), 146; https://doi.org/10.3390/photonics12020146 - 11 Feb 2025

Viewed by 434

Abstract

Hyperspectral imaging and laser technology both rely on different wavelengths of light to analyze the characteristics of materials, revealing their composition, state, or structure through precise spectral data. In hyperspectral image (HSI) classification tasks, the limited number of labeled samples and the lack [...] Read more.

Hyperspectral imaging and laser technology both rely on different wavelengths of light to analyze the characteristics of materials, revealing their composition, state, or structure through precise spectral data. In hyperspectral image (HSI) classification tasks, the limited number of labeled samples and the lack of feature extraction diversity often lead to suboptimal classification performance. Furthermore, traditional convolutional neural networks (CNNs) primarily focus on local features in hyperspectral data, neglecting long-range dependencies and global context. To address these challenges, this paper proposes a novel model that combines CNNs with an average pooling Vision Transformer (ViT) for hyperspectral image classification. The model utilizes three-dimensional dilated convolution and two-dimensional convolution to extract multi-scale spatial–spectral features, while ViT was employed to capture global features and long-range dependencies in the hyperspectral data. Unlike the traditional ViT encoder, which uses linear projection, our model replaces it with average pooling projection. This change enhances the extraction of local features and compensates for the ViT encoder’s limitations in local feature extraction. This hybrid approach effectively combines the local feature extraction strengths of CNNs with the long-range dependency handling capabilities of Transformers, significantly improving overall performance in hyperspectral image classification tasks. Additionally, the proposed method holds promise for the classification of fiber laser spectra, where high precision and spectral analysis are crucial for distinguishing between different fiber laser characteristics. Experimental results demonstrate that the CNN-Transformer model substantially improves classification accuracy on three benchmark hyperspectral datasets. The overall accuracies achieved on the three public datasets—IP, PU, and SV—were 99.35%, 99.31%, and 99.66%, respectively. These advancements offer potential benefits for a wide range of applications, including high-performance optical fiber sensing, laser medicine, and environmental monitoring, where accurate spectral classification is essential for the development of advanced systems in fields such as laser medicine and optical fiber technology. Full article

(This article belongs to the Special Issue Advanced Fiber Laser Technology and Its Application)

► Show Figures

Figure 1

30 pages, 16247 KiB

Open AccessArticle

A Scale-Invariant Looming Detector for UAV Return Missions in Power Line Scenarios

by Jiannan Zhao, Qidong Zhao, Chenggen Wu, Zhiteng Li and Feng Shuang

Biomimetics 2025, 10(2), 99; https://doi.org/10.3390/biomimetics10020099 - 10 Feb 2025

Viewed by 413

Abstract

Unmanned aerial vehicles (UAVs) offer an efficient solution for power grid maintenance, but collision avoidance during return flights is challenged by crossing power lines, especially for small drones with limited computational resources. Conventional visual systems struggle to detect thin, intricate power lines, which [...] Read more.

Unmanned aerial vehicles (UAVs) offer an efficient solution for power grid maintenance, but collision avoidance during return flights is challenged by crossing power lines, especially for small drones with limited computational resources. Conventional visual systems struggle to detect thin, intricate power lines, which are often overlooked or misinterpreted. While deep learning methods have improved static power line detection in images, they still struggle with dynamic scenarios where collision risks are not detected in real time. Inspired by the hypothesis that the Lobula Giant Movement Detector (LGMD) distinguishes sparse and incoherent motion in the background by detecting continuous and clustered motion contours of the looming object, we propose a Scale-Invariant Looming Detector (SILD). SILD detects motion by preprocessing video frames, enhances motion regions using attention masks, and simulates biological arousal to recognize looming threats while suppressing noise. It also predicts impending collisions during high-speed flight and overcomes the limitations of motion vision to ensure consistent sensitivity to looming objects at different scales. We compare SILD with existing static power line detection techniques, including the Hough transform and D-LinkNet with a dilated convolution-based encoder–decoder architecture. Our results show that SILD strikes an effective balance between detection accuracy and real-time processing efficiency. It is well suited for UAV-based power line detection, where high precision and low-latency performance are essential. Furthermore, we evaluated the performance of the model under various conditions and successfully deployed it on a UAV-embedded board for collision avoidance testing at power lines. This approach provides a novel perspective for UAV obstacle avoidance in power line scenarios. Full article

► Show Figures

Figure 1

Search Results (583)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (583)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI