Nothing Special   »   [go: up one dir, main page]

You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (692)

Search Parameters:
Keywords = point cloud classification

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
24 pages, 2264 KiB  
Review
Transforming Architectural Digitisation: Advancements in AI-Driven 3D Reality-Based Modelling
by Kai Zhang and Francesco Fassi
Heritage 2025, 8(2), 81; https://doi.org/10.3390/heritage8020081 - 18 Feb 2025
Abstract
The capture of 3D reality has demonstrated increased efficiency and consistently accurate outcomes in architectural digitisation. Nevertheless, despite advancements in data collection, 3D reality-based modelling still lacks full automation, especially in the post-processing and modelling phase. Artificial intelligence (AI) has been a significant [...] Read more.
The capture of 3D reality has demonstrated increased efficiency and consistently accurate outcomes in architectural digitisation. Nevertheless, despite advancements in data collection, 3D reality-based modelling still lacks full automation, especially in the post-processing and modelling phase. Artificial intelligence (AI) has been a significant focus, especially in computer vision, and tasks such as image classification and object recognition might be beneficial for the digitisation process and its subsequent utilisation. This study aims to examine the potential outcomes of integrating AI technology into the field of 3D reality-based modelling, with a particular focus on its use in architecture and cultural-heritage scenarios. The main methods used for data collection are laser scanning (static or mobile) and photogrammetry. As a result, image data, including RGB-D data (files containing both RGB colours and depth information) and point clouds, have become the most common raw datasets available for object mapping. This study comprehensively analyses the current use of 2D and 3D deep learning techniques in documentation tasks, particularly downstream applications. It also highlights the ongoing research efforts in developing real-time applications with the ultimate objective of achieving generalisation and improved accuracy. Full article
(This article belongs to the Section Architectural Heritage)
Show Figures

Figure 1

Figure 1
<p>A typical pipeline of DL for object detection.</p>
Full article ">Figure 2
<p>Chronological overview of 2D-image object detection algorithms, involving convolutional networks [<a href="#B6-heritage-08-00081" class="html-bibr">6</a>,<a href="#B16-heritage-08-00081" class="html-bibr">16</a>,<a href="#B17-heritage-08-00081" class="html-bibr">17</a>,<a href="#B18-heritage-08-00081" class="html-bibr">18</a>], hand-engineered features [<a href="#B31-heritage-08-00081" class="html-bibr">31</a>,<a href="#B32-heritage-08-00081" class="html-bibr">32</a>,<a href="#B33-heritage-08-00081" class="html-bibr">33</a>,<a href="#B34-heritage-08-00081" class="html-bibr">34</a>,<a href="#B35-heritage-08-00081" class="html-bibr">35</a>], two-stage and one-stage detectors [<a href="#B1-heritage-08-00081" class="html-bibr">1</a>,<a href="#B19-heritage-08-00081" class="html-bibr">19</a>,<a href="#B20-heritage-08-00081" class="html-bibr">20</a>,<a href="#B35-heritage-08-00081" class="html-bibr">35</a>,<a href="#B36-heritage-08-00081" class="html-bibr">36</a>,<a href="#B37-heritage-08-00081" class="html-bibr">37</a>,<a href="#B38-heritage-08-00081" class="html-bibr">38</a>,<a href="#B39-heritage-08-00081" class="html-bibr">39</a>], and attention based detectors [<a href="#B4-heritage-08-00081" class="html-bibr">4</a>,<a href="#B40-heritage-08-00081" class="html-bibr">40</a>,<a href="#B41-heritage-08-00081" class="html-bibr">41</a>,<a href="#B42-heritage-08-00081" class="html-bibr">42</a>].</p>
Full article ">Figure 3
<p>Chronological overview of 3D object detection algorithms, involving deep learning methods for point cloud [2,3,47–58] and RGB-D data [5,65,69–77] processing.</p>
Full article ">Figure 4
<p>The proposed semantic photogrammetric pipeline in the work of Stathopoulou et al. [<a href="#B105-heritage-08-00081" class="html-bibr">105</a>].</p>
Full article ">Figure 5
<p>MLMR classification levels (capital details) for the Milan Cathedral from Teruggi et al. [<a href="#B114-heritage-08-00081" class="html-bibr">114</a>].</p>
Full article ">
28 pages, 25975 KiB  
Article
Analysis of the Qualitative Parameters of Mobile Laser Scanning for the Creation of Cartographic Works and 3D Models for Digital Twins of Urban Areas
by Ľudovít Kovanič, Patrik Peťovský, Branislav Topitzer, Peter Blišťan and Ondrej Tokarčík
Appl. Sci. 2025, 15(4), 2073; https://doi.org/10.3390/app15042073 - 16 Feb 2025
Viewed by 336
Abstract
This article focuses on the assessment of point clouds obtained by various laser scanning methods as a tool for 3D mapping and Digital Twin concepts. The presented research employed terrestrial and mobile laser scanning methods to obtain high-precision spatial data, enabling efficient spatial [...] Read more.
This article focuses on the assessment of point clouds obtained by various laser scanning methods as a tool for 3D mapping and Digital Twin concepts. The presented research employed terrestrial and mobile laser scanning methods to obtain high-precision spatial data, enabling efficient spatial documentation of urban structures and infrastructure. As a reference method, static terrestrial laser scanning (TLS) was chosen. Mobile laser scanning (MLS) data obtained by devices such as Lidaretto, the Stonex X120GO laser scanning device, and an iPhone 13 Pro with an Emlid scanning kit and GNSS antenna Reach RX were evaluated. Analyses based on comparing methods of classification, differences in individual objects, detail/density, and noise were performed. The results confirm the high accuracy of the methods and their ability to support the development of digital twins and smart solutions that enhance the efficiency of infrastructure management and planning. Full article
Show Figures

Figure 1

Figure 1
<p>Map display of Slovakia showing the city of Žiar nad Hronom (<b>a</b>), display of the orthomosaic of the study area (<b>b</b>), representation of the 3D model of the study area (<b>c</b>) highlighted by red marks.</p>
Full article ">Figure 2
<p>Surveying equipment used in the study.</p>
Full article ">Figure 3
<p>Example of a GCP placement for the Leica RTC360 terrestrial laser scanner (<b>a</b>), a CP for the Lidaretto mobile laser scanner (<b>b</b>), and a CP and GCP for the Stonex X120GO mobile laser scanner (<b>c</b>).</p>
Full article ">Figure 4
<p>Distribution of positions for the TLS survey.</p>
Full article ">Figure 5
<p>Leica RTC360 terrestrial laser scanner (<b>a</b>), Lidaretto mobile laser scanner placed on various carriers (<b>b</b>), Stonex X120GO handheld laser scanner (<b>c</b>), and a combined setup consisting of an iPhone 13 Pro with an Emlid scanning kit and a GNSS antenna Reach RX (<b>d</b>).</p>
Full article ">Figure 6
<p>Measurement trajectory using mobile laser scanners Stonex X120GO (<b>a</b>), Lidaretto (<b>b</b>) and iPhone 13 Pro with Emlid scanning kit and GNSS antenna Reach RX (<b>c</b>).</p>
Full article ">Figure 7
<p>Diagram of the optimized workflow.</p>
Full article ">Figure 8
<p>The resulting point clouds obtained by methods under study—3D view and top view of the TLS (<b>a</b>), 3D view and top view of the Lidaretto (<b>b</b>), 3D view and top view of the Stonex X120GO (<b>c</b>), 3D view and top view of the iPhone 13 Pro with Emlid scanning kit and GNSS antenna Reach RX (<b>d</b>).</p>
Full article ">Figure 9
<p>Viewing automatic and manual classification on an individual object. Legend: brown—Ground, green—Vegetation class, red—Buildings class, blue—Hardscape class, grey—Unclassified class.</p>
Full article ">Figure 10
<p>Analysis of the differences in the point clouds—tree trunk.</p>
Full article ">Figure 11
<p>Analysis of the differences in the point clouds—corners (<b>A</b>–<b>D</b>) of a building.</p>
Full article ">Figure 12
<p>Analysis of the differences in the point clouds—cross-sections of the mast of a street lamp.</p>
Full article ">Figure 13
<p>Density of the points per 1 m<sup>2</sup>—top view of the point clouds.</p>
Full article ">Figure 14
<p>Histogram showing the point density in the point clouds obtained by different methods.</p>
Full article ">Figure 15
<p>Noise in the point clouds obtained by the devices under study—an example on the wall of the building.</p>
Full article ">
15 pages, 3658 KiB  
Article
A Hard Negatives Mining and Enhancing Method for Multi-Modal Contrastive Learning
by Guangping Li, Yanan Gao, Xianhui Huang and Bingo Wing-Kuen Ling
Electronics 2025, 14(4), 767; https://doi.org/10.3390/electronics14040767 - 16 Feb 2025
Viewed by 155
Abstract
Contrastive learning has emerged as a dominant paradigm for understanding 3D open-world environments, particularly in the realm of multi-modalities. However, due to the nature of self-supervised learning and the limited size of 3D datasets, pre-trained models in the 3D point cloud domain often [...] Read more.
Contrastive learning has emerged as a dominant paradigm for understanding 3D open-world environments, particularly in the realm of multi-modalities. However, due to the nature of self-supervised learning and the limited size of 3D datasets, pre-trained models in the 3D point cloud domain often suffer from overfitting in downstream tasks, especially in zero-shot classification. To tackle this problem, we design a module to mine and enhance hard negatives from datasets, which are useful to improve the discrimination of models. This module could be seamlessly integrated into cross-modal contrastive learning frameworks, addressing the overfitting issue by enhancing the mined hard negatives during the process of training. This module consists of two key components: mining and enhancing. In the process of mining, we identify hard negative samples by examining similarity relationships between vision–vision and vision–text modalities, locating hard negative pairs within the visual domain. In the process of enhancing, we compute weighting coefficients via the similarity differences of these mined hard negatives. By enhancing the mined hard negatives while leaving others unchanged, we improve the overall performance and discrimination of models. A series of experiments demonstrate that our module can be easily incorporated into various contrastive learning frameworks, leading to improved model performance in both zero-shot and few-shot tasks. Full article
Show Figures

Figure 1

Figure 1
<p>The modified framework of CLIP2Point. We add a textual branch and apply HNME to cross-modal contrastive learning during pre-training.</p>
Full article ">Figure 2
<p>The framework of OpenShape. Two cross-modal similarity matrices are computed, and our HNME method is applied to the cross-modalities between images and point clouds. Snowflakes and sparks respectively represent that the encoder parameters are frozen and learnable during training. Pink, green, blue and purple boxes represent image, text, point cloud features and positive sample pair similarity, respectively.</p>
Full article ">Figure 3
<p>Qualitative process of mining and enhancing hard negatives. (<b>a</b>) indicates the anchor and its positive, false, and true negative samples. After step1, (<b>b</b>) circles the candidate hard negative samples with a dotted box, but they are not all true hard negatives. So we identify the true ones in step2, as shown in (<b>c</b>); the final mined hard negatives are circled by solid boxes. After enhancing in (<b>d</b>), they become closer to the anchor in the feature space.</p>
Full article ">Figure 4
<p>The judgment accuracy of the initial and trained models with different <math display="inline"><semantics> <mi>δ</mi> </semantics></math>. The x axis is the value of <math display="inline"><semantics> <mi>δ</mi> </semantics></math>; with the decrease of <math display="inline"><semantics> <mi>δ</mi> </semantics></math>, accuracies gradually increase and tend to be stable.</p>
Full article ">Figure 5
<p>(<b>a</b>) shows that the rate of similarity difference between positive and negative sample pairs varies with the similarity of the negative sample pair. After calculating the exponents, the coefficient is above 1, as shown in (<b>b</b>).</p>
Full article ">Figure 6
<p>These heatmaps indicate sample similarity relationships in a batch before and after enhancing hard negatives. (<b>a</b>) shows the original cosine similarities between image-depth pairs. (<b>b</b>) indicates the similarity translationships after enhancing; the brighter parts are the similarities of enhanced negatives. The legend in the rightmost column indicates the colors of different similarities, which are expanded by the temperature coefficient <math display="inline"><semantics> <mi>τ</mi> </semantics></math>.</p>
Full article ">
24 pages, 11349 KiB  
Article
Multi-Size Voxel Cube (MSVC) Algorithm—A Novel Method for Terrain Filtering from Dense Point Clouds Using a Deep Neural Network
by Martin Štroner, Martin Boušek, Jakub Kučera, Hana Váchová and Rudolf Urban
Remote Sens. 2025, 17(4), 615; https://doi.org/10.3390/rs17040615 - 11 Feb 2025
Viewed by 336
Abstract
When filtering highly rugged terrain from dense point clouds (particularly in technical applications such as civil engineering), the most widely used filtering approaches yield suboptimal results. Here, we proposed and tested a novel ground-filtering algorithm, a multi-size voxel cube (MSVC), utilizing a deep [...] Read more.
When filtering highly rugged terrain from dense point clouds (particularly in technical applications such as civil engineering), the most widely used filtering approaches yield suboptimal results. Here, we proposed and tested a novel ground-filtering algorithm, a multi-size voxel cube (MSVC), utilizing a deep neural network. This is based on the voxelization of the point cloud, the classification of individual voxels as ground or non-ground using surrounding voxels (a “voxel cube” of 9 × 9 × 9 voxels), and the gradual reduction in voxel size, allowing the acquisition of custom-level detail and highly rugged terrain from dense point clouds. The MSVC performance on two dense point clouds, capturing highly rugged areas with dense vegetation cover, was compared with that of the widely used cloth simulation filter (CSF) using manually classified terrain as the reference. MSVC consistently outperformed the CSF filter in terms of the correctly identified ground points, correctly identified non-ground points, balanced accuracy, and the F-score. Another advantage of this filter lay in its easy adaptability to any type of terrain, enabled by the utilization of machine learning. The only disadvantage lay in the necessity to manually prepare training data. On the other hand, we aim to account for this in the future by producing neural networks trained for individual landscape types, thus eliminating this phase of the work. Full article
(This article belongs to the Special Issue New Perspectives on 3D Point Cloud (Third Edition))
Show Figures

Figure 1

Figure 1
<p>A 2D illustration of the point cloud (profile) and its voxelization to 2 × 2 × 2 m voxels. Individual dots represent the centers of the voxels, color-coded to represent the number of points in the voxel (see the color bar). The central red square indicates the evaluated voxel, and the large orange square indicates the entire area used for its evaluation (2D representation of the voxel cube). (<b>a</b>) shows the overall view, (<b>b</b>) detail.</p>
Full article ">Figure 2
<p>A 2D illustration of the progressive reduction in vegetation with a gradual reduction in the voxel size (color-coding indicates the number of points in the voxel relative to the most populated voxel; grey indicates voxels with no points; and the greyed-out part of the point cloud indicates the points removed in previous steps). (<b>a</b>) A voxel size of 3.38 m; (<b>b</b>) A voxel size of 1.90 m; (<b>c</b>) A voxel size of 1.42 m; (<b>d</b>) A voxel size of 0.6 m.</p>
Full article ">Figure 3
<p>(<b>a</b>) The misclassification of voxels with low numbers of points (marked with red arrows) as non-ground and (<b>b</b>) the solution to this problem through the use of the additional shifted grid (blue lines); voxels classified as ground in any of the grids (thick lines) are considered ground and carried forward to the next step.</p>
Full article ">Figure 4
<p>Gradual filtering with stepwise reduction in the voxel size: (<b>a</b>) Original point cloud; (<b>b</b>) Step 2 (voxel size 4.5 m); (<b>c</b>) Step 5 (voxel size 1.9 m); (<b>d</b>) Step 15—final result (voxel size 0.11 m).</p>
Full article ">Figure 5
<p>Flowchart of the multi-size voxel cube (MSVC) algorithm.</p>
Full article ">Figure 6
<p>Data 1 with the vegetation color-coded according to the vegetation height: (<b>a</b>) Training data, (<b>b</b>) Test data; note that the training data contain all types of terrain as well as the vegetation character present in the test data.</p>
Full article ">Figure 7
<p>Data 2—training area (<b>a</b>,<b>b</b>) and the testing areas Boulders (<b>c</b>,<b>d</b>), Tower (<b>e</b>,<b>f</b>), and Rugged (<b>g</b>,<b>h</b>).</p>
Full article ">Figure 8
<p>Data 1—best classification results: (<b>a</b>) CSF (cloth resolution 2.5 cm; threshold 25 cm); (<b>b</b>) MSVC (voxel size 6 cm); (<b>c</b>) detail of the CSF classification; (<b>d</b>) detail of the same area classified by MSVC; the color-coded points indicate erroneously preserved vegetation, along with its height.</p>
Full article ">Figure 9
<p>Classification success for Data 2—Boulder: (<b>a</b>) CSF classification and (<b>b</b>) MSVC classification, with points erroneously classified as ground highlighted in red; (<b>c</b>) CSF classification, with points correctly identified by CSF but not by MSVC highlighted in green (<b>d</b>) MSVC classification, with points correctly identified by MSVC but not by CSF highlighted in green.</p>
Full article ">Figure 10
<p>Classification success for Data 2—Tower: (<b>a</b>) CSF classification and (<b>b</b>) MSVC classification, with points erroneously classified as ground highlighted in red; (<b>c</b>) CSF classification, with points correctly identified by CSF but not by MSVC highlighted in green (<b>d</b>) MSVC classification, with points correctly identified by MSVC but not by CSF highlighted in green. Blue ovals indicate areas with the biggest differences in the performance of the filters, where CSF identified more points falsely as ground.</p>
Full article ">Figure 11
<p>Classification success for Data 2—Rugged: (<b>a</b>) CSF classification and (<b>b</b>) MSVC classification, with points erroneously classified as ground highlighted in red; (<b>c</b>) CSF classification, with points correctly identified by CSF but not by MSVC highlighted in green (<b>d</b>) MSVC classification, with points correctly identified by MSVC but not by CSF highlighted in green. Colored ovals indicate areas with the biggest differences in the performance of the filters.</p>
Full article ">Figure 12
<p>The terrain model of the Data 2—Tower area with buildings shown; note that no buildings were present in the training data.</p>
Full article ">Figure A1
<p>Data 2—location of individual data in the area (<b>a</b>) Data 2 Training (<b>b</b>) Data 2 Boulders (<b>c</b>) Data 2 Tower (<b>d</b>) Data 2 Rugged.</p>
Full article ">
21 pages, 16141 KiB  
Article
The Development of a Sorting System Based on Point Cloud Weight Estimation for Fattening Pigs
by Luo Liu, Yangsen Ou, Zhenan Zhao, Mingxia Shen, Ruqian Zhao and Longshen Liu
Agriculture 2025, 15(4), 365; https://doi.org/10.3390/agriculture15040365 - 8 Feb 2025
Viewed by 405
Abstract
As large-scale and intensive fattening pig farming has become mainstream, the increase in farm size has led to more severe issues related to the hierarchy within pig groups. Due to genetic differences among individual fattening pigs, those that grow faster enjoy a higher [...] Read more.
As large-scale and intensive fattening pig farming has become mainstream, the increase in farm size has led to more severe issues related to the hierarchy within pig groups. Due to genetic differences among individual fattening pigs, those that grow faster enjoy a higher social rank. Larger pigs with greater aggression continuously acquire more resources, further restricting the survival space of weaker pigs. Therefore, fattening pigs must be grouped rationally, and the management of weaker pigs must be enhanced. This study, considering current fattening pig farming needs and actual production environments, designed and implemented an intelligent sorting system based on weight estimation. The main hardware structure of the partitioning equipment includes a collection channel, partitioning channel, and gantry-style collection equipment. Experimental data were collected, and the original scene point cloud was preprocessed to extract the back point cloud of fattening pigs. Based on the morphological characteristics of the fattening pigs, the back point cloud segmentation method was used to automatically extract key features such as hip width, hip height, shoulder width, shoulder height, and body length. The segmentation algorithm first calculates the centroid of the point cloud and the eigenvectors of the covariance matrix to reconstruct the point cloud coordinate system. Then, based on the variation characteristics and geometric shape of the consecutive horizontal slices of the point cloud, hip width and shoulder width slices are extracted, and the related features are calculated. Weight estimation was performed using Random Forest, Multilayer perceptron (MLP), linear regression based on the least squares method, and ridge regression models, with parameter tuning using Bayesian optimization. The mean squared error, mean absolute error, and mean relative error were used as evaluation metrics to assess the model’s performance. Finally, the classification capability was evaluated using the median and average weights of the fattening pigs as partitioning standards. The experimental results show that the system’s average relative error in weight estimation is approximately 2.90%, and the total time for the partitioning process is less than 15 s, which meets the needs of practical production. Full article
(This article belongs to the Special Issue Modeling of Livestock Breeding Environment and Animal Behavior)
Show Figures

Figure 1

Figure 1
<p>Experimental data collection device diagram.</p>
Full article ">Figure 2
<p>Pass-through filtering.</p>
Full article ">Figure 3
<p>Pig back point cloud division.</p>
Full article ">Figure 4
<p>Coordinate system reconstruction result of point cloud.</p>
Full article ">Figure 5
<p><span class="html-italic">x</span>-axis span of a slice.</p>
Full article ">Figure 6
<p>Schematic diagram of the operation of the column equipment.</p>
Full article ">Figure 6 Cont.
<p>Schematic diagram of the operation of the column equipment.</p>
Full article ">Figure 7
<p>A 3D diagram of the column device.</p>
Full article ">Figure 8
<p>Hardware connection diagram.</p>
Full article ">Figure 9
<p>Relationship between ‘eps’ and ‘min_points’ and the number of running hours and categories.</p>
Full article ">Figure 10
<p>DBSCAN clustering of different ‘eps’ and ‘min_points’ values.</p>
Full article ">Figure 11
<p>DBSCAN clustering and voxel downsampling effect.</p>
Full article ">Figure 12
<p>Scatter plot of redundant and normal samples.</p>
Full article ">Figure 13
<p>Model test results and error comparison.</p>
Full article ">Figure 14
<p>Operation display of sorting equipment and system platform.</p>
Full article ">
27 pages, 3505 KiB  
Article
DeepDR: A Two-Level Deep Defect Recognition Framework for Meteorological Satellite Images
by Xiangang Zhao, Xiangyu Chang, Cunqun Fan, Manyun Lin, Lan Wei and Yunming Ye
Remote Sens. 2025, 17(4), 585; https://doi.org/10.3390/rs17040585 - 8 Feb 2025
Viewed by 290
Abstract
Raw meteorological satellite images often suffer from defects such as noise points and lines due to atmospheric interference and instrument errors. Current solutions typically rely on manual visual inspection to identify these defects. However, manual inspection is labor-intensive, lacks uniform standards, and is [...] Read more.
Raw meteorological satellite images often suffer from defects such as noise points and lines due to atmospheric interference and instrument errors. Current solutions typically rely on manual visual inspection to identify these defects. However, manual inspection is labor-intensive, lacks uniform standards, and is prone to both false positives and missed detections. To address these challenges, we propose DeepDR, a two-level deep defect recognition framework for meteorological satellite images. DeepDR consists of two modules: a transformer-based noise image classification module for the first level and a noise region segmentation module based on a pseudo-label training strategy for the second level. This framework enables the automatic identification of defective cloud images and the detection of noise points and lines, thereby significantly improving the accuracy of defect recognition. To evaluate the effectiveness of DeepDR, we have collected and released two satellite cloud image datasets from the FengYun-1 satellite, which include noise points and lines. Subsequently, we conducted comprehensive experiments to demonstrate the superior performance of our approach in addressing the satellite cloud image defect recognition problem. Full article
Show Figures

Figure 1

Figure 1
<p>Some examples of satellite cloud images with noise points and lines are presented. The noise points have been marked with red boxes. In (<b>a</b>,<b>b</b>), satellite cloud images with noise points are displayed, while (<b>c</b>,<b>d</b>) show images with noise lines.</p>
Full article ">Figure 2
<p>Illustration of the proposed framework DeepDR.</p>
Full article ">Figure 3
<p>Illustration of the Transformer-based noise image classifier.</p>
Full article ">Figure 4
<p>Illustration of the proposed pseudo-label-based noise region segmentation.</p>
Full article ">Figure 5
<p>Some example images of a noise point dataset are shown.</p>
Full article ">Figure 6
<p>Some example images of a noise line dataset are displayed.</p>
Full article ">Figure 7
<p>A selection of results from the noise points experiment. (<b>a</b>,<b>b</b>) are remote sensing satellite images containing noises, and (<b>c</b>,<b>d</b>) are normal remote sensing satellite images.</p>
Full article ">Figure 8
<p>A selection of results from the noise lines experiment. (<b>a</b>,<b>b</b>) are remote sensing satellite images containing lines, and (<b>c</b>,<b>d</b>) are normal remote sensing satellite images.</p>
Full article ">Figure 9
<p>The precision of all methods on normal and noise point images.</p>
Full article ">Figure 10
<p>The recall of all methods on normal and noise point images.</p>
Full article ">Figure 11
<p>The F1 score for all methods on normal and noise point images.</p>
Full article ">Figure 12
<p>The precision of all methods on normal and noise line images.</p>
Full article ">Figure 13
<p>The recall of all methods on normal and noise line images.</p>
Full article ">Figure 14
<p>The F1 score for all methods on normal and noise line images.</p>
Full article ">Figure 15
<p>The visualization results of image segmentation methods for meteorological satellite images containing noise points. The noise points have been marked with red boxes.</p>
Full article ">Figure 16
<p>The visualization results of image segmentation methods for meteorological satellite images containing noise lines.</p>
Full article ">
25 pages, 6553 KiB  
Article
Tree Species Classification Based on Point Cloud Completion
by Haoran Liu, Hao Zhong, Guangqiang Xie and Ping Zhang
Forests 2025, 16(2), 280; https://doi.org/10.3390/f16020280 - 6 Feb 2025
Viewed by 341
Abstract
LiDAR is an active remote sensing technology widely used in forestry applications, such as forest resource surveys, tree information collection, and ecosystem monitoring. However, due to the resolution limitations of 3D-laser scanners and the canopy occlusion in forest environments, the tree point clouds [...] Read more.
LiDAR is an active remote sensing technology widely used in forestry applications, such as forest resource surveys, tree information collection, and ecosystem monitoring. However, due to the resolution limitations of 3D-laser scanners and the canopy occlusion in forest environments, the tree point clouds obtained often have missing data. This can reduce the accuracy of individual tree segmentation, which subsequently affects the tree species classification. To address the issue, this study used point cloud data with RGB information collected by the UAV platform to improve tree species classification by completing the missing point clouds. Furthermore, the study also explored the effects of point cloud completion, feature selection, and classification methods on the results. Specifically, both a traditional geometric method and a deep learning-based method were used for point cloud completion, and their performance was compared. For the classification of tree species, five machine learning algorithms—Random Forest (RF), Support Vector Machine (SVM), Back Propagation Neural Network (BPNN), Quadratic Discriminant Analysis (QDA), and K-Nearest Neighbors (KNN)—were utilized. This study also ranked the importance of features to assess the impact of different algorithms and features on classification accuracy. The results showed that the deep learning-based completion method provided the best performance (avgCD = 6.14; avgF1 = 0.85), generating more complete point clouds than the traditional method. On the other hand, compared with SVM and BPNN, RF showed better performance in dealing with multi-classification tasks with limited training samples (OA-87.41%, Kappa-0.85). Among the six dominant tree species, Pinus koraiensis had the highest classification accuracy (93.75%), while that of Juglans mandshurica was the lowest (82.05%). In addition, the vegetation index and the tree structure parameter accounted for 50% and 30%, respectively, in the top 10 features in terms of feature importance. The point cloud intensity also had a high contribution to the classification results, indicating that the lidar point cloud data can also be used as an important basis for tree species classification. Full article
(This article belongs to the Section Forest Inventory, Modeling and Remote Sensing)
Show Figures

Figure 1

Figure 1
<p>Location of the study area.</p>
Full article ">Figure 2
<p>Technical route. The yellow blocks represent the data used, the pink blocks represent the processing steps, and the blue blocks marked with dotted lines represent the parameters or methods.</p>
Full article ">Figure 3
<p>Schematic diagram of tree crown completion. The yellow part represents the trees in the plots. The red and green parts are the data collected after the tree crowns were shaded, and the dashed parts are the simulated actual complete trees. <span class="html-italic">P</span><sub>1</sub> is the highest point in the point cloud of the missing tree crown in the collected data, <span class="html-italic">P</span><sub>2</sub> is the vertex of the actual complete tree crown, <span class="html-italic">H</span> is the crown height, and <span class="html-italic">h</span> is the vertical distance between <span class="html-italic">P</span><sub>1</sub> and <span class="html-italic">P</span><sub>2</sub>. <span class="html-italic">O</span> is the center point of the vertical projection plane of the tree crown. <span class="html-italic">d</span> is the distance between the vertical projection of point <span class="html-italic">P</span><sub>1</sub> and <span class="html-italic">O. D</span> is the radius of the fitted circle. <span class="html-italic">A</span>, <span class="html-italic">B</span>, <span class="html-italic">C</span>, are the edge point on the tree crown.</p>
Full article ">Figure 4
<p>Point cloud completion network based on GAN. (<b>a</b>) Hybrid pooling. (<b>b</b>) Attention-based feature enhancement (AFE) module.</p>
Full article ">Figure 5
<p>Visual comparisons on completion of tree point cloud by traditional method (TM) and deep leaning method (DL). (<b>a</b>) <span class="html-italic">Pinus koraiensis</span>, (<b>b</b>) <span class="html-italic">Larix gmelinii</span>, (<b>c</b>) <span class="html-italic">Ulmus pumila</span>, (<b>d</b>) <span class="html-italic">Fraxinus mandshurica</span>, (<b>e</b>) <span class="html-italic">Juglans mandshurica.</span>, and (<b>f</b>) <span class="html-italic">Betula platyphylla</span>.</p>
Full article ">Figure 6
<p>Results of RF-based tree species classification with plots 1–10.</p>
Full article ">Figure 7
<p>Overall ranking of feature importance.</p>
Full article ">Figure 8
<p>Distribution of the top 10 features in terms of importance by tree species.</p>
Full article ">
23 pages, 4583 KiB  
Article
Research on Fine-Scale Terrain Construction in High Vegetation Coverage Areas Based on Implicit Neural Representations
by Yi Zhang, Peipei He, Haihang Jing, Bin He, Weibo Yin, Junzhen Meng, Yuntian Ma, Haifeng Zhang, Bo Zhang and Haoxiang Shen
Sustainability 2025, 17(3), 1320; https://doi.org/10.3390/su17031320 - 6 Feb 2025
Viewed by 429
Abstract
Due to the high-density coverage of vegetation, the complexity of terrain, and occlusion issues, ground point extraction faces significant challenges. Airborne Light Detection and Ranging (LiDAR) technology plays a crucial role in complex mountainous areas. This article proposes a method for constructing fine [...] Read more.
Due to the high-density coverage of vegetation, the complexity of terrain, and occlusion issues, ground point extraction faces significant challenges. Airborne Light Detection and Ranging (LiDAR) technology plays a crucial role in complex mountainous areas. This article proposes a method for constructing fine terrain in high vegetation coverage areas based on implicit neural representation. This method consists of data preprocessing, multi-scale and multi-feature high-difference point cloud initial filtering, and an upsampling module based on implicit neural representation. Firstly, preprocess the regional point cloud data is preprocessed; then, K-dimensional trees (K-d trees) are used to construct spatial indexes, and spherical neighborhood methods are applied to capture the geometric and physical information of point clouds for multi-feature fusion, enhancing the distinction between terrain and non-terrain elements. Subsequently, a differential model is constructed based on DSM (Digital Surface Model) at different scales, and the elevation variation coefficient is calculated to determine the threshold for extracting the initial set of ground points. Finally, the upsampling module using implicit neural representation is used to finely process the initial ground point set, providing a complete and uniformly dense ground point set for the subsequent construction of fine terrain. To validate the performance of the proposed method, three sets of point cloud data from mountainous terrain with different features are selected as the experimental area. The experimental results indicate that, from a qualitative perspective, the proposed method significantly improves the classification of vegetation, buildings, and roads, with clear boundaries between different types of terrain. From a quantitative perspective, the Type I errors of the three selected regions are 4.3445%, 5.0623%, and 5.9436%, respectively. The Type II errors are 5.7827%, 6.8516%, and 7.3478%, respectively. The overall errors are 5.3361%, 6.4882%, and 6.7168%, respectively. The Kappa coefficients of the measurement areas all exceed 80%, indicating that the proposed method performs well in complex mountainous environments. Provide point cloud data support for the construction of wind and photovoltaic bases in China, reduce potential damage to the ecological environment caused by construction activities, and contribute to the sustainable development of ecology and energy. Full article
Show Figures

Figure 1

Figure 1
<p>Overview of the location of the wind and photovoltaic project in the experimental area.</p>
Full article ">Figure 2
<p>Flowchart of the fine point cloud filtering method for dense vegetation coverage in complex mountainous areas.</p>
Full article ">Figure 3
<p>Multi-feature neighborhood construction model. In the K-d tree, it can be clearly seen that the red, green, and blue lines in the above figure divide the space of the cube into two, four, and eight parts, respectively. The last 8 subspaces are leaf nodes; In a spherical neighborhood map, black dots are the current point, blue dots are points within the neighborhood of the current point, and the remaining points are terrain points in the neighborhood of the previous point.</p>
Full article ">Figure 4
<p>Illustrates the application of the implicit neural representation upsampling module in the processing of point clouds in complex mountainous terrain.</p>
Full article ">Figure 5
<p>Results obtained with different upsampling scales for the same input.</p>
Full article ">Figure 6
<p>4× upsampling point cloud data results.</p>
Full article ">Figure 7
<p>Results of processing the point cloud data of Area c.</p>
Full article ">Figure 8
<p>The DEM of the complex mountainous terrain generated after processing with the proposed method.</p>
Full article ">Figure 9
<p>Point cloud image and DEM for Area b.</p>
Full article ">Figure 10
<p>Maps of Area c, d, and e, along with their corresponding DEMs.</p>
Full article ">Figure 10 Cont.
<p>Maps of Area c, d, and e, along with their corresponding DEMs.</p>
Full article ">
18 pages, 14896 KiB  
Article
Deep Learning-Based Point Cloud Classification of Obstacles for Intelligent Vehicles
by Yiqi Xu, Dengke Wu, Mengfei Zhou and Jiafu Yang
World Electr. Veh. J. 2025, 16(2), 80; https://doi.org/10.3390/wevj16020080 - 5 Feb 2025
Viewed by 508
Abstract
Intelligent driving research has focused much attention on point cloud obstacles since they are a class of high-dimensional data that can adequately depict the shape and placement of obstacles, unlike picture data. Currently, deep learning technology is primarily employed for vehicle autonomy point [...] Read more.
Intelligent driving research has focused much attention on point cloud obstacles since they are a class of high-dimensional data that can adequately depict the shape and placement of obstacles, unlike picture data. Currently, deep learning technology is primarily employed for vehicle autonomy point cloud obstacle classification tasks. These techniques typically struggle with low classification accuracy, processing efficiency, and model stability. To tackle the abovementioned issues, this paper suggests a novel random forest algorithm that integrates the out-of-bag error theory and can consistently and accurately evaluate the influence of point cloud properties. Then, building on the novel algorithm, this paper suggests a modified PointNet network that incorporates the effects of both global and local features on the classification task, therefore increasing the conventional network’s classification accuracy. To assess the effectiveness of this novel approach in the experimental portion, we set up an evaluation system based on the metrics for average accuracy, overall accuracy, and a confusion matrix. According to the simulation results, the overall accuracy of the proposed network in terms of classification accuracy is 94.4% and the average accuracy is 84.9%, which are then compared to the prototype PointNet and its variants. The classification accuracies for the four types of obstacles are 97.6%, 63.6%, 92.5%, and 86.1%. In addition, the proposed method is effective at improving both the computational complexity and stability of the network. Full article
(This article belongs to the Special Issue Deep Learning Applications for Electric Vehicles)
Show Figures

Figure 1

Figure 1
<p>Technology Roadmap for the ITS.</p>
Full article ">Figure 2
<p>The schematic diagram of the algorithm to generate a decision tree forest.</p>
Full article ">Figure 3
<p>Schematic diagram of the RF-OOB algorithm.</p>
Full article ">Figure 4
<p>The model architecture diagram of the conventional PointNet.</p>
Full article ">Figure 5
<p>The structural model of the optimized OOB-PointNet.</p>
Full article ">Figure 6
<p>Point cloud of four types of obstacles.</p>
Full article ">Figure 7
<p>Histogram of feature importance assessment.</p>
Full article ">Figure 8
<p>Confusion matrix plots for the three classifiers.</p>
Full article ">Figure 9
<p>Plot of the experimental results of the overall accuracy of classification for the three classifiers.</p>
Full article ">Figure 10
<p>Experimental diagram of 200 epochs of training for five types of networks.</p>
Full article ">Figure 11
<p>Diagram of each network categorizing each frame of the point cloud.</p>
Full article ">Figure 12
<p>The diagram of classification accuracy of the networks for different point cloud densities.</p>
Full article ">Figure 13
<p>The diagram of classification accuracy of the networks for each type of obstacle for different point cloud densities.</p>
Full article ">
29 pages, 15780 KiB  
Article
Assessing Lightweight Folding UAV Reliability Through a Photogrammetric Case Study: Extracting Urban Village’s Buildings Using Object-Based Image Analysis (OBIA) Method
by Junyu Kuang, Yingbiao Chen, Zhenxiang Ling, Xianxin Meng, Wentao Chen and Zihao Zheng
Drones 2025, 9(2), 101; https://doi.org/10.3390/drones9020101 - 29 Jan 2025
Viewed by 522
Abstract
With the rapid advancement of drone technology, modern drones have achieved high levels of functional integration, alongside structural improvements that include lightweight, compact designs with foldable features, greatly enhancing their flexibility and applicability in photogrammetric applications. Nevertheless, limited research currently explores data collected [...] Read more.
With the rapid advancement of drone technology, modern drones have achieved high levels of functional integration, alongside structural improvements that include lightweight, compact designs with foldable features, greatly enhancing their flexibility and applicability in photogrammetric applications. Nevertheless, limited research currently explores data collected by such compact UAVs, and whether they can balance a small form factor with high data quality remains uncertain. To address this challenge, this study acquired the remote sensing data of a peri-urban area using the DJI Mavic 3 Enterprise and applied Object-Based Image Analysis (OBIA) to extract high-density buildings. It was found that this drone offers high portability, a low operational threshold, and minimal regulatory constraints in practical applications, while its captured imagery provides rich textural details that clearly depict the complex surface features in urban villages. To assess the accuracy of the extraction results, the visual comparison between the segmentation outputs and airborne LiDAR point clouds captured by the DJI M300 RTK was performed, and classification performance was evaluated based on confusion matrix metrics. The results indicate that the boundaries of the segmented objects align well with the building edges in the LiDAR point cloud. The classification accuracy of the three selected algorithms exceeded 80%, with the KNN classifier achieving an accuracy of 91% and a Kappa coefficient of 0.87, which robustly demonstrate the reliability of the UAV data and validate the feasibility of the proposed approach in complex cases. As a practical case reference, this study is expected to promote the wider application of lightweight UAVs across various fields. Full article
Show Figures

Figure 1

Figure 1
<p>Framework of This Study.</p>
Full article ">Figure 2
<p>Overview of the Study Area: (<b>a</b>) Location of Guangzhou Higher Education Mega Center within Guangzhou City; (<b>b</b>) DOM of Guangzhou Higher Education Mega Center; (<b>c</b>) DOM of Beiting Village. Images (<b>b</b>) and (<b>c</b>) were sourced from remote sensing imagery collected by the research team using a fixed-wing UAV, with a resolution of 0.2 m.</p>
Full article ">Figure 3
<p>Mavic 3 Enterprise with RTK Module stored in portable case, including six batteries, charging components, and spare parts.</p>
Full article ">Figure 4
<p>DJI M300 RTK Equipped with GreenValley LiAir X3-H LiDAR System.</p>
Full article ">Figure 5
<p>Flight path planning on DJI Pilot 2.</p>
Full article ">Figure 6
<p>MRS Workflow Diagram.</p>
Full article ">Figure 7
<p>Overall DOM and Local Detail of the Study Area.</p>
Full article ">Figure 8
<p>Overall DSM and Local Comparison of the Study Area.</p>
Full article ">Figure 9
<p>Overall VDVI and Local Comparison of the Study Area.</p>
Full article ">Figure 10
<p>Overall LAS and Local Detail of the Study Area.</p>
Full article ">Figure 11
<p>Determination of Shape and Compactness Using the Control Variable Method: (<b>a</b>) Shape Set to 0.7; (<b>b</b>) Compactness Set to 0.8.</p>
Full article ">Figure 12
<p>ESP2 Results.</p>
Full article ">Figure 13
<p>Scale Set to 320.</p>
Full article ">Figure 14
<p>Comparison of MRS results with hybrid visualization of LAS, (<b>a</b>–<b>d</b>) illustrate the comparison results of four different high-density building areas.</p>
Full article ">Figure 15
<p>Results of Building Extraction Using K-Nearest Neighbor (KNN) Method.</p>
Full article ">Figure 16
<p>Ground-based LiDAR Equipment and Point Cloud Data: (<b>a</b>) GreenValley LiGirp H120 handheld LiDAR scanning device; (<b>b</b>) Overlay of airborne and handheld point cloud data, with the highlighted point cloud in the yellow box representing the range of data captured by the ground-based LiDAR.</p>
Full article ">Figure 17
<p>Cross-sectional Views of a Same Location: (<b>a</b>) Airborne point cloud data slope map; (<b>b</b>) Handheld LiDAR point cloud data slope map, with the red box highlighting the narrow alley where data acquisition is challenging.</p>
Full article ">
26 pages, 6721 KiB  
Article
Advanced Detection and Classification of Kelp Habitats Using Multibeam Echosounder Water Column Point Cloud Data
by Amy W. Nau, Vanessa Lucieer, Alexandre C. G. Schimel, Haris Kunnath, Yoann Ladroit and Tara Martin
Remote Sens. 2025, 17(3), 449; https://doi.org/10.3390/rs17030449 - 28 Jan 2025
Viewed by 758
Abstract
Kelps are important habitat-forming species in shallow marine environments, providing critical habitat, structure, and productivity for temperate reef ecosystems worldwide. Many kelp species are currently endangered by myriad pressures, including changing water temperatures, invasive species, and anthropogenic threats. This situation necessitates advanced methods [...] Read more.
Kelps are important habitat-forming species in shallow marine environments, providing critical habitat, structure, and productivity for temperate reef ecosystems worldwide. Many kelp species are currently endangered by myriad pressures, including changing water temperatures, invasive species, and anthropogenic threats. This situation necessitates advanced methods to detect kelp density, which would allow tracking density changes, understanding ecosystem dynamics, and informing evidence-based management strategies. This study introduces an innovative approach to detect kelp density with multibeam echosounder water column data. First, these data are filtered into a point cloud. Then, a range of variables are derived from these point cloud data, including average acoustic energy, volume, and point density. Finally, these variables are used as input to a Random Forest model in combination with bathymetric variables to classify sand, bare rock, sparse kelp, and dense kelp habitats. At 5 m resolution, we achieved an overall accuracy of 72.5% with an overall Area Under the Curve of 0.874. Notably, our method achieved high accuracy across the entire multibeam swath, with only a 1 percent point decrease in model accuracy for data falling within the part of the multibeam water column data impacted by sidelobe artefact noise, which significantly expands the potential of this data type for wide-scale monitoring of threatened kelp ecosystems. Full article
(This article belongs to the Section Ocean Remote Sensing)
Show Figures

Figure 1

Figure 1
<p>Overview of three study sites: The Gardens (<b>a</b>), North Freycinet (<b>b</b>), and Monroe 1 (<b>c</b>). The inset map of Tasmania in the top panel shows the relative locations of each site. MBES depth is displayed with a range of 0 to 40 m. Towed video tracks and classification results are shown in greyscale for sand (white), bare rock (light grey), sparse kelp (dark grey), and dense kelp (black).</p>
Full article ">Figure 2
<p>Overview diagram of our proposed method, including data processing for towed video, multibeam bathymetry, and WCD, modelling, and model evaluation.</p>
Full article ">Figure 3
<p>Examples of water column data variables generated for the three sites: (<b>a</b>) WC Mean amplitude variable at 5 m resolution, (<b>b</b>) WC Point count variable at 5 m resolution, (<b>c</b>) WC Volume variable at 5 m resolution, and (<b>d</b>) WC Volume variable at 1 m resolution. The units of volume for panels (<b>c</b>) and (<b>d</b>) were converted to volume per area for visual comparison between the different resolutions. For each panel, the sites correspond to The Gardens (<b>top</b>), North Freycinet (<b>middle</b>), and Monroe 1 (<b>bottom</b>).</p>
Full article ">Figure 4
<p>Examples of towed video data for each class type: (<b>a</b>) Sand, (<b>b</b>) Bare rock (sea urchins present), (<b>c</b>) Sparse kelp, and (<b>d</b>) Dense kelp.</p>
Full article ">Figure 5
<p>Average ROC curve across all CV folds for each class for the best performing Random Forest model (5 m resolution) including the AUC. Sand is shown as a dotted line, bare rock is shown as a dot-dash line, sparse kelp is shown as a dashed line, and dense kelp is shown as a solid line.</p>
Full article ">Figure 6
<p>Variable importance (Mean Decrease Gini) for the models at 5 m resolution (<b>left</b>), 3 m resolution (<b>middle</b>), and 1 m resolution (<b>right</b>). Higher values of Mean Decrease Gini indicate a higher importance ranking of those variables in the Random Forest model.</p>
Full article ">Figure 7
<p>Box plots of selected water column variables by class at 5 m (<b>top row</b>) and 1 m (<b>bottom row</b>) grid resolutions. The horizontal line inside of each box is the sample median. The top and bottom edges are the upper and lower quartiles, respectively. Outliers are shown as dots.</p>
Full article ">Figure 8
<p>Box plots of selected water column variables falling within (<b>top</b>) and beyond (<b>bottom</b>) the minimum slant range (MSR). The top and bottom edges are the upper and lower quartiles, respectively. Outliers are shown as dots.</p>
Full article ">Figure 9
<p>Classified maps based on the Random Forest model at 5 m resolution for three sites: (<b>a</b>) The Gardens, (<b>b</b>) North Freycinet, and (<b>c</b>) Monroe 1.</p>
Full article ">Figure 10
<p>Percent of each reef class (bare rock, sparse kelp, or dense kelp) within each site (The Gardens (white), North Freycinet (grey), and Monroe 1 (black)). The percentage values are shown at the top of each bar.</p>
Full article ">
19 pages, 1575 KiB  
Article
FIFA3D: Flow-Guided Feature Aggregation for Temporal Three-Dimensional Object Detection
by Ruiqi Ma, Chunwei Wang, Chi Chen, Yihan Zeng, Bijun Li, Qin Zou, Qingqiu Huang, Xinge Zhu and Hang Xu
Remote Sens. 2025, 17(3), 380; https://doi.org/10.3390/rs17030380 - 23 Jan 2025
Viewed by 566
Abstract
Detecting accurate 3D bounding boxes from LiDAR point clouds is crucial for autonomous driving. Recent studies have shown the superiority of the performance of multi-frame 3D detectors, yet eliminating the misalignment across frames and effectively aggregating spatiotemporal information are still challenging problems. In [...] Read more.
Detecting accurate 3D bounding boxes from LiDAR point clouds is crucial for autonomous driving. Recent studies have shown the superiority of the performance of multi-frame 3D detectors, yet eliminating the misalignment across frames and effectively aggregating spatiotemporal information are still challenging problems. In this paper, we present a novel flow-guided feature aggregation scheme for 3D object detection (FIFA3D) to align cross-frame information. FIFA3D first leverages optical flow with supervised signals to model the pixel-to-pixel correlations between sequential frames. Considering the sparse nature of bird’s-eye-view feature maps, an additional classification branch is adopted to provide explicit pixel-wise clues. Meanwhile, we utilize multi-scale feature maps and predict flow in a coarse-to-fine manner. With guidance from the estimated flow, historical features can be well aligned to the current situation, and a cascade fusion strategy is introduced to benefit the following detection. Extensive experiments show that FIFA3D surpasses the single-frame baseline with remarkable margins of +10.8% mAPH and +6.8% mAP on the Waymo and nuScenes validation datasets and performs well compared with state-of-the-art methods. Full article
Show Figures

Figure 1

Figure 1
<p>Comparison of temporal aggregation mechanisms for sequential point clouds. (<b>a</b>) Point concatenation methods directly merge LiDAR points after removing the ego motion. (<b>b</b>) Learnable fusion methods generally apply convolution or transformer for feature aggregation on BEV maps, which aims to reduce the effect of the motion of the object between sequential frames. (<b>c</b>) Our FIFA3D leverages an optical flow estimation module with supervised signals and further utilizes the flow for temporal alignment.</p>
Full article ">Figure 2
<p>Overview of our flow-guided feature aggregation scheme for 3D object detection (FIFA3D) with sequential point clouds. It consists of four components: (a) a temporal feature encoder module that encodes sparse point clouds into BEV maps, (b) a class-aware flow estimation module that makes sparse BEV feature maps denser and predicts flow in a coarse-to-fine manner, (c) a flow-guided cascade temporal feature aggregation module, and (d) aggregated features are sent to an RPN decoder and a center-based head to obtain 3D bounding boxes.</p>
Full article ">Figure 3
<p>Network of progressive flow estimation. Combining historical feature <math display="inline"><semantics> <msub> <mi>E</mi> <mrow> <mi>i</mi> <mo>−</mo> <mn>1</mn> </mrow> </msub> </semantics></math> and current feature <math display="inline"><semantics> <msub> <mi>E</mi> <mi>i</mi> </msub> </semantics></math>, the optical flow <math display="inline"><semantics> <msub> <mi>f</mi> <mrow> <mi>i</mi> <mo>−</mo> <mn>1</mn> <mo>→</mo> <mi>i</mi> </mrow> </msub> </semantics></math> is predicted in a coarse-to-fine manner.</p>
Full article ">Figure 4
<p>An example of the feature warping process. For a moving object (magnified in white box) in consecutive frames, with the guidance of the estimated optical flow (blue arrows), the corresponding historical features can be well aligned to the current frame.</p>
Full article ">Figure 5
<p>The structure of flow-guided temporal aggregation. We use a cascade temporal feature aggregation strategy. For example, when three feature maps are used as the input, the optical flow <math display="inline"><semantics> <msub> <mi>f</mi> <mrow> <mi>i</mi> <mo>−</mo> <mn>2</mn> <mo>→</mo> <mi>i</mi> <mo>−</mo> <mn>1</mn> </mrow> </msub> </semantics></math> is used to aggregate the feature maps <math display="inline"><semantics> <msubsup> <mi>B</mi> <mrow> <mi>i</mi> <mo>−</mo> <mn>2</mn> </mrow> <mo>′</mo> </msubsup> </semantics></math> and <math display="inline"><semantics> <msubsup> <mi>B</mi> <mrow> <mi>i</mi> <mo>−</mo> <mn>1</mn> </mrow> <mo>′</mo> </msubsup> </semantics></math>. Then, the fused feature maps <math display="inline"><semantics> <msub> <mi>O</mi> <mrow> <mi>i</mi> <mo>−</mo> <mn>1</mn> </mrow> </msub> </semantics></math> and <math display="inline"><semantics> <msubsup> <mi>B</mi> <mrow> <mi>i</mi> </mrow> <mo>′</mo> </msubsup> </semantics></math> are aggregated under the guidance of <math display="inline"><semantics> <msub> <mi>f</mi> <mrow> <mi>i</mi> <mo>−</mo> <mn>1</mn> <mo>→</mo> <mi>i</mi> </mrow> </msub> </semantics></math>, and output <math display="inline"><semantics> <msub> <mi>O</mi> <mi>i</mi> </msub> </semantics></math> is obtained.</p>
Full article ">Figure 6
<p>Qualitative results on the Waymo dataset in urban scenarios (<b>a</b>,<b>b</b>). <span style="color: #0000FF">Blue</span> boxes denote ground truth, and <span style="color: #FF0000">red</span> boxes denote the predictions. <span style="color: #00FF00">Green</span> ovals represent the objects that FIFA3D successfully detected but the other two methods failed to detect. * represents using 2-frame point cloud concatenation as input.</p>
Full article ">Figure 7
<p>Qualitative results on the Waymo dataset in rural (<b>a</b>) and highway-like (<b>b</b>) scenarios. <span style="color: #0000FF">Blue</span> boxes denote ground truth, and <span style="color: #FF0000">red</span> boxes denote the predictions. <span style="color: #00FF00">Green</span> ovals represent objects that FIFA3D successfully detected but the other two methods failed to detect. * represents using 2-frame point clouds concatenation as input.</p>
Full article ">Figure 8
<p>Comparisons of different intervals of two-framespoint clouds. FC: feature concatenation along channel dimension. The dashed lines indicate the trend of changes, and the dots represent individual data points.</p>
Full article ">
17 pages, 3431 KiB  
Article
Interchangeability of Cross-Platform Orthophotographic and LiDAR Data in DeepLabV3+-Based Land Cover Classification Method
by Shijun Pan, Keisuke Yoshida, Satoshi Nishiyama, Takashi Kojima and Yutaro Hashimoto
Land 2025, 14(2), 217; https://doi.org/10.3390/land14020217 - 21 Jan 2025
Viewed by 482
Abstract
Riverine environmental information includes important data to collect, and the data collection still requires personnel’s field surveys. These on-site tasks still face significant limitations (i.e., hard or danger to entry). In recent years, as one of the efficient approaches for data collection, air-vehicle-based [...] Read more.
Riverine environmental information includes important data to collect, and the data collection still requires personnel’s field surveys. These on-site tasks still face significant limitations (i.e., hard or danger to entry). In recent years, as one of the efficient approaches for data collection, air-vehicle-based Light Detection and Ranging technologies have already been applied in global environmental research, i.e., land cover classification (LCC) or environmental monitoring. For this study, the authors specifically focused on seven types of LCC (i.e., bamboo, tree, grass, bare ground, water, road, and clutter) that can be parameterized for flood simulation. A validated airborne LiDAR bathymetry system (ALB) and a UAV-borne green LiDAR System (GLS) were applied in this study for cross-platform analysis of LCC. Furthermore, LiDAR data were visualized using high-contrast color scales to improve the accuracy of land cover classification methods through image fusion techniques. If high-resolution aerial imagery is available, then it must be downscaled to match the resolution of low-resolution point clouds. Cross-platform data interchangeability was assessed by comparing the interchangeability, which measures the absolute difference in overall accuracy (OA) or macro-F1 by comparing the cross-platform interchangeability. It is noteworthy that relying solely on aerial photographs is inadequate for achieving precise labeling, particularly under limited sunlight conditions that can lead to misclassification. In such cases, LiDAR plays a crucial role in facilitating target recognition. All the approaches (i.e., low-resolution digital imagery, LiDAR-derived imagery and image fusion) present results of over 0.65 OA and of around 0.6 macro-F1. The authors found that the vegetation (bamboo, tree, grass) and road species have comparatively better performance compared with clutter and bare ground species. Given the stated conditions, differences in the species derived from different years (ALB from year 2017 and GLS from year 2020) are the main reason. Because the identification of clutter species includes all the items except for the relative species in this research, RGB-based features of the clutter species cannot be substituted easily because of the 3-year gap compared with other species. Derived from on-site reconstruction, the bare ground species also has a further color change between ALB and GLS that leads to decreased interchangeability. In the case of individual species, without considering seasons and platforms, image fusion can classify bamboo and trees with higher F1 scores compared to low-resolution digital imagery and LiDAR-derived imagery, which has especially proved the cross-platform interchangeability in the high vegetation types. In recent years, high-resolution photography (UAV), high-precision LiDAR measurement (ALB, GLS), and satellite imagery have been used. LiDAR measurement equipment is expensive, and measurement opportunities are limited. Based on this, it would be desirable if ALB and GLS could be continuously classified by Artificial Intelligence, and in this study, the authors investigated such data interchangeability. A unique and crucial aspect of this study is exploring the interchangeability of land cover classification models across different LiDAR platforms. Full article
Show Figures

Figure 1

Figure 1
<p>Perspective of airborne LiDAR bathymetry and green LiDAR measurement area: (<b>a</b>) location of the Asahi River in Japan with kilo post (KP) values representing longitudinal distance (km) from the river mouth, (<b>b</b>) aerial-captured photographs based on the marked positions in (<b>a</b>,<b>c</b>) drone-captured photographs based on the marked positions in (<b>b</b>).</p>
Full article ">Figure 2
<p>In overland and underwater surveys, Light Detection and Ranging (LiDAR) using near-infrared (NIR) and green laser (GL) from ALB (left side, NIR and GL) and GLS (right-side, GL) is shown, respectively (laser points are shown in grayscale).</p>
Full article ">Figure 3
<p>Processes of different data types and responding operations (LR-TL, LR-DI, LiDAR-I, and image fusion).</p>
Full article ">Figure 4
<p>Comparison of data style-based averaged 2 m pixel<sup>−1</sup> resolution cross-platform interchangeability. Left vertical axis: reference of OA and macro-F1 value; right vertical axis: the reference of absolute difference value.</p>
Full article ">Figure 5
<p>Water areas that are not extractable using GLS alone (i.e., zoom in from LiDAR-I, Oct. 2020). HC means high contrast.</p>
Full article ">
19 pages, 2560 KiB  
Article
Evaluation of Rapeseed Leave Segmentation Accuracy Using Binocular Stereo Vision 3D Point Clouds
by Lili Zhang, Shuangyue Shi, Muhammad Zain, Binqian Sun, Dongwei Han and Chengming Sun
Agronomy 2025, 15(1), 245; https://doi.org/10.3390/agronomy15010245 - 20 Jan 2025
Viewed by 648
Abstract
Point cloud segmentation is necessary for obtaining highly precise morphological traits in plant phenotyping. Although a huge development has occurred in point cloud segmentation, the segmentation of point clouds from complex plant leaves still remains challenging. Rapeseed leaves are critical in cultivation and [...] Read more.
Point cloud segmentation is necessary for obtaining highly precise morphological traits in plant phenotyping. Although a huge development has occurred in point cloud segmentation, the segmentation of point clouds from complex plant leaves still remains challenging. Rapeseed leaves are critical in cultivation and breeding, yet traditional two-dimensional imaging is susceptible to reduced segmentation accuracy due to occlusions between plants. The current study proposes the use of binocular stereo-vision technology to obtain three-dimensional (3D) point clouds of rapeseed leaves at the seedling and bolting stages. The point clouds were colorized based on elevation values in order to better process the 3D point cloud data and extract rapeseed phenotypic parameters. Denoising methods were selected based on the source and classification of point cloud noise. However, for ground point clouds, we combined plane fitting with pass-through filtering for denoising, while statistical filtering was used for denoising outliers generated during scanning. We found that, during the seedling stage of rapeseed, a region-growing segmentation method was helpful in finding suitable parameter thresholds for leaf segmentation, and the Locally Convex Connected Patches (LCCP) clustering method was used for leaf segmentation at the bolting stage. Furthermore, the study results show that combining plane fitting with pass-through filtering effectively removes the ground point cloud noise, while statistical filtering successfully denoises outlier noise points generated during scanning. Finally, using the region-growing algorithm during the seedling stage with a normal angle threshold set at 5.0/180.0* M_PI and a curvature threshold set at 1.5 helps to avoid the under-segmentation and over-segmentation issues, achieving complete segmentation of rapeseed seedling leaves, while the LCCP clustering method fully segments rapeseed leaves at the bolting stage. The proposed method provides insights to improve the accuracy of subsequent point cloud phenotypic parameter extraction, such as rapeseed leaf area, and is beneficial for the 3D reconstruction of rapeseed. Full article
(This article belongs to the Special Issue Unmanned Farms in Smart Agriculture)
Show Figures

Figure 1

Figure 1
<p>Diagram of the scanning process in the field: (<b>a</b>) a smart phenotyping platform; (<b>b</b>) a sidebar camera; (<b>c</b>) a data flow diagram.</p>
Full article ">Figure 2
<p>The colored point clouds diagram: (<b>a</b>) cross-section of point clouds, (<b>b</b>) three-dimensional (3D) view of rapeseed.</p>
Full article ">Figure 3
<p>The point cloud image after fitting the plane.</p>
Full article ">Figure 4
<p>Illustration of the Extended Convexity Criterion (CC) theory.</p>
Full article ">Figure 5
<p>Pass-through filtering effect diagram: (<b>a</b>) the original point cloud image of rapeseed plot, (<b>b</b>) the point cloud image of rapeseed after pass-through.</p>
Full article ">Figure 6
<p>The relationship between the number of removed points and the standard deviation multiple under various nearest-neighbor numbers.</p>
Full article ">Figure 7
<p>The denoising results from the point cloud image of rapeseed after statistical filtering: (<b>a</b>) when <span class="html-italic">k</span> = 5, <span class="html-italic">α</span> = 0.01; (<b>b</b>) when <span class="html-italic">k</span> = 100, <span class="html-italic">α</span> = 0.01; (<b>c</b>) when <span class="html-italic">k</span> = 5, <span class="html-italic">α</span> = 0.5; (<b>d</b>) when <span class="html-italic">k</span> = 5, <span class="html-italic">α</span> = 5.</p>
Full article ">Figure 8
<p>Segmentation results of a single rapeseed plant based on region growth under the curvature value of (<b>a</b>) 0.5, (<b>b</b>) 1.0, and (<b>c</b>) 1.5.</p>
Full article ">Figure 9
<p>Evaluation of the leaf area accuracy of Huyou 039.</p>
Full article ">Figure 10
<p>Segmentation results of rapeseed leaves at the bolting stage using (<b>a</b>) the region-growing algorithm and (<b>b</b>) the LCCP algorithm.</p>
Full article ">Figure 11
<p>The point cloud of the leaf overlaps: (<b>a</b>) the red circle highlights the overlapping region, and (<b>b</b>) an enlarged view of this overlapping area.</p>
Full article ">
15 pages, 18148 KiB  
Article
Fast 3D Transmission Tower Detection Based on Virtual Views
by Liwei Zhou, Jiaying Tan, Jing Fu and Guiwei Shao
Appl. Sci. 2025, 15(2), 947; https://doi.org/10.3390/app15020947 - 19 Jan 2025
Viewed by 541
Abstract
Advanced remote sensing technologies leverage extensive synthetic aperture radar (SAR) satellite data and high-resolution airborne light detection and ranging (LiDAR) data to swiftly capture comprehensive 3D information about electrical grid assets and their surrounding environments. This facilitates in-depth scene analysis for target detection [...] Read more.
Advanced remote sensing technologies leverage extensive synthetic aperture radar (SAR) satellite data and high-resolution airborne light detection and ranging (LiDAR) data to swiftly capture comprehensive 3D information about electrical grid assets and their surrounding environments. This facilitates in-depth scene analysis for target detection and classification, allowing for the early recognition of potential hazards near transmission towers (TTs). These innovations present a groundbreaking strategy for the automated inspection of electrical grid assets. However, traditional 3D target detection techniques, which involve searching the entire 3D space, are marred by low accuracy and high computational demands. Although deep learning-based 3D target detection methods have significantly improved detection precision, they rely on a large volume of 3D target samples for training and are sensitive to point cloud data density. Moreover, these methods demonstrate low detection efficiency, constraining their application in the automated monitoring of electricity networks. This paper proposes a fast 3D target detection method using virtual views to overcome these challenges related to detection accuracy and efficiency. The method first utilizes cutting-edge 2D splatting technology to project 3D point clouds with diverse densities from a specific viewpoint, generating a 2D virtual image. Then, a novel local–global dual-path feature fusion network based on YOLO is applied to detect TTs on the virtual image, ensuring efficient and accurate identification of their positions and types. Finally, by leveraging the projection transformation between the virtual image and the 3D point cloud, combined with a 3D region growing algorithm, the 3D points belonging to the TTs are extracted from the whole 3D point cloud. The effectiveness of the proposed method in terms of target detection rate and efficiency is validated through experiments on synthetic datasets and outdoor LiDAR point clouds. Full article
Show Figures

Figure 1

Figure 1
<p>Overall pipeline of the proposed method. The detected TTs are highlighted in red rectangles in the virtual image and are rendered in red within the 3D point cloud.</p>
Full article ">Figure 2
<p>Comparison of two projection strategies on the TT.</p>
Full article ">Figure 3
<p>Illustration of a portion of a generated virtual view.</p>
Full article ">Figure 4
<p>Local–global dual-path feature fusion network.</p>
Full article ">Figure 5
<p>Illustration of a TT extracted from the LiDAR data.</p>
Full article ">Figure 6
<p>Seven selected scenarios from TTPLA. All TTs are photographed from different angles.</p>
Full article ">Figure 7
<p>Example of airborne LiDAR data.</p>
Full article ">Figure 8
<p>Illustration of 2D detection results on airborne LiDAR data.</p>
Full article ">Figure 9
<p>Illustration of 3D detection results on airborne LiDAR data.</p>
Full article ">
Back to TopTop