MDPI - Publisher of Open Access Journals

24 pages, 2264 KiB

Open AccessReview

Transforming Architectural Digitisation: Advancements in AI-Driven 3D Reality-Based Modelling

by Kai Zhang and Francesco Fassi

Heritage 2025, 8(2), 81; https://doi.org/10.3390/heritage8020081 - 18 Feb 2025

The capture of 3D reality has demonstrated increased efficiency and consistently accurate outcomes in architectural digitisation. Nevertheless, despite advancements in data collection, 3D reality-based modelling still lacks full automation, especially in the post-processing and modelling phase. Artificial intelligence (AI) has been a significant [...] Read more.

The capture of 3D reality has demonstrated increased efficiency and consistently accurate outcomes in architectural digitisation. Nevertheless, despite advancements in data collection, 3D reality-based modelling still lacks full automation, especially in the post-processing and modelling phase. Artificial intelligence (AI) has been a significant focus, especially in computer vision, and tasks such as image classification and object recognition might be beneficial for the digitisation process and its subsequent utilisation. This study aims to examine the potential outcomes of integrating AI technology into the field of 3D reality-based modelling, with a particular focus on its use in architecture and cultural-heritage scenarios. The main methods used for data collection are laser scanning (static or mobile) and photogrammetry. As a result, image data, including RGB-D data (files containing both RGB colours and depth information) and point clouds, have become the most common raw datasets available for object mapping. This study comprehensively analyses the current use of 2D and 3D deep learning techniques in documentation tasks, particularly downstream applications. It also highlights the ongoing research efforts in developing real-time applications with the ultimate objective of achieving generalisation and improved accuracy. Full article

(This article belongs to the Section Architectural Heritage)

► Show Figures

Figure 1

28 pages, 25975 KiB

Open AccessArticle

Analysis of the Qualitative Parameters of Mobile Laser Scanning for the Creation of Cartographic Works and 3D Models for Digital Twins of Urban Areas

by Ľudovít Kovanič, Patrik Peťovský, Branislav Topitzer, Peter Blišťan and Ondrej Tokarčík

Appl. Sci. 2025, 15(4), 2073; https://doi.org/10.3390/app15042073 - 16 Feb 2025

Viewed by 336

Abstract

This article focuses on the assessment of point clouds obtained by various laser scanning methods as a tool for 3D mapping and Digital Twin concepts. The presented research employed terrestrial and mobile laser scanning methods to obtain high-precision spatial data, enabling efficient spatial [...] Read more.

This article focuses on the assessment of point clouds obtained by various laser scanning methods as a tool for 3D mapping and Digital Twin concepts. The presented research employed terrestrial and mobile laser scanning methods to obtain high-precision spatial data, enabling efficient spatial documentation of urban structures and infrastructure. As a reference method, static terrestrial laser scanning (TLS) was chosen. Mobile laser scanning (MLS) data obtained by devices such as Lidaretto, the Stonex X120GO laser scanning device, and an iPhone 13 Pro with an Emlid scanning kit and GNSS antenna Reach RX were evaluated. Analyses based on comparing methods of classification, differences in individual objects, detail/density, and noise were performed. The results confirm the high accuracy of the methods and their ability to support the development of digital twins and smart solutions that enhance the efficiency of infrastructure management and planning. Full article

(This article belongs to the Special Issue Recent Advances in Land Use and Spatial Planning in Urban and Rural Areas)

► Show Figures

Figure 1

15 pages, 3658 KiB

Open AccessArticle

A Hard Negatives Mining and Enhancing Method for Multi-Modal Contrastive Learning

by Guangping Li, Yanan Gao, Xianhui Huang and Bingo Wing-Kuen Ling

Electronics 2025, 14(4), 767; https://doi.org/10.3390/electronics14040767 - 16 Feb 2025

Viewed by 155

Abstract

Contrastive learning has emerged as a dominant paradigm for understanding 3D open-world environments, particularly in the realm of multi-modalities. However, due to the nature of self-supervised learning and the limited size of 3D datasets, pre-trained models in the 3D point cloud domain often [...] Read more.

Contrastive learning has emerged as a dominant paradigm for understanding 3D open-world environments, particularly in the realm of multi-modalities. However, due to the nature of self-supervised learning and the limited size of 3D datasets, pre-trained models in the 3D point cloud domain often suffer from overfitting in downstream tasks, especially in zero-shot classification. To tackle this problem, we design a module to mine and enhance hard negatives from datasets, which are useful to improve the discrimination of models. This module could be seamlessly integrated into cross-modal contrastive learning frameworks, addressing the overfitting issue by enhancing the mined hard negatives during the process of training. This module consists of two key components: mining and enhancing. In the process of mining, we identify hard negative samples by examining similarity relationships between vision–vision and vision–text modalities, locating hard negative pairs within the visual domain. In the process of enhancing, we compute weighting coefficients via the similarity differences of these mined hard negatives. By enhancing the mined hard negatives while leaving others unchanged, we improve the overall performance and discrimination of models. A series of experiments demonstrate that our module can be easily incorporated into various contrastive learning frameworks, leading to improved model performance in both zero-shot and few-shot tasks. Full article

(This article belongs to the Special Issue Recent Advances in Computer Vision: Technologies and Applications, 2nd Edition)

► Show Figures

Figure 1

Figure 1
The modified framework of CLIP2Point. We add a textual branch and apply HNME to cross-modal contrastive learning during pre-training. Full article ">Figure 2
The framework of OpenShape. Two cross-modal similarity matrices are computed, and our HNME method is applied to the cross-modalities between images and point clouds. Snowflakes and sparks respectively represent that the encoder parameters are frozen and learnable during training. Pink, green, blue and purple boxes represent image, text, point cloud features and positive sample pair similarity, respectively. Full article ">Figure 3
Qualitative process of mining and enhancing hard negatives. (a) indicates the anchor and its positive, false, and true negative samples. After step1, (b) circles the candidate hard negative samples with a dotted box, but they are not all true hard negatives. So we identify the true ones in step2, as shown in (c); the final mined hard negatives are circled by solid boxes. After enhancing in (d), they become closer to the anchor in the feature space. Full article ">Figure 4
The judgment accuracy of the initial and trained models with different <math display="inline"><semantics> <mi>δ</mi> </semantics></math>. The x axis is the value of <math display="inline"><semantics> <mi>δ</mi> </semantics></math>; with the decrease of <math display="inline"><semantics> <mi>δ</mi> </semantics></math>, accuracies gradually increase and tend to be stable. Full article ">Figure 5
(a) shows that the rate of similarity difference between positive and negative sample pairs varies with the similarity of the negative sample pair. After calculating the exponents, the coefficient is above 1, as shown in (b). Full article ">Figure 6
These heatmaps indicate sample similarity relationships in a batch before and after enhancing hard negatives. (a) shows the original cosine similarities between image-depth pairs. (b) indicates the similarity translationships after enhancing; the brighter parts are the similarities of enhanced negatives. The legend in the rightmost column indicates the colors of different similarities, which are expanded by the temperature coefficient <math display="inline"><semantics> <mi>τ</mi> </semantics></math>. Full article ">

24 pages, 11349 KiB

Open AccessArticle

Multi-Size Voxel Cube (MSVC) Algorithm—A Novel Method for Terrain Filtering from Dense Point Clouds Using a Deep Neural Network

by Martin Štroner, Martin Boušek, Jakub Kučera, Hana Váchová and Rudolf Urban

Remote Sens. 2025, 17(4), 615; https://doi.org/10.3390/rs17040615 - 11 Feb 2025

Viewed by 336

Abstract

When filtering highly rugged terrain from dense point clouds (particularly in technical applications such as civil engineering), the most widely used filtering approaches yield suboptimal results. Here, we proposed and tested a novel ground-filtering algorithm, a multi-size voxel cube (MSVC), utilizing a deep [...] Read more.

When filtering highly rugged terrain from dense point clouds (particularly in technical applications such as civil engineering), the most widely used filtering approaches yield suboptimal results. Here, we proposed and tested a novel ground-filtering algorithm, a multi-size voxel cube (MSVC), utilizing a deep neural network. This is based on the voxelization of the point cloud, the classification of individual voxels as ground or non-ground using surrounding voxels (a “voxel cube” of 9 × 9 × 9 voxels), and the gradual reduction in voxel size, allowing the acquisition of custom-level detail and highly rugged terrain from dense point clouds. The MSVC performance on two dense point clouds, capturing highly rugged areas with dense vegetation cover, was compared with that of the widely used cloth simulation filter (CSF) using manually classified terrain as the reference. MSVC consistently outperformed the CSF filter in terms of the correctly identified ground points, correctly identified non-ground points, balanced accuracy, and the F-score. Another advantage of this filter lay in its easy adaptability to any type of terrain, enabled by the utilization of machine learning. The only disadvantage lay in the necessity to manually prepare training data. On the other hand, we aim to account for this in the future by producing neural networks trained for individual landscape types, thus eliminating this phase of the work. Full article

(This article belongs to the Special Issue New Perspectives on 3D Point Cloud (Third Edition))

► Show Figures

Figure 1

21 pages, 16141 KiB

Open AccessArticle

The Development of a Sorting System Based on Point Cloud Weight Estimation for Fattening Pigs

by Luo Liu, Yangsen Ou, Zhenan Zhao, Mingxia Shen, Ruqian Zhao and Longshen Liu

Agriculture 2025, 15(4), 365; https://doi.org/10.3390/agriculture15040365 - 8 Feb 2025

Viewed by 405

Abstract

As large-scale and intensive fattening pig farming has become mainstream, the increase in farm size has led to more severe issues related to the hierarchy within pig groups. Due to genetic differences among individual fattening pigs, those that grow faster enjoy a higher [...] Read more.

As large-scale and intensive fattening pig farming has become mainstream, the increase in farm size has led to more severe issues related to the hierarchy within pig groups. Due to genetic differences among individual fattening pigs, those that grow faster enjoy a higher social rank. Larger pigs with greater aggression continuously acquire more resources, further restricting the survival space of weaker pigs. Therefore, fattening pigs must be grouped rationally, and the management of weaker pigs must be enhanced. This study, considering current fattening pig farming needs and actual production environments, designed and implemented an intelligent sorting system based on weight estimation. The main hardware structure of the partitioning equipment includes a collection channel, partitioning channel, and gantry-style collection equipment. Experimental data were collected, and the original scene point cloud was preprocessed to extract the back point cloud of fattening pigs. Based on the morphological characteristics of the fattening pigs, the back point cloud segmentation method was used to automatically extract key features such as hip width, hip height, shoulder width, shoulder height, and body length. The segmentation algorithm first calculates the centroid of the point cloud and the eigenvectors of the covariance matrix to reconstruct the point cloud coordinate system. Then, based on the variation characteristics and geometric shape of the consecutive horizontal slices of the point cloud, hip width and shoulder width slices are extracted, and the related features are calculated. Weight estimation was performed using Random Forest, Multilayer perceptron (MLP), linear regression based on the least squares method, and ridge regression models, with parameter tuning using Bayesian optimization. The mean squared error, mean absolute error, and mean relative error were used as evaluation metrics to assess the model’s performance. Finally, the classification capability was evaluated using the median and average weights of the fattening pigs as partitioning standards. The experimental results show that the system’s average relative error in weight estimation is approximately 2.90%, and the total time for the partitioning process is less than 15 s, which meets the needs of practical production. Full article

(This article belongs to the Special Issue Modeling of Livestock Breeding Environment and Animal Behavior)

► Show Figures

Figure 1

27 pages, 3505 KiB

Open AccessArticle

DeepDR: A Two-Level Deep Defect Recognition Framework for Meteorological Satellite Images

by Xiangang Zhao, Xiangyu Chang, Cunqun Fan, Manyun Lin, Lan Wei and Yunming Ye

Remote Sens. 2025, 17(4), 585; https://doi.org/10.3390/rs17040585 - 8 Feb 2025

Viewed by 290

Abstract

Raw meteorological satellite images often suffer from defects such as noise points and lines due to atmospheric interference and instrument errors. Current solutions typically rely on manual visual inspection to identify these defects. However, manual inspection is labor-intensive, lacks uniform standards, and is [...] Read more.

Raw meteorological satellite images often suffer from defects such as noise points and lines due to atmospheric interference and instrument errors. Current solutions typically rely on manual visual inspection to identify these defects. However, manual inspection is labor-intensive, lacks uniform standards, and is prone to both false positives and missed detections. To address these challenges, we propose DeepDR, a two-level deep defect recognition framework for meteorological satellite images. DeepDR consists of two modules: a transformer-based noise image classification module for the first level and a noise region segmentation module based on a pseudo-label training strategy for the second level. This framework enables the automatic identification of defective cloud images and the detection of noise points and lines, thereby significantly improving the accuracy of defect recognition. To evaluate the effectiveness of DeepDR, we have collected and released two satellite cloud image datasets from the FengYun-1 satellite, which include noise points and lines. Subsequently, we conducted comprehensive experiments to demonstrate the superior performance of our approach in addressing the satellite cloud image defect recognition problem. Full article

(This article belongs to the Special Issue Intelligent Remote Sensing: AI-Powered Techniques for Enhanced Data Analysis and Interpretation)

► Show Figures

Figure 1

25 pages, 6553 KiB

Open AccessArticle

Tree Species Classification Based on Point Cloud Completion

by Haoran Liu, Hao Zhong, Guangqiang Xie and Ping Zhang

Forests 2025, 16(2), 280; https://doi.org/10.3390/f16020280 - 6 Feb 2025

Viewed by 341

Abstract

LiDAR is an active remote sensing technology widely used in forestry applications, such as forest resource surveys, tree information collection, and ecosystem monitoring. However, due to the resolution limitations of 3D-laser scanners and the canopy occlusion in forest environments, the tree point clouds [...] Read more.

LiDAR is an active remote sensing technology widely used in forestry applications, such as forest resource surveys, tree information collection, and ecosystem monitoring. However, due to the resolution limitations of 3D-laser scanners and the canopy occlusion in forest environments, the tree point clouds obtained often have missing data. This can reduce the accuracy of individual tree segmentation, which subsequently affects the tree species classification. To address the issue, this study used point cloud data with RGB information collected by the UAV platform to improve tree species classification by completing the missing point clouds. Furthermore, the study also explored the effects of point cloud completion, feature selection, and classification methods on the results. Specifically, both a traditional geometric method and a deep learning-based method were used for point cloud completion, and their performance was compared. For the classification of tree species, five machine learning algorithms—Random Forest (RF), Support Vector Machine (SVM), Back Propagation Neural Network (BPNN), Quadratic Discriminant Analysis (QDA), and K-Nearest Neighbors (KNN)—were utilized. This study also ranked the importance of features to assess the impact of different algorithms and features on classification accuracy. The results showed that the deep learning-based completion method provided the best performance (avgCD = 6.14; avgF1 = 0.85), generating more complete point clouds than the traditional method. On the other hand, compared with SVM and BPNN, RF showed better performance in dealing with multi-classification tasks with limited training samples (OA-87.41%, Kappa-0.85). Among the six dominant tree species, Pinus koraiensis had the highest classification accuracy (93.75%), while that of Juglans mandshurica was the lowest (82.05%). In addition, the vegetation index and the tree structure parameter accounted for 50% and 30%, respectively, in the top 10 features in terms of feature importance. The point cloud intensity also had a high contribution to the classification results, indicating that the lidar point cloud data can also be used as an important basis for tree species classification. Full article

(This article belongs to the Section Forest Inventory, Modeling and Remote Sensing)

► Show Figures

Figure 1

23 pages, 4583 KiB

Open AccessArticle

Research on Fine-Scale Terrain Construction in High Vegetation Coverage Areas Based on Implicit Neural Representations

by Yi Zhang, Peipei He, Haihang Jing, Bin He, Weibo Yin, Junzhen Meng, Yuntian Ma, Haifeng Zhang, Bo Zhang and Haoxiang Shen

Sustainability 2025, 17(3), 1320; https://doi.org/10.3390/su17031320 - 6 Feb 2025

Viewed by 429

Abstract

Due to the high-density coverage of vegetation, the complexity of terrain, and occlusion issues, ground point extraction faces significant challenges. Airborne Light Detection and Ranging (LiDAR) technology plays a crucial role in complex mountainous areas. This article proposes a method for constructing fine [...] Read more.

Due to the high-density coverage of vegetation, the complexity of terrain, and occlusion issues, ground point extraction faces significant challenges. Airborne Light Detection and Ranging (LiDAR) technology plays a crucial role in complex mountainous areas. This article proposes a method for constructing fine terrain in high vegetation coverage areas based on implicit neural representation. This method consists of data preprocessing, multi-scale and multi-feature high-difference point cloud initial filtering, and an upsampling module based on implicit neural representation. Firstly, preprocess the regional point cloud data is preprocessed; then, K-dimensional trees (K-d trees) are used to construct spatial indexes, and spherical neighborhood methods are applied to capture the geometric and physical information of point clouds for multi-feature fusion, enhancing the distinction between terrain and non-terrain elements. Subsequently, a differential model is constructed based on DSM (Digital Surface Model) at different scales, and the elevation variation coefficient is calculated to determine the threshold for extracting the initial set of ground points. Finally, the upsampling module using implicit neural representation is used to finely process the initial ground point set, providing a complete and uniformly dense ground point set for the subsequent construction of fine terrain. To validate the performance of the proposed method, three sets of point cloud data from mountainous terrain with different features are selected as the experimental area. The experimental results indicate that, from a qualitative perspective, the proposed method significantly improves the classification of vegetation, buildings, and roads, with clear boundaries between different types of terrain. From a quantitative perspective, the Type I errors of the three selected regions are 4.3445%, 5.0623%, and 5.9436%, respectively. The Type II errors are 5.7827%, 6.8516%, and 7.3478%, respectively. The overall errors are 5.3361%, 6.4882%, and 6.7168%, respectively. The Kappa coefficients of the measurement areas all exceed 80%, indicating that the proposed method performs well in complex mountainous environments. Provide point cloud data support for the construction of wind and photovoltaic bases in China, reduce potential damage to the ecological environment caused by construction activities, and contribute to the sustainable development of ecology and energy. Full article

► Show Figures

Figure 1

18 pages, 14896 KiB

Open AccessArticle

Deep Learning-Based Point Cloud Classification of Obstacles for Intelligent Vehicles

by Yiqi Xu, Dengke Wu, Mengfei Zhou and Jiafu Yang

World Electr. Veh. J. 2025, 16(2), 80; https://doi.org/10.3390/wevj16020080 - 5 Feb 2025

Viewed by 508

Abstract

Intelligent driving research has focused much attention on point cloud obstacles since they are a class of high-dimensional data that can adequately depict the shape and placement of obstacles, unlike picture data. Currently, deep learning technology is primarily employed for vehicle autonomy point [...] Read more.

Intelligent driving research has focused much attention on point cloud obstacles since they are a class of high-dimensional data that can adequately depict the shape and placement of obstacles, unlike picture data. Currently, deep learning technology is primarily employed for vehicle autonomy point cloud obstacle classification tasks. These techniques typically struggle with low classification accuracy, processing efficiency, and model stability. To tackle the abovementioned issues, this paper suggests a novel random forest algorithm that integrates the out-of-bag error theory and can consistently and accurately evaluate the influence of point cloud properties. Then, building on the novel algorithm, this paper suggests a modified PointNet network that incorporates the effects of both global and local features on the classification task, therefore increasing the conventional network’s classification accuracy. To assess the effectiveness of this novel approach in the experimental portion, we set up an evaluation system based on the metrics for average accuracy, overall accuracy, and a confusion matrix. According to the simulation results, the overall accuracy of the proposed network in terms of classification accuracy is 94.4% and the average accuracy is 84.9%, which are then compared to the prototype PointNet and its variants. The classification accuracies for the four types of obstacles are 97.6%, 63.6%, 92.5%, and 86.1%. In addition, the proposed method is effective at improving both the computational complexity and stability of the network. Full article

(This article belongs to the Special Issue Deep Learning Applications for Electric Vehicles)

► Show Figures

Figure 1

29 pages, 15780 KiB

Open AccessArticle

Assessing Lightweight Folding UAV Reliability Through a Photogrammetric Case Study: Extracting Urban Village’s Buildings Using Object-Based Image Analysis (OBIA) Method

by Junyu Kuang, Yingbiao Chen, Zhenxiang Ling, Xianxin Meng, Wentao Chen and Zihao Zheng

Drones 2025, 9(2), 101; https://doi.org/10.3390/drones9020101 - 29 Jan 2025

Viewed by 522

Abstract

With the rapid advancement of drone technology, modern drones have achieved high levels of functional integration, alongside structural improvements that include lightweight, compact designs with foldable features, greatly enhancing their flexibility and applicability in photogrammetric applications. Nevertheless, limited research currently explores data collected [...] Read more.

With the rapid advancement of drone technology, modern drones have achieved high levels of functional integration, alongside structural improvements that include lightweight, compact designs with foldable features, greatly enhancing their flexibility and applicability in photogrammetric applications. Nevertheless, limited research currently explores data collected by such compact UAVs, and whether they can balance a small form factor with high data quality remains uncertain. To address this challenge, this study acquired the remote sensing data of a peri-urban area using the DJI Mavic 3 Enterprise and applied Object-Based Image Analysis (OBIA) to extract high-density buildings. It was found that this drone offers high portability, a low operational threshold, and minimal regulatory constraints in practical applications, while its captured imagery provides rich textural details that clearly depict the complex surface features in urban villages. To assess the accuracy of the extraction results, the visual comparison between the segmentation outputs and airborne LiDAR point clouds captured by the DJI M300 RTK was performed, and classification performance was evaluated based on confusion matrix metrics. The results indicate that the boundaries of the segmented objects align well with the building edges in the LiDAR point cloud. The classification accuracy of the three selected algorithms exceeded 80%, with the KNN classifier achieving an accuracy of 91% and a Kappa coefficient of 0.87, which robustly demonstrate the reliability of the UAV data and validate the feasibility of the proposed approach in complex cases. As a practical case reference, this study is expected to promote the wider application of lightweight UAVs across various fields. Full article

(This article belongs to the Special Issue Advances in Civil Applications of Unmanned Aircraft Systems: 2nd Edition)

► Show Figures

Figure 1

26 pages, 6721 KiB

Open AccessArticle

Advanced Detection and Classification of Kelp Habitats Using Multibeam Echosounder Water Column Point Cloud Data

by Amy W. Nau, Vanessa Lucieer, Alexandre C. G. Schimel, Haris Kunnath, Yoann Ladroit and Tara Martin

Remote Sens. 2025, 17(3), 449; https://doi.org/10.3390/rs17030449 - 28 Jan 2025

Viewed by 758

Abstract

Kelps are important habitat-forming species in shallow marine environments, providing critical habitat, structure, and productivity for temperate reef ecosystems worldwide. Many kelp species are currently endangered by myriad pressures, including changing water temperatures, invasive species, and anthropogenic threats. This situation necessitates advanced methods [...] Read more.

Kelps are important habitat-forming species in shallow marine environments, providing critical habitat, structure, and productivity for temperate reef ecosystems worldwide. Many kelp species are currently endangered by myriad pressures, including changing water temperatures, invasive species, and anthropogenic threats. This situation necessitates advanced methods to detect kelp density, which would allow tracking density changes, understanding ecosystem dynamics, and informing evidence-based management strategies. This study introduces an innovative approach to detect kelp density with multibeam echosounder water column data. First, these data are filtered into a point cloud. Then, a range of variables are derived from these point cloud data, including average acoustic energy, volume, and point density. Finally, these variables are used as input to a Random Forest model in combination with bathymetric variables to classify sand, bare rock, sparse kelp, and dense kelp habitats. At 5 m resolution, we achieved an overall accuracy of 72.5% with an overall Area Under the Curve of 0.874. Notably, our method achieved high accuracy across the entire multibeam swath, with only a 1 percent point decrease in model accuracy for data falling within the part of the multibeam water column data impacted by sidelobe artefact noise, which significantly expands the potential of this data type for wide-scale monitoring of threatened kelp ecosystems. Full article

(This article belongs to the Section Ocean Remote Sensing)

► Show Figures

Figure 1

19 pages, 1575 KiB

Open AccessArticle

FIFA3D: Flow-Guided Feature Aggregation for Temporal Three-Dimensional Object Detection

by Ruiqi Ma, Chunwei Wang, Chi Chen, Yihan Zeng, Bijun Li, Qin Zou, Qingqiu Huang, Xinge Zhu and Hang Xu

Remote Sens. 2025, 17(3), 380; https://doi.org/10.3390/rs17030380 - 23 Jan 2025

Viewed by 566

Abstract

Detecting accurate 3D bounding boxes from LiDAR point clouds is crucial for autonomous driving. Recent studies have shown the superiority of the performance of multi-frame 3D detectors, yet eliminating the misalignment across frames and effectively aggregating spatiotemporal information are still challenging problems. In [...] Read more.

Detecting accurate 3D bounding boxes from LiDAR point clouds is crucial for autonomous driving. Recent studies have shown the superiority of the performance of multi-frame 3D detectors, yet eliminating the misalignment across frames and effectively aggregating spatiotemporal information are still challenging problems. In this paper, we present a novel flow-guided feature aggregation scheme for 3D object detection (FIFA3D) to align cross-frame information. FIFA3D first leverages optical flow with supervised signals to model the pixel-to-pixel correlations between sequential frames. Considering the sparse nature of bird’s-eye-view feature maps, an additional classification branch is adopted to provide explicit pixel-wise clues. Meanwhile, we utilize multi-scale feature maps and predict flow in a coarse-to-fine manner. With guidance from the estimated flow, historical features can be well aligned to the current situation, and a cascade fusion strategy is introduced to benefit the following detection. Extensive experiments show that FIFA3D surpasses the single-frame baseline with remarkable margins of

+ 10.8 %

mAPH and

+ 6.8 %

mAP on the Waymo and nuScenes validation datasets and performs well compared with state-of-the-art methods. Full article

(This article belongs to the Special Issue Advances in 3D Reconstruction Based on Remote Sensing Imagery and Lidar Point Cloud)

► Show Figures

Figure 1

17 pages, 3431 KiB

Open AccessArticle

Interchangeability of Cross-Platform Orthophotographic and LiDAR Data in DeepLabV3+-Based Land Cover Classification Method

by Shijun Pan, Keisuke Yoshida, Satoshi Nishiyama, Takashi Kojima and Yutaro Hashimoto

Land 2025, 14(2), 217; https://doi.org/10.3390/land14020217 - 21 Jan 2025

Viewed by 482

Abstract

Riverine environmental information includes important data to collect, and the data collection still requires personnel’s field surveys. These on-site tasks still face significant limitations (i.e., hard or danger to entry). In recent years, as one of the efficient approaches for data collection, air-vehicle-based [...] Read more.

Riverine environmental information includes important data to collect, and the data collection still requires personnel’s field surveys. These on-site tasks still face significant limitations (i.e., hard or danger to entry). In recent years, as one of the efficient approaches for data collection, air-vehicle-based Light Detection and Ranging technologies have already been applied in global environmental research, i.e., land cover classification (LCC) or environmental monitoring. For this study, the authors specifically focused on seven types of LCC (i.e., bamboo, tree, grass, bare ground, water, road, and clutter) that can be parameterized for flood simulation. A validated airborne LiDAR bathymetry system (ALB) and a UAV-borne green LiDAR System (GLS) were applied in this study for cross-platform analysis of LCC. Furthermore, LiDAR data were visualized using high-contrast color scales to improve the accuracy of land cover classification methods through image fusion techniques. If high-resolution aerial imagery is available, then it must be downscaled to match the resolution of low-resolution point clouds. Cross-platform data interchangeability was assessed by comparing the interchangeability, which measures the absolute difference in overall accuracy (OA) or macro-F1 by comparing the cross-platform interchangeability. It is noteworthy that relying solely on aerial photographs is inadequate for achieving precise labeling, particularly under limited sunlight conditions that can lead to misclassification. In such cases, LiDAR plays a crucial role in facilitating target recognition. All the approaches (i.e., low-resolution digital imagery, LiDAR-derived imagery and image fusion) present results of over 0.65 OA and of around 0.6 macro-F1. The authors found that the vegetation (bamboo, tree, grass) and road species have comparatively better performance compared with clutter and bare ground species. Given the stated conditions, differences in the species derived from different years (ALB from year 2017 and GLS from year 2020) are the main reason. Because the identification of clutter species includes all the items except for the relative species in this research, RGB-based features of the clutter species cannot be substituted easily because of the 3-year gap compared with other species. Derived from on-site reconstruction, the bare ground species also has a further color change between ALB and GLS that leads to decreased interchangeability. In the case of individual species, without considering seasons and platforms, image fusion can classify bamboo and trees with higher F1 scores compared to low-resolution digital imagery and LiDAR-derived imagery, which has especially proved the cross-platform interchangeability in the high vegetation types. In recent years, high-resolution photography (UAV), high-precision LiDAR measurement (ALB, GLS), and satellite imagery have been used. LiDAR measurement equipment is expensive, and measurement opportunities are limited. Based on this, it would be desirable if ALB and GLS could be continuously classified by Artificial Intelligence, and in this study, the authors investigated such data interchangeability. A unique and crucial aspect of this study is exploring the interchangeability of land cover classification models across different LiDAR platforms. Full article

(This article belongs to the Special Issue Application of Multi-Source Geographical Big Data in Land Use Decision-Making)

► Show Figures

Figure 1

19 pages, 2560 KiB

Open AccessArticle

Evaluation of Rapeseed Leave Segmentation Accuracy Using Binocular Stereo Vision 3D Point Clouds

by Lili Zhang, Shuangyue Shi, Muhammad Zain, Binqian Sun, Dongwei Han and Chengming Sun

Agronomy 2025, 15(1), 245; https://doi.org/10.3390/agronomy15010245 - 20 Jan 2025

Viewed by 648

Abstract

Point cloud segmentation is necessary for obtaining highly precise morphological traits in plant phenotyping. Although a huge development has occurred in point cloud segmentation, the segmentation of point clouds from complex plant leaves still remains challenging. Rapeseed leaves are critical in cultivation and [...] Read more.

Point cloud segmentation is necessary for obtaining highly precise morphological traits in plant phenotyping. Although a huge development has occurred in point cloud segmentation, the segmentation of point clouds from complex plant leaves still remains challenging. Rapeseed leaves are critical in cultivation and breeding, yet traditional two-dimensional imaging is susceptible to reduced segmentation accuracy due to occlusions between plants. The current study proposes the use of binocular stereo-vision technology to obtain three-dimensional (3D) point clouds of rapeseed leaves at the seedling and bolting stages. The point clouds were colorized based on elevation values in order to better process the 3D point cloud data and extract rapeseed phenotypic parameters. Denoising methods were selected based on the source and classification of point cloud noise. However, for ground point clouds, we combined plane fitting with pass-through filtering for denoising, while statistical filtering was used for denoising outliers generated during scanning. We found that, during the seedling stage of rapeseed, a region-growing segmentation method was helpful in finding suitable parameter thresholds for leaf segmentation, and the Locally Convex Connected Patches (LCCP) clustering method was used for leaf segmentation at the bolting stage. Furthermore, the study results show that combining plane fitting with pass-through filtering effectively removes the ground point cloud noise, while statistical filtering successfully denoises outlier noise points generated during scanning. Finally, using the region-growing algorithm during the seedling stage with a normal angle threshold set at 5.0/180.0* M_PI and a curvature threshold set at 1.5 helps to avoid the under-segmentation and over-segmentation issues, achieving complete segmentation of rapeseed seedling leaves, while the LCCP clustering method fully segments rapeseed leaves at the bolting stage. The proposed method provides insights to improve the accuracy of subsequent point cloud phenotypic parameter extraction, such as rapeseed leaf area, and is beneficial for the 3D reconstruction of rapeseed. Full article

(This article belongs to the Special Issue Unmanned Farms in Smart Agriculture)

► Show Figures

Figure 1

15 pages, 18148 KiB

Open AccessArticle

Fast 3D Transmission Tower Detection Based on Virtual Views

by Liwei Zhou, Jiaying Tan, Jing Fu and Guiwei Shao

Appl. Sci. 2025, 15(2), 947; https://doi.org/10.3390/app15020947 - 19 Jan 2025

Viewed by 541

Abstract

Advanced remote sensing technologies leverage extensive synthetic aperture radar (SAR) satellite data and high-resolution airborne light detection and ranging (LiDAR) data to swiftly capture comprehensive 3D information about electrical grid assets and their surrounding environments. This facilitates in-depth scene analysis for target detection [...] Read more.

Advanced remote sensing technologies leverage extensive synthetic aperture radar (SAR) satellite data and high-resolution airborne light detection and ranging (LiDAR) data to swiftly capture comprehensive 3D information about electrical grid assets and their surrounding environments. This facilitates in-depth scene analysis for target detection and classification, allowing for the early recognition of potential hazards near transmission towers (TTs). These innovations present a groundbreaking strategy for the automated inspection of electrical grid assets. However, traditional 3D target detection techniques, which involve searching the entire 3D space, are marred by low accuracy and high computational demands. Although deep learning-based 3D target detection methods have significantly improved detection precision, they rely on a large volume of 3D target samples for training and are sensitive to point cloud data density. Moreover, these methods demonstrate low detection efficiency, constraining their application in the automated monitoring of electricity networks. This paper proposes a fast 3D target detection method using virtual views to overcome these challenges related to detection accuracy and efficiency. The method first utilizes cutting-edge 2D splatting technology to project 3D point clouds with diverse densities from a specific viewpoint, generating a 2D virtual image. Then, a novel local–global dual-path feature fusion network based on YOLO is applied to detect TTs on the virtual image, ensuring efficient and accurate identification of their positions and types. Finally, by leveraging the projection transformation between the virtual image and the 3D point cloud, combined with a 3D region growing algorithm, the 3D points belonging to the TTs are extracted from the whole 3D point cloud. The effectiveness of the proposed method in terms of target detection rate and efficiency is validated through experiments on synthetic datasets and outdoor LiDAR point clouds. Full article

► Show Figures

Figure 1

Search Results (692)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (692)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI