MDPI - Publisher of Open Access Journals

18 pages, 39910 KiB

Open AccessArticle

DyGS-SLAM: Realistic Map Reconstruction in Dynamic Scenes Based on Double-Constrained Visual SLAM

by Fan Zhu, Yifan Zhao, Ziyu Chen, Chunmao Jiang, Hui Zhu and Xiaoxi Hu

Remote Sens. 2025, 17(4), 625; https://doi.org/10.3390/rs17040625 - 12 Feb 2025

Viewed by 446

Visual SLAM is widely applied in robotics and remote sensing. The fusion of Gaussian radiance fields and Visual SLAM has demonstrated astonishing efficacy in constructing high-quality dense maps. While existing methods perform well in static scenes, they are prone to the influence of [...] Read more.

Visual SLAM is widely applied in robotics and remote sensing. The fusion of Gaussian radiance fields and Visual SLAM has demonstrated astonishing efficacy in constructing high-quality dense maps. While existing methods perform well in static scenes, they are prone to the influence of dynamic objects in real-world dynamic environments, thus making robust tracking and mapping challenging. We introduce DyGS-SLAM, a Visual SLAM system that employs dual constraints to achieve high-fidelity static map reconstruction in dynamic environments. We extract ORB features within the scene, and use open-world semantic segmentation models and multi-view geometry to construct dual constraints, forming a zero-shot dynamic information elimination module while recovering backgrounds occluded by dynamic objects. Furthermore, we select high-quality keyframes and use them for loop closure detection and global optimization, constructing a foundational Gaussian map through a set of determined point clouds and poses and integrating repaired frames for rendering new viewpoints and optimizing 3D scenes. Experimental results on the TUM RGB-D, Bonn, and Replica datasets, as well as real scenes, demonstrate that our method has excellent localization accuracy and mapping quality in dynamic scenes. Full article

(This article belongs to the Special Issue 3D Scene Reconstruction, Modeling and Analysis Using Remote Sensing)

► Show Figures

Figure 1

15 pages, 3120 KiB

Open AccessArticle

Implementation of Visual Odometry on Jetson Nano

by Jakub Krško, Dušan Nemec, Vojtech Šimák and Mário Michálik

Sensors 2025, 25(4), 1025; https://doi.org/10.3390/s25041025 - 9 Feb 2025

Viewed by 461

Abstract

This paper presents the implementation of ORB-SLAM3 for visual odometry on a low-power ARM-based system, specifically the Jetson Nano, to track a robot’s movement using RGB-D cameras. Key challenges addressed include the selection of compatible software libraries, camera calibration, and system optimization. The [...] Read more.

This paper presents the implementation of ORB-SLAM3 for visual odometry on a low-power ARM-based system, specifically the Jetson Nano, to track a robot’s movement using RGB-D cameras. Key challenges addressed include the selection of compatible software libraries, camera calibration, and system optimization. The ORB-SLAM3 algorithm was adapted for the ARM architecture and tested using both the EuRoC dataset and real-world scenarios involving a mobile robot. The testing demonstrated that ORB-SLAM3 provides accurate localization, with errors in path estimation ranging from 3 to 11 cm when using the EuRoC dataset. Real-world tests on a mobile robot revealed discrepancies primarily due to encoder drift and environmental factors such as lighting and texture. The paper discusses strategies for mitigating these errors, including enhanced calibration and the potential use of encoder data for tracking when camera performance falters. Future improvements focus on refining the calibration process, adding trajectory correction mechanisms, and integrating visual odometry data more effectively into broader systems. Full article

(This article belongs to the Section Sensors and Robotics)

► Show Figures

Figure 1

68 pages, 11118 KiB

Open AccessReview

A Review of Simultaneous Localization and Mapping for the Robotic-Based Nondestructive Evaluation of Infrastructures

by Ali Ghadimzadeh Alamdari, Farzad Azizi Zade and Arvin Ebrahimkhanlou

Sensors 2025, 25(3), 712; https://doi.org/10.3390/s25030712 - 24 Jan 2025

Viewed by 829

Abstract

The maturity of simultaneous localization and mapping (SLAM) methods has now reached a significant level that motivates in-depth and problem-specific reviews. The focus of this study is to investigate the evolution of vision-based, LiDAR-based, and a combination of these methods and evaluate their [...] Read more.

The maturity of simultaneous localization and mapping (SLAM) methods has now reached a significant level that motivates in-depth and problem-specific reviews. The focus of this study is to investigate the evolution of vision-based, LiDAR-based, and a combination of these methods and evaluate their performance in enclosed and GPS-denied (EGD) conditions for infrastructure inspection. This paper categorizes and analyzes the SLAM methods in detail, considering the sensor fusion type and chronological order. The paper analyzes the performance of eleven open-source SLAM solutions, containing two visual (VINS-Mono, ORB-SLAM 2), eight LiDAR-based (LIO-SAM, Fast-LIO 2, SC-Fast-LIO 2, LeGO-LOAM, SC-LeGO-LOAM A-LOAM, LINS, F-LOAM) and one combination of the LiDAR and vision-based method (LVI-SAM). The benchmarking section analyzes accuracy and computational resource consumption using our collected dataset and a test dataset. According to the results, LiDAR-based methods performed well under EGD conditions. Contrary to common presumptions, some vision-based methods demonstrate acceptable performance in EGD environments. Additionally, combining vision-based techniques with LiDAR-based methods demonstrates superior performance compared to either vision-based or LiDAR-based methods individually. Full article

(This article belongs to the Special Issue Feature Review Papers in Intelligent Sensors)

► Show Figures

Figure 1

21 pages, 9794 KiB

Open AccessArticle

Research on a Density-Based Clustering Method for Eliminating Inter-Frame Feature Mismatches in Visual SLAM Under Dynamic Scenes

by Zhiyong Yang, Kun Zhao, Shengze Yang, Yuhong Xiong, Changjin Zhang, Lielei Deng and Daode Zhang

Sensors 2025, 25(3), 622; https://doi.org/10.3390/s25030622 - 22 Jan 2025

Viewed by 582

Abstract

Visual SLAM relies on the motion information of static feature points in keyframes for both localization and map construction. Dynamic feature points interfere with inter-frame motion pose estimation, thereby affecting the accuracy of map construction and the overall robustness of the visual SLAM [...] Read more.

Visual SLAM relies on the motion information of static feature points in keyframes for both localization and map construction. Dynamic feature points interfere with inter-frame motion pose estimation, thereby affecting the accuracy of map construction and the overall robustness of the visual SLAM system. To address this issue, this paper proposes a method for eliminating feature mismatches between frames in visual SLAM under dynamic scenes. First, a spatial clustering-based RANSAC method is introduced. This method eliminates mismatches by leveraging the distribution of dynamic and static feature points, clustering the points, and separating dynamic from static clusters, retaining only the static clusters to generate a high-quality dataset. Next, the RANSAC method is introduced to fit the geometric model of feature matches, eliminating local mismatches in the high-quality dataset with fewer iterations. The accuracy of the DSSAC-RANSAC method in eliminating feature mismatches between frames is then tested on both indoor and outdoor dynamic datasets, and the robustness of the proposed algorithm is further verified on self-collected outdoor datasets. Experimental results demonstrate that the proposed algorithm reduces the average reprojection error by 58.5% and 49.2%, respectively, when compared to traditional RANSAC and GMS-RANSAC methods. The reprojection error variance is reduced by 65.2% and 63.0%, while the processing time is reduced by 69.4% and 31.5%, respectively. Finally, the proposed algorithm is integrated into the initialization thread of ORB-SLAM2 and the tracking thread of ORB-SLAM3 to validate its effectiveness in eliminating feature mismatches between frames in visual SLAM. Full article

(This article belongs to the Special Issue Simultaneous Localization and Mapping (SLAM) and Artificial Intelligence (AI) Based Localization for Positioning Applications and Mobile Robot Navigation—Second Edition)

► Show Figures

Figure 1

24 pages, 5261 KiB

Open AccessArticle

Extended Study of a Multi-Modal Loop Closure Detection Framework for SLAM Applications

by Mohammed Chghaf, Sergio Rodríguez Flórez and Abdelhafid El Ouardi

Electronics 2025, 14(3), 421; https://doi.org/10.3390/electronics14030421 - 21 Jan 2025

Viewed by 701

Abstract

Loop Closure (LC) is a crucial task in Simultaneous Localization and Mapping (SLAM) for Autonomous Ground Vehicles (AGV). It is an active research area because it improves global localization efficiency. The consistency of the global map and the accuracy of the AGV’s location [...] Read more.

Loop Closure (LC) is a crucial task in Simultaneous Localization and Mapping (SLAM) for Autonomous Ground Vehicles (AGV). It is an active research area because it improves global localization efficiency. The consistency of the global map and the accuracy of the AGV’s location in an unknown environment are highly correlated with the efficiency and robustness of Loop Closure Detection (LCD), especially when facing environmental changes or data unavailability. We propose to introduce multimodal complementary data to increase the algorithms’ resilience. Various methods using different data sources have been proposed to achieve precise place recognition. However, integrating a multimodal loop-closure fusion process that combines multiple information sources within a SLAM system has been explored less. Additionally, existing multimodal place recognition techniques are often difficult to integrate into existing frameworks. In this paper, we propose a fusion scheme of multiple place recognition methods based on camera and LiDAR data for a robust multimodal LCD. The presented approach uses Similarity-Guided Particle Filtering (SGPF) to identify and verify candidates for loop closure. Based on the ORB-SLAM2 framework, the proposed method uses two perception sensors (camera and LiDAR) under two data representation models for each. Our experiments on both KITTI and a self-collected dataset show that our approach outperforms the state-of-the-art methods in terms of place recognition metrics or localization accuracy metrics. The proposed Multi-Modal Loop Closure (MMLC) framework enhances the robustness and accuracy of AGV’s localization by fusing multiple sensor modalities, ensuring consistent performance across diverse environments. Its real-time operation and early loop closure detection enable timely trajectory corrections, reducing navigation errors and supporting cost-effective deployment with adaptable sensor configurations. Full article

(This article belongs to the Special Issue Image Analysis Using LiDAR Data)

► Show Figures

Figure 1

20 pages, 3776 KiB

Open AccessArticle

Parallelized SLAM: Enhancing Mapping and Localization Through Concurrent Processing

by Francisco J. Romero-Ramirez, Miguel Cazorla, Manuel J. Marín-Jiménez, Rafael Medina-Carnicer and Rafael Muñoz-Salinas

Sensors 2025, 25(2), 365; https://doi.org/10.3390/s25020365 - 9 Jan 2025

Viewed by 494

Abstract

Simultaneous Localization and Mapping (SLAM) systems face high computational demands, hindering their real-time implementation on low-end computers. An approach to addressing this challenge involves offline processing, i.e., a map of the environment map is created offline on a powerful computer and then passed [...] Read more.

Simultaneous Localization and Mapping (SLAM) systems face high computational demands, hindering their real-time implementation on low-end computers. An approach to addressing this challenge involves offline processing, i.e., a map of the environment map is created offline on a powerful computer and then passed to a low-end computer, which uses it for navigation, which involves fewer resources. However, even creating the map on a powerful computer is slow since SLAM is designed as a sequential process. This work proposes a parallel mapping method pSLAM for speeding up the offline creation of maps. In pSLAM, a video sequence is partitioned into multiple subsequences, with each processed independently, creating individual submaps. These submaps are subsequently merged to create a unified global map of the environment. Our experiments across a diverse range of scenarios demonstrate an increase in the processing speed of up to 6 times compared to that of the sequential approach while maintaining the same level of robustness. Furthermore, we conducted comparative analyses against state-of-the-art SLAM methods, namely UcoSLAM, OpenVSLAM, and ORB-SLAM3, with our method outperforming these across all of the scenarios evaluated. Full article

(This article belongs to the Special Issue Simultaneous Localization and Mapping (SLAM) and Artificial Intelligence (AI) Based Localization for Positioning Applications and Mobile Robot Navigation—Second Edition)

► Show Figures

Figure 1

Figure 1
This image shows the main modules of our system. Initially, the video sequence is divided into multiple subsequences, with each processed by dedicated Map builder modules. Finally, each module’s resulting maps are merged to create a unified representation. Full article ">Figure 2
Representation of the methodology followed for merging the submaps. The merge modules are executed in parallel within the system, ensuring that each module can be executed only when the maps it will unify have previously been created. Full article ">Figure 3
(a,c) Maps generated by our method, using two video sequences from the Kitti dataset. Different configurations of the parameter m were employed to create each map, <math display="inline"><semantics> <mrow> <mi>m</mi> <mo>=</mo> <mn>2</mn> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <mi>m</mi> <mo>=</mo> <mn>4</mn> </mrow> </semantics></math>, respectively. (b,d) Trajectories of the evaluated methods are depicted in various colors, along with the ground truth. Full article ">Figure 4
(a,c) Maps generated by our method for two video sequences of the 4Seasons dataset, using the parameter <math display="inline"><semantics> <mrow> <mi>m</mi> <mo>=</mo> <mn>4</mn> </mrow> </semantics></math>. (b,d) Trajectories of the different methods in the same sequences. Full article ">Figure 5
(a,c) Maps generated using two video sequences from the CUR dataset. Different parameter configurations m were employed to create the maps: (a) <math display="inline"><semantics> <mrow> <mi>m</mi> <mo>=</mo> <mn>16</mn> </mrow> </semantics></math>, while (c) <math display="inline"><semantics> <mrow> <mi>m</mi> <mo>=</mo> <mn>8</mn> </mrow> </semantics></math>. (b,d) Trajectories followed by the different methods evaluated. Full article ">Figure 6
Computing time comparison of UcoSLAM and the proposed pSLAM method for different values of m on the datasets analyzed. Full article ">

21 pages, 4833 KiB

Open AccessArticle

An Effective 3D Instance Map Reconstruction Method Based on RGBD Images for Indoor Scene

by Heng Wu, Yanjie Liu, Chao Wang and Yanlong Wei

Remote Sens. 2025, 17(1), 139; https://doi.org/10.3390/rs17010139 - 3 Jan 2025

Viewed by 544

Abstract

To enhance the intelligence of robots, constructing accurate object-level instance maps is essential. However, the diversity and clutter of objects in indoor scenes present significant challenges for instance map construction. To tackle this issue, we propose a method for constructing object-level instance maps [...] Read more.

To enhance the intelligence of robots, constructing accurate object-level instance maps is essential. However, the diversity and clutter of objects in indoor scenes present significant challenges for instance map construction. To tackle this issue, we propose a method for constructing object-level instance maps based on RGBD images. First, we utilize the advanced visual odometer ORB-SLAM3 to estimate the poses of image frames and extract keyframes. Next, we perform semantic and geometric segmentation on the color and depth images of these keyframes, respectively, using semantic segmentation to optimize the geometric segmentation results and address inaccuracies in the target segmentation caused by small depth variations. The segmented depth images are then projected into point cloud segments, which are assigned corresponding semantic information. We integrate these point cloud segments into a global voxel map, updating each voxel’s class using color, distance constraints, and Bayesian methods to create an object-level instance map. Finally, we construct an ellipsoids scene from this map to test the robot’s localization capabilities in indoor environments using semantic information. Our experiments demonstrate that this method accurately and robustly constructs the environment, facilitating precise object-level scene segmentation. Furthermore, compared to manually labeled ellipsoidal maps, generating ellipsoidal maps from extracted objects enables accurate global localization. Full article

(This article belongs to the Special Issue 3D Scene Reconstruction, Modeling and Analysis Using Remote Sensing)

► Show Figures

Figure 1

22 pages, 12893 KiB

Open AccessArticle

Research on Visual–Inertial Measurement Unit Fusion Simultaneous Localization and Mapping Algorithm for Complex Terrain in Open-Pit Mines

by Yuanbin Xiao, Wubin Xu, Bing Li, Hanwen Zhang, Bo Xu and Weixin Zhou

Sensors 2024, 24(22), 7360; https://doi.org/10.3390/s24227360 - 18 Nov 2024

Viewed by 951

Abstract

As mining technology advances, intelligent robots in open-pit mining require precise localization and digital maps. Nonetheless, significant pitch variations, uneven highways, and rocky surfaces with minimal texture present substantial challenges to the precision of feature extraction and positioning in traditional visual SLAM systems, [...] Read more.

As mining technology advances, intelligent robots in open-pit mining require precise localization and digital maps. Nonetheless, significant pitch variations, uneven highways, and rocky surfaces with minimal texture present substantial challenges to the precision of feature extraction and positioning in traditional visual SLAM systems, owing to the intricate terrain features of open-pit mines. This study proposes an improved SLAM technique that integrates visual and Inertial Measurement Unit (IMU) data to address these challenges. The method incorporates a point–line feature fusion matching strategy to enhance the quality and stability of line feature extraction. It integrates an enhanced Line Segment Detection (LSD) algorithm with short segment culling and approximate line merging techniques. The combination of IMU pre-integration and visual feature restrictions is executed inside a tightly coupled visual–inertial framework utilizing a sliding window approach for back-end optimization, enhancing system robustness and precision. Experimental results demonstrate that the suggested method improves RMSE accuracy by 36.62% and 26.88% on the MH and VR sequences of the EuRoC dataset, respectively, compared to ORB-SLAM3. The improved SLAM system significantly reduces trajectory drift in the simulated open-pit mining tests, improving localization accuracy by 40.62% and 61.32%. The results indicate that the proposed method demonstrates significance. Full article

(This article belongs to the Section Sensors and Robotics)

► Show Figures

Figure 1

18 pages, 2990 KiB

Open AccessArticle

A GGCM-E Based Semantic Filter and Its Application in VSLAM Systems

by Yuanjie Li, Chunyan Shao and Jiaming Wang

Electronics 2024, 13(22), 4487; https://doi.org/10.3390/electronics13224487 - 15 Nov 2024

Viewed by 532

Abstract

Image matching-based visual simultaneous localization and mapping (vSLAM) extracts low-level pixel features to reconstruct camera trajectories and maps through the epipolar geometry method. However, it fails to achieve correct trajectories and mapping when there are low-quality feature correspondences in several challenging environments. Although [...] Read more.

Image matching-based visual simultaneous localization and mapping (vSLAM) extracts low-level pixel features to reconstruct camera trajectories and maps through the epipolar geometry method. However, it fails to achieve correct trajectories and mapping when there are low-quality feature correspondences in several challenging environments. Although the RANSAC-based framework can enable better results, it is computationally inefficient and unstable in the presence of a large number of outliers. A Faster R-CNN learning-based semantic filter is proposed to explore the semantic information of inliers to remove low-quality correspondences, helping vSLAM localize accurately in our previous work. However, the semantic filter learning method generalizes low precision for low-level and dense texture-rich scenes, leading the semantic filter-based vSLAM to be unstable and have poor geometry estimation. In this paper, a GGCM-E-based semantic filter using YOLOv8 is proposed to address these problems. Firstly, the semantic patches of images are collected from the KITTI dataset, the TUM dataset provided by the Technical University of Munich, and real outdoor scenes. Secondly, the semantic patches are classified by our proposed GGCM-E descriptors to obtain the YOLOv8 neural network training dataset. Finally, several semantic filters for filtering low-level and dense texture-rich scenes are generated and combined into the ORB-SLAM3 system. Extensive experiments show that the semantic filter can detect and classify semantic levels of different scenes effectively, filtering low-level semantic scenes to improve the quality of correspondences, thus achieving accurate and robust trajectory reconstruction and mapping. For the challenging autonomous driving benchmark and real environments, the vSLAM system with respect to the GGCM-E-based semantic filter demonstrates its superiority regarding reducing the 3D position error, such that the absolute trajectory error is reduced by up to approximately 17.44%, showing its promise and good generalization. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence in Robotics)

► Show Figures

Figure 1

20 pages, 1837 KiB

Open AccessArticle

A Monocular Ranging Method for Ship Targets Based on Unmanned Surface Vessels in a Shaking Environment

by Zimu Wang, Xiunan Li, Peng Chen, Dan Luo, Gang Zheng and Xin Chen

Remote Sens. 2024, 16(22), 4220; https://doi.org/10.3390/rs16224220 - 12 Nov 2024

Viewed by 1011

Abstract

Aiming to address errors in the estimation of the position and attitude of an unmanned vessel, especially during vibration, where the rapid loss of feature point information hinders continuous attitude estimation and global trajectory mapping, this paper improves the monocular ORB-SLAM framework based [...] Read more.

Aiming to address errors in the estimation of the position and attitude of an unmanned vessel, especially during vibration, where the rapid loss of feature point information hinders continuous attitude estimation and global trajectory mapping, this paper improves the monocular ORB-SLAM framework based on the characteristics of the marine environment. In general, we extract the location area of the artificial sea target in the video, build a virtual feature set for it, and filter the background features. When shaking occurs, GNSS information is combined and the target feature set is used to complete the map reconstruction task. Specifically, firstly, the sea target area of interest is detected by YOLOv5, and the feature extraction and matching method is optimized in the front-end tracking stage to adapt to the sea environment. In the key frame selection and local map optimization stage, the characteristics of the feature set are improved to further improve the positioning accuracy, to provide more accurate position and attitude information about the unmanned platform. We use GNSS information to provide the scale and world coordinates for the map. Finally, the target distance is measured by the beam ranging method. In this paper, marine unmanned platform data, GNSS, and AIS position data are autonomously collected, and experiments are carried out using the proposed marine ranging system. Experimental results show that the maximum measurement error of this method is 9.2%, and the average error is 4.7%. Full article

► Show Figures

Figure 1

7 pages, 1148 KiB

Open AccessProceeding Paper

A Novel Method to Improve the Efficiency and Performance of Cloud-Based Visual Simultaneous Localization and Mapping

by Omar M. Salih, Hussam Rostum and József Vásárhelyi

Eng. Proc. 2024, 79(1), 78; https://doi.org/10.3390/engproc2024079078 - 11 Nov 2024

Viewed by 481

Abstract

Since Visual Simultaneous Localization and Mapping (VSLAM) inherently requires intensive computational operations and consumes many hardware resources, these limitations pose challenges to implementing the entire VSLAM architecture within limited processing power and battery capacity. This paper proposes a novel solution to improve the [...] Read more.

Since Visual Simultaneous Localization and Mapping (VSLAM) inherently requires intensive computational operations and consumes many hardware resources, these limitations pose challenges to implementing the entire VSLAM architecture within limited processing power and battery capacity. This paper proposes a novel solution to improve the efficiency and performance of exchanging data between the unmanned aerial vehicle (UAV) and the cloud server. First, an adaptive ORB (oriented FAST and rotated BRIEF) method is proposed for precise tracking, mapping, and re-localization. Second, efficient visual data encoding and decoding methods are proposed for exchanging the data between the edge device and the UAV. The results show an improvement in the trajectory RMSE and accurate tracking using the adaptive ORB-SLAM. Furthermore, the proposed visual data encoding and decoding showed an outstanding performance compared with the most used standard JPEG-based system over high quantization ratios. Full article

(This article belongs to the Proceedings of The Sustainable Mobility and Transportation Symposium 2024)

► Show Figures

Figure 1

17 pages, 3301 KiB

Open AccessArticle

Stereo and LiDAR Loosely Coupled SLAM Constrained Ground Detection

by Tian Sun, Lei Cheng, Ting Zhang, Xiaoping Yuan, Yanzheng Zhao and Yong Liu

Sensors 2024, 24(21), 6828; https://doi.org/10.3390/s24216828 - 24 Oct 2024

Viewed by 999

Abstract

In many robotic applications, creating a map is crucial, and 3D maps provide a method for estimating the positions of other objects or obstacles. Most of the previous research processes 3D point clouds through projection-based or voxel-based models, but both approaches have certain [...] Read more.

In many robotic applications, creating a map is crucial, and 3D maps provide a method for estimating the positions of other objects or obstacles. Most of the previous research processes 3D point clouds through projection-based or voxel-based models, but both approaches have certain limitations. This paper proposes a hybrid localization and mapping method using stereo vision and LiDAR. Unlike the traditional single-sensor systems, we construct a pose optimization model by matching ground information between LiDAR maps and visual images. We use stereo vision to extract ground information and fuse it with LiDAR tensor voting data to establish coplanarity constraints. Pose optimization is achieved through a graph-based optimization algorithm and a local window optimization method. The proposed method is evaluated using the KITTI dataset and compared against the ORB-SLAM3, F-LOAM, LOAM, and LeGO-LOAM methods. Additionally, we generate 3D point cloud maps for the corresponding sequences and high-definition point cloud maps of the streets in sequence 00. The experimental results demonstrate significant improvements in trajectory accuracy and robustness, enabling the construction of clear, dense 3D maps. Full article

(This article belongs to the Section Navigation and Positioning)

► Show Figures

Figure 1

19 pages, 12975 KiB

Open AccessArticle

Enhancing Real-Time Visual SLAM with Distant Landmarks in Large-Scale Environments

by Hexuan Dou, Xinyang Zhao, Bo Liu, Yinghao Jia, Guoqing Wang and Changhong Wang

Drones 2024, 8(10), 586; https://doi.org/10.3390/drones8100586 - 16 Oct 2024

Viewed by 1578

Abstract

The efficacy of visual Simultaneous Localization and Mapping (SLAM) diminishes in large-scale environments due to challenges in identifying distant landmarks, leading to a limited perception range and trajectory drift. This paper presents a practical method to enhance the accuracy of feature-based real-time visual [...] Read more.

The efficacy of visual Simultaneous Localization and Mapping (SLAM) diminishes in large-scale environments due to challenges in identifying distant landmarks, leading to a limited perception range and trajectory drift. This paper presents a practical method to enhance the accuracy of feature-based real-time visual SLAM for compact unmanned vehicles by constructing distant map points. By tracking consecutive image features across multiple frames, remote map points are generated with sufficient parallax angles, extending the mapping scope to the theoretical maximum range. Observations of these landmarks from preceding keyframes are supplemented accordingly, improving back-end optimization and, consequently, localization accuracy. The effectiveness of this approach is ensured by the introduction of the virtual map point, a proposed data structure that links relational features to an imaginary map point, thereby maintaining the constrained size of local optimization during triangulation. Based on the ORB-SLAM3 code, a SLAM system incorporating the proposed method is implemented and tested. Experimental results on drone and vehicle datasets demonstrate that the proposed method outperforms ORB-SLAM3 in both accuracy and perception range with negligible additional processing time, thus preserving real-time performance. Field tests using a UGV further validate the efficacy of the proposed method. Full article

(This article belongs to the Section Drone Design and Development)

► Show Figures

Figure 1

20 pages, 6262 KiB

Open AccessArticle

YPR-SLAM: A SLAM System Combining Object Detection and Geometric Constraints for Dynamic Scenes

by Xukang Kan, Gefei Shi, Xuerong Yang and Xinwei Hu

Sensors 2024, 24(20), 6576; https://doi.org/10.3390/s24206576 - 12 Oct 2024

Viewed by 948

Abstract

Traditional SLAM systems assume a static environment, but moving objects break this ideal assumption. In the real world, moving objects can greatly influence the precision of image matching and camera pose estimation. In order to solve these problems, the YPR-SLAM system is proposed. [...] Read more.

Traditional SLAM systems assume a static environment, but moving objects break this ideal assumption. In the real world, moving objects can greatly influence the precision of image matching and camera pose estimation. In order to solve these problems, the YPR-SLAM system is proposed. First of all, the system includes a lightweight YOLOv5 detection network for detecting both dynamic and static objects, which provides pre-dynamic object information to the SLAM system. Secondly, utilizing the prior information of dynamic targets and the depth image, a method of geometric constraint for removing motion feature points from the depth image is proposed. The Depth-PROSAC algorithm is used to differentiate the dynamic and static feature points so that dynamic feature points can be removed. At last, the dense cloud map is constructed by the static feature points. The YPR-SLAM system is an efficient combination of object detection and geometry constraint in a tightly coupled way, eliminating motion feature points and minimizing their adverse effects on SLAM systems. The performance of the YPR-SLAM was assessed on the public TUM RGB-D dataset, and it was found that YPR-SLAM was suitable for dynamic situations. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

24 pages, 4712 KiB

Open AccessArticle

Balancing Efficiency and Accuracy: Enhanced Visual Simultaneous Localization and Mapping Incorporating Principal Direction Features

by Yuelin Yuan, Fei Li, Xiaohui Liu and Jialiang Chen

Appl. Sci. 2024, 14(19), 9124; https://doi.org/10.3390/app14199124 - 9 Oct 2024

Viewed by 1121

Abstract

In visual Simultaneous Localization and Mapping (SLAM), operational efficiency and localization accuracy are equally crucial evaluation metrics. We propose an enhanced visual SLAM method to ensure stable localization accuracy while improving system efficiency. It can maintain localization accuracy even after reducing the number [...] Read more.

In visual Simultaneous Localization and Mapping (SLAM), operational efficiency and localization accuracy are equally crucial evaluation metrics. We propose an enhanced visual SLAM method to ensure stable localization accuracy while improving system efficiency. It can maintain localization accuracy even after reducing the number of feature pyramid levels by 50%. Firstly, we innovatively incorporate the principal direction error, which represents the global geometric features of feature points, into the error function for pose estimation, utilizing Pareto optimal solutions to improve the localization accuracy. Secondly, for loop-closure detection, we construct a feature matrix by integrating the grayscale and gradient direction of an image. This matrix is then dimensionally reduced through aggregation, and a multi-layer detection approach is employed to ensure both efficiency and accuracy. Finally, we optimize the feature extraction levels and integrate our method into the visual system to speed up the extraction process and mitigate the impact of the reduced levels. We comprehensively evaluate the proposed method on local and public datasets. Experiments show that the SLAM method maintained high localization accuracy after reducing the tracking time by 24% compared with ORB SLAM3. Additionally, the proposed loop-closure-detection method demonstrated superior computational efficiency and detection accuracy compared to the existing methods. Full article

(This article belongs to the Special Issue Mobile Robotics and Autonomous Intelligent Systems)

► Show Figures

Figure 1

Search Results (138)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (138)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI