Virtual and Real Occlusion Processing Method of Monocular Visual Assembly Scene Based on ORB-SLAM3
<p>Monocular vision-based occlusion processing flow framework.</p> "> Figure 2
<p>Schematic diagram of occlusion relationships.</p> "> Figure 3
<p>Feature point pair-pole geometry constraints.</p> "> Figure 4
<p>Key points and camera positions.</p> "> Figure 5
<p>Comparison of tracking and positioning effects of different algorithms. (<b>a</b>) Comparison of trajectory translations; (<b>b</b>) Comparison of trajectory rotations.</p> "> Figure 6
<p>RGB image and depth diagram of the assembly scene. (<b>a</b>) Original RGB images; (<b>b</b>) Densified depth image; (<b>c</b>–<b>f</b>) Results of the densified depth maps for different assembly scenes.</p> "> Figure 7
<p>Rendering process for virtual–real occlusion of assembly objects.</p> "> Figure 8
<p>Aero-engine external accessories exhaust pipe bolt virtual model and scene fusion.</p> "> Figure 9
<p>Comparison of edge errors in depth images. (<b>a</b>) Schematic diagram of the accessory connecting tube and its contour extraction. (<b>b</b>) Depth map sampling point error.</p> ">
Abstract
:1. Introduction
- Propose a novel method based on ORB-SLAM3 for handling virtual–real occlusion in MR environments, specifically tailored for the complex assembly scenes of aero-engines.
- Propose the use of the MNSTF algorithm for matching optimization of ORB-SLAM3 and depth point reconstruction of assembly scenes, which is suitable for feature point extraction and matching in weakly textured or untextured regions.
- Propose a bicubic interpolation-based method to densify sparse depth maps and integrate them with the depth information from the 3D model in the digitized process model, generating a complete and accurate depth map of the real scene.
2. Related Work
3. Monocular Vision-Based Occlusion Handling Method
3.1. Virtual and Real Occlusion Handling Framework
3.2. The Relationship Between Virtual and Real Object Occlusion
4. Depth Image-Based Voxel Occlusion Rendering
4.1. Assembly Scene Sparse Depth Point Reconstruction
4.2. Improved ORB-SLAM3 Feature Point Matching
4.3. Depth Map Densification
4.4. Assembly Scene Occlusion Rendering
5. Case Study
5.1. Virtual and Real Occlusion Effects
5.2. Accuracy Analysis of Virtual and Real Occlusion
5.3. Timeliness Analysis of Occlusion
6. Conclusions
- By incorporating the MNSTF algorithm, we enhanced the feature matching and optimization capabilities of ORB-SLAM3, enabling more accurate reconstruction of the assembly scene with sparse depth points. This improvement significantly reduces computational overhead while maintaining high precision in depth estimation.
- Our method compares the depth values of each pixel in the real and virtual scene depth maps to determine the spatial relationship between virtual and real objects. This ensures accurate occlusion handling and optimizes the visual fusion effect in MR-assisted assembly scenarios.
- The MR-assisted assembly guidance process of aero-engine piping connectors was used as experimental validation and compared with the methods of Holynski and Kinect, and the results show that the method in this paper can effectively solve the problem of dealing with the real–virtual occlusion in the MR-assisted assembly process. The proposed method of false-reality occlusion processing based on improved ORB-SLAM3 monocular vision performs well in terms of false-reality occlusion effect, accuracy, and timeliness.
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Mal, F.; Karaman, S. Sparse-to-Dense: Depth Prediction from Sparse Depth Samples and a Single Image. In Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, QLD, Australia, 21–25 May 2018. [Google Scholar]
- Fu, H.; Gong, M.; Wang, C.; Batmanghelich, K.; Tao, D. Deep Ordinal Regression Network for Monocular Depth Estimation. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 18–23 June 2018. [Google Scholar]
- Raj, S.; Murthy, L.R.; Shanmugam, T.A.; Kumar, G.; Chakrabarti, A.; Biswas, P. Augmented reality and deep learning based system for assisting assembly process. J. Multimodal User Interfaces 2024, 18, 119–133. [Google Scholar] [CrossRef]
- Yuan, M.L.; Ong, S.K.; Nee, A.Y.C. Augmented reality for assembly guidance using a virtual interactive tool. Int. J. Prod. Res. 2008, 46, 1745–1767. [Google Scholar] [CrossRef]
- Subramanian, K.; Thomas, L.; Sahin, M.; Sahin, F. Supporting Human-Robot Interaction in Manufacturing with Augmented Reality and Effective Human–Computer Interaction: A Review and Framework. Machines 2024, 12, 706. [Google Scholar] [CrossRef]
- Mu, X.; Wang, Y.; Yuan, B.; Sun, W.; Liu, C.; Sun, Q. A New assembly precision prediction method of aeroengine high-pressure rotor system considering manufacturing error and deformation of parts. J. Manuf. Syst. 2021, 61, 112–124. [Google Scholar] [CrossRef]
- Li, J.; Wang, S.; Wang, G.; Zhang, J.; Feng, S.; Xiao, Y.; Wu, S. The effects of complex assembly task type and assembly experience on users’ demands for augmented reality instructions. Int. J. Adv. Manuf. Technol. 2024, 131, 1479–1496. [Google Scholar] [CrossRef]
- Patricio, A.; Valente, J.; Dehban, A.; Cadilha, I.; Reis, D.; Ventura, R. AI-Powered Augmented Reality for Satellite Assembly, Integration and Test. arXiv 2024, arXiv:2409.18101. [Google Scholar]
- Wolfartsberger, J.; Hallewell Haslwanter, J.D.; Lindorfer, R. Perspectives on Assistive Systems for Manual Assembly Tasks in Industry. Technologies 2019, 7, 12. [Google Scholar] [CrossRef]
- Li, W.; Wang, J.; Liu, M.; Zhao, S.; Ding, X. Integrated registration and occlusion handling based on deep learning for augmented-reality-assisted assembly instruction. IEEE Trans. Ind. Inform. 2022, 19, 6825–6835. [Google Scholar] [CrossRef]
- Tian, Y.; Long, Y.; Xia, D.; Yao, H.; Zhang, J. Handling occlusions in augmented reality based on 3D reconstruction method. Neurocomputing 2015, 156, 96–104. [Google Scholar] [CrossRef]
- Zhu, J.; Pan, Z.; Sun, C.; Chen, W. Handling occlusions in video-based augmented reality using depth information. Computer Animation and Virtual Worlds. Comput. Animat. Virtual Worlds 2010, 21, 509–521. [Google Scholar] [CrossRef]
- Hayashi, K.; Hirokazu, K.; Shogo, N. Occlusion Detection of Real Objects Using Contour Based Stereo Matching. In Proceedings of the ICAT05: The International Conference on Augmented Tele-Existence, Christchurch, New Zealand, 5–8 December 2005; pp. 180–186. [Google Scholar]
- Walton, D.R.; Steed, A. Accurate Real-Time Occlusion for Mixed Reality. In Proceedings of the VRST’ 17: 23rd ACM Symposium on Virtual Reality Software and Technology, Gothenburg, Sweden, 8–10 November 2017; pp. 1–10. [Google Scholar]
- Lee, J.H.; Kim, C.S. Monocular depth estimation using relative depth maps. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 9729–9738. [Google Scholar]
- Liu, F.; Zhou, S.; Wang, Y.; Hou, G.; Sun, Z.; Tan, T. Binocular light-field: Imaging theory and occlusion-robust depth perception application. IEEE Trans. Image Process. 2019, 29, 1628–1640. [Google Scholar] [CrossRef] [PubMed]
- Kim, H.; Yang, S.J.; Sohn, K. 3D reconstruction of stereo images for interaction between real and virtual worlds. In Proceedings of the Second IEEE and ACM International Symposium on Mixed and Augmented Reality, Tokyo, Japan, 10 October 2003. [Google Scholar]
- Zheng, Y.; Liu, P.; Qian, L.; Qin, S.; Liu, X.; Ma, Y.; Cheng, G. Recognition and depth estimation of ships based on binocular stereo vision. J. Mar. Sci. Eng. 2022, 10, 1153. [Google Scholar] [CrossRef]
- Yang, Y.; Meng, X.; Gao, M. Vision system of mobile robot combining binocular and depth cameras. J. Sens. 2017, 2017, 4562934. [Google Scholar] [CrossRef]
- Luo, T.; Liu, Z.; Pan, Z.; Zhang, M. A virtual-real occlusion method based on GPU acceleration for MR. In Proceedings of the 2019 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), Osaka, Japan, 23–27 March 2019; pp. 1068–1069. [Google Scholar]
- Zhang, C.; Xu, R.C.; Han, C.; Zhai, H.Y. An Occlusion Consistency Processing Method Based on Virtual-Real Fusion. In Frontier Research and Innovation in Optoelectronics Technology and Industry; CRC Press: Parkway, NW, USA, 2018; pp. 47–56. [Google Scholar]
- Ibrahim, M.M.; Liu, Q.; Khan, R.; Yang, J.; Adeli, E.; Yang, Y. Depth map artefacts reduction: A review. IET Image Process. 2020, 14, 2630–2644. [Google Scholar] [CrossRef]
- Simon, N.; Majumdar, A. Mononav: Mav navigation via monocular depth estimation and reconstruction. In Proceedings of the 18th International Symposium on Experimental Robotics (ISER 2023), Chiang Mai, Thailand, 26–30 November 2023; Springer Nature: Cham, Switzerland, 2023; pp. 415–426. [Google Scholar]
- Chaplot, D.S.; Gandhi, D.; Gupta, S.; Gupta, A.; Salakhutdinov, R. Learning to explore using active neural slam. arXiv 2022, arXiv:2004.05155. [Google Scholar]
- Chaplot, D.S.; Salakhutdinov, R.; Gupta, A.; Gupta, S. Neural topological slam for visual navigation. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 12875–12884. [Google Scholar]
- Muravyev, K.; Bokovoy, A.; Yakovlev, K. tx2_fcnn_node: An open-source ROS compatible tool for monocular depth reconstruction. SoftwareX 2022, 17, 100956. [Google Scholar] [CrossRef]
- Chang, M.F. Monocular Depth Reconstruction Using Geometry and Deep Convolutional Networks. Master’s Thesis, Carnegie Mellon University, Pittsburgh, PA, USA, May 2018. [Google Scholar]
- Luo, Y.; Liu, G.; Liu, H.; Liu, T.; Tian, G.; Ji, Z. Simultaneous Monocular Visual Odometry and Depth Reconstruction with Scale Recover. In Proceedings of the 2019 IEEE International Conference on Robotics and Biomimetics (ROBIO), Dali, China, 6–8 December 2019; pp. 682–687. [Google Scholar]
- Fink, L.; Franke, L.; Keinert, J.; Stamminger, M. Refinement of Monocular Depth Maps via Multi-View Differentiable Rendering. arXiv 2024, arXiv:2410.03861. [Google Scholar]
- Zhang, X.; Zhao, B.; Yao, J.; Wu, G. Unsupervised monocular depth and camera pose estimation with multiple masks and geometric consistency constraints. Sensors 2023, 23, 5329. [Google Scholar] [CrossRef]
- Li, W.; Wang, J.; Liu, M.; Zhao, S. Real-time occlusion handling for augmented reality assistance assembly systems with monocular images. J. Manuf. Syst. 2022, 62, 561–574. [Google Scholar] [CrossRef]
- Holynski, A.; Kopf, J. Fast depth densification for occlusion-aware augmented reality. Trans. Graph. 2018, 37, 1–11. [Google Scholar] [CrossRef]
- Han, X.; Chen, X.; Deng, H.; Wan, P.; Li, J. Point Cloud Deep Learning Network Based on Local Domain Multi-Level Feature. Appl. Sci. 2023, 13, 10804. [Google Scholar] [CrossRef]
- Dengwen, Z. An Edge-Directed Bicubic Interpolation Algorithm. In Proceedings of the 2010 3rd International Congress on Image and Signal Processing, Yantai, China, 16–18 October 2010; Volume 3, pp. 1186–1189. [Google Scholar]
- Kirkland, E.J.; Kirkland, E.J. Bilinear Interpolation. In Advanced Computing in Electron Microscopy; Springer: Boston, MA, USA, 2010; pp. 261–263. [Google Scholar]
- Bai, Y.; Wang, D. On the comparison of trilinear, cubic spline, and fuzzy interpolation methods in the high-accuracy measurements. IEEE Trans. Fuzzy Syst. 2010, 18, 1016–1022. [Google Scholar]
- Azure Kinect DK. Available online: https://learn.microsoft.com/zh-cn/azure/kinect-dk/depth-camera (accessed on 15 May 2024).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xu, H.; Chen, C.; Yin, Q.; Ma, C.; Guo, F. Virtual and Real Occlusion Processing Method of Monocular Visual Assembly Scene Based on ORB-SLAM3. Machines 2025, 13, 212. https://doi.org/10.3390/machines13030212
Xu H, Chen C, Yin Q, Ma C, Guo F. Virtual and Real Occlusion Processing Method of Monocular Visual Assembly Scene Based on ORB-SLAM3. Machines. 2025; 13(3):212. https://doi.org/10.3390/machines13030212
Chicago/Turabian StyleXu, Hanzhong, Chunping Chen, Qingqing Yin, Chao Ma, and Feiyan Guo. 2025. "Virtual and Real Occlusion Processing Method of Monocular Visual Assembly Scene Based on ORB-SLAM3" Machines 13, no. 3: 212. https://doi.org/10.3390/machines13030212
APA StyleXu, H., Chen, C., Yin, Q., Ma, C., & Guo, F. (2025). Virtual and Real Occlusion Processing Method of Monocular Visual Assembly Scene Based on ORB-SLAM3. Machines, 13(3), 212. https://doi.org/10.3390/machines13030212