SLAM-OR: Simultaneous Localization, Mapping and Object Recognition Using Video Sensors Data in Open Environments from the Sparse Points Cloud
<p>This figure presents the basic concept of the SLAM-OR algorithm.</p> "> Figure 2
<p>This figure presents example results of objects bounding box calculation using PCA—based approach from Equation (<a href="#FD13-sensors-21-04734" class="html-disp-formula">13</a>). Values on the X and Y axis are points coordinates calculated by our algorithm. The scaling of the axes is irrelevant because PCA normalizes variables. Red points are directly calculated from SLAM algorithm. Blue section lays along direction of the highest variance of red points (the first principal component <math display="inline"><semantics> <mover> <msub> <mi>v</mi> <mn>1</mn> </msub> <mo>¯</mo> </mover> </semantics></math>). Orange section is perpendicular to blue section and it lies along direction of the second highest variance of red points (the second principal component <math display="inline"><semantics> <mover> <msub> <mi>v</mi> <mn>2</mn> </msub> <mo>¯</mo> </mover> </semantics></math>). The bounding box have four edges: violet (<math display="inline"><semantics> <mover> <msub> <mi>b</mi> <mn>4</mn> </msub> <mo>¯</mo> </mover> </semantics></math>), green (<math display="inline"><semantics> <mover> <msub> <mi>b</mi> <mn>3</mn> </msub> <mo>¯</mo> </mover> </semantics></math>), brown (<math display="inline"><semantics> <mover> <msub> <mi>b</mi> <mn>2</mn> </msub> <mo>¯</mo> </mover> </semantics></math>) and red (<math display="inline"><semantics> <mover> <msub> <mi>b</mi> <mn>1</mn> </msub> <mo>¯</mo> </mover> </semantics></math>).</p> "> Figure 3
<p>This figure presents plots of loss function (<a href="#FD7-sensors-21-04734" class="html-disp-formula">7</a>) values that changes over training on subset of OIDv4 dataset.</p> "> Figure 4
<p>This figure presents influence of application of various DBSCAN parameters on detected objects bounding boxes. Visualization has been done on fragment of the Kitti 39 data set. (<b>a</b>) DBSCAN eps = 0.5, min samples = 7. (<b>b</b>) DBSCAN eps = 1.5, min samples = 7. (<b>c</b>) DBSCAN eps = 1, min samples = 5.</p> "> Figure 5
<p>This figure presents example results of SLAM-OR algorithm with DBSCAN parameters: eps = 1 and min samples = 5. Black path and blue rectangles are reference data. Red line and green rectangles are results of the SLAM-OR. (<b>a</b>) SLAM and Bird’s Eye View mapping of Kitti 09 dataset. (<b>b</b>) SLAM and Bird’s Eye View mapping of Kitti 64 dataset.</p> ">
Abstract
:1. Introduction
1.1. The State of the Art
1.1.1. SLAM
1.1.2. Objects Detection Recognition
1.1.3. Localization, Mapping and Objects Recognition
1.1.4. Point Clouds Clustering
1.2. Study Motivation
1.3. Comparison with Other OR-Based Location Recognition Algorithms
2. Material and Methods
2.1. Simultaneous Localization and Mapping
2.1.1. Tracking
2.1.2. Mapping
2.1.3. Loop Closing
2.2. YOLO
2.3. RetinaNet
2.4. MobileNet
2.5. DBSCAN
2.6. OPTICS
2.7. SLAM-OR Algorithm
2.8. Data Set
3. Results
3.1. SLAM Algorithm Validation
3.2. Selection of Object Recognition Algorithm
3.3. SLAM-OR Evaluation
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Leonard, J.J.; Durrant-Whyte, H.F. Simultaneous map building and localization for an autonomous mobile robot. In Proceedings of the IROS ’91: IEEE/RSJ International Workshop on Intelligent Robots and Systems ’91, Osaka, Japan, 3–5 November 1991; Volume 3, pp. 1442–1447. [Google Scholar] [CrossRef]
- Di, K.; Wan, W.; Zhao, H.; Liu, Z.; Wang, R.; Zhang, F. Progress and Applications of Visual SLAM. Cehui Xuebao Acta Geod. Cartogr. Sin. 2018, 47, 770–779. [Google Scholar] [CrossRef]
- Naseer, T.; Ruhnke, M.; Stachniss, C.; Spinello, L.; Burgard, W. Robust visual SLAM across seasons. In Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, Germany, 28 September–2 October 2015; pp. 2529–2535. [Google Scholar] [CrossRef]
- Bojarski, M.; Testa, D.; Dworakowski, D.; Firner, B.; Flepp, B.; Goyal, P.; Jackel, L.; Monfort, M.; Muller, U.; Zhang, J.; et al. End to End Learning for Self-Driving Cars. arXiv 2016, arXiv:1604.07316. [Google Scholar]
- Chen, Z.; Huang, X. End-to-end learning for lane keeping of self-driving cars. In Proceedings of the 2017 IEEE Intelligent Vehicles Symposium (IV), Redondo Beach, CA, USA, 11–17 June 2017; pp. 1856–1860. [Google Scholar] [CrossRef]
- Pillai, S.; Leonard, J. Monocular SLAM Supported Object Recognition. arXiv 2015, arXiv:1506.01732. [Google Scholar]
- Li, P.; Zhang, G.; Zhou, J.; Yao, R.; Zhang, X.; Zhou, J. Study on Slam Algorithm Based on Object Detection in Dynamic Scene. In Proceedings of the 2019 International Conference on Advanced Mechatronic Systems (ICAMechS), Shiga, Japan, 26–28 August 2019; pp. 363–367. [Google Scholar] [CrossRef]
- Zhang, Z.; Zhang, J.; Tang, Q. Mask R-CNN Based Semantic RGB-D SLAM for Dynamic Scenes. In Proceedings of the 2019 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM), Hong Kong, China, 8–12 July 2019; pp. 1151–1156. [Google Scholar] [CrossRef]
- Bavle, H.; De La Puente, P.; How, J.P.; Campoy, P. VPS-SLAM: Visual Planar Semantic SLAM for Aerial Robotic Systems. IEEE Access 2020, 8, 60704–60718. [Google Scholar] [CrossRef]
- Hess, W.; Kohler, D.; Rapp, H.; Andor, D. Real-time loop closure in 2D LIDAR SLAM. In Proceedings of the 2016 IEEE International Conference on Robotics and Automation (ICRA), Stockholm, Sweden, 16–21 May 2016; pp. 1271–1278. [Google Scholar] [CrossRef]
- Jung, D.W.; Lim, Z.S.; Kim, B.G.; Kim, N.K. Multi-channel ultrasonic sensor system for obstacle detection of the mobile robot. In Proceedings of the 2007 International Conference on Control, Automation and Systems, Seoul, Korea, 17–20 October 2007; pp. 2347–2351. [Google Scholar] [CrossRef]
- Kim, H.D.; Seo, S.W.; Jang, I.-H.; Sim, K.B. SLAM of mobile robot in the indoor environment with Digital Magnetic Compass and Ultrasonic Sensors. In Proceedings of the 2007 International Conference on Control, Automation and Systems, Seoul, Korea, 17–20 October 2007; pp. 87–90. [Google Scholar] [CrossRef]
- Xuexi, Z.; Guokun, L.; Genping, F.; Dongliang, X.; Shiliu, L. SLAM Algorithm Analysis of Mobile Robot Based on Lidar. In Proceedings of the 2019 Chinese Control Conference (CCC), Guangzhou, China, 27–30 July 2019; pp. 4739–4745. [Google Scholar] [CrossRef]
- Holder, M.; Hellwig, S.; Winner, H. Real-Time Pose Graph SLAM based on Radar. In Proceedings of the 2019 IEEE Intelligent Vehicles Symposium (IV), Paris, France, 9–12 June 2019; pp. 1145–1151. [Google Scholar] [CrossRef]
- Kiss-Illés, D.; Barrado, C.; Salamí, E. GPS-SLAM: An Augmentation of the ORB-SLAM Algorithm. Sensors 2019, 19, 4973. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Chong, T.; Tang, X.; Leng, C.; Yogeswaran, M.; Ng, O.; Chong, Y. Sensor Technologies and Simultaneous Localization and Mapping (SLAM). Procedia Comput. Sci. 2015, 76, 174–179. [Google Scholar] [CrossRef] [Green Version]
- Taketomi, T.; Uchiyama, H.; Ikeda, S. Visual SLAM algorithms: A survey from 2010 to 2016. IPSJ Trans. Comput. Vis. Appl. 2017, 9. [Google Scholar] [CrossRef]
- Rublee, E.; Rabaud, V.; Konolige, K.; Bradski, G. ORB: An efficient alternative to SIFT or SURF. In Proceedings of the 2011 International Conference on Computer Vision, Barcelona, Spain, 6–13 November 2011; pp. 2564–2571. [Google Scholar] [CrossRef]
- Majdik, A.L.; Tamas, L.; Popa, M.; Szoke, I.; Lazea, G. Visual odometer system to build feature based maps for mobile robot navigation. In Proceedings of the 18th Mediterranean Conference on Control and Automation, MED’10, Marrakech, Morocco, 23–25 June 2010; pp. 1200–1205. [Google Scholar] [CrossRef]
- Mito, Y.; Morimoto, M.; Fujii, K. An Object Detection and Extraction Method Using Stereo Camera. In Proceedings of the 2006 World Automation Congress, Budapest, Hungary, 24–26 July 2006; pp. 1–6. [Google Scholar] [CrossRef]
- Chan, S.H.; Wu, P.T.; Fu, L.C. Robust 2D Indoor Localization Through Laser SLAM and Visual SLAM Fusion. In Proceedings of the 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Miyazaki, Japan, 7–10 October 2018; pp. 1263–1268. [Google Scholar] [CrossRef]
- Hui, C.; Shiwei, M. Visual SLAM based on EKF filtering algorithm from omnidirectional camera. In Proceedings of the 2013 IEEE 11th International Conference on Electronic Measurement Instruments, Harbin, China, 16–18 August 2013; Volume 2, pp. 660–663. [Google Scholar] [CrossRef]
- Aulinas, J.; Petillot, Y.; Salvi, J.; Llado, X. The SLAM problem: A survey. Artif. Intell. Res. Dev. 2008, 184, 363–371. [Google Scholar] [CrossRef]
- Fuentes-Pacheco, J.; Ascencio, J.; Rendon-Mancha, J. Visual Simultaneous Localization and Mapping: A Survey. Artif. Intell. Rev. 2015, 43. [Google Scholar] [CrossRef]
- Bresson, G.; Alsayed, Z.; Yu, L.; Glaser, S. Simultaneous Localization and Mapping: A Survey of Current Trends in Autonomous Driving. IEEE Trans. Intell. Veh. 2017, 2, 194–220. [Google Scholar] [CrossRef] [Green Version]
- Hachaj, T.; Miazga, J. Image Hashtag Recommendations Using a Voting Deep Neural Network and Associative Rules Mining Approach. Entropy 2020, 22, 1351. [Google Scholar] [CrossRef]
- Hachaj, T.; Bibrzycki, Ł.; Piekarczyk, M. Recognition of Cosmic Ray Images Obtained from CMOS Sensors Used in Mobile Phones by Approximation of Uncertain Class Assignment with Deep Convolutional Neural Network. Sensors 2021, 21, 1963. [Google Scholar] [CrossRef] [PubMed]
- Fang, M. Intelligent Processing Technology of Cross Media Intelligence Based on Deep Cognitive Neural Network and Big Data. In Proceedings of the 2020 2nd International Conference on Machine Learning, Big Data and Business Intelligence (MLBDBI), Taiyuan, China, 23–25 October 2020; pp. 505–508. [Google Scholar] [CrossRef]
- Ahmad, A.S. Brain inspired cognitive artificial intelligence for knowledge extraction and intelligent instrumentation system. In Proceedings of the 2017 International Symposium on Electronics and Smart Devices (ISESD), Yogyakarta, Indonesia, 17–19 October 2017; pp. 352–356. [Google Scholar] [CrossRef]
- Kim, J.; Kim, D. Fast Car/Human Classification Using Triple Directional Edge Property and Local Relations. In Proceedings of the 2009 11th IEEE International Symposium on Multimedia, San Diego, CA, USA, 14–16 December 2009; pp. 106–111. [Google Scholar] [CrossRef]
- Molchanov, V.; Vishnyakov, B.; Vizilter, Y.; Vishnyakova, O.; Knyaz, V. Pedestrian detection in video surveillance using fully convolutional YOLO neural network. Int. Soc. Opt. Photonics 2017, 103340Q. [Google Scholar] [CrossRef]
- Lin, J.P.; Sun, M.T. A YOLO-Based Traffic Counting System. In Proceedings of the 2018 Conference on Technologies and Applications of Artificial Intelligence (TAAI), Taichung, Taiwan, 30 November–2 December 2018; pp. 82–85. [Google Scholar] [CrossRef]
- Strbac, B.; Gostovic, M.; Lukac, Z.; Samardzija, D. YOLO Multi-Camera Object Detection and Distance Estimation. In Proceedings of the 2020 Zooming Innovation in Consumer Technologies Conference (ZINC), Novi Sad, Serbia, 26–27 May 2020; pp. 26–30. [Google Scholar] [CrossRef]
- Rajasekaran, C.; Jayanthi, K.B.; Sudha, S.; Kuchelar, R. Automated Diagnosis of Cardiovascular Disease Through Measurement of Intima Media Thickness Using Deep Neural Networks. In Proceedings of the 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Berlin, Germany, 23–27 July 2019; Volume 2019, pp. 6636–6639. [Google Scholar] [CrossRef]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 24–27 June 2014; pp. 580–587. [Google Scholar] [CrossRef] [Green Version]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 1904–1916. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 39. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. SSD: Single Shot MultiBox Detector. In Computer Vision—ECCV 2016; Leibe, B., Matas, J., Sebe, N., Welling, M., Eds.; Springer International Publishing: Cham, Swizerland, 2016; pp. 21–37. [Google Scholar]
- Lin, T.; Goyal, P.; Girshick, R.; He, K.; Dollar, P. Focal Loss for Dense Object Detection. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 318–327. [Google Scholar] [CrossRef] [Green Version]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar] [CrossRef] [Green Version]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. In Proceedings of the International Conference on Learning Representations, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the Inception Architecture for Computer Vision. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar] [CrossRef] [Green Version]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef] [Green Version]
- Bochkovskiy, A.; Wang, C.Y.; Liao, H.y. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Bernuy, F.; Ruiz Del Solar, J. Semantic Mapping of Large-Scale Outdoor Scenes for Autonomous Off-Road Driving. In Proceedings of the 2015 IEEE International Conference on Computer Vision Workshop (ICCVW), Santiago, Chile, 7–13 December 2015; pp. 124–130. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Sutskever, I.; Hinton, G. ImageNet Classification with Deep Convolutional Neural Networks. Neural Inf. Process. Syst. 2012, 25. [Google Scholar] [CrossRef]
- Zhou, B.; Lapedriza, A.; Khosla, A.; Oliva, A.; Torralba, A. Places: A 10 Million Image Database for Scene Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 1. [Google Scholar] [CrossRef] [Green Version]
- Kang, R.; Shi, J.; Li, X.; Liu, Y.; Liu, X. DF-SLAM: A Deep-Learning Enhanced Visual SLAM System based on Deep Local Features. arXiv 2019, arXiv:1901.07223. [Google Scholar]
- Duan, C.; Junginger, S.; Huang, J.; Jin, K.; Thurow, K. Deep Learning for Visual SLAM in Transportation Robotics: A review. Transp. Saf. Environ. 2019, 1, 177–184. [Google Scholar] [CrossRef]
- Li, R.; Wang, S.; Gu, D. Ongoing Evolution of Visual SLAM from Geometry to Deep Learning: Challenges and Opportunities. Cogn. Comput. 2018, 10, 1–15. [Google Scholar] [CrossRef]
- Zhang, L.; Wei, L.; Shen, P.; Wei, W.; Zhu, G.; Song, J. Semantic SLAM Based on Object Detection and Improved Octomap. IEEE Access 2018, 6, 75545–75559. [Google Scholar] [CrossRef]
- Wang, M.; Long, X.; Chang, P.; Padlr, T. Autonomous Robot Navigation with Rich Information Mapping in Nuclear Storage Environments. In Proceedings of the 2018 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), Philadelphia, PA, USA, 6–8 August 2018; pp. 1–6. [Google Scholar] [CrossRef] [Green Version]
- Francis, Z.; Villagrasa, C.; Clairand, I. Simulation of DNA damage clustering after proton irradiation using an adapted DBSCAN algorithm. Comput. Methods Programs Biomed. 2011, 101, 265–270. [Google Scholar] [CrossRef] [PubMed]
- Mete, M.; Kockara, S.; Aydin, K. Fast density-based lesion detection in dermoscopy images. Comput. Med. Imaging Graph. Off. J. Comput. Med. Imaging Soc. 2011, 35, 128–136. [Google Scholar] [CrossRef] [PubMed]
- Wagner, T.; Feger, R.; Stelzer, A. Modification of DBSCAN and application to range/Doppler/DoA measurements for pedestrian recognition with an automotive radar system. In Proceedings of the 2015 European Radar Conference (EuRAD), Paris, France, 9–11 September 2015; pp. 269–272. [Google Scholar] [CrossRef]
- Sifa, R.; Bauckhage, C. Online k-Maxoids Clustering. In Proceedings of the 2017 IEEE International Conference on Data Science and Advanced Analytics (DSAA), Tokyo, Japan, 19–21 October 2017; pp. 667–675. [Google Scholar] [CrossRef]
- Bauckhage, C. NumPy/SciPy Recipes for Data Science: K-Medoids Clustering. Researchgate 2015. [Google Scholar] [CrossRef]
- Wang, Z.; Huang, M.; Du, H.; Qin, H. A Clustering Algorithm Based on FDP and DBSCAN. In Proceedings of the 2018 14th International Conference on Computational Intelligence and Security (CIS), Hangzhou, China, 16–19 November 2018; pp. 145–149. [Google Scholar] [CrossRef]
- Ohadi, N.; Kamandi, A.; Shabankhah, M.; Fatemi, S.M.; Hosseini, S.M.; Mahmoudi, A. SW-DBSCAN: A Grid-based DBSCAN Algorithm for Large Datasets. In Proceedings of the 2020 6th International Conference on Web Research (ICWR), Tehran, Iran, 22–23 April 2020; pp. 139–145. [Google Scholar] [CrossRef]
- Weng, X.; Kitani, K. Monocular 3D Object Detection with Pseudo-LiDAR Point Cloud. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), IEEE Computer Society, Los Alamitos, CA, USA, 27–28 October 2019; pp. 857–866. [Google Scholar] [CrossRef] [Green Version]
- Qin, Z.; Wang, J.; Lu, Y. MonoGRNet: A Geometric Reasoning Network for 3D Object Localization. In Proceedings of the The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19), Honolulu, HI, USA, 27 January–1 February 2019. [Google Scholar]
- Bao, W.; Xu, B.; Chen, Z. MonoFENet: Monocular 3D Object Detection With Feature Enhancement Networks. IEEE Trans. Image Process. 2020, 29, 2753–2765. [Google Scholar] [CrossRef]
- Li, B.; Ouyang, W.; Sheng, L.; Zeng, X.; Wang, X. GS3D: An Efficient 3D Object Detection Framework for Autonomous Driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 1019–1028. [Google Scholar] [CrossRef] [Green Version]
- Oeljeklaus, M.; Hoffmann, F.; Bertram, T. A Fast Multi-Task CNN for Spatial Understanding of Traffic Scenes. In Proceedings of the 2018 21st International Conference on Intelligent Transportation Systems (ITSC), Maui, HI, USA, 4–7 November 2018; pp. 2825–2830. [Google Scholar] [CrossRef]
- Liu, L.; Lu, J.; Xu, C.; Tian, Q.; Zhou, J. Deep Fitting Degree Scoring Network for Monocular 3D Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 1057–1066. [Google Scholar] [CrossRef] [Green Version]
- Manhardt, F.; Kehl, W.; Gaidon, A. ROI-10D: Monocular Lifting of 2D Detection to 6D Pose and Metric Shape. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 2064–2073. [Google Scholar] [CrossRef] [Green Version]
- Choi, H.M.; Kang, H.; Hyun, Y. Multi-View Reprojection Architecture for Orientation Estimation. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), Seoul, Korea, 27–28 October 2019; pp. 2357–2366. [Google Scholar] [CrossRef]
- Jörgensen, E.; Zach, C.; Kahl, F. Monocular 3D Object Detection and Box Fitting Trained End-to-End Using Intersection-over-Union Loss. arXiv 2019, arXiv:1906.08070. [Google Scholar]
- Mur-Artal, R.; Montiel, J.M.M.; Tardós, J.D. ORB-SLAM: A Versatile and Accurate Monocular SLAM System. IEEE Trans. Robot. 2015, 31, 1147–1163. [Google Scholar] [CrossRef] [Green Version]
- Tomasz Hachaj, P.M. Modern UVC stereovision camera’s calibration and disparity maps generation: Mathematical basis, algorithms and implementations. Przegląd Elektrotechniczny 2020, 96, 168–173. [Google Scholar] [CrossRef]
- Rosten, E.; Drummond, T. Machine Learning for High-Speed Corner Detection. In Computer Vision—ECCV 2006; Leonardis, A., Bischof, H., Pinz, A., Eds.; Springer: Berlin/Heidelberg, Germany, 2006; pp. 430–443. [Google Scholar]
- Calonder, M.; Lepetit, V.; Strecha, C.; Fua, P. BRIEF: Binary Robust Independent Elementary Features. In Computer Vision—ECCV 2010; Daniilidis, K., Maragos, P., Paragios, N., Eds.; Springer: Berlin/Heidelberg, Germany, 2010; pp. 778–792. [Google Scholar]
- Rosin, P.L. Measuring Corner Properties. Comput. Vis. Image Underst. 1999, 73, 291–307. [Google Scholar] [CrossRef] [Green Version]
- Galvez-López, D.; Tardos, J.D. Bags of Binary Words for Fast Place Recognition in Image Sequences. IEEE Trans. Robot. 2012, 28, 1188–1197. [Google Scholar] [CrossRef]
- Liu, K.; Sun, H.; Ye, P. Research on bundle adjustment for visual SLAM under large-scale scene. In Proceedings of the 2017 4th International Conference on Systems and Informatics (ICSAI), Hangzhou, China, 11–13 November 2017; pp. 220–224. [Google Scholar] [CrossRef]
- Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Li, Y.; Han, Z.; Xu, H.; Liu, L.; Li, X.; Zhang, K. YOLOv3-Lite: A Lightweight Crack Detection Network for Aircraft Structure Based on Depthwise Separable Convolutions. Appl. Sci. 2019, 9, 3781. [Google Scholar] [CrossRef] [Green Version]
- Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y. Scaled-YOLOv4: Scaling Cross Stage Partial Network. arXiv 2020, arXiv:2011.08036v2. [Google Scholar]
- Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 936–944. [Google Scholar] [CrossRef] [Green Version]
- Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
- Ioffe, S.; Szegedy, C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In Proceedings of the 32nd International Conference on International Conference on Machine Learning—Volume 37. JMLR.org, 2015, ICML’15, Lille, France, 6–11 July 2015; pp. 448–456. [Google Scholar]
- Ankerst, M.; Breunig, M.M.; Kriegel, H.P.; Sander, J. OPTICS: Ordering Points to Identify the Clustering Structure. ACM Sigmod Rec. 1999, 28, 49–60. [Google Scholar] [CrossRef]
- Schubert, E.; Gertz, M. Improving the Cluster Structure Extracted from OPTICS Plots. In Proceedings of the Conference on Lernen, Wissen, Daten, Analysen (LWDA), Mannheim, Germany, 22–24 August 2018; pp. 318–329. [Google Scholar]
- Geiger, A.; Lenz, P.; Stiller, C.; Urtasun, R. Vision meets Robotics: The KITTI Dataset. Int. J. Robot. Res. IJRR 2013, 32, 1231–1237. [Google Scholar] [CrossRef] [Green Version]
- Song, S.; Chandraker, M. Robust Scale Estimation in Real-Time Monocular SFM for Autonomous Driving. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 1566–1573. [Google Scholar] [CrossRef] [Green Version]
- Ferrera, M.; Eudes, A.; Moras, J.; Sanfourche, M.; Le Besnerais, G. OV2SLAM: A Fully Online and Versatile Visual SLAM for Real-Time Applications. IEEE Robot. Autom. Lett. 2021, 6, 1399–1406. [Google Scholar] [CrossRef]
- Pan, Y.; Xiao, P.; He, Y.; Shao, Z.; Li, Z. MULLS: Versatile LiDAR SLAM via Multi-metric Linear Least Square. arXiv 2021, arXiv:2102.03771. [Google Scholar]
Method | Sensors | Translation | Rotation |
---|---|---|---|
VISO2-M+GP [85] | Single camera | 7.46% | 0.0245 |
ORB SLAM [69] | Single camera | 6.23% | 0.07 |
OV2SLAM [86] | Stereo camera | 0.94% | 0.0023 |
MULLS [87] | LiDAR | 0.65% | 0.0019 |
Method | YOLO v3 | YOLO v3 t | YOLO v4 | YOLO v4 t | MobileNet | RetinaNet |
---|---|---|---|---|---|---|
Loss | 12.01 | 10.28 | 4.44 | 9.29 | – | – |
fps | 24.2 | 35.4 | 23.1 | 32.7 | 20.9 | 21.4 |
mAP | 51.02 | 41.31 | 65.20 | 37.01 | 68.93 | 52.59 |
MIN | TPR | PPV | FNR | F1 | FM |
---|---|---|---|---|---|
5 | 0.30 ± 0.10 | 0.20 ± 0.09 | 0.70 ± 0.10 | 0.23 ± 0.09 | 0.24 ± 0.09 |
10 | 0.27 ± 0.08 | 0.24 ± 0.13 | 0.73 ± 0.08 | 0.25 ± 0.11 | 0.25 ± 0.10 |
15 | 0.32 ± 0.08 | 0.26 ± 0.13 | 0.68 ± 0.08 | 0.27 ± 0.10 | 0.28 ± 0.09 |
20 | 0.34 ± 0.10 | 0.28 ± 0.13 | 0.66 ± 0.10 | 0.29 ± 0.11 | 0.30 ± 0.11 |
25 | 0.37 ± 0.10 | 0.26 ± 0.13 | 0.63 ± 0.1 0 | 0.29 ± 0.12 | 0.30 ± 0.11 |
EPS | MIN | TPR | PPV | FNR | F1 | FM |
---|---|---|---|---|---|---|
0.5 | 3 | 0.34 ± 0.15 | 0.35 ± 0.11 | 0.66 ± 0.15 | 0.33 ± 0.11 | 0.34 ± 0.11 |
0.5 | 5 | 0.27 ± 0.13 | 0.4 ± 0.14 | 0.73 ± 0.13 | 0.31 ± 0.12 | 0.33 ± 0.12 |
0.5 | 7 | 0.22 ± 0.11 | 0.44 ± 0.15 | 0.78 ± 0.11 | 0.28 ± 0.11 | 0.30 ± 0.11 |
0.75 | 3 | 0.44 ± 0.15 | 0.31 ± 0.10 | 0.56 ± 0.15 | 0.35 ± 0.10 | 0.36 ± 0.11 |
0.75 | 5 | 0.38 ± 0.14 | 0.35 ± 0.11 | 0.62 ± 0.14 | 0.36 ± 0.11 | 0.36 ± 0.11 |
0.75 | 7 | 0.32 ± 0.14 | 0.36 ± 0.12 | 0.68 ± 0.14 | 0.33 ± 0.11 | 0.33 ± 0.11 |
1 | 3 | 0.50 ± 0.15 | 0.27 ± 0.09 | 0.50 ± 0.15 | 0.34 ± 0.09 | 0.36 ± 0.09 |
1 | 5 | 0.46 ± 0.15 | 0.31 ± 0.10 | 0.54 ± 0.15 | 0.37 ± 0.10 | 0.38 ± 0.10 |
1 | 7 | 0.42 ± 0.15 | 0.34 ± 0.11 | 0.58 ± 0.15 | 0.37 ± 0.10 | 0.37 ± 0.11 |
1.25 | 3 | 0.54 ± 0.15 | 0.25 ± 0.08 | 0.46 ± 0.15 | 0.33 ± 0.09 | 0.36 ± 0.09 |
1.25 | 5 | 0.51 ± 0.15 | 0.28 ± 0.09 | 0.49 ± 0.15 | 0.35 ± 0.09 | 0.37 ± 0.10 |
1.25 | 7 | 0.48 ± 0.15 | 0.31 ± 0.09 | 0.52 ± 0.15 | 0.37 ± 0.10 | 0.38 ± 0.10 |
1.5 | 3 | 0.58 ± 0.15 | 0.23 ± 0.07 | 0.42 ± 0.15 | 0.32 ± 0.08 | 0.36 ± 0.08 |
1.5 | 5 | 0.56 ± 0.15 | 0.25 ± 0.08 | 0.44 ± 0.15 | 0.34 ± 0.09 | 0.37 ± 0.09 |
1.5 | 7 | 0.53 ± 0.15 | 0.28 ± 0.08 | 0.47 ± 0.15 | 0.36 ± 0.09 | 0.38 ± 0.09 |
EPS | MIN | TPR | PPV | FNR | F1 | FM |
---|---|---|---|---|---|---|
0.5 | 3 | 0.34 ± 0.12 | 0.42 ± 0.12 | 0.66 ± 0.12 | 0.37 ± 0.10 | 0.37 ± 0.10 |
0.5 | 5 | 0.29 ± 0.10 | 0.50 ± 0.14 | 0.71 ± 0.10 | 0.36 ± 0.10 | 0.37 ± 0.10 |
0.5 | 7 | 0.25 ± 0.09 | 0.55 ± 0.13 | 0.75 ± 0.09 | 0.33 ± 0.10 | 0.36 ± 0.09 |
0.75 | 3 | 0.40 ± 0.12 | 0.38 ± 0.12 | 0.6 ± 0.12 | 0.38 ± 0.10 | 0.39 ± 0.10 |
0.75 | 5 | 0.37 ± 0.12 | 0.44 ± 0.13 | 0.63 ± 0.12 | 0.39 ± 0.10 | 0.40 ± 0.10 |
0.75 | 7 | 0.33 ± 0.10 | 0.49 ± 0.13 | 0.67 ± 0.10 | 0.39 ± 0.10 | 0.40 ± 0.10 |
1 | 3 | 0.45 ± 0.13 | 0.34 ± 0.11 | 0.55 ± 0.13 | 0.38 ± 0.10 | 0.39 ± 0.10 |
1 | 5 | 0.43 ± 0.12 | 0.39 ± 0.12 | 0.57 ± 0.12 | 0.41 ± 0.11 | 0.41 ± 0.11 |
1 | 7 | 0.40 ± 0.12 | 0.44 ± 0.13 | 0.60 ± 0.12 | 0.41 ± 0.11 | 0.41 ± 0.11 |
1.25 | 3 | 0.49 ± 0.14 | 0.31 ± 0.11 | 0.51 ± 0.14 | 0.37 ± 0.11 | 0.39 ± 0.11 |
1.25 | 5 | 0.47 ± 0.13 | 0.35 ± 0.12 | 0.53 ± 0.13 | 0.39 ± 0.11 | 0.40 ± 0.11 |
1.25 | 7 | 0.45 ± 0.13 | 0.39 ± 0.13 | 0.55 ± 0.13 | 0.41 ± 0.11 | 0.41 ± 0.11 |
1.5 | 3 | 0.52 ± 0.15 | 0.29 ± 0.11 | 0.48 ± 0.15 | 0.36 ± 0.11 | 0.38 ± 0.11 |
1.5 | 5 | 0.50 ± 0.14 | 0.32 ± 0.12 | 0.50 ± 0.14 | 0.38 ± 0.12 | 0.40 ± 0.12 |
1.5 | 7 | 0.49 ± 0.14 | 0.35 ± 0.13 | 0.51 ± 0.14 | 0.40 ± 0.12 | 0.41 ± 0.12 |
EPS | MIN | TPR | PPV | FNR | F1 | FM |
---|---|---|---|---|---|---|
0.5 | 3 | 0.37 ± 0.10 | 0.41 ± 0.13 | 0.63 ± 0.10 | 0.37 ± 0.09 | 0.38 ± 0.09 |
0.5 | 5 | 0.31 ± 0.08 | 0.47 ± 0.13 | 0.69 ± 0.08 | 0.36 ± 0.08 | 0.37 ± 0.08 |
0.5 | 7 | 0.26 ± 0.08 | 0.52 ± 0.14 | 0.74 ± 0.08 | 0.34 ± 0.07 | 0.36 ± 0.07 |
0.75 | 3 | 0.44 ± 0.10 | 0.37 ± 0.12 | 0.56 ± 0.10 | 0.39 ± 0.09 | 0.39 ± 0.09 |
0.75 | 5 | 0.39 ± 0.09 | 0.42 ± 0.13 | 0.61 ± 0.09 | 0.39 ± 0.09 | 0.39 ± 0.09 |
0.75 | 7 | 0.35 ± 0.09 | 0.44 ± 0.13 | 0.65 ± 0.09 | 0.38 ± 0.09 | 0.39 ± 0.09 |
1 | 3 | 0.50 ± 0.11 | 0.34 ± 0.12 | 0.50 ± 0.11 | 0.39 ± 0.10 | 0.40 ± 0.10 |
1 | 5 | 0.45 ± 0.10 | 0.38 ± 0.13 | 0.55 ± 0.10 | 0.40 ± 0.10 | 0.41 ± 0.09 |
1 | 7 | 0.42 ± 0.10 | 0.42 ± 0.14 | 0.58 ± 0.10 | 0.41 ± 0.09 | 0.42 ± 0.09 |
1.25 | 3 | 0.55 ± 0.11 | 0.31 ± 0.12 | 0.45 ± 0.11 | 0.38 ± 0.11 | 0.41 ± 0.10 |
1.25 | 5 | 0.52 ± 0.11 | 0.35 ± 0.13 | 0.48 ± 0.11 | 0.41 ± 0.10 | 0.43 ± 0.09 |
1.25 | 7 | 0.49 ± 0.11 | 0.39 ± 0.14 | 0.51 ± 0.11 | 0.41 ± 0.10 | 0.43 ± 0.10 |
1.5 | 3 | 0.60 ± 0.10 | 0.29 ± 0.11 | 0.40 ± 0.10 | 0.38 ± 0.11 | 0.41 ± 0.10 |
1.5 | 5 | 0.57 ± 0.10 | 0.32 ± 0.12 | 0.43 ± 0.10 | 0.39 ± 0.11 | 0.42 ± 0.10 |
1.5 | 7 | 0.54 ± 0.10 | 0.35 ± 0.13 | 0.46 ± 0.10 | 0.41 ± 0.11 | 0.42 ± 0.10 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Mazurek, P.; Hachaj, T. SLAM-OR: Simultaneous Localization, Mapping and Object Recognition Using Video Sensors Data in Open Environments from the Sparse Points Cloud. Sensors 2021, 21, 4734. https://doi.org/10.3390/s21144734
Mazurek P, Hachaj T. SLAM-OR: Simultaneous Localization, Mapping and Object Recognition Using Video Sensors Data in Open Environments from the Sparse Points Cloud. Sensors. 2021; 21(14):4734. https://doi.org/10.3390/s21144734
Chicago/Turabian StyleMazurek, Patryk, and Tomasz Hachaj. 2021. "SLAM-OR: Simultaneous Localization, Mapping and Object Recognition Using Video Sensors Data in Open Environments from the Sparse Points Cloud" Sensors 21, no. 14: 4734. https://doi.org/10.3390/s21144734
APA StyleMazurek, P., & Hachaj, T. (2021). SLAM-OR: Simultaneous Localization, Mapping and Object Recognition Using Video Sensors Data in Open Environments from the Sparse Points Cloud. Sensors, 21(14), 4734. https://doi.org/10.3390/s21144734