AdVLO: Region Selection via Attention-Driven for Visual LiDAR Odometry

Han Lam¹⁴,
Khoa Pho¹⁴ &
Atsuo Yoshitaka¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13995))

Included in the following conference series:

Asian Conference on Intelligent Information and Database Systems

486 Accesses

Abstract

Simultaneous Localization and Mapping (SLAM) aims to estimate the position and reconstruct the map of mobile robotics. Odometry is an essential component that tries to calculate the translations and rotations between frames of the sensors attached to the vehicle on the fly. Visual-LiDAR Odometry (VLO) is a prominent approach that has advantages in the sensor costs of cameras and robustness to environmental changes of LiDAR sensors. In general, one of the critical tasks in Odometry is selecting the important features between frames. In this paper, we proposed an end-to-end visual LiDAR odometry method named AdVLO that selects the important regions between frames via an attention-driven mechanism. A mask of essential regions of the input frame is generated via the attention mechanism. We then fuse the attention mask with the corresponding frame to maintain the essential regions. Instead of concatenating like previous works in VLO, we fuse the visual features and LiDAR using the Guided attention technique. The translation and rotation of the camera are calculated via the sequential computation of the LSTM. Experimental results on the KITTI dataset show that our proposed method achieves promising results compared to other odometry methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Human Visual Attention Mechanism-Inspired Point-and-Line Stereo Visual Odometry for Environments with Uneven Distributed Features

Article Open access 09 May 2023

Graph attention network-optimized dynamic monocular visual odometry

Article 05 July 2023

Cross transformer for LiDAR-based loop closure detection

Article 06 November 2024

References

An, Y., Shi, J., Gu, D., Liu, Q.: Visual-lidar slam based on unsupervised multi-channel deep neural networks. Springer Cogn. Comput. 14, 1496–1508 (2022)
Article Google Scholar
Bay, H., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006). https://doi.org/10.1007/11744023_32
Chapter Google Scholar
Cho, Y., Kim, G., Kim, A.: DeepLO: geometry-aware deep lidar odometry. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 2145–2152 (2022)
Google Scholar
Davison, A.J., Reid, I., Molton, N., Stasse, O.: MonoSLAM: real-time single camera slam. IEEE Trans. Pattern Anal. Mach. Intell. 26(6), 1052–1067 (2007). https://doi.org/10.1109/tpami.2007.1049
Article Google Scholar
Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24, 381–395 (1981)
Article MathSciNet Google Scholar
Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the kitti dataset. Int. J. Robot. Res. 32, 1231–1237 (2013). https://doi.org/10.1177/0278364913491297
Article Google Scholar
Geiger, A., Ziegler, J., Stiller, C.: StereoScan: dense 3d reconstruction in real-time. In: IEEE Intelligent Vehicles Symposium (IV), pp. 963–968 (2011)
Google Scholar
Kendall, A., Grimes, M., Cipolla, R.: PoseNet: a convolutional network for real-time 6-DOF camera relocalization. In: International Conference on Computer Vision (ICCV), pp. 2938–2946 (2015)
Google Scholar
Li, Q., Chen, S., Wang, C., Li, X., Wen, C., Cheng, M., Li, J.: Lo-net: deep real-time lidar odometry. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019). https://doi.org/10.1109/cvpr.2019.00867
Li, R., Gu, D., Liu, Q., Long, Z., Hu, H.: Semantic scene mapping with spatio-temporal deep neural network for robotic applications. Cogn. Comput. (2018). https://doi.org/10.1007/s12559-017-9526-9
Article Google Scholar
Li, R., Wang, S., Gu, D.: DeepSLAM: a robust monocular slam system with unsupervised deep learning. IEEE Trans. Industr. Electron. (2020). https://doi.org/10.1109/tie.2020.2982096
Article Google Scholar
Li, R., Wang, S., Long, Z., Gu, D.: UnDeepVO: monocular visual odometry through unsupervised deep learning. ArXiv e-prints (2017). https://doi.org/10.48550/ARXIV.1709.06841, arXiv:1709.06841
Lowe, D.: Object recognition from local scale-invariant features. In: Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157 (1999). https://doi.org/10.1109/ICCV.1999.790410
Mur-Artal, R., Tardós, J.D.: Orb-SLAM2: an open-source slam system for monocular, stereo and RGB-D cameras. IEEE Trans. Rob. (2017). https://doi.org/10.1109/tro.2017.2705103
Article Google Scholar
Nguyen, X.D., You, B.J., Oh, S.R.: A simple framework for indoor monocular slam. Int. J. Control. Autom. Syst. 6, 62–75 (2008)
Google Scholar
Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems (NIPS), vol. 32 (2019). https://proceedings.neurips.cc/paper/2019/file/bdbca288fee7f92f2bfa9f7012727740-Paper.pdf
Qi, C., Su, H., Mo, K., Guibas, L.J.: PointNet: deep learning on point sets for 3d classification and segmentation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 77–85 (2017)
Google Scholar
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: Orb: an efficient alternative to sift or surf. In: International Conference on Computer Vision, pp. 2564–2571 (2011). https://doi.org/10.1109/ICCV.2011.6126544
Rusinkiewicz, S., Levoy, M.: Efficient variants of the ICP algorithm. In: Proceedings Third International Conference on 3-D Digital Imaging and Modeling, pp. 145–152 (2001). https://doi.org/10.1109/IM.2001.924423
Shi, J., Tomasi: Good features to track. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 593–600 (1994). https://doi.org/10.1109/CVPR.1994.323794
Wang, S., Clark, R., Wen, H., Trigoni, N.: DeepVO: towards end-to-end visual odometry with deep recurrent convolutional neural networks. In: 2017 IEEE International Conference on Robotics and Automation (ICRA) (2017). https://doi.org/10.1109/icra.2017.7989236
Weixin, L., Lu, W., Zhou, Y., Wan, G., Hou, S., Song, S.: L3-net: towards learning based lidar localization for autonomous driving. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6389–6398 (2019). https://doi.org/10.1109/cvpr.2019.00655
Xu, C., Feng, Z., Chen, Y., Wang, M., Wei, T.: FeatNet: large-scale fraud device detection by network representation learning with rich features. In: Proceedings of the 11th ACM Workshop on Artificial Intelligence and Security, pp. 57–63 (2018)
Google Scholar
Yan, M., Wang, J., Li, J., Zhang, C.: Loose coupling visual-lidar odometry by combining VISO2 and LOAM. In: 36th Chinese Control Conference (CCC), pp. 6841–6846 (2017)
Google Scholar
Yin, D., et al.: CAE-LO: lidar odometry leveraging fully unsupervised convolutional auto-encoder for interest point detection and feature description. arXiv: Computer Vision and Pattern Recognition (2020)
Zhan, H., Garg, R., Weerasekera, C.S., Li, K., Agarwal, H., Reid, I.: Unsupervised learning of monocular depth estimation and visual odometry with deep feature reconstruction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 340–349 (2018)
Google Scholar
Zhang, J., Singh, S.: LOAM: lidar odometry and mapping in real-time. In: Proceedings of Robotics: Science and Systems (RSS 2014) (2014)
Google Scholar
Zhang, J., Singh, S.: Visual-lidar odometry and mapping: low-drift, robust, and fast. In: 2015 IEEE International Conference on Robotics and Automation (ICRA), pp. 2174–2181 (2015)
Google Scholar
Zhou, T., Brown, M., Snavely, N., Lowe, D.G.: Unsupervised learning of depth and ego-motion from video. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6612–6619 (2017). https://doi.org/10.1109/CVPR.2017.700

Download references

Author information

Authors and Affiliations

Graduate School of Advanced Science and Technology, Japan Advanced Institute of Science and Technology, Nomi, Ishikawa, Japan
Han Lam, Khoa Pho & Atsuo Yoshitaka

Authors

Han Lam
View author publications
You can also search for this author in PubMed Google Scholar
Khoa Pho
View author publications
You can also search for this author in PubMed Google Scholar
Atsuo Yoshitaka
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Han Lam .

Editor information

Editors and Affiliations

Wrocław University of Science and Technology, Wrocław, Poland
Ngoc Thanh Nguyen
King Mongkut’s Institute of Technology Ladkrabang, Bangkok, Thailand
Siridech Boonsang
Iwate Prefectural University Iwate, Iwate, Japan
Hamido Fujita
Wroclaw University of Science and Technology, Wrocław, Poland
Bogumiła Hnatkowska
National University of Kaohsiung, Kaohsiung, Taiwan
Tzung-Pei Hong
King Mongkut’s Institute of Technology Ladkrabang, Bangkok, Thailand
Kitsuchart Pasupa
Malaysia Japan International Institute of Technology, Kuala Lumpur, Malaysia
Ali Selamat

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lam, H., Pho, K., Yoshitaka, A. (2023). AdVLO: Region Selection via Attention-Driven for Visual LiDAR Odometry. In: Nguyen, N.T., et al. Intelligent Information and Database Systems. ACIIDS 2023. Lecture Notes in Computer Science(), vol 13995. Springer, Singapore. https://doi.org/10.1007/978-981-99-5834-4_7

Download citation

DOI: https://doi.org/10.1007/978-981-99-5834-4_7
Published: 05 September 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-5833-7
Online ISBN: 978-981-99-5834-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

AdVLO: Region Selection via Attention-Driven for Visual LiDAR Odometry

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Human Visual Attention Mechanism-Inspired Point-and-Line Stereo Visual Odometry for Environments with Uneven Distributed Features

Graph attention network-optimized dynamic monocular visual odometry

Cross transformer for LiDAR-based loop closure detection

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

AdVLO: Region Selection via Attention-Driven for Visual LiDAR Odometry

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Human Visual Attention Mechanism-Inspired Point-and-Line Stereo Visual Odometry for Environments with Uneven Distributed Features

Graph attention network-optimized dynamic monocular visual odometry

Cross transformer for LiDAR-based loop closure detection

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation