The Impact of LiDAR Configuration on Goal-Based Navigation within a Deep Reinforcement Learning Framework
<p>LiDAR sensor parameter definition.</p> "> Figure 2
<p>The robot’s x,y position is obtained from the robot odometry while the distance between the robot and the obstacle are calculated from the LiDAR information, which is passed to the neural network model as input. After training, a navigation policy is produced to command the robot’s velocities at each time step.</p> "> Figure 3
<p>Husky A200 UGV (<b>left side</b>) is a differential drive wheeled unmanned ground vehicle designed for robotic research. The (<b>right side</b>) illustrate the differential drive for mobile robot.</p> "> Figure 4
<p>Illustration of viewing angle of an object. The viewing angle is obtained on the basis of the sensor width (h) and distance to the obstacle (L).</p> "> Figure 5
<p>Illustration of a LiDAR sensor with different FOV and beam densities. FOV <math display="inline"><semantics> <mrow> <msub> <mi>θ</mi> <mi>x</mi> </msub> <mo>,</mo> <msub> <mi>θ</mi> <mi>y</mi> </msub> <mo>,</mo> <msub> <mi>θ</mi> <mi>z</mi> </msub> </mrow> </semantics></math> has a beam density of <math display="inline"><semantics> <mrow> <msub> <mi>n</mi> <mi>x</mi> </msub> <mo>,</mo> <msub> <mi>n</mi> <mi>y</mi> </msub> <mo>,</mo> <msub> <mi>n</mi> <mi>z</mi> </msub> </mrow> </semantics></math> respectively.</p> "> Figure 6
<p>Sample of the simulated environment used for training. The robot position <math display="inline"><semantics> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> </semantics></math> is <math display="inline"><semantics> <mrow> <mo>(</mo> <mn>0</mn> <mo>,</mo> <mn>0</mn> <mo>)</mo> </mrow> </semantics></math> in both environments. Goal points are <math display="inline"><semantics> <mrow> <mo>(</mo> <mn>4</mn> <mo>,</mo> <mo>−</mo> <mn>1</mn> <mo>)</mo> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <mo>(</mo> <mo>−</mo> <mn>4</mn> <mo>,</mo> <mo>−</mo> <mn>4</mn> <mo>)</mo> </mrow> </semantics></math> for the left and right environments, respectively.</p> "> Figure 7
<p>The actor network.</p> "> Figure 8
<p>The critic network.</p> "> Figure 9
<p>The average reward, maximum reward, and loss value for the three DRL models.</p> "> Figure 10
<p>Sample of simulated environments used for testing and evaluating the three models.</p> "> Figure 11
<p>Visualization of some trajectories from the start point to the goal point of the robot for the three LiDAR-configured models; grey, orange, and gold for models 1, 2, and 3 respectively in the simulation environment. The blue marker indicates the start point, the green marker indicates the goal point, the red marker indicates collision, and the purple marker the timeout state.</p> "> Figure 12
<p>The performance of the three trained LiDAR configuration models over four testing environments. Each environment runs 100 times.</p> "> Figure 13
<p>Husky robot navigation in the real-world environment. (<b>a</b>) The real-world testing environment with a static obstacle. The end of the white tape indicates the goal point. (<b>b</b>) The resultant representation of the real-world map and navigation trajectory.</p> "> Figure 14
<p>Husky robot path in the real-world environment. (<b>a</b>) Environment state with one obstacle. (<b>b</b>) Environment state with two obstacles. The grey, orange, and gold lines represent models 1, 2, and 3 respectively. The blue marker indicates the start point, the green marker indicates the goal point, and the red marker indicates collision.</p> ">
Abstract
:1. Introduction
- We design a DRL control policy based on goal-based exploration.
- We explore the effect of the LiDAR beam and FOV on the performance of the DRL model by learning the appropriate FOV and beam density suitable for a static environment. This is essential when the application needs a low-resolution LiDAR sensor.
- We demonstrate the performance of our model in a simulated environment to test the effect of different LiDAR sensor configurations on collision avoidance.
- We demonstrate our control policy on a Husky A200 robot by Clear Robotics using environment dynamics different from those used for training.
2. Related Work
3. Problem Formulation
3.1. Simulation Environment
3.2. Action Space and State Representation
3.3. Deep-Reinforcement Learning
3.4. Reward Function
- Collision Penalty: A m radius zone is placed around the obstacle called the restricted zone. It is considered that a collision has occurred if the robot enters the restricted zone. The collision penalty is defined as:
- Goal Reward: Like the restricted zone, a m radius surrounding the goal point is referred to as the success zone. If the robot enters the success zone, it is considered that the robot has reached its goal. Equation (10) shows the calculated goal reward.
- Distance Penalty: This is the penalty obtained based on the distance between the robot and the target point relative to the initial distance to the target point . If the robot is close to the goal point, it receives a small penalty, while if the distance is large, it receives a large penalty.
- Heading Penalty: To ensure that the robot heads toward the goal point, a penalty is placed on the robot’s orientation. Given the robot’s orientation , and the goal point orientation , the heading penalty is calculated as:
4. Training Environment and Process
5. Results
5.1. Evaluation Metrics
Algorithm 1: Average Q-value and maximum Q-value |
|
Algorithm 2: Loss Value |
|
5.2. Training Result
5.3. Simulation Performance Evaluation
5.4. Real-World Performance Evaluation
6. Limitation and Future Work
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Marr, B. Demand for These Autonomous Delivery Robots Is Skyrocketing during This Pandemic. Forbes, 29 May 2020. [Google Scholar]
- Xie, Z.; Dames, P. Drl-vo: Learning to navigate through crowded dynamic scenes using velocity obstacles. IEEE Trans. Robot. 2023, 39, 2700–2719. [Google Scholar] [CrossRef]
- Takleh, T.T.O.; Bakar, N.A.; Rahman, S.A.; Hamzah, R.; Aziz, Z. A brief survey on SLAM methods in autonomous vehicle. Int. J. Eng. Technol. 2018, 7, 38–43. [Google Scholar] [CrossRef]
- Kim, D.L.; Park, H.W.; Yeon, Y.M. Analysis of optimal detection range performance of LiDAR systems applying coaxial optics. Heliyon 2022, 8, e12493. [Google Scholar] [CrossRef] [PubMed]
- Ma, Z.; Postolache, O.; Yang, Y. Obstacle Avoidance for Unmanned Vehicle based on a 2D LIDAR. In Proceedings of the 2019 International Conference on Sensing and Instrumentation in IoT Era (ISSI), Lisbon, Portugal, 29–30 August 2019; pp. 1–6. [Google Scholar] [CrossRef]
- Cai, P.; Wang, S.; Wang, H.; Liu, M. Carl-lead: Lidar-based end-to-end autonomous driving with contrastive deep reinforcement learning. arXiv 2021, arXiv:2109.08473. [Google Scholar]
- Tsai, J.; Chang, C.C.; Ou, Y.C.; Sieh, B.H.; Ooi, Y.M. Autonomous driving control based on the perception of a lidar sensor and odometer. Appl. Sci. 2022, 12, 7775. [Google Scholar] [CrossRef]
- Peng, Y.; Qu, D.; Zhong, Y.; Xie, S.; Luo, J.; Gu, J. The obstacle detection and obstacle avoidance algorithm based on 2-D lidar. In Proceedings of the 2015 IEEE International Conference on Information and Automation, Lijiang, China, 8–10 August 2015; pp. 1648–1653. [Google Scholar] [CrossRef]
- Ghorpade, D.; Thakare, A.D.; Doiphode, S. Obstacle Detection and Avoidance Algorithm for Autonomous Mobile Robot using 2D LiDAR. In Proceedings of the 2017 International Conference on Computing, Communication, Control and Automation (ICCUBEA), Pune, India, 17–18 August 2017; pp. 1–6. [Google Scholar] [CrossRef]
- Dong, H.; Weng, C.Y.; Guo, C.; Yu, H.; Chen, I.M. Real-Time Avoidance Strategy of Dynamic Obstacles via Half Model-Free Detection and Tracking With 2D Lidar for Mobile Robots. IEEE/ASME Trans. Mechatron. 2021, 26, 2215–2225. [Google Scholar] [CrossRef]
- Chen, C.W.; Hsieh, P.H.; Lai, W.H. Application of decision tree on collision avoidance system design and verification for quadcopter. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2017, 42, 71–75. [Google Scholar] [CrossRef]
- Hu, M.; Liao, Y.; Wang, W.; Li, G.; Cheng, B.; Chen, F. Decision tree-based maneuver prediction for driver rear-end risk-avoidance behaviors in cut-in scenarios. J. Adv. Transp. 2017, 2017, 7170358. [Google Scholar] [CrossRef]
- Kim, Y.N.; Ko, D.W.; Suh, I.H. Confidence random tree-based algorithm for mobile robot path planning considering the path length and safety. Int. J. Adv. Robot. Syst. 2019, 16, 1729881419838179. [Google Scholar] [CrossRef]
- Xiong, J.; Duan, X. Path planning for UAV based on improved dynamic step RRT algorithm. Proc. J. Phys. Conf. Ser. Iop Publ. 2021, 1983, 012034. [Google Scholar] [CrossRef]
- Hoy, M.; Matveev, A.S.; Savkin, A.V. Algorithms for collision-free navigation of mobile robots in complex cluttered environments: A survey. Robotica 2015, 33, 463–497. [Google Scholar] [CrossRef]
- Noh, S.; Park, J.; Park, J. Autonomous mobile robot navigation in indoor environments: Mapping, localization, and planning. In Proceedings of the 2020 International Conference on Information and Communication Technology Convergence (ICTC), Jeju Island, Republic of Korea, 21–23 October 2020; pp. 908–913. [Google Scholar]
- Hennes, D.; Claes, D.; Meeussen, W.; Tuyls, K. Multi-robot collision avoidance with localization uncertainty. In Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems, Valencia, Spain, 4–8 June 2012; Volume 1, pp. 147–154. [Google Scholar]
- Mourllion, B.; Lambert, A.; Gruyer, D.; Aubert, D. Collaborative perception for collision avoidance. In Proceedings of the IEEE International Conference on Networking, Sensing and Control, Taipei, Taiwan, 21–23 March 2004; pp. 880–885. [Google Scholar]
- Balakrishnan, K.; Narayanan, P.; Lakehal-ayat, M. Automatic Navigation Using Deep Reinforcement Learning. U.S. Patent 11,613,249, 28 March 2023. [Google Scholar]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human-level control through deep reinforcement learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef] [PubMed]
- Silver, D.; Huang, A.; Maddison, C.J.; Guez, A.; Sifre, L.; Van Den Driessche, G.; Schrittwieser, J.; Antonoglou, I.; Panneershelvam, V.; Lanctot, M.; et al. Mastering the game of Go with deep neural networks and tree search. Nature 2016, 529, 484–489. [Google Scholar] [CrossRef] [PubMed]
- Weerakoon, K.; Sathyamoorthy, A.J.; Patel, U.; Manocha, D. Terp: Reliable planning in uneven outdoor environments using deep reinforcement learning. In Proceedings of the 2022 International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA, 23–27 May 2022; pp. 9447–9453. [Google Scholar]
- Xue, X.; Li, Z.; Zhang, D.; Yan, Y. A deep reinforcement learning method for mobile robot collision avoidance based on double dqn. In Proceedings of the 2019 IEEE 28th International Symposium on Industrial Electronics (ISIE), Vancouver, BC, Canada, 12–14 June 2019; pp. 2131–2136. [Google Scholar]
- Ruan, X.; Ren, D.; Zhu, X.; Huang, J. Mobile robot navigation based on deep reinforcement learning. In Proceedings of the 2019 Chinese Control and Decision Conference (CCDC), Nanchang, China, 3–5 June 2019; pp. 6174–6178. [Google Scholar]
- Grando, R.B.; de Jesus, J.C.; Kich, V.A.; Kolling, A.H.; Bortoluzzi, N.P.; Pinheiro, P.M.; Neto, A.A.; Drews, P.L.J. Deep Reinforcement Learning for Mapless Navigation of a Hybrid Aerial Underwater Vehicle with Medium Transition. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May–5 June 2021; pp. 1088–1094. [Google Scholar] [CrossRef]
- Lee, M.F.R.; Yusuf, S.H. Mobile Robot Navigation Using Deep Reinforcement Learning. Processes 2022, 10, 2748. [Google Scholar] [CrossRef]
- Cimurs, R.; Lee, J.H.; Suh, I.H. Goal-oriented obstacle avoidance with deep reinforcement learning in continuous action space. Electronics 2020, 9, 411. [Google Scholar] [CrossRef]
- Choi, J.; Lee, G.; Lee, C. Reinforcement learning-based dynamic obstacle avoidance and integration of path planning. Intell. Serv. Robot. 2021, 14, 663–677. [Google Scholar] [CrossRef] [PubMed]
- Wang, H.C.; Huang, S.C.; Huang, P.J.; Wang, K.L.; Teng, Y.C.; Ko, Y.T.; Jeon, D.; Wu, I.C. Curriculum Reinforcement Learning From Avoiding Collisions to Navigating Among Movable Obstacles in Diverse Environments. IEEE Robot. Autom. Lett. 2023, 8, 2740–2747. [Google Scholar] [CrossRef]
- Fang, M.; Zhou, T.; Du, Y.; Han, L.; Zhang, Z. Curriculum-guided hindsight experience replay. Adv. Neural Inf. Process. Syst. 2019, 32, 12623–12634. [Google Scholar]
- Li, B.; Wu, Y. Path Planning for UAV Ground Target Tracking via Deep Reinforcement Learning. IEEE Access 2020, 8, 29064–29074. [Google Scholar] [CrossRef]
- Miranda, V.R.F.; Neto, A.A.; Freitas, G.M.; Mozelli, L.A. Generalization in Deep Reinforcement Learning for Robotic Navigation by Reward Shaping. IEEE Trans. Ind. Electron. 2023, 1–8. [Google Scholar] [CrossRef]
- Tai, L.; Paolo, G.; Liu, M. Virtual-to-real deep reinforcement learning: Continuous control of mobile robots for mapless navigation. In Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada, 24–28 September 2017; pp. 31–36. [Google Scholar]
- Han, Y.; Zhan, I.H.; Zhao, W.; Pan, J.; Zhang, Z.; Wang, Y.; Liu, Y.J. Deep reinforcement learning for robot collision avoidance with self-state-attention and sensor fusion. IEEE Robot. Autom. Lett. 2022, 7, 6886–6893. [Google Scholar] [CrossRef]
- Xie, L.; Wang, S.; Rosa, S.; Markham, A.; Trigoni, N. Learning with training wheels: Speeding up training with a simple controller for deep reinforcement learning. In Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia, 21–25 May 2018; pp. 6276–6283. [Google Scholar]
- Vödisch, N.; Unal, O.; Li, K.; Van Gool, L.; Dai, D. End-to-end optimization of LiDAR beam configuration for 3D object detection and localization. IEEE Robot. Autom. Lett. 2022, 7, 2242–2249. [Google Scholar] [CrossRef]
- Zhang, W.; Liu, N.; Zhang, Y. Learn to navigate maplessly with varied LiDAR configurations: A support point-based approach. IEEE Robot. Autom. Lett. 2021, 6, 1918–1925. [Google Scholar] [CrossRef]
- Liu, L.; Dugas, D.; Cesari, G.; Siegwart, R.; Dubé, R. Robot Navigation in Crowded Environments Using Deep Reinforcement Learning. In Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA, 24 October–24 January 2020; pp. 5671–5677. [Google Scholar] [CrossRef]
- Choi, J.; Park, K.; Kim, M.; Seok, S. Deep reinforcement learning of navigation in a complex and crowded environment with a limited field of view. In Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; pp. 5993–6000. [Google Scholar]
- Gu, D.; Hu, H. Receding horizon tracking control of wheeled mobile robots. IEEE Trans. Control. Syst. Technol. 2006, 14, 743–749. [Google Scholar]
- Leena, N.; Saju, K. Modelling and trajectory tracking of wheeled mobile robots. Procedia Technol. 2016, 24, 538–545. [Google Scholar] [CrossRef]
- Thai, N.H.; Ly, T.T.K.; Dzung, L. Trajectory tracking control for differential-drive mobile robot by a variable parameter PID controller. Int. J. Mech. Eng. Robot. Res. 2022, 11, 614–621. [Google Scholar] [CrossRef]
- Zhang, S.; Shan, J.; Liu, Y. Variational Bayesian estimator for mobile robot localization with unknown noise covariance. IEEE/ASME Trans. Mechatron. 2022, 27, 2185–2193. [Google Scholar] [CrossRef]
- Wang, S.; Gao, R.; Han, R.; Chen, S.; Li, C.; Hao, Q. Adaptive Environment Modeling Based Reinforcement Learning for Collision Avoidance in Complex Scenes. In Proceedings of the 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Kyoto, Japan, 23–27 October 2022; pp. 9011–9018. [Google Scholar]
- He, C.; Gong, J.; Yang, Y.; Bi, D.; Lan, J.; Qie, L. Real-time Track Obstacle Detection from 3D LIDAR Point Cloud. Proc. J. Phys. Conf. Ser. Iop Publ. 2021, 1910, 012002. [Google Scholar] [CrossRef]
- Brunke, L.; Greeff, M.; Hall, A.W.; Yuan, Z.; Zhou, S.; Panerati, J.; Schoellig, A.P. Safe learning in robotics: From learning-based control to safe reinforcement learning. Annu. Rev. Control. Robot. Auton. Syst. 2022, 5, 411–444. [Google Scholar] [CrossRef]
- Kiran, B.R.; Sobh, I.; Talpaert, V.; Mannion, P.; Al Sallab, A.A.; Yogamani, S.; Pérez, P. Deep reinforcement learning for autonomous driving: A survey. IEEE Trans. Intell. Transp. Syst. 2021, 23, 4909–4926. [Google Scholar] [CrossRef]
- Fan, Z.; Su, R.; Zhang, W.; Yu, Y. Hybrid actor-critic reinforcement learning in parameterized action space. arXiv 2019, arXiv:1903.01344. [Google Scholar]
- Mochizuki, D.; Abiko, Y.; Saito, T.; Ikeda, D.; Mineno, H. Delay-tolerance-based mobile data offloading using deep reinforcement learning. Sensors 2019, 19, 1674. [Google Scholar] [CrossRef] [PubMed]
- Srinivas, A.; Sharma, S.; Ravindran, B. Dynamic frame skip deep q network. arXiv 2016, arXiv:1605.05365. [Google Scholar]
- Feng, S.; Sebastian, B.; Ben-Tzvi, P. A collision avoidance method based on deep reinforcement learning. Robotics 2021, 10, 73. [Google Scholar] [CrossRef]
Reference | Sensor | FOV | Beam | RL-Model | Comments |
---|---|---|---|---|---|
[33] | LiDAR | 180° | 10 | DDPG | The use of a small beam within a wide FOV may result in a lower point cloud density, thus impacting the level of detail. |
[34] | LiDAR+Camera | 180° | - | PPO | A prior knowledge of the environment is required. |
[35] | LiDAR | 270° | 512 | PID+CNN | The model performs well, however, this choice of LiDAR configuration for such a task seems costly. |
[36] | LiDAR | 13.34° | Learns | -Greedy Search | The optimization was specific to just the beam size and not the FOV. The method performs well for localization and object detection. |
[37] | LiDAR | Varies | Varies | SAC | Does it mean that 240° FOV is the best for all types of environments or scenarios? |
[38] | LiDAR | 25° | 72 | A3C | A sensor with more beams affects its mechanical design and is more costly. |
[39] | Camera | 90° | 18 | LSTM-LMC | The camera data need to be processed into point cloud data. |
Our Approach | LiDAR | 20 | TD3-AC | The required FOV is calculated based on the width of the LiDAR h and the minimum distance L set between the obstacle and the robot. |
Learning Rate | Discount Factor | Update Rate | Policy Noise | Batch Size |
---|---|---|---|---|
0.0005 | 0.99999 | 0.005 | 2 | 200 |
Model 1 | Model 2 | Model 3 | |
---|---|---|---|
FOV | |||
Number of beams | 10 | 20 | 30 |
N.B: ϕ = 10° |
Models | Scenario 1 | Scenario 2 | Scenario 3 | |||||||
---|---|---|---|---|---|---|---|---|---|---|
Criteria | 1 | 2 | 3 | 1 | 2 | 3 | 1 | 2 | 3 | |
Distance (m) | 2.79 | 1.81 | 2.39 | Timeout | 2.79 | 3.40 | Collision | 1.83 | 2.63 | |
Time (s) | 9.52 | 8.05 | 8.92 | Timeout | 9.5 | 10.44 | Collision | 8.08 | 9.28 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Olayemi, K.B.; Van, M.; McLoone, S.; McIlvanna, S.; Sun, Y.; Close, J.; Nguyen, N.M. The Impact of LiDAR Configuration on Goal-Based Navigation within a Deep Reinforcement Learning Framework. Sensors 2023, 23, 9732. https://doi.org/10.3390/s23249732
Olayemi KB, Van M, McLoone S, McIlvanna S, Sun Y, Close J, Nguyen NM. The Impact of LiDAR Configuration on Goal-Based Navigation within a Deep Reinforcement Learning Framework. Sensors. 2023; 23(24):9732. https://doi.org/10.3390/s23249732
Chicago/Turabian StyleOlayemi, Kabirat Bolanle, Mien Van, Sean McLoone, Stephen McIlvanna, Yuzhu Sun, Jack Close, and Nhat Minh Nguyen. 2023. "The Impact of LiDAR Configuration on Goal-Based Navigation within a Deep Reinforcement Learning Framework" Sensors 23, no. 24: 9732. https://doi.org/10.3390/s23249732
APA StyleOlayemi, K. B., Van, M., McLoone, S., McIlvanna, S., Sun, Y., Close, J., & Nguyen, N. M. (2023). The Impact of LiDAR Configuration on Goal-Based Navigation within a Deep Reinforcement Learning Framework. Sensors, 23(24), 9732. https://doi.org/10.3390/s23249732