Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Deep Reinforcement Learning for Parameter Tuning of Robot Visual Servoing

Published: 21 February 2023 Publication History

Abstract

Robot visual servoing controls the motion of a robot through real-time visual observations. Kinematics is a key approach to achieving visual servoing. One key challenge of kinematics-based visual servoing is that it requires time-varying parameter configuration throughout the entire process of one task. Parameter tuning is also necessary when applying to different tasks. The existing work on parameter tuning either lacks adaptation or cannot automate the tuning of all parameters. Meanwhile, the transferability of existing methods from one task to another is low. This work develops a Deep Reinforcement Learning (DRL) framework for robot visual servoing, which can automate all parameters tuning for one task and across tasks. In visual servoing, forward kinematics focuses on motion speed, while inverse kinematics focuses on the smoothness of motion. Therefore, we develop two separate modules in the proposed DRL framework. One tunes time-varying Forward Kinematics parameters to accelerate the motion, and the other tunes the Inverse Kinematics parameters to ensure smoothness. Moreover, we customize a knowledge transfer method to generalize the proposed DRL models to various robot tasks without reconstructing the neural network. We verify the proposed method on simulated robot tasks. The experimental results show that the proposed method outperforms the state-of-the-art methods and manual parameter configuration in terms of movement speed and smoothness in one task and across tasks.

References

[1]
A. Astolfi, Liu Hsu, Mariana Netto, and Romeo Ortega.2002. Two solutions to the adaptive visual servoing problem. IEEE Trans. Robot. Autom. 18, 3 (August2002), 387–392.
[2]
Francois Chaumette and Seth Hutchinson. 2006. Visual servo control. I. Basic approaches. IEEE Robot. Autom. Mag. 13, 4 (December2006), 82–90.
[3]
Cosmin Copot, Lei Shi, and Steve Vanlanduit. 2019. Automatic tuning methodology of visual servoing system using predictive approach. In Proceedings of the IEEE 15th International Conference on Control and Automation (ICCA’19). IEEE, 776–781.
[4]
Xingping Dong, Jianbin Shen, Wenguan Wang, Yu Liu, Ling Shao, and Fatih Porikli.2018. Hyperparameter optimization for tracking with continuous deep q-learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). 518–527.
[5]
Xingping Dong, Jianbing Shen, Wenguan Wang, Ling Shao, Haibin Ling, and Fatih Porikli. 2019. Dynamical hyperparameter optimization via deep reinforcement learning in tracking. IEEE Trans. Pattern Anal. Mach. Intell. 43, 5 (2019), 1515–1529.
[6]
Yan Duan, Xi Chen, Rein Houthooft, John Schulman, and Pieter Abbeel. 2016. Benchmarking deep reinforcement learning for continuous control. In International Conference on Machine Learning. PMLR, 1329–1338.
[7]
De feng He, Li Yu, and Xiu lan Song.2014. Optimized-based stabilization of constrained nonlinear systems: A receding horizon approach. Asian J. Contr. 16, 6 (March2014), 1693–1701.
[8]
Jesus A. Garrido, Niceto R. Luque, and Egidio D’Angelo. 2013. Distributed cerebellar plasticity implements adaptable gain control in a manipulation task: A closed-loop robotic simulation. Front. Neural Circ. 7 (October2013), 1–20.
[9]
Jesus A. Garrido Alcazar, Niceto Rafael Luque, Egidio D’Angelo, and Eduardo Ros. 2013. Distributed cerebellar plasticity implements adaptable gain control in a manipulation task: A closed-loop robotic simulation. Front. Neural Circ. 7 (2013), 159, 1–20.
[10]
Alex Graves. 2012. Long short-term memory. In Supervised Sequence Labelling with Recurrent Neural Networks, Studies in Computational Intelligence, Vol. 385 (February2012), 37–45.
[11]
Ondrej Hock and Jozef Sedo. 2018. Inverse kinematics using transposition method for robotic arm. In Proceedings of the International Conference ELEKTRO (ELEKTRO’18). 1–5.
[12]
Zhehao Jin, Jinhui Wu, Andong Liu, Wen-An Zhang, and Li Yu. 2021. Policy-based deep reinforcement learning for visual servoing control of mobile robots with visibility constraints. IEEE Trans. Industr. Electr. 69, 2 (February2021), 1898–1908.
[13]
Hadi S. Jomaa, Josif Grabocka, and Lars Schmidt-Thieme.2019. Hyp-rl: Hyperparameter optimization by reinforcement learning. arXiv:1906.11527. Retrieved from https://arxiv.org/abs/1906.11527.
[14]
Meng Kang, Hao Chen, and Jiuxiang Dong.2020. Adaptive visual servoing with an uncalibrated camera using extreme learning machine and Q-leaning. Neurocomputing 402 (March2020), 384–394.
[15]
M. Kirtas, Konstantinos Tsampazis, Nikolaos Passalis, and Anastasios Tefas. 2020. Deepbots: A webots-based deep reinforcement learning framework for robotics. In Proceedings of the IFIP International Conference on Artificial Intelligence Applications and Innovations. Springer, 64–75.
[16]
K. Jagatheesan, B. Anand, S. Samanta, N. Dey, A. S. Ashour, and V. E. Balas. 2019. Design of a proportional-integral-derivative controller for an automatic generation control of multi-area power thermal systems using firefly algorithm. IEEE/CAA J. Autom. Sinica 6, 2 (March2019), 503–515.
[17]
Linghuan Kong, Wei He, Yiting Dong, Long Cheng, Chenguang Yang, and Zhijun Li. 2019. Asymmetric bounded neural control for an uncertain robot by state feedback and output feedback. IEEE Trans. Syst. Man Cybernet.: Syst. 51, 3 (2019), 1735–1746.
[18]
Alex X. Lee, Sergey Levine, and Pieter Abbeel. 2017. Learning visual servoing with deep features and fitted q-iteration. arXiv:1703.11000. Retrieved from https://arxiv.org/abs/1703.11000.
[19]
Haoran Li, Qichao Zhang, and Dongbin Zhao. 2020. Flying through a narrow gap using end-to-end deep reinforcement learning augmented with curriculum learning and Sim2Real. IEEE Trans. Neural Netw. Learn. Syst. 31, 6 (June2020), 2064–2076.
[20]
Min Li, Yu Zhu, Kaiming Yang, and Chuxiong Hu. 2015. A data-driven variable-gain control strategy for an ultra-precision wafer stage with accelerated iterative parameter tuning. IEEE Trans. Industr. Inf. 11, 5 (October2015), 1179–1189.
[21]
Xuesi Li, Kai Jiang, Chunlei Yang, and Haobin Shi.2018. Image-based visual servoing for quadrotor helicopters using genetic algorithm. In Proceedings of the IEEE International Conference on Information and Automation. 507–523.
[22]
Yimeng Li and Jana Košecka. 2020. Learning view and target invariant visual servoing for navigation. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA’20). IEEE, 658–664.
[23]
Z. Li, X. Li, Q. Li, H. Su, Z. Kan, and W. He 2022. Human-in-the-Loop Control of Soft Exosuits Using Impedance Learning on Different Terrains. In IEEE Transactions on Robotics, 38, 5 (2022), 2979–2993. DOI:
[24]
Timothy P. Lillicrap, Jonathan J. Hunt, and Alexander Pritzel et. al. (2015). Continuous control with deep reinforcement learning. arXiv:1509.02971. Retrieved from https://arxiv.org/abs/1509.02971.
[25]
Xiangyang Liu, Jianliang Mao, Jun Yang, Shihua Li, and Kaifeng Yang. 2021. Robust predictive visual servoing control for an inertially stabilized platform with uncertain kinematics. ISA Trans. 114 (2021), 347–358.
[26]
Carlos Lopez-Franco, Javier Gomez-Avila, Alma Y. Alanis, and Carlos Villaseñor. 2017. Visual servoing for an autonomous hexarotor using a neural network based PID controller. Sensors 17, 8 (August2017), 1–17.
[27]
Ezio Malis, Francois Chaumette, and Sylvie Boudet. 1999. 2 1/2 D visual servoing. IEEE Trans. Robot. Autom. 15, 2 (1999), 238–250.
[28]
Ebrahim Matter.2010. Epipolar-kinematics relations estimation neural approximation for robotics closed loop visual servo system. In Proceedings of the 2nd International Conference on Computer and Automation Engineering (ICCAE’10). 441–445.
[29]
M. Bašić, D. Vukadinović, I. Grgić, and M. Bubalo. 2020. Speed-sensorless vector control of an induction generator including stray load and iron losses and online parameter tuning. IEEE Trans. Energy Convers. 35, 2 (June2020), 724–732.
[30]
Olivier Michel. 2004. Cyberbotics Ltd. Webots™: Professional mobile robot simulation. Int. J. Adv. Robot. Syst. 1, 5 (2004), 39–42.
[31]
Francisco Naveros, Niceto R Luque, Eduardo Ros, and Angelo Arleo. 2019. VOR adaptation on a humanoid iCub robot using a spiking cerebellar model. IEEE Trans. Cybernet. 50, 11 (2019), 4744–4757.
[32]
Francisco Naveros, Niceto R. Luque, Eduardo Ros, and Angelo Arleo. 2020. VOR adaptation on a humanoid iCub robot using a spiking cerebellar model. IEEE Trans. Cybernet. 50, 11 (November2020), 4744–4757.
[33]
Jie Pan, Xuesong Wang, Yuhu Cheng, and Qiang Yu. 2018. Multisource transfer double DQN based on actor learning. IEEE Trans. Neural Netw. Learn. Syst. 29, 6 (March2018), 2227–2238.
[34]
Do-Hwan Park, Jeong-Hoon Kwon, and In-Joong Ha. 2011. Novel position-based visual servoing approach to robust global stability under field-of-view constraint. IEEE Trans. Industr. Electr. 59, 12 (2011), 4735–4752.
[35]
Carlos Sampedro, Alejandro Rodriguez-Ramos, Ignacio Gil, Luis Mejias, and Pascual Campoy. 2018. Image-based visual servoing controller for multirotor aerial robots using deep reinforcement learning. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS’18). IEEE, 979–986.
[36]
G. Senthilkumar and M. P. Chitra. 2020. An ensemble dynamic optimization based inverse adaptive heuristic critic in IaaS cloud computing for resource allocation. J. Intell. Fuzzy Syst. 39, 5 (November2020), 7521–7535.
[37]
Haobin Shi and Meng Xu. 2020. A multiple-attribute decision-making approach to reinforcement learning. IEEE Trans. Cogn. Dev. Syst. 12, 4 (December2020), 695–708.
[38]
Haobin Shi, Meng Xu, and Kao-Shing Hwang. 2020. A fuzzy adaptive approach to decoupled visual servoing for a wheeled mobile robot. IEEE Trans. Fuzzy Syst. 28, 12 (December 2020), 3229–3243.
[39]
Xiulan Song and Miaomiao Fu.2017. CLFs-based optimization control for a class of constrained visual servoing systems. ISA Trans. 67 (March2017), 507–514.
[40]
Xiaocheng Tang, Zhiwei Qin, Fan Zhang, Zhaodong Wang, Zhe Xu, Yintai Ma, Hongtu Zhu, and Jieping Ye. 2019. A deep value-network based approach for multi-driver order dispatching. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1780–1790.
[41]
Likui Wang and Hak-Keung Lam. 2020. A new approach to stability and stabilization analysis for continuous-time takagi–sugeno fuzzy systems with time delay. IEEE Trans. Fuzzy Syst. 26, 4 (August2020), 2460–2465.
[42]
Kaixuan Wei, Angelica Aviles-Rivero, Jingwei Liang, Ying Fu, Carola-Bibiane Schonlieb, and Hua Huang.2020. Tuning-free plug-and-play proximal algorithm for inverse imaging problems. In Proceedings of the 37th International Conference on Machine Learning. 10158–10169.
[43]
Chenxi Xiao, Peng Lu, and Qizhi He. 2021. Flying through a narrow gap using end-to-end deep reinforcement learning augmented with curriculum learning and Sim2Real. IEEE Trans. Neural Netw. Learn. Syst. (September2021), 1–8. early access.
[44]
Zhaoming Xie, Glen Berseth, Patrick Clary, Jonathan Hurst, and Michiel van de Panne. 2018. Feedback control for cassie with deep reinforcement learning. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS’18). IEEE, 1241–1246.
[45]
De Xu, You Fu Li, and Min Tan. 2008. A general recursive linear method and unique solution pattern design for the perspective-n-point problem. Image Vis. Comput. 26, 6 (June2008), 740–750.
[46]
Meng Xu and Jianping Wang. 2022. Learning strategy for continuous robot visual control: A multi-objective perspective. Knowl.-Bas. Syst. 252, 109448 (2022), 1–15.
[47]
Li Yang and Abdallah Shami. 2020. On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing 415 (July2020), 295–316.
[48]
Tolga Yüksel. 2017. Intelligent visual servoing with extreme learning machine and fuzzy logic. Exp. Syst. Appl. 72 (2017), 344–356.
[49]
Yinyan Zhang and Shuai Li. 2018. A neural controller for image-based visual servoing of manipulators with physical constraints. IEEE Trans. Neural Netw. Learn. Syst. 29, 11 (2018), 5419–5429.
[50]
Zhengyou Zhang. 2000. A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. 22, 11 (November2000), 1330–1334.

Cited By

View all
  • (2024)Strengthening Cooperative Consensus in Multi-Robot ConfrontationACM Transactions on Intelligent Systems and Technology10.1145/363937115:2(1-27)Online publication date: 22-Feb-2024
  • (2024)Time-Varying Weights in Multi-Reward Architecture for Deep Reinforcement LearningIEEE Transactions on Emerging Topics in Computational Intelligence10.1109/TETCI.2024.33590398:2(1865-1881)Online publication date: Apr-2024
  • (2023)Dynamic Weights and Prior Reward in Policy Fusion for Compound Agent LearningACM Transactions on Intelligent Systems and Technology10.1145/362340514:6(1-28)Online publication date: 14-Nov-2023
  • Show More Cited By

Index Terms

  1. Deep Reinforcement Learning for Parameter Tuning of Robot Visual Servoing

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Intelligent Systems and Technology
    ACM Transactions on Intelligent Systems and Technology  Volume 14, Issue 2
    April 2023
    430 pages
    ISSN:2157-6904
    EISSN:2157-6912
    DOI:10.1145/3582879
    • Editor:
    • Huan Liu
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 21 February 2023
    Online AM: 12 January 2023
    Accepted: 11 December 2022
    Revised: 25 October 2022
    Received: 04 April 2022
    Published in TIST Volume 14, Issue 2

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Robot visual servoing
    2. kinematics
    3. parameter tuning
    4. Deep Reinforcement Learning
    5. knowledge transfer

    Qualifiers

    • Research-article

    Funding Sources

    • Science and Technology Innovation Committee Foundation of Shenzhen
    • Hong Kong Research Grant Council under RIF

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)549
    • Downloads (Last 6 weeks)25
    Reflects downloads up to 03 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Strengthening Cooperative Consensus in Multi-Robot ConfrontationACM Transactions on Intelligent Systems and Technology10.1145/363937115:2(1-27)Online publication date: 22-Feb-2024
    • (2024)Time-Varying Weights in Multi-Reward Architecture for Deep Reinforcement LearningIEEE Transactions on Emerging Topics in Computational Intelligence10.1109/TETCI.2024.33590398:2(1865-1881)Online publication date: Apr-2024
    • (2023)Dynamic Weights and Prior Reward in Policy Fusion for Compound Agent LearningACM Transactions on Intelligent Systems and Technology10.1145/362340514:6(1-28)Online publication date: 14-Nov-2023
    • (undefined)Robust Recommender Systems with Rating Flip NoiseACM Transactions on Intelligent Systems and Technology10.1145/3641285

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media