Abstract
Vascular robotic systems, which have gained popularity in clinic, provide a platform for potentially semi-automated surgery. Reinforcement learning (RL) is a appealing skill-learning method to facilitate automatic instrument delivery. However, the notorious sample inefficiency of RL has limited its application in this domain. To address this issue, this paper proposes a novel RL framework, Distributed Reinforcement learning with Adaptive Conservatism (DRAC), that learns manipulation skills with a modest amount of interactions. DRAC pretrains skills from rule-based interactions before online fine-tuning to utilize prior knowledge and improve sample efficiency. Moreover, DRAC uses adaptive conservatism to explore safely during online fine-tuning and a distributed structure to shorten training time. Experiments in a pre-clinical environment demonstrate that DRAC can deliver guidewire to the target with less dangerous exploration and better performance than prior methods (success rate of 96.00% and mean backward steps of 9.54) within 20k interactions. These results indicate that the proposed algorithm is promising to learn skills for vascular robotic systems.
This work was supported in part by the National Natural Science Foundation of China under Grant 62003343, Grant 62222316, Grant U1913601, Grant 62073325, Grant U20A20224, and Grant U1913210; in part by the Beijing Natural Science Foundation under Grant M22008; in part by the Youth Innovation Promotion Association of Chinese Academy of Sciences (CAS) under Grant 2020140; in part by the CIE-Tencent Robotics X Rhino-Bird Focused Research Program.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Wang, H., et al.: Global, regional, and national life expectancy, all-cause mortality, and cause-specific mortality for 249 causes of death, 1980–2015: a systematic analysis for the global burden of disease study 2015. Lancet (London, England) 388, 1459–1544 (2016)
Granada, J.F., et al.: First-in-human evaluation of a novel robotic-assisted coronary angioplasty system. J. Am. Coll. Cardiol. Intv. 4(4), 460–465 (2011)
Guo, S., et al.: A novel robot-assisted endovascular catheterization system with haptic force feedback. IEEE Trans. Rob. 35(3), 685–696 (2019)
Zhao, H.-L., et al.: Design and performance evaluation of a novel vascular robotic system for complex percutaneous coronary interventions. In: Proceedings of 43rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp. 4679–4682 (2021)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press (2018)
Chi, W., et al.: Collaborative robot-assisted endovascular catheterization with generative adversarial imitation learning. In: Proceedings of 2020 IEEE International Conference on Robotics and Automation, pp. 2414–2420 (2020)
Karstensen, L., et al.: Autonomous guidewire navigation in a two dimensional vascular phantom. Current Dir. Biomed. Eng. 6, 20200007 (2020)
Li, H., et al.: Discrete soft actor-critic with auto-encoder on vascular robotic system. Robotica 41, 1115–1126 (2022)
Kweon, J., et al.: Deep reinforcement learning for guidewire navigation in coronary artery phantom. IEEE Access 9, 166409–166422 (2021)
Li, H., Zhou, X.-H., Xie, X.-L., Liu, S.-Q., Feng, Z.-Q., Hou, Z.-G.: CASOG: conservative actor-critic with SmOoth gradient for skill learning in robot-assisted intervention. Arxiv (2020)
Yarats, D., Fergus, R., Lazaric, A., Pinto, L.: Mastering visual continuous control: improved data-augmented reinforcement learning. ArXiv, abs/2107.09645 (2021)
Nair, A., Dalal, M., Gupta, A., Levine, S.: Accelerating online reinforcement learning with offline datasets. ArXiv, abs/2006.09359 (2020)
Lu, Y.: AW-Opt: learning robotic skills with imitation and reinforcement at scale. In: Conference on Robot Learning (2021)
Kalashnikov, D.: QT-Opt: scalable deep reinforcement learning for vision-based robotic manipulation. ArXiv, abs/1806.10293 (2018)
Fu, J., Kumar, A., Nachum, O., Tucker, G., Levine, S.: D4RL: datasets for deep data-driven reinforcement learning. ArXiv, abs/2004.07219 (2020)
Fujimoto, S., van Hoof, H., Meger, D.: Addressing function approximation error in actor-critic methods. In: Proceedings of the 35th International Conference on Machine Learning, pp. 1582–1591 (2018)
Cetin, E., Ball, P.J., Roberts, S.J., Çeliktutan, O.: Stabilizing off-policy deep reinforcement learning from pixels. In: Proceedings of the 39th International Conference on Machine Learning, pp. 2784–2810 (2022)
Cheng, C.-A., Xie, T., Jiang, N., Agarwal, A.: Adversarially trained actor critic for offline reinforcement learning. In: Proceedings of the 39th International Conference on Machine Learning, pp. 3852–3878 (2022)
Yarats, D., et al.: Improving sample efficiency in model-free reinforcement learning from images. In: Proceedings of 35th AAAI Conference on Artificial Intelligence, pp. 10674–10681 (2021)
Moritz, P.: Ray: a distributed framework for emerging AI applications. Arxiv, abs/1712.05889 (2017)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Li, H. et al. (2024). Effective Skill Learning on Vascular Robotic Systems: Combining Offline and Online Reinforcement Learning. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Communications in Computer and Information Science, vol 1969. Springer, Singapore. https://doi.org/10.1007/978-981-99-8184-7_3
Download citation
DOI: https://doi.org/10.1007/978-981-99-8184-7_3
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8183-0
Online ISBN: 978-981-99-8184-7
eBook Packages: Computer ScienceComputer Science (R0)