Article

Assured Deep Multi-Agent Reinforcement Learning for Safe Robotic Systems

Authors:

Radu Calinescu,

Colin Paterson,

Daniel Kudenko,

Alec BanksAuthors Info & Claims

Agents and Artificial Intelligence: 13th International Conference, ICAART 2021, Virtual Event, February 4–6, 2021, Revised Selected Papers

Pages 158 - 180

https://doi.org/10.1007/978-3-031-10161-8_8

Published: 04 February 2021 Publication History

Abstract

Using multi-agent reinforcement learning to find solutions to complex decision-making problems in shared environments has become standard practice in many scenarios. However, this is not the case in safety-critical scenarios, where the reinforcement learning process, which uses stochastic mechanisms, could lead to highly unsafe outcomes. We proposed a novel, safe multi-agent reinforcement learning approach named Assured Multi-Agent Reinforcement Learning (AMARL) to address this issue. Distinct from other safe multi-agent reinforcement learning approaches, AMARL utilises quantitative verification, a model checking technique that guarantees agent compliance of safety, performance, and non-functional requirements, both during and after the learning process. We have previously evaluated AMARL in patrolling domains with various multi-agent reinforcement learning algorithms for both homogeneous and heterogeneous systems. In this work we extend AMARL through the use of deep multi-agent reinforcement learning. This approach is particularly appropriate for systems in which the rewards are sparse and hence extends the applicability of AMARL. We evaluate our approach within a new search and collection domain which demonstrates promising results in safety standards and performance compared to algorithms not using AMARL.

References

[1]

Alshiekh, M., Bloem, R., Ehlers, R., Könighofer, B., Niekum, S., Topcu, U.: Safe reinforcement learning via shielding. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)

[2]

Brunke, L., et al.: Safe learning in robotics: from learning-based control to safe reinforcement learning. arXiv preprint arXiv:2108.06266 (2021)

[3]

Bui VH, Nguyen TT, and Kim HM Distributed operation of wind farm for maximizing output power: a multi-agent deep reinforcement learning approach IEEE Access 2020 8 173136-173146

[4]

Buşoniu L, Babuška R, and De Schutter B Srinivasan D and Jain LC Multi-agent reinforcement learning: an overview Innovations in Multi-Agent Systems and Applications - 1 2010 Heidelberg Springer 183-221

[5]

Cheng, R., Orosz, G., Murray, R.M., Burdick, J.W.: End-to-end safe reinforcement learning through barrier functions for safety-critical continuous control tasks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 3387–3395 (2019)

[6]

Danassis, P., Filos-Ratsikas, A., Faltings, B.: Achieving diverse objectives with AI-driven prices in deep reinforcement learning multi-agent markets. arXiv preprint arXiv:2106.06060 (2021)

[7]

Dehnert C, Junges S, Katoen J-P, and Volk M Majumdar R and Kunčak V A storm is coming: a modern probabilistic model checker Computer Aided Verification 2017 Cham Springer 592-600

[8]

Fan, J., Wang, Z., Xie, Y., Yang, Z.: A theoretical analysis of deep q-learning. In: Learning for Dynamics and Control, pp. 486–489. PMLR (2020)

[9]

Faria, J.M.: Machine learning safety: an overview. In: Proceedings of the 26th Safety-Critical Systems Symposium, York, UK, pp. 6–8 (2018)

[10]

Garcia, F., Rachelson, E.: Markov decision processes. Markov Decision Processes in Artificial Intelligence, pp. 1–38 (2013)

[11]

Garcıa J and Fernández F A comprehensive survey on safe reinforcement learning J. Mach. Learn. Res. 2015 16 1 1437-1480

[12]

Ge Y, Zhu F, Huang W, Zhao P, and Liu Q Multi-agent cooperation q-learning algorithm based on constrained Markov game Comput. Sci. Inf. Syst. 2020 17 2 647-664

[13]

Gerasimou, S., Calinescu, R., Shevtsov, S., Weyns, D.: UNDERSEA: an exemplar for engineering self-adaptive unmanned underwater vehicles. In: 2017 IEEE/ACM 12th International Symposium on Software Engineering for Adaptive and Self-Managing Systems (SEAMS), pp. 83–89. IEEE (2017)

[14]

Hansson H and Jonsson B A logic for reasoning about time and reliability Formal Aspects Comput. 1994 6 5 512-535

[15]

Hasanbeig, M., Abate, A., Kroening, D.: Cautious reinforcement learning with logical constraints. arXiv preprint arXiv:2002.12156 (2020)

[16]

Hernandez-Leal P, Kartal B, and Taylor ME Is multiagent deep reinforcement learning the answer or the question? A brief survey Learning 2018 21 22

[17]

Huang, Y., Wu, S., Mu, Z., Long, X., Chu, S., Zhao, G.: A multi-agent reinforcement learning method for swarm robots in space collaborative exploration. In: 2020 6th International Conference on Control, Automation and Robotics (ICCAR), pp. 139–144. IEEE (2020)

[18]

Huh, S., Yang, I.: Safe reinforcement learning for probabilistic reachability and safety specifications: a Lyapunov-based approach. arXiv preprint arXiv:2002.10126 (2020)

[19]

Jansen, N., Könighofer, B., Junges, S., Serban, A., Bloem, R.: Safe reinforcement learning using probabilistic shields (2020)

[20]

Juliani, A., et al.: Unity: a general platform for intelligent agents. arXiv preprint arXiv:1809.02627 (2018)

[21]

Junges S, Jansen N, Dehnert C, Topcu U, and Katoen J-P Chechik M and Raskin J-F Safety-constrained reinforcement learning for MDPs Tools and Algorithms for the Construction and Analysis of Systems 2016 Heidelberg Springer 130-146

[22]

Kwiatkowska M, Norman G, and Parker D Katoen J-P and Stevens P Probabilistic symbolic model checking with PRISM: a hybrid approach Tools and Algorithms for the Construction and Analysis of Systems 2002 Heidelberg Springer 52-66

[23]

Kwiatkowska M, Norman G, and Parker D Bernardo M and Hillston J Stochastic model checking Formal Methods for Performance Evaluation 2007 Heidelberg Springer 220-270

[24]

Kwiatkowska M, Norman G, and Parker D Gopalakrishnan G and Qadeer S PRISM 4.0: verification of probabilistic real-time systems Computer Aided Verification 2011 Heidelberg Springer 585-591

[25]

Lee HR and Lee T Multi-agent reinforcement learning algorithm to solve a partially-observable multi-agent problem in disaster response Eur. J. Oper. Res. 2021 291 1 296-308

[26]

Liao, X., et al.: Iteratively-refined interactive 3D medical image segmentation with multi-agent reinforcement learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9394–9402 (2020)

[27]

Liu W, Wang Z, Liu X, Zeng N, Liu Y, and Alsaadi FE A survey of deep neural network architectures and their applications Neurocomputing 2017 234 11-26

[28]

Luis SY, Reina DG, and Marín SLT A multiagent deep reinforcement learning approach for path planning in autonomous surface vehicles: the Ypacaraí lake patrolling case IEEE Access 2021 9 17084-17099

[29]

Mason, G.R., Calinescu, R.C., Kudenko, D., Banks, A.: Assured reinforcement learning with formally verified abstract policies. In: 9th International Conference on Agents and Artificial Intelligence (ICAART), York (2017)

[30]

Nowé A, Vrancx P, and De Hauwere YM Wiering M and van Otterlo M Game theory and multi-agent reinforcement learning Reinforcement Learning 2012 Berlin Springer 441-470

[31]

OroojlooyJadid, A., Hajinezhad, D.: A review of cooperative multi-agent deep reinforcement learning. arXiv preprint arXiv:1908.03963 (2019)

[32]

Pardalos PM, Migdalas A, and Pitsoulis L Pareto Optimality, Game Theory and Equilibria 2008 Heidelberg Springer

[33]

Parnika, P., Diddigi, R.B., Danda, S.K.R., Bhatnagar, S.: Attention actor-critic algorithm for multi-agent constrained co-operative reinforcement learning. arXiv preprint arXiv:2101.02349 (2021)

[34]

Pathak, D., Agrawal, P., Efros, A.A., Darrell, T.: Curiosity-driven exploration by self-supervised prediction. In: International Conference on Machine Learning, pp. 2778–2787. PMLR (2017)

[35]

Portugal D, Iocchi L, and Farinelli A Koubaa A A ROS-based framework for simulation and benchmarking of multi-robot patrolling algorithms Robot Operating System (ROS) 2019 Cham Springer 3-28

[36]

Riley, J., Calinescu, R., Paterson, C., Kudenko, D., Banks, A.: Reinforcement learning with quantitative verification for assured multi-agent policies. In: 13th International Conference on Agents and Artificial Intelligence, York (2021)

[37]

Riley, J., Calinescu, R., Paterson, C., Kudenko, D., Banks, A.: Utilising assured multi-agent reinforcement learning within safety-critical scenarios. Procedia Comput. Sci. 192, 1061–1070 (2021). Knowledge-Based and Intelligent Information & Engineering Systems: Proceedings of the 25th International Conference KES 2021

[38]

Rizk Y, Awad M, and Tunstel EW Decision making in multiagent systems: a survey IEEE Trans. Cogn. Dev. Syst. 2018 10 3 514-529

[39]

Rosser, C., Abed, K.: Curiosity-driven reinforced learning of undesired actions in autonomous intelligent agents. In: 2021 IEEE 19th World Symposium on Applied Machine Intelligence and Informatics (SAMI), pp. 000039–000042. IEEE (2021)

[40]

Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)

[41]

Spano S et al. An efficient hardware implementation of reinforcement learning: the q-learning algorithm IEEE Access 2019 7 186340-186351

[42]

Srinivasan, K., Eysenbach, B., Ha, S., Tan, J., Finn, C.: Learning to be safe: deep rl with a safety critic. arXiv preprint arXiv:2010.14603 (2020)

[43]

Sutton RS and Barto AG Reinforcement Learning: An Introduction 2018 Cambridge MIT Press

[44]

Thananjeyan B et al. Recovery RL: safe reinforcement learning with learned recovery zones IEEE Robot. Autom. Lett. 2021 6 3 4915-4922

[45]

Wachi, A., Sui, Y.: Safe reinforcement learning in constrained Markov decision processes. In: International Conference on Machine Learning, pp. 9797–9806. PMLR (2020)

[46]

Wiering MA and Van Otterlo M Reinforcement learning Adapt. Learn. Optim. 2012 12 3 729

[47]

Zhang K, Yang Z, and Başar T Vamvoudakis KG, Wan Y, Lewis FL, and Cansever D Multi-agent reinforcement learning: a selective overview of theories and algorithms Handbook of Reinforcement Learning and Control 2021 Cham Springer 321-384

Index Terms

Assured Deep Multi-Agent Reinforcement Learning for Safe Robotic Systems
1. Computer systems organization

Index terms have been assigned to the content through auto-classification.

Recommendations

Multi-agent deep reinforcement learning: a survey
Abstract
The advances in reinforcement learning have recorded sublime success in various domains. Although the multi-agent domain has been overshadowed by its single-agent counterpart during this progress, multi-agent reinforcement learning gains rapid ...
Utilising Assured Multi-Agent Reinforcement Learning within Safety-Critical Scenarios
Abstract
Multi-agent reinforcement learning allows a team of agents to learn how to work together to solve complex decision-making problems in a shared environment. However, this learning process utilises stochastic mechanisms, meaning that its use in ...
Deep reinforcement learning for multi-agent interaction
Multi-agent systems research in the United Kingdom

The development of autonomous agents which can interact with other agents to accomplish a given task is a core area of research in artificial intelligence and machine learning. Towards this goal, the Autonomous Agents Research Group develops novel ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

Agents and Artificial Intelligence: 13th International Conference, ICAART 2021, Virtual Event, February 4–6, 2021, Revised Selected Papers

Feb 2021

352 pages

ISBN:978-3-031-10160-1

DOI:10.1007/978-3-031-10161-8

Editors:
Ana Paula Rocha
LIACC, University of Porto, Porto, Portugal
,
Luc Steels
ICREA, Institute of Evolutionary Biology, Barcelona, Barcelona, Spain
,
Jaap van den Herik
Leiden University, Leiden, The Netherlands

© Springer Nature Switzerland AG 2022.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 04 February 2021

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 22 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents