Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1007/978-3-031-10161-8_8guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Assured Deep Multi-Agent Reinforcement Learning for Safe Robotic Systems

Published: 04 February 2021 Publication History

Abstract

Using multi-agent reinforcement learning to find solutions to complex decision-making problems in shared environments has become standard practice in many scenarios. However, this is not the case in safety-critical scenarios, where the reinforcement learning process, which uses stochastic mechanisms, could lead to highly unsafe outcomes. We proposed a novel, safe multi-agent reinforcement learning approach named Assured Multi-Agent Reinforcement Learning (AMARL) to address this issue. Distinct from other safe multi-agent reinforcement learning approaches, AMARL utilises quantitative verification, a model checking technique that guarantees agent compliance of safety, performance, and non-functional requirements, both during and after the learning process. We have previously evaluated AMARL in patrolling domains with various multi-agent reinforcement learning algorithms for both homogeneous and heterogeneous systems. In this work we extend AMARL through the use of deep multi-agent reinforcement learning. This approach is particularly appropriate for systems in which the rewards are sparse and hence extends the applicability of AMARL. We evaluate our approach within a new search and collection domain which demonstrates promising results in safety standards and performance compared to algorithms not using AMARL.

References

[1]
Alshiekh, M., Bloem, R., Ehlers, R., Könighofer, B., Niekum, S., Topcu, U.: Safe reinforcement learning via shielding. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)
[2]
Brunke, L., et al.: Safe learning in robotics: from learning-based control to safe reinforcement learning. arXiv preprint arXiv:2108.06266 (2021)
[3]
Bui VH, Nguyen TT, and Kim HM Distributed operation of wind farm for maximizing output power: a multi-agent deep reinforcement learning approach IEEE Access 2020 8 173136-173146
[4]
Buşoniu L, Babuška R, and De Schutter B Srinivasan D and Jain LC Multi-agent reinforcement learning: an overview Innovations in Multi-Agent Systems and Applications - 1 2010 Heidelberg Springer 183-221
[5]
Cheng, R., Orosz, G., Murray, R.M., Burdick, J.W.: End-to-end safe reinforcement learning through barrier functions for safety-critical continuous control tasks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 3387–3395 (2019)
[6]
Danassis, P., Filos-Ratsikas, A., Faltings, B.: Achieving diverse objectives with AI-driven prices in deep reinforcement learning multi-agent markets. arXiv preprint arXiv:2106.06060 (2021)
[7]
Dehnert C, Junges S, Katoen J-P, and Volk M Majumdar R and Kunčak V A storm is coming: a modern probabilistic model checker Computer Aided Verification 2017 Cham Springer 592-600
[8]
Fan, J., Wang, Z., Xie, Y., Yang, Z.: A theoretical analysis of deep q-learning. In: Learning for Dynamics and Control, pp. 486–489. PMLR (2020)
[9]
Faria, J.M.: Machine learning safety: an overview. In: Proceedings of the 26th Safety-Critical Systems Symposium, York, UK, pp. 6–8 (2018)
[10]
Garcia, F., Rachelson, E.: Markov decision processes. Markov Decision Processes in Artificial Intelligence, pp. 1–38 (2013)
[11]
Garcıa J and Fernández F A comprehensive survey on safe reinforcement learning J. Mach. Learn. Res. 2015 16 1 1437-1480
[12]
Ge Y, Zhu F, Huang W, Zhao P, and Liu Q Multi-agent cooperation q-learning algorithm based on constrained Markov game Comput. Sci. Inf. Syst. 2020 17 2 647-664
[13]
Gerasimou, S., Calinescu, R., Shevtsov, S., Weyns, D.: UNDERSEA: an exemplar for engineering self-adaptive unmanned underwater vehicles. In: 2017 IEEE/ACM 12th International Symposium on Software Engineering for Adaptive and Self-Managing Systems (SEAMS), pp. 83–89. IEEE (2017)
[14]
Hansson H and Jonsson B A logic for reasoning about time and reliability Formal Aspects Comput. 1994 6 5 512-535
[15]
Hasanbeig, M., Abate, A., Kroening, D.: Cautious reinforcement learning with logical constraints. arXiv preprint arXiv:2002.12156 (2020)
[16]
Hernandez-Leal P, Kartal B, and Taylor ME Is multiagent deep reinforcement learning the answer or the question? A brief survey Learning 2018 21 22
[17]
Huang, Y., Wu, S., Mu, Z., Long, X., Chu, S., Zhao, G.: A multi-agent reinforcement learning method for swarm robots in space collaborative exploration. In: 2020 6th International Conference on Control, Automation and Robotics (ICCAR), pp. 139–144. IEEE (2020)
[18]
Huh, S., Yang, I.: Safe reinforcement learning for probabilistic reachability and safety specifications: a Lyapunov-based approach. arXiv preprint arXiv:2002.10126 (2020)
[19]
Jansen, N., Könighofer, B., Junges, S., Serban, A., Bloem, R.: Safe reinforcement learning using probabilistic shields (2020)
[20]
Juliani, A., et al.: Unity: a general platform for intelligent agents. arXiv preprint arXiv:1809.02627 (2018)
[21]
Junges S, Jansen N, Dehnert C, Topcu U, and Katoen J-P Chechik M and Raskin J-F Safety-constrained reinforcement learning for MDPs Tools and Algorithms for the Construction and Analysis of Systems 2016 Heidelberg Springer 130-146
[22]
Kwiatkowska M, Norman G, and Parker D Katoen J-P and Stevens P Probabilistic symbolic model checking with PRISM: a hybrid approach Tools and Algorithms for the Construction and Analysis of Systems 2002 Heidelberg Springer 52-66
[23]
Kwiatkowska M, Norman G, and Parker D Bernardo M and Hillston J Stochastic model checking Formal Methods for Performance Evaluation 2007 Heidelberg Springer 220-270
[24]
Kwiatkowska M, Norman G, and Parker D Gopalakrishnan G and Qadeer S PRISM 4.0: verification of probabilistic real-time systems Computer Aided Verification 2011 Heidelberg Springer 585-591
[25]
Lee HR and Lee T Multi-agent reinforcement learning algorithm to solve a partially-observable multi-agent problem in disaster response Eur. J. Oper. Res. 2021 291 1 296-308
[26]
Liao, X., et al.: Iteratively-refined interactive 3D medical image segmentation with multi-agent reinforcement learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9394–9402 (2020)
[27]
Liu W, Wang Z, Liu X, Zeng N, Liu Y, and Alsaadi FE A survey of deep neural network architectures and their applications Neurocomputing 2017 234 11-26
[28]
Luis SY, Reina DG, and Marín SLT A multiagent deep reinforcement learning approach for path planning in autonomous surface vehicles: the Ypacaraí lake patrolling case IEEE Access 2021 9 17084-17099
[29]
Mason, G.R., Calinescu, R.C., Kudenko, D., Banks, A.: Assured reinforcement learning with formally verified abstract policies. In: 9th International Conference on Agents and Artificial Intelligence (ICAART), York (2017)
[30]
Nowé A, Vrancx P, and De Hauwere YM Wiering M and van Otterlo M Game theory and multi-agent reinforcement learning Reinforcement Learning 2012 Berlin Springer 441-470
[31]
OroojlooyJadid, A., Hajinezhad, D.: A review of cooperative multi-agent deep reinforcement learning. arXiv preprint arXiv:1908.03963 (2019)
[32]
Pardalos PM, Migdalas A, and Pitsoulis L Pareto Optimality, Game Theory and Equilibria 2008 Heidelberg Springer
[33]
Parnika, P., Diddigi, R.B., Danda, S.K.R., Bhatnagar, S.: Attention actor-critic algorithm for multi-agent constrained co-operative reinforcement learning. arXiv preprint arXiv:2101.02349 (2021)
[34]
Pathak, D., Agrawal, P., Efros, A.A., Darrell, T.: Curiosity-driven exploration by self-supervised prediction. In: International Conference on Machine Learning, pp. 2778–2787. PMLR (2017)
[35]
Portugal D, Iocchi L, and Farinelli A Koubaa A A ROS-based framework for simulation and benchmarking of multi-robot patrolling algorithms Robot Operating System (ROS) 2019 Cham Springer 3-28
[36]
Riley, J., Calinescu, R., Paterson, C., Kudenko, D., Banks, A.: Reinforcement learning with quantitative verification for assured multi-agent policies. In: 13th International Conference on Agents and Artificial Intelligence, York (2021)
[37]
Riley, J., Calinescu, R., Paterson, C., Kudenko, D., Banks, A.: Utilising assured multi-agent reinforcement learning within safety-critical scenarios. Procedia Comput. Sci. 192, 1061–1070 (2021). Knowledge-Based and Intelligent Information & Engineering Systems: Proceedings of the 25th International Conference KES 2021
[38]
Rizk Y, Awad M, and Tunstel EW Decision making in multiagent systems: a survey IEEE Trans. Cogn. Dev. Syst. 2018 10 3 514-529
[39]
Rosser, C., Abed, K.: Curiosity-driven reinforced learning of undesired actions in autonomous intelligent agents. In: 2021 IEEE 19th World Symposium on Applied Machine Intelligence and Informatics (SAMI), pp. 000039–000042. IEEE (2021)
[40]
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)
[41]
Spano S et al. An efficient hardware implementation of reinforcement learning: the q-learning algorithm IEEE Access 2019 7 186340-186351
[42]
Srinivasan, K., Eysenbach, B., Ha, S., Tan, J., Finn, C.: Learning to be safe: deep rl with a safety critic. arXiv preprint arXiv:2010.14603 (2020)
[43]
Sutton RS and Barto AG Reinforcement Learning: An Introduction 2018 Cambridge MIT Press
[44]
Thananjeyan B et al. Recovery RL: safe reinforcement learning with learned recovery zones IEEE Robot. Autom. Lett. 2021 6 3 4915-4922
[45]
Wachi, A., Sui, Y.: Safe reinforcement learning in constrained Markov decision processes. In: International Conference on Machine Learning, pp. 9797–9806. PMLR (2020)
[46]
Wiering MA and Van Otterlo M Reinforcement learning Adapt. Learn. Optim. 2012 12 3 729
[47]
Zhang K, Yang Z, and Başar T Vamvoudakis KG, Wan Y, Lewis FL, and Cansever D Multi-agent reinforcement learning: a selective overview of theories and algorithms Handbook of Reinforcement Learning and Control 2021 Cham Springer 321-384

Index Terms

  1. Assured Deep Multi-Agent Reinforcement Learning for Safe Robotic Systems
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    Agents and Artificial Intelligence: 13th International Conference, ICAART 2021, Virtual Event, February 4–6, 2021, Revised Selected Papers
    Feb 2021
    352 pages
    ISBN:978-3-031-10160-1
    DOI:10.1007/978-3-031-10161-8

    Publisher

    Springer-Verlag

    Berlin, Heidelberg

    Publication History

    Published: 04 February 2021

    Author Tags

    1. Reinforcement Learning
    2. Multi-Agent Systems
    3. Quantitative verification
    4. Assurance
    5. Multi-Agent Reinforcement Learning
    6. Safety-critical scenarios
    7. Safe Multi-Agent Reinforcement Learning
    8. Assured Multi-Agent Reinforcement Learning
    9. Deep Reinforcement Learning

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 0
      Total Downloads
    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 22 Nov 2024

    Other Metrics

    Citations

    View Options

    View options

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media