Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/3545946.3599126acmconferencesArticle/Chapter ViewAbstractPublication PagesaamasConference Proceedingsconference-collections
research-article

Counterfactual Explanations for Reinforcement Learning Agents

Published: 30 May 2023 Publication History

Abstract

Reinforcement learning (RL) algorithms often use neural networks to represent agent's policy, making them difficult to interpret. Counterfactual explanations are human-friendly explanations which offer users actionable advice on how to change their features to obtain a desired output from a black-box model. However, methods for generating counterfactuals in RL ignore the stochastic and sequential nature of RL tasks, and can generate counterfactuals which are difficult to obtain, affecting user effort and trust. My dissertation focuses on developing methods that take into account the complexities of RL framework and provide counterfactual explanations that are easy to reach and confidently produce the desired output

References

[1]
Dan Amir and Ofra Amir. 2018. Highlights: Summarizing agent behavior to people. In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems. 1168--1176.
[2]
Ziheng Chen, Fabrizio Silvestri, Gabriele Tolomei, He Zhu, Jia Wang, and Hongshik Ahn. 2021. ReLACE: Reinforcement Learning Agent for Counterfactual Explanations of Arbitrary Predictive Models. arXiv preprint arXiv:2110.11960 (2021).
[3]
Susanne Dandl, Christoph Molnar, Martin Binder, and Bernd Bischl. 2020. Multi-objective counterfactual explanations. In International Conference on Parallel Problem Solving from Nature. Springer, 448--469.
[4]
Richard Dazeley, Peter Vamplew, and Francisco Cruz. 2021. Explainable reinforcement learning for Broad-XAI: a conceptual framework and survey. arXiv preprint arXiv:2108.09003 (2021).
[5]
Pim De Haan, Dinesh Jayaraman, and Sergey Levine. 2019. Causal confusion in imitation learning. Advances in Neural Information Processing Systems, Vol. 32 (2019).
[6]
Jasmina Gajcin and Ivana Dusparic. 2022a. Counterfactual Explanations for Reinforcement Learning. arXiv preprint arXiv:2210.11846 (2022).
[7]
Jasmina Gajcin and Ivana Dusparic. 2022b. ReCCoVER: Detecting Causal Confusion for Explainable Reinforcement Learning. In Explainable and Transparent AI and Multi-Agent Systems, Davide Calvaresi, Amro Najjar, Michael Winikoff, and Kary Främling (Eds.). Springer International Publishing, Cham, 38--56.
[8]
Jasmina Gajcin, Rahul Nair, Tejaswini Pedapati, Radu Marinescu, Elizabeth Daly, and Ivana Dusparic. 2021. Contrastive Explanations for Comparing Preferences of Reinforcement Learning Agents. arXiv preprint arXiv:2112.09462 (2021).
[9]
Thibault Laugel, Marie-Jeanne Lesot, Christophe Marsala, Xavier Renard, and Marcin Detyniecki. 2017. Inverse classification for comparison-based interpretability in machine learning. arXiv preprint arXiv:1712.08443 (2017).
[10]
Guiliang Liu, Oliver Schulte, Wang Zhu, and Qingcan Li. 2018. Toward interpretable deep reinforcement learning with linear model u-trees. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 414--429.
[11]
Arnaud Van Looveren and Janis Klaise. 2021. Interpretable counterfactual explanations guided by prototypes. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 650--665.
[12]
Clare Lyle, Amy Zhang, Minqi Jiang, Joelle Pineau, and Yarin Gal. 2021. Resolving Causal Confusion in Reinforcement Learning via Robust Exploration. In Self-Supervision for Reinforcement Learning Workshop-ICLR 2021.
[13]
Prashan Madumal, Tim Miller, Liz Sonenberg, and Frank Vetere. 2020. Explainable reinforcement learning through a causal lens. In Proceedings of the AAAI conference on artificial intelligence, Vol. 34. 2493--2500.
[14]
Ramaravind K Mothilal, Amit Sharma, and Chenhao Tan. 2020. Explaining machine learning classifiers through diverse counterfactual explanations. In Proceedings of the 2020 conference on fairness, accountability, and transparency. 607--617.
[15]
Matthew L Olson, Roli Khanna, Lawrence Neal, Fuxin Li, and Weng-Keen Wong. 2021. Counterfactual state explanations for reinforcement learning agents via generative deep learning. Artificial Intelligence, Vol. 295 (2021), 103455.
[16]
Samira Pouyanfar, Saad Sadiq, Yilin Yan, Haiman Tian, Yudong Tao, Maria Presa Reyes, Mei-Ling Shyu, Shu-Ching Chen, and Sundaraja S Iyengar. 2018. A survey on deep learning: Algorithms, techniques, and applications. ACM Computing Surveys (CSUR), Vol. 51, 5 (2018), 1--36.
[17]
Rafael Poyiadzi, Kacper Sokol, Raul Santos-Rodriguez, Tijl De Bie, and Peter Flach. 2020. FACE: feasible and actionable counterfactual explanations. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society. 344--350.
[18]
Erika Puiutta and Eric Veith. 2020. Explainable reinforcement learning: A survey. In International cross-domain conference for machine learning and knowledge extraction. Springer, 77--95.
[19]
Robert-Florian Samoilescu, Arnaud Van Looveren, and Janis Klaise. 2021. Model-agnostic and Scalable Counterfactual Explanations via Reinforcement Learning. arXiv preprint arXiv:2106.02597 (2021).
[20]
Shubham Sharma, Jette Henderson, and Joydeep Ghosh. 2019. Certifai: Counterfactual explanations for robustness, transparency, interpretability, and fairness of artificial intelligence models. arXiv preprint arXiv:1905.07857 (2019).
[21]
Sahil Verma, John Dickerson, and Keegan Hines. 2020. Counterfactual explanations for machine learning: A review. arXiv preprint arXiv:2010.10596 (2020).
[22]
Sandra Wachter, Brent Mittelstadt, and Chris Russell. 2017. Counterfactual explanations without opening the black box: Automated decisions and the GDPR. Harv. JL & Tech., Vol. 31 (2017), 841.

Index Terms

  1. Counterfactual Explanations for Reinforcement Learning Agents

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      AAMAS '23: Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems
      May 2023
      3131 pages
      ISBN:9781450394321
      • General Chairs:
      • Noa Agmon,
      • Bo An,
      • Program Chairs:
      • Alessandro Ricci,
      • William Yeoh

      Sponsors

      Publisher

      International Foundation for Autonomous Agents and Multiagent Systems

      Richland, SC

      Publication History

      Published: 30 May 2023

      Check for updates

      Author Tags

      1. causality
      2. contrastive explanations
      3. counterfactual explanations
      4. explainability
      5. reinforcement learning

      Qualifiers

      • Research-article

      Funding Sources

      • Science Foundation Ireland

      Conference

      AAMAS '23
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 109
        Total Downloads
      • Downloads (Last 12 months)66
      • Downloads (Last 6 weeks)4
      Reflects downloads up to 20 Nov 2024

      Other Metrics

      Citations

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media