Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3512290.3528767acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article

Pittsburgh learning classifier systems for explainable reinforcement learning: comparing with XCS

Published: 08 July 2022 Publication History

Abstract

Interest in reinforcement learning (RL) has recently surged due to the application of deep learning techniques, but these connectionist approaches are opaque compared with symbolic systems. Learning Classifier Systems (LCSs) are evolutionary machine learning systems that can be categorised as eXplainable AI (XAI) due to their rule-based nature. Michigan LCSs are commonly used in RL domains as the alternative Pittsburgh systems (e.g. SAMUEL) suffer from complex algorithmic design and high computational requirements; however they can produce more compact/interpretable solutions than Michigan systems. We aim to develop two novel Pittsburgh LCSs to address RL domains: PPL-DL and PPL-ST. The former acts as a "zeroth-level" system, and the latter revisits SAMUEL's core Monte Carlo learning mechanism for estimating rule strength. We compare our two Pittsburgh systems to the Michigan system XCS across deterministic and stochastic FrozenLake environments. Results show that PPL-ST performs on-par or better than PPL-DL and outperforms XCS in the presence of high levels of environmental uncertainty. Rulesets evolved by PPL-ST can achieve higher performance than those evolved by XCS, but in a more parsimonious and therefore more interpretable fashion, albeit with higher computational cost. This indicates that PPL-ST is an LCS well-suited to producing explainable policies in RL domains.

Supplemental Material

PDF File
Supplemental material.

References

[1]
Amina Adadi and Mohammed Berrada. 2018. Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI). IEEE Access 6 (2018), 52138--52160.
[2]
Jaume Bacardit. 2004. Pittsburgh genetics-based machine learning in the data mining era: representations, generalization, and run-time. Ph.D. Dissertation. Ramon Llull University, Barcelona, Catalonia, Spain.
[3]
Jaume Bacardit and Martin V. Butz. 2007. Data Mining in Learning Classifier Systems: Comparing XCS with GAssist. In Learning Classifier Systems, Tim Kovacs, Xavier Llorà, Keiki Takadama, Pier Luca Lanzi, Wolfgang Stolzmann, and Stewart W. Wilson (Eds.). Vol. 4399. Springer Berlin Heidelberg, 282--290.
[4]
Jaume Bacardit and Josep Maria Garrell. 2007. Bloat Control and Generalization Pressure Using the Minimum Description Length Principle for a Pittsburgh Approach Learning Classifier System. In Learning Classifier Systems (Lecture Notes in Computer Science), Tim Kovacs, Xavier Llorà, Keiki Takadama, Pier Luca Lanzi, Wolfgang Stolzmann, and Stewart W. Wilson (Eds.). Springer, Berlin, Heidelberg, 59--79.
[5]
Jaume Bacardit, Michael Stout, Jonathan D. Hirst, Kumara Sastry, Xavier Llorà, and Natalio Krasnogor. 2007. Automated alphabet reduction method with evolutionary algorithms for protein structure prediction. In Proceedings of the 9th annual conference on Genetic and evolutionary computation - GECCO '07. ACM Press, London, England, 346.
[6]
A. M. Barry. 2002. The stability of long action chains in XCS. Soft Computing - A Fusion of Foundations, Methodologies and Applications 6, 3-4 (June 2002), 183--199.
[7]
Ester Bernadó, Xavier Llorà, and Josep M. Garrell. 2002. XCS and GALE: A Comparative Study of Two Learning Classifier Systems on Data Mining. In Advances in Learning Classifier Systems (Lecture Notes in Computer Science), Pier Luca Lanzi, Wolfgang Stolzmann, and Stewart W. Wilson (Eds.). Springer, Berlin, Heidelberg, 115--132.
[8]
Jordan T. Bishop and Marcus Gallagher. 2020. Optimality-Based Analysis of XCSF Compaction in Discrete Reinforcement Learning. In Parallel Problem Solving from Nature - PPSN XVI (Lecture Notes in Computer Science), Thomas Bäck, Mike Preuss, André Deutz, Hao Wang, Carola Doerr, Michael Emmerich, and Heike Trautmann (Eds.). Springer International Publishing, Cham, 471--484.
[9]
Martin V Butz. 2006. Rule-based evolutionary online learning systems. Springer-Verlag, Berlin, Heidelberg.
[10]
Jan Drugowitsch. 2008. Design and Analysis of Learning Classifier Systems. Studies in Computational Intelligence, Vol. 139. Springer, Berlin, Heidelberg.
[11]
Alberto Fernández, Salvador García, Julián Luengo, Ester Bernadó-Mansilla, and Francisco Herrera. 2010. Genetics-Based Machine Learning for Rule Induction: State of the Art, Taxonomy, and Comparative Study. IEEE Transactions on Evolutionary Computation 14, 6 (Dec. 2010), 913--941.
[12]
John J. Grefenstette. 1995. Lamarckian Learning in Multi-Agent Environments. Technical Report. Navy Center for Applied Research in Artificial Intelligence Washington DC.
[13]
John J. Grefenstette, Connie Loggia Ramsey, and Alan C. Schultz. 1990. Learning sequential decision rules using simulation models and competition. Machine Learning 5, 4 (Oct. 1990), 355--381.
[14]
Tim Kovacs. 2004. Strength or Accuracy: Credit Assignment in Learning Classifier Systems. Springer, London.
[15]
W. B. Langdon and R. Poli. 1998. Fitness Causes Bloat. In Soft Computing in Engineering Design and Manufacturing, P. K. Chawdhry, R. Roy, and R. K. Pant (Eds.). Springer, London, 13--22.
[16]
Pier Luca Lanzi. 1999. An Analysis of Generalization in the XCS Classifier System. Evolutionary Computation 7, 2 (June 1999), 125--149.
[17]
Pier Luca Lanzi and Marco Colombetti. 1999. An extension to the XCS classifier system for stochastic environments. In Proceedings of the 1st Annual Conference on Genetic and Evolutionary Computation - Volume 1 (GECCO'99). Morgan Kaufmann Publishers Inc., Orlando, Florida, 353--360.
[18]
Pier Luca Lanzi, Daniele Loiacono, Stewart W. Wilson, and David E. Goldberg. 2007. Generalization in the XCSF Classifier System: Analysis, Improvement, and Extension. Evolutionary Computation 15, 2 (June 2007), 133--168.
[19]
Sergey Levine, Chelsea Finn, Trevor Darrell, and Pieter Abbeel. 2016. End-to-end training of deep visuomotor policies. The Journal of Machine Learning Research 17, 1 (Jan. 2016), 1334--1373.
[20]
Xavier Llorà and Josep M. Garrell. 2001. Knowledge-independent data mining with fine-grained parallel evolutionary algorithms. In Proceedings of the 3rd Annual Conference on Genetic and Evolutionary Computation (GECCO'01). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 461--468.
[21]
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, and Demis Hassabis. 2015. Human-level control through deep reinforcement learning. Nature 518, 7540 (Feb. 2015), 529--533.
[22]
Anthony Stein, Roland Maier, Lukas Rosenbauer, and Jörg Hähner. 2020. XCS classifier system with experience replay. In Proceedings of the 2020 Genetic and Evolutionary Computation Conference (GECCO '20). Association for Computing Machinery, New York, NY, USA, 404--413.
[23]
Christopher Stone and Larry Bull. 2003. For Real! XCS with Continuous-Valued Inputs. Evolutionary Computation 11, 3 (Sept. 2003), 299--336.
[24]
Richard S. Sutton and Andrew G. Barto. 2018. Reinforcement learning: an introduction (second ed.). The MIT Press, Cambridge, MA.
[25]
Ryan J. Urbanowicz and W. N. Browne. 2017. Introduction to learning classifier systems. Springer Berlin Heidelberg, New York, NY.
[26]
Stewart W. Wilson. 1994. ZCS: A Zeroth Level Classifier System. Evolutionary Computation 2, 1 (March 1994), 1--18.
[27]
Stewart W. Wilson. 1995. Classifier Fitness Based on Accuracy. Evolutionary Computation 3, 2 (June 1995), 149--175.
[28]
Stewart W. Wilson. 2000. Mining Oblique Data with XCS. In Revised Papers from the Third International Workshop on Advances in Learning Classifier Systems (IWLCS '00). Springer-Verlag, Berlin, Heidelberg, 158--176.
[29]
Stewart W. Wilson. 2001. Classifiers that Approximate Functions. Natural Computing 1 (2001), 1--2.

Cited By

View all
  • (2024)A Survey on Learning Classifier Systems from 2022 to 2024Proceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3638530.3664165(1797-1806)Online publication date: 14-Jul-2024
  • (2023)Modern Applications of Evolutionary Rule-based Machine LearningProceedings of the Companion Conference on Genetic and Evolutionary Computation10.1145/3583133.3595047(1301-1330)Online publication date: 15-Jul-2023

Index Terms

  1. Pittsburgh learning classifier systems for explainable reinforcement learning: comparing with XCS

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        GECCO '22: Proceedings of the Genetic and Evolutionary Computation Conference
        July 2022
        1472 pages
        ISBN:9781450392372
        DOI:10.1145/3512290
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 08 July 2022

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. SAMUEL
        2. XAI
        3. XCS
        4. learning classifier systems
        5. reinforcement learning

        Qualifiers

        • Research-article

        Data Availability

        Conference

        GECCO '22
        Sponsor:

        Acceptance Rates

        Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)29
        • Downloads (Last 6 weeks)1
        Reflects downloads up to 16 Feb 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)A Survey on Learning Classifier Systems from 2022 to 2024Proceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3638530.3664165(1797-1806)Online publication date: 14-Jul-2024
        • (2023)Modern Applications of Evolutionary Rule-based Machine LearningProceedings of the Companion Conference on Genetic and Evolutionary Computation10.1145/3583133.3595047(1301-1330)Online publication date: 15-Jul-2023

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media