research-article

Pittsburgh learning classifier systems for explainable reinforcement learning: comparing with XCS

Authors:

Jordan T. Bishop,

Marcus Gallagher,

Will N. BrowneAuthors Info & Claims

GECCO '22: Proceedings of the Genetic and Evolutionary Computation Conference

Pages 323 - 331

https://doi.org/10.1145/3512290.3528767

Published: 08 July 2022 Publication History

Abstract

Interest in reinforcement learning (RL) has recently surged due to the application of deep learning techniques, but these connectionist approaches are opaque compared with symbolic systems. Learning Classifier Systems (LCSs) are evolutionary machine learning systems that can be categorised as eXplainable AI (XAI) due to their rule-based nature. Michigan LCSs are commonly used in RL domains as the alternative Pittsburgh systems (e.g. SAMUEL) suffer from complex algorithmic design and high computational requirements; however they can produce more compact/interpretable solutions than Michigan systems. We aim to develop two novel Pittsburgh LCSs to address RL domains: PPL-DL and PPL-ST. The former acts as a "zeroth-level" system, and the latter revisits SAMUEL's core Monte Carlo learning mechanism for estimating rule strength. We compare our two Pittsburgh systems to the Michigan system XCS across deterministic and stochastic FrozenLake environments. Results show that PPL-ST performs on-par or better than PPL-DL and outperforms XCS in the presence of high levels of environmental uncertainty. Rulesets evolved by PPL-ST can achieve higher performance than those evolved by XCS, but in a more parsimonious and therefore more interpretable fashion, albeit with higher computational cost. This indicates that PPL-ST is an LCS well-suited to producing explainable policies in RL domains.

Supplemental Material

PDF File

Supplemental material.

Download
759.98 KB

References

[1]

Amina Adadi and Mohammed Berrada. 2018. Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI). IEEE Access 6 (2018), 52138--52160.

[2]

Jaume Bacardit. 2004. Pittsburgh genetics-based machine learning in the data mining era: representations, generalization, and run-time. Ph.D. Dissertation. Ramon Llull University, Barcelona, Catalonia, Spain.

[3]

Jaume Bacardit and Martin V. Butz. 2007. Data Mining in Learning Classifier Systems: Comparing XCS with GAssist. In Learning Classifier Systems, Tim Kovacs, Xavier Llorà, Keiki Takadama, Pier Luca Lanzi, Wolfgang Stolzmann, and Stewart W. Wilson (Eds.). Vol. 4399. Springer Berlin Heidelberg, 282--290.

[4]

Jaume Bacardit and Josep Maria Garrell. 2007. Bloat Control and Generalization Pressure Using the Minimum Description Length Principle for a Pittsburgh Approach Learning Classifier System. In Learning Classifier Systems (Lecture Notes in Computer Science), Tim Kovacs, Xavier Llorà, Keiki Takadama, Pier Luca Lanzi, Wolfgang Stolzmann, and Stewart W. Wilson (Eds.). Springer, Berlin, Heidelberg, 59--79.

[5]

Jaume Bacardit, Michael Stout, Jonathan D. Hirst, Kumara Sastry, Xavier Llorà, and Natalio Krasnogor. 2007. Automated alphabet reduction method with evolutionary algorithms for protein structure prediction. In Proceedings of the 9th annual conference on Genetic and evolutionary computation - GECCO '07. ACM Press, London, England, 346.

Digital Library

[6]

A. M. Barry. 2002. The stability of long action chains in XCS. Soft Computing - A Fusion of Foundations, Methodologies and Applications 6, 3-4 (June 2002), 183--199.

[7]

Ester Bernadó, Xavier Llorà, and Josep M. Garrell. 2002. XCS and GALE: A Comparative Study of Two Learning Classifier Systems on Data Mining. In Advances in Learning Classifier Systems (Lecture Notes in Computer Science), Pier Luca Lanzi, Wolfgang Stolzmann, and Stewart W. Wilson (Eds.). Springer, Berlin, Heidelberg, 115--132.

[8]

Jordan T. Bishop and Marcus Gallagher. 2020. Optimality-Based Analysis of XCSF Compaction in Discrete Reinforcement Learning. In Parallel Problem Solving from Nature - PPSN XVI (Lecture Notes in Computer Science), Thomas Bäck, Mike Preuss, André Deutz, Hao Wang, Carola Doerr, Michael Emmerich, and Heike Trautmann (Eds.). Springer International Publishing, Cham, 471--484.

[9]

Martin V Butz. 2006. Rule-based evolutionary online learning systems. Springer-Verlag, Berlin, Heidelberg.

[10]

Jan Drugowitsch. 2008. Design and Analysis of Learning Classifier Systems. Studies in Computational Intelligence, Vol. 139. Springer, Berlin, Heidelberg.

[11]

Alberto Fernández, Salvador García, Julián Luengo, Ester Bernadó-Mansilla, and Francisco Herrera. 2010. Genetics-Based Machine Learning for Rule Induction: State of the Art, Taxonomy, and Comparative Study. IEEE Transactions on Evolutionary Computation 14, 6 (Dec. 2010), 913--941.

Digital Library

[12]

John J. Grefenstette. 1995. Lamarckian Learning in Multi-Agent Environments. Technical Report. Navy Center for Applied Research in Artificial Intelligence Washington DC.

[13]

John J. Grefenstette, Connie Loggia Ramsey, and Alan C. Schultz. 1990. Learning sequential decision rules using simulation models and competition. Machine Learning 5, 4 (Oct. 1990), 355--381.

[14]

Tim Kovacs. 2004. Strength or Accuracy: Credit Assignment in Learning Classifier Systems. Springer, London.

[15]

W. B. Langdon and R. Poli. 1998. Fitness Causes Bloat. In Soft Computing in Engineering Design and Manufacturing, P. K. Chawdhry, R. Roy, and R. K. Pant (Eds.). Springer, London, 13--22.

[16]

Pier Luca Lanzi. 1999. An Analysis of Generalization in the XCS Classifier System. Evolutionary Computation 7, 2 (June 1999), 125--149.

Digital Library

[17]

Pier Luca Lanzi and Marco Colombetti. 1999. An extension to the XCS classifier system for stochastic environments. In Proceedings of the 1st Annual Conference on Genetic and Evolutionary Computation - Volume 1 (GECCO'99). Morgan Kaufmann Publishers Inc., Orlando, Florida, 353--360.

[18]

Pier Luca Lanzi, Daniele Loiacono, Stewart W. Wilson, and David E. Goldberg. 2007. Generalization in the XCSF Classifier System: Analysis, Improvement, and Extension. Evolutionary Computation 15, 2 (June 2007), 133--168.

Digital Library

[19]

Sergey Levine, Chelsea Finn, Trevor Darrell, and Pieter Abbeel. 2016. End-to-end training of deep visuomotor policies. The Journal of Machine Learning Research 17, 1 (Jan. 2016), 1334--1373.

[20]

Xavier Llorà and Josep M. Garrell. 2001. Knowledge-independent data mining with fine-grained parallel evolutionary algorithms. In Proceedings of the 3rd Annual Conference on Genetic and Evolutionary Computation (GECCO'01). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 461--468.

[21]

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, and Demis Hassabis. 2015. Human-level control through deep reinforcement learning. Nature 518, 7540 (Feb. 2015), 529--533.

[22]

Anthony Stein, Roland Maier, Lukas Rosenbauer, and Jörg Hähner. 2020. XCS classifier system with experience replay. In Proceedings of the 2020 Genetic and Evolutionary Computation Conference (GECCO '20). Association for Computing Machinery, New York, NY, USA, 404--413.

Digital Library

[23]

Christopher Stone and Larry Bull. 2003. For Real! XCS with Continuous-Valued Inputs. Evolutionary Computation 11, 3 (Sept. 2003), 299--336.

Digital Library

[24]

Richard S. Sutton and Andrew G. Barto. 2018. Reinforcement learning: an introduction (second ed.). The MIT Press, Cambridge, MA.

[25]

Ryan J. Urbanowicz and W. N. Browne. 2017. Introduction to learning classifier systems. Springer Berlin Heidelberg, New York, NY.

[26]

Stewart W. Wilson. 1994. ZCS: A Zeroth Level Classifier System. Evolutionary Computation 2, 1 (March 1994), 1--18.

Digital Library

[27]

Stewart W. Wilson. 1995. Classifier Fitness Based on Accuracy. Evolutionary Computation 3, 2 (June 1995), 149--175.

Digital Library

[28]

Stewart W. Wilson. 2000. Mining Oblique Data with XCS. In Revised Papers from the Third International Workshop on Advances in Learning Classifier Systems (IWLCS '00). Springer-Verlag, Berlin, Heidelberg, 158--176.

[29]

Stewart W. Wilson. 2001. Classifiers that Approximate Functions. Natural Computing 1 (2001), 1--2.

Cited By

Siddique AHeider MIqbal MShiraishi HLi XHandl J(2024)A Survey on Learning Classifier Systems from 2022 to 2024Proceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3638530.3664165(1797-1806)Online publication date: 14-Jul-2024
https://dl.acm.org/doi/10.1145/3638530.3664165
Siddique ABrowne WUrbanowicz RSilva SPaquete L(2023)Modern Applications of Evolutionary Rule-based Machine LearningProceedings of the Companion Conference on Genetic and Evolutionary Computation10.1145/3583133.3595047(1301-1330)Online publication date: 15-Jul-2023
https://dl.acm.org/doi/10.1145/3583133.3595047

Index Terms

Pittsburgh learning classifier systems for explainable reinforcement learning: comparing with XCS
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Reinforcement learning
    2. Machine learning approaches
      1. Bio-inspired approaches
        Genetic algorithms
      2. Rule learning

Recommendations

Learning classifier system with average reward reinforcement learning

In the family of Learning Classifier Systems, the classifier system XCS is most widely used and investigated. However, the standard XCS has difficulties solving large multi-step problems, where long action chains are needed to get delayed rewards. Up to ...
Learning classifier system equivalent with reinforcement learning with function approximation
GECCO '05: Proceedings of the 7th annual workshop on Genetic and evolutionary computation

We present an experimental comparison of the reinforcement process between Learning Classifier System (LCS) and Reinforcement Learning (RL) with function approximation (FA) method, regarding their generalization mechanisms. To validate our previous ...
Deep Reinforcement Learning with a Classifier System – First Steps
Architecture of Computing Systems
Abstract
Organic Computing enables self-* properties in technical systems for mastering them in the face of complexity and for improving robustness and efficiency. Key technology for self-improving adaptation decisions is reinforcement learning (RL). In ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

GECCO '22: Proceedings of the Genetic and Evolutionary Computation Conference

July 2022

1472 pages

ISBN:9781450392372

DOI:10.1145/3512290

Editor:
Jonathan E. Fieldsend
University of Exeter
,
General Chair:
Markus Wagner
The University of Adelaide

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGEVO: ACM Special Interest Group on Genetic and Evolutionary Computation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 July 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Data Availability

Supplemental material. https://dl.acm.org/doi/10.1145/3512290.3528767#p323-bishop-suppl.pdf

Conference

GECCO '22

Sponsor:

SIGEVO

GECCO '22: Genetic and Evolutionary Computation Conference

July 9 - 13, 2022

Massachusetts, Boston

Acceptance Rates

Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
164
Total Downloads

Downloads (Last 12 months)29
Downloads (Last 6 weeks)1

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Siddique AHeider MIqbal MShiraishi HLi XHandl J(2024)A Survey on Learning Classifier Systems from 2022 to 2024Proceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3638530.3664165(1797-1806)Online publication date: 14-Jul-2024
https://dl.acm.org/doi/10.1145/3638530.3664165
Siddique ABrowne WUrbanowicz RSilva SPaquete L(2023)Modern Applications of Evolutionary Rule-based Machine LearningProceedings of the Companion Conference on Genetic and Evolutionary Computation10.1145/3583133.3595047(1301-1330)Online publication date: 15-Jul-2023
https://dl.acm.org/doi/10.1145/3583133.3595047

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten