Imitation guided learning in learning classifier systems

Marc Métivier¹ &
Claude Lattaud¹

162 Accesses
2 Citations
Explore all metrics

Abstract

In this paper, we study the means of developing an imitation process allowing to improve learning in the framework of learning classifier systems. We present three different approaches in the way a behavior observed may be taken into account through a guidance interaction: two approaches using a model of this behavior, and one without modelling. Those approaches are evaluated and compared in different environments when they are applied to three major classifier systems: ZCS, XCS and ACS. Results are analyzed and discussed. They highlight the importance of using a model of the observed behavior to enable an efficient imitation. Moreover, they show the advantages of taking this model into account by a specialized internal action. Finally, they bring new results of comparison between ZCS, XCS and ACS.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence

Notes

Butz uses the following parameters: γ = 0.95, b _r = 0.05, b _q = 0.05, θ_r = 0.9 and θ_i = 0.1.
A system applying the policy of M1 around the obstacles and food but performing another action in the empty situation would have the following performance: 3.70 with E or W actions, 3.35 with SE or NW, 3.80 with SW or NE. N and S actions may conduct to cycling behaviors.

References

Atkeson CG, Schaal S (1997) Robot learning from demonstration. In: Proceedings of the 14th international conference on machine learning. Morgan Kaufmann, pp 12–20
Bakker P, Kuniyoshi Y (1996) Robot see, robot do: an overview of robot imitation. In: AISB’96 workshop on learning in robots and animals. Brighton, UK, pp 3–11
Bull L, Hurst J (2002) ZCS redux. Evolut Comput 10(2):185–205
Article Google Scholar
Butz M, Goldberg DE, Stolzmann W (1999) New challenges for an ACS: hard problems and possible solutions. Technical report 99019, University of Illinois at Urbana-Champaign, Urbana, IL, October 1999
Butz M, Goldberg DE, Stolzmann W (2000) The anticipatory classifier system and genetic generalization. Technical report 2000032, Illinois Genetic Algorithms Laboratory, 2000
Butz MV, Wilson SW (2002) An algorithmic description of XCS. Soft Computing 6:144–153
Google Scholar
Cliff D, Ross S (1994) Adding temporary memory to ZCS. Adapt Behav 3(2):101–150
Article Google Scholar
Damer BF (1998) Avatars, exploring and building Virtual Worlds on the Internet. Addison-Wesley Longman/Peachpit Press, USA
Dorigo M, Colombetti M (1994) Robot shaping: developing autonomous agents through learning. Artif Intell 71:321–370
Article Google Scholar
Goldberg DE (1989) Genetic algorithms in search, optimization, and machine learning. Addison-Wesley, Reading, Mass
MATH Google Scholar
Heudin J-C (1998) Virtual Worlds. In: Heudin JC (ed) Virtual Worlds: synthetic universes, digital life and complexity. Perseus Books, pp 1–28
Hoffmann J (1993) Vorhersage und Erkenntnis [Anticipation and Cognition]. Hogrefe
Holland JH (1986) Escaping brittleness: the possibilities of general-purpose learning algorithms applied to parallel rule-based systems. In: Mitchell, Michalski, Carbonell (eds) Machine learning, an artificial intelligence approach, vol II, chapter 20. Morgan Kaufmann, pp 593–623
Kovacs T (1996) Evolving optimal populations with XCS classifier systems. Master’s thesis, School of Computer Science, University of Birmingham, Birmingham, UK, 1996. Also technical report CSR-96-17 and CSRP-96-17
Kovacs T, Kerber M (2001) What makes a problem hard for XCS? In: Lanzi PL, Stolzmann W, Wilson SW (eds) Advances in learning classifier systems, volume 1996 of LNAI. Springer–Verlag, Berlin, pp 80–99
Lanzi Pier Luca (1999) An analysis of generalization in the XCS classifier system. Evolut Comput 7(2):125–149
Article Google Scholar
Lanzi PL (2002) Learning classifier systems from a reinforcement learning perspective. Soft Comput 6(3):162–170
Google Scholar
Lin L-J (1992) Self-improving reactive agents based on reinforcement learning, planning and teaching. Mach Learn J 8:293–321
Google Scholar
Mataric MJ (1998) Using communication to reduce locality in distributed multiagent learning. J Exp Theor Artif Intell 10(3):357–369
Article MATH Google Scholar
Mitchell RW (1989) A comparative-developmental approach to understanding imitation. Perspect Ethol 7:183–215
Google Scholar
Ng AY, Russell S (2000) Algorithms for inverse reinforcement learning. In: Proceedings of the 17th international conference on machine learning. Morgan Kaufmann, San Francisco, CA, pp 663–670
Price B (2003) Accelerating reinforcement learning with imitation. PhD thesis, University of British Columbia
Price B, Boutilier C (1999) Implicit imitation in multiagent reinforcement learning. In: Proceedings of the 16th international conference on machine learning. Morgan Kaufmann, San Francisco, CA, pp 325–334
Sammut C, Hurst S, Kedzier D, Michie D (1992) Learning to fly. In: Proceedings of the ninth international conference on machine learning, July 1992. Morgan Kaufmann, Aberdeen
Stolzmann W (1998) Anticipatory classifier systems. In: Proceedings of the third annual genetic programming conference. Morgan Kaufmann, pp 658–664
Šuc D, Bratko I (1997) Skill reconstruction as induction of LQ controllers with subgoals. In: Proceedings of the 15th International Joint Conference on Artificial Intelligence (IJCAI-97), August 23–29 1997. Morgan Kaufmann Publishers, San Francisco, pp 914–919
Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. The MIT Press, Cambridge
Google Scholar
Urbančič T, Bratko I (1994) Reconstructing human skill with machine learning. In: Cohn AG (ed) Proceedings of the eleventh European conference on artificial intelligence, August 8–12 1994. John Wiley and Sons, Chichester, pp 498–502
Utgoff PE, Clouse JA (1991) Two kinds of training information for evaluation function learning. In: Dean, McKeown (eds) Proceedings of the 9th national conference on artificial intelligence. MIT Press, July, pp 596–600
Watkins CJCH, Dayan P (1992) Q-learning. Mach Learn 8(3):272–292
Google Scholar
Whitehead SD (1991) A complexity analysis of cooperative mechanisms in reinforcement learning. In: Dean, McKeown (eds) Proceedings of the 9th national conference on artificial intelligence. MIT Press, pp 607–613
Wilson SW (1994) ZCS: a zeroth level classifier system. Evol Comput 2(1):1–18
Article Google Scholar
Wilson SW (1995) Classifier fitness based on accuracy. Evol Comput 3(2):149–175
Article Google Scholar

Download references

Author information

Authors and Affiliations

LIAP5—UFR de mathématiques et d’informatique, Centre Universitaire des Saints-Pères, Université René Descartes (Paris 5), 45, rue des Saints-Pères, Paris, 75006, France
Marc Métivier & Claude Lattaud

Authors

Marc Métivier
View author publications
You can also search for this author in PubMed Google Scholar
Claude Lattaud
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marc Métivier.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Métivier, M., Lattaud, C. Imitation guided learning in learning classifier systems. Nat Comput 8, 29–56 (2009). https://doi.org/10.1007/s11047-007-9054-8

Download citation

Received: 19 October 2006
Accepted: 18 July 2007
Published: 15 August 2007
Issue Date: March 2009
DOI: https://doi.org/10.1007/s11047-007-9054-8

Imitation guided learning in learning classifier systems

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Introducing Action Planning to the Anticipatory Classifier System ACS2

BACS: A Thorough Study of Using Behavioral Sequences in ACS2

Learning from Humans

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Imitation guided learning in learning classifier systems

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Introducing Action Planning to the Anticipatory Classifier System ACS2

BACS: A Thorough Study of Using Behavioral Sequences in ACS2

Learning from Humans

Explore related subjects

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation