Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Imitation guided learning in learning classifier systems

  • Published:
Natural Computing Aims and scope Submit manuscript

Abstract

In this paper, we study the means of developing an imitation process allowing to improve learning in the framework of learning classifier systems. We present three different approaches in the way a behavior observed may be taken into account through a guidance interaction: two approaches using a model of this behavior, and one without modelling. Those approaches are evaluated and compared in different environments when they are applied to three major classifier systems: ZCS, XCS and ACS. Results are analyzed and discussed. They highlight the importance of using a model of the observed behavior to enable an efficient imitation. Moreover, they show the advantages of taking this model into account by a specialized internal action. Finally, they bring new results of comparison between ZCS, XCS and ACS.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Notes

  1. Butz uses the following parameters: γ =  0.95, b r =  0.05, b q =  0.05, θ r =  0.9 and θ i =  0.1.

  2. A system applying the policy of M1 around the obstacles and food but performing another action in the empty situation would have the following performance: 3.70 with E or W actions, 3.35 with SE or NW, 3.80 with SW or NE. N and S actions may conduct to cycling behaviors.

References

  • Atkeson CG, Schaal S (1997) Robot learning from demonstration. In: Proceedings of the 14th international conference on machine learning. Morgan Kaufmann, pp 12–20

  • Bakker P, Kuniyoshi Y (1996) Robot see, robot do: an overview of robot imitation. In: AISB’96 workshop on learning in robots and animals. Brighton, UK, pp 3–11

  • Bull L, Hurst J (2002) ZCS redux. Evolut Comput 10(2):185–205

    Article  Google Scholar 

  • Butz M, Goldberg DE, Stolzmann W (1999) New challenges for an ACS: hard problems and possible solutions. Technical report 99019, University of Illinois at Urbana-Champaign, Urbana, IL, October 1999

  • Butz M, Goldberg DE, Stolzmann W (2000) The anticipatory classifier system and genetic generalization. Technical report 2000032, Illinois Genetic Algorithms Laboratory, 2000

  • Butz MV, Wilson SW (2002) An algorithmic description of XCS. Soft Computing 6:144–153

    Google Scholar 

  • Cliff D, Ross S (1994) Adding temporary memory to ZCS. Adapt Behav 3(2):101–150

    Article  Google Scholar 

  • Damer BF (1998) Avatars, exploring and building Virtual Worlds on the Internet. Addison-Wesley Longman/Peachpit Press, USA

  • Dorigo M, Colombetti M (1994) Robot shaping: developing autonomous agents through learning. Artif Intell 71:321–370

    Article  Google Scholar 

  • Goldberg DE (1989) Genetic algorithms in search, optimization, and machine learning. Addison-Wesley, Reading, Mass

    MATH  Google Scholar 

  • Heudin J-C (1998) Virtual Worlds. In: Heudin JC (ed) Virtual Worlds: synthetic universes, digital life and complexity. Perseus Books, pp 1–28

  • Hoffmann J (1993) Vorhersage und Erkenntnis [Anticipation and Cognition]. Hogrefe

  • Holland JH (1986) Escaping brittleness: the possibilities of general-purpose learning algorithms applied to parallel rule-based systems. In: Mitchell, Michalski, Carbonell (eds) Machine learning, an artificial intelligence approach, vol II, chapter 20. Morgan Kaufmann, pp 593–623

  • Kovacs T (1996) Evolving optimal populations with XCS classifier systems. Master’s thesis, School of Computer Science, University of Birmingham, Birmingham, UK, 1996. Also technical report CSR-96-17 and CSRP-96-17

  • Kovacs T, Kerber M (2001) What makes a problem hard for XCS? In: Lanzi PL, Stolzmann W, Wilson SW (eds) Advances in learning classifier systems, volume 1996 of LNAI. Springer–Verlag, Berlin, pp 80–99

  • Lanzi Pier Luca (1999) An analysis of generalization in the XCS classifier system. Evolut Comput 7(2):125–149

    Article  Google Scholar 

  • Lanzi PL (2002) Learning classifier systems from a reinforcement learning perspective. Soft Comput 6(3):162–170

    Google Scholar 

  • Lin L-J (1992) Self-improving reactive agents based on reinforcement learning, planning and teaching. Mach Learn J 8:293–321

    Google Scholar 

  • Mataric MJ (1998) Using communication to reduce locality in distributed multiagent learning. J Exp Theor Artif Intell 10(3):357–369

    Article  MATH  Google Scholar 

  • Mitchell RW (1989) A comparative-developmental approach to understanding imitation. Perspect Ethol 7:183–215

    Google Scholar 

  • Ng AY, Russell S (2000) Algorithms for inverse reinforcement learning. In: Proceedings of the 17th international conference on machine learning. Morgan Kaufmann, San Francisco, CA, pp 663–670

  • Price B (2003) Accelerating reinforcement learning with imitation. PhD thesis, University of British Columbia

  • Price B, Boutilier C (1999) Implicit imitation in multiagent reinforcement learning. In: Proceedings of the 16th international conference on machine learning. Morgan Kaufmann, San Francisco, CA, pp 325–334

  • Sammut C, Hurst S, Kedzier D, Michie D (1992) Learning to fly. In: Proceedings of the ninth international conference on machine learning, July 1992. Morgan Kaufmann, Aberdeen

  • Stolzmann W (1998) Anticipatory classifier systems. In: Proceedings of the third annual genetic programming conference. Morgan Kaufmann, pp 658–664

  • Šuc D, Bratko I (1997) Skill reconstruction as induction of LQ controllers with subgoals. In: Proceedings of the 15th International Joint Conference on Artificial Intelligence (IJCAI-97), August 23–29 1997. Morgan Kaufmann Publishers, San Francisco, pp 914–919

  • Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. The MIT Press, Cambridge

    Google Scholar 

  • Urbančič T, Bratko I (1994) Reconstructing human skill with machine learning. In: Cohn AG (ed) Proceedings of the eleventh European conference on artificial intelligence, August 8–12 1994. John Wiley and Sons, Chichester, pp 498–502

  • Utgoff PE, Clouse JA (1991) Two kinds of training information for evaluation function learning. In: Dean, McKeown (eds) Proceedings of the 9th national conference on artificial intelligence. MIT Press, July, pp 596–600

  • Watkins CJCH, Dayan P (1992) Q-learning. Mach Learn 8(3):272–292

    Google Scholar 

  • Whitehead SD (1991) A complexity analysis of cooperative mechanisms in reinforcement learning. In: Dean, McKeown (eds) Proceedings of the 9th national conference on artificial intelligence. MIT Press, pp 607–613

  • Wilson SW (1994) ZCS: a zeroth level classifier system. Evol Comput 2(1):1–18

    Article  Google Scholar 

  • Wilson SW (1995) Classifier fitness based on accuracy. Evol Comput 3(2):149–175

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marc Métivier.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Métivier, M., Lattaud, C. Imitation guided learning in learning classifier systems. Nat Comput 8, 29–56 (2009). https://doi.org/10.1007/s11047-007-9054-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11047-007-9054-8

Keywords

Navigation