Abstract
Learning provides a useful tool for the automatic design of autonomous robots. Recent research on learning robot control has predominantly focussed on learning single tasks that were studied in isolation. If robots encounter a multitude of control learning tasks over their entire lifetime there is an opportunity to transfer knowledge between them. In order to do so, robots may learn the invariants and the regularities of their individual tasks and environments. This task-independent knowledge can be employed to bias generalization when learning control, which reduces the need for real-world experimentation. We argue that knowledge transfer is essential if robots are to learn control with moderate learning times in complex scenarios. Two approaches to lifelong robot learning which both capture invariant knowledge about the robot and its environments are presented. Both approaches have been evaluated using a HERO-2000 mobile robot. Learning tasks included navigation in unknown indoor environments and a simple find-and-fetch task.
This paper is also available as Technical report IAI-TR-93-7, University of Bonn, Dept. of Computer Science III, March 1993.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Christopher A. Atkeson, 1991. Using locally weighted regression for robot learning. In Proceedings of the 1991 IEEE International Conference on Robotics and Automation, pp. 958–962, Sacramento, CA
Jonathan R. Bachrach and Michael C. Mozer, 1991. Connectionist modeling and control of finite state systems given partial state information
Andrew G. Barto, Richard S. Sutton, and Chris J. C. H. Watkins, 1989. Learning and sequential decision making. Technical Report COINS 89–95, Department of Computer Science, University of Massachusetts, MA
Andrew G. Barto, Richard S. Sutton, and Chris J. C. H. Watkins, 1990. Learning and sequential decision making. In M. Gabriel and J. W. Moore, (eds.), Learning and Computational Neuroscience, pp. 539–602, Cambridge, MA. MIT Press
Andrew G. Barto, Steven J. Bradtke, and SatinderP. Singh, 1991. Real-time learning and control using asynchronous dynamic programming. Technical Report COINS 91–57, Department of Computer Science, University of Massachusetts, MA
R. E. Bellman, 1957. Dynamic Programming. Princeton University Press, Prince–ton, NJ
Rodney A. Brooks, 1989. A robot that walks; emergent behaviors from a carefully evolved network. Neural Computation, 1 (2): 253
Joachim Buhmann, Wolfram Burgard, Armin B. Cremers, Dieter Fox, Thomas Hofmann, Frank Schneider, Jiannis Strikos, and Sebastian Thrun, 1995. The mobile robot Rhino. AI Magazine, 16 (1)
Tom Bylander, 1991. Complexity results for planning. In Proceedings of IJCAI-91, pp. 274–279, Darling Habour, Sydney, Australia. IJCAI, Inc
John Canny, 1987. The Complexity of Robot Motion Planning. MIT Press, Cambridge, MA
Richard Caruana, 1993. Multitask learning: A knowledge-based of source of inductive bias. In Paul E. Utgoff, (ed.), Proceedings of the Tenth International Conference on Machine Learning, pp. 41–48, San Mateo, CA. Morgan Kaufmann
Lonnie Chrisman, 1992. Reinforcement learning with perceptual aliasing: The perceptual distinction approach. In Proceedings of 1992 AAAI Conference, Menlo Park, CA. AAAI Press/MIT Press
Peter Day an and Geoffrey E. Hinton, 1993. Feudal reinforcement learning. In J. E. Moody, S. J. Hanson, and R. P. Lippmann, (eds.), Advances in Neural Information Processing Systems 5, San Mateo, CA. Morgan Kaufmann
Gerald DeJong and Raymond Mooney, 1986. Explanation-based learning: An alternative view. Machine Learning, 1 (2): 145–176
Alberto Elfes, 1987. Sonar-based real-world mapping and navigation. IEEE Journal of Robotics and Automation, RA-3(3): 249–265
Michael I. Jordan, 1989. Generic constraints on underspecified target trajectories. In Proceedings of the First International Joint Conference on Neural Networks, Washington, DC, San Diego. IEEE TAB Neural Network Committee R.E. Kaiman, 1960. A new approach to linear filtering and prediction problems. Trans. ASME, Journal of Basic Engineering, 82: 35–45
Benjamin Kuipers and Yung-Tai Byun, 1990. A robot exploration and mapping strategy based on a semantic hierarchy of spatial representations. Technical report, Department of Computer Science, University of Texas at Austin, TX 78712
Long-Ji Lin and Tom M. Mitchell, 1992. Memory approaches to reinforcement learning in non-markovian domains. Technical Report CMU-CS-92–138, Carnegie Mellon University, Pittsburgh, PA
Long-Ji Lin, 1992. Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning, 8
Long-Ji Lin, 1992. Self-supervised Learning by Reinforcement and Artificial Neural Networks. PhD thesis, Carnegie Mellon University, School of Computer Science, Pittsburgh, PA
Pattie Maes, (ed.), 1991. Designing Autonomous Agents. MIT Press (and Elsevier), Cambridge, MA
Sridhar Mahadevan and Jonathan Connell, 1991. Scaling reinforcement learning to robotics by exploiting the subsumption architecture. In Proceedings of the Eighth International Workshop on Machine Learning, pp. 328–332
Bartlett W. Mel, 1989. Murphy: A neutrally-inspired connectionist approach to learning and performance in vision–based robot motion planning. Technical Report CCSR–89–17A, Center for Complex Systems Research Beckman Institute, University of Illinois
Tom M. Mitchell and Sebastian Thrun, 1993. Explanation based learning: A comparison of symbolic and neural network approaches. In Paul E. Utgoff, (ed.), Proceedings of the Tenth International Conference on Machine Learning, pp. 197–204, San Mateo, CA. Morgan Kaufmann
Tom M. Mitchell and Sebastian Thrun, 1993. Explanation-based neural network learning for robot control. In S. J. Hanson, J. Cowan, and C. L. Giles, (eds.), Advances in Neural Information Processing Systems 5, pp. 287–294, San Mateo, CA. Morgan Kaufmann
Tom M. Mitchell and Sebastian Thrun, 1995. Learning analytically and inductively. In Steier and Mitchell, (eds.), Mind Matters: A Tribute to Allen Newell. Lawrence Erlbaum Associates
Tom M. Mitchell, Rich Keller, and Smadar Kedar-Cabelli, 1986. Explanation-based generalization: A unifying view. Machine Learning, 1 (1): 47–80
Tom M. Mitchell, Joseph OSullivan, and Sebastian Thrun, 1994. Explanation- based learning for mobile robot perception. In Workshop on Robot Learning, Eleventh Conference on Machine Learning
Andrew W. Moore, 1990. Efficient Memory-based Learning for Robot Control.PhD thesis, Trinity Hall, University of Cambridge, UK
Hans P. Moravec, 1988. Sensor fusion in certainty grids for mobile robots. AI.Magazine, pp. 61–74
Michael C. Mozer and Jonathan R. Bachrach, 1989. Discovering the structure of a reactive environment by exploration. Technical Report CU-CS-451–89, Dept. of Computer Science, University of Colorado, Boulder
Paul Munro, 1987. A dual backpropagation scheme for scalar-reward learning. In Ninth Annual Conference of the Cognitive Science Society, pp. 165–176, Hillsdale, NJ. Cognitive Science Society, Lawrence Erlbaum
Joseph O’Sullivan, Tom M. Mitchell, and Sebastian Thrun, 1995. Explanation- based neural network learning from mobile robot perception. In Katsushi Ikeuchi and Manuela Veloso, (eds.), Symbolic Visual Learning. Oxford University Press
Dean A. Pomerleau, 1989. ALVINN: an autonomous land vehicle in a neural network. Technical Report CMU-CS-89–107, Computer Science Dept. Carnegie Mellon University, Pittsburgh PA
Lorien Y. Pratt, 1993. Discriminability-based transfer between neural networks. In J. E. Moody, S. J. Hanson, and R. P. Lippmann, (eds.), Advances in Neural Information Processing Systems 5, San Mateo, CA. Morgan Kaufmann
Ronald L. Rivest and Robert E. Schapire, 1987. Diversity-based inference of finite automata. In Proceedings of Foundations of Computer Science
David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams, 1986. Learning internal representations by error propagation. In D. E. Rumelhart and J. L. McClelland, (eds.), Parallel Distributed Processing. Vol. I II. MIT Press
A. L. Samuel, 1959. Some studies in machine learning using the game of checkers.IBM Journal on research and development, 3: 210–229
Jacob T. Schwartz, Micha Scharir, and John Hopcroft, 1987. Planning, Geometry and Complexity of Robot Motion. Ablex Publishing Corporation, Norwood, NJ
Noel E. Sharkey and Amanda J.C. Sharkey, 1992. Adaptive generalization and the transfer of knowledge. In Proceedings of the Second Irish Neural Networks Conference, Belfast
Patrice Simard, Bernard Victorri, Yann LeCun, and John Denker, 1992. Tangent prop–a formalism for specifying selected invariances in an adaptive network. In J. E. Moody, S. J. Hanson, and R. P. Lippmann, (eds.), Advances in Neural Information Processing Systems 4, pp. 895–903, San Mateo, CA. Morgan Kaufmann
Satinder P. Singh, 1992. The efficient learning of multiple task sequences. In J. E. Moody, S. J. Hanson, and R. P. Lippmann, (eds.), Advances in Neural Information Processing Systems 4, pp. 251–258, San Mateo, CA. Morgan Kaufmann
Satinder P. Singh, 1992. Transfer of learning by composing solutions for elemental sequential tasks. Machine Learning, 8
Steven C. Suddarth and Y. L. Kergosien, 1990. Ruleinjection hints as a means of improving network performance and learning time. In Proceedings of the EURASIP Workshop on Neural Networks, Sesimbra, Portugal. EURASIP
Richard S. Sutton, 1984. Temporal Credit Assignment in Reinforcement Learning. PhD thesis, Department of Computer and Information Science, University of Massachusetts
Richard S. Sutton, 1988. Learning to predict by the methods of temporal differ–ences. Machine Learning, 3
Richard S. Sutton, 1990. Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In Proceedings of the Seventh International Conference on Machine Learning, June 1990, pp. 216– 224, San Mateo, CA. Morgan Kaufmann
Richard S. Sutton, 1992. Adapting bias by gradient descent: An incremental version of delta-bar-delta. In Proceeding of Tenth National Conference on Artificial Intelligence AAAI-92, pp. 171–176, Menlo Park, CA. AAAI, AAAI Press/MIT Press
Ming Tan, 1991. Learning a cost-sensitive internal representation for reinforcement learning. In Proceedings of the Eighth International Workshop on Machine Learning, pp. 358–362
Sebastian Thrun and Tom M. Mitchell, 1993. Integrating inductive neural network learning and explanation-based learning. In Proceedings of IJCAI-93, Chamberry, France. IJCAI, Inc
Sebastian Thrun and Tom M. Mitchell, 1994. Learning one more thing. Technical Report CMU-CS-94–184, Carnegie Mellon University, Pittsburgh, PA 15213 Sebastian Thrun, 1992. The role of exploration in learning control. In David A. White and Donald A. Sofge, (eds.), Handbook of intelligent control: neural, fuzzy and adaptive approaches. Van Nostrand Reinhold, Florence, Kentucky 41022
Sebastian Thrun, 1993. Exploration and model building in mobile robot domains. In Proceedings of the ICNN-93, pp. 175–180, San Francisco, CA. IEEE Neural Network Council
Sebastian Thrun, 1994. A lifelong learning perspective for mobile robot control. In Proceedings of the IEEE/RSJ/GI International Conference on Intelligent Robots and Systems Sebastian Thrun, 1995. An approach to learning mobile robot navigation. Robotics and Autonomous Systems, (in press)
Sebastian Thrun, 1995. Learning to play the game of chess. In G. Tesauro, D. Touretzky, and T. Leen, (eds.), Advances in Neural Information Processing Systems 7, San Mateo, CA. MIT Press
Christopher J. C. H. Watkins, 1989. Learning from Delayed Rewa rds. PhD thesis, King’s College, Cambridge, UK
Steven D. Whitehead and D H. Ballard, 1991. Learning to perceive and act by trial and error. Machine Learning, 7: 45–83
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1995 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Thrun, S., Mitchell, T.M. (1995). Lifelong Robot Learning. In: Steels, L. (eds) The Biology and Technology of Intelligent Autonomous Agents. NATO ASI Series, vol 144. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-79629-6_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-79629-6_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-79631-9
Online ISBN: 978-3-642-79629-6
eBook Packages: Springer Book Archive