Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/3091125.3091280acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaamasConference Proceedingsconference-collections
research-article

Simultaneously Learning and Advising in Multiagent Reinforcement Learning

Published: 08 May 2017 Publication History

Abstract

Reinforcement Learning has long been employed to solve sequential decision-making problems with minimal input data. However, the classical approach requires a large number of interactions with an environment to learn a suitable policy. This problem is further intensified when multiple autonomous agents are simultaneously learning in the same environment. The teacher-student approach aims at alleviating this problem by integrating an advising procedure in the learning process, in which an experienced agent (human or not) can advise a student to guide her exploration. Even though previous works reported that an agent can learn faster when receiving advice, their proposals require that the teacher is an expert in the learning task. Sharing successful episodes can also accelerate learning, but this procedure requires a lot of communication between agents, which is unfeasible for domains in which communication is limited. Thus, we here propose a multiagent advising framework where multiple agents can advise each other while learning in a shared environment. If in any state an agent is unsure about what to do, it can ask for advice to other agents and may receive answers from agents that have more confidence in their actuation for that state. We perform experiments in a simulated Robot Soccer environment and show that the learning process is improved by incorporating this kind of advice.

References

[1]
H. Akiyama. Helios team base code. https://osdn.jp/projects/rctools/, 2012.
[2]
O. Amir, E. Kamar, A. Kolobov, and B. Grosz. Interactive teaching strategies for agent training. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI), pages 804--811, 2016.
[3]
M. G. Azar, A. Lazaric, and E. Brunskill. Regret bounds for reinforcement learning with policy advice. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD), pages 97--112. Springer, 2013.
[4]
L. Busoniu, R. Babuska, and B. De Schutter. A comprehensive survey of multiagent reinforcement learning. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 38(2):156--172, 2008.
[5]
J. A. Clouse. Learning from an automated training agent. In Adaptation and Learning in Multiagent Systems. Springer Verlag, 1996.
[6]
J. A. Clouse and P. E. Utgoff. A teaching method for reinforcement learning. In Proceedings of the 9th International Workshop on Machine Learning, pages 92--101, 1992.
[7]
F. Fernández and M. Veloso. Probabilistic policy reuse in a reinforcement learning agent. In Proceedings of the Fifth International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS), pages 720--727, New York, NY, USA, 2006. ACM.
[8]
D. Garant, B. C. Silva, V. Lesser, and C. Zhang. Accelerating multi-agent reinforcement learning with dynamic co-learning. Technical report, 2015.
[9]
M. Hausknecht, P. Mupparaju, S. Subramanian, S. Kalyanakrishnan, and P. Stone. Half field offense: An environment for multiagent learning and ad hoc teamwork. In AAMAS Adaptive Learning Agents (ALA) Workshop, 2016.
[10]
H. Kitano, M. Asada, Y. Kuniyoshi, I. Noda, and E. Osawa. Robocup: The robot world cup initiative. In Proceedings of the 1st International Conference on Autonomous agents (IAA97), pages 340--347. ACM, 1997.
[11]
H. Kitano, M. Asada, Y. Kuniyoshi, I. Noda, E. Osawa, and H. Matsubara. Robocup: A challenge problem for AI. AI magazine, 18(1):73, 1997.
[12]
H. Kitano, M. Tambe, P. Stone, M. Veloso, S. Coradeschi, E. Osawa, H. Matsubara, I. Noda, and M. Asada. The robocup synthetic agent challenge 97. In RoboCup-97: Robot Soccer World Cup I, pages 62--73. Springer, 1998.
[13]
M. L. Koga, V. F. Silva, and A. H. R. Costa. Stochastic abstract policies: Generalizing knowledge to improve reinforcement learning. IEEE Transactions on Cybernetics, 45(1):77--88, 2015.
[14]
M. Lauer and M. Riedmiller. An algorithm for distributed reinforcement learning in cooperative multi-agent systems. In Proceedings of the 17th International Conference on Machine Learning (ICML), pages 535--542, 2000.
[15]
M. L. Littman. Markov games as a framework for multi-agent reinforcement learning. In Proceedings of the 11th International Conference on Machine Learning (ICML), pages 157--163, 1994.
[16]
M. L. Littman. Reinforcement learning improves behaviour from evaluative feedback. Nature, 521(7553):445--451, 2015.
[17]
R. Maclin, J. W. Shavlik, and P. Kaelbling. Creating advice-taking reinforcement learners. In Machine Learning, pages 251--281, 1996.
[18]
D. Miller, A. Sun, M. Johns, H. Ive, D. Sirkin, S. Aich, and W. Ju. Distraction becomes engagement in automated driving. In Proceedings of the Human Factors and Ergonomics Society Annual Meeting, volume 59, pages 1676--1680. SAGE Publications, 2015.
[19]
V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, et al. Human-level control through deep reinforcement learning. Nature, 518(7540):529--533, 2015.
[20]
A. Y. Ng, A. Coates, M. Diel, V. Ganapathi, J. Schulte, B. Tse, E. Berger, and E. Liang. Autonomous inverted helicopter flight via reinforcement learning. In Experimental Robotics IX, pages 363--372. Springer, 2006.
[21]
L. Nunes and E. Oliveira. On learning by exchanging advice. Journal of Artificial Intelligence and the Simulation of Behaviour, 1(3):241--257, July 2003.
[22]
L. Panait and S. Luke. Cooperative multi-agent learning: The state of the art. Autonomous Agents and Multi-Agent Systems, 11(3):387--434, 2005.
[23]
M. L. Puterman. Markov Decision Processes : Discrete Stochastic Dynamic Programming. J. Wiley & Sons, Hoboken (N. J.), 2005.
[24]
A. A. Sherstov and P. Stone. Function approximation via Tile Coding: Automating parameter choice. In Proceedings of the Symposium on Abstraction, Reformulation, and Approximation (SARA), pages 194--205, 2005.
[25]
F. L. Silva and A. H. R. Costa. Accelerating Multiagent Reinforcement Learning through Transfer Learning. In Proceedings of the 31st AAAI Conference on Artificial Intelligence, pages 5034--5035, 2017.
[26]
R. S. Sutton. Generalization in reinforcement learning: Successful examples using sparse coarse coding. Advances in neural information processing systems, pages 1038--1044, 1996.
[27]
R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA, USA, 1st edition, 1998.
[28]
M. Tan. Multi-agent reinforcement learning: Independent vs. cooperative agents. In Proceedings of the 10th International Conference on Machine Learning (ICML), pages 330--337, 1993.
[29]
M. E. Taylor, N. Carboni, A. Fachantidis, I. P. Vlahavas, and L. Torrey. Reinforcement learning agents providing advice in complex video games. Connection Science, 26(1):45--63, 2014.
[30]
M. E. Taylor and P. Stone. Transfer learning for reinforcement learning domains: A survey. Journal of Machine Learning Research, 10:1633--1685, 2009.
[31]
L. Torrey and M. E. Taylor. Teaching on a budget: agents advising agents in reinforcement learning. In Proceedings of 12th the International Conference on Autonomous Agents and MultiAgent Systems (AAMAS), pages 1053--1060, 2013.
[32]
L. Torrey, T. Walker, J. Shavlik, and R. Maclin. Using advice to transfer knowledge acquired in one reinforcement learning task to another. In Proceedings of the 16th European Conference on Machine Learning (ECAI), pages 412--424, 2005.
[33]
C. J. Watkins and P. Dayan. Q-learning. Machine learning, 8(3):279--292, 1992.
[34]
Y. Zhan, H. Bou-Ammar, and M. E. Taylor. Theoretically-grounded policy advice from multiple teachers in reinforcement learning settings with applications to negative transfer. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI), pages 2315--2321, 2016.
[35]
M. Zimmer, P. Viappiani, and P. Weng. Teacher-student framework: a reinforcement learning approach. In AAMAS workshop Autonomous Robots and Multirobot Systems, 2014.

Cited By

View all
  • (2022)Learning to Advise and Learning from Advice in Cooperative Multiagent Reinforcement LearningProceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems10.5555/3535850.3536063(1645-1647)Online publication date: 9-May-2022
  • (2021)A Q-values Sharing Framework for Multi-agent Reinforcement Learning under Budget ConstraintACM Transactions on Autonomous and Adaptive Systems10.1145/344726815:2(1-28)Online publication date: 19-Apr-2021
  • (2020)Learning by Reusing Previous Advice in Teacher-Student ParadigmProceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems10.5555/3398761.3398953(1674-1682)Online publication date: 5-May-2020
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
AAMAS '17: Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems
May 2017
1914 pages

Sponsors

  • IFAAMAS

In-Cooperation

Publisher

International Foundation for Autonomous Agents and Multiagent Systems

Richland, SC

Publication History

Published: 08 May 2017

Check for updates

Author Tags

  1. autonomous advice taking
  2. cooperative learning
  3. multiagent reinforcement learning
  4. transfer learning

Qualifiers

  • Research-article

Funding Sources

  • CNPq
  • CAPES
  • Google Latin America Research Award
  • São Paulo Research Foundation (FAPESP)

Acceptance Rates

AAMAS '17 Paper Acceptance Rate 127 of 457 submissions, 28%;
Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)10
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Learning to Advise and Learning from Advice in Cooperative Multiagent Reinforcement LearningProceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems10.5555/3535850.3536063(1645-1647)Online publication date: 9-May-2022
  • (2021)A Q-values Sharing Framework for Multi-agent Reinforcement Learning under Budget ConstraintACM Transactions on Autonomous and Adaptive Systems10.1145/344726815:2(1-28)Online publication date: 19-Apr-2021
  • (2020)Learning by Reusing Previous Advice in Teacher-Student ParadigmProceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems10.5555/3398761.3398953(1674-1682)Online publication date: 5-May-2020
  • (2020)Learning Hierarchical Teaching Policies for Cooperative AgentsProceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems10.5555/3398761.3398836(620-628)Online publication date: 5-May-2020
  • (2019)Integrating Agent Advice and Previous Task Solutions in Multiagent Reinforcement LearningProceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems10.5555/3306127.3332142(2447-2448)Online publication date: 8-May-2019
  • (2019)A Q-values Sharing Framework for Multiple Independent Q-learnersProceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems10.5555/3306127.3332099(2324-2326)Online publication date: 8-May-2019
  • (2019)A survey on transfer learning for multiagent reinforcement learning systemsJournal of Artificial Intelligence Research10.1613/jair.1.1139664:1(645-703)Online publication date: 1-Jan-2019
  • (2018)Autonomously reusing knowledge in multiagent reinforcement learningProceedings of the 27th International Joint Conference on Artificial Intelligence10.5555/3304652.3304788(5487-5493)Online publication date: 13-Jul-2018
  • (2018)Object-Oriented Curriculum Generation for Reinforcement LearningProceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems10.5555/3237383.3237850(1026-1034)Online publication date: 9-Jul-2018
  • (2018)Efficient Convention Emergence through Decoupled Reinforcement Social Learning with Teacher-Student MechanismProceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems10.5555/3237383.3237501(795-803)Online publication date: 9-Jul-2018
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media