Nothing Special   »   [go: up one dir, main page]

skip to main content
article

If multi-agent learning is the answer, what is the question?

Published: 01 May 2007 Publication History

Abstract

The area of learning in multi-agent systems is today one of the most fertile grounds for interaction between game theory and artificial intelligence. We focus on the foundational questions in this interdisciplinary area, and identify several distinct agendas that ought to, we argue, be separated. The goal of this article is to start a discussion in the research community that will result in firmer foundations for the area.

References

[1]
Arrow, K., Rationality of self and others in an economic system. Journal of Business. v59 i4.
[2]
B. Banerjee, J. Peng, Efficient no-regret multiagent learning, in: AAAI, 2005
[3]
Bellman, R., Dynamic Programming. 1957. Princeton University Press.
[4]
D. Billings, N. Burch, A. Davidson, R. Holte, J. Schaeffer, T. Schauenberg, D. Szafron, Approximating game-theoretic optimal strategies for full-scale poker, in: The Eighteenth International Joint Conference on Artificial Intelligence, 2003
[5]
Blackwell, D., Controlled random walks. In: Proceedings of the International Congress of Mathematicians, vol. 3. North-Holland, Amsterdam. pp. 336-338.
[6]
Bowling, M., Convergence and no-regret in multiagent learning. In: Advances in Neural Information Processing Systems, vol. 17. MIT Press, Cambridge, MA.
[7]
M. Bowling, M. Veloso, Rational and convergent learning in stochastic games, in: Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, 2001
[8]
Brafman, R. and Tennenholtz, M., R-max, a general polynomial time algorithm for near-optimal reinforcement learning. Journal of Machine Learning Research. v3. 213-231.
[9]
Brafman, R. and Tennenholtz, M., Efficient learning equilibrium. Artificial Intelligence. v159 i1--2. 27-47.
[10]
Brown, G., Iterative solution of games by fictitious play. In: Activity Analysis of Production and Allocation, John Wiley and Sons, New York.
[11]
Camerer, C., Ho, T. and Chong, J., Sophisticated EWA learning and strategic teaching in repeated games. Journal of Economic Theory. v104. 137-188.
[12]
Y.-H. Chang, T. Ho, L.P. Kaelbling, Mobilized ad-hoc networks: A reinforcement learning approach, in: 1st International Conference on Autonomic Computing (ICAC 2004), 2004, pp. 240--247
[13]
Cheng, S.-F., Leung, E., Lochner, K.M., O'Malley, K., Reeves, D.M., Schvartzman, L.J. and Wellman, M.P., Walverine: A walrasian trading agent. Decision Support Systems. v39. 169-184.
[14]
C. Claus, C. Boutilier, The dynamics of reinforcement learning in cooperative multiagent systems, in: Proceedings of the Fifteenth National Conference on Artificial Intelligence, 1998, pp. 746--752
[15]
Erev, I. and Roth, A.E., Predicting how people play games: reinforcement leaning in experimental games with unique, mixed strategy equilibria. The American Economic Review. v88 i4. 848-881.
[16]
Foster, D. and Vohra, R., Regret in the on-line decision problem. Games and Economic Behavior. v29. 7-36.
[17]
Freund, Y. and Schapire, R.E., A decision-theoretic generalization of on-line learning and an application to boosting. In: Computational Learning Theory: Proceedings of the Second European Conference, Springer-Verlag, Berlin. pp. 23-37.
[18]
Fudenberg, D. and Kreps, D., Learning mixed equilibria. Games and Economic Behavior. v5. 320-367.
[19]
Fudenberg, D. and Levine, D., Universal consistency and cautious fictitious play. Journal of Economic Dynamics and Control. v19. 1065-1089.
[20]
Fudenberg, D. and Levine, D.K., The Theory of Learning in Games. 1998. MIT Press, Cambridge, MA.
[21]
A. Greenwald, K. Hall, Correlated Q-learning, in: Proceedings of the Twentieth International Conference on Machine Learning, 2003, pp. 242--249
[22]
C. Guestrin, D. Koller, R. Parr, Multiagent planning with factored mdps, in: Advances in Neural Information Processing Systems (NIPS-14), 2001
[23]
Hannan, J.F., Approximation to Bayes risk in repeated plays. Contributions to the Theory of Games. v3. 97-139.
[24]
Hart, S. and Mas-Colell, A., A simple adaptive procedure leading to correlated equilibrium. Econometrica. v68. 1127-1150.
[25]
Hu, J. and Wellman, M., Nash Q-learning for general-sum stochastic games. Journal of Machine Learning Research. v4. 1039-1069.
[26]
J. Hu, P. Wellman, Multiagent reinforcement learning: Theoretical framework and an algorithm, in: Proceedings of the Fifteenth International Conference on Machine Learning, 1998, pp. 242--250
[27]
A. Jafari, A. Greenwald, D. Gondek, G. Ercal, On no-regret learning, fictitious play, and Nash equilibrium, in: Proceedings of the Eighteenth International Conference on Machine Learning, 2001
[28]
Jehiel, P. and Samet, D., Learning to play games in extensive form by valuation. NAJ Economics. v3.
[29]
Kaelbling, L.P., Littman, M.L. and Moore, A.P., Reinforcement learning: A survey. Journal of Artificial Intelligence Research. v4. 237-285.
[30]
Kalai, E. and Lehrer, E., Rational learning leads to Nash equilibrium. Econometrica. v61 i5. 1019-1045.
[31]
S. Kapetanakis, D. Kudenko, Reinforcement learning of coordination in heterogeneous cooperative multi-agent systems, in: Proceedings of the Third Autonomous Agents and Multi-Agent Systems Conference, 2004
[32]
M. Kearns, S. Singh, Near-optimal reinforcement learning in polynomial time, in: Proceedings of the Fifteenth International Conference on Machine Learning, 1998, pp. 260--268
[33]
Koller, D. and Pfeffer, A., Representations and solutions for game-theoretic problems. Artificial Intelligence. v94 i1. 167-215.
[34]
Lauer, M. and Riedmiller, M., An algorithm for distributed reinforcement learning in cooperative multi-agent systems. In: Proceedings of the 17th International Conference on Machine Learning, Morgan Kaufman. pp. 535-542.
[35]
K. Leyton-Brown, M. Tennenholtz, Local-effect games, in: Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, 2003, pp. 772--780
[36]
M.L. Littman, Markov games as a framework for multi-agent reinforcement learning, in: Proceedings of the 11th International Conference on Machine Learning, 1994, pp. 157--163
[37]
M.L. Littman, Friend-or-foe Q-learning in general-sum games, in: Proceedings of the Eighteenth International Conference on Machine Learning, 2001
[38]
M.L. Littman, C. Szepesvari, A generalized reinforcement-learning model: Convergence and applications, in: Proceedings of the 13th International Conference on Machine Learning, 1996, pp. 310--318
[39]
Mannor, S. and Shimkin, N., The empirical Bayes envelope and regret minimization in competitive Markov decision processes. Mathematics of Operations Research. v28 i2. 327-345.
[40]
Mitchell, T., Machine Learning. 1997. McGraw Hill.
[41]
Miyasawa, K., On the convergence of learning processes in a 2×2 non-zero-person game. Research Memo. v33.
[42]
Nachbar, J., Evolutionary selection dynamics in games: Convergence and limit properties. International Journal of Game Theory. v19. 59-89.
[43]
E. Nudelman, J. Wortman, K. Leyton-Brown, Y. Shoham, Run the GAMUT: A comprehensive approach to evaluating game-theoretic algorithms, in: AAMAS, 2004
[44]
R. Powers, Y. Shoham, Learning against opponents with bounded memory, in: Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, 2005
[45]
Powers, R. and Shoham, Y., New criteria and a new algorithm for learning in multi-agent systems. In: Advances in Neural Information Processing Systems, vol. 17. MIT Press, Cambridge, MA.
[46]
Robinson, J., An iterative method of solving a game. Annals of Mathematics. v54. 298-301.
[47]
Schuster, P. and Sigmund, K., Replicator dynamics. Journal of Theoretical Biology. v100. 533-538.
[48]
S. Sen, M. Sekaran, J. Hale, Learning to coordinate without sharing information, in: Proceedings of the Twelfth National Conference on Artificial Intelligence, Seattle, WA, 1994, pp. 426--431
[49]
Smith, J.M., Evolution and the Theory of Games. 1982. Cambridge University Press.
[50]
Sutton, R.S. and Barto, A.G., Reinforcement Learning: An Introduction. 1998. MIT Press, Cambridge, MA.
[51]
T. Vu, R. Powers, Y. Shoham, Learning against multiple opponents, in: Proceedings of the Fifth International Joint Conference on Autonomous Agents and Multi Agent Systems, 2006
[52]
X. Wang, T. Sandholm, Reinforcement learning to play an optimal Nash equilibrium in team Markov games, in: Advances in Neural Information Processing Systems, vol. 15, 2002
[53]
Watkins, C. and Dayan, P., Technical note: Q-learning. Machine Learning. v8 i3/4. 279-292.
[54]
Young, H.P., Strategic Learning and Its Limits. 2004. Oxford University Press, Oxford.
[55]
M. Zinkevich, Online convex programming and generalized infinitesimal gradient ascent, in: ICML, 2003

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Artificial Intelligence
Artificial Intelligence  Volume 171, Issue 7
May, 2007
91 pages

Publisher

Elsevier Science Publishers Ltd.

United Kingdom

Publication History

Published: 01 May 2007

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 19 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)QDAPKnowledge-Based Systems10.1016/j.knosys.2024.111719294:COnline publication date: 21-Jun-2024
  • (2024)Modeling and reinforcement learning in partially observable many-agent systemsAutonomous Agents and Multi-Agent Systems10.1007/s10458-024-09640-138:1Online publication date: 1-Jun-2024
  • (2024)Prosocial dynamics in multiagent systemsAI Magazine10.1002/aaai.1214345:1(131-138)Online publication date: 10-Jan-2024
  • (2022)Exploration and Communication for Partially Observable Collaborative Multi-Agent Reinforcement LearningProceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems10.5555/3535850.3536120(1829-1832)Online publication date: 9-May-2022
  • (2022)Local Advantage Networks for Cooperative Multi-Agent Reinforcement LearningProceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems10.5555/3535850.3536022(1524-1526)Online publication date: 9-May-2022
  • (2022)Survey of Deep Reinforcement Learning for Motion Planning of Autonomous VehiclesIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2020.302465523:2(740-759)Online publication date: 1-Feb-2022
  • (2022)Combining quantitative and qualitative reasoning in concurrent multi-player gamesAutonomous Agents and Multi-Agent Systems10.1007/s10458-021-09531-936:1Online publication date: 1-Apr-2022
  • (2022)Dynamical systems as a level of cognitive analysis of multi-agent learningNeural Computing and Applications10.1007/s00521-021-06117-034:3(1653-1671)Online publication date: 1-Feb-2022
  • (2021)Cooperation between Independent Reinforcement Learners under Wealth Inequality and Collective RisksProceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems10.5555/3463952.3464059(898-906)Online publication date: 3-May-2021
  • (2021)Game PlanJournal of Artificial Intelligence Research10.1613/jair.1.1250571(41-88)Online publication date: 10-Sep-2021
  • Show More Cited By

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media