article

If multi-agent learning is the answer, what is the question?

Authors:

Trond GrenagerAuthors Info & Claims

Artificial Intelligence, Volume 171, Issue 7

Pages 365 - 377

https://doi.org/10.1016/j.artint.2006.02.006

Published: 01 May 2007 Publication History

Abstract

The area of learning in multi-agent systems is today one of the most fertile grounds for interaction between game theory and artificial intelligence. We focus on the foundational questions in this interdisciplinary area, and identify several distinct agendas that ought to, we argue, be separated. The goal of this article is to start a discussion in the research community that will result in firmer foundations for the area.

References

[1]

Arrow, K., Rationality of self and others in an economic system. Journal of Business. v59 i4.

[2]

B. Banerjee, J. Peng, Efficient no-regret multiagent learning, in: AAAI, 2005

[3]

Bellman, R., Dynamic Programming. 1957. Princeton University Press.

[4]

D. Billings, N. Burch, A. Davidson, R. Holte, J. Schaeffer, T. Schauenberg, D. Szafron, Approximating game-theoretic optimal strategies for full-scale poker, in: The Eighteenth International Joint Conference on Artificial Intelligence, 2003

[5]

Blackwell, D., Controlled random walks. In: Proceedings of the International Congress of Mathematicians, vol. 3. North-Holland, Amsterdam. pp. 336-338.

[6]

Bowling, M., Convergence and no-regret in multiagent learning. In: Advances in Neural Information Processing Systems, vol. 17. MIT Press, Cambridge, MA.

[7]

M. Bowling, M. Veloso, Rational and convergent learning in stochastic games, in: Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, 2001

[8]

Brafman, R. and Tennenholtz, M., R-max, a general polynomial time algorithm for near-optimal reinforcement learning. Journal of Machine Learning Research. v3. 213-231.

[9]

Brafman, R. and Tennenholtz, M., Efficient learning equilibrium. Artificial Intelligence. v159 i1--2. 27-47.

[10]

Brown, G., Iterative solution of games by fictitious play. In: Activity Analysis of Production and Allocation, John Wiley and Sons, New York.

[11]

Camerer, C., Ho, T. and Chong, J., Sophisticated EWA learning and strategic teaching in repeated games. Journal of Economic Theory. v104. 137-188.

[12]

Y.-H. Chang, T. Ho, L.P. Kaelbling, Mobilized ad-hoc networks: A reinforcement learning approach, in: 1st International Conference on Autonomic Computing (ICAC 2004), 2004, pp. 240--247

[13]

Cheng, S.-F., Leung, E., Lochner, K.M., O'Malley, K., Reeves, D.M., Schvartzman, L.J. and Wellman, M.P., Walverine: A walrasian trading agent. Decision Support Systems. v39. 169-184.

Digital Library

[14]

C. Claus, C. Boutilier, The dynamics of reinforcement learning in cooperative multiagent systems, in: Proceedings of the Fifteenth National Conference on Artificial Intelligence, 1998, pp. 746--752

Digital Library

[15]

Erev, I. and Roth, A.E., Predicting how people play games: reinforcement leaning in experimental games with unique, mixed strategy equilibria. The American Economic Review. v88 i4. 848-881.

[16]

Foster, D. and Vohra, R., Regret in the on-line decision problem. Games and Economic Behavior. v29. 7-36.

[17]

Freund, Y. and Schapire, R.E., A decision-theoretic generalization of on-line learning and an application to boosting. In: Computational Learning Theory: Proceedings of the Second European Conference, Springer-Verlag, Berlin. pp. 23-37.

[18]

Fudenberg, D. and Kreps, D., Learning mixed equilibria. Games and Economic Behavior. v5. 320-367.

[19]

Fudenberg, D. and Levine, D., Universal consistency and cautious fictitious play. Journal of Economic Dynamics and Control. v19. 1065-1089.

[20]

Fudenberg, D. and Levine, D.K., The Theory of Learning in Games. 1998. MIT Press, Cambridge, MA.

[21]

A. Greenwald, K. Hall, Correlated Q-learning, in: Proceedings of the Twentieth International Conference on Machine Learning, 2003, pp. 242--249

[22]

C. Guestrin, D. Koller, R. Parr, Multiagent planning with factored mdps, in: Advances in Neural Information Processing Systems (NIPS-14), 2001

[23]

Hannan, J.F., Approximation to Bayes risk in repeated plays. Contributions to the Theory of Games. v3. 97-139.

[24]

Hart, S. and Mas-Colell, A., A simple adaptive procedure leading to correlated equilibrium. Econometrica. v68. 1127-1150.

[25]

Hu, J. and Wellman, M., Nash Q-learning for general-sum stochastic games. Journal of Machine Learning Research. v4. 1039-1069.

[26]

J. Hu, P. Wellman, Multiagent reinforcement learning: Theoretical framework and an algorithm, in: Proceedings of the Fifteenth International Conference on Machine Learning, 1998, pp. 242--250

Digital Library

[27]

A. Jafari, A. Greenwald, D. Gondek, G. Ercal, On no-regret learning, fictitious play, and Nash equilibrium, in: Proceedings of the Eighteenth International Conference on Machine Learning, 2001

Digital Library

[28]

Jehiel, P. and Samet, D., Learning to play games in extensive form by valuation. NAJ Economics. v3.

[29]

Kaelbling, L.P., Littman, M.L. and Moore, A.P., Reinforcement learning: A survey. Journal of Artificial Intelligence Research. v4. 237-285.

[30]

Kalai, E. and Lehrer, E., Rational learning leads to Nash equilibrium. Econometrica. v61 i5. 1019-1045.

[31]

S. Kapetanakis, D. Kudenko, Reinforcement learning of coordination in heterogeneous cooperative multi-agent systems, in: Proceedings of the Third Autonomous Agents and Multi-Agent Systems Conference, 2004

[32]

M. Kearns, S. Singh, Near-optimal reinforcement learning in polynomial time, in: Proceedings of the Fifteenth International Conference on Machine Learning, 1998, pp. 260--268

Digital Library

[33]

Koller, D. and Pfeffer, A., Representations and solutions for game-theoretic problems. Artificial Intelligence. v94 i1. 167-215.

[34]

Lauer, M. and Riedmiller, M., An algorithm for distributed reinforcement learning in cooperative multi-agent systems. In: Proceedings of the 17th International Conference on Machine Learning, Morgan Kaufman. pp. 535-542.

[35]

K. Leyton-Brown, M. Tennenholtz, Local-effect games, in: Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, 2003, pp. 772--780

[36]

M.L. Littman, Markov games as a framework for multi-agent reinforcement learning, in: Proceedings of the 11th International Conference on Machine Learning, 1994, pp. 157--163

Digital Library

[37]

M.L. Littman, Friend-or-foe Q-learning in general-sum games, in: Proceedings of the Eighteenth International Conference on Machine Learning, 2001

Digital Library

[38]

M.L. Littman, C. Szepesvari, A generalized reinforcement-learning model: Convergence and applications, in: Proceedings of the 13th International Conference on Machine Learning, 1996, pp. 310--318

[39]

Mannor, S. and Shimkin, N., The empirical Bayes envelope and regret minimization in competitive Markov decision processes. Mathematics of Operations Research. v28 i2. 327-345.

[40]

Mitchell, T., Machine Learning. 1997. McGraw Hill.

[41]

Miyasawa, K., On the convergence of learning processes in a 2×2 non-zero-person game. Research Memo. v33.

[42]

Nachbar, J., Evolutionary selection dynamics in games: Convergence and limit properties. International Journal of Game Theory. v19. 59-89.

Digital Library

[43]

E. Nudelman, J. Wortman, K. Leyton-Brown, Y. Shoham, Run the GAMUT: A comprehensive approach to evaluating game-theoretic algorithms, in: AAMAS, 2004

[44]

R. Powers, Y. Shoham, Learning against opponents with bounded memory, in: Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, 2005

[45]

Powers, R. and Shoham, Y., New criteria and a new algorithm for learning in multi-agent systems. In: Advances in Neural Information Processing Systems, vol. 17. MIT Press, Cambridge, MA.

[46]

Robinson, J., An iterative method of solving a game. Annals of Mathematics. v54. 298-301.

[47]

Schuster, P. and Sigmund, K., Replicator dynamics. Journal of Theoretical Biology. v100. 533-538.

[48]

S. Sen, M. Sekaran, J. Hale, Learning to coordinate without sharing information, in: Proceedings of the Twelfth National Conference on Artificial Intelligence, Seattle, WA, 1994, pp. 426--431

Digital Library

[49]

Smith, J.M., Evolution and the Theory of Games. 1982. Cambridge University Press.

[50]

Sutton, R.S. and Barto, A.G., Reinforcement Learning: An Introduction. 1998. MIT Press, Cambridge, MA.

[51]

T. Vu, R. Powers, Y. Shoham, Learning against multiple opponents, in: Proceedings of the Fifth International Joint Conference on Autonomous Agents and Multi Agent Systems, 2006

Digital Library

[52]

X. Wang, T. Sandholm, Reinforcement learning to play an optimal Nash equilibrium in team Markov games, in: Advances in Neural Information Processing Systems, vol. 15, 2002

[53]

Watkins, C. and Dayan, P., Technical note: Q-learning. Machine Learning. v8 i3/4. 279-292.

[54]

Young, H.P., Strategic Learning and Its Limits. 2004. Oxford University Press, Oxford.

[55]

M. Zinkevich, Online convex programming and generalized infinitesimal gradient ascent, in: ICML, 2003

Digital Library

Cited By

Zhao ZZhang YWang SZhang FZhang MChen W(2024)QDAPKnowledge-Based Systems10.1016/j.knosys.2024.111719294:COnline publication date: 21-Jun-2024
https://dl.acm.org/doi/10.1016/j.knosys.2024.111719
He KDoshi PBanerjee B(2024)Modeling and reinforcement learning in partially observable many-agent systemsAutonomous Agents and Multi-Agent Systems10.1007/s10458-024-09640-138:1Online publication date: 1-Jun-2024
https://dl.acm.org/doi/10.1007/s10458-024-09640-1
Santos F(2024)Prosocial dynamics in multiagent systemsAI Magazine10.1002/aaai.1214345:1(131-138)Online publication date: 10-Jan-2024
https://dl.acm.org/doi/10.1002/aaai.12143
Show More Cited By

Index Terms

If multi-agent learning is the answer, what is the question?
1. Computing methodologies
  1. Artificial intelligence
    1. Distributed artificial intelligence
      1. Multi-agent systems
    2. Knowledge representation and reasoning
  2. Machine learning
2. Theory of computation
  1. Logic

Recommendations

Mediated Multi-Agent Reinforcement Learning
AAMAS '23: Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems

The majority of Multi-Agent Reinforcement Learning (MARL) literature equates the cooperation of self-interested agents in mixed environments to the problem of social welfare maximization, allowing agents to arbitrarily share rewards and private ...
Agents and Multi-Agent Systems: Technologies and Applications 2018 Proceedings of the 12th International Conference on Agents and Multi-Agent ...
Learning in multi-agent systems

In recent years, multi-agent systems (MASs) have received increasing attention in the artificial intelligence community. Research in multi-agent systems involves the investigation of autonomous, rational and flexible behaviour of entities such as ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Artificial Intelligence

Artificial Intelligence Volume 171, Issue 7

May, 2007

91 pages

ISSN:0004-3702

Issue’s Table of Contents

Copyright © © 2007.

Publisher

Elsevier Science Publishers Ltd.

United Kingdom

Publication History

Published: 01 May 2007

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

75
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 19 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhao ZZhang YWang SZhang FZhang MChen W(2024)QDAPKnowledge-Based Systems10.1016/j.knosys.2024.111719294:COnline publication date: 21-Jun-2024
https://dl.acm.org/doi/10.1016/j.knosys.2024.111719
He KDoshi PBanerjee B(2024)Modeling and reinforcement learning in partially observable many-agent systemsAutonomous Agents and Multi-Agent Systems10.1007/s10458-024-09640-138:1Online publication date: 1-Jun-2024
https://dl.acm.org/doi/10.1007/s10458-024-09640-1
Santos F(2024)Prosocial dynamics in multiagent systemsAI Magazine10.1002/aaai.1214345:1(131-138)Online publication date: 10-Jan-2024
https://dl.acm.org/doi/10.1002/aaai.12143
Avalos RPelachaud CTaylor MFaliszewski PMascardi V(2022)Exploration and Communication for Partially Observable Collaborative Multi-Agent Reinforcement LearningProceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems10.5555/3535850.3536120(1829-1832)Online publication date: 9-May-2022
https://dl.acm.org/doi/10.5555/3535850.3536120
Avalos RReymond MNowé ARoijers DPelachaud CTaylor MFaliszewski PMascardi V(2022)Local Advantage Networks for Cooperative Multi-Agent Reinforcement LearningProceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems10.5555/3535850.3536022(1524-1526)Online publication date: 9-May-2022
https://dl.acm.org/doi/10.5555/3535850.3536022
Aradi S(2022)Survey of Deep Reinforcement Learning for Motion Planning of Autonomous VehiclesIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2020.302465523:2(740-759)Online publication date: 1-Feb-2022
https://dl.acm.org/doi/10.1109/TITS.2020.3024655
Bulling NGoranko V(2022)Combining quantitative and qualitative reasoning in concurrent multi-player gamesAutonomous Agents and Multi-Agent Systems10.1007/s10458-021-09531-936:1Online publication date: 1-Apr-2022
https://dl.acm.org/doi/10.1007/s10458-021-09531-9
Barfuss W(2022)Dynamical systems as a level of cognitive analysis of multi-agent learningNeural Computing and Applications10.1007/s00521-021-06117-034:3(1653-1671)Online publication date: 1-Feb-2022
https://dl.acm.org/doi/10.1007/s00521-021-06117-0
Merhej RSantos FMelo FSantos FDignum FLomuscio AEndriss UNowé A(2021)Cooperation between Independent Reinforcement Learners under Wealth Inequality and Collective RisksProceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems10.5555/3463952.3464059(898-906)Online publication date: 3-May-2021
https://dl.acm.org/doi/10.5555/3463952.3464059
Tuyls KOmidshafiei SMuller PWang ZConnor JHennes DGraham ISpearman WWaskett TSteel DLuc PRecasens AGalashov AThornton GElie RSprechmann PMoreno PCao KGarnelo MDutta PValko MHeess NBridgland APérolat JDe Vylder BEslami SRowland MJaegle AMunos RBack TAhamed RBouton SBeauguerlange NBroshear JGraepel THassabis D(2021)Game PlanJournal of Artificial Intelligence Research10.1613/jair.1.1250571(41-88)Online publication date: 10-Sep-2021
https://dl.acm.org/doi/10.1613/jair.1.12505
Show More Cited By

View Options

View options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents