Article

Cyclic equilibria in Markov games

Authors:

Martin Zinkevich,

Amy Greenwald,

Michael L. LittmanAuthors Info & Claims

NIPS'05: Proceedings of the 18th International Conference on Neural Information Processing Systems

Pages 1641 - 1648

Published: 05 December 2005 Publication History

Publisher Site

Abstract

Although variants of value iteration have been proposed for finding Nash or correlated equilibria in general-sum Markov games, these variants have not been shown to be effective in general. In this paper, we demonstrate by construction that existing variants of value iteration cannot find stationary equilibrium policies in arbitrary general-sum Markov games. Instead, we propose an alternative interpretation of the output of value iteration based on a new (non-stationary) equilibrium concept that we call "cyclic equilibria." We prove that value iteration identifies cyclic equilibria in a class of games in which it fails to find stationary equilibria. We also demonstrate empirically that value iteration finds cyclic equilibria in nearly all examples drawn from a random distribution of Markov games.

References

[1]

Bellman, R. (1957). Dynamic programming. Princeton, NJ: Princeton University Press.

Crossref

Google Scholar

[2]

Brafman, R. I., & Tennenholtz, M. (2002). R-MAX—a general polynomial time algorithm for near-optimal reinforcement learning. Journal of Machine Learning Research, 3, 213-231.

Crossref

Google Scholar

[3]

Greenwald, A., & Hall, K. (2003). Correlated Q-learning. Proceedings of the Twentieth International Conference on Machine Learning (pp. 242-249).

Google Scholar

[4]

Hu, J., & Wellman, M. (1998). Multiagent reinforcement learning: theoretical framework and an algorithm. Proceedings of the Fifteenth International Conference on Machine Learning (pp. 242-250). Morgan Kaufman.

Crossref

Google Scholar

[5]

Littman, M. (2001). Friend-or-foe Q-learning in general-sum games. Proceedings of the Eighteenth International Conference on Machine Learning (pp. 322-328). Morgan Kaufmann.

Crossref

Google Scholar

[6]

Littman, M. L., & Szepesvári, C. (1996). A generalized reinforcement-learning model: Convergence and applications. Proceedings of the Thirteenth International Conference on Machine Learning (pp. 310-318).

Google Scholar

[7]

Osborne, M. J., & Rubinstein, A. (1994). A Course in Game Theory. The MIT Press.

Google Scholar

[8]

Puterman, M. (1994). Markov decision processes: Discrete stochastic dynamic programming. Wiley-Interscience.

Crossref

Google Scholar

[9]

Shapley, L. (1953). Stochastic games. Proceedings of the National Academy of Sciences of the United States of America, 39, 1095-1100.

Google Scholar

[10]

Tesauro, G., & Kephart, J. (1999). Pricing in agent economies using multi-agent Q-learning. Proceedings of Fifth European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty (pp. 71-86).

Google Scholar

Cited By

View all

Srinivasan SLanctot MZambaldi VPérolat JTuyls KMunos RBowling M(2018)Actor-critic policy optimization in partially observable multiagent environmentsProceedings of the 32nd International Conference on Neural Information Processing Systems10.5555/3327144.3327261(3426-3439)Online publication date: 3-Dec-2018
https://dl.acm.org/doi/10.5555/3327144.3327261
Foerster JChen RAl-Shedivat MWhiteson SAbbeel PMordatch IAndre EKoenig SDastani MSukthankar G(2018)Learning with Opponent-Learning AwarenessProceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems10.5555/3237383.3237408(122-130)Online publication date: 9-Jul-2018
https://dl.acm.org/doi/10.5555/3237383.3237408
Sun FChang YWu YLin SFurman JMarchant GPrice HRossi F(2018)Designing Non-greedy Reinforcement Learning Agents with Diminishing Reward ShapingProceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society10.1145/3278721.3278759(297-302)Online publication date: 27-Dec-2018
https://dl.acm.org/doi/10.1145/3278721.3278759
Show More Cited By

Cyclic equilibria in Markov games

Recommendations

Cyclic Markov equilibria in stochastic games
Abstract
We examine a three-person stochastic game where the only existing equilibria consist of cyclic Markov strategies. Unlike in two-person games of a similar type, stationary ε-equilibria (ε > 0) do not exist for this game. Besides we characterize the ...
Pure Nash equilibria in restricted budget games

In budget games, players compete over resources with finite budgets. For every resource, a player has a specific demand and as a strategy, he chooses a subset of resources. If the total demand on a resource does not exceed its budget, the utility of ...
Stationary Markov Nash Equilibria for Nonzero-Sum Constrained ARAT Markov Games

We consider a nonzero-sum Markov game on an abstract measurable state space with compact metric action spaces. The goal of each player is to maximize his respective discounted payoff function under the condition that some constraints on a discounted payoff ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

NIPS'05: Proceedings of the 18th International Conference on Neural Information Processing Systems

December 2005

1656 pages

Publisher

MIT Press

Cambridge, MA, United States

Publication History

Published: 05 December 2005

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 24 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Srinivasan SLanctot MZambaldi VPérolat JTuyls KMunos RBowling M(2018)Actor-critic policy optimization in partially observable multiagent environmentsProceedings of the 32nd International Conference on Neural Information Processing Systems10.5555/3327144.3327261(3426-3439)Online publication date: 3-Dec-2018
https://dl.acm.org/doi/10.5555/3327144.3327261
Foerster JChen RAl-Shedivat MWhiteson SAbbeel PMordatch IAndre EKoenig SDastani MSukthankar G(2018)Learning with Opponent-Learning AwarenessProceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems10.5555/3237383.3237408(122-130)Online publication date: 9-Jul-2018
https://dl.acm.org/doi/10.5555/3237383.3237408
Sun FChang YWu YLin SFurman JMarchant GPrice HRossi F(2018)Designing Non-greedy Reinforcement Learning Agents with Diminishing Reward ShapingProceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society10.1145/3278721.3278759(297-302)Online publication date: 27-Dec-2018
https://dl.acm.org/doi/10.1145/3278721.3278759
Leibo JZambaldi VLanctot MMarecki JGraepel TLarson KWinikoff MDas SDurfee E(2017)Multi-agent Reinforcement Learning in Sequential Social DilemmasProceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems10.5555/3091125.3091194(464-473)Online publication date: 8-May-2017
https://dl.acm.org/doi/10.5555/3091125.3091194

Abstract

References

Cited By

Recommendations

Cyclic Markov equilibria in stochastic games

Pure Nash equilibria in restricted budget games

Stationary Markov Nash Equilibria for Nonzero-Sum Constrained ARAT Markov Games

Comments

Information

Published In

Publisher

Publication History

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations