Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1329125.1329390acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaamasConference Proceedingsconference-collections
research-article

Q-value functions for decentralized POMDPs

Published: 14 May 2007 Publication History

Abstract

Planning in single-agent models like MDPs and POMDPs can be carried out by resorting to Q-value functions: a (near-) optimal Q-value function is computed in a recursive manner by dynamic programming, and then a policy is extracted from this value function. In this paper we study whether similar Q-value functions can be defined in decentralized POMDP models (Dec-POMDPs), what the cost of computing such value functions is, and how policies can be extracted from such value functions. Using the framework of Bayesian games, we argue that searching for the optimal Q-value function may be as costly as exhaustive policy search. Then we analyze various approximate Q-value functions that allow efficient computation. Finally, we describe a family of algorithms for extracting policies from such Q-value functions.

References

[1]
R. Becker, S. Zilberstein, V. Lesser, and C. V. Goldman. Solving transition independent decentralized Markov decision processes. Journal of Artificial Intelligence Research (JAIR), 22:423--455, December 2004.
[2]
D. S. Bernstein, R. Givan, N. Immerman, and S. Zilberstein. The complexity of decentralized control of Markov decision processes. Math. Oper. Res., 27(4):819--840, 2002.
[3]
C. Boutilier. Planning, learning and coordination in multiagent decision processes. In TARK '96: Proceedings of the 6th conference on Theoretical aspects of rationality and knowledge, pages 195--210, 1996.
[4]
R. Emery-Montemerlo, G. Gordon, J. Schneider, and S. Thrun. Approximate solutions for partially observable stochastic games with common payoffs. In Proc. of Int. Joint Conference on Autonomous Agents and Multi Agent Systems, pages 136--143, 2004.
[5]
C. Guestrin, D. Koller, and R. Parr. Multiagent planning with factored MDPs. In Advances in Neural Information Processing Systems 14, pages 1523--1530, 2002.
[6]
E. A. Hansen, D. S. Bernstein, and S. Zilberstein. Dynamic programming for partially observable stochastic games. In Proceedings of the Nineteenth National Conference on Artificial Intelligence, pages 709--715, 2004.
[7]
L. P. Kaelbling, M. L. Littman, and A. R. Cassandra. Planning and acting in partially observable stochastic domains. Artif. Intell., 101(1--2):99--134, 1998.
[8]
J. R. Kok and N. Vlassis. Collaborative multiagent reinforcement learning by payoff propagation. Journal of Machine Learning Research, 7:1789--1828, 2006.
[9]
M. Littman, A. Cassandra, and L. Kaelbling. Learning policies for partially observable environments: Scaling up. In International Conference on Machine Learning, pages 362--370, 1995.
[10]
R. Nair, M. Tambe, M. Yokoo, D. V. Pynadath, and S. Marsella. Taming decentralized POMDPs: Towards efficient policy computation for multiagent settings. In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, pages 705--711, 2003.
[11]
M. J. Osborne and A. Rubinstein. A Course in Game Theory. The MIT Press, July 1994.
[12]
C. H. Papadimitriou and J. N. Tsitsiklis. The complexity of Markov decision processes. Mathematics of Operations Research, 12(3):441--451, 1987.
[13]
M. L. Puterman. Markov Decision Processes---Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., New York, NY, 1994.
[14]
M. Roth, R. Simmons, and M. Veloso. Reasoning about joint beliefs for execution-time communication decisions. In Proc. of Int. Joint Conference on Autonomous Agents and Multi Agent Systems, pages 786--793, 2005.
[15]
P. Stone and M. Veloso. Multiagent systems: a survey from a machine learning perspective. Autonomous Robots, 8(3), 2000.
[16]
D. Szer, F. Charpillet, and S. Zilberstein. MAA*: A heuristic search algorithm for solving decentralized POMDPs. In Proc. of the Twenty First Conference on Uncertainty in Artificial Intelligence, 2005.
[17]
N. Vlassis. A concise introduction to multiagent systems and distributed AI. Informatics Institute, University of Amsterdam, Sept. 2003.
[18]
G. Weiss, editor. Multiagent Systems: a Modern Approach to Distributed Artificial Intelligence. MIT Press, 1999.

Cited By

View all
  • (2024)An Effective Training Method for Counterfactual Multi-Agent Policy Network Based on Differential Evolution AlgorithmApplied Sciences10.3390/app1418838314:18(8383)Online publication date: 18-Sep-2024
  • (2022)Task-Oriented Communication Design in Cyber-Physical Systems: A Survey on Theory and ApplicationsIEEE Access10.1109/ACCESS.2022.323103910(133842-133868)Online publication date: 2022
  • (2020)Improving coordination in small-scale multi-agent deep reinforcement learning through memory-driven communicationMachine Learning10.1007/s10994-019-05864-5Online publication date: 23-Jan-2020
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
AAMAS '07: Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
May 2007
1585 pages
ISBN:9788190426275
DOI:10.1145/1329125
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

  • IFAAMAS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 May 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. cooperative multiagent systems
  2. decentralized POMDPs
  3. planning under uncertainty

Qualifiers

  • Research-article

Funding Sources

  • ICIS

Conference

AAMAS07
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)13
  • Downloads (Last 6 weeks)0
Reflects downloads up to 18 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)An Effective Training Method for Counterfactual Multi-Agent Policy Network Based on Differential Evolution AlgorithmApplied Sciences10.3390/app1418838314:18(8383)Online publication date: 18-Sep-2024
  • (2022)Task-Oriented Communication Design in Cyber-Physical Systems: A Survey on Theory and ApplicationsIEEE Access10.1109/ACCESS.2022.323103910(133842-133868)Online publication date: 2022
  • (2020)Improving coordination in small-scale multi-agent deep reinforcement learning through memory-driven communicationMachine Learning10.1007/s10994-019-05864-5Online publication date: 23-Jan-2020
  • (2012)Exploiting symmetries for single- and multi-agent Partially Observable Stochastic DomainsArtificial Intelligence10.1016/j.artint.2012.01.003182-183(32-57)Online publication date: 1-May-2012
  • (2012)Decentralized POMDPsReinforcement Learning10.1007/978-3-642-27645-3_15(471-503)Online publication date: 2012
  • (2011)Online planning for multi-agent systems with bounded communicationArtificial Intelligence10.1016/j.artint.2010.09.008175:2(487-511)Online publication date: 1-Feb-2011
  • (2010)Point-based policy generation for decentralized POMDPsProceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 110.5555/1838206.1838377(1307-1314)Online publication date: 10-May-2010
  • (2008)Optimal and approximate Q-value functions for decentralized POMDPsJournal of Artificial Intelligence Research10.5555/1622673.162268032:1(289-353)Online publication date: 1-May-2008
  • (2008)Exploiting locality of interaction in factored Dec-POMDPsProceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 110.5555/1402383.1402457(517-524)Online publication date: 12-May-2008
  • (2008)A Cross-Entropy Approach to Solving Dec-POMDPsAdvances in Intelligent and Distributed Computing10.1007/978-3-540-74930-1_15(145-154)Online publication date: 2008

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media