research-article

A Value Equivalence Approach for Solving Interactive Dynamic Influence Diagrams

Authors:

Yinghui PanAuthors Info & Claims

AAMAS '16: Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems

Pages 1162 - 1170

Published: 09 May 2016 Publication History

Abstract

Interactive dynamic influence diagrams (I-DIDs) are recognized graphical models for sequential multiagent decision making under uncertainty. They represent the problem of how a subject agent acts in a common setting shared with other agents who may act in sophisticated ways. The difficulty in solving I-DIDs is mainly due to an exponentially growing space of candidate models ascribed to other agents over time. in order to minimize the model space, the previous I-DID techniques prune behaviorally equivalent models. In this paper, we challenge the minimal set of models and propose a value equivalence approach to further compress the model space. The new method reduces the space by additionally pruning behaviorally distinct models that result in the same expected value of the subject agent's optimal policy. To achieve this, we propose to learn the value from available data particularly in practical applications of real-time strategy games. We demonstrate the performance of the new technique in two problem domains.

References

[1]

C. Amato and F. A. Oliehoek. Scalable planning and learning for multiagent POMDPs: Extended version. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI), pages 1995--2002, 2015.

Digital Library

[2]

K. T. Andersen, Y. Zeng, D. D. Christensen, and D. Tran. Experiments with online reinforcement learning in real-time strategy games. Applied Artificial Intelligence: An International Journal, 23:855--871, 2009.

[3]

Y. Chen, P. Doshi, and Y. Zeng. Iterative online planning in multiagent settings with limited model spaces and pac guarantees. In Proceedings of the Fourteenth Internationl Conference on Autonomous Agents and Multiagents Systems Conference (AAMAS), pages 1161--1169, 2015.

Digital Library

[4]

R. Conroy, Y. Zeng, M. Cavazza, and Y. Chen. Learning behaviors in agents systems with interactive dynamic influence diagrams. In International Joint Conference on Artificial Intelligence (IJCAI), pages 39--45, 2015.

Digital Library

[5]

P. Doshi, M. Chandrasekaran, and Y. Zeng. Epsilon-subject equivalence of models for interactive dynamic influence diagrams. In WIC/ACM/IEEE Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), 2010.

Digital Library

[6]

P. Doshi and Y. Zeng. Improved approximation of interactive dynamic influence diagrams using discriminative model updates. In Eighth Internationl Conference on Autonomous Agents and Multiagents Systems Conference (AAMAS), pages 907--914, 2009.

Digital Library

[7]

P. Doshi, Y. Zeng, and Q. Chen. Graphical models for interactive pomdps: Representations and solutions. Journal of Autonomous Agents and Multi-Agent Systems (JAAMAS), 18(3):376--416, 2009.

Digital Library

[8]

P. Gmytrasiewicz and P. Doshi. A framework for sequential planning in multiagent settings. Journal of Artificial Intelligence Research (JAIR), 24:49--79, 2005.

Digital Library

[9]

M. Hauskrecht. Value-function approximations for partially observable markov decision processes. Journal of Artificial Intelligence Research (JAIR), 13:33--94, 2000.

Digital Library

[10]

D. Koller and B. Milch. Multi-agent influence diagrams for representing and solving games. In International Joint Conference on Artificial Intelligence (IJCAI), pages 1027--1034, 2001.

Digital Library

[11]

J. Luo, H. Yin, B. Li, and C. Wu. Path planning for automated guided vehicles system via interactive dynamic influence diagrams with communication. In 9th IEEE International Conference on Control and Automation (ICCA), pages 755--759, 2011.

[12]

B. Ng, C. Meyers, K. Boakye, and J. Nitao. Towards applying interactive POMDPs to real-world adversary modeling. In Innovative Applications in Artificial Intelligence (IAAI), pages 1814--1820, 2010.

[13]

F. Oliehoek, M. Spaan, S. Whiteson, and N. Vlassis. Exploiting locality of interaction in factored Dec-POMDPs. In Seventh International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pages 517--524, 2008.

Digital Library

[14]

F. A. Oliehoek, S. Whiteson, and M. T. Spaan. Approximate solutions for factored Dec-POMDPs with many agents. In Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems (AAMAS), pages 563--570, 2013.

Digital Library

[15]

F. A. Oliehoek, S. J. Witwicki, and L. P. Kaelbling. Influence-based abstraction for multiagent systems. In Twenty-Sixth AAAI Conference on Artificial Intelligence (AAAI), pages 1422--1428, 2012.

Digital Library

[16]

J. Pajarinen and J. Peltonen. Efficient planning for factored infinite-horizon Dec-POMDPs. In International Joint Conference on Artificial Intelligence (IJCAI), pages 325--331, 2011.

Digital Library

[17]

J. Pajarinen and J. Peltonen. Efficient planning for factored infinite-horizon Dec-POMDPs. In Proceedings of the Twenty-Second international joint conference on Artificial Intelligence (IJCAI), pages 325--331, 2011.

Digital Library

[18]

D. Pynadath and S. Marsella. Minimal mental models. In Twenty-Second Conference on Artificial Intelligence (AAAI), pages 1038--1044, Vancouver, Canada, 2007.

Digital Library

[19]

S. Seuken and S. Zilberstein. Formal models and algorithms for decentralized decision making under uncertainty. Journal of Autonomous Agents and Multi-agent Systems, pages 190--250, 2008.

Digital Library

[20]

J. A. Tatman and R. D. Shachter. Dynamic programming and influence diagrams. IEEE Transactions on Systems, Man, and Cybernetics, 20(2):365--379, 1990.

[21]

T. Veiga, M. T. J. Spaan, and P. U. Lima. Point-based pomdp solving with factored value function approximation. In Twenty-Eighth AAAI Conference on Artificial Intelligence (AAAI), pages 2513--2519, 2014.

Digital Library

[22]

S. J. Witwicki and E. H. Durfee. Influence-based policy abstraction for weakly-coupled dec-pomdps. In International Conference on Automated Planning and Scheduling (ICAPS), pages 185--192, 2010.

[23]

Y. Zeng, Y. Chen, and P. Doshi. Approximating behavioral equivalence of models using top-k policy paths (extended abstract). In International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), pages 1229--1230, 2011.

Digital Library

[24]

Y. Zeng and P. Doshi. Exploiting model equivalences for solving interactive dynamic influence diagrams. Journal of Artificial Intelligence Research (JAIR), 43:211--255, 2012.

Digital Library

[25]

Y. Zeng, P. Doshi, Y. Pan, H. Mao, M. Chandrasekaran, and J. Luo. Utilizing partial policies for identifying equivalence of behavioral models. In Twenty-Fifth AAAI Conference on Artificial Intelligence, pages 1083--1088, 2011.

Digital Library

[26]

Y. Zeng, H. Mao, Y. Pan, and J. Luo. Improved use of partial policies for identifying behavioral equivalence. In Proceedings of the Eleventh Internationl Conference on Autonomous Agents and Multiagents Systems Conference (AAMAS), pages 1015--1022, 2012.

Digital Library

Cited By

Conroy RZeng YTang J(2016)Approximating value equivalence in interactive dynamic influence diagrams using behavioral coverageProceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence10.5555/3060621.3060650(201-207)Online publication date: 9-Jul-2016
https://dl.acm.org/doi/10.5555/3060621.3060650

Index Terms

A Value Equivalence Approach for Solving Interactive Dynamic Influence Diagrams
1. Computing methodologies
  1. Artificial intelligence
    1. Distributed artificial intelligence
      1. Multi-agent systems

Recommendations

Epsilon-Subjective Equivalence of Models for Interactive Dynamic Influence Diagrams
WI-IAT '10: Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 02

Interactive dynamic influence diagrams (I-DID) are graphical models for sequential decision making in uncertain settings shared by other agents. Algorithms for solving I-DIDs face the challenge of an exponentially growing space of candidate models ...
Exploiting model equivalences for solving interactive dynamic influence diagrams

We focus on the problem of sequential decision making in partially observable environments shared with other agents of uncertain types having similar or conflicting objectives. This problem has been previously formalized by multiple frameworks one of ...
Graphical models for interactive POMDPs: representations and solutions

We develop new graphical representations for the problem of sequential decision making in partially observable multiagent environments, as formalized by interactive partially observable Markov decision processes (I-POMDPs). The graphical models called ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

AAMAS '16: Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems

May 2016

1580 pages

ISBN:9781450342391

General Chairs:
Catholijn M. Jonker
TU Delft, Netherlands
,
Stacy Marsella
University of Southern California, USA
,
Program Chairs:
John Thangarajah
RMIT University. Australia
,
Karl Tuyls
University of Liverpool, UK

Sponsors

IFAAMAS

In-Cooperation

SIGAI: ACM Special Interest Group on Artificial Intelligence

Publisher

International Foundation for Autonomous Agents and Multiagent Systems

Richland, SC

Publication History

Published: 09 May 2016

Check for updates

Author Tags

Qualifiers

Research-article

Conference

AAMAS '16

Sponsor:

AAMAS '16: International Conference on Agents and Multiagent Systems

May 9 - 13, 2016

Singapore, Singapore

Acceptance Rates

AAMAS '16 Paper Acceptance Rate 137 of 550 submissions, 25%;

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
73
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)0

Reflects downloads up to 13 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Conroy RZeng YTang J(2016)Approximating value equivalence in interactive dynamic influence diagrams using behavioral coverageProceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence10.5555/3060621.3060650(201-207)Online publication date: 9-Jul-2016
https://dl.acm.org/doi/10.5555/3060621.3060650

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents