Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/2936924.2937094acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaamasConference Proceedingsconference-collections
research-article

A Value Equivalence Approach for Solving Interactive Dynamic Influence Diagrams

Published: 09 May 2016 Publication History

Abstract

Interactive dynamic influence diagrams (I-DIDs) are recognized graphical models for sequential multiagent decision making under uncertainty. They represent the problem of how a subject agent acts in a common setting shared with other agents who may act in sophisticated ways. The difficulty in solving I-DIDs is mainly due to an exponentially growing space of candidate models ascribed to other agents over time. in order to minimize the model space, the previous I-DID techniques prune behaviorally equivalent models. In this paper, we challenge the minimal set of models and propose a value equivalence approach to further compress the model space. The new method reduces the space by additionally pruning behaviorally distinct models that result in the same expected value of the subject agent's optimal policy. To achieve this, we propose to learn the value from available data particularly in practical applications of real-time strategy games. We demonstrate the performance of the new technique in two problem domains.

References

[1]
C. Amato and F. A. Oliehoek. Scalable planning and learning for multiagent POMDPs: Extended version. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI), pages 1995--2002, 2015.
[2]
K. T. Andersen, Y. Zeng, D. D. Christensen, and D. Tran. Experiments with online reinforcement learning in real-time strategy games. Applied Artificial Intelligence: An International Journal, 23:855--871, 2009.
[3]
Y. Chen, P. Doshi, and Y. Zeng. Iterative online planning in multiagent settings with limited model spaces and pac guarantees. In Proceedings of the Fourteenth Internationl Conference on Autonomous Agents and Multiagents Systems Conference (AAMAS), pages 1161--1169, 2015.
[4]
R. Conroy, Y. Zeng, M. Cavazza, and Y. Chen. Learning behaviors in agents systems with interactive dynamic influence diagrams. In International Joint Conference on Artificial Intelligence (IJCAI), pages 39--45, 2015.
[5]
P. Doshi, M. Chandrasekaran, and Y. Zeng. Epsilon-subject equivalence of models for interactive dynamic influence diagrams. In WIC/ACM/IEEE Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), 2010.
[6]
P. Doshi and Y. Zeng. Improved approximation of interactive dynamic influence diagrams using discriminative model updates. In Eighth Internationl Conference on Autonomous Agents and Multiagents Systems Conference (AAMAS), pages 907--914, 2009.
[7]
P. Doshi, Y. Zeng, and Q. Chen. Graphical models for interactive pomdps: Representations and solutions. Journal of Autonomous Agents and Multi-Agent Systems (JAAMAS), 18(3):376--416, 2009.
[8]
P. Gmytrasiewicz and P. Doshi. A framework for sequential planning in multiagent settings. Journal of Artificial Intelligence Research (JAIR), 24:49--79, 2005.
[9]
M. Hauskrecht. Value-function approximations for partially observable markov decision processes. Journal of Artificial Intelligence Research (JAIR), 13:33--94, 2000.
[10]
D. Koller and B. Milch. Multi-agent influence diagrams for representing and solving games. In International Joint Conference on Artificial Intelligence (IJCAI), pages 1027--1034, 2001.
[11]
J. Luo, H. Yin, B. Li, and C. Wu. Path planning for automated guided vehicles system via interactive dynamic influence diagrams with communication. In 9th IEEE International Conference on Control and Automation (ICCA), pages 755--759, 2011.
[12]
B. Ng, C. Meyers, K. Boakye, and J. Nitao. Towards applying interactive POMDPs to real-world adversary modeling. In Innovative Applications in Artificial Intelligence (IAAI), pages 1814--1820, 2010.
[13]
F. Oliehoek, M. Spaan, S. Whiteson, and N. Vlassis. Exploiting locality of interaction in factored Dec-POMDPs. In Seventh International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pages 517--524, 2008.
[14]
F. A. Oliehoek, S. Whiteson, and M. T. Spaan. Approximate solutions for factored Dec-POMDPs with many agents. In Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems (AAMAS), pages 563--570, 2013.
[15]
F. A. Oliehoek, S. J. Witwicki, and L. P. Kaelbling. Influence-based abstraction for multiagent systems. In Twenty-Sixth AAAI Conference on Artificial Intelligence (AAAI), pages 1422--1428, 2012.
[16]
J. Pajarinen and J. Peltonen. Efficient planning for factored infinite-horizon Dec-POMDPs. In International Joint Conference on Artificial Intelligence (IJCAI), pages 325--331, 2011.
[17]
J. Pajarinen and J. Peltonen. Efficient planning for factored infinite-horizon Dec-POMDPs. In Proceedings of the Twenty-Second international joint conference on Artificial Intelligence (IJCAI), pages 325--331, 2011.
[18]
D. Pynadath and S. Marsella. Minimal mental models. In Twenty-Second Conference on Artificial Intelligence (AAAI), pages 1038--1044, Vancouver, Canada, 2007.
[19]
S. Seuken and S. Zilberstein. Formal models and algorithms for decentralized decision making under uncertainty. Journal of Autonomous Agents and Multi-agent Systems, pages 190--250, 2008.
[20]
J. A. Tatman and R. D. Shachter. Dynamic programming and influence diagrams. IEEE Transactions on Systems, Man, and Cybernetics, 20(2):365--379, 1990.
[21]
T. Veiga, M. T. J. Spaan, and P. U. Lima. Point-based pomdp solving with factored value function approximation. In Twenty-Eighth AAAI Conference on Artificial Intelligence (AAAI), pages 2513--2519, 2014.
[22]
S. J. Witwicki and E. H. Durfee. Influence-based policy abstraction for weakly-coupled dec-pomdps. In International Conference on Automated Planning and Scheduling (ICAPS), pages 185--192, 2010.
[23]
Y. Zeng, Y. Chen, and P. Doshi. Approximating behavioral equivalence of models using top-k policy paths (extended abstract). In International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), pages 1229--1230, 2011.
[24]
Y. Zeng and P. Doshi. Exploiting model equivalences for solving interactive dynamic influence diagrams. Journal of Artificial Intelligence Research (JAIR), 43:211--255, 2012.
[25]
Y. Zeng, P. Doshi, Y. Pan, H. Mao, M. Chandrasekaran, and J. Luo. Utilizing partial policies for identifying equivalence of behavioral models. In Twenty-Fifth AAAI Conference on Artificial Intelligence, pages 1083--1088, 2011.
[26]
Y. Zeng, H. Mao, Y. Pan, and J. Luo. Improved use of partial policies for identifying behavioral equivalence. In Proceedings of the Eleventh Internationl Conference on Autonomous Agents and Multiagents Systems Conference (AAMAS), pages 1015--1022, 2012.

Cited By

View all
  • (2016)Approximating value equivalence in interactive dynamic influence diagrams using behavioral coverageProceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence10.5555/3060621.3060650(201-207)Online publication date: 9-Jul-2016

Index Terms

  1. A Value Equivalence Approach for Solving Interactive Dynamic Influence Diagrams

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    AAMAS '16: Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems
    May 2016
    1580 pages
    ISBN:9781450342391

    Sponsors

    • IFAAMAS

    In-Cooperation

    Publisher

    International Foundation for Autonomous Agents and Multiagent Systems

    Richland, SC

    Publication History

    Published: 09 May 2016

    Check for updates

    Author Tags

    1. decision making
    2. intelligent agents
    3. probabilistic graphical models

    Qualifiers

    • Research-article

    Conference

    AAMAS '16
    Sponsor:

    Acceptance Rates

    AAMAS '16 Paper Acceptance Rate 137 of 550 submissions, 25%;
    Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 13 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2016)Approximating value equivalence in interactive dynamic influence diagrams using behavioral coverageProceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence10.5555/3060621.3060650(201-207)Online publication date: 9-Jul-2016

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media