Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/1402821.1402880acmconferencesArticle/Chapter ViewAbstractPublication PagesaamasConference Proceedingsconference-collections
research-article

Social reward shaping in the prisoner's dilemma

Published: 12 May 2008 Publication History

Abstract

Reward shaping is a well-known technique applied to help reinforcement-learning agents converge more quickly to near-optimal behavior. In this paper, we introduce social reward shaping, which is reward shaping applied in the multiagent-learning framework. We present preliminary experiments in the iterated Prisoner's dilemma setting that show that agents using social reward shaping appropriately can behave more effectively than other classical learning and non-learning strategies. In particular, we show that these agents can both lead---encourage adaptive opponents to stably cooperate---and follow---adopt a best-response strategy when paired with a fixed opponent---where better known approaches achieve only one of these objectives.

References

[1]
Y.-H. Chang and L. P. Kaelbling. Playing is believing: The role of beliefs in multi-agent learning. In Advances in Neural Information Processing Systems 14, 2002.
[2]
M. L. Littman and P. Stone. Implicit negotiation in repeated games. In Eighth International Workshop on Agent Theories, Architectures, and Languages (ATAL-2001), pages 393--404, 2001.
[3]
M. L. Littman and P. Stone. A polynomial-time Nash equilibrium algorithm for repeated games. Decision Support Systems, 39(1):55--66, 2005.
[4]
A. Y. Ng, D. Harada, and S. Russell. Policy invariance under reward transformations: Theory and application to reward shaping. In Proceedings of the Sixteenth International Conference on Machine Learning, pages 278--287, 1999.
[5]
M. J. Osborne and A. Rubinstein. A Course in Game Theory. The MIT Press, 1994.
[6]
M. L. Puterman. Markov Decision Processes---Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., New York, NY, 1994.
[7]
Y. Shoham, R. Powers, and T. Grenager. If multi-agent learning is the answer, what is the question? Artificial Intelligence, 2007. Special issue on the foundations of research in multi-agent learning.
[8]
R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. The MIT Press, 1998.
[9]
E. Wiewiora. Potential-based shaping and Q-value initialization are equivalent. J. Artif. Intell. Res. (JAIR), 19:205--208, 2003.

Cited By

View all
  • (2022)Emergent Cooperation from Mutual Acknowledgment ExchangeProceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems10.5555/3535850.3535967(1047-1055)Online publication date: 9-May-2022
  • (2020)Gaussian Processes as Multiagent Reward ModelsProceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems10.5555/3398761.3398804(330-338)Online publication date: 5-May-2020
  • (2019)Coordinating the CrowdProceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems10.5555/3306127.3331718(386-394)Online publication date: 8-May-2019
  • Show More Cited By

Index Terms

  1. Social reward shaping in the prisoner's dilemma

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    AAMAS '08: Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 3
    May 2008
    503 pages
    ISBN:9780981738123

    Sponsors

    Publisher

    International Foundation for Autonomous Agents and Multiagent Systems

    Richland, SC

    Publication History

    Published: 12 May 2008

    Check for updates

    Author Tags

    1. game theory
    2. iterated prisoner's dilemma
    3. leader/follower strategies
    4. reinforcement learning
    5. subgame perfect equilibrium

    Qualifiers

    • Research-article

    Conference

    AAMAS08
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)4
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 19 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Emergent Cooperation from Mutual Acknowledgment ExchangeProceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems10.5555/3535850.3535967(1047-1055)Online publication date: 9-May-2022
    • (2020)Gaussian Processes as Multiagent Reward ModelsProceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems10.5555/3398761.3398804(330-338)Online publication date: 5-May-2020
    • (2019)Coordinating the CrowdProceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems10.5555/3306127.3331718(386-394)Online publication date: 8-May-2019
    • (2019)Learning Existing Social Conventions via Observationally Augmented Self-PlayProceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society10.1145/3306618.3314268(107-114)Online publication date: 27-Jan-2019
    • (2018)Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team RewardProceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems10.5555/3237383.3238080(2085-2087)Online publication date: 9-Jul-2018
    • (2018)Prosocial Learning Agents Solve Generalized Stag Hunts Better than Selfish OnesProceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems10.5555/3237383.3238065(2043-2044)Online publication date: 9-Jul-2018
    • (2017)An exploration strategy for non-stationary opponentsAutonomous Agents and Multi-Agent Systems10.1007/s10458-016-9347-331:5(971-1002)Online publication date: 1-Sep-2017
    • (2014)Potential-based difference rewards for multiagent reinforcement learningProceedings of the 2014 international conference on Autonomous agents and multi-agent systems10.5555/2615731.2615761(165-172)Online publication date: 5-May-2014
    • (2013)Organizational design principles and techniques for decision-theoretic agentsProceedings of the 2013 international conference on Autonomous agents and multi-agent systems10.5555/2484920.2484994(463-470)Online publication date: 6-May-2013
    • (2013)Combining Dynamic Reward Shaping and Action Shaping for Coordinating Multi-agent LearningProceedings of the 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) - Volume 0210.1109/WI-IAT.2013.127(321-328)Online publication date: 17-Nov-2013
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media