IMED-RL: regret optimal learning of ergodic Markov decision processes
Abstract
Supplementary Material
- Download
- 1.42 MB
References
Index Terms
- IMED-RL: regret optimal learning of ergodic Markov decision processes
Recommendations
RL for latent MDPs: regret guarantees and a lower bound
NIPS '21: Proceedings of the 35th International Conference on Neural Information Processing SystemsIn this work, we consider the regret minimization problem for reinforcement learning in latent Markov Decision Processes (LMDP). In an LMDP, an MDP is randomly drawn from a set of M possible MDPs at the beginning of the interaction, but the identity of ...
Posterior sampling for competitive RL: function approximation and partial observation
NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsThis paper investigates posterior sampling algorithms for competitive reinforcement learning (RL) in the context of general function approximations. Focusing on zero-sum Markov games (MGs) under two critical settings, namely self-play and adversarial ...
Expediting RL by using graphical structures
AAMAS '08: Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 3The goal of Reinforcement learning (RL) is to maximize reward (minimize cost) in a Markov decision process (MDP) without knowing the underlying model a priori. RL algorithms tend to be much slower than planning algorithms, which require the model as ...
Comments
Please enable JavaScript to view thecomments powered by Disqus.Information & Contributors
Information
Published In
- Editors:
- S. Koyejo,
- S. Mohamed,
- A. Agarwal,
- D. Belgrave,
- K. Cho,
- A. Oh
Publisher
Curran Associates Inc.
Red Hook, NY, United States
Publication History
Qualifiers
- Research-article
- Research
- Refereed limited
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- 0Total Citations
- 0Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0