Foresight Distribution Adjustment for Off-policy Reinforcement Learning
Abstract
References
Index Terms
- Foresight Distribution Adjustment for Off-policy Reinforcement Learning
Recommendations
Multi-threading parallel reinforcement learning
With respect to the problem of the slow convergence of the traditional reinforcement learning algorithm in practical applications, we propose a novel multi-threading parallel reinforcement learning algorithm - MPRL. MPRL is mainly composed of two parts. ...
Reward Shaping in Episodic Reinforcement Learning
AAMAS '17: Proceedings of the 16th Conference on Autonomous Agents and MultiAgent SystemsRecent advancements in reinforcement learning confirm that reinforcement learning techniques can solve large scale problems leading to high quality autonomous decision making. It is a matter of time until we will see large scale applications of ...
Hessian matrix distribution for Bayesian policy gradient reinforcement learning
Bayesian policy gradient algorithms have been recently proposed for modeling the policy gradient of the performance measure in reinforcement learning as a Gaussian process. These methods were known to reduce the variance and the number of samples needed ...
Comments
Please enable JavaScript to view thecomments powered by Disqus.Information & Contributors
Information
Published In
- General Chairs:
- Mehdi Dastani,
- Jaime Simão Sichman,
- Program Chairs:
- Natasha Alechina,
- Virginia Dignum
Sponsors
Publisher
International Foundation for Autonomous Agents and Multiagent Systems
Richland, SC
Publication History
Check for updates
Author Tags
Qualifiers
- Research-article
Funding Sources
- National Science Foundation of China
Conference
Acceptance Rates
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- 0Total Citations
- 19Total Downloads
- Downloads (Last 12 months)19
- Downloads (Last 6 weeks)4
Other Metrics
Citations
View Options
Get Access
Login options
Check if you have access through your login credentials or your institution to get full access on this article.
Sign in