Mern et al., 2021 - Google Patents

Bayesian optimized monte carlo planning

Mern et al., 2021

Document ID: 10307001334830791624
Author: Mern J; Yildiz A; Sunberg Z; Mukerji T; Kochenderfer M
Publication year: 2021
Publication venue: Proceedings of the AAAI Conference on Artificial Intelligence

External Links

Cited by

Snippet

Online solvers for partially observable Markov decision processes have difficulty scaling to problems with large action spaces. Monte Carlo tree search with progressive widening attempts to improve scaling by sampling from the action space to construct a policy search …

Continue reading at ojs.aaai.org (PDF) (other versions)

238000000034 method 0 abstract description 45

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5009—Computer-aided design using simulation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/005—Probabilistic networks
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/0265—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
- G05B13/027—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion using neural networks only
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/11—Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30386—Retrieval requests
- G06F17/30424—Query processing
- G06F17/30533—Other types of queries
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/18—Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/04—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators

Similar Documents

Publication	Publication Date	Title
Mern et al.	2021	Bayesian optimized monte carlo planning
Wachi et al.	2020	Safe reinforcement learning in constrained markov decision processes
Suttle et al.	2020	A multi-agent off-policy actor-critic algorithm for distributed reinforcement learning
Ouyang et al.	2014	Multi-robot active sensing of non-stationary Gaussian process-based environmental phenomena
Sahai et al.	2012	Hearing the clusters of a graph: A distributed algorithm
Moss et al.	2021	Gibbon: General-purpose information-based bayesian optimisation
Lim et al.	2021	Voronoi progressive widening: efficient online solvers for continuous state, action, and observation POMDPs
Rumí et al.	2006	Estimating mixtures of truncated exponentials in hybrid Bayesian networks
Katz et al.	2019	Learning an urban air mobility encounter model from expert preferences
Sun et al.	2020	Stochastic motion planning under partial observability for mobile robots with continuous range measurements
Mern et al.	2021	Improved POMDP tree search planning with prioritized action branching
Park et al.	2019	A distributed ADMM approach to non-myopic path planning for multi-target tracking
Park et al.	2014	Variational Bayesian inference for forecasting hierarchical time series
von Rohr et al.	2021	Probabilistic robust linear quadratic regulators with Gaussian processes
Lim et al.	2023	Optimality guarantees for particle belief approximation of POMDPs
Hanawal et al.	2015	Cheap bandits
Gabor et al.	2019	Subgoal-Based Temporal Abstraction in Monte-Carlo Tree Search.
Sariff et al.	2009	Comparative study of genetic algorithm and ant colony optimization algorithm performances for robot path planning in global static environments of different complexities
Bialas et al.	2022	Coverage path planning for unmanned aerial vehicles in complex 3D environments with deep reinforcement learning
Jang et al.	2023	Improved socialtaxis for information-theoretic source search using cooperative multiple agents in turbulent environments
Ono et al.	2009	Multi-objective particle swarm optimization for robust optimization and its hybridization with gradient search
Bacon	2014	On the bottleneck concept for options discovery: Theoretical underpinnings and extension in continuous state spaces
Kuhlman et al.	2014	Physics-aware informative coverage planning for autonomous vehicles
Choi et al.	2010	Coordinated targeting of mobile sensor networks for ensemble forecast improvement
Cao et al.	2007	Partially observable Markov decision processes with reward information: Basic ideas and models