Mern et al., 2021 - Google Patents
Bayesian optimized monte carlo planningMern et al., 2021
View PDF- Document ID
- 10307001334830791624
- Author
- Mern J
- Yildiz A
- Sunberg Z
- Mukerji T
- Kochenderfer M
- Publication year
- Publication venue
- Proceedings of the AAAI Conference on Artificial Intelligence
External Links
Snippet
Online solvers for partially observable Markov decision processes have difficulty scaling to problems with large action spaces. Monte Carlo tree search with progressive widening attempts to improve scaling by sampling from the action space to construct a policy search …
- 238000000034 method 0 abstract description 45
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5009—Computer-aided design using simulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/005—Probabilistic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/0265—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
- G05B13/027—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion using neural networks only
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/11—Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30386—Retrieval requests
- G06F17/30424—Query processing
- G06F17/30533—Other types of queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/18—Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/04—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Mern et al. | Bayesian optimized monte carlo planning | |
Wachi et al. | Safe reinforcement learning in constrained markov decision processes | |
Suttle et al. | A multi-agent off-policy actor-critic algorithm for distributed reinforcement learning | |
Ouyang et al. | Multi-robot active sensing of non-stationary Gaussian process-based environmental phenomena | |
Sahai et al. | Hearing the clusters of a graph: A distributed algorithm | |
Moss et al. | Gibbon: General-purpose information-based bayesian optimisation | |
Lim et al. | Voronoi progressive widening: efficient online solvers for continuous state, action, and observation POMDPs | |
Rumí et al. | Estimating mixtures of truncated exponentials in hybrid Bayesian networks | |
Katz et al. | Learning an urban air mobility encounter model from expert preferences | |
Sun et al. | Stochastic motion planning under partial observability for mobile robots with continuous range measurements | |
Mern et al. | Improved POMDP tree search planning with prioritized action branching | |
Park et al. | A distributed ADMM approach to non-myopic path planning for multi-target tracking | |
Park et al. | Variational Bayesian inference for forecasting hierarchical time series | |
von Rohr et al. | Probabilistic robust linear quadratic regulators with Gaussian processes | |
Lim et al. | Optimality guarantees for particle belief approximation of POMDPs | |
Hanawal et al. | Cheap bandits | |
Gabor et al. | Subgoal-Based Temporal Abstraction in Monte-Carlo Tree Search. | |
Sariff et al. | Comparative study of genetic algorithm and ant colony optimization algorithm performances for robot path planning in global static environments of different complexities | |
Bialas et al. | Coverage path planning for unmanned aerial vehicles in complex 3D environments with deep reinforcement learning | |
Jang et al. | Improved socialtaxis for information-theoretic source search using cooperative multiple agents in turbulent environments | |
Ono et al. | Multi-objective particle swarm optimization for robust optimization and its hybridization with gradient search | |
Bacon | On the bottleneck concept for options discovery: Theoretical underpinnings and extension in continuous state spaces | |
Kuhlman et al. | Physics-aware informative coverage planning for autonomous vehicles | |
Choi et al. | Coordinated targeting of mobile sensor networks for ensemble forecast improvement | |
Cao et al. | Partially observable Markov decision processes with reward information: Basic ideas and models |