Nothing Special   »   [go: up one dir, main page]

Mern et al., 2021 - Google Patents

Bayesian optimized monte carlo planning

Mern et al., 2021

View PDF
Document ID
10307001334830791624
Author
Mern J
Yildiz A
Sunberg Z
Mukerji T
Kochenderfer M
Publication year
Publication venue
Proceedings of the AAAI Conference on Artificial Intelligence

External Links

Snippet

Online solvers for partially observable Markov decision processes have difficulty scaling to problems with large action spaces. Monte Carlo tree search with progressive widening attempts to improve scaling by sampling from the action space to construct a policy search …
Continue reading at ojs.aaai.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • G06N99/005Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/50Computer-aided design
    • G06F17/5009Computer-aided design using simulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computer systems based on specific mathematical models
    • G06N7/005Probabilistic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • G05B13/0265Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
    • G05B13/027Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion using neural networks only
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • G06N5/04Inference methods or devices
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • G06N5/02Knowledge representation
    • G06N5/022Knowledge engineering, knowledge acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/11Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30386Retrieval requests
    • G06F17/30424Query processing
    • G06F17/30533Other types of queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • G05B13/04Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators

Similar Documents

Publication Publication Date Title
Mern et al. Bayesian optimized monte carlo planning
Wachi et al. Safe reinforcement learning in constrained markov decision processes
Suttle et al. A multi-agent off-policy actor-critic algorithm for distributed reinforcement learning
Ouyang et al. Multi-robot active sensing of non-stationary Gaussian process-based environmental phenomena
Sahai et al. Hearing the clusters of a graph: A distributed algorithm
Moss et al. Gibbon: General-purpose information-based bayesian optimisation
Lim et al. Voronoi progressive widening: efficient online solvers for continuous state, action, and observation POMDPs
Rumí et al. Estimating mixtures of truncated exponentials in hybrid Bayesian networks
Katz et al. Learning an urban air mobility encounter model from expert preferences
Sun et al. Stochastic motion planning under partial observability for mobile robots with continuous range measurements
Mern et al. Improved POMDP tree search planning with prioritized action branching
Park et al. A distributed ADMM approach to non-myopic path planning for multi-target tracking
Park et al. Variational Bayesian inference for forecasting hierarchical time series
von Rohr et al. Probabilistic robust linear quadratic regulators with Gaussian processes
Lim et al. Optimality guarantees for particle belief approximation of POMDPs
Hanawal et al. Cheap bandits
Gabor et al. Subgoal-Based Temporal Abstraction in Monte-Carlo Tree Search.
Sariff et al. Comparative study of genetic algorithm and ant colony optimization algorithm performances for robot path planning in global static environments of different complexities
Bialas et al. Coverage path planning for unmanned aerial vehicles in complex 3D environments with deep reinforcement learning
Jang et al. Improved socialtaxis for information-theoretic source search using cooperative multiple agents in turbulent environments
Ono et al. Multi-objective particle swarm optimization for robust optimization and its hybridization with gradient search
Bacon On the bottleneck concept for options discovery: Theoretical underpinnings and extension in continuous state spaces
Kuhlman et al. Physics-aware informative coverage planning for autonomous vehicles
Choi et al. Coordinated targeting of mobile sensor networks for ensemble forecast improvement
Cao et al. Partially observable Markov decision processes with reward information: Basic ideas and models