Golowich et al., 2022 - Google Patents

Planning in observable pomdps in quasipolynomial time

Golowich et al., 2022

Document ID: 783343790480394217
Author: Golowich N; Moitra A; Rohatgi D
Publication year: 2022
Publication venue: arXiv preprint arXiv:2201.04735

External Links

Cited by

Snippet

Partially Observable Markov Decision Processes (POMDPs) are a natural and general model in reinforcement learning that take into account the agent's uncertainty about its current state. In the literature on POMDPs, it is customary to assume access to a planning …

Continue reading at arxiv.org (PDF) (other versions)

238000000034 method 0 abstract description 30

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/005—Probabilistic networks
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/002—Quantum computers, i.e. information processing by using quantum superposition, coherence, decoherence, entanglement, nonlocality, teleportation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/12—Computer systems based on biological models using genetic models
- G06N3/126—Genetic algorithms, i.e. information processing using digital simulations of the genetic system
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management, e.g. organising, planning, scheduling or allocating time, human or machine resources; Enterprise planning; Organisational models

Similar Documents

Publication	Publication Date	Title
Golowich et al.	2022	Planning in observable pomdps in quasipolynomial time
Curi et al.	2020	Efficient model-based reinforcement learning through optimistic policy search and planning
Xu et al.	2023	Offline rl with no ood actions: In-sample learning via implicit value regularization
Zhang et al.	2020	Gendice: Generalized offline estimation of stationary values
Wang et al.	2022	Policy gradient method for robust reinforcement learning
Rosencrantz et al.	2004	Learning low dimensional predictive representations
Zhang et al.	2015	CORPP: Commonsense reasoning and probabilistic planning, as applied to dialog with a mobile robot
Chen et al.	2020	On computation and generalization of generative adversarial imitation learning
Mao et al.	2020	Information state embedding in partially observable cooperative multi-agent reinforcement learning
Gao et al.	2017	An efficient quantum algorithm for generative machine learning
Pan et al.	2020	Frequency-based search-control in dyna
Goo et al.	2022	Know your boundaries: The necessity of explicit behavioral cloning in offline rl
Golowich et al.	2023	Planning and learning in partially observable systems via filter stability
Li et al.	2021	Bayesian distributional policy gradients
Zhang et al.	2022	Latent state marginalization as a low-cost approach for improving exploration
Zhang et al.	2020	Learning retrospective knowledge with reverse reinforcement learning
Wang et al.	2023	Mobile agent path planning under uncertain environment using reinforcement learning and probabilistic model checking
Fan et al.	2023	Fedhql: Federated heterogeneous q-learning
Campbell et al.	2013	Multiagent allocation of markov decision process tasks
Beikmohammadi et al.	2024	Accelerating actor-critic-based algorithms via pseudo-labels derived from prior knowledge
Rohatgi	2023	Computationally Efficient Reinforcement Learning under Partial Observability
Wei	2024	Value of Information and Reward Specification in Active Inference and POMDPs
Levchuk et al.	2018	Active learning and structure adaptation in teams of heterogeneous agents: designing organizations of the future
Watanabe et al.	2011	Loopy belief propagation, Bethe free energy and graph zeta function
Heng	2016	On the use of transport and optimal control methods for Monte Carlo simulation