Nothing Special   »   [go: up one dir, main page]

Golowich et al., 2022 - Google Patents

Planning in observable pomdps in quasipolynomial time

Golowich et al., 2022

View PDF
Document ID
783343790480394217
Author
Golowich N
Moitra A
Rohatgi D
Publication year
Publication venue
arXiv preprint arXiv:2201.04735

External Links

Snippet

Partially Observable Markov Decision Processes (POMDPs) are a natural and general model in reinforcement learning that take into account the agent's uncertainty about its current state. In the literature on POMDPs, it is customary to assume access to a planning …
Continue reading at arxiv.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • G06N99/005Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • G06N5/04Inference methods or devices
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computer systems based on specific mathematical models
    • G06N7/005Probabilistic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • G06N5/02Knowledge representation
    • G06N5/022Knowledge engineering, knowledge acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • G06N99/002Quantum computers, i.e. information processing by using quantum superposition, coherence, decoherence, entanglement, nonlocality, teleportation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/12Computer systems based on biological models using genetic models
    • G06N3/126Genetic algorithms, i.e. information processing using digital simulations of the genetic system
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06QDATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management, e.g. organising, planning, scheduling or allocating time, human or machine resources; Enterprise planning; Organisational models

Similar Documents

Publication Publication Date Title
Golowich et al. Planning in observable pomdps in quasipolynomial time
Curi et al. Efficient model-based reinforcement learning through optimistic policy search and planning
Xu et al. Offline rl with no ood actions: In-sample learning via implicit value regularization
Zhang et al. Gendice: Generalized offline estimation of stationary values
Wang et al. Policy gradient method for robust reinforcement learning
Rosencrantz et al. Learning low dimensional predictive representations
Zhang et al. CORPP: Commonsense reasoning and probabilistic planning, as applied to dialog with a mobile robot
Chen et al. On computation and generalization of generative adversarial imitation learning
Mao et al. Information state embedding in partially observable cooperative multi-agent reinforcement learning
Gao et al. An efficient quantum algorithm for generative machine learning
Pan et al. Frequency-based search-control in dyna
Goo et al. Know your boundaries: The necessity of explicit behavioral cloning in offline rl
Golowich et al. Planning and learning in partially observable systems via filter stability
Li et al. Bayesian distributional policy gradients
Zhang et al. Latent state marginalization as a low-cost approach for improving exploration
Zhang et al. Learning retrospective knowledge with reverse reinforcement learning
Wang et al. Mobile agent path planning under uncertain environment using reinforcement learning and probabilistic model checking
Fan et al. Fedhql: Federated heterogeneous q-learning
Campbell et al. Multiagent allocation of markov decision process tasks
Beikmohammadi et al. Accelerating actor-critic-based algorithms via pseudo-labels derived from prior knowledge
Rohatgi Computationally Efficient Reinforcement Learning under Partial Observability
Wei Value of Information and Reward Specification in Active Inference and POMDPs
Levchuk et al. Active learning and structure adaptation in teams of heterogeneous agents: designing organizations of the future
Watanabe et al. Loopy belief propagation, Bethe free energy and graph zeta function
Heng On the use of transport and optimal control methods for Monte Carlo simulation