Computer Science > Machine Learning

arXiv:2206.12020 (cs)

[Submitted on 24 Jun 2022]

Title:Provably Efficient Reinforcement Learning in Partially Observable Dynamical Systems

Authors:Masatoshi Uehara, Ayush Sekhari, Jason D. Lee, Nathan Kallus, Wen Sun

View PDF

Abstract:We study Reinforcement Learning for partially observable dynamical systems using function approximation. We propose a new \textit{Partially Observable Bilinear Actor-Critic framework}, that is general enough to include models such as observable tabular Partially Observable Markov Decision Processes (POMDPs), observable Linear-Quadratic-Gaussian (LQG), Predictive State Representations (PSRs), as well as a newly introduced model Hilbert Space Embeddings of POMDPs and observable POMDPs with latent low-rank transition. Under this framework, we propose an actor-critic style algorithm that is capable of performing agnostic policy learning. Given a policy class that consists of memory based policies (that look at a fixed-length window of recent observations), and a value function class that consists of functions taking both memory and future observations as inputs, our algorithm learns to compete against the best memory-based policy in the given policy class. For certain examples such as undercomplete observable tabular POMDPs, observable LQGs and observable POMDPs with latent low-rank transition, by implicitly leveraging their special properties, our algorithm is even capable of competing against the globally optimal policy without paying an exponential dependence on the horizon in its sample complexity.

Subjects:	Machine Learning (cs.LG); Statistics Theory (math.ST); Methodology (stat.ME); Machine Learning (stat.ML)
Cite as:	arXiv:2206.12020 [cs.LG]
	(or arXiv:2206.12020v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2206.12020

Submission history

From: Masatoshi Uehara [view email]
[v1] Fri, 24 Jun 2022 00:27:42 UTC (90 KB)

Computer Science > Machine Learning

Title:Provably Efficient Reinforcement Learning in Partially Observable Dynamical Systems

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Provably Efficient Reinforcement Learning in Partially Observable Dynamical Systems

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators