Computer Science > Machine Learning

arXiv:2103.02650 (cs)

[Submitted on 3 Mar 2021 (v1), last revised 15 Mar 2021 (this version, v2)]

Title:Successor Feature Sets: Generalizing Successor Representations Across Policies

Authors:Kianté Brantley, Soroush Mehri, Geoffrey J. Gordon

View PDF

Abstract:Successor-style representations have many advantages for reinforcement learning: for example, they can help an agent generalize from past experience to new goals, and they have been proposed as explanations of behavioral and neural data from human and animal learners. They also form a natural bridge between model-based and model-free RL methods: like the former they make predictions about future experiences, and like the latter they allow efficient prediction of total discounted rewards. However, successor-style representations are not optimized to generalize across policies: typically, we maintain a limited-length list of policies, and share information among them by representation learning or GPI. Successor-style representations also typically make no provision for gathering information or reasoning about latent variables. To address these limitations, we bring together ideas from predictive state representations, belief space value iteration, successor features, and convex analysis: we develop a new, general successor-style representation, together with a Bellman equation that connects multiple sources of information within this representation, including different latent states, policies, and reward functions. The new representation is highly expressive: for example, it lets us efficiently read off an optimal policy for a new reward function, or a policy that imitates a new demonstration. For this paper, we focus on exact computation of the new representation in small, known environments, since even this restricted setting offers plenty of interesting questions. Our implementation does not scale to large, unknown environments -- nor would we expect it to, since it generalizes POMDP value iteration, which is difficult to scale. However, we believe that future work will allow us to extend our ideas to approximate reasoning in large, unknown environments.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2103.02650 [cs.LG]
	(or arXiv:2103.02650v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2103.02650

Submission history

From: Kianté Brantley [view email]
[v1] Wed, 3 Mar 2021 19:36:44 UTC (893 KB)
[v2] Mon, 15 Mar 2021 22:09:04 UTC (726 KB)

Computer Science > Machine Learning

Title:Successor Feature Sets: Generalizing Successor Representations Across Policies

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Successor Feature Sets: Generalizing Successor Representations Across Policies

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators