Optimistic Linear Support and Successor Features as a Basis for Optimal Policy Transfer

Alegre, Lucas N.; Bazzan, Ana L. C.; da Silva, Bruno C.

Computer Science > Machine Learning

arXiv:2206.11326 (cs)

[Submitted on 22 Jun 2022]

Title:Optimistic Linear Support and Successor Features as a Basis for Optimal Policy Transfer

Authors:Lucas N. Alegre, Ana L. C. Bazzan, Bruno C. da Silva

View PDF

Abstract:In many real-world applications, reinforcement learning (RL) agents might have to solve multiple tasks, each one typically modeled via a reward function. If reward functions are expressed linearly, and the agent has previously learned a set of policies for different tasks, successor features (SFs) can be exploited to combine such policies and identify reasonable solutions for new problems. However, the identified solutions are not guaranteed to be optimal. We introduce a novel algorithm that addresses this limitation. It allows RL agents to combine existing policies and directly identify optimal policies for arbitrary new problems, without requiring any further interactions with the environment. We first show (under mild assumptions) that the transfer learning problem tackled by SFs is equivalent to the problem of learning to optimize multiple objectives in RL. We then introduce an SF-based extension of the Optimistic Linear Support algorithm to learn a set of policies whose SFs form a convex coverage set. We prove that policies in this set can be combined via generalized policy improvement to construct optimal behaviors for any new linearly-expressible tasks, without requiring any additional training samples. We empirically show that our method outperforms state-of-the-art competing algorithms both in discrete and continuous domains under value function approximation.

Comments:	Proceedings of the 39th International Conference on Machine Learning (ICML'22)
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2206.11326 [cs.LG]
	(or arXiv:2206.11326v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2206.11326

Submission history

From: Lucas N. Alegre [view email]
[v1] Wed, 22 Jun 2022 19:00:08 UTC (606 KB)

Computer Science > Machine Learning

Title:Optimistic Linear Support and Successor Features as a Basis for Optimal Policy Transfer

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Optimistic Linear Support and Successor Features as a Basis for Optimal Policy Transfer

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators