Expected Eligibility Traces

van Hasselt, Hado; Madjiheurem, Sephora; Hessel, Matteo; Silver, David; Barreto, André; Borsa, Diana

Computer Science > Machine Learning

arXiv:2007.01839 (cs)

[Submitted on 3 Jul 2020 (v1), last revised 8 Feb 2021 (this version, v2)]

Title:Expected Eligibility Traces

Authors:Hado van Hasselt, Sephora Madjiheurem, Matteo Hessel, David Silver, André Barreto, Diana Borsa

View PDF

Abstract:The question of how to determine which states and actions are responsible for a certain outcome is known as the credit assignment problem and remains a central research question in reinforcement learning and artificial intelligence. Eligibility traces enable efficient credit assignment to the recent sequence of states and actions experienced by the agent, but not to counterfactual sequences that could also have led to the current state. In this work, we introduce expected eligibility traces. Expected traces allow, with a single update, to update states and actions that could have preceded the current state, even if they did not do so on this occasion. We discuss when expected traces provide benefits over classic (instantaneous) traces in temporal-difference learning, and show that sometimes substantial improvements can be attained. We provide a way to smoothly interpolate between instantaneous and expected traces by a mechanism similar to bootstrapping, which ensures that the resulting algorithm is a strict generalisation of TD($\lambda$). Finally, we discuss possible extensions and connections to related ideas, such as successor features.

Comments:	AAAI, distinguished paper award
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Cite as:	arXiv:2007.01839 [cs.LG]
	(or arXiv:2007.01839v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2007.01839

Submission history

From: Hado van Hasselt [view email]
[v1] Fri, 3 Jul 2020 17:46:16 UTC (129 KB)
[v2] Mon, 8 Feb 2021 13:02:30 UTC (142 KB)

Computer Science > Machine Learning

Title:Expected Eligibility Traces

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Expected Eligibility Traces

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators