Model-Value Inconsistency as a Signal for Epistemic Uncertainty

Filos, Angelos; Vértes, Eszter; Marinho, Zita; Farquhar, Gregory; Borsa, Diana; Friesen, Abram; Behbahani, Feryal; Schaul, Tom; Barreto, André; Osindero, Simon

Computer Science > Machine Learning

arXiv:2112.04153 (cs)

[Submitted on 8 Dec 2021 (v1), last revised 29 Jun 2022 (this version, v3)]

Title:Model-Value Inconsistency as a Signal for Epistemic Uncertainty

Authors:Angelos Filos, Eszter Vértes, Zita Marinho, Gregory Farquhar, Diana Borsa, Abram Friesen, Feryal Behbahani, Tom Schaul, André Barreto, Simon Osindero

View PDF

Abstract:Using a model of the environment and a value function, an agent can construct many estimates of a state's value, by unrolling the model for different lengths and bootstrapping with its value function. Our key insight is that one can treat this set of value estimates as a type of ensemble, which we call an \emph{implicit value ensemble} (IVE). Consequently, the discrepancy between these estimates can be used as a proxy for the agent's epistemic uncertainty; we term this signal \emph{model-value inconsistency} or \emph{self-inconsistency} for short. Unlike prior work which estimates uncertainty by training an ensemble of many models and/or value functions, this approach requires only the single model and value function which are already being learned in most model-based reinforcement learning algorithms. We provide empirical evidence in both tabular and function approximation settings from pixels that self-inconsistency is useful (i) as a signal for exploration, (ii) for acting safely under distribution shifts, and (iii) for robustifying value-based planning with a learned model.

Comments:	The first three authors contributed equally. Accepted at ICML 2022
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2112.04153 [cs.LG]
	(or arXiv:2112.04153v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2112.04153

Submission history

From: Angelos Filos [view email]
[v1] Wed, 8 Dec 2021 07:53:41 UTC (10,392 KB)
[v2] Thu, 10 Feb 2022 12:38:19 UTC (15,185 KB)
[v3] Wed, 29 Jun 2022 21:34:51 UTC (15,190 KB)

Computer Science > Machine Learning

Title:Model-Value Inconsistency as a Signal for Epistemic Uncertainty

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Model-Value Inconsistency as a Signal for Epistemic Uncertainty

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators