Guided Policy Exploration for Markov Decision Processes using an Uncertainty-Based Value-of-Information Criterion

Sledge, Isaac J.; Emigh, Matthew S.; Principe, Jose C.

doi:10.1109/TNNLS.2018.2812709

Computer Science > Artificial Intelligence

arXiv:1802.01518 (cs)

[Submitted on 5 Feb 2018]

Title:Guided Policy Exploration for Markov Decision Processes using an Uncertainty-Based Value-of-Information Criterion

Authors:Isaac J. Sledge, Matthew S. Emigh, Jose C. Principe

View PDF

Abstract:Reinforcement learning in environments with many action-state pairs is challenging. At issue is the number of episodes needed to thoroughly search the policy space. Most conventional heuristics address this search problem in a stochastic manner. This can leave large portions of the policy space unvisited during the early training stages. In this paper, we propose an uncertainty-based, information-theoretic approach for performing guided stochastic searches that more effectively cover the policy space. Our approach is based on the value of information, a criterion that provides the optimal trade-off between expected costs and the granularity of the search process. The value of information yields a stochastic routine for choosing actions during learning that can explore the policy space in a coarse to fine manner. We augment this criterion with a state-transition uncertainty factor, which guides the search process into previously unexplored regions of the policy space.

Comments:	IEEE Transactions on Neural Networks and Learning Systems
Subjects:	Artificial Intelligence (cs.AI)
Cite as:	arXiv:1802.01518 [cs.AI]
	(or arXiv:1802.01518v1 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.1802.01518
Related DOI:	https://doi.org/10.1109/TNNLS.2018.2812709

Submission history

From: Isaac Sledge [view email]
[v1] Mon, 5 Feb 2018 17:24:13 UTC (2,143 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.AI

< prev | next >

new | recent | 2018-02

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Isaac J. Sledge
Matthew S. Emigh
José C. Príncipe

export BibTeX citation

Computer Science > Artificial Intelligence

Title:Guided Policy Exploration for Markov Decision Processes using an Uncertainty-Based Value-of-Information Criterion

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:Guided Policy Exploration for Markov Decision Processes using an Uncertainty-Based Value-of-Information Criterion

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators