An Efficient Off-Policy Reinforcement Learning Algorithm for the Continuous-Time LQR Problem

Lopez, Victor G.; Müller, Matthias A.

Electrical Engineering and Systems Science > Systems and Control

arXiv:2303.17819 (eess)

[Submitted on 31 Mar 2023]

Title:An Efficient Off-Policy Reinforcement Learning Algorithm for the Continuous-Time LQR Problem

Authors:Victor G. Lopez, Matthias A. Müller

View PDF

Abstract:In this paper, an off-policy reinforcement learning algorithm is designed to solve the continuous-time LQR problem using only input-state data measured from the system. Different from other algorithms in the literature, we propose the use of a specific persistently exciting input as the exploration signal during the data collection step. We then show that, using this persistently excited data, the solution of the matrix equation in our algorithm is guaranteed to exist and to be unique at every iteration. Convergence of the algorithm to the optimal control input is also proven. Moreover, we formulate the policy evaluation step as the solution of a Sylvester-transpose equation, which increases the efficiency of its solution. Finally, a method to determine a stabilizing policy to initialize the algorithm using only measured data is proposed.

Comments:	7 pages
Subjects:	Systems and Control (eess.SY); Machine Learning (cs.LG)
Cite as:	arXiv:2303.17819 [eess.SY]
	(or arXiv:2303.17819v1 [eess.SY] for this version)
	https://doi.org/10.48550/arXiv.2303.17819

Submission history

From: Victor Lopez [view email]
[v1] Fri, 31 Mar 2023 06:30:23 UTC (68 KB)

Full-text links:

Access Paper:

view license

Current browse context:

eess.SY

< prev | next >

new | recent | 2023-03

Change to browse by:

cs
cs.LG
cs.SY
eess

References & Citations

export BibTeX citation

Electrical Engineering and Systems Science > Systems and Control

Title:An Efficient Off-Policy Reinforcement Learning Algorithm for the Continuous-Time LQR Problem

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Systems and Control

Title:An Efficient Off-Policy Reinforcement Learning Algorithm for the Continuous-Time LQR Problem

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators