|Integral Temporal Difference Learning for Continuous-Time Linear Quadratic Regulations
Tae Yoon Chun, Jae Young Lee, Jin Bae Park, and Yoon Ho Choi*
International Journal of Control, Automation, and Systems, vol. 15, no. 1, pp.226-238, 2017
Abstract : "In this paper, we propose a temporal difference (TD) learning method, called integral TD learning that
efficiently finds solutions to continuous-time (CT) linear quadratic regulation (LQR) problems in an online fashion
where system matrix A is unknown. The idea originates from a computational reinforcement learning method
known as TD(0), which is the simplest TD method in a finite Markov decision process. For the proposed integral TD
method, we mathematically analyze the positive definiteness of the updated value functions, monotone convergence
conditions, and stability properties concerning the locations of the closed-loop poles in terms of the learning rate
and the discount factor. The proposed method includes the existing value iteration method for CT LQR problems
as a special case. Finally, numerical simulations are carried out to verify the effectiveness of the proposed method
and further investigate the aforementioned mathematical properties."
"Adaptive optimal control, linear quadratic regulation, reinforcement learning, temporal difference, value iteration."
Download PDF : Click this link