A Reinforcement Learning Approach to On-line Optimal Control
A Reinforcement Learning Approach to On-line Optimal Control
This paper presents a hybrid control architecture for solving on-line optimal control. In this architecture, the control law is dynamically scheduled between a reinforcement controller and a stabilizing controller so that the closed-loop performance is smoothly transformed from a reactive behavior to one which can predict. Based on a modified Q-learning technique, the reinforcement controller is made of two components: policy and Q functions. The policy function is explicitly incorporated so as to bypass the minimum operator normally required for selecting actions and updating the Q function. This architecture is then applied to a repetitive operation using a second-order linear-time-variant plant with a nonlinear control structure. In this operation, the reinforcement signals are based on set-point errors and the reinforcement controller is generalized using second-order B-Splines networks. This example illustrates how, for a non-optimally tuned stabilizing controller, the closed-loop performance can be bootstrapped with the use of reinforcement learning. Results shows that the set-point performance of the hybrid controller is improved over that of the fixed structure controller by discovering better control strategies which compensate for the non-optimal gains and nonlinear control structure.
2465-2471
An, P.E.
5dc94657-d009-4d13-9a0f-6645a9d296d9
Aslam-Mir, S.
330da1ca-ef74-4b86-a3d4-4e30da176990
Brown, M.
52cf4f52-6839-4658-8cc5-ec51da626049
Harris, C.J.
c4fd3763-7b3f-4db1-9ca3-5501080f797a
1994
An, P.E.
5dc94657-d009-4d13-9a0f-6645a9d296d9
Aslam-Mir, S.
330da1ca-ef74-4b86-a3d4-4e30da176990
Brown, M.
52cf4f52-6839-4658-8cc5-ec51da626049
Harris, C.J.
c4fd3763-7b3f-4db1-9ca3-5501080f797a
An, P.E., Aslam-Mir, S., Brown, M. and Harris, C.J.
(1994)
A Reinforcement Learning Approach to On-line Optimal Control.
Int. Conf. on Neural Networks.
.
Record type:
Conference or Workshop Item
(Other)
Abstract
This paper presents a hybrid control architecture for solving on-line optimal control. In this architecture, the control law is dynamically scheduled between a reinforcement controller and a stabilizing controller so that the closed-loop performance is smoothly transformed from a reactive behavior to one which can predict. Based on a modified Q-learning technique, the reinforcement controller is made of two components: policy and Q functions. The policy function is explicitly incorporated so as to bypass the minimum operator normally required for selecting actions and updating the Q function. This architecture is then applied to a repetitive operation using a second-order linear-time-variant plant with a nonlinear control structure. In this operation, the reinforcement signals are based on set-point errors and the reinforcement controller is generalized using second-order B-Splines networks. This example illustrates how, for a non-optimally tuned stabilizing controller, the closed-loop performance can be bootstrapped with the use of reinforcement learning. Results shows that the set-point performance of the hybrid controller is improved over that of the fixed structure controller by discovering better control strategies which compensate for the non-optimal gains and nonlinear control structure.
This record has no associated files available for download.
More information
Published date: 1994
Additional Information:
Organisation: IEEE Address: Orlando, Fl
Venue - Dates:
Int. Conf. on Neural Networks, 1994-01-01
Organisations:
Southampton Wireless Group
Identifiers
Local EPrints ID: 250209
URI: http://eprints.soton.ac.uk/id/eprint/250209
PURE UUID: 6908bbea-13cb-45d6-b110-5af913e12e0f
Catalogue record
Date deposited: 04 May 1999
Last modified: 10 Dec 2021 20:07
Export record
Contributors
Author:
P.E. An
Author:
S. Aslam-Mir
Author:
M. Brown
Author:
C.J. Harris
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics