Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Q-learning based tracking control with novel finite-horizon performance index

Published: 01 October 2024 Publication History

Abstract

A data-driven method is designed to realize the model-free finite-horizon optimal tracking control (FHOTC) of unknown linear discrete-time systems based on Q-learning in this paper. First, a novel finite-horizon performance index (FHPI) that only depends on the next-step tracking error is introduced. Then, an augmented system is formulated, which incorporates with the system model and the trajectory model. Based on the novel FHPI, a derivation of the augmented time-varying Riccati equation (ATVRE) is provided. We present a data-driven FHOTC method that uses Q-learning to optimize the defined time-varying Q-function. This allows us to estimate the solutions of the ATVRE without the system dynamics. Finally, the validity and features of the proposed Q-learning-based FHOTC method are demonstrated by means of conducting comparative simulation studies.

References

[1]
Z. Huang, W. Bai, T. Li, Y. Long, C.P. Chen, H. Liang, H. Yang, Adaptive reinforcement learning optimal tracking control for strict-feedback nonlinear systems with prescribed performance, Inf. Sci. 621 (2023) 407–423.
[2]
C. Mu, Y. Zhang, Z. Gao, C. Sun, ADP-based robust tracking control for a class of nonlinear systems with unmatched uncertainties, IEEE Trans. Syst. Man Cybern. Syst. 50 (11) (2020) 4056–4067.
[3]
Y. Pan, S. Fu, J. Wang, W. Zhang, Optimal output tracking of Boolean control networks, Inf. Sci. 626 (2023) 524–536.
[4]
R.S. Sutton, A.G. Barto, Reinforcement Learning: An Introduction, MIT Press, 2018.
[5]
C.J. Watkins, P. Dayan, Q-learning, Mach. Learn. 8 (3–4) (1992) 279–292.
[6]
M. Li, J. Sun, H. Zhang, Z. Ming, Based on Q-learning optimal tracking control schemes for linear Itô stochastic systems with Markovian jumps, IEEE Trans. Circuits Syst. II, Express Briefs 70 (3) (2023) 1094–1098.
[7]
B. Kiumarsi, F.L. Lewis, H. Modares, A. Karimpour, M.-B. Naghibi-Sistani, Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics, Automatica 50 (4) (2014) 1167–1175.
[8]
C. Mu, Q. Zhao, C. Sun, Z. Gao, An ADDHP-based Q-learning algorithm for optimal tracking control of linear discrete-time systems with unknown dynamics, Appl. Soft Comput. 82 (2019).
[9]
X. Wen, H. Shi, C. Su, X. Jiang, P. Li, J. Yu, Novel data-driven two-dimensional Q-learning for optimal tracking control of batch process with unknown dynamics, ISA Trans. 125 (2022) 10–21.
[10]
S.A.A. Rizvi, A.J. Pertzborn, Z. Lin, Reinforcement learning based optimal tracking control under unmeasurable disturbances with application to HVAC systems, IEEE Trans. Neural Netw. Learn. Syst. 33 (12) (2022) 7523–7533.
[11]
J. Zhao, C. Yang, W. Gao, L. Zhou, Reinforcement learning and optimal setpoint tracking control of linear systems with external disturbances, IEEE Trans. Ind. Inform. 18 (11) (2022) 7770–7779.
[12]
H. Shi, C. Yang, X. Jiang, C. Su, P. Li, Novel two-dimensional off-policy Q-learning method for output feedback optimal tracking control of batch process with unknown dynamics, J. Process Control 113 (2022) 29–41.
[13]
H. Shi, X. Wen, X. Jiang, C. Su, Two-dimensional model-free optimal tracking control for batch processes with packet loss, IEEE Trans. Control Netw. Syst. 10 (2) (2023) 1032–1045.
[14]
D. Wang, J. Ren, M. Ha, Discounted linear Q-learning control with novel tracking cost and its stability, Inf. Sci. 626 (2023) 339–353.
[15]
B. Luo, D. Liu, T. Huang, D. Wang, Model-free optimal tracking control via critic-only Q-learning, IEEE Trans. Neural Netw. Learn. Syst. 27 (10) (2016) 2134–2144.
[16]
S. Song, M. Zhu, X. Dai, D. Gong, Model-free optimal tracking control of nonlinear input-affine discrete-time systems via an iterative deterministic Q-learning algorithm, IEEE Trans. Neural Netw. Learn. Syst. 35 (1) (2024) 999–1012.
[17]
J. Li, Z. Xiao, P. Li, J. Cao, Robust optimal tracking control for multiplayer systems by off-policy Q-learning approach, Int. J. Robust Nonlinear 31 (1) (2021) 87–106.
[18]
Y. Peng, Q. Chen, W. Sun, Reinforcement Q-learning algorithm for H ∞ tracking control of unknown discrete-time linear systems, IEEE Trans. Syst. Man Cybern. Syst. 50 (11) (2020) 4109–4122.
[19]
Q. Wei, R. Song, B. Li, X. Lin, Self-Learning Optimal Control of Nonlinear Systems, Springer, 2018.
[20]
D. Wang, D. Liu, Q. Wei, Finite-horizon neuro-optimal tracking control for a class of discrete-time nonlinear systems using adaptive dynamic programming approach, Neurocomputing 78 (1) (2012) 14–22.
[21]
A. Heydari, S. Balakrishnan, Fixed-final-time optimal tracking control of input-affine nonlinear systems, Neurocomputing 129 (2014) 528–539.
[22]
C. Li, D. Liu, H. Li, Finite horizon optimal tracking control of partially unknown linear continuous-time systems using policy iteration, IET Control Theory Appl. 9 (12) (2015) 1791–1801.
[23]
R. Song, Y. Xie, Z. Zhang, Data-driven finite-horizon optimal tracking control scheme for completely unknown discrete-time nonlinear systems, Neurocomputing 356 (2019) 206–216.
[24]
H. Zhang, X. Cui, Y. Luo, H. Jiang, Finite-horizon H ∞ tracking control for unknown nonlinear systems with saturating actuators, IEEE Trans. Neural Netw. Learn. Syst. 29 (4) (2018) 1200–1212.
[25]
C. Possieri, G.P. Incremona, G.C. Calafiore, A. Ferrara, An iterative data-driven linear quadratic method to solve nonlinear discrete-time tracking problems, IEEE Trans. Autom. Control 66 (11) (2021) 5514–5521.
[26]
W. Wang, X. Xie, C. Feng, Model-free finite-horizon optimal tracking control of discrete-time linear systems, Appl. Math. Comput. 433 (2022).
[27]
C. Li, J. Ding, F.L. Lewis, T. Chai, A novel adaptive dynamic programming based on tracking error for nonlinear discrete-time systems, Automatica 129 (2021).
[28]
D. Simon, Optimal State Estimation: Kalman, H Infinity, and Nonlinear Approaches, John Wiley & Sons, 2006.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Information Sciences: an International Journal
Information Sciences: an International Journal  Volume 681, Issue C
Oct 2024
1022 pages

Publisher

Elsevier Science Inc.

United States

Publication History

Published: 01 October 2024

Author Tags

  1. Optimal tracking control
  2. Model-free control
  3. Q-function
  4. Finite-horizon

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media