Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Q-learning based tracking control with novel finite-horizon performance index

Published: 18 October 2024 Publication History

Abstract

A data-driven method is designed to realize the model-free finite-horizon optimal tracking control (FHOTC) of unknown linear discrete-time systems based on Q-learning in this paper. First, a novel finite-horizon performance index (FHPI) that only depends on the next-step tracking error is introduced. Then, an augmented system is formulated, which incorporates with the system model and the trajectory model. Based on the novel FHPI, a derivation of the augmented time-varying Riccati equation (ATVRE) is provided. We present a data-driven FHOTC method that uses Q-learning to optimize the defined time-varying Q-function. This allows us to estimate the solutions of the ATVRE without the system dynamics. Finally, the validity and features of the proposed Q-learning-based FHOTC method are demonstrated by means of conducting comparative simulation studies.

References

[1]
Z. Huang, W. Bai, T. Li, Y. Long, C.P. Chen, H. Liang, H. Yang, Adaptive reinforcement learning optimal tracking control for strict-feedback nonlinear systems with prescribed performance, Inf. Sci. 621 (2023) 407–423.
[2]
C. Mu, Y. Zhang, Z. Gao, C. Sun, ADP-based robust tracking control for a class of nonlinear systems with unmatched uncertainties, IEEE Trans. Syst. Man Cybern. Syst. 50 (11) (2020) 4056–4067.
[3]
Y. Pan, S. Fu, J. Wang, W. Zhang, Optimal output tracking of Boolean control networks, Inf. Sci. 626 (2023) 524–536.
[4]
R.S. Sutton, A.G. Barto, Reinforcement Learning: An Introduction, MIT Press, 2018.
[5]
C.J. Watkins, P. Dayan, Q-learning, Mach. Learn. 8 (3–4) (1992) 279–292.
[6]
M. Li, J. Sun, H. Zhang, Z. Ming, Based on Q-learning optimal tracking control schemes for linear Itô stochastic systems with Markovian jumps, IEEE Trans. Circuits Syst. II, Express Briefs 70 (3) (2023) 1094–1098.
[7]
B. Kiumarsi, F.L. Lewis, H. Modares, A. Karimpour, M.-B. Naghibi-Sistani, Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics, Automatica 50 (4) (2014) 1167–1175.
[8]
C. Mu, Q. Zhao, C. Sun, Z. Gao, An ADDHP-based Q-learning algorithm for optimal tracking control of linear discrete-time systems with unknown dynamics, Appl. Soft Comput. 82 (2019).
[9]
X. Wen, H. Shi, C. Su, X. Jiang, P. Li, J. Yu, Novel data-driven two-dimensional Q-learning for optimal tracking control of batch process with unknown dynamics, ISA Trans. 125 (2022) 10–21.
[10]
S.A.A. Rizvi, A.J. Pertzborn, Z. Lin, Reinforcement learning based optimal tracking control under unmeasurable disturbances with application to HVAC systems, IEEE Trans. Neural Netw. Learn. Syst. 33 (12) (2022) 7523–7533.
[11]
J. Zhao, C. Yang, W. Gao, L. Zhou, Reinforcement learning and optimal setpoint tracking control of linear systems with external disturbances, IEEE Trans. Ind. Inform. 18 (11) (2022) 7770–7779.
[12]
H. Shi, C. Yang, X. Jiang, C. Su, P. Li, Novel two-dimensional off-policy Q-learning method for output feedback optimal tracking control of batch process with unknown dynamics, J. Process Control 113 (2022) 29–41.
[13]
H. Shi, X. Wen, X. Jiang, C. Su, Two-dimensional model-free optimal tracking control for batch processes with packet loss, IEEE Trans. Control Netw. Syst. 10 (2) (2023) 1032–1045.
[14]
D. Wang, J. Ren, M. Ha, Discounted linear Q-learning control with novel tracking cost and its stability, Inf. Sci. 626 (2023) 339–353.
[15]
B. Luo, D. Liu, T. Huang, D. Wang, Model-free optimal tracking control via critic-only Q-learning, IEEE Trans. Neural Netw. Learn. Syst. 27 (10) (2016) 2134–2144.
[16]
S. Song, M. Zhu, X. Dai, D. Gong, Model-free optimal tracking control of nonlinear input-affine discrete-time systems via an iterative deterministic Q-learning algorithm, IEEE Trans. Neural Netw. Learn. Syst. 35 (1) (2024) 999–1012.
[17]
J. Li, Z. Xiao, P. Li, J. Cao, Robust optimal tracking control for multiplayer systems by off-policy Q-learning approach, Int. J. Robust Nonlinear 31 (1) (2021) 87–106.
[18]
Y. Peng, Q. Chen, W. Sun, Reinforcement Q-learning algorithm for H ∞ tracking control of unknown discrete-time linear systems, IEEE Trans. Syst. Man Cybern. Syst. 50 (11) (2020) 4109–4122.
[19]
Q. Wei, R. Song, B. Li, X. Lin, Self-Learning Optimal Control of Nonlinear Systems, Springer, 2018.
[20]
D. Wang, D. Liu, Q. Wei, Finite-horizon neuro-optimal tracking control for a class of discrete-time nonlinear systems using adaptive dynamic programming approach, Neurocomputing 78 (1) (2012) 14–22.
[21]
A. Heydari, S. Balakrishnan, Fixed-final-time optimal tracking control of input-affine nonlinear systems, Neurocomputing 129 (2014) 528–539.
[22]
C. Li, D. Liu, H. Li, Finite horizon optimal tracking control of partially unknown linear continuous-time systems using policy iteration, IET Control Theory Appl. 9 (12) (2015) 1791–1801.
[23]
R. Song, Y. Xie, Z. Zhang, Data-driven finite-horizon optimal tracking control scheme for completely unknown discrete-time nonlinear systems, Neurocomputing 356 (2019) 206–216.
[24]
H. Zhang, X. Cui, Y. Luo, H. Jiang, Finite-horizon H ∞ tracking control for unknown nonlinear systems with saturating actuators, IEEE Trans. Neural Netw. Learn. Syst. 29 (4) (2018) 1200–1212.
[25]
C. Possieri, G.P. Incremona, G.C. Calafiore, A. Ferrara, An iterative data-driven linear quadratic method to solve nonlinear discrete-time tracking problems, IEEE Trans. Autom. Control 66 (11) (2021) 5514–5521.
[26]
W. Wang, X. Xie, C. Feng, Model-free finite-horizon optimal tracking control of discrete-time linear systems, Appl. Math. Comput. 433 (2022).
[27]
C. Li, J. Ding, F.L. Lewis, T. Chai, A novel adaptive dynamic programming based on tracking error for nonlinear discrete-time systems, Automatica 129 (2021).
[28]
D. Simon, Optimal State Estimation: Kalman, H Infinity, and Nonlinear Approaches, John Wiley & Sons, 2006.

Index Terms

  1. Q-learning based tracking control with novel finite-horizon performance index
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Information Sciences: an International Journal
    Information Sciences: an International Journal  Volume 681, Issue C
    Oct 2024
    1022 pages

    Publisher

    Elsevier Science Inc.

    United States

    Publication History

    Published: 18 October 2024

    Author Tags

    1. Optimal tracking control
    2. Model-free control
    3. Q-function
    4. Finite-horizon

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 0
      Total Downloads
    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 10 Nov 2024

    Other Metrics

    Citations

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media