Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Nonzero-sum differential games of continuous-Time nonlinear systems with uniformly ultimately ε-bounded by adaptive dynamic programming

Published: 01 October 2022 Publication History

Highlights

The proposed control scheme can solve the coupled HJ equations online and forward in time without the requirement of initial stabilizing control policies through adding a novel term of only relying on value function V(x) on the weight tuning law of critic NN for each player in this paper.
In this paper, due to the introduction of Osgood condition, a more general condition than Lipschitz condition, the uniqueness of the solution of the system is obtained. So the requirement of the nonlinear term f (x) is greatly relaxed by using Osgood condition, which makes the nonlinear term f (x) is not limited to a function of the first degree, but can be quadratic, cubic, N-degree.
Compared with the works in [23, 30] and [31], the assumptions that g(x) and k(x) are bounded functions are removed, which can also be reflected in the subsequent simulation.
In [23], the system state and the weight estimation errors of critic NNs are uniformly ultimately bounded (UUB), where the boundary contains constants and the errors generated by the NNs. However, in our paper, the boundary is only related to the errors generated by the NNs themselves. Further, when the error is small enough or even tend to zero, the system is asymptotically stable.

Abstract

In this paper, a single network adaptive dynamic programming (ADP) control method is presented to obtain the nearly optimal control policies for the non-zero sum (NZS) differential game problem of the autonomous nonlinear system. The Osgood condition, instead of the traditional Lipschitz condition, is firstly introduced to policy iteration to guarantee the existence and uniqueness of the solution of the dynamic nonlinear systems and to weaken the limited conditions of nonlinear dynamic functions f ( x ), g ( x ) and k ( x ). Moreover, this adaptive control pattern finds in real-time approximations of the optimal value and the non-zero sum Nash-equilibrium, while also ensuring the uniform ultimate ε-boundedness of the closed-loop system. Further, as the number of hidden-layer neurons tends to infinite, the approximation errors converge to zero. As a result, the closed-loop system is asymptotically stable. Finally, the effectiveness of the proposed near-optimal control pattern is verified by a simulation example.

References

[1]
H. Mukaidani, Newton’s method for solving cross-coupled signindefinite algebraic riccati equations for weakly coupled large-scale systems, Appl. Math. Comput. 188 (1) (2007) 103–115.
[2]
V. Shah, Power Control for Wireless Data Services Based on Utility and Pricing,, Rutgers Univ., Piscataway, NJ, 1998, M.s. thesis.
[3]
S. Tijs, Introduction to Game Theory, Hindustan Book Agency, India, 2003.
[4]
T. Basar, G.J. Olsder, Dynamic Noncooperative Game Theory (2nd ed.), SIAM, Philadelphia, PA, 1999.
[5]
G. Freiling, G. Jank, H. Abou-Kandil, On global existence of solutions to coupled matrix riccati equations in closed loop nash games, IEEE Trans. Automat. Contr. 41 (2) (2002) 264–269.
[6]
H. Abou-Kandil, G. Freiling, V. Ionescu, G. Jank, Matrix Riccati Equations in Control and Systems Theory, birkhäuser, 2003.
[7]
M. Jungers, E. De Pieri, H. Abou-Kandil, Solving coupled riccati equations for closed-loop nash strategy, by lack of trust approach, Int. J. Tomogr. Stat. 7 (F07) (2007) 49–54.
[8]
Z. Gajic, T.Y. Li, Simulation results for two new algorithms for solving coupled algebraic riccati equations, In Third int. symp, on differential games, Sophia, Antipolis, France, 1988.
[9]
R.E. Bellman, Dynamic Programming, Princeton Univ. Press, Princeton, NJ, 1957.
[10]
H. Liang, X. Guo, Y. Pan, T. Huang, Event-triggered fuzzy bipartite tracking control for network systems based on distributed reduced-order observers, IEEE Trans. Fuzzy Syst.To be published. 10.1109/TFUZZ.2020.2982618.
[11]
Y. Pan, P. Du, H. Xue, H. Lam, Singularity-free fixed-time fuzzy control for robotic systems with user-defined performance, IEEE Trans. Fuzzy Syst., to be published10.1109/TFUZZ.2020.2999746.
[12]
P. Du, Y. Pan, H. Li, H. Lam, Nonsingular finite-time event-triggered fuzzy control for large-scale nonlinear systems, IEEE Trans. Fuzzy Syst., to be published10.1109/TFUZZ.2020.2992632.
[13]
C. Peng, M. Wu, X. Xie, Y. Wang, Event-triggered predictive control for networked nonlinear systems with imperfect premise matching, IEEE Trans. Fuzzy Syst. 26 (5) (2018) 2797–2806.
[14]
C. Peng, D. Yue, M. Fei, Relaxed stability and stabilization conditions of networked fuzzy control systems subject to asynchronous grades of membership, IEEE Trans. Fuzzy syst. 22 (5) (2014) 1101–1112.
[15]
X. Xie, D. Yue, J.H. Park, Enhanced switching stabilization of discrete-time Takagi-Sugeno fuzzy systems: reducing the conservatism and alleviating the on-line computational burden, IEEE Trans. Fuzzy Syst.10.1109/TFUZZ.2020.2986670.
[16]
H. Zhang, Z. Liu, G. Huang, Z. Wang, Novel weighting-delay-based stability criteria for recurrent neural networks with time-varying delay, IEEE Trans. Neural Netw. 21 (1) (2010) 91–106.
[17]
Q. Wei, D. Liu, H. Lin, Value iteration adaptive dynamic programming for optimal control of discrete-time nonlinear systems, IEEE Trans. Cybern. 46 (3) (2016) 840–853.
[18]
Q. Wei, D. Liu, Q. Lin, R. Song, Adaptive dynamic programming for discrete-time zero-sum games, IEEE Trans. Neural Netw. Learn. Syst. 29 (4) (2018) 957–969.
[19]
H. Zhang, Y. Luo, D. Liu, Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints, IEEE Trans. Neural Netw. 20 (9) (2009) 1490–1503.
[20]
H. Zhang, Q. Wei, D. Liu, An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games, Automatica 47 (1) (2011) 207–214.
[21]
F.-Y. Wang, H. Zhang, D. Liu, Adaptive dynamic programming: an introduction, IEEE Comput. Intell. Mag. 4 (2) (2009) 39–47.
[22]
Q. Wei, F.L. Lewis, D. Liu, R. Song, H. Lin, Iscrete-time local value iteration adaptive dynamic programming: convergence analysis, IEEE Trans. Syst., Man, Cybern., Syst. 48 (6) (2018) 875–891.
[23]
K.G. Vamvoudakisand, F.L. Lewis, Multi-player non-zero-sum games: online adaptive learning solution of coupled hamilton-hacobi equations, Automatica 47 (8) (2011) 1556–1569.
[24]
H. Zhang, C. Qing, B. Jiang, Y. Luo, Online adaptive policy learning algorithm for h ∞ state feedback control of unknown affine nonlinear discrete-time systems, IEEE Trans. Cybern. 44 (12) (2014) 2706–2718.
[25]
H. Zhang, L. Cui, X. Zhang, Y. Luo, Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method, IEEE Trans. Neural Netw. 22 (12) (2011) 2226–2236.
[26]
H. Zhang, C. Liu, H. Su, K. Zhang, Echo state network-based decentralized control of continuous-time nonlinear large-scale interconnected systems, IEEE Trans. Syst., Man, Cybern., Syst.To be published. 10.1109/TSMC.2019.2958484.
[27]
H. Zhang, J. Zhang, G. Yang, Y. Luo, Leader-based optimal coordination control for the consensus problem of multi-agent differential games via fuzzy adaptive dynamic programming, IEEE Trans. Fuzzy Syst. 23 (1) (2015) 152–163.
[28]
C. Mu, K. Wang, Aperiodic adaptive control for neural-network-based nonzero-sum differential games: a novel event-triggering strategy, ISA Trans 92 (2019) 1–13.
[29]
R. Song, J. Li, F.L. Lewis, Robust optimal control for disturbed nonlinear zero-sum differential games based on single NN and least squares, IEEE Trans. Syst., Man, Cybern., Syst. 50 (11) (2020).
[30]
H. Zhang, L. Cui, Y. Luo, Near-optimal control for nonzero-sum differential games of continuous-time nonlinear systems using single-network ADP, IEEE Trans. Cybern. 43 (1) (2013) 206–216.
[31]
K.G. Vamvoudakis, F.L. Lewis, Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem, Automatica 46 (5) (2010) 878–888.
[32]
H.K. Khalil, Nonlinear System, Prentice-Hall, Englewood Cliffs, NJ, 1996.
[33]
M. Abu-Khalaf, F.L. Lewis, Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach, Automatica 41 (5) (2005) 779–791.
[34]
T. Ding, C. Li, Ordinary Differential Equations Tutorial, Higher Education Press, Beijing, NJ, 2012.
[35]
H. Su, H. Zhang, H. Jiang, Y. Wen, Decentralized event-triggered adaptive control of discrete-time nonzero-sum games over wireless sensor-actuator networks with input constraints, IEEE Trans Neural Netw Learn Syst 99 (2020) 1–13.
[36]
H.S. Zhang, H. Zhang, D. Gao, Y. Luo, Adaptive dynamics programming for h ∞ control of continuous-time unknown nonlinear systems via generalized fuzzy hyperbolic models, IEEE Transactions on Systems Man & Cybernetics Systems (2019) 1–13.
[37]
J. Zhang, H. Zhang, Y. Luo, Y. Liu, Adaptive event-triggered leader-follower consensus of linear multiagent systems under directed graph with nonzero leader input, IEEE Trans. Circuits Syst. II Express Briefs10.1109/TCSII.2021.3115487.
[38]
J. Zhang, H. Zhang, K. Zhang, Y. Cai, Observer-based output feedback event-triggered adaptive control for linear multiagent systems under switching topologies, IEEE Trans Neural Netw Learn Syst10.1109/TNNLS.2021.3084317.
[39]
Y. Li, W. Gao, W. Yan, S. Huang, D. Gao, Data-driven optimal control strategy for virtual synchronous generator via deep reinforcement learning approach, J. Mod Power Syst. Clean Energy (2021).

Cited By

View all
  • (2024)Hierarchical approximate optimal interaction control of human-centered modular robot manipulator systemsNeurocomputing10.1016/j.neucom.2024.127573585:COnline publication date: 7-Jun-2024

Index Terms

  1. Nonzero-sum differential games of continuous-Time nonlinear systems with uniformly ultimately ε-bounded by adaptive dynamic programming
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image Applied Mathematics and Computation
        Applied Mathematics and Computation  Volume 430, Issue C
        Oct 2022
        850 pages

        Publisher

        Elsevier Science Inc.

        United States

        Publication History

        Published: 01 October 2022

        Author Tags

        1. Adaptive dynamic programming (ADP)
        2. Coupled Hamilton-Jacobi equations
        3. Osgood condition
        4. Neural networks (NNs)

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 21 Nov 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Hierarchical approximate optimal interaction control of human-centered modular robot manipulator systemsNeurocomputing10.1016/j.neucom.2024.127573585:COnline publication date: 7-Jun-2024

        View Options

        View options

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media