Adaptive Dynamic Programming Algorithm For Uncertain Nonlinear Switched Systems

International Journal of Power Electronics and Drive Systems (IJPEDS)
Vol. 12, No. 1, March 2021, pp. 551∼557

ISSN: 2088-8694, DOI: 10.11591/ijpeds.v12.i1.pp551-557 ❒ 551
Adaptive dynamic programming algorithm for uncertain

nonlinear switched systems
Dao Phuong Nam1 , Nguyen Hong Quang2 , Nguyen Nhat Tung3 , Tran Thi Hai Yen4
1
School of Electrical Engineering, Hanoi University of Science and Technology,
Bách Khoa, Hai Bà Trung, Hà Noi, Vietnam
2,4
Thai Nguyen University of Technology, So 666 D. 3/2, P, Thành pho Thái Nguyên, Thái Nguyên, Vietnam
3
Electric Power University, 235 Hoàng Quoc Viet, Co Nhue, Tu Liêm, Hà Noi 129823, Vietnam
Article Info ABSTRACT

Article history: This paper studies an approximate dynamic programming (ADP) strategy of a group of
nonlinear switched systems, where the external disturbances are considered. The neu-
Received Feb 2, 2020
ral network (NN) technique is regarded to estimate the unknown part of actor as well as
Revised Dec 15, 2020 critic to deal with the corresponding nominal system. The training technique is simul-
Accepted Jan 10, 2021 taneously carried out based on the solution of minimizing the square error Hamilton
function. The closed system’s tracking error is analyzed to converge to an attraction
Keywords: region of origin point with the uniformly ultimately bounded (UUB) description. The
simulation results are implemented to determine the effectiveness of the ADP based
Adaptive dynamic controller.
programming
HJB equation
Lyapunov This is an open access article under the CC BY-SA license.
Neural networksstability
Nonlinear switched systems
Corresponding Author:
Nguyen Hong Quang
Thai Nguyen University of Technology,
So 666 D. 3/2, P, Thành pho Thái Nguyên, Thái Nguyên, Vietnam
Email: quang.nguyenhong@tnut.edu.vn
1. INTRODUCTION
It is worth noting that many systems in industry can be described by switched system such as DC-
DC converter [1]-[3], H-bridge inverter [4], multilevel inverter [5], photovoltaic inverter [6]. Although many
different approaches for switched systems have been proposed, e.g., switching-delay tolerant control [7], clas-
sical nonlinear control [8]-[12], the optimization approaches with the advantage of mentioning the input/state
constraint has not been mentioned much. The approaches of fuzzy and neural network as well as ANN, par-
ticle swarm optimization (PSO) technique were investigated in several different systems such as photovoltaic
inverter, transmission line. [13]-[17].
Adaptive dynamic programming has been considered in many situations, such as nonlinear continuous
time systems [18], actuator saturation [19], linear systems [20]-[22], output constraint [23]. In the case of non-
linear systems, the algorithm should be implemented based on Neural Networks (NNs). However, Kronecker
product was employed in linear systems. Furthermore, the data driven technique should to be mentioned to
compute the actor/critic precisely. It should be noted that the robotic systems has been controlled by ADP
algorithm [24]-[25].
Our work proposed the solution of adaptive dynamic programming in nonlinear perturbed switching
systems based on the neural networks. The consideration of the Halminton function enables us obtaining the
learning technique of these neural networks. The UUB stability of closed system is analyzed and simulation
results illustrate the high effectiveness of given controller.
Journal homepage: http://ijpeds.iaescore.com

552 ❒ ISSN: 2088-8694
2. PROBLEM STATEMENTS
Consider the following uncertain nonlinear continuous time switched systems of the form:
d
ξ(t) = fi (ξ(t)) + gi (ξ(t)) (u + ∆ (ξ, t)) (1)
dt
where ξ (t) ∈ Ωx ∈ Rn denotes the state variables and u (t) ∈ Ωu ∈ Rm describes the control variables.
The function β : [ 0, +∞) 7→ Ω = {1, 2, ..., l} is a information of switching processing, which is known as
a function with many continuous piecewise depending on time, and l is the subsystems number. fi (ξ) are
uncertain smooth vector functions with fi (0) = 0. gi (ξ) are mentioned as smooth vector functions with the
property Gmin ⩽ ∥gi (ξ)∥ ⩽ Gmax . The switching index β (t) is unknown.
Assumption 1: ∆ (ξ, t) is bounded by a certain function ϱ (ξ) as ∥∆ (ξ, t)∥ ⩽ ϱ (ξ)
Consider the cost function connected with the uncertain switched system (1):
Z∞
J(ξ, u) = r (ξ (τ ) , u (τ )) dτ (2)
t
where r(ξ, u) = ξ T Qξ + uT Ru and Q = QT > 0; R = RT > 0.

The main purpose is to achieve the state feedback control design and give the upper bound term to
guarantee the closed systems under this controller is robustly stable. Additionally, the performance index (2) is
bounded as J ≤ K (ξ, u) ≤ M .
Definition: The term K(u) is given by the appropriate performance index. As a result, the control
input u∗ = arg min K (ξ, u) is mentioned as the optimal appropriate performance index method.
u∈Ωu
3. CONTROL DESIGN
The obtained nominal system after eliminating the disturbance in switched system (3) is described by:
d
ξ = fi (ξ) + gi (ξ) u (3)
dt
The performance index of system (3) is modified as (4)
Z∞ h i
2
Q1 (ξ, u) = r(ξ, u) + γ (ρ (ξ)) dτ (4)
t
We prove that Q1 (ξ, u) with γ ⩾ ∥R∥ is the one of appropriate performance indexes of dynamical
system (1). Define: V ∗ (t) = min Q1 (ξ, u), we have (5)
u∈Ωu
Z∞
∗
r(ξ, u) + γρ2 (ξ) dλ

V (t) = min (5)
u∈Ωu
t
based on nominal system and cost function (4), it leads to Halminton function as (6)
T
∂V ∗

∗ 2
H (ξ, u, V ) = r(ξ, u) + γρ (ξ) + (fi (ξ) + gi (ξ) u) (6)
∂ξ
by using optimality principle, the optimal control input can be obtained as (7).
∗
1 T ∂V
u∗ (ξ) = − R−1 (gi (ξ)) (7)
2 ∂ξ
We continue to utilize this control law (7) for nonlinear continuous SW system (1) and obtain that:
T ∗
Theorem 1: The system (1) under the controller u∗ (ξ) = − 12 R−1 (gi (ξ)) ∂V ∂ξ is stable with the
associated Lyapunov function candidate:
Int J Pow Elec & Dri Syst, Vol. 12, No. 1, March 2021 : 551 – 557
Int J Pow Elec & Dri Syst ISSN: 2088-8694 ❒ 553
Z∞
r(ξ, u) + γϱ2 (ξ) dλ

V (t) = (8)
t
where γ ⩾ ∥R∥.
T
Proof: Taking the derivative of V under the control input u (ξ) = − 21 R−1 (gi (ξ)) ∇V ∗ , we imply
that (9):
d
T

T
V = −ξ T Qξ − γϱ2 (ξ) − ∆ (ξ, t) R∆ (ξ, t) − (u + ∆ (ξ, t)) R (u + ∆ (ξ, t)) (9)
dt
It is able to conclude that (10):
V̇ (t) ⩽ −ξ T Qξ (10)
Therefore, the system (1) is robustly stable. However, it is impossible to solve directly HJB equation.
Hence, the optimal performance index V ∗ for system (3) can be described based on a NN as (11)
V ∗ = wT σ (ξ) + ε (ξ) (11)

where σ (x) : Rn → RN ; σ (0) = 0, w ∈ RN is the NN constant weight vector. σ (x) can be found to
guarantee that when N → ∞, we have: ε (ξ) → 0 and ∇ε (ξ) → 0, so for fixed N , we can assume that:
Assumption 2: ∥ε (ξ)∥ ⩽ εmax ; ∥∇ε (ξ)∥ ⩽ ∇εmax ; ∇σmin ⩽ ∥∇σ (ξ)∥ ⩽ ∇σmax ; ∥w∥ ⩽ wmax .
Combining two formulas (10) and (11) we imply (12)
T 1 T T
H (ξ, u∗ , V ∗ ) = ξ T Qξ + λϱ2 (ξ) + (∇V ∗ ) fi (ξ) − (∇V ∗ ) gi (ξ) R−1 gi (ξ) (∇V ∗ ) = 0 (12)
4
Formula (19) leads to (13).
T
∇V ∗ = (∇σ (ξ)) w + ∇ε (ξ) (13)
Obtain the description as (14).
T 1 T T
eN N = −∇ε (ξ) (fi (ξ) + gi (ξ) u∗ ) + ∇ε (ξ) gi (ξ) R−1 gi (ξ) ∇ε (ξ) (14)
4
It follows that eN N converges uniformly to zero as N → ∞. For each number N , eN N is bounded
on a region as eN N ⩽ emax . Under the structure of ADP-based controller, a critic NN is computed as (15).
T 1 T
V̂ = ŵT σ (ξ) = σ (ξ) ŵ; û = − R−1 (gi (ξ)) ∇V̂ (15)
2
It is able to achieve that:
1 T T
eHJB = ξ T Qξ + λϱ2 (ξ) + ŵT ∇σ (ξ) fi (ξ) − ŵT ∇σ (ξ) gi (ξ) R−1 gi (ξ) ∇σ (ξ) ŵ (16)
4
The training law is handled based on a steepest descent method:
d ∂E
b = −α
w (17)
dt ∂w
b
with E = 21 eTHJB eHJB .
b is trained to minimize the network error part G = 12 eTHJB eHJB . This result
Remark 1: The weight w
is obtained from (18).
2
∂G ∂G
= −α (18)
∂t ∂w
b
Adaptive dynamic programming algorithm for uncertain nonlinear switched systems (Dao Phuong Nam)
554 ❒ ISSN: 2088-8694
Theorem 2: Consider the feedback controller in (15) and the critic weight is updated by (18), the
weight estimate error w̃ = w − ŵ and the closed system’s state vector x(t) are uniformly ultimately bounded
(UUB).
Proof: Let’s choose the Lyapunov function:
1 T
V (t) = V1 (t) + V2 (t) , where: V1 (t) = w̃ (t) w̃ (t) , V2 (t) = V ∗ (19)
2α
Using the Assumption 3: ∥fi (ξ) + gi (ξ) u∗ ∥ ⩽ ρmax and the definition:
T
ρi = fi (ξ) + gi (ξ) u∗ ; Gi = gi (ξ) R−1 gi (ξ) ; ∇σ = ∇σ (ξ) ; ∇ε = ∇ε (ξ). Taking the derivative of V1 (t),
we imply that:

1 1
V̇1 (t) = −w̃T −eN N + w̃T ∇σµi + w̃T ∇σGi ∇ε + w̃T ∇σGi ∇σ T w̃
2 4

1
∇σ (x) µi + Gi ∇σ T w̃ + ∇ε

(20)
2
It leads to the estimation: V̇1 (t) ⩽ −π1 . For the term V2 (t) , from (20) we have (21).
T 1 T
V̇2 = (∇V ∗ ) (fi + gi (û + ∆)) = − ξ T Qξ + λρ2 (ξ) − (∇V ∗ )
4
1 T

T

T
gi R−1 giT (∇V ∗ ) + (∇V ∗ ) gi R−1 giT ∇σ (ξ) w̃ + ∇ε (ξ) + (∇V ∗ ) gi ∆ (21)
2
Assume that ρ (ξ) = ϖ ∥ξ∥. From (40) we have (22).
2
V̇2 ⩽ − (λmin (Q) + λϖ) ∥ξ∥ + θ2 (22)

T T T T
with θ2 = − 41 (∇V ∗ ) gi R−1 giT (∇V ∗ ) + 21 (∇V ∗ ) gi R−1 giT ∇σ (x) w̃ + ∇ε (x) + (∇V ∗ ) gi ∆.
Based on the two above assumptions, we have (23).
1 2 2 1 2 2
θ2 ⩽ λmax R−1 + (ϑ∇σmax + ∇εmax ) gmax λmax R−1

(wmax ∇σmax + ∇εmax ) gmax
4 2
+ (wmax ∇σmax + ∇εmax ) gmax ϖ ∥x∥ (23)

2
It is obvious that (λmin (Q) + λϖ) ∥x∥ − θ2 ⩾ π2 with π2 > 0 and we obtain (24).
V̇2 (t) ⩽ −π2 (24)
.
Remark 2: The coefficients ϑ1 ; ϑ2 can be chosen by renovating the NN of the optimal performance
V (0)
index. Moreover, for arbitrary switching index, after min(π 1 ;π2 )
the variable ∥ξ∥ and ∥w̃∥ tend to the accurate
domains. The ADP controller û is proposed in (15), which tends to the neighborhood of u∗ .
Proof: The deviation of control input is estimated as (25).
1 T

T

∥û − u∗ ∥ = R−1 (gi (ξ)) (∇σ (ξ)) w̃ + ∇ε (ξ)
2
1
λmax R−1 .Gmax . (∇σmax .υ1 + ∇εmax ) = ϑ3

⩽ (25)
2
Thus the proof is completed.
4. SIMULATION RESULTS
In this section, we consider the simulations to validate the performance of the established control
scheme: Let N = 2 and the subsystems of the switched system are (26) and (27).
ẋ1 = −x31 − 2x2 + (u+ ∆1 (x, t))

(26)
ẋ2 = x1 + 0.5 cos x21 sin x32 − (u + ∆1 (x, t))
ẋ1 = −x51 sin (x2 ) + (u + ∆2 (x, t))

(27)
ẋ2 = 12 x1 − cos (x1 ) cos x32 − (u + ∆2 (x, t))
The initial state vectors can be chosen as (28).
T
x (0) = −5
5 (28)

2 0 1 0
Choosing that the parameter matrices: R = ;Q = ; α = 0.1; λ = 5.
0 2 0 3
The simulation results shown in Figure 1 and Figure 2 validate the effectiveness of proposed algorithm.
Figure 1. The response of x2 Figure 2. The response of x2
5. CONCLUSION
This paper has investigated the ADP problem of switched nonlinear systems under the external dis-
turbance. We consider previously for nominal system by eliminating the disturbance, then using classical
nonlinear control technique. The neural networks have been designed to estimate the actor and critic NN of
iteration. It is possible to develop the learning algorithm with simultaneous tuning. Finally, UUB description
of the closed system is guaranteed under this work.
ACKNOWLEDGEMENT
This research was supported by Research Foundation funded by Thai Nguyen University of Technol-
ogy.
REFERENCES
[1] Vu, Tran Anh and Nam, Dao Phuong and Huong, Pham Thi Viet, “Analysis and control design of
transformerless high gain, high efficient buck-boost DC-DC converters,” in 2016 IEEE International
Conference on Sustainable Energy Technologies (ICSET), Hanoi, 2016, pp. 72-77, doi: 10.1109/IC-
SET.2016.7811759.
556 ❒ ISSN: 2088-8694
[2] Nam, Dao Phuong and Thang, Bui Minh and Thanh, Nguyen Truong, “Adaptive Tracking Control for
a Boost DC–DC Converter: A Switched Systems Approach,” in 2018 4th International Conference on
Green Technology and Sustainable Development (GTSD), Ho Chi Minh City, 2018, pp. 702-705, doi:
10.1109/GTSD.2018.8595580.
[3] Thanh, Nguyen Truong and Sam, Pham Ngoc and Nam, Dao Phuong, “An Adaptive Backstepping Con-
trol for Switched Systems in presence of Control Input Constraint,” in 2019 International Conference
on System Science and Engineering (ICSSE), Dong Hoi, Vietnam, 2019, pp. 196-200, doi: 10.1109/IC-
SSE.2019.8823125.
[4] Panigrahi, Swetapadma and Thakur, Amarnath, “Modeling and simulation of three phases cascaded H-
bridge grid-tied PV inverter,” Bulletin of Electrical Engineering and Informatics (BEEI), vol. 8, no. 1,
pp. 1-9, 2019, doi: 10.11591/eei.v8i1.1225.
[5] Devarajan, N and Reena, A, “Reduction of switches and DC sources in Cascaded Multilevel Inverter,”
Bulletin of Electrical Engineering and Informatics (BEEI), vol. 4, no. 3, pp. 186-195, 2015, doi:
10.11591/eei.v4i3.320.
[6] Venkatesan, M and Rajeshwari, R and Deverajan, N and Kaliyamoorthy, M, “Comparative study of three
phase grid connected photovoltaic inverter using pi and fuzzy logic controller with switching losses cal-
culation,” International Journal of Power Electronics and Drive Systems (IJPEDS), vol. 7, no. 2, pp.
543-550, 2016.
[7] Zhang, Lixian and Xiang, Weiming, “Mode-identifying time estimation and switching-delay tolerant con-
trol for switched systems: An elementary time unit approach,” Automatica, vol. 64, pp. 174-181, 2016,
doi: 10.1016/j.automatica.2015.11.010.
[8] Yuan, Shuai and Zhang, Lixian and De Schutter, Bart and Baldi, Simone, “A novel Lyapunov function for
a non-weighted L2 gain of asynchronously switched linear systems,” Automatica, vol. 87, pp. 310-317,
2018, doi: 10.1016/j.automatica.2017.10.018.
[9] Xiang, Weiming and Lam, James and Li, Panshuo, “On stability and H control of switched
systems with random switching signals,” Automatica, vol. 95, pp. 419-425, 2018, doi:
10.1016/j.automatica.2018.06.001.
[10] Lin, Jinxing and Zhao, Xudong and Xiao, Min and Shen, Jingjin, “Stabilization of discrete-time switched
singular systems with state, output and switching delays,” Journal of the Franklin Institute, vol. 356, pp.
2060-2089, 2019, doi: 10.1016/j.jfranklin.2018.11.034.
[11] Briat, Corentin, “Convex conditions for robust stabilization of uncertain switched systems with guaranteed
minimum and mode-dependent dwell-time,” Systems & Control Letters, vol. 78, pp. 63-72, 2015, doi:
10.1016/j.sysconle.2015.01.012.
[12] Lian, Jie and Li, Can, “Event-triggered control for a class of switched uncertain nonlinear systems,”
Systems & Control Letters, vol. 135, pp. 1-5, 2020, doi: 10.1016/j.sysconle.2019.104592.
[13] Anyaka, Boniface O and Manirakiza, J Felix and Chike, Kenneth C and Okoro, Prince A, “Opti-
mal unit commitment of a power plant using particle swarm optimization approach,” International
Journal of Electrical and Computer Engineering (IJECE), vol. 10, no.2, pp. 1135-1141, 2020, doi:
10.11591/ijece.v10i2.pp1135-1141.
[14] Devi, Palakaluri Srividya and Santhi, R Vijaya, “Introducing LQR-fuzzy for a dynamic multi area LFC-
DR model,” International Journal of Electrical & Computer Engineering, vol. 9, no. 2, pp. 861-874, 2019,
doi: 10.11591/ijece.v9i2.pp861-874.
[15] Omar, Othman AM and Badra, Niveen M and Attia, Mahmoud A, “Enhancement of on-grid pv sys-
tem under irradiance and temperature variations using new optimized adaptive controller,” Interna-
tional Journal of Electrical and Computer Engineering (IJECE), vol. 8, no. 5, pp. 2650-2660, 2018, doi:
10.11591/ijece.v8i5.2650-2660.
[16] Sharma, Purva and Saini, Deepak and Saxena, Akash, “Fault detection and classification in transmission
line using wavelet transform and ANN,” Bulletin of Electrical Engineering and Informatics (BEEI), vol.
5, no. 3, pp. 284-295, 2016.
[17] Ilamathi, P and Selladurai, V and Balamurugan, K, “Predictive modelling and optimization of nitrogen
oxides emission in coal power plant using Artificial Neural Network and Simulated Annealing,” IAES
International Journal of Artificial Intelligence (IJ-AI), vol. 1, no. 1, pp. 11-18, 2012.
[18] Vamvoudakis, Kyriakos G and Vrabie, Draguna and Lewis, Frank L, “Online adaptive algorithm for
optimal control with integral reinforcement learning,” International Journal of Robust and Nonlinear
Control, vol. 24, no. 17, pp. 2686-2710, 2013, doi: 10.1002/rnc.3018.
[19] Bai, Weiwei and Zhou, Qi and Li, Tieshan and Li, Hongyi, “Adaptive reinforcement learning neural
network control for uncertain nonlinear system with input saturation,” IEEE transactions on cybernetics,
vol. 50, no. 8, pp. 3433-3443, Aug. 2020, doi: 10.1109/TCYB.2019.2921057.
[20] Chen, Ci and Modares, Hamidreza and Xie, Kan and Lewis, Frank L and Wan, Yan and Xie, Shengli, “Re-
inforcement learning-based adaptive optimal exponential tracking control of linear systems with unknown
dynamics,” in IEEE Transactions on Automatic Control, vol. 64, no. 11, pp. 4423-4438, Nov. 2019, doi:
10.1109/TAC.2019.2905215.
[21] Vamvoudakis, Kyriakos G and Ferraz, Henrique, “Model-free event-triggered control algorithm for
continuous-time linear systems with optimal performance,” in Automatica, vol. 87, pp. 412-420, 2018,
doi: 10.1016/j.automatica.2017.03.013.
[22] Gao, Weinan and Jiang, Yu and Jiang, Zhong-Ping and Chai, Tianyou, “Output-feedback adaptive optimal
control of interconnected systems based on robust adaptive dynamic programming,” Automatica, vol. 72,
pp. 37-45, 2016, doi: 10.1016/j.automatica.2016.05.008.
[23] Zhang, Tianping and Xu, Haoxiang, “Adaptive optimal dynamic surface control of strict-feedback non-
linear systems with output constraints,” International Journal of Robust and Nonlinear Control, vol. 30,
no. 5, pp. 2059–2078, 2020, doi: 10.1002/rnc.4864.
[24] Wang, Ding and Mu, Chaoxu, “Adaptive-critic-based robust trajectory tracking of uncertain dynamics
and its application to a spring–mass–damper system,” IEEE Transactions on Industrial Electronics, vol.
65, no. 1, pp. 654-663, Jan. 2018, doi: 10.1109/TIE.2017.2722424.
[25] Wen, Guoxing and Ge, Shuzhi Sam and Chen, CL Philip and Tu, Fangwen and Wang, Shengnan, “Adap-
tive tracking control of surface vessel using optimized backstepping technique,” IEEE transactions on
cybernetics, vol. 49, no. 9, pp. 3420-3431, Sept. 2019, doi: 10.1109/TCYB.2018.2844177.

Adaptive Dynamic Programming Algorithm For Uncertain Nonlinear Switched Systems

Uploaded by

Copyright:

Adaptive Dynamic Programming Algorithm For Uncertain Nonlinear Switched Systems

Uploaded by

Document Information

Original Title

Copyright

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Adaptive Dynamic Programming Algorithm For Uncertain Nonlinear Switched Systems

Uploaded by

Copyright:

International Journal of Power Electronics and Drive Systems (IJPEDS)

Vol. 12, No. 1, March 2021, pp. 551∼557

Adaptive dynamic programming algorithm for uncertain

Article Info ABSTRACT

Journal homepage: http://ijpeds.iaescore.com

where r(ξ, u) = ξ T Qξ + uT Ru and Q = QT > 0; R = RT > 0.

V ∗ = wT σ (ξ) + ε (ξ) (11)

+ (wmax ∇σmax + ∇εmax ) gmax ϖ ∥x∥ (23)

V̇2 (t) ⩽ −π2 (24)

ẋ1 = −x31 − 2x2 + (u+ ∆1 (x, t))

ẋ1 = −x51 sin (x2 ) + (u + ∆2 (x, t))

Figure 1. The response of x2 Figure 2. The response of x2

You might also like