Nothing Special   »   [go: up one dir, main page]

Neural Network Implementation of Nonlinear Receding-Horizon Control

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Neural Comput & Applic (1999)8:8692

1999 Springer-Verlag London Limited

Neural Network Implementation of Nonlinear Receding-


Horizon Control*
L. Cavagnari, L. Magni and R. Scattolini
Dipartimento di Informatica e Sistemistica, Universita di Pavia, Pavia, Italy

The Receding Horizon (RH) approach is an effective ted linearised system. The corresponding NRH con-
way to derive control algorithms for nonlinear sys- trol law is computed through the solution of a Finite
tems with stabilising properties also in the presence Horizon (FH) optimisation problem with optimis-
of state and control constraints. However, RH ation horizon N and terminal state penalty equal to
methods imply a heavy computational burden for the cost that would be incurred by applying a local
on-line optimisation, therefore they are not suitable stabilising linear control law thereafter. It is remark-
for the control of fast systems, for example mech- able, however, that the linear control law is never
anical ones, which call for the use of short sampling applied in practice, but is just used to compute
periods. The aim of this paper is to show through the terminal state penalty. Moreover, the region of
an experimental study how a Nonlinear RH (NRH) attraction of the equilibrium grows with N, and
control law can be computed off-line, and sub- tends to that of the associated Infinite Horizon (IH)
sequently approximated by means of a neural net- optimisation problem. For this reason, one should
work, which is effectively used for the on-line select long horizons N in order to improve the
implementation. The proposed design procedure is overall performance and to enlarge the exponential
applied to synthesise a neural NRH controller for a stability region. On the other hand, as N increases
seesaw equipment. The experimental results reported the optimisation problem becomes more and more
here demonstrate the feasibility of the method. difficult to solve, and it is surely intractable for an
on-line implementation on fast applications, i.e.
Keywords: Mechanical systems; Neural network; when the dynamics of the system under control
Nonlinear control; Receding-Horizon Control force the use of a small sampling period.
In these cases, the procedure first suggested by
Parisini and Zoppoli [3] can be followed. Specifi-
1. Introduction cally, one can compute off-line the optimal NRH
control law RH(x) for a (large) set of admissible
In recent years, many Nonlinear Receding Horizon values of the initial state x X. Then, it is possible
(NRH) control algorithms have been proposed with to approximate RH(x) with any suitable interpolation
guaranteed local stability properties even when technique, for example by means of a Neural Net-
constraints are imposed on the evolution of the work (NN). Finally, the approximating function so
control and state variables [16]. In particular, the obtained is effectively implemented for on-line com-
technique presented by De Nicolao et al. [5] pro- putations.
vides exponential stability of the equilibrium under Although the above implementation procedure is
the mild assumption of stabilisability of the associa- conceptually very attractive and has been used in
simulation experiments [3], to the authors knowl-
edge, no real applications have been presented so
Correspondence and offprint requests to: L. Magni, Dipartimento far.
di Informatica e Sistemistica, Universita di Pavia, Via Ferrata 1, The aim of this paper is to present some experi-
27100 Pavia, Italy. Email: magniconpro.unipv.it
* This paper has been partially supported by MURST Project mental results obtained in the control of a seesaw
Model Identification, System, Control, Signal Processing. apparatus with a NN approximation of the NRH
Neural Network Implementation of Nonlinear Receding-Horizon Control 87

control law presented by De Nicolao et al. [5]. To be the linearisation of system (1)(2) around the
reduce the computational burden required to deter- equilibrium point (x,u) = (0,0), i.e.
mine the control sequences used for the training of
f f h
the NN, the NRH control law is computed for initial A= (0,0), B= (0,0), C= (0)
values of the system state which are far from the x u x
equilibrium, while the standard optimal Linear Assuming that the pair (A,B) is stabilisable, well
Quadratic (LQ) technique is applied in its neighbour- known results of linear control theory state that it
hoods. is possible to find a matrix K such that the eigenval-
The paper is organised as follows. In Section 2, ues of (ABK) are inside the unit circle in the
the state feedback NRH control law [5] is briefly complex plane. Note that K is the gain matrix of a
summarised together with its properties. Moreover, linear state feedback control law u(k) = Kx(k) which
since the state of the seesaw equipment is not stabilises the linear system (4). Hence, K can be
completely accessible, the extension of the NRH computed by means of standard synthesis methods
control law to the case of output feedback is for linear systems, for example LQ [8] or pole-
presented together with the associated stabilising placement [9] techniques.
results, recently presented by Magni et al. [7]. Sec- Now, for a given stabilising matrix K, at any
tion 3 describes the guidelines followed to derive time instant t let x = x(t) and minimise with respect
the NN approximation of the NRH control law. In to ut,tN1 := [u(t) u(t1) % u(t N 1)], N 1,
Section 4, the experimental apparatus is presented; the cost function
the implemented NRH/NN control law is described,


N1
and some experimental results are reported to wit-
ness the applicability of the approach. Finally, some J(x,ut,tN1,N,K) = x(ti)Qx(ti)
i=0
concluding remarks are reported in Section 5.
u(ti)Ru(ti) Vf(x(tN),K) (5)
subject to (1) and (3), with Q 0, R 0 and the
2. Nonlinear Receding-Horizon terminal state penalty Vf defined as
Control

In this section, we briefly review the results in De Vf(x,K) = x(ti)(QKRK)x(ti)


i=0
Nicolao et al. [5] and Magni et al. [7] on state
feedback and output feedback RH control of nonlin- where x(ti), i 0 satisfies (1) with u(k) = Kx(k).
ear systems, which form the basis of all the sub- The optimal control sequence uot,tN1 solving the
sequent developments. above optimisation problem is termed admissible if,
The nonlinear discrete-time dynamic system is when applied to system (1)
assumed to be described by
x(k) X, u(k) U, t k t N
x(k1) = f(x(k),u(k)), x(t) = x kt (1)
x(t N) X(K)
y(k) = h(x(k)) (2)
where X(K) stands for the exponential stability
where x Rn is the state, y Rm is the output, region (see [1] and [5]) of the nonlinear closed-loop
and u Rm is the input. The functions f(,) and h() system composed by the nonlinear system (1) and
are C1 functions of their arguments and f(0,0) = 0, the linear control law u(k) = Kx(k). In other words,
h(0) = 0. x X(K) implies the fulfillment of the constraints
For the system (1), we search for a NRH control (3), i.e. x(k) X, Kx(k) U,k t.
law u = RH(x) which regulates the state to the ori- Finally, the state-feedback NRH control law u =
gin, subject to the input and state constraints RH(x) is obtained by applying at any time instant
t the control u(t) = uo(x) where uo(x) is the first
x(k) X, u(k) U, kt (3)
column of uot,tN1. Letting X0(N,K) be the set of
where X and U are closed subsets of Rn and Rm, states x such that any admissible control sequence
respectively, both containing the origin as an uot,tN1 exists, the following result holds:
interior point.
To derive the NRH control law, first let Theorem 1 [5] Assume that (A,B) is stabilisable
and let K be such that the eigenvalues of (ABK)
x(k 1) = Ax(k) Bu(k)
are inside the unit circle in the complex plane.
y(k) = Cx(k) (4) Then, if the NRH control law u = RH(x) is applied
88 L. Cavagnari et al.

to the nonlinear system (1), the origin is an exponen- 3. Neural Network Implementation of
tially stable equilibrium point of the resulting closed- the NRH Control Law
loop system having X0(N,K) as exponential stab-
ility region. The main drawback of the RH approach is the
necessity to solve a nonlinear optimisation problem
on-line. This is possible for slow systems such as
A very practical procedure for stabilising the non- the one considered by Magni et al. [10] or chemical
linear system (1) is to design first a linear control and petrochemical plants where NRH control is
law u = IHL (x) = Kx by minimising an IH perform- already widely industrially applied. However, not-
ance index subject to the linearised state dynamics withstanding the improvements of the hardware tech-
(4). In this respect, well-established tools are avail- nology, a direct on-line implementation of NRH
able for the tuning of the weighting matrices Q and control may be still quite impossible for fast
R in a standard LQ control problem so as to achieve systems.
the desired specifications for the linearised closed- To apply the NRH approach also when a short
loop system x(k1) = (ABK)x(k). Then, the same sampling interval must be used, one can consider
Q and R are used to implement the nonlinear RH to solve the FH optimisation problem off-line for
controller. Under regularity assumptions, as x many different initial states x and to store the com-
0, it turns out that RH(x) IHL (x) = Kx. Moreover, puted sequences of control variables, i.e. the realis-
(RH(x)/xx=0 = K so that the NRH control law ations of RH(x), in the computers memory. Then,
can be regarded as a consistent nonlinear extension in the real application, one has to select or interp-
of the linear control law u = Kx. olate the computed control sequence that best fits
In this procedure, once Q and R have been selec- with the current state of the plant. Clearly, this
ted, the only free parameter is the optimisation strategy has the disadvantage that an excessive
horizon N, which can be tuned to trade compu- amount of computer memory may be required to
tational complexity (which grows with N) for per- store the closed-loop control law. Moreover, some
formance (RH(x) IH(x) as N , where IH(x) interpolating procedure must be implemented in
is the unknown optimal control law for the IH practice to effectively compute the control variable.
nonlinear problem). Furthermore, as N grows, the An interesting way to solve these implementation
stability region enlarges and X0(N,K) XIH, where problems is to resort to a functional approximation
XIH is the region of attraction of the optimal IH RH(x) of the optimal control law RH(x) [3]. More
nonlinear control law IH(x). It can also be proven specifically, we search for a function RH(x,w), with
that X0(N1,K) X0(N,K) X(K) N 0 [5]. a given structure, which depends upon a vector of
The NRH state-feedback control law previously parameters w. The values of w must be optimised
introduced assumes the knowledge of the system with respect to the approximation error
state x at any time instant t. In many practical cases,


Nc
for example in the application described in Section
4, only the system outputs are measured. Then, the E(w) = RH
(x(i)) RH(x(i),w)
i=0
state-feedback control law must be combined with
a suitable state observer producing at any time where x(i), i = 1, %, Nc, are the states of the
instant t the estimation x(t) of the state vector x(t) sequences which have been computed off-line, and
from the measures of the system inputs and outputs. that form the training set.
Finally, the truly implemented NRH output feed- Among the many different approximation stra-
back type control law is computed on the estimated tegies nowadays available, in this work we have
state, that is u = RH(x). However, this procedure concentrated on multilayer feedforward neural net-
raises an important theoretical issue: to what extent works [11]. In particular, we assume that the
the stabilising properties of the state feedback con- approximating neural function RH(x,w) contains
trol law (see Theorem 1) still hold in the output only one hidden layer composed of neural units
feedback case? This problem has been analysed in (perceptron) with a sigmoidal activation function
depth by Magni et al. [7], where it has been shown
fs(y) = (1 ey)1
that, under mild observability properties on the orig-
inal nonlinear system (1)(2), combining the NRH and that the output layer is composed of linear
state-feedback control law with popular observer activation units. It is well known that continuous
methods, such as the Kalman filter, the asymptotic functions can be approximated to any degree of
(or exponential) stability of the equilibrium is still accuracy on a given compact set by feedforward
guaranteed. neural networks based on sigmoidal functions, pro-
Neural Network Implementation of Nonlinear Receding-Horizon Control 89


vided that the number of perceptrons is sufficiently
mh F 1
large. However, the choice a priori of the number = 2 2 p
of perceptrons to use is an open problem of neural mp J m mp2J
network approximation theory. In Parisini and Zop- 2mvp
poli [3], a theoretical study of the approximating [Mgcsin() mgpcos()]
mp2J
properties of the receding-horizon neural regulator
is reported. (6)
where M and m are the seesaw and cart masses, J
is the moment of inertia, h is the height of the track
4. NRH Control of a Seesaw from pivot point, c is the centre of mass seesaw
(height from pivot point), g is the acceleration due
The design procedure described in the previous sec- to gravity and F is the force applied to the cart.
tions has been followed for the synthesis of a neural The force is from a DC motor coupled to a track
controller for a seesaw. The apparatus, schematically via a rack and pinion mechanism. It depends upon
shown in Fig. 1, consists of two long arms hinged the input voltage u in the following way:


onto a triangular support. The axis is coupled to a
2
potentiometer which allows one to measure the see- kgKm kgKm 1
F= u v (7)
saw angle. A cart slides on a ground strainless steel rRa r Ra
shaft. The cart is equipped with a motor and a
where Ra is the armature resistance, kg is the built-
potentiometer. These are coupled to a rack and
in gear ratio of motor, r is the radius of output
pinion mechanism to input the driving force to the
pinion and Km is the torque constant. In the experi-
system and to measure cart position respectively.
mental apparatus used in this study, the values of
The control objective is to design an output feedback
the parameters are: m = 0.455 Kg, M = 3.3 Kg, h
control law that controls the position of the cart to
= 0.1397 m, c = 0.058 m, J = 0.427 Kgm2, g =
maintain the seesaw in the horizontal position.
9.81 m/s2 with kg = 3.7, Km = 0.00767 Nm/A, r =
0.0064 m, Ra = 2.6 . The maximum allowed value
4.1. Nonlinear Continuous Time Model of is of about 13 for physical constraints.
Note that model (6) does not account for the
Denoting by p and v the cart position and velocity presence of friction, which can be viewed in the
and by and the angle position and velocity (see control synthesis phase as an unmodelled dynamics.
Fig. 1), the nonlinear model of the plant is given The robustness of the controller will also be tasted
by the following differential equations: against this uncertainty.
In the (unstable) equilibrium x = [p v ] =
p = v
[0 0 0 0], the seesaw is horizontal and the cart is
v =
mp2Jmh2
mp2J F
m
2p located at the centre of the triangular support. The
system is assumed to be controlled with a digital
controller with sampling period equal to 5 ms.
g
[(Mhcmp2J)sin()
mp2J
4.2. NRH Control Law and Neural
2mhvp Approximation
(mhpcos())]
mp2J
According to the procedure outlined in Section 2,
= the linearlisation of model (6)(7) around the origin
has first been computed and discretised. The
obtained discrete-time linear model is defined by
the matrices A and B reported in the Appendix, and
is able to describe with accuracy the seesaw dynam-
ics for angle positions roughly ranging in the
interval (8,8). On the contrary for 8 the
nonlinear behaviour dominates.
With reference to the discrete-time linearised sys-
tem (4), the stabilising gain K (see again the
Appendix) of the linear control law u = Kx has been
Fig. 1. Mechanical scheme of the seesaw. determined by means of the LQ method [8] with
90 L. Cavagnari et al.

state and control weighting matrices Q and R


given by
Q = diag{3000,0.1,3000,0.1}, R=2 (8)
Note that, since the seesaw behaviour is almost
linear for 08, the control value u computed
with the linear control law is almost equal to the
one which could be determined through the minimis-
Fig. 2. Control scheme.
ation of the performance index (5) referred to the
nonlinear system (6)(7), i.e. u = KxRH(x). For
this reason, starting from different initial conditions
x0 = [0 0 0 0], 0 8, the control law u = Kx
has been used to generate with a negligible compu-
tational effort the control sequences to be sub-
sequently used for the training of the neural net.
To enlarge the stability region and to improve
the control performance, the NRH control algorithm
of Section 2 has been used to compute off-line the
optimal control sequences corresponding to various
initial conditions x0 = [0 0 0 0], with 8
013. The optimisation horizon N = 10 has
been used together with the Q and R matrices again
given by Eq. (8). As discussed in Section 3, the
computed sequences are realisations of the truly
optimal NRH control law u = RH(x), and have been
subsequently used in the training of the approximat-
ing net. Observe that the solution of the optimisation
problem requires a significant computational burden.
Finally, a multilayer feedforward neural net with
30 perceptrons and sigmoidal activation function
with = 1 has been used to approximate the state-
feedback NRH control law. As already stated, the
training-set has been composed both of the
sequences computed with the linear control law
(0 8) and of those determined through the
optimisation phase. In so doing, a smooth passage
from the nonlinear control law to the linear LQ one
has been guaranteed by the training process itself.

4.3. Output Feedback Control Law

In the seesaw experimental apparatus only the cart


and the angle positions are directly measured and
coincide with the outputs of the system, while the
cart and angle velocities must be reconstructed with
an observer. In particular, a standard Kalman filter
[12] has been derived for the discrete-time linearised
system, and used for on-line closed-loop control.
This filter has been designed assuming that two
Gaussian and mutually independent white noises
WGN(0,Q*), Q* = diag[0.0001 0 0 0] and
WGN(0,R*), R* = diag[0.0004 0.0004] act on the
state and output vectors, respectively. Correspond-
ingly, the filter gain L reported in the Appendix has
been computed. Fig. 3. Experimental results.
Neural Network Implementation of Nonlinear Receding-Horizon Control 91

In summary, the scheme used for on-line control 3. Parisini T, Zoppoli R. A receding-horizon regulator
is shown in Fig. 2. It is composed by the plant, the for nonlinear systems and a neural approximation.
Automatica 1995; 31: 14431451
Kalman observer, the NRH neural network 4. Chen H, Allgower F. A quasi-infinite horizon nonlin-
(NRH/NN) state-feedback control law and a satu- ear predictive control. European Control Conference,
ration block. This control scheme has been im- 1997
plemented on a PC-133 MHz using a C-language 5. De Nicolao G, Magni L, Scattolini R. Stabilizing
program and a commercial data acquisition system. receding-horizon control of nonlinear time-varying
systems. IEEE Trans Automatic Control 1998; AC-
43: 10301036
6. De Nicolao G, Magni L, Scattolini R. Stabilizing
4.4. Experimental Results predictive control of nonlinear ARX models. Auto-
matica 1997; 33: 16911697
To test the performance of the NRH/NN control 7. Magni L, De Nicolao G, Scattolini R. Output feedback
law, some perturbations have been imposed to the receding-horizon control of discrete-time nonlinear
seesaw. In particular, impulse-type forces have systems. IFAC Nonlinear Control Systems Design
Symposium, Enschede, The Netherlands, 1998
been provided to one extremum of the apparatus, 8. Anderson B, Moore J. Optimal Control: Linear Quad-
so that the task of the control system has been to ratic Methods, Prentice-Hall, 1990
bring back the seesaw in the horizontal position. 9. Franklin G, Powell J, Workman M. Digital control of
Some of the results achieved are presented in Dynamic Systems, Addison-Wesley, 1990
Fig. 3, where the transients of the cart position, of 10. Magni L, Bastin G, Wertz V. Multivariable nonlinear
predictive control of cement mills. IEEE Trans Control
the angle position and of the input voltage are and Systems Technology 1998 (to appear)
reported. Concerning these figures, two facts have 11. Hornik K, Stinchombe M, White H. Multilayer feedf-
to be noted. First, the NRH/NN control law guaran- orward networks are universal approximators. Neural
tees good results also for perturbations of the angle Networks 1989; 2: 359366
position greater than 80 where the nonlinear con- 12. Anderson B, Moore J. Optimal Filtering. Prentice-
Hall, 1979
trol synthesis procedure is effective. Second, there 13. Targhetti W. Modellizzazione e controllo di un pen-
is a small steady-state error both in the cart and in dolo inverso. Thesis, Dip. di Informatica e Sistemis-
the angle positions. Correspondingly, in the steady- tica, University of Pavia, Italy, 1996/97 (in Italian)
state the input voltage is different from zero. This
is caused by the presence of friction forces that for
small values of the input u prevent the cart from
moving. To eliminate these errors, one should
include suitable integral action into the feedback Appendix
loop.
Locally stabilising control law

5. Conclusion K = [92.63 8.67 97.53 36.50]


designed by solving an LQ problem with the Q and
The results reported in this paper clearly illustrate
R matrices given by Eq. (8) based on the discrete-
that the Receding Horizon approach is a practical
time linearised system
way to solve nonlinear control problems also for
fast systems. In these cases, the true nonlinear x(k1) = Ax(k) Bu(k)
control law must be computed off-line first. Then,
it can be approximated through the use of nowadays with
standard tools, such as Neural Nets. The approach 1.0000 0.0048 0.0001 0.0000
followed here has been used also to control a labora-
tory inverted pendulum with excellent results [13].
0.0070 0.9187 0.0441 0.0001
A= ;

0.0001 0.0000 1.0001 0.0050

0.0522 0.0120 0.0223 1.0001
References 0.0000
1. Keerthi S, Gilbert E. Optimal, infinite-horizon feed- 0.0185
back laws for a general class of constrained discrete- B=
time systems. J Optimiz Th Appl 1988; 57: 265293 0.0000
2. Mayne D, Michalska H. Receding horizon control of 0.0027
nonlinear systems. IEEE Trans Automatic Control
1990; 35: 814824 Kalman Filter gain
92 L. Cavagnari et al.


0.3904 0.0004 y(k) = Cx(k) (k)
0.0037 0.0130
L= where
0.0004 0.0288
0.0320 0.0843 C = [1 0 1 0]
derived from the following noisy discrete-time and x, (k) and (k) are assumed jointly Gaussian
linearised systems: and mutually independent. Furthermore, x N(0,I),
(k) WN(0,Q*), (k) WN(0,R*), with Q* =
x(k1) = Ax(k) Bu(k) (k) x(0) = x diag[0.0001 0 0 0] and R* = diag[0.0004 0.0004].

You might also like