Nothing Special   »   [go: up one dir, main page]

Poly

Download as pdf or txt
Download as pdf or txt
You are on page 1of 44

Identi cation and Control of Nonlinear Systems Using

Neural Network Models: Design and Stability Analysis


by
Marios M. Polycarpou and Petros A. Ioannou

Report 91-09-01 September 1991


Identi cation and Control of Nonlinear Systems Using Neural
Network Models: Design and Stability Analysis
Marios M. Polycarpou and Petros A. Ioannou
Department of Electrical Engineering-Systems
University of Southern California, MC-2563
Los Angeles, CA 90089-2563, U.S.A
email: polycarp@bode.usc.edu, ioannou@bode.usc.edu

Technical Report 91-09-01


September 1991

Abstract
The feasibility of applying neural network learning techniques in problems of system
identi cation and control has been demonstrated through several empirical studies. These
studies are based for the most part on gradient techniques for deriving parameter adjust-
ment laws. While such schemes perform well in many cases, in general, problems arise
in attempting to prove stability of the overall system, or convergence of the output error
to zero. This paper presents a stability theory approach to synthesizing and analyzing
identi cation and control schemes for nonlinear dynamical systems using neural network
models. The nonlinearities of the dynamical system are assumed to be unknown and are
modelled by neural network architectures. Multilayer networks with sigmoidal activation
functions and radial basis function networks are the two types of neural network models
that are considered. These static network architectures are combined with dynamical
elements, in the form of stable lters, to construct a type of recurrent network con gura-
tion which is shown to be capable of approximating a large class of dynamical systems.
Identi cation schemes based on neural network models are developed using two di erent
techniques, namely, the Lyapunov synthesis approach and the gradient method. Both
identi cation schemes are shown to guarantee stability, even in the presence of modelling
errors. A novel network architecture, referred to as dynamic radial basis function net-
work, is derived and shown to be useful in problems dealing with learning in dynamic
enviroments. For a class of nonlinear systems, a stable neural network based control
con guration is presented and analyzed.
1 Introduction
Adaptive identi cation and control of dynamical systems has been an active area of research
for the last three decades. Although methods for controlling linear time invariant plants with
unknown parameters had been pursued since the 1960's, it was not until the last decade that
stable adaptive laws were established [1]-[4]. Recent advances in nonlinear control theory
and, in particular, feedback linearization techniques [5, 6], have initiated activity aimed at
developing adaptive control schemes for nonlinear plant models [7]-[10]. This area, which
came to be known as adaptive nonlinear control, deals with systems where the uncertainty
is due to unknown parameters which appear linearly with respect to known nonlinearities.
Therefore, adaptive control research so far has been directed towards systems with a special
class of parametric uncertainties.
The emergence of the neural network paradigm as a powerful tool for learning complex
mappings from a set of examples has generated a great deal of excitement in using neural
network models for identi cation and control of dynamical systems with unknown nonlin-
earities [11, 12]. Due to their approximation capabilities as well as their inherent adaptivity
features, arti cial neural networks present a potentially appealing alternative to modelling of
nonlinear systems. Furthermore, from a practical perspective, the massive parallelism and
fast adaptability of neural network implementations provide more incentives for further inves-
tigating the connectionist approach in problems involving dynamical systems with unknown
nonlinearities.
The feasibility of applying neural network architectures in identi cation and control has
been demonstrated, through simulations, by several studies, including [13]-[17]. The problem
was originally formulated by Narendra et.al. [13], whose work instigated further research in
this area. For the most part, these studies are based on rst replacing unknown functions
in the di erence equation by static neural networks and then, according to a quadratic cost
function, deriving update laws using optimization methods. A particular optimization proce-
dure that has attracted a lot of attention is the steepest descent, or gradient method, which
leads to the static backpropagation, or dynamic backpropagation-type algorithms [18]-[20],
depending on whether the dynamical behavior of the system is taken into account or not.
While such schemes perform well in many cases, in general, problems arise in attempting to
prove stability of the overall system, or convergence of the output error to zero. Interestingly,
methods based on straightforward application of optimization techniques (such as sensitiv-
ity models [21] and the M.I.T. rule [22]), that dominated the early adaptive linear control
literature, exhibited similar stability problems. The fact that, even for linear systems, such
methods can lead to instability was shown in [23, 24] (see also [25]).
In this paper we present a design procedure, based on stability theory, for modelling,
identi cation and adaptive control of continuous-time nonlinear dynamical systems using
neural network architectures. Particular emphasis is placed on the synthesis and stability
analysis of the proposed schemes. The techniques developed here share some fundamental
features with the parametric methods of both adaptive nonlinear control as well as adaptive
linear control theory. Therefore, like earlier works [26, 13, 20, 27] this study also serves as

1
an attempt to unify learning methods used by connectionists and adaptive control theorists.
More speci cally, one of the objectives of the paper is the development of stable neural network
learning techniques for dynamic enviroments.
Our approach in this paper is based on continuous-time system representations. In particu-
lar, the plant dynamics are considered to evolve in continuous-time and furthermore, adaptive
laws based on continuous adjustment of the weights are developed and analyzed. Due to the
convenient \tapped delay line" representation, most of the current literature on identi ca-
tion and control of nonlinear systems using neural networks is developed in a discrete-time
framework with iterative weight update rules. However, as also pointed out in [28], studying
the problem in discrete-time has several drawbacks. It is known that discretization of phys-
ical continuous-time nonlinear systems yields highly complex discrete-time models [6]. More
precisely, discretization of ane (in the input) continuous-time systems results in non-ane
discrete-time models which are almost impossible to analyze. The approach usually taken
by nonlinear control theorists is to design a control system based on continuous-time plant
model and controller and then study the e ect of sampling on the overall system [29]. Fur-
thermore, from a practical perspective, if the neural network approach to control is to achieve
its full potential, then it will be through parallel hardware realizations of neural network
architectures implemented for real-time control systems. The current trend towards analog
realizations of neural networks suggests considering identi cation and control schemes based
on continuous-time systems with continuous-time adaptation.
From a theoretic point of view, neural networks can be considered as versatile mappings
whose response to a speci c input or pattern is determined by the values of adjustable weights.
In our analysis we consider two types of neural network architectures: 1) multilayer neural
network models with sigmoidal nonlinearities and 2) radial basis function networks with Gaus-
sian activation functions. Multilayer neural networks [30] are by far the most widely used
neural network models. They have been applied successfully to a wide range of problems
including pattern and speech recognition, image compression, signal prediction and classi-
cation [31]. Due to some desirable features such as local adjustment of the weights and
mathematical tractability, radial basis function networks have recently also attracted consid-
erable attention especially in applications dealing with prediction and classi cation [32]-[34].
The importance of radial basis function networks has also greatly bene tted from the work of
Poggio et.al. [35]-[37], where the relationship between regularization theory and radial basis
function networks is explored.
The approximation capabilities of static sigmoidal type networks and of radial basis func-
tion networks has been studied by several research groups (see for example, [38]-[40]). In
Section 2 we use these results to show that a proposed combination of static neural networks
with dynamical components, such as stable lters, form a type of recurrent network capable
of approximating a large class of dynamical systems. More precisely, it is shown that there
exists a set of weights such that for a given input, the outputs of the real system and the
proposed recurrent neural network model remain arbitrarily close over a nite interval of time.
System identi cation consists of rst choosing an appropriate identi cation model and
then adjusting the parameters of the model according to some adaptive law such that the

2
response of the model to an input signal approximates the response of the real system to
the same input. Since a mathematical characterization of a system is often a prerequisite to
analysis and controller design, system identi cation is important not only for understanding
and predicting the behavior of the process, but also for designing e ective control laws. In
Section 3 we develop and analyze two di erent neural network based identi cation schemes.
The rst scheme relies on the Lyapunov synthesis approach while the second is derived based
on optimization methods. In this context, we introduce a novel architecture, referred to as
dynamic radial basis function network, which is shown to be useful for applying optimization
techniques in problems dealing with learning in dynamic enviroments. In Section 4, we
consider the problem of controlling simple classes of nonlinear systems. Due to the intricacies
of the problem, we rst discuss in detail three types of mechanisms that can lead to instability,
namely, the parameter drift, controllability and transient behavior problems, and propose
modi cations to the standard control and adaptive laws for dealing with these problems.
Based on these modi cations, we prove stability of the overall control system. Finally, in
Section 5 we discuss the contribution of the paper and draw some nal conclusions. Due to
the length of the paper, simulation results are omitted.

2 Modelling of Dynamical Systems by Neural Networks


One of the basic requirements in using neural network1 architectures to represent, identify
and control nonlinear dynamical systems is the capability of these architectures to accurately
model the behavior of a large class of dynamical systems that are encountered in science and
engineering problems. This leads to the question of whether a given neural network con g-
uration is able to approximate, in some appropriate sense, the input-output response of a
class of dynamical systems. In order to be able to approximate the behavior of a dynamical
system, it is clear that any proposed neural network con guration must have some feedback
connections. In the neural network literature such networks are known as recurrent networks.
The input-output response of neural networks, whether static or recurrent, is determined by
the values of a set of parameters which are referred to as weights. Therefore the representa-
tional capabilities of a given network depend on whether there exists a set of weight values
such that the neural network con guration approximates the behavior of a given dynamical
system. The terms \weights" and \parameters" are used interchangeably throughout the
paper.
In this section we consider the problem of constructing a neural network architecture
that is capable of approximating the behavior of continuous-time dynamical systems, whose
input-state-output representation is described by
x_ (t) = f (x(t); u(t)) x(0) = x0 (2:1)
y(t) = h(x(t); u(t))
1
The term \neural network" is used very loosely since some of the architectures considered in this paper
have very little, if any, relation to biological neurons. In this context, \neural networks" are better interpreted
as versatile mappings represented by the composition of many basic functions structured in a parallel fashion.

3
Figure 1: A recurrent neural network con guration for modelling the general dynamical sys-
tem described by (2.1).

where u 2 Rm is the input, x 2 Rn is the state, y 2 Rp is the output and t 2 R+ is the


temporal variable. The input u belongs to a class U of (piecewise continuous) admissible
inputs. By adding and subtracting Ax, where A is a Hurwitz or stability matrix (i.e, has all
of its eigenvalues in the open left-half complex plane), (2.1) becomes
x_ = Ax + g(x; u) ; y = h(x; u) (2:2)
where g (x; u) := f (x; u) ? Ax. Based on (2.2), we construct a recurrent network model by
replacing the mappings g and h by feedforward (static) neural network architectures, denoted
by N1 and N2 respectively. Therefore we consider the model
x^_ = Ax^ + g^(^x; u; g ) x^(0) = x^0 (2:3)
y^ = h^ (^x; u; h)
where g^ and h^ are the outputs of the static neural networks N1 and N2 respectively, while g
and h denote the adjustable weights of these networks. In (2.3), x^ and y^ denote the state
and output respectively of the recurrent network model.
Corresponding to the Hurwitz matrix A, we let W (s) := (sI ? A)?1 be an n  n matrix
whose elements are stable transfer functions and s denotes the di erential (Laplace) operator.
Based on this de nition of W (s) as a stable lter, a block diagram representation of the
recurrent network model described by (2.3) is depicted in Figure 1. This interconnection
of static neural nets and dynamic components is proposed for modelling the input-output
response of the general dynamical system described by (2.1). If we suppose that the real
system and the proposed model are initially at the same state (i.e, x^0 = x0 ), then the natural
question to ask is whether there exist weights g , h such that the input-output behavior
(u 7! y^) of the neural network model (2.3) approximates, in some sense, the input-output
behavior (u 7! y ) of the real system (2.1). This leads to the validity of the proposed model.
In examining this question, we will impose the following mild assumptions on the system
to be approximated:
u ^
u ^ h ^y
g x^4 N2
x^ N1 W(s)
Figure 2: Block diagram representation of a two-layer sigmoidal neural network.

(S1) Given a class U of admissible inputs, then for any u 2 U and any nite initial condition
x0 the state and output trajectories do not escape to in nity in nite time, i.e, for any
nite T > 0 we have jx(T )j + jy (T )j < 1.
(S2) The vector elds f : Rn+m 7! Rn and h : Rn+m 7! Rp are continuous with respect to
their arguments. Furthermore, f satis es a local Lipschitz condition so that the solution
x(t) to the di erential equation (2.1) is unique for any nite initial condition x0 and
u 2 U.
The above assumptions are required in order to guarantee that the solution to the system
described by (2.1) exists and is unique for any nite initial condition x0 and any admissible
input u 2 U .
We will also assume that the static neural network topologies Ni , i = 1; 2, that are used
to represent the mappings g and h satisfy the following conditions:
(N1) Given a positive constant " and a continuous function f : C 7! Rp, where C  Rq
is a compact set, there exists a weight vector  =  such that the output f^(X; ) of
the neural network architecture Ni with n nodes (where n may depend on " and f )
satis es
max f^(X; ) ? f (X )  "

X 2C
(N2) The output f^(X; ) of the neural network architecture Ni is continuous with respect
to its arguments for all nite (X; ).
We next describe two popular neural network architectures that satisfy conditions (N1), (N2).
(a) Multilayer sigmoidal neural networks: Multilayer neural networks with sigmoidal
type of nonlinearities are by far the most widely used neural network models [30]. From a
theoretic point of view, multilayer neural networks may be considered as versatile maps whose
response to a speci c input is determined by the values of adjustable weights. The input-
output behavior (x 7! y ) of a two-layer sigmoidal neural network2 (shown in Figure 2) with
m inputs, n outputs and n hidden units or neurons is described (through the intermediate
n -dimensional vectors z, z) by
2
In this paper we use the notational convention that a k-layer network consists of k ? 1 hidden layers.
b
5

z z
x A1 + σ(z) A2 y
Figure 3: Block diagram representation of a Radial Basis Function neural network.

z = A1 x + b
zi = (zi) i = 1; 2; : : : n
y = A2 z
where x 2 Rm is the network input, z 2 Rn is the input to the sigmoidal nonlinearities,
z 2 Rn is the output of the hidden layer and y 2 Rn is the network output; the adjustable
weights of the network are the elements of A1 2 Rn m , A2 2 Rnn , b 2 Rn . The
weight vector b is known as the o set or bias weight vector. In order to satisfy (N2), the
sigmoidal nonlinearity  : R 7! R should be continuous. The hyperbolic tangent and the
logistic function are examples of continuous sigmoids that are often used in neural network
applications [31]. It is well known (see e.g, [38, 39, 41] and references therein) that if n is
large enough then there exist weight values A1 , A2 , b such that the above two-layer neural
network can approximate any continuous function f (x) to any degree of accuracy over a
compact set. This holds for any function  that is nonconstant, bounded and nondecreasing.
Therefore multilayer neural networks with as few as two layers satisfy (N1) and hence can
be used to model continuous nonlinearities in dynamical systems. Furthermore, it has been
shown [42] that three-layer neural networks (i.e, with two hidden layers), are capable of
approximating discontinuous functions. Therefore, increasing the number of layers results in
a natural enlargement of the class of functions that can be approximated.
(b) Radial basis function neural networks: Radial Basis Function (RBF) networks were
introduced to the neural network literature by Broomhead et.al. [43] and have since gain
signi cance in the eld due to several application and theoretical results [32, 33, 35]. Recently,
RBF networks have also been considered in adaptive control of nonlinear dynamical systems
[28]. The input-output response (x 7! y ) of a RBF neural network (shown in Figure 3) with
m inputs, n outputs and n hidden, or kernel units, is characterized by
i = g (jx ? cij=i ) i = 1; 2; : : : n
y = A
where x 2 Rm is the input,  2 R n is the output of the hidden layer, y 2 Rn is the output
of the network; A 2 Rnn is the weight matrix, while ci 2 Rm and i > 0 are the center and
ξ ith kernel unit respectively. The Euclidean or a weighted
width (or smoothing factor) of the
x g ( | x - ci | / σi ) A y
6

ci σi { i = 1, 2, . . . , n*}
Euclidean norm j  j is often used. The continuous function g : [0; 1) 7! R is the activation
function which is usually chosen to be the Gaussian function g ( ) := e? . Depending on the
2

application, the centers ci and/or widths i of the network can either be adjustable (during
learning) or they can be xed. For example, in control applications it is very crucial for
analytical purposes to choose ci, i apriori according to some preliminary training or an
ad-hoc procedure and keep these values xed during the learning phase. By doing so, the
nonlinearities g () appear linearly with respect to the adjustable weights, which, as we will
see later, simpli es the analysis considerably. It has recently been shown that under mild
assumptions RBF neural networks are capable of universal approximation, i.e approximation
of any continuous function over a compact set to any degree of accuracy [40, 44]. Therefore,
Gaussian RBF networks also satisfy (N1), (N2) and hence they are candidates for modelling
nonlinearities of dynamical systems.
Remark 2.1: In addition to sigmoidal and RBF neural networks, there are, of course, other
representations that can be used for approximating static maps. Considerable more studies,
both of theoretic and of practical nature, need to be performed before it is clear which
architecture is best for approximating di erent classes of functions. Practical issues such
as cost of hardware implementation, exibility to increasing the number of nodes, speed of
weight adjustment, as well as theoretic considerations such as number of parameters required
in the approximation and robustness, are very important in choosing an appropriate network
design. 
Using Assumptions (S1)-(S2), (N1)-(N2), the following theorem establishes the capabil-
ity of the proposed recurrent network architecture depicted in Figure 1 to approximate the
behavior of the real system over a nite interval of time.
Theorem 1 Suppose x(0) = x^(0) = x and u 2 U  Rm where U is some compact set. Then
0

given  > 0 and a nite T > 0, there exist weight values g , h such that for all u 2 U the
outputs of the real system and the recurrent neural network model satisfy
max jy (t) ? y^(t)j  
t2[0;T ]
The proof of Theorem 1 is given in Appendix A and relies on standard techniques from the
theory of ordinary di erential equations (see for example [45]).
Based on the above result, we will assume in the subsequent sections that the nonlin-
ear dynamical system to be identi ed and/or controlled is represented by a recurrent net-
work con guration with static neural networks replacing the unknown nonlinearities. In this
framework, the real system is parametrized by neural network models with known under-
lying structure and unknown parameters or weights. In order to accomodate for modelling
inaccuracies arising, for example, from having insucient number of adjustable weights, we
will allow the presence of modelling errors, which appear as additive disturbances in the
di erential equation representing the system model.

7
Figure 4: A general con guration for identi cation of nonlinear dynamical systems based on
the series-parallel model.

3 Identi cation
In this section we consider the identi cation of nonlinear systems of the form
x_ = f (x) + g(x)u (3:1)
where u 2 R is the input, x 2 Rn is the state, which is assumed to be available for mea-
surement, and f , g are smooth vector elds de ned on an open set of Rn . The above class
of continuous-time nonlinear systems are called ane systems because in Equation (3.1) the
control input u appears linearly with respect to g . There are several reasons for considering
this class of nonlinear systems. First, most of the systems encountered in engineering are
by nature (or design) ane systems. Secondly, most of the nonlinear control techniques,
including feedback linearization, are developed for ane systems. Finally we note that non-
ane systems (as described by (2.1)) can be converted to ane systems by passing the input
through integrators [6]. This procedure is known as dynamic extension.
The problem of identi cation consists of choosing an appropriate identi cation model and
adjusting the parameters of the model according to some adaptive law such that the response
x^ of the model to an input signal u (or a class of input signals) approximates the response x
of the real system to the same input. Since a mathematical x characterization of a system is
u often a prerequisite toReal
analysisSystem
and controller design, system identi cation is important not
only for understanding and predicting the behavior of the system, but also for obtaining an
e ective control law. In this paper we consider identi cation schemes that are based on the
setting shown in Figure 4, which is known as the series-parallel con guration [13].
As is common in identi cation procedures, we will assume that the state x(t) is bounded
for all admissible bounded inputs u(t). Note that even though the real system - is bounded-
input bounded-state (BIBS) stable, there is no apriori guarantee that the output x^ ofexthe
identi cation model or that the adjustable parameters in the model will remain bounded.
+
8
x
x^
u Identification Model
Stability of the overall scheme depends on the particular identi cation model that is used as
well as on the parameter adjustment rules that are chosen. This section is concerned with
the development of identi cation models, based on sigmoidal and RBF neural networks, and
the derivation of adaptive laws that guarantee stability of the overall identi cation structure.
Following the results of Section 2, the unknown nonlinearities f (x) and g (x) are parametrized
by static neural networks with outputs f^(x; f ) and g^(x; g ) respectively, where f 2 Rnf ,
g 2 Rng are the adjustable weights and nf , ng denote the number of weights in the respective
neural network approximation of f and g . By adding and subtracting the terms f^ and g^u,
the nonlinear system described by (3.1) is rewritten as
h i h i
x_ = f^(x; f ) + g^(x; g)u + f (x) ? f^(x; f ) + g(x) ? g^(x; g) u (3:2)
where f , g denote the optimal weight values (in the L1 -norm sense) in the approximation
of f (x) and g (x) respectively, for x belonging to a compact set X  Rn . For a given class
of bounded input signals u, the set X is such that it contains all possible trajectories x(t).
While being aware of its existence, in our analysis we do not need to know the region X .
Although the \optimal" weights f , g in (3.2) could take arbitrarily large values, from
a practical perspective we are interested only in weights that belong to a (large) compact
set. Therefore we will consider \optimal" weights f , g that belong to the convex compact
sets B(Mf ), B(Mg ) respectively, where Mf , Mg are design constants and B(M ) := f : jj 
M g denotes a ball of radius M . In the adaptive law, the estimates of f , g , which are
the adjustable weights in the approximation networks, are also restricted to B(Mf ), B(Mg )
respectively, through the use of a projection algorithm. By doing so, we avoid any numerical
problems that may arise due to having weight values that are too large; furthermore, the
projection algorithm prevents the weights from drifting to in nity, which, as will be apparent
later, is a phenomenon that may occur with standard adaptive laws.
To summarize, the optimal weight vector f is de ned as the element in B(Mf ) that
minimizes jf (x) ? f^(x; f )j for x 2 X  Rn ; i.e,
( )
min sup f (x) ? f^(x; f )
f := arg  2B (3:3)
f (Mf ) x2X

Similarly, g is de ned as


( )
g := arg  2B
min sup jg (x) ? g^(x; g )j (3:4)
g (Mg ) x2X

Finally, it is noted that if the optimal weights are not unique then f (and correspondingly
g ) denotes an arbitrary (but xed) element of the set of optimal weights.
Equation (3.2) is now expressed in compact form as
x_ = f^(x; f ) + g^(x; g)u +  (t) (3:5)

9
where  (t) denotes the modelling error, de ned as
h i h i
 (t) := f (x(t)) ? f^(x(t); f) + g(x(t)) ? g^(x(t); g) u(t)
The modelling error  (t) is bounded by a constant 0 where
   
0 := sup f (x(t)) ? f^(x(t); f ) + g(x(t)) ? g^(x(t); g) u(t)
t0
Since by assumption, u(t) and x(t) are bounded, the constant 0 is nite. The value of
0 depends on many factors, such as the type of neural network that is used, the number
of weights and layers, as well as the \size" of the compact sets X , B(Mf ), B(Mg ). For
example, the constraint that the optimal weights f , g belong to the sets B(Mf ) and B(Mg )
respectively, may increase the value of 0 . However, if the constants Mf and Mg are large
then any increase will be very small. In general, if the networks f^ and g^ are constructed
appropriately then 0 will be a small number. Unfortunately, at present, the factors that
in uence how well a network is constructed, such as the number of weights and number of
layers, are chosen for the most part by trial and error or other ad-hoc techniques. Therefore,
a very attractive feature of our synthesis and analysis procedure is that we do not need to
know the value of 0 .
By replacing the unknown nonlinearities with feedforward neural network models, we have
essentially rewritten the system (3.1) in the form (3.5), where the parameters f , g and the
modelling error  (t) are unknown, but the underlying structure of f^ and g^ is known. Based
on (3.5), we next develop and analyze various types of identi cation schemes using both
Gaussian RBF networks and multilayer network models with sigmoidal nonlinearities.
3.1 RBF Network Models
We rst consider the case where the network architectures employed for modelling f and g
are RBF networks. Therefore the functions f^ and g^ in (3.5) take the form
f^ = W1(x); g^ = W2 (x) (3:6)
where W1 , W2 are n  n1 and n  n2 matrices respectively, representing in the spirit of (3.3),
(3.4), the optimal weight values, subject to the constraints kW1 kF  M1 , kW2 kF  M n 2 . The
o
P
norm kkF denotes the Frobenius matrix norm [46], de ned as kAkF := ij jaij j = tr AAT ,
2 2

where trfg denotes the trace of a matrix. The constants n1 , n2 are the number of kernel
units in each approximation and the vector elds  (x) 2 Rn ,  (x) 2 Rn , which we refer to
1 2

as regressors, are Gaussian type of functions, de ned element-wise as


i (x) = e?jx?c i j = i
1
2 2
1 i = 1; 2;    n1
j (x) = e ?jx?c2j j2= 2
j
2 j = 1; 2;    n2
For analysis, it is crucial that the centers c1i , c2j and widths 1i , 2j , i = 1;    n1 , j =
1;    n2 , are chosen apriori. By doing so, the only adjustable weights are W1, W2, which
10
appear linearly with respect to the nonlinearities  and  respectively. Based on \local tuning"
training techniques several researchers have suggested methods for appropriately choosing the
centers and widths of the radial basis functions [32, 47]. In this paper, we will simply assume
that c1i, c2j , 1i, 2j are chosen apriori and kept xed during adaptation of W1 , W2 .
Generally, in the problem of identi cation of nonlinear dynamical systems one is usu-
ally interested in obtaining an accurate model in a (possibly large) neighborhood N (x0) of
an equilibrium point x = x0 and therefore it is intuitively evident that the centers should
be clustered around x0 in this neighborhood. Clearly, the number of kernel units and the
position of the centers and widths will a ect the approximation capability of the model and
consequently the value of the modelling error  (t). Hence, current and future research dealing
with e ectively choosing these quantities is also relevent to the topics discussed in this paper.
By substituting (3.6) in (3.5) we obtain
x_ = W1(x) + W2  (x)u +  (3:7)
Based on the RBF network model described by (3.7), we next develop parameter update
laws for stable identi cation using various techniques derived from the Lyapunov synthesis
approach and also basic optimization methods.
3.1.1 Lyapunov Synthesis Method
The RBF network model (3.7) is rewritten in the form
x_ = ? x + x + W1 (x) + W2  (x)u +  (3:8)
where > 0 is a scalar (design) constant. Based on (3.8) we consider the identi cation model
x^_ = ? x^ + x + W1(x) + W2  (x)u (3:9)
where W1, W2 are the estimates of W1 , W2 respectively, while x^ is the output of the identi -
cation model. The identi cation model (3.9), which we refer to as RBF error ltering model,
is similar to estimation schemes developed in [9]. The RBF error ltering model is depicted
in Figure 5. As can be seen from the gure, this identi cation model consists of two RBF
network architectures in parallel and n rst order stable lters h(s) = 1=(s + ).
If we de ne ex := x^ ? x, the state error, and 1 := W1 ? W1 , 2 := W2 ? W2 , the weight
estimation errors, then from (3.8), (3.9) we obtain the error equation
e_x = ? ex + 1 (x) + 2  (x)u ?  (3:10)
The Lyapunov synthesis method consists of choosing an appropriate Lyapunov function
candidate V and selecting weight adaptive laws so that the time derivative V_ satis es V_  0.
The Lyapunov method as a technique for deriving stable adaptive laws can be traced as far
back as the 1960's, in the early literature of adaptive control theory for linear systems [23, 48].

11
Figure 5: A block diagram representation of the error ltering identi cation model developed
using RBF networks.

In our case, an adaptive law for generating the parameter estimates W1 (t), W2(t) is developed
by considering the Lyapunov function candidate
n o n o
V (ex; 1; 2) = 12 jexj2 + 21 k1k2F + 21 k2 k2F = 12 eTx ex + 21 tr 1 T1 + 21 tr 2 T2
1 2 1 2
(3:11)
where 1, 2 are positive constants. These constants will appear in the adaptive laws and are
referred to as learning rates or adaptive gains.
Using (3.10),
Identification the time derivative of V in (3.11) is expressed as
Model n o n o
V_ = ? eT e +  T T e + 1 tr _ T +  T T e u + 1 tr _ T ? eT 
x x 1 x 1 1 1 2 x 2 2 2 x
αsuch as
Using properties of the trace,
n o n o
T T1 ex = tr  T T1 ex = tr ex  T T1
RBF -Network
we obtain
   
_V = ? jex j2 +ξtr(x) 1 _ T 1 _
ex  1 + 1 1 + tr ex u 2 + 2 2 ?1eTx 
T T T T T ^x
x ξ n
W1 1  o n
2
s +o α
2 1 T T 1
= ? jex j + tr 1 ex  + _ 1 1 + tr 2ex u + _ 2 T2 ? eTx  (3.12)
T
1 2
Since W1 , W2 are constant, we have that W_ 1 = _ 1 and W_ 2 = _ 2. Therefore it is clear from
(3.12) that if the parameter estimates W1, W2 are generated according to the adaptive laws
RBF -Network
W_ 1 = ? 1ex  T ; W_ 2 = ? 2ex u T (3:13)
ζ (x)
ζ W2 12

u
then (3.12) becomes
V_ = ? jex j2 ? eTx   ? jex j2 + 0 jexj (3:14)
_
If there is no modelling error (i.e, 0 = 0), then from (3.14) we have that V is negative
semide nite; hence stability of the overall identi cation scheme is guaranteed. However in
the presence of modelling error, if jex j < 0 = then it is possible that V_ > 0, which implies
that the weights W1 (t), W2 (t) may drift to in nity with time. This problem, which is referred
to as parameter drift [49], is well known in the adaptive control literature. Parameter drift
has also been encountered in empirical studies of neural network learning, where it is usually
referred to as weight saturation.
In order to avoid parameter drift, W1 (t), W2(t) are con ned to the sets fW1 : kW1kF  M1 g,
fW2 : kW2kF  M2g respectively, through the use of a projection algorithm [50, 54]. In par-
ticular, the standard adaptive laws described by (3.13) are modi ed to
(
_W1 = ? n1exT o if fkW1kF < M1g or fkW1kF = M1 and eTx W1  0g
P ? 1exT if fkW1kF = M1 and eTx W1 < 0g (3.15)
(
? n exu T o if fkW2kF < M2g or fkW2kF = M2 and eTx W2 u  0g
W_ 2 = 2

P ? exu T if fkW2kF = M2 and eTx W2u < 0g (3.16)


2

where Pfg denotes the projection onto the supporting hyperplane, de ned as
T
:= ? 1ex  T + 1 ex W12 W1
n o
P ? exT1
kW kF (3.17)
1
T
:= ? ex u T + ekxWW ku W
n o
P ? exu T
2 2 2
2
2 2 (3.18)
2 F
Therefore, if the initial weights are chosen such that kW1(0)kF  M1 , kW2 (0)kF  M2
then we have kW1(t)kF  M1 , kW2 (t)kF  M2 for all t  0. This can be readily established
by noting that whenever kW1 kF = M1 (and correspondingly for kW2kF = M2) then
d nkW (t)k2 ? M 2 o  0
dt 1 F 1

which implies that the parameter estimate is directed towards the inside or the surface of
the ball fW1 : kW1 kF  M1 g. It is worth noting that the projection modi cation causes the
adaptive law to be discontinuous. However, the trajectory behavior on the discontinuity
hypersurface is \smooth" and hence existence of a solution, in the sense of Caratheodory [45],
is assured. The issue of existence and uniqueness of solutions in adaptive systems is treated
in detail in [51].
With the adaptive laws (3.15), (3.16), Equation (3.12) becomes
( T ) T ( )
V_ = ? jex j2 ? eTx  + I1 tr kW e W
x  W T + I  tr ex W2 u W T
1

1 kF kW2k2F 2 2
21 1 2

T T
 ? jex j2 ? eTx  + I1 kexWWk12 tr W1T1 + I2 ekxWW2ku
n o n o
tr W T (3.19)
2 2 2
1 F 2 F

13
where I1, I2 are indicator functions de ned as I1 = 1 if the conditions kW1 kF = M1 and
eTx W1  < 0 are satis ed and I1 = 0 otherwise (and correspondingly for I2). The following
lemma establishes that the additional terms introduced by the projection can only make V_
more negative, which implies that the projection modi cation guarantees boundedness of the
weights without a ecting the rest of the stability properties established in the absence of
projection.
Lemma 1 Based on the adaptive laws (3.15), (3.16) the following inequalities hold:
  n o
(i) I1 eTx W1=kW1k2F tr W1T1  0 .
  n o
(ii) I  eTx W u=kW kF tr W T  0 .
2 2 2
2
2 2

The proof of Lemma 1 is given in Appendix B. Now, using Lemma 1, (3.19) becomes
V_  ? jex j2 ? eTx   ? jex j2 + 0 jexj (3:20)
Based on (3.20), we next summarize the properties of the weight adaptive laws (3.15), (3.16).
It is pointed out that the proof of the following theorem employs well knownR techniques from
the adaptive control literature. In the sequel, the notation z 2 L2 means 01 jz (t)j2dt < 1
while z 2 L1 implies supt0 jz (t)j < 1.
Theorem 2 Consider the error ltering identi cation scheme (3.9). The weight adaptive
laws given by (3.15), (3.16) guarantee the following properties:
(a) For 0 = 0 (no modelling error), we have
 ex; x^; 1; 2 2 L1 , ex 2 L2.
 limt!1 ex(t) = 0, limt!1 _ 1(t) = 0, limt!1 _ 2(t) = 0.
(b) For supt0 j (t)j  0 we have
 ex; x^; 1; 2 2 L1 .
 there exist constants k1, k2 such that
Zt Zt
jex( )j d  k + k
2
1 2 j ( )j d
2

0 0

Proof:
(a) With 0 = 0, Equation (3.20) becomes
V_  ? jex j2  0 (3:21)
Hence V 2 L1 , which from (3.11) implies ex ; 1; 2 2 L1 . Furthermore, x^ = ex + x is
also bounded. Since V is a non-increasing function of time and bounded from below the
limt!1 V (t) = V1 exists. Therefore by integrating (3.21) from 0 to 1 we have
Z1
je ( )j2d  1 [V (0) ? V ] < 1
x 1
0

14
which implies that ex 2 L2 . By the de nition of the Gaussian radial basis function, the
regressor vectors  (x) and  (x) are bounded for all x and by assumption u is also bounded.
Hence from (3.10) we have that e_x 2 L1 . Since ex 2 L2 \ L1 and e_x 2 L1 , using Barbalat's
Lemma [3] we conclude that limt!1 ex (t) = 0. Now, using the boundedness of  (t) and the
convergence of ex (t) to zero, we have that _ 1 = W_ 1 also converges to zero. Similarly, _ 2 ! 0
as t ! 1.
(b) With the projection algorithm, it is guaranteed that kW1 kF  M1 , kW2 kF  M2 . There-
fore the weight estimation errors are also bounded, i.e 1; 2 2 L1 . From (3.20) it is clear
that if jex j > 0 = then V_ < 0 which implies that ex 2 L1 and consequently x^ 2 L1 . In
order to prove the second part, we proceed to complete the square in (3.20) :
 
_V  ? jexj2 ? jex j2 + 2 eTx 
2 2
 ? 2 jexj2 + 21 j j2
Therefore, by integrating both sides and using the fact that V 2 L1 we obtain
Zt Zt
2
je ( )j d  [V (0) ? V (t)] +
2 1 j ( )j2d
x 2
0 0
Zt
 k1 + k2 j ( )j2d
0
 
where k1 := 2= V (0) ? supt0 V (t) and k2 := 1= 2 . 2
Remark 3.1: For notational simplicity the above identi cation scheme was developed with
the lter pole and the learning rates 1, 2 being scalars. It can be easily veri ed that the
analysis is still valid if ? in (3.9) is replaced by a Hurwitz matrix A 2 Rnn and 1 , 2 in the
parameter update laws are replaced by positive de nite learning rate matrices ?1 2 Rn n , 1 1

?2 2 Rn n respectively.
2 2 
Remark 3.2: To ensure robustness of the identi cation scheme with respect to modelling
errors, we have considered a projection algorithm, which prevents the weights from drifting
to in nity. The stability of the proposed identi cation scheme in the presence of modelling
errors can also be achieved by other modi cations to the standard adaptive laws, such as
the xed, or switching  -modi cation [49, 52], "-modi cation [53] and the dead-zone [2, 3]. A
comprehensive exposition to robust adaptive control theory for linear systems is given in [54].

Remark 3.3: Under the assumptions of Theorem 2 we cannot conclude anything about the
convergence of the weights to their optimal values. In order to guarantee convergence,  (x),
 (x)u need to satisfy a persistency of excitation condition. A signal z(t) 2 Rn is persistently
exciting in Rn if there exist positive constants , , T such that
0 1
Zt T +
I 
0 z( )z T ( )d  I 18t  0
t
In contrast to linear systems where the persistency of excitation condition has been trans-
formed into a condition on the input signal [2, 3], in nonlinear systems this condition cannot
be veri ed apriori because the regressors are nonlinear functions of the state x. 
15
3.1.2 Optimization Methods
In order to derive optimization-based weight adjustment rules that also guarantee stability
of the overall scheme, we need to develop an identi cation model in which the output error
ex is related to the weight estimation errors 1 , 2 in a simple algebraic fashion. To achieve
this objective we consider ltered forms of the regressors  ,  and the state x. We start by
rewriting (3.8) in the lter form
x = s + [x] + W1 s +1 [(x)] + W2 s +1 [ (x)u] + s +1 [ ] (3:22)
where s denotes the di erential (Laplace) operator.3 As in Section 2, the notation h(s)[z ] is
to be interpreted as the output of the lter h(s) with z as the input. Equation (3.22) is now
expressed in the compact form
x = xf + W1 f + W2 f + f (3:23)
where xf , f , f are generated by ltering x,  , u respectively:
x_ f = ? xf + x xf (0) = 0
_f = ? f +  f (0) = 0
_f = ? f + u f (0) = 0
Since  (t) is bounded by 0 , the ltered modelling error f := s+1 [ ] is another bounded
disturbance signal, i.e jf (t)j  0 = .
Based on (3.23), we consider the identi cation model
x^ = xf + W1f + W2 f (3:24)
This model, which will be referred to as the regressor ltering identi cation model [9], is
shown in Figure 6, in block diagram representation. The regressor ltering scheme requires
n + n1 + n2 rst order lters, which is considerably more than the n lters required in the error
ltering scheme. As can be seen from Figure 6, the lters appear inside the RBF networks,
forming a dynamic RBF network. This neural network architecture can be directly applied
to modelling of dynamical systems in the same way that RBF networks are used as models
of static mappings.
Using the regressor ltering scheme, the output error ex := x^ ? x satis es
ex = W1f + W2 f + xf ? x = 1f + 2 f ? f (3:25)
where 1 = W1 ? W1 , 2 = W2 ? W2 are the weight estimation errors. Hence by ltering the
regressors, we obtain an algebraic relationship between the output error ex and the weight
In deriving (3.22) we have assumed (without loss of generality) zero initial condition for the state, i.e
3

x = 0. Note that if x0 6= 0 then the initial condition will appear in the identi cation model so that it gets
0

cancelled in the error equation. This is possible because x is available for measurement.

16
Figure 6: A block diagram representation of the regressor ltering identi cation model devel-
oped using RBF networks.

estimation errors 1, 2 . In the framework of the optimization approach [55], adaptive laws
for W1, W2 are obtained by minimizing an appropriate cost functional with respect to each
element of W1 , W2 . Here we consider an instantaneous cost functional with a constraint on the
possible values that W1 , W2 can take. This leads to the following constrained minimization
problem:
minimize J (W1; W2) = 21 eTx ex
subject to kW1kF  M1 (3:26)
kW kF  M
2 2

Using the gradient projection method [55], we obtain the following adaptive laws for continuous
adjustment of the weights W1 , W2 :
Identification( Model
? e T if fkW1kF < M1g or fkW1kF = M1 and eTx W1 f  0g
W_ 1 = P n1?x fe  T o if fkWα xf T (3.27)
1 x f 1 kF = M1 and ex W1 f < 0g
( s+α
_W2 = ? n
T
2 ex f
o if fkW2kF < M2g or fkW2kF = M2 and eTx W2f  0g
P ? 2exfT if fkW2kF = M2 and eTx W2f < 0g (3.28)
Dynamic RBF-Network
where Pfg denotes the projection operation de ned in Section 3.1.1. The weight adaptive
ξ (x)
laws (3.27), (3.28) have the same form as (3.15),
1 (3.16),ξf which were derived by the Lyapunov ^
x method, with the ξ exception that the output α ex and the regressor
s + error W1 vectors ,  are de ned x
in a di erent way.
The counterpart of Theorem 2 concerning the stability properties of the regressor lter-
ing identi cation scheme with the adaptive laws (3.27), (3.28), obtained using the gradient
projection methodRBF-Network
Dynamic is described by the following result.
ζ (x) ζf
1 17
ζ s+α W2

u
Theorem 3 Consider the regressor ltering identi cation scheme described by (3.24). The
weight adaptive laws given by (3.27), (3.28) guarantee the following properties:
(a) For  (t) = 0 (no modelling error), we have
 ex; x^;  ;  2 L1 , ex 2 L .
1 2 2

 limt!1 ex(t) = 0, limt!1 _ (t) = 0, limt!1 _ (t) = 0.


1 2

(b) For supt j (t)j   , we have


0 0

 ex; x^;  ;  2 L1 .
1 2

 there exist constants k , k such that


1 2

Zt Zt
jex( )j d  k + k
2
1 2 j ( )j d
2

0 0

Proof: Consider the Lyapunov function candidate


 
1 1 1
V (1; 2) = 2 k1kF + 2 k2 kF = tr 2 11 + 2 2 2
2 2 T 1 T (3:29)
1 2 1 2

Using (3.25), Lemma 1 and the fact that _ 1 = W_ 1 , _ 2 = W_ 2, the time derivative of V along
(3.27), (3.28) can be expressed as
T T
V_ = tr ?ex fT T1 ? exfT T2 + I1 ekxWW1k2f tr W1T1 + I2 ekxWW2k2f tr W2 T2
n o n o n o

n  o
1 F 2 F
 ?tr ex fT T1 + fT T2
?jexj ? fT ex  ?jexj +  jexj
n  o
= ?tr ex eTx + fT = 2 2 0
(3.30)
(a) If  (t) = 0 then
V_  ?jex j2  0 (3:31)
Therefore V 2 L1 , which from (3.29) implies that 1; 2 2 L1 . Using this, together with
the boundedness of f , f in (3.25) gives ex ; x^ 2 L1 . Furthermore, by integrating both sides
of (3.31) from 0 to 1 it can be shown that ex 2 L2. Now by taking the time derivative of ex
in (3.25) we obtain
e_x = _ 1f + 1 _f + _ 2f + 2_f (3:32)
Since _ 1 ; f ; 1; _f ; _ 2; f ; 2; _f 2 L1 , (3.32) implies that e_x 2 L1 and thus using Barbalat's
Lemma we conclude that limt!1 ex (t) = 0. Using the boundedness of f and f , it can be
readily veri ed that _ 1 and _ 2 also converge to zero.
(b) Suppose supt0 j (t)j  0 . The projection algorithm guarantees that W1 , W2 are
bounded, which implies 1; 2 2 L1 . Since f , f are also bounded, from (3.25) we ob-
tain ex 2 L1 and also x^ 2 L1 . The proof of the second part follows directly along the same
lines as its counterpart in Theorem 2. 2
18
Remark 3.4: As in the Lyapunov method, the scalar learning rates , in the parameter
1 2
update laws (3.27), (3.28) can be replaced by positive de nite matrices ? , ? . 1 2 
Remark 3.5: Minimization of an appropriate integral cost using Newton's method yields the
recursive least-squares algorithm. In the least-squares algorithm the learning rate matrices
?1 , ?2 are time varying and are adjusted concurrently with the weights. Unfortunately, the
least-squares algorithm is computationally very expensive and especially in neural network
modelling, where the number of units n1 , n2 are usually large, updating the matrices ?1 , ?2 ,
which consist of n21 , n22 entries respectively, makes this algorithm impractical. 
3.2 Multilayer Network Models
In this section we consider the case where the network structures employed in the approx-
imation of f (x) and g (x) are multilayer neural networks with sigmoidal-type of activation
functions. Although this is the most commonly used class of neural network models in empir-
ical studies, there are very few analytical results concerning the stability properties of such
networks in problems dealing with learning in dynamic enviroments. The main diculty in
analyzing the behavior of recurrent network architectures with feedforward multilayer neural
networks as subsystems arises due to the fact that the adjustable weights appear non-anely
with respect to the nonlinearities of the network structure.
Our approach relies on developing an error identi cation scheme, based on the same
procedure as in Section 3.1.1. The analysis proceeds through the use of a Taylor series
expansion around the optimal weights. The adaptive law is designed based on the rst order
(linear) approximation of the Taylor series expansion. In this framework, the RBF error
ltering scheme presented earlier, constitutes a special case of the analysis in this section,
with the higher order terms not present and the regressor being independent of the weight
values. As a consequence of the presence of higher order terms, the results obtained here are
weaker, in the sense that, even if there is no modelling error it cannot be guaranteed that the
output error will converge to zero. The signi cance of this analysis is based on proving that
all the signals in the proposed identi cation scheme remain bounded, as well as developing a
uni ed approach to synthesizing and analyzing stable dynamic learning con gurations using
di erent types of neural network architectures.
Consider the system (3.5), where f , g are the optimal weights in the minimization of
jf (x) ? f^(x; f )j and jg(x) ? g^(x; g)j respectively, for x 2 X and subject to the constraints
jf j  Mf , jgj  Mg , where Mf , Mg are (large) design constants. The functions f^ and g^
are the outputs of multilayer neural networks with sigmoidal nonlinearities in-between layers.
We start by adding and subtracting x in (3.5), where is a positive design constant. This
gives
x_ = ? x + x + f^(x; f ) + g^(x; g)u +  (3:33)
Based on (3.33) we consider the error ltering identi cation model
x^_ = ? x^ + x + f^(x; f ) + g^(x; g )u (3:34)

19
Figure 7: A block diagram representation of the error ltering identi cation model developed
using multilayer sigmoidal neural networks.

where f 2 Rnf , g 2 Rng are the estimates of the optimal weights f , g respectively.
The constants nf , ng are the number of weights in each approximation. The error ltering
identi cation scheme described by (3.34) is shown in block diagram representation in Figure 7.
From (3.33), (3.34), the output error ex = x^ ? x satis es the di erential equation:
h i h i
e_x = ? ex + f^(x; f ) ? f^(x; f) + g^(x; g ) ? g^(x; g) u ?  (3:35)
obtain an adaptive law for the weights f it is convenient to consider the rst
In order to Model
Identification ^ ^ 
order approximation of the di erence f (x; f ) ? f (x; f ). Using the Taylor series expansion4
of f^(x; f ) around the point (x; f ) we obtain
α
@ f^ (x;  )   ?   + f^ (x;  )
f^(x; f ) ? f^(x; f ) = @ (3:36)
f f f 0 f
f
where f^0 (x; f ) represents the higher order terms
^ ? (with respect to f ) of the expansion. If we
de ne the weight estimationNetwork θ
error as f := ff (x,
Multilayer f f ) from (3.36) we have that
then ^ x
x 1
^ @ f^ (x;  )   s+α
f^ f(x; f ) = f^(x; f ) ? f^(x; f ) ? @
0 f f
f
4
Throughout the analysis we require that the network outputs f^(x; f ), g^(x; g ) are smooth functions of
their arguments. This can easily be achieved if the sigmoid used is a smooth function. The logistic function
and the hyperbolic tangent are examples of popular sigmoids that also satisfy the smoothness condition.
Multilayer Network g^ (x, θg )
g^ 20

u
By the Mean Value Theorem there exists  f 2 [f ; f ], i.e, f = f + (1 ? )f for some
 2 [0; 1], such that
@ f^ (x;  )   ? @ f^ (x;  )  
f^0 (x; f ) = @ f f @ f f
f f
" #
= @@ f^ (x;  ) ? @ f^ (x;  ) 
f @ f f
f f
Now let
@ f^

f := sup
(x; f ) ? @ f^ (x;  ) (3:37)
x2X ; f 2B(Mf );  f 2[f ; f ] @f
 @f f
where B(Mf ) denotes a ball of radius Mf . Therefore, the higher order terms satisfy
jf^0(x; f )j  f jf j 8x 2 X ; 8f 2 B(Mf ) (3:38)
Using the same procedure we have
@ g^ (x;  )   + g^ (x;  )
g^(x; g ) ? g^(x; g) = @ (3:39)
g g 0 g
g
where g := g ? g and g^0 (x; g ) satis es
jg^ (x; g )j  g jg j
0 8x 2 X ; 8g 2 B(Mg ) (3:40)
and the constant g is de ned as

@ g^ @ g^ (x;  )
g := sup (x; g ) ? (3:41)

x2X ; g 2B(Mg ); g 2[g ; g ] @g
 @g g
From now on, for notational simplicity we de ne the regressors
@ f^ (x;  );  (x;  ) := @ g^ (x;  )
(x; f ) := @ f g @ g
f g
where  ,  are n  nf and n  ng matrices respectively, representing the sensitivity functions
between the output of the network and the adjustable weights. It is worth noting that these
sensitivity functions are exactly the functions that are calculated by the standard (static)
backpropagation algorithm.
Now, by replacing (3.36) and (3.39) in (3.35) we obtain
e_x = ? ex + (x; f )f +  (x; g)g u + f^0(x; f ) + g^0(x; g )u ?  (3:42)
Once expressed in the form (3.42), it is clear that this error equation is a generalization of
the corresponding error equation for RBF networks. In particular, if the higher order terms
f^0 , g^0 are identically zero and the regressors ,  are independent of f and g respectively,
21
then (3.42) becomes of the same form as the error equation for RBF networks. In deriving
adaptive laws for adjusting the weights f and g , we neglect the higher order terms f^0 and
g^0 . This is consistent with linearization techniques, whose main principle is that for f , g
suciently small, the linearized terms f and g dominate the higher order terms f^0 , g^0.
Based on these observations we consider the following adaptive laws for adjusting the weights
f , g :
(
? nf T ex o if fjf j < Mf g or fjf j = Mf g and eTx f  0g
_f = P ? f T ex if fjf j = Mf g and eTx f < 0g (3.43)
(
_g = ? ng  T exu o if fjg j < Mg g or fjg j = Mg g and eTx g u  0g
P ? g  T exu if fjg j = Mg g and eTx g u < 0g (3.44)

where Pfg again denotes the projection operation, which is computed as follows:
T
:= ? f  T ex + f ejx j2f f
n o
P ? f T ex
f
T
? g  T exu + g exjj2g u g
n o
P ? g  T exu :=
g
The weight adjustment laws (3.43), (3.44) are the usual adaptive laws obtained by the Lya-
punov synthesis method, with the projection algorithm modi cation for preventing the weights
from drifting to in nity.
Remark 3.6: An intuitive interpretation of the above adaptive laws based on more familiar
optimization techniques can be obtained as follows: by setting e_x = 0 in the error equation
(3.35) and solving for the quasi-steady-state response ess of ex we have
h    i
e = 1 f^(x;  ) ? f^(x; ) + g^(x;  ) ? g^(x; ) u ? 
ss f f g g (3:45)
Based on (3.45), if we minimize the quadratic cost functional

Jss(f ; g ) = 2 jessj
2
2

subject to jf j  Mf , jg j  Mg , by using the gradient projection method, we obtain


weight adjustment laws of the form (3.43), (3.44), with the exception that ex is replaced
by its quasi-steady-state response ess . This indicates the close relationship between dynamic
backpropagation-type algorithms [18]-[20] and the adaptive laws described in this paper. 
The next theorem establishes stability of the proposed identi cation scheme.
Theorem 4 Consider the error ltering identi cation model (3.34). The adaptive laws
(3.43), (3.44) guarantee that
 ex; x^; f ; g 2 L1.
22
 there exist constants k , k such that
1 2
Zt Zt
jex( )j d  k + k
2
1 2 (f + g + j ( )j)2 d
0 0

Proof: Consider the Lyapunov function candidate


V (ex ; f ; g ) = 21 eTx ex + 21 Tf f + 21 Tg g (3:46)
f g
Using (3.42), (3.43), (3.44), the time derivative of V in (3.46) can be expressed as
T T T T
_V = ? jex j2 + eTx f0 + eTx g0 u ? eTx  + I1 f  2ex fT f + I2 g  e2x u gT g (3:47)
j j f j j g
where the last two terms in (3.47) are due to the projection modi cation. Using the same
procedure as in Lemma 1, it can be shown that these terms are non-positive. Therefore (3.47)
becomes
V_  ? jex j2 + jexjjf0j + u0jex jjg0j + 0 jex j
 ? jex j2 + f jexjjf j + g u0jexjjgj + 0jexj (3.48)
where u0 , 0 are the bounds for u(t) and  (t) respectively, and f , g are as de ned in
Equations (3.37), (3.41) respectively. Since f 2 B(Mf ), g 2 B(Mg ), it is clear that jf j 
2Mf , jg j  2Mg ; hence, (3.48) can be written in the form
V_  ? jex j2 + jexj (3:49)
where := 2f Mf + 2u0g Mg + 0 . Therefore, for ex > = , we have V_ < 0 which implies
that ex 2 L1 and consequently x^ 2 L1 . The proof of the second part follows along the same
lines as its counterpart in Theorem 2. 2
Remark 3.7: To retain the generality of the result, in the above analysis we have not assumed
any special structure for f^(x; f ) and g^(x; g ) except that these functions are suciently
smooth. Given a speci c number of layers and type of sigmoidal nonlinearity, estimates of
f , g can be computed. 
Remark 3.8: As a consequence of the weights appearing non-anely, the regressor ltering
and optimization techniques descibed in Section 3.1.2 for RBF networks, cannot be applied,
at least directly, in the case that the identi cation model is based on multilayer networks. 

4 Control
The control of dynamical systems deals with the problem of designing a controller that causes
the system to display some desired behavior. This desired behavior is speci ed in terms of
the control objectives which may include: maintaining the output of the system around some
desired constant value (regulation); ensuring that the outputs follow some desired trajectory
23
(tracking); forcing the outputs of the system to follow the outputs of a prespeci ed reference
model (model reference control), or assuring that the overall system minimizes some perfor-
mance criterion (optimal control). The problem of controlling a general nonlinear system is a
formidable task even when the dynamics of the system are completely known. This diculty
is re ected in the fact that it was not until recently that any systematic procedure for the
synthesis and analysis of a reasonably large class of nonlinear control systems was found. This
class of nonlinear systems, which is referred to as feedback linearizable, or simply, linearizable
systems, has attracted considerable attention among system and control theorists [5, 6].
In many areas of science and engineering we encounter systems for which the determination
of an accurate model, based on the physical properties of the system, is not possible. In this
section we are concerned with the problem of controlling such systems. In particular, we
consider nonlinear dynamical system with unknown, or partially unknown nonlinearities. The
objective is to approximate the unknown nonlinear functions by neural networks and develop
adaptive laws for on-line adjustment of the weights of these networks, such that the stability
of the overall system is guaranteed, and furthermore, the control objective is asymptotically
achieved. This problem is closely related to the areas of adaptive linear control [2], and
adaptive nonlinear control [56], where the uncertainty in the system is due to some unknown
parameters. Allowing unknown nonlinearities in the system to be controlled, extends the
class of uncertain systems that adaptive control techniques can be applied to, but at the
same time, increases considerably the complexity in the analysis of the overall control system
behavior. Due to the intricacies of the problem, it is very crucial to rst understand what are
the possible circumstances that may lead to instability and how these can be prevented.
4.1 Instability Mechanisms
In this subsection we investigate various types of instability mechanisms that may arise in
control of nonlinear systems using neural network approximators and propose modi cations
to the standard adaptive and control laws for dealing with these problems. The instability
discussion is motivated by the rst order system
x_ = f (x) + g(x)u (4:1)
where u 2 R is the control input, x 2 R is the state, which is available for measurement,
and f , g are unknown smooth functions. It will be assumed (without loss of generality) that
x = 0 is an equilibrium point of the undriven system, i.e f (0) = 0. The control objective is
to force the plant output x to follow the output xm of a reference model
x_ m = ?am xm + bmr (4:2)
where am > 0, bm are constants and r(t) is a bounded reference input. Although simple, this
rst order system captures many of the diculties that are encountered in the problem of
controlling nonlinear systems using neural networks, or more generally, using approximations
for the unknown nonlinearities.

24
If f and g were known and g (x) 6= 0 for all x in some region Nc (0) around the equilibrium
x = 0, then it is easy to verify that the control law
u = g(1x) [?am x + bmr ? f (x)] (4:3)
would achieve the control objective asymptotically. More precisely, in this case the tracking
error e = xm ? x satis es e(t)  0 if x(0) = xm (0), or satis es limt!1 e(t) = 0 if x(0) 6= xm (0).
The requirement g (x) 6= 0, 8x 2 Nc (0) is a controllability condition [57]; i.e, for g (x(t)) = 0
the system is no longer controllable.
Since f , g are unknown, we will approximate these functions by neural network models
with outputs f^(x; f ), g^(x; g ) respectively, where f and g are weight vectors which are to
be adjusted according to some adaptive rule. A straightforward choice for the control law
is the certainty equivalence controller [4], obtained by replacing the unknown functions f (x)
and g (x) in (4.3) by their estimates f^(x; f ) and g^(x; g ) respectively. This gives the control
law h i
u = uc = g^(x;1  ) ?am x + bmr ? f^(x; f ) (4:4)
g
where, for the time being, it is assumed that the weights g are such that g^(x; g ) is bounded
away from zero. Using the control law (4.4) in (4.1), it can be shown after some manipulation
that e = xm ? x satis es the following di erential equation:
e_ = ?am e + [f^(x; f ) ? f (x)] + [^g(x; g ) ? g(x)]uc e(0) = xm(0) ? x(0) (4:5)
Following the same procedure as in Section 3, we will let f , g be the optimal weights,
in the sense that they minimize jf (x) ? f^(x; f )j and jg (x) ? g^(x; g )j respectively, for x 2 X
and subject to the constraints jf j  Mf , jg j  Mg , where Mf , Mg are design constants and
X  R is some compact set. More conditions on the properties of X are discussed later.
Therefore, (4.5) is rewritten in terms of the (unknown) optimal weights as
h i h i
e_ = ?ame + f^(x; f ) ? f^(x; f ) + g^(x; g) ? g^(x; g) uc +  (t) (4:6)
where  (t) represents the modelling error, which is given by
h i h i
 (t) = f^(x(t); f ) ? f (x(t)) + g^(x(t); g) ? g(x(t)) uc (t)
It is worth noting that (4.6) is of the same form as the error equation of the identi cation
model developed in Section 3, based on the error ltering scheme. In identi cation, the
constant ? was the pole location of the stable lter, while in control, the corresponding
constant ?am is the pole location of the reference model. A major di erence between the
two schemes is the role of the input signal. In identi cation, u is an external bounded signal
that is chosen by the designer and fed to both the identi cation model and the real system,
which is stable. On the other hand, in the control procedure the signal uc is the certainty
equivalence control law given by (4.4), which is composed of signals from the output of the
system (feedback), as well as the output of the neural network estimation model (adaptation).
25
Based on the error equation (4.6), one may consider the adaptive laws developed in Section
3 for continuous adjustment of the weights. However, in order to guarantee stability of the
overall scheme, we have to impose additional assumptions or consider modi cations to the
adaptive laws. In identi cation the problem was simpli ed in two respects. First, the input
signal was chosen by the designer; hence it could be readily assumed that only bounded
input signals were used. In control, the boundedness of uc cannot be guaranteed apriori.
Second, in the identi cation problem we considered only stable systems, thus ensuring that
the output of the system was bounded. In control problems, due to the feedback control law,
the boundedness of the output cannot be guaranteed even if the system is originally stable.
Based on these observations, we are faced with the following three possible types of insta-
bility problems.
(a) Parameter Drift : This kind of problem was also encountered (and remedied) in the
identi cation schemes described in Section 3. The characteristic of parameter drift is the
tendency of weights to drift to in nity with time. In the case of neural network models, the
parameter drift is caused by the modelling error, which enters the identi cation or control
scheme by the inadequacy of a neural network to match exactly the nonlinear function that
it tries to approximate, even if an optimal set of weights were to be chosen. This mismatch
between the output of an \optimal" static network model and a given nonlinear function
could be due to many factors but, a principal cause is insucient number of weights. Of
course, for special classes of functions, it is possible that no modelling error is present. For
example, the scalar function
f (x) = 3e(x?2)
2

can be modelled exactly for all x 2 R by a Gaussian RBF network with a single unit, simply
by choosing the center c1 = 2, the width 1 = 1 and the weight w1 = 3. However, for a
general function this will not be the case and therefore the modelling error is something that
has to be dealt with.
In Section 3 we described the projection algorithm modi cation, one method of preventing
parameter drift. The basic idea behind the projection algorithm is as follows: rst choose the
initial weight values to belong to some (large) compact set that is chosen apriori; then, the
weight adjustment proceeds according to the standard adaptive law (obtained by the Lya-
punov or optimization method), unless the weights are directed towards leaving the compact
set. In this case, the weight values are projected back onto the compact set by a projection
algorithm.
Although there are other techniques for preventing parameter drift, the projection algo-
rithm has the natural and intuitive interpretation of con ning the weights to some compact
set that is chosen by the designer. Another major advantage of using the projection algorithm
for dealing with the parameter drift problem is the fact that no apriori information regard-
ing the bound on the modelling error is required. Furthermore, from a practical viewpoint,
restricting the weight values from becoming too large, can prevent many problems that may
otherwise occur due to numerical errors.
(b) Controllability Problem : In de ning the certainty equivalence controller uc, we have
made the crucial assumption that the weights g are such that g^(x; g ) is bounded away from

26
zero. It is clear that a standard adaptive law for generating g (t) (obtained by the Lyapunov
or optimization method) does not guarantee apriori that this will be true. Therefore some
kind of modi cation in the update law for g is required in order to ensure that controllability
of the estimation model is retained. In general, this is far from being a trivial task; adaptive
control theorists are aware, from the linear case, that this is one of the major problems with
indirect-type of control schemes [58].
In this paper we take the approach of attempting to nd a convex subset Kg of the g
parameter space such that g 2 Kg and for every g 2 Kg we have jg^(x; g )j   > 0, where
 is a small constant. If such set can be found, then controlability based on the estimated
parameters g (t) at each time t is ensured by incorporating a projection algorithm [50, 54]
(of the type described in Section 3) in the adaptive law, to guarantee that g (t) 2 Kg for all
t  0. Following this procedure, the usual properties of the adaptive law (e.g, stability) are
not altered. This approach, although simple, relies on the strong requirement that the set Kg
is known. We next describe how to obtain such a set Kg for the two cases that g^(x; g ) is the
output of: 1) a Gaussian RBF network, and 2) a sigmoidal multilayer network. Without loss
of generality, we will assume that g (x) is positive5 for all x 2 Nc (0).
In the case of Gaussian RBF network models, the output of the network has the form
g^(x; g ) = gT (x), where  2 Rng is a vector of Gaussian functions; therefore each element of
the vector  (x) is positive for all x. As a consequence, if each element of g is positive then
gT (x) will also be positive. Based on this observation, Kg is chosen as
Kg := f 2 Rng : i > ; i = 1;    ng g (4:7)
where  is a small constant. Even though it is not necessarily true (most of the times it will
be true) that g 2 Kg , based on the same technique as in Section 3, we can de ne the optimal
weights g under the additional constraint that g 2 Kg . Since g (x) is positive, any modelling
error introduced by restricting g to be in Kg , will be small.
In the case of sigmoidal multilayer neural network models, we can obtain a \controllability
set" Kg in the same way by expressing g^(x; g ) as
g^(x; g ) = gT z(x; ~g)
where z is the output of the last hidden layer, the vector g represents the weights of the last
layer and ~g represents the weights of the rest of the layers. In this formulation, g 2 Rn +n 1 2

is broken up into g 2 Rn and ~g 2 Rn . If the sigmoidal nonlinearity  () that is used is
1 2

such that,  (p) is strictly positive for all p, then each element of the vector z (x; ~g ) will be
positive for all (x; ~g ). A particular type of sigmoid whose output is always positive is the
logistic function  (p) := (1 + e?p )?1 . Following the same procedure as in the case of RBF
networks, the set Kg can therefore be de ned as
Kg :=  2 Rn +n : i > ; i = 1;    n1
1 2

5
In general, we need to know the sign of g(x). This is a requirement that is needed even in the adaptive
control of linear systems [2].

27
where  is to interpreted as representing the weights of the last layer and  is again a small
constant.
(c) Transient Behavior Problem : A common characteristic of adaptive systems is that
large initial parameter estimation errors result in bad transient behavior. Intuitively, this is
to be expected since large initial parameter estimation errors, in general imply that (initially)
there is a large discrepancy between the functions f (x), g (x) and their approximations f^(x; f )
and g^(x; g ) respectively, which in turn may cause large tracking errors in the early stages of
adaptation.
In the context of using neural network architectures to approximate the unknown functions
f (x) and g(x), bad transient behavior may cause additional problems. In particular, adaptive
laws for adjustment of the weights are developed under the assumption that x(t) remains in
some compact set X for all t  0, where X represents the region that the approximation of the
functions f (x) and g (x) is considered. Hence, if x(t) leaves the region X due to bad transient
behavior then this would, in general, increase the modelling error, which could force x(t)
even further away from X , eventually causing instability. It is noted that in the identi cation
schemes developed in Section 3, there was not such problem because the system was assumed
to be stable and adaptation was performed o -line.
One way of dealing with this problem is to rst use some identi cation procedure so that
during the control phase, the estimates f^(x; f ), g^(x; g ) represent \good" approximations of
the functions f (x) and g (x) respectively. However, this would only work for systems that are
already stable, or systems for which we have enough apriori information to stabilize it before
we employ approximation techniques for better performance.
Another way of dealing with the transient behavior problem was suggested by Sanner
et.al. [28]. According to [28], an auxiliary component is added in the control law, whose
purpose is to introduce a sliding control term for con ning x(t) to the region X . The sliding
control methodology [59], which is sometimes used as an alternative to adaptive control in
systems with parametric uncertainty, represents a rather simple approach to robust control
with applications in robot manipulators, power systems, and others [60]. The principal idea
behind using sliding control in the context of this paper, is very similar to that of the projection
algorithm, which was introduced earlier; in both cases the objective is to restrict a signal or a
parameter value from entering an undesired region. Since the designer can directly in uence
the weight values, the projection algorithm can be employed for con ning the weights to a
certain region. On the other hand, if the objective is to restrict the output x in a certain
region then this has to be done indirectly through the control signal u, which is exactly what
the sliding control term attempts to do.
Therefore from this point of view, projection provides a direct way of con ning weight
values, while sliding control provides an indirect method for con ning the output signal. Due
to its indirect way of a ecting the output signal, the sliding technique requires considerable
apriori information concerning the unknown plant, in the form of bounds on f (x) and g (x).
Another drawback of the sliding control design is the extremely high control e ort that it
requires. Typically, high control activity is not only expensive but it may also excite any
unmodelled dynamics of the system [60]. In fact, sliding control in this context, is analogous

28
Figure 8: A block diagram representation of the control scheme given by (4.8) with neural
network models as estimators of the unknown nonlinearities.

to high-gain feedback [61], which is a classical control tool for reducing the e ects of unknown
disturbances. A detail design procedure of the sliding control term is given in the next
subsection.
4.2 Stable Control of First Order Systems
Using the modi cations discussed in Section 4.1 for dealing with the paramater drift, control-
lability and transient behavior Reference x m the design of stable control laws
problems, we now consider
for rst order systems of the form described by (4.1). These results are extended to a class of
Model For simplicity, we examine the case where the
higher order systems in the next subsection.
unknown nonlinearities f (x) and g (x) are modelled by RBF networks. As in Section 3, the
r +
same approach can be applied to multilayer networks with sigmoidal type of nonlinearities;
however, the results obtained will be weaker, in the sense that, even if there is no modelling
error, it cannot be guaranteed that the tracking error converges to zero. e
uthes certainty equivalence control uc, given by- (4.4),
Following the discussion of Section 4.1,
is appended by the sliding control term us . Therefore we have
Controlu =uucc + us u PLANT x
law = T 1 h i
?amx + bmr ? fT (x) + us (4.8)
x
g  (x)
where f 2 Rnf , g 2 Rng represent the adjustable weights and  ,  are again the outputs of
Gaussian basis functions in the approximation of f (x) and g (x) respectively. This controller
structure leads to the scheme shown in Figure 8.
^
f (x, θf ) Neural x
Network29

g^ (x, θg ) Neural x
Network
Design of us : Since us is to be used for con ning x(t) in X for all t  0, it is important
that X be chosen appropriately. From a theoretical viewpoint, the choice of X is required
only for the design of us . However, from a practical perspective, X also represents the region
in which the approximation is to be performed. In the case of RBF networks, this is the
region where the centers of the basis functions are distributed. For ease of discussion we take
X to be a ball of radius Mx, i.e, X = B(Mx). From (4.2), the output xm of the reference
model satis es
Zt
jxm(t)j  e?am tjxm(0)j + e?am t? r( )d( )

 jxm(0)j + ar := Mr
0
0

m
where r0 is an upper bound for jr(t)j. Since the control objective is to track xm , it is clear
that Mx should be greater than Mr . We will let Mx = 3Mr . At this stage it has to be
assumed that B(Mx )  Nc (0), where, it is recalled, Nc (0) is the controllability region of the
real system. It is also assumed that jx(0)j  31 Mx . This last assumption is made without any
loss of generality since Mx can be chosen large enough such that jx(0)j  13 Mx is satis ed.
Using the composite control law (4.8), it can be shown after some straightforward rear-
rangement of terms that the tracking error e = xm ? x satis es the di erential equation
h i h i
e_ = ?ame + fT  ? f (x) + gT  ? g(x) uc ? g(x)us (4:9)
Based on (4.9) consider the Lyapunov function Ve (t) = 12 e2 (t). Then
h i h i
V_e = ?am e2 + e fT  ? f (x) + e gT  ? g(x) uc ? eg(x)us
h i
 ?am e2 + jfT j + jf j + jgT ucj + jgucj jej ? geus (4.10)
Suppose us is of the form us = q (t) sgnfeg where sgnfeg = 1 if e  0 and sgnfeg = 0 if e < 0.
Equation (4.10) can now be expressed as
h i
V_e  ?am e2 + jfT j + jf j + jgT uc j + jgucj jej ? gqe sgnfeg
h i
 ?am e2 + jfT j + jf j + jgT uc j + jgucj ? gq jej (4.11)
From (4.11), if q (t) is chosen such that the inequality
h i
q(t)  g(x1(t)) fT (t)(x(t)) + jf (x(t))j + gT (t) (x(t))uc (t) + jg(x(t))uc(t)j (4:12)

is satis ed for all t  0, then V_ e  ?am e2 , which implies that the tracking error e(t) is
decreasing. In order to choose a function q (t) that satis es the inequality (4.12), we need
to assume that we know upper bounds on f (x) and g (x), and a lower bound on g (x); these
bounds should be valid for all x 2 B(Mx ). Let f u (x), g u (x) be known state dependent upper

30
bounds on f (x) and g (x) respectively and also let gl (x) be a known lower bound on g (x); i.e,
0 < gl (x)  g (x) for all x 2 B(Mx ).
Using state dependent bounds is less restrictive than requiring a xed bound for all x 2
B(Mx). Nevertheless, having knowledge of these bounds is a very restrictive assumption that
limits the applicability of these techniques. It is recalled that these assumptions are required
in the design of the sliding controller, whose purpose is to con ne x(t) within the region X .
Therefore, these requirements are more of a safeguard rather than an integral part of the
design. Based on the above bounds, q (t) is chosen as
h i
q(t) := g (x1(t)) fT (t)(x(t)) + jf u (x(t))j + gT (t) (x(t))uc (t) + jg u (x(t))uc(t)j (4:13)
l
The action of the sliding control is required only when there is a danger that x might
leave the region B(Mx ). Let Ius be the indicator function de ned as Ius = 1 if jej  32 Mx and
Ius = 0 if jej < 32 Mx . The sliding control term is chosen as
us = Ius q(t) sgnfeg (4:14)
To verify that this choice for us guarantees that jx(t)j  Mx for all t  0, we rst note that
since by assumption jx(0)j  13 Mx , we have
je(0)j  jx(0)j + jxm(0)j
 31 Mx + Mr = 23 Mx
Now, if je(t)j = 32 Mx then, from (4.11) and (4.13), we have that je(t)j is decreasing. This
implies that je(t)j  23 Mx for all t  0. Therefore using the inequality
jx(t)j  je(t)j + jxm(t)j
 23 Mx + Mr = Mx
we conclude that x(t) 2 B(Mx ) for all t  0.
One potential problem is the fact that us , as given by (4.14), is discontinuous with respect
to x. Discontinuous control laws not only create problems of existence and uniqueness of
solutions [51], but are also known to display chattering phenomena [59] and excite high
frequency unmodelled dynamics [60]. One way of avoiding these problems is to approximate
the discontinuous terms in (4.14) by sigmoidal functions . As pointed out in [28], if these
sigmoidal functions are made \steep enough" then they represent good approximations of the
discontinuous functions, thus resulting in a similar behavior.
Design of the adaptive laws : We now turn our attention towards designing weight
adjustment laws for f and g . Let the weight estimation errors be de ned as 1(t) :=
f (t) ? f and 2(t) := g (t) ? g. The optimal weight set f is equal to the value of  that
minimizes jf (x) ? T  (x)j for x 2 X and subject to the constraint  2 B(Mf ), while g is
equal to the value of  that minimizes jg (x) ? T  (x)j for x 2 X and subject to the constraint
31
 2 B(Mf ) \ Kg . The set Kg , which is given by (4.7), characterizes a subset of the g weight
space for which controllability of the estimation model is ensured.
Based on the above de nitions for the optimal weights f , g , we start by adding and
subtracting f T  + g T uc in the error equation (4.9). This yields
e_ = ?ame + T1  + T2 uc ?  ? gus (4:15)
where the modelling error  is given by
h i h i
 (t) = f (x(t)) ? f T (x(t)) + g(x(t)) ? g T  (x(t)) uc (t) (4:16)
In developing adaptive laws for f and g , it is convenient to de ne the indicator functions
If and Ig as follows: if f is on the boundary of B(Mf ) and directed outwards then If = 1;
otherwise If = 0. Correspondingly, Ig = 1 if g is on the boundary of the set B(Mg ) \Kg and
directed towards leaving the set, and Ig = 0 otherwise. Utilizing these indicator functions,
the weights are continuously adjusted according to the following adaptive laws:
(
_f = ? e 1 if If = 0 (4.17)
P f? eg 1 if If = 1
(
_g = ? euc2 if Ig = 0 (4.18)
P f? eucg2 if Ig = 1
The adaptive gains , are positive constants and Pf? e g denotes the projection of
1 2 1
? e onto the supporting hyperplane to the convex set B(Mf ) at f (and correspondingly
1
for g ). The initial weights f (0), g (0) are chosen to belong to the sets B(Mf ) and B(Mg ) \Kg
respectively.
Theorem 5 Consider the rst order system (4.1) with the composite control law (4.8), (4.14)
and the adaptive laws (4.17), (4.18). The overall control scheme guarantees the following
properties:
 e; x; u; 1; 2 2 L1 .
 there exist constants k1, k2 such that
Zt Zt
je( )j d  k + k
2
1 2 j ( )j d
2
0 0

Proof: Consider the Lyapunov function candidate


V (e; 1; 2) = 21 e2 + 21 T1 1 + 21 T2 2 (4:19)
1 2

It can be readily veri ed that the time derivative of V satis es


V_ = ?am e2 ? e ? Ius g(x)q(t)jej + If f (e; ; f ) + Ig g (e; ; uc; g ) (4:20)
32
where If f and Ig g are additional terms that enter (4.20) due to the projection algorithm
modi cation in the adaptive laws. Using the same procedure as in Lemma 1, it can be shown
that these terms are non-positive. Furthemore, since g (x) and q (t) are positive, the term
?Ius g(x)q(t)jej is also non-positive. Therefore (4.20) becomes
V_  ?am e2 ? e (4:21)
The projection modi cation in the adaptive laws ensures that jf j  Mf , jg j  Mg . Therefore
1; 2 2 L1 . Also, the sliding mode control term us in (4.8) guarantees that jx(t)j  Mx
for all t  0. This together with the fact that g (t) 2 Kg for t  0, implies that uc 2 L1 .
Therefore there exists a nite constant 0 such that supt0 j (t)j  0 . Hence from (4.21) we
obtain that e 2 L1 . Thus all the signals in the adaptive control scheme remain bounded. The
second part of the theorem follows directly from (4.21) along the same lines as in Theorem 2.
2
Remark 4.1: Whether or not the tracking error converges to zero, depends on how large is
the modelling error. In the ideal case that  (t) = 0 then e 2 L and hence using the fact that
2
e_ 2 L1 , we obtain (using Barbalat's Lemma) that limt!1 e(t) = 0. If  (t) is not identically
zero but is square integrable (i.e,  2 L ), then from the second part of Theorem 5 we have
2
that e 2 L and hence it is concluded again that e(t) converges to zero.
2 
4.3 Extension to nth Order Systems
The results obtained in Section 4.2 for the rst order system (4.1) can be extended in a
straightforward manner to nth order systems of the form
x_ 1 = x2
x_ 2 = x3
.. (4:22)
.
x_ n = f (x1; x2;    ; xn) + g(x1; x2;    ; xn)u
y = x1
or equivalently of the form
x(n) ? f (x; x;_ x;    x(n?1) ) = g(x; x_ ; x;    x(n?1) )u ; y=x (4:23)
We will refer to systems of the form (4.22), or equivalently (4.23), as being in canonical form.
In the spirit of the feedback linearization literature (see [5, 6]), these systems are in normal
form and have no zero dynamics.
We consider the case that the control objective is to force y to follow the output ym of
the reference model
x_ m = Axm + br (4:24)
y = cT x
m m

33
where A is a Hurwitz n  n matrix. In order to have a well-posed problem, it is assumed that
the relative degree of the reference model is equal to n.6 This implies [57]
cT b = cT Ab =    = cT An?2 b = 0 (4:25)
If we let X = [x1 ; x2;    xn ]T , then using (4.22), (4.25), it can be readily veri ed that the
tracking error e = ym ? y satis es
e(n) = ym(n) ? y (n)
= cT An xm + cT An?1 br ? f (X ) ? g (X )u (4.26)
Let k = [kn; kn?1 ;    k1 ]T , where the constants fki : i = 1;    ng are such that all the
n n?1 +    + kn are in the open left half plane. Also
h polynomial h(si) = s + k1 s
roots of the
let " := e; e;_ e;    e(n?1) . Using (4.25), each element of the vector " can be measured
according to
e(i) = cT Aixm ? xi+1 i = 0; 1;    n ? 1
First suppose that the functions f and g are known and g (X ) 6= 0 for X 2 Nc , where
Nc  Rn is the controllability region. The control law
h i
u = g(1X ) ?f (X ) + cT Anxm + cT An?1 br + kT " (4:27)
applied to the error system (4.26) results in
e(n) = kT "
or equivalently
h(s)["] = 0
This implies limt!1 e(t) = 0, which achieves the control objective.
Since f and g are unknown, we replace these functions in the control law (4.27) by neu-
ral network models with output f^(X; f ) and g^(X; g ) respectively. Therefore the certainty
equivalence controller u = uc is given by
h i
uc = g^(X;1  ) ?f^(X; f ) + cT Anxm + cT An?1 br + kT " (4:28)
g
By applying the control law (4.28) in (4.26), we obtain after some straightforward manipula-
tion of terms the error equation
h i
e(n) = ?kT " + f^(X; f ) ? f (X ) + [^g(X; g ) ? g(X )] uc (4:29)
6
The relative degree of the reference model should be equal to n because the relative degree of the plant
(4.22) is equal to n. For details concerning the concepts of zero dynamics and relative degree of nonlinear
systems, the reader is referred to [5].

34
or h i
h(s)[e] = f^(X; f ) ? f (X ) + [^g(X; g) ? g(X )] uc
Equation (4.29) is now rewritten as
h  i
"_ = c " + bc f^(X; f ) ? f (X ) + (^g(X; g ) ? g(X )) uc (4:30)
where det(sI ? c ) = h(s), bc := [ 0    0 1 ]T and (c ; bc ) is in controllable canonical form
[57]. Once expressed in the form (4.30), then it is clear that we have the same error equation
as (4.5), with the exception that in (4.5) the error e was a scalar, while in (4.30) the error "
is a vector. Therefore, based on (4.30), one can design control and adaptive laws using the
same procedure as in Section 4.2. The stability analysis follows along the same lines as that
of Section 4.2 and is therefore omitted.
Remark 4.2: The control procedure of this section was applied to a relatively simple class of
nonlinear systems. Although similar techniques based on the certainty equivalence principle
can be applied to a wider class of systems, the analysis of such control schemes becomes
considerably more complicated. These diculties are also present (to a lesser degree) in
adaptive nonlinear control, where two types of control schemes have been considered [10] :
uncertainty-constrained schemes impose restrictions (matching conditions) on the location of
the unknown parameters, while nonlinearity-constrained schemes impose restrictions on the
type of nonlinearities in the original system. We believe that in the future similar conditions
will also be developed for systems with unknown nonlinearities in an attempt to enlarge the
class of systems for which stable neural network techniques can be applied. 
5 Concluding Remarks
The principal contribution of this paper is the synthesis and stability analysis of neural net-
work based schemes for identifying and controlling dynamical systems with unknown nonlin-
earities. Two wide classes of neural network architectures have been considered: 1) multilayer
neural networks with sigmoidal type of activation functions, and 2) RBF networks. Although
these two classes of networks are evidently constructed di erently, we have examined them
in a common framework as approximators of the unknown nonlinearities of the system. Our
results are intended to complement the numerous empirical results concerning learning and
control that have appeared in the recent neural network literature. Furthermore, we have
attempted to unify the learning techniques used by connectionists and adaptive control the-
orists. Bridging the gap between the two areas would certainly be bene cial to both elds.

APPENDIX A : Proof of Theorem 1


By Assumptions (S1), (S2), there exists a unique solution x(t) to the di erential equation
(2.1) (with the initial condition x(0) = x0 ) that satis es jx(t) ? x0 j  k for all t 2 [0; T ],
where k is some positive constant. Let K be the compact set de ned as
n o
K := (x; u) 2 Rn+m : jx ? x0j  k + ; u 2 U
35
By Assumption (N2), g^, h^ are continuous functions and therefore they satisfy a Lipschitz con-
dition in the compact domain K, i.e there exists constants lg , lh such that for all (x1; u); (x2; u) 2
K
jg^(x ; u; g) ? g^(x ; u; g)j  lg jx ? x j
1 2 1 2 (A.1)
j^h(x ; u; h) ? ^h(x ; u; h)j  lhjx ? x j
1 2 1 2 (A.2)
If we let ex := x ? x^ then from (2.1) and (2.3) we obtain
e_x = Aex + g(x; u) ? g^(^x; u; g) ex (0) = 0 (A:3)
Based on Assumption (N1), the weight set g in Equation (A.3) is chosen such that

max g (x; u) ? g^(x; u; g)  g (A:4)
( x;u)2K
where g > 0 is a constant to be chosen later. The solution to the di erential equation (A.3)
can be expressed as
Zt h i
ex (t) = eA(t? ) g(x( ); u( )) ? g^(^x( ); u( ); g) d
Z0t h i
= eA(t? ) g(x( ); u( )) ? g^(x( ); u( ); g) d +
0
Zt h i
+ eA(t? ) g^(x( ); u( ); g) ? g^(^x( ); u( ); g) d (A.5)
0

Since A is a Hurwitz matrix there exist positive constants c, (that depend on A) such that
keAt k  ce? t for all t  0. Based on the constants c, , lg , lh, , let g in (A.4) be chosen as
g := 2 clh e
?clg = > 0 (A:6)

First consider the case that jx^(t) ? x0j < k +  for all t 2 [0; T ], which implies that (^x; u) 2 K
in that interval. In this case, starting from (A.5), taking norms on both sides and using (A.1),
(A.4), (A.6), the following inequalities hold for t 2 [0; T ] :
Zt
jex(t)j  keA t? k  g(x( ); u( )) ? g^(x( ); u( ); g) d +
( )
0
Zt
keA t? k  g^(x( ); u( ); g) ? g^(^x( ); u( ); g) d
+ ( )
0
Zt Z
 ce? t?  2 clZh e(?clg = d + t ce? t?  lg jex( )jd
) ( )
0 0

 2l e?clg = + clg e? t? jex( )jd


t
( )

h 0

36
Therefore by using the Bellman-Gronwall Lemma [45] we obtain
Rt
jex(t)j  2l e?clg =  eclg e? t? d
0
( )

h

jex(t)j  2l (A.7)
h
Note that from (A.2) it can be assumed without loss of generality that lh  1 and therefore
(A.7) implies that jx^(t) ? x0j  k + =2. Now suppose (for the sake of contradiction) that
(^x; u) does not belong to K for all t 2 [0; T ]. Then, by the continuity of x^(t), there exists a
T  , where 0 < T  < T , such that jx^(T ) ? x0j = k + . Therefore if we carry out the same
analysis for t 2 [0; T ] we obtain that in this interval jx^(t) ? x0j  k + =2, which is clearly a
contradiction. Hence (A.7) holds for all t 2 [0; T ].
Now consider the di erence in outputs. By taking norms we obtain
jy(t) ? y^(t)j = jh(x; u) ? h^ (^x; u; h)j
 jh(x; u) ? h^ (x; u; h)j + j^h(x; u; h) ? h^(^x; u; h)j
Again by Assumption (N1) we can choose h such that

max h(x; u) ? h^ (x; u; h )  
(x;u)2K 2
Therefore using (A.2) and (A.7) we obtain
jy(t) ? y^(t)j  2 + lhjex(t)j   8t 2 [0; T ]
which concludes the proof. 2

APPENDIX B : Proof of Lemma 1


We will proof here only part (i) since the proof of part (ii) follows from the same reasoning.
Suppose kW1 kF = M1 and enTx W1  <o 0. If that is not the case then I1 = 0 and the inequality
holds trivially. The term tr W1 T1 can be expressed as
n o n o
tr W1 T1 = tr (1 + W1 )T
  
1 T 1 T
= tr 2 1 1 + 2 11 + W1   T

= 12 k1k2F + 12 kW1k2F ? 12 kW1 k2F


Therefore
T W1  n
I1 keW
o   
x tr W T = 1 eT W  k k2 + M 2 ? kW k2 (B:1)
1 k2F
1 1
2M 2 x 1 1 F
1
1 1 F

37
Since the optimal weights W1 satisfy kW1 kF  M1, the last term in (B.1) is positive and
therefore T
I1 kexWWk12 tr W1T1  0
n o
1 F
2

References
[1] G.C. Goodwin and K.S. Sin, Adaptive Filtering Prediction and Control, Englewood Cli s,
NJ, Prentice-Hall, 1984.
[2] K.S. Narendra and A.M. Annaswamy, Stable Adaptive Systems, Englewood Cli s, NJ,
Prentice-Hall, 1989.
[3] S. Sastry and M. Bodson, Adaptive Control: Stability, Convergence and Robustness,
Englewood Cli s, NJ, Prentice-Hall, 1989.
[4] K.J. Astrom and B. Wittenmark, Adaptive Control, Reading, MA, Addison-Wesley, 1989.
[5] A. Isidori, Nonlinear Control Systems, Berlin, Springer-Verlag, 1989.
[6] H. Nijmeijer and A.J. van der Schaft, Nonlinear Dynamical Control Systems, New York,
NY, Springer-Verlag, 1990.
[7] D.G. Taylor, P.V. Kokotovic, R. Marino, and I. Kanellakopoulos, \Adaptive regulation
of nonlinear systems with unmodelled dynamics", IEEE Trans. Aut. Control, vol. AC-34,
pp. 405-412, April 1989.
[8] S.S. Sastry and A. Isidori, \Adaptive control of linearizable systems", IEEE Trans. Aut.
Control, vol. AC-34, pp. 1123-1131, November 1989.
[9] L. Praly, G. Bastin, J.-B. Pomet and Z.P. Jiang, \Adaptive Stabilization of Nonlinear
Systems", in Foundations of Adaptive Control, P.V. Kokotovic ed., pp. 347-433, Springer-
Verlag, Berlin, 1991.
[10] P.V. Kokotovic, I. Kanellakopoulos, and A.S. Morse, \Adaptive Feedback Linearization
of Nonlinear Systems", in Foundations of Adaptive Control, P.V. Kokotovic ed., pp.
311-346, Springer-Verlag, Berlin, 1991.
[11] T.W. Miller, R.S. Sutton III, and P.J. Werbos, Eds, Neural Networks for Control, Cam-
bridge, MA, The MIT Press, 1990.
[12] P.J. Antsaklis, Editor, Special Issue on Neural Networks in Control Systems, IEEE Con-
trol Systems Magazine, vol. 10, No.3, April 1990.
[13] K.S. Narendra and K. Parthasarathy, \Identi cation and control of dynamical systems
using neural networks", IEEE Trans. on Neural Networks, vol. 1, pp.4-27, 1990.
38
[14] M.I. Jordan and D.E. Rumelhart, \Forward models: supervised learning with a distal
teacher", Occ. Paper #40, Center for Cognitive Science, M.I.T., 1990.
[15] M.S. Lan, \Adaptive control of unknown dynamical systems via neural network ap-
proach", Proc. Automatic Control Conf., pp. 910-915, 1989.
[16] F.C. Chen and H.K. Khalil, \Adaptive control of nonlinear systems using neural net-
works", Proc. IEEE Conf. on Decision and Control, pp. 1707-1712, 1990.
[17] B.E. Ydstie, \Forecasting and control using adaptive connectionist networks", Computers
chem. Engng., vol. 14, pp. 583-599, 1990.
[18] R.J. Williams and D. Zipser, \A learning algorithm for continuously running fully recur-
rent neural networks", Neural Computation, vol. 1, pp. 270-280, 1989.
[19] P. J. Werbos, \Backpropagation through time: what it does and how to do it", Proc.
IEEE, vol. 78, pp. 1550-1560, 1990.
[20] K.S. Narendra and K. Parthasarathy, \Gradient methods for the optimization of dynam-
ical systems containing neural networks", IEEE Trans. on Neural Networks, vol. 2, pp.
252-262, 1991.
[21] J.B. Cruz., Jr., Ed., System Sensitivity Analysis, (Benchmark papers in Electrical Engi-
neering and Computer Science), Stroudsburg, PA, Dowden, Hutchinson and Ross, 1973.
[22] H.P. Whitaker, J. Yamron, and A. Kezer, \Design of model-reference adaptive control
systems for aircraft", Report R-164, Instrumentation Lab., MIT, 1958.
[23] P.C. Parks, \Lyapunov redesign of model reference adaptive control systems", IEEE
Trans. Aut. Control, vol. AC-11, pp. 362-367, 1966.
[24] D.J. James, \Stability of a model reference control system", AIAA Journal, vol. 9, no.
5, 1971.
[25] I.M.Y. Mareels, B.D.O. Anderson, R.R. Bitmead, M. Bodson, and S.S. Sastry, \Revis-
iting the MIT rule for adaptive control", Proc. of the 2nd IFAC Workshop on Adaptive
Systems in Control and Signal Processing, Lund, Sweden, 1986.
[26] A.G. Barto, \Connectionist learning for control: an overview", in Neural Networks for
Control, T.W. Miller, R.S. Sutton III, and P.J. Werbos, Eds, pp. 5-58, Cambridge, MA,
The MIT Press, 1990.
[27] M.M. Polycarpou and P.A. Ioannou, \Learning and convergence analysis of neural-type
structured networks", to appear in IEEE Trans. on Neural Networks, vol. 3, no. 1, 1992.
[28] R.M. Sanner and J.-J.E. Slotine, \Guassian networks for direct adaprive control", Proc.
Autom. Control Conf., pp. 2153-2159, 1991.

39
[29] A. Arapostathis, B. Jakubczyk, H.G. Lee, S.I. Marcus, and E.D. Sontag, \The e ect of
sampling on linear equivalence and feedback linearization", Systems & Control Letters,
vol. 13, pp. 373-381, 1989.
[30] D.E. Rumelhart, J.L. McClelland and the PDP Research group, Parallel Distributed
Processing: Exploration in the Microstructure of Cognition. Volume 1: Foundations,
Cambridge, MA, The MIT Press, 1986.
[31] J. Hertz, A. Krogh, and R.G. Palmer, Introduction to the Theory of Neural Computation,
Addison-Wesley Publ. Co., 1991.
[32] J. Moody and C.J. Darken, \Fast learning in networks of locally-tuned processing units",
Neural Computation, vol. 1, pp. 281-294, 1989.
[33] S. Renals and R. Rohwer, \Phoneme classi cation experiments using radial basis func-
tions", Proc. Intern. Joint Conf. on Neural Networks, vol. 1, pp. 461-467, 1989.
[34] M. Niranjan and F. Fallside, \Neural networks and radial basis functions for classifying
static speech patterns", Computer Speech and Language, vol. 4, pp. 275-289, 1990.
[35] T. Poggio and F. Girosi \A theory of networks for approximation and learning", Tech.
Rep. Artif. Intel. Lab, Memo No. 1140, M.I.T., 1989.
[36] T. Poggio and F. Girosi \Extensions of a theory of networks for approximation and
learning: dimensionality reduction and clustering", Tech. Rep. Artif. Intel. Lab, Memo
No. 1167, M.I.T., 1990.
[37] T. Poggio and F. Girosi \Regularization algorithms for learning that are equivalent to
multilayer networks", Science, vol. 247, pp. 978-982, 1990.
[38] G. Cybenko, \Approximation by superpositions of a sigmoidal function", Mathematics
of Control, Signals, and Systems, vol. 2, pp. 303-314, 1989.
[39] K. Hornik, M. Stinchcombe, and H. White, \Multilayer feedforward networks are uni-
versal approximators", Neural Networks, vol. 2, pp. 359-366, 1989.
[40] E.J. Hartman, J.D. Keeler, and J.M. Kowalski, \Layered neural networks with guassian
hidden units as universal approximations", Neural Computation, vol. 2, pp. 210-215,
1990.
[41] K. Funahashi, \On the approximate realization of continuous mappings by neural net-
works", Neural Networks, vol. 2, pp.183-192, 1989.
[42] E.D. Sontag, \Feedback stabilization using two-hidden-layer nets", Proc. Autom. Control
Conf. pp. 815-820, 1991.
[43] D.S. Broomhead and D. Lowe, \Multivariable functional interpolation and adaptive net-
works", Complex Systems, vol. 2, pp. 321-355, 1988.
40
[44] J. Park and I.W. Sandberg, \Universal approximation using radial-basis-function net-
works", Neural Computation, vol. 3, pp. 246-257, 1990.
[45] J.K. Hale, Ordinary Di erential Equations, New York, NY, Wiley-InterScience, 1969.
[46] G.H. Golub and C.F. Van Loan, Matrix Computations, The John Hopkins Univ. Press,
2nd Edition, 1989.
[47] T. Holcomb and M. Morari, \Local training of radial basis function networks: towards
solving the hidden unit problem", Proc. Autom. Control Conf., pp. 2331-2336, 1991.
[48] P. Kudva and K.S. Narendra, \Synthesis of an adaptive observer using Lyapunov's direct
method", Inter. Journal of Control, vol. 18, pp. 1201-1210, 1973.
[49] P.A. Ioannou and P.V. Kokotovic, Adaptive Systems with Reduced Models, Springer-
Verlag, New York, NY, 1983.
[50] G.C. Goodwin and D.Q. Mayne, \A parameter estimation perspective of continuous time
model reference adaptive control", Automatica, vol. 23, pp. 57-70, Jan. 1987.
[51] M.M. Polycarpou and P.A. Ioannou, \On the existence and uniqueness of solutions in
adaptive control systems", Tech. Rep. No. 90-05-01, Dept. Elec. Eng. - Systems, Univ.
of Southern Cal., 1990.
[52] P.A. Ioannou and K.S. Tsakalis, \A robust direct adaptive controller", IEEE Trans. Aut.
Control, vol. AC-31, pp. 1033-1043, Nov. 1986.
[53] K.S. Narendra and A.M. Annaswamy, \A new adaptive law for robust adaptation with
persistent excitation", IEEE Trans. on Aut. Control, vol. AC-32, no. 2, pp. 134-145,
1987.
[54] P.A. Ioannou and A. Datta, \Robust Adaptive Control: Design, Analysis and Robustness
Bounds", in Foundations of Adaptive Control, P.V. Kokotovic ed., pp. 71-152, Springer-
Verlag, Berlin, 1991.
[55] D.G. Luenberger, Linear and Nonlinear programming, Addison-Wesley Publ. Co., 1984.
[56] P.V. Kokotovic, I. Kanellakopoulos, \Adaptive nonlinear control: a critical appraisal",
Proc. 6th Yale Workshop on Adaptive and Learning Systems, New Haven, CT, 1990.
[57] T. Kailath, Linear Systems, Englewood Cli s, NJ, Prentice-Hall, 1980.
[58] P.A. Ioannou, F. Giri, and F. Ahmed-Zaid, \Stable indirect adaptive control: the
switched-excitation approach", Tech. Rep. No. 91-05-02, Dept. Elec. Eng. - Systems,
Univ. of Southern Cal., 1991.
[59] V.I. Utkin, Sliding Modes and their Application to Variable Structure Systems, Moscow,
MIR Publishers, 1978.
41
[60] J.-J.E. Slotine and W. Li, Applied Nonlinear Control, Englewood Cli s, NJ, Prentice-
Hall, 1991.
[61] K.D. Young, P.V. Kokotovic, and V.I. Utkin, \A singular perturbation analysis of high-
gain feedback systems", IEEE Trans. on Autom. Control, vol. AC-22, no. 6, pp. 931-938,
1977.

42

You might also like