Identification of Nonlinear Dynamic Systems - Classical Methods Versus RBF Networks - Nelles & Isermann ACC 1995
Identification of Nonlinear Dynamic Systems - Classical Methods Versus RBF Networks - Nelles & Isermann ACC 1995
Identification of Nonlinear Dynamic Systems - Classical Methods Versus RBF Networks - Nelles & Isermann ACC 1995
1. Introduction and only the slopes in each direction -a, and b, have to
In the recent years the capabilities of neural networks for be identified to obtain a complete model description.
system identification have been studied by many authors. However for nonlinear systems the function f(.) can be
Most papers concentrate their analysis on the per- of arbitrary shape and the problem is to approximate it.
formance of one kind of neural network. Some others Thus in nonlinear system identification we have to deal
compare different kinds of neural networks with their with both estimation and approximation errors.
advantages and disadvantages. But neural networks and The following two sections give an introduction on the
classical methods have hardly been compared. examined classical and neural methods. In section 4 the
The term “classical methods“ refers to the widely known designed test processes and the excitation signal is
and applied parameter estimation methods, that are based described. Section 5 discusses the results and section 6
on Volterra kernels. Usually the simple Hammerstein, ends with conclusions.
Wiener or combined models are applied. These models
assume a special structure of the system, e.g. a nonlinear
2. Classical Methods
static function separated from the linear dynamic system.
Also the more general NDE models have become quite The parametric Volterra model can describe any
popular, since they suit a lot of models, that arise nonlinear process of the following form [3]
directly or by approximation from physical laws. ‘w){ Y ( t ) ) ~ { Y ( ~ ) ~ Y ( O=~ W
+ . . . ~)
{u(t)) (3)
Neural networks on the other hand are not structured in where L o ) are differential operators with D = d/dt and
this sense and try to learn any given nonlinear mapping. F(.) is a linear combination of a finite number of
Most types of neural networks are static. However a products and powers of the variable and the derivatives
dynamic system can be modeled in discrete time by of the variable. The Hammerstein and Wiener models
feeding previous process inputs and outputs in the can be considered as a special case of (3), while the
network. Thus the following nonlinear mapping f(.) can NDE (nonlinear differential equation) model is as general
be learned: as the parametric Volterra model.
y ( k ) = f ( u ( k - l),u(R-2), ...,u(k-ny), The parametric Volterra, the generalized Hammerstein
(1) and Wiener models and the NDE model are shown in
y ( k - l ) , y ( k - 2 ) , ...,Y ( k - n y ) ) Fig. 1. The series connection of a static nonlinearity and
a linear dynamic system is called simple Hammerstein
I Thus information about the dynamic order nu and ny of model. The series connection of a linear dynamic system
3786
and a static nonlinearity is called simple Wiener model.
For system orders higher than one they are special cases
of the corresponding generalized models, otherwise they
are identical.
The Hammerstein and NDE models are linear in their
parameters, but the Wiener model is not. Thus for
identification with the simple Wiener model, first the
static nonlinearity and its inverse are approximated from
a quasi-static test data set with a polynomial of order p.
T U I
I i I
where ci, is the j-th component of i-th centre, uij is j-th
b) NDE model component of the i-th standard deviation, N is the
number of RBFs and n = nu + ny is the number of
dimensions. Equation (4) differs from the one mainly
used in most standard papers in the possibility to chose
different widths in each dimension.
This paper only handles the case of fixed centres ci, and
standard deviations uij.Thus learning the weights w, is a
linear optimization problem, because the error is linear in
the parameters wi. The following approach has been
taken. First the minimum and maximum values of the
networks inputs naturally given e.g. by actuator
boundaries are determined or estimated. Then the R B F s
centres are positioned on a regular lattice (see Fig. 3),
c) generalized Hammerstein model with the number of RBFs in each dimension as a free
3787
parameter. After that the standard deviations are xI = @-I), x2 = y(k-1) and y = y(k) has to be predicted
determined by by the network. In this paper the weights have been
u , ~= kj.(Distance of two neighboured optimized by directly solving the linear equation system.
(5) Due to bad numerical condition, when using larger
RBFs in dimension j )
values for the standard deviations, the pseudo inverse
with k, as free parameters. So all N RBFs have the same was evaluated by singular value decomposition.
set of standard deviation and usually all 4 are chosen
equally too. If different standard deviations are chosen in 4. Test Processes and Excitation
each dimension, the Gaussians are elliptic and not radial
any more. Five representative processes out of many examined will
be considered in this paper. The block diagrams of these
processes are shown in Fig. 4. The first process is of
Hammerstein type, the second one is of Wiener type, the
third process suits a NDE model, the fourth and fifth
process do not match any of these structures. It was
expected that the RBF network would outperform the
classical methods with process 4 and 5, while the
classical models with the appropriate structure would
\\ 7 lead to the best results for processes 1, 2 and 3. In
practice however usually no a priori knowledge about the
structure is available. Special models often are merely
W assumed for simplicity and one cannot expect the process
to match exactly all underlying assumptions.
Fig. 2: Radial basis function network with two input
nodes, N Gaussians and one output node
At
I
n 1
-Oa4I
-0.6
L
0 100 200
I 300 400
1
I
Time [s]
4 -1 -Wiener
l:-2 model
-3
1
$00
Time [SI
800 1000 1200 --
-0.51 01itput
3789
Process 4: The RBF network led to the by far best estimation methods were compared. The RBF networks
results. The Hammerstein was worse (10 times larger led to very good results for all examined processes. In
error), but still quite good. Wiener and NDE failed. most cases classical methods performed better if the
Process 5: The RBF network performed best, but the structure assumptions are valid, but they failed if the
Hammerstein model did only slightly worse for large assumptions are invalid.
input amplitudes. Wiener and NDE models failed. It was pointed out, that large RBF variances lead to
Roughly spoken the expected results have been better but more noise sensitive results and destroy the
confirmed. The RBF network performed very good in all local approximation properties. Therefore a compromise
cases. If the assumed structure was different form the has to be found. Since the variances and centres were
actually existing one, the parametric models failed in f i e d before learning, the calculation of a pseudo inverse
most cases. However there are some very remarkable could be used for weight determination. Thus the RBF
outcomes. First the Wiener model led to worse results network learning was very fast (about 1 minute on an
than the RBF network, although all made structure average PC). Therefore both approaches are perfectly
assumptions were valid. This is obviously due to comparable.
problems with approximating the inverse static In our further work we intend to expand the results to
nonlinearity. Second for large input amplitudes the systems of higher order and multi inpudoutput processes.
Hammerstein model performed quite well with process 5, It must be seen clearly that RBF networks suffer from
although the structure assumptions were not valid. The the "curse of dimensionality" and so might have
reason for this astonishing result is the dominating problems with high order systems. On the other hand
nonlinearity u3 (the feedback gain is very small), that when using classical methods the probability for
causes a structure close to a Hammerstein model. structure assumptions being valid also decreases with
However for small input amplitudes between -1 and 1 increasing order.
the time constant is highly dependent on the process
output and only the RBF network does a good
approximation. Acknowledgements
To examine the sensitivity to noise, a (nearly) white We want to thank Peter Damm, who performed many of
noise signal was added to the process output. No major the simulations presented above during his diploma
differences to the previous results have been observed. thesis.
But there is one important and interesting exception. For
larger variances, e.g. k = 3, the RBF network tends to References
fail with process 2 (see Fig. 7). This may be explained
as follows. The network could only be trained for output He X., Asada H.: "A New Method for Identifying Orders
of Input-Output Models for Nonlinear Dynamic Systems",
values between -1.57 and 1.57 ( = atan(m)) and all values ACC, 1993
outside this interval are due to the added noise signal. Narendra K.S., Parthasarathy K.: "Identification and
Although interpolation between two trained areas Control of Dynamical Systems Using Neural Networks",
improves with larger values of U, generalisation IEEE Transactions on Neural Networks, Vol. 1, No. 1,
performance for areas outside decreases. This March 1990
phenomenon is probably due to the very high positive Isermann R., Lachmann K.-H., Matko D.: "Adaptive
and negative weights learned for large U-values. It is not Control Systems", Prentice Hall, 1992
observed for smaller U-values. Lachmann, K.-H.: "Parameteradaptive Regelalgorithmen
f i r bestimmte Klassen nichtlinearer Prozesse mit
eindeutigen Nichtlinearit2tenen",
Fortschrittberichteder VDI
Zeitschriften, Reihe 8, Nr. 66, VDI-Verlag, 1983
desired Poggio T., Girosi F.: "A Theory of Networks for
Approximation and Learning", MIT A.I. Memo No. 1140,
2 C.B.I.P. Paper No. 31, July 1989
$ 1 Moody J., Darken C.J.: "Fast Learning in Networks of
2 Locally-Tuned Processing Units", Neural Computation 1,
3 0
pp. 281-294, 1989
4 -1 Sanner R.M., Slotine J.-J.E.: "Gaussian Networks for
-2 Direct Adaptive Control", IEEE Transactions on Neural
-3 Networks, Vol. 3, No. 6, November 1992
output Sanner, R.M., Slotine J.-J.E.: "Stable Recursive
-4
I I Identification Using Radial Basis Function Networks",
0 200 400 600 800 1000 1200 ACC, 1992
Time [SI Gorinevsky D.: "On the Persistency of Excitation in RBF
Network Identification", ACC, 1994
Fig. 7: RBF network with large U fails with process 2, if [lo] Haber R., Unbehauen H.: "Structure Identification of
the output signal is spoiled with noise Nonlinear Dynamic Systems - A Survey on Inputloutput
Approaches", Automatica, Vol. 26, No. 4, pp. 651-677,
6. Conclusions 1990
For identification of nonlinear dynamic systems radial [Ill Kecman V., Reiffer B.-M.: "Exploiting the Structural
basis function networks and classical parameter Equivalence of Learning Fuzzy Systems and Radial Basis
Function Neural Networks", EUFIT, Aachen, 1994
3790