■/i
^,
::
^'ai '• i!i;!!Mjii
■" ".-, k if? {3 f" r; r.!
-'„SsS;S
L?!s2SiSiii
'? K "?»?"? s ^ ; I
iiL'lM'i iiiiliiliji
;!riii litiinfiii iiii&=»
iiasi'ilMII
.-Ji,ft^a»'"--•■;! jsiiiilSiiiii
Isy^A'
-^ '3 3 H -3 3 i? " "
J. Levesley
I.J. Anderson
V J.C. Mason
/
(Eds)
University of Huddersfield
Proceedings Published 2002
Form Approved 0MB No. 0704-0188
REPORT DOCUMENTATION PAGE
Public reporting burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and
maintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of tliis collection of information,
including suggestions for reducing the burden, to Department of Defense, Washington Headquarters Services, Directorate for Information Operations and Reports (0704-0188), 1215 Jefferson
Davis Highway, Suite 1204, Arlington, VA 22202-4302. Respondents should be aware that notwithstanding any other provision of law, no person shall be subject to any penalty for failing to comply
with a collection of information if it does not display a currently valid 0MB control number.
PLEASE DO NOT RETURN YOUR FORM TO THE ABOVE ADDRESS.
1. REPORT DAIE (DD-MM-YYYY)
14-10-2002
2. REPORT TYPE
Conference Proceedings
4. TITLE AND SUBTITLE
3. DATES COVERED (From - To)
16 July 2001 -20 July 2001
5a. CONTRACT NUMBER
F61775-00-WF078
Algorithms for Approximation IV (A4A4)
5b. GRANT NUMBER
5c. PROGRAM ELEMENT NUMBER
Sd. PROJECT NUMBER
6. AUTHOR(S)
Conference Committee
(Organizer, Professor John C Mason)
5d. TASK NUMBER
5e. WORK UNIT NUMBER
7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES)
University of Huddersfield
Queensgate
Huddersfield HD1 SDH
United Kingdom
8. PERFORMING ORGANIZATION
REPORT NUMBER
9. SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES)
10. SPONSOR/MONITOR'S ACRONYM(S)
EOARD
PSC 802 BOX 14
=:._-—^FEO-09499-0014.—,.—^-_ - ^ -
N/A
JJ. _SPONSOR/MONITOR'SJREJPORT NUMBERtS)
CSP 00-5078
.—
12. DISTRIBUTION/AVAILABILITY STATEMENT
Approved for public release; distribution is unlimited.
13. SUPPLEMENTARY NOTES
14. ABSTRACT
The Final Proceedings for Algorithms for Approximation IV (A4A4), 16 July 2001 - 20 July 2001,
a multidisciplinary conference addressing many areas of interest to the Air Force. Of primary interest are the potential
applications to Modeling and Simulation. Specifically, the topics to be covered include in the following four major
areas: Algorithms, Efficiency, Software, and Applications. Each major topic is divided into subtopics as follows:
Algorithms - Approximation of Functions, Data Fitting, Geometric and Surface Modelling, Splines, Wavelets, Radial
Basis Functions, Support Vector Machines, Norms and Metrics, Errors in Data, Uncertainty Estimation,
Efficiency - Numerical Analysis, Parallel Processing,
Software - Standards, Libraries, New Routines, World Wide Web,
Applications - Metrology (Science of Measurement), Data Fusion, Neural Networks and Intelligent Systems, Spherical
Data and Geodetics, Medical Data.
15. SUBJECT TERMS
EOARD, Software, Mathematics, Intelligent Systems, Computational Mathematics
16. SECURITY CLASSIFICATION OF:
a. REPORT
UNCLAS
b. ABSTRACT
UNCLAS
c. THIS PAGE
UNCLAS
17. LIMITATION OF
ABSTRACT
UL
18, NUMBER
OF PAGES
492 (plus
front matter)
19a. NAME OF RESPONSIBLE PERSON
Christopher Reuter, Ph. D.
19b. TELEPHONE NUMBER (include area code)
+44(0)20 7514 4474
Standard Form 298 (Rev. 8/98)
Prescribed by ANSI Std. Z39-18
Algorithms for Approximation
IV
The proceedings of the Fourth International Symposium on Algorithms
for Approximation, held at the University of Huddersfield, July 2001.
Edited by
Jeremy Levesley
Department of Mathematics and Computer Science
University of Leicester
Leicester LEI 7RH, UK.
Iain Anderson
Analyticon Ltd
Elopak House
Rutherford Close
Meadway Technology Park
Stevenage, SGI 2EF, UK.
DISTRIBUTION STATEMENT A
Approved for Public Release
Distribution Unlimited
John C. Mason
School of Computing and Mathematics
The University of Huddersfield
Queensgate
Huddersfield, HDl 3DH, UK.
Published by The University of Huddersfield
20030319 033
A^Q^OS-O'^-OSI^
y
First published in 2002 by the University of Huddersfield, Queensgate, Huddersfield HDl SDH.
Printed in Great Britain by The Charlesworth Group, 254, Deighton Road,
Huddersfield HD2 IJJ, UK.
ISBN 186218 040 7
British Library in Publication Data
A catalogue record for this book is available from the British Library.
Contents
Contributors
Preface
Chapter 1
Computer Aided Geometric Design
An automatic control point choice in algebraic numerical grid
generation
C. Conti, R. Morandi, and D. ScaramelU
Shape-measure method for introducing the nearly optimal domain
A. Fakharzadeh and J. E. Rubio
1
2
£J
10
Convex combination maps
M. Floater
18
Shape preserving interpolation by curves
T. N. T. Goodman
24
CAGD techniques for differentiable manifolds
A. Lin and M. Walker
Parametric shape-preserving spatial interpolation and z/-splines
C. Manni
On the g-Bernstein polynomials
H. Orug and N. Tuncer
Uniform Powell-Sabin splines for the polygonal hole problem
J. Windmolders and P. Dierckx
36
44
52
60
Chapter 2
Differential Equations
Iterative refinement schemes for an ill-conditioned transfer
equation in astrophysics
M. Ahues, F. d'Almeida, A. Largillier, O. Titaud, and P. Vasconcelos
Geometric symmetry in the symmetric Galerkin BEM
A. Aimi and M. Diligenti
The numerical simulation of the qualitative behaviour of Volterra integro-differential equations
J. T. Edwards, N. J. Ford, and J. A. Roberts
Systems of delay equations with small solutions: a numerical
approach
A^. J. Ford and P. M. Lumb
Chapter 3
69
70
78
86
94
On an adaptive mesh algorithm with minimal distance control
K. Shanazari and K. Chen
102
An alternative approach for solving Maxwell equations
W. Sproessig and E. Venturino
110
Metrology
121
Orthogonal distance fitting of parametric curves and surfaces
S. J. Ahn, E. Westkdmper, and W. Rauh
122
Template matching in the ii norm
/. J. Anderson and C. Ross
130
A bootstrap method for mixture models and interval data in
inter-comparisons
P. Ciarlini, G. Regoliosi, and F. Pavese
138
Efl[icient algorithms for structured self-calibration problems
A. Forbes
146
On measurement uncertainties derived from "metrological statistics"
M. Grabe
154
li and loo fitting of geometric elements
H.-P. Helfrich and D. S. Zwick
162
Evaluation of measurements by the method of least squares
L. Nielsen
170
Chapter 4
An overview of the relationship between approximation theory
and filtration
P. J. Scott, X. Q. Jiang, and L. A. Blunt
188
Radial Basis Functions
197
Applications of radial basis functions: Sobolev-orthogonal functions, radial basis functions and spectral methods
M.D. Buhmann, A. Iserles, and S. P. N0rsett
Approximation with the radial basis functions of Lewitt
J. J. Green
212
Computing with radial basic functions the Beatson-Light way!
W. A. Light
220
Application of orthogonalisation procedures for Gaussian radial
basis functions and Chebyshev polynomials
J. C. Mason and A. Crampton
Geometric knot selection for radial basis scattered data approximation
R. Morandi and A. Sestini
On the boundary over distance preconditioner for radial basis
function interpolation
C. T. Mouat and R. K. Beatson
What are 'good' points for local interpolation by radial basis
functions?
R. P. Tong, A. Crampton, and A. E. Trefethen
Chapter 5
198
236
244
252
260
Regression
269
Generalised Gauss-Markov regression
A. Forbes, P. M. Harris, and I. M. Smith
270
Nonparametric regression subject to a given number of local
extreme values
A. Majidi and L. Davies
278
Model fitting using the least volume criterion
a Tofallis
286
Some problems in orthogonal and non-orthogonal distance regression
G. A. Watson
294
Chapter 6
Splines and Wavelets
Nonlinear multiscale transformations: from synchronisation to
error control
F. Arandiga and R. Donat
306
Splines: a new contribution to wavelet analysis
A. Z. Averbuch and V. A. Zheludev
314
Knot removal for tensor product splines
T. Brenna
322
Fixed- and free-knot univariate least-squares data approximation by polynomial splines
M. Cox, P. Harris, and P. Kenward
On the approximation power of local least squares polynomials
0. Davydov
A wavelet-based preconditioning method for dense matrices
with block structure
J. M. Ford and K. Chen
Some properties of the perturbed Haar wavelets
A. L. Gonzalez and R. A. Zalik
An example concerning the Lp-stabihty of piecewise linear Bwavelets
P. Oja and E. Quak
How many holes can locally linearly independent refinable vector functions have?
G. Plonka
The correlation between the convergence of subdivision processes and solvability of refinement equations
V. Protasov
Accurate approximation of functions with discontinuities using
low order Fourier coefficients
R. K. Wright
Chapter 7
305
330
346
354
362
370
378
394
402
General Approximation
411
Remarks on delay approximations based on feedback
A. Beghi, A. Lepschy, W. Krajewski, and U. Viaro
412
Point shifts in rational interpolation with optimized denominator
J.-P. Berrut and H. D. Mittelmann
420
An application of a mathematical blood flow model
M. Breuss, A. Meister, and B. Fischer
428
Zeros of the hypergeometric polynomial F{—n, b; c;, z)
K. Driver and K. Jordaan
436
Approximation error maps
A. Gomide and J. Stolfi
446
Approximation by perceptron networks
V. Kurkovd
454
Eye-ball rebuilding using splines with a view to refractive surgery simulation
M. Lamard, B. Cochener, and A. Le Mehaute
A robust algorithm for least absolute deviation curve fitting
D. Lei, I. J. Anderson, and M. G. Cox
Tomographic reconstruction using Cesaro-means and NewmanShapiro operators
U. Maier
A unified approach to fast algorithms of discrete trigonometric
transforms
M. Tasche and H. Zeuner
462
470
478
486
Contributors
Invited Speakers
Martin Buhmann
Maurice Cox
Kathy Driver
Michael Floater
Tim Goodman
Will Light
Lars Nielsen
Gerlind Plonka
Tomaso Poggio
Larry Schumaker
Alistair Watson
Lehrstuhl Numerische Mathematik, Mathematisches
Institut, Justus-Liebig-University, 35392 Giessen,
Germany.
National Physical Laboratory, Teddington,
Middlesex, TWll OLW, UK.
School of Mathematics, University of the Witwatersrand,
Private Bag 3, WITS, 2050, South Africa.
SINTEF, Applied Mathematics, P.O. Box 124,
Bhndern, 0314 Oslo, NORWAY.
Department of Mathematics,
The University of Dundee, Dundee DDl 4HN, Scotland.
Department of Mathematics, The University
of Leicester, Leicester LEI 7RH, UK.
Danish Institute of Fundamental Metrology
DK-2800 Lyngby, Denmark.
Gerhard-Mercator-Universitat Duisburg
Institut fr Mathematik, D-47048 Duisburg, Germany.
Massachusetts Institute of Technology
Department of Brain and Cognitive Sciences
77 Massachusetts Avenue, E25-406 Cambridge,
MA 02139-4307, USA.
Vanderbilt University, Department of Mathematics,
1326 Stevenson Center, Nashville TN 37240-0001, USA.
Department of Mathematics,
The University of Dundee, Dundee DDl 4HN, Scotland.
Contributing Speakers
S. J. Ahn
A. Aimi
F. D. d'Almeida
I. J. Anderson
F. Arandiga
A. A. Badr
R. K. Beatson
Fraimhofer Institute for Manufacturing Engineering
and Automation (IPA), 70569 Stuttgart, Germany.
Department of Mathematics, University of Parma, Italy.
University of Porto, Faculty of Engineering, 4200-468 Porto,
Portugal.
Analyticon Ltd, Elopak House, Meadway Technology Park,
Stevenage, SGI 2EF, UK.
Dept. Matematica Aplicada, University of Valencia, Spain.
Alexandria University, Dept. of Mathematics, Faculty of Science,
Alexandria, Egypt.
Dept. of Mathematics and Statistics, Univ. of Canterbury,
Christchurch, New Zealand.
E. Belinsky
J.-P. Berrut
K. Bittner
T. Brenna
M. Breuss
C. Brezinski
A. Chunovkina
P. Cross
M. P. Dainton
0. Davydov
A. Fakharzadeh
A. B. Forbes
J. M. Ford
N. Ford
D. J. Gavaghan
A. J. P. Gomes
A. Gomide
M. Grabe
P. R. Graves-Morris
J. J. Green
H.-P. Helfrich
H. 0. Kim
W. Krajewski
V. Kurkova
M. Lamard
D. Lei
S.Li
J. Lippus
P. M. Lumb
T. Lyche
University of the West Indies, Dept. of Computer Science,
Maths and Physics, P 0 Box 64, Bridgetown, Barbados.
Dept. de Mathematiques, Universite de Fribourg, Switzerland.
University of Missouri - St Louis, Dept. of Maths
and Computing Science, St Louis, M063121, USA.
Dept. of Informatics, University of Oslo, Oslo, Norway.
Dept. of Mathematics, University of Hamburg, Hamburg, Germany.
University of Lille, Lille, France.
VNIIM, St Petersburg, Russia.
University College London, Dept. of Geomatic Engineering,
London WCIE 6BT, UK.
National Physical Laboratory, Teddington, Middlesex,
TWll OLW, UK.
Universitat Giessen, Mathematisches Institut, D-35392 Giessen,
Germany.
Dept. of Mathematics, Shahid Chamran University of Ahvaz,
Ahvaz, Iran.
National Physical Laboratory, Middlesex TWll OLW, UK.
Dept. of Mathematical Sciences, University of Liverpool,
Liverpool L69 7ZL, UK.
Chester College, Parkgate Road, Chester, CHI 4BJ, UK.
Oxford University, Computing Laboratory, Oxford, 0X1 3QD, UK.
University Beira Interior, Dept. Informatica, 6201-001 Covilha,
Portugal.
Institute of Computing, University of Campinas, Brazil.
PTB, Am Hasselteich 5, 38104 Braunschweig, Germany.
University of Bradford, Dept. of Maths and Computing Science,
Bradford, BD7 IDP, UK.
University of ShefReld, Dept. of Applied Mathematics, ShefBeld, UK.
Mathematisches Seminar der Landwirtschaftlichen Fakultat
der Universitat Bonn, Bonn, Germany.
KAIST, Division of Applied Mathematics, Taejon, Korea.
Systems Research Institute, Polish Academy of Sciences, Warsaw,
Poland.
Academy of Sciences of the Czech Republic, Institute of
Computer Science, PO Box 5, 182 07 Prague 8, Czech Repubhc.
LATIM - INSERM ERM 0102, 29609 Brest, Cedex France.
School of Computing and Mathematics, University of Huddersfield,
Huddersfield, UK.
Southeastern Louisiana University, USA.
Tallinn Tech. University, Institute of Cybernetics, 12618 Tallinn,
Estonia.
Chester College, Parkgate Road, Chester, CHI 4BJ, UK.
University of Oslo Institute for Informatics, P 0 Box 1080,
Blindern, 0316 Oslo, Norway.
U. Maier
A. Majidi
e. Manni
J. C. Mason
G. W. Morgan
A. Palomares
F. Pavese
M. J. D. Powell
A. Prymak
E. Quak
M. Rogina
C. Ross
D. Scaramelli
C. Schneider
P. J. Scott
S. Serra Capizzano
A. Sestini
K. Shanazari
I. M. Smith
A. Sommariva
W. Sproessig
K. Strom
M. Tasche
C.
R.
L.
N.
Tofallis
P. Tong
Traversoni
Tuncer
M. Walker
J. Windmolders
R. K. Wright
R. A. Zalik
V. A. Zheludev
D. S. Zwick
Justus-Liebig Universitat, Mathematisches Institut, D-35392 Giessen,
Germany.
Dept. of Mathematics and Computer Science, University of Essen,
Germany.
Dept. of Mathematics, University of Torino, Italy.
School of Computing and Mathematics, University of Huddersfield,
Huddersfield, UK.
Numerical Algorithms Group, Oxford, UK.
Universidad de Granada, Facultad de Ciencias, 18071 Granada, Spain.
Istituto di Metrologia "G.Colonnetti", Torino, Italy.
University of Cambridge, DAMTP, Cambridge, CBS 9EW, UK.
National Taras Shevchenko University of Kyiv, Mech-Math Faculty,
Kyiv 01033, Ukraine.
SINTEF Applied Mathematics, P.O. Box 124 Blindern, 0314 Oslo,
Norway.
University of Zagreb, Dept. of Mathematics, 10002 Zagreb, Croatia.
School of Computing and Mathematics, University of Huddersfield,
Huddersfield, UK.
Dipartimento di Energetica, 50134 Firenze, Italy.
Johannes Gutenberg Universitat, FB 17, D-55099 Mainz, Germany.
Taylor Hobson Ltd, New Star Road, Leicester LE4 9JQ, UK.
University Insubria Como, 22100 Como, Italy.
Dipartimento di Energetica, Universita di Firenze, Italy.
Dept. of Mathematical Sciences, The University of Liverpool,
Liverpool L69 7ZL, UK.
National Physical Laboratory, Middlesex, TWll OLW, UK.
Universita di Padova, Dipartimento di Matematica Pura e Applicada,
Padova, Italy.
Freiberg University of Mining and Technology, 09596 Frieburg, German}
SimSurgery, Sognsveien 75B, N-0855 Oslo, Norway.
University of Rostock, Dept. of Mathematics,
D-18051 Rostock, Germany.
University of Hertfordshire Business School, Hertford, SG13 8QF, UK.
The Numerical Algorithms Group Ltd, Oxford, 0X2 SDR, UK.
Universidad Autonoma Metropolitana, D.F. Mexico CP 09340.
Dept. of Mathematics, Dokuz Eyliil University, 35160 Buca Izmir,
Turkey
York University, Toronto M3J 1P3, Canada.
Dept. of Computer Sciences, Kath. University Leuven, Belgium
Dept. of Mathematics and Statistics, UVM, Burlington,
VT, 05445 USA.
Dept. of Mathematics, Auburn University, AL 36849-5310, USA.
School of Computer Science, Tel Aviv University, Israel.
Wilcox Associates, Inc. Phoenix, AZ 85310 USA.
Chairs
CAGD
Data Approximation
MetrologyNeural Networks
Orthogonal Polynomials and Pade Approximation
Radial Basis Functions
Radial Basis Functions and Wavelets
Shape Preserving Methods
Spline Functions
Spline Functions
T. N. T. Goodman
H.-P. Helfrich
A B Forbes
V. Kurkova
P. R. Graves-Morris
J. C. Mason
M J D Powell
C. Manni
C. Brezinski
T. Morton
Preface
This book contains the proceedings of an International Symposium on
Algorithms for Approximation Four (A4A4), held at University of Huddersfield from July 15th to 20th, 2001, and attended by 106 people from no
less than 32 countries. The accommodation base was the attractive University Park at Storthes Hall, where social events were centred. There was
a very friendly atmosphere, helped by the presence of a significant number
of younger people to balance the stalwarts. Food was excellent and weather
was generally good.
This was the fourth, after a pause of 9 years, in the series of "Algorithms for Approximation" meetings held before in Oxfordshire in 1985,
1988, 1992, and once again it was run under the sponsorship of US Air
Force (European OfBce of Aerospace Research and Development) and this
time with grants from London Mathematical Society and National Physical
Laboratory (NPL) (Software Support for Metrology).
The Organising Committee consisted of Iain Anderson, John Mason,
David Turner (Huddersfield) Maurice Cox and Alistair Forbes (NPL) and
Jeremy Levesley and Will Light (Leicester). In addition to them, the Programme Committee included Claude Brezinski (Lille), Martin Buhmann
(Giessen), Tim Goodman (Dundee), Tom Lyche (Oslo), Alistair Watson
(Dundee) and Larry Schumaker (Vanderbilt). In support of the committee,
the Symposium Secretary, Ros Hawkins was extremely efficient, and was
helped by Karen Mitchell.
Moving to the academic programme, there were 11 invited speakers.
From UK were Maurice Cox, Tim Goodman, Alistair Watson and Will
Light; from other parts of Europe were Martin Buhmann (Giessen), Michael Floater (SINTEF Oslo), Lars Nielsen (Danish Institute of Fundamental
Metrology) and Gerlind Plonka (Duisburg); from USA were Tomaso Poggio
(MIT) and Larry Schumaker (Vanderbilt); and from South Africa, Kathy
Driver (Witwatersrand).
In addition there were 74 submitted papers given at the meeting, of which
a good proportion were offered in Special Sessions in Metrology-Maths (run
by David Turner), Metrology-Stats (Alistair Forbes), Orthogonal Polynomials and Pade Approximation (Claude Brezinski and Peter Graves-Morris
(Bradford)), SpUne Functions (Tom Lyche), Mathematical Modelling in
Medicine (Ewald Quak), Integrals and Integral Equations (Ezio Venturino
(Torino)) and Wavelets (Richard Zalik (Auburn)).
The current volume contains a substantial portion of the papers from
the conference, which were provided by the speakers, so that this is a solid
and broad contribution to the area. The book has been organised in topics
to suit the final selection of papers.
All submitted papers were refereed and significant modifications were
made to a number of papers. In general, there was a high standard of
submissions.
We cannot conclude this preface without mentioning the celebration of
three 60th birthdays of 2001 at the meeting, namely those of Claude Brezinski, Maurice Cox, and John Mason. All played major parts in the Symposium.
We must finish by offering thanks to all the staff at University of Huddersfield, NPL, USAF-EOARD, London Mathematical Society, and the pubHshers, who contributed to this most successful and memorable symposium.
Thanks also go to Jeremy Levesley and Iain Anderson and the publishers,
who worked so hard on the proceedings, and to all authors without whom
the volume would not exist.
John Mason
HuddersGeld
Chapter 1
Computer Aided Geometric
Design
An automatic control point choice in algebraic
numerical grid generation
C. Conti, R. Morandi, and D. Scaramelli
Dipartimento di Energetica, via C. Lombroso 6/17, 50134 Firenze, Italy
costanza@sirio.de.vmiifi.it, morandiQde.unifi.it, scaramel@inath.unipd.it
Abstract
A strategy to construct a grid conforming to the boundaries of a prescribed domain by
using transfinite interpolation methods is discussed. A transfinite interpolation procedure
is combined with a B-spline tensor product scheme defined by using suitable control
points. Their choice is performed by taking into account a quality measure parameter
based on the condition number of matrices linked to the covariant metric tensors.
1
Introduction
The algebraic grid generation approach relies on the construction of a coordinate transformation from the computational domain into the physical domain. In particular, this
can be obtained through transfinite interpolating operators allowing us the generation
of grids with boundary conformity. Furthermore, using a Hermite-type transfinite interpolating scheme we can obtain orthogonal grid fines emanating from the boundary.
This can be very important for practical reasons since the grid point distribution in the
immediate neighborhood of the boundaries has a strong influence on the accuracy of the
numerical solution of partial diff'erential equations [5]. Furthermore, in case a domain
decomposition is necessary the orthogonality guarantees smoother grids. In order to obtain a grid with other specified properties, e.g. the control of the shape and position of
the coordinate curves, transfinite interpolating methods can be combined with tensor
product schemes using suitably chosen control points (see for instance [1, 2, 6, 7, 8]).
Even though this type of algebraic method is computationally efficient, to define workable meshes, a significant amount of user interaction is required for the selection of the
control points involved in the tensor product. To overcome this drawback, an automatic
strategy for choosing the control points turns out to be desirable. Here, following the
approach first discussed in [1], we present an algebraic Hermite-type transfinite method
to construct a grid interpolating the boundary and its normal derivatives. In fact, given
a "quadrilateral" domain Q C H^, a transformation G : i? = [0,1] x [0,1] -+ fi is defined
as
G{s,t):=Tp{s,t) + {Pi®P2){[4>,ip]-Tp){s,t)
(1.1)
where Tp is a tensor product surface i.e. Tp{s,t) := J2Z=i Sj=i QijBi,3{s)Bj^3{t) with
Bj,3 denoting the usual cubic B-spline, (j) and I/J are boundary curves and (Pi ® P2) is the
An automatic control point choice
Boolean sum of Hermite-type blending function linear operators. The set Q = {Qij, i =
1,..., m, j = 1,..., n} is the set of control points.
As already noted, the choice of the control points is a crucial matter. In this paper
we take into account a grid quality measure parameter for their selection. In particular,
the proposed automatic procedure relies on the fact that some grid properties can be
described in terms of the condition number of matrices linked to the covariant metric
tensors [4]. Therefore, the control points are chosen minimizing their condition number.
The outUne of this paper is as follows. In Section 2, the transformation (1.1) is given
in detail and its properties are investigated. In Section 3, a way for choosing the control
points is proposed relying on a particular quality measure parameter. Finally, in Section
4 some numerical results are presented to illustrate the features of the proposed strategy.
2
The transformation
In this section the transformation (1.1) is characterized. Let us consider a "quadrilateral"
domain fi c R^ such that dQ, = uf^-^dQi, with 80,1,00.2, dfls, dD.4 being the supports of
four regular curves 7^ : [0,1] -^ dCli, i = 1,... ,4 taken counterclockwise. Furthermore,
let us suppose that dOi D dO^ = 0 and 80,2 n 80^ = 0, with any other intersection
occuring only at the end points of the boundary curves. In particular, the following
compatibility conditions are assumed
71 (0) =74(1), 7i(l) = 72(0), 72(1) =73(0), 74(0) = 73(1) .
For later convenience, we set (t>i{s) := 7i(s), ^2(5) := 73(1 - s) denoting by s the
curve parameter running on [0,1] and we set Vi(*) := 74(1 -1), tp2{t) := 72(t) denoting
by t the curve parameter running on [0,1]. In addition, the components of the ^-curves
and ■j/'-curves are denoted by (jf,(f)^ and ip^,ip''' respectively.
Next, we define four additional curves by computing the derivatives of the 0 and
i/j-curves, i.e..
(2.1)
with C a constant value also depending on the curve orientations and with || ■ ||2 the
Euclidean norm. Then, we introduce the linear operators
^4
_ /.\ . /.\
Pi[0](s,t) := E-=i
Oii{t)Ms)
,
n r /I/ .i\
v^4
P2mis,t)
:= ^U
"i(«)V'jW '
(2.2)
P,P2[(j^,'^]{s,t) := E-=i {ai{t)P2ms,Ui) + a,+2(f)^^^#^) ,
where tti = 0, U2 = 1- The functions 0;^, i = 1,... ,4, are the dilated versions of the
C. Conti, R. Morandi and D. Scaramelli
classical Hermite bases with support on [0, u] and on [1 - u, 1] being 0 < u < 1, i.e.
ai(s):=(l + 2t)(l-f)2,
Q3(s) := s(l - D' - s€[0,ul
(2.3)
The Boolean sum operator (Pi ©F2) = P1 + P2-P1P2 provides the blending function
surface
Bis,t) := (Pi® P2)[<l>Ms,t) = Pi[4>]is,t) + PMis,t) - PiP2[<l>Ms,t) .
(2.4)
It is known that B satisfies
B(ui,t) = Ut),i = h2
B(s,«;^) = <^^(s) , i = 1,2
^5^=^.(i), j = 3,4,
^M) = 0^(s) , j = 3,4 ,
,25.
where wi = «3 = 0, U2 = U4 = 1 and wi = W3 = 0, W2 = W4 = 1. It is worthwhile to
remark that, as we are dealing with orthogonal grid lines emanating from the boundary
of the domain, the intersecting boundary curves must be also orthogonal. Thus, the
following additional conditions are assumed:
CJ)i+2{0)
= tlhiWi),
(t>i+2il)
= V'2(Wi) >
Vj+2(0)
=<p\{Ui),
V'^2(l)
=4>'2M,
(/•"(o)
=v-'i(w,),
0'i'(i)
=v4'K).
i = l,2.
(2.6)
Now, in order to define a suitable grid, following the approach given in [1], we use
the linear transformation G
G{s,t):=Tp{s,t) + {Pi@P2){[(pA]-Tp){s,f)
(2.7)
where Tp{s,t) := EHi E"=i <5u-Bi,3(s)Bj,3(t) with ^,,3 denoting the usual cubic Bsphnes with uniform knots. The set Q = {Qij, i = 1,... ,m, j = 1,... ,n} is a suitable
set of control points whose definition is discussed in Section 3. It should be noted that
in (2.7) the Boolean sum operator is also acting on a surface Tp{s,t). In this case
(2.2) is used taking the eight boundary curves Tp{0,t), Tp{l,t), Tp(s,0), Tp(s, 1),
aTp{o,t) OTp(i,t) aTpjsfi) orp(s,i)
ds '
ds >
at '
at ■
It is easy to show that G still satisfies G{ui,t) = tpi{t) , i = 1,2, ^'^^""'^ = tpiit) , i =
3,4, G{s,Wj) = (pjis) , j = 1,2 and ^^i^pl = (Pj{s) , j = 3,4. Furthermore, because
of the locality of the blending functions ai, i = 1,... ,4, the control of the coordinate
lines obtained by means of the evaluation of G over a parameter set in the interior
of the domain is mainly based on the contribution of Tp. This fact and the use of Bsplines ensures the convex-hull property in the interior of the domain. This property is of
importance in numerical grid generation to locate the grid with respect to the position
of control points.
An automatic control point choice
3
Grid quality measure
It is well known that grid generation techniques sensible to grid quality features are
particularly attractive. Thus, in this section, we discuss a strategy to choose the set Q
of control points based on a suitable grid quality measure parameter.
Given a set of grid points G-={Gij}^J_-^ defining the quadrilateral cells {C'jj}^^^"^' ",
quality measures can commonly include: grid "skewness", measuring the departure of
Cij from a rectangle, grid "aspect ratio", measuring the departure of Cij from a rhombus
or grid "conformality", measuring the departure of Cij from a square (see for instance
[5]).
Here, as done in [4] for the case of unstructured grids, we define a grid quality measure
taking into account the condition number of particular matrices derived from the grid.
As explained below, somehow this quality parameter measures the departure of Cij from
a square.
The strategy starts with a set Q' of control points obtained by evaluating on a coarse
parameter set Sc = {{si,tj)}^JJ^^'' a Lagrange blending function surface (for detail
related to Lagrange blending function methods we refer, for instance, to [3]) by working
only with the four boundary curves of the given domain. Then, using Q* a first grid is
obtained by evaluating the surface G in (2.7) on a fine parameter set Sf = {(sj, tj)}^j^i
obtaining the grid points
g := {Gi,j = {Glj,Glj) = G{si,tj), i = 1,... ,M, j = l,...,N}.
The set G is then used to define (M — 1) x {N — 1) bidimensional matrices associated
with the {M -1) X {N - 1) quadrilateral cells Cij, i = 1,... ,M - 1, j ^ 1,. :. ,N -1.
These matrices are defined as
rv^^''
rv'
S'^'
'j^],i = l,...,M-l,3 = l,...,N-l{-i.l)
and their condition number K{Ai^j) is related to the stretch of the cells. In fact, it is easy
to prove that K{Ai^j) := \\Aij\\2 ■ \\A~j\\2 = 1 if and only if we are dealing with a cell
Cij where the three points Gij+i,Gij,Gi+ij generate half a square [9]. On the other
hand, in order to involve all the grid points in the quality measure it is also convenient
to define the boundary matrices
■^i.N-\
/ /~<x
/~ix
r<x
/~fx
■~ \ nv
nv
nv
ny
((~ix
rfx
rix
rix
^M,j
'^M-l,j
^M,j /
^M,j + l
■ri.l\^_^
w„i
^M-1,N-
.— 1
f^x
rix
r^x
jrix
^y
^y
^y
_ ^y
'~^M,N
^M-1,JV
^M,N
\
i , < — i,... ,-iw
±,
\
'^M,N-1
so that the boundary points are also taken into account.
Next, we modify the initial set Q* of control points minimizing the following objective
C. Conti, R. Morandi and D. Scaramelli
function
(3.3)
The minimization is done with respect to the control points under suitable constraints
on their coordinates depending on the geometry of the domain fi. This is the only user
interaction required.
Obviously, since ideal inner cells are characterized by an associated matrix Aij having
a condition number close to one, the optimal distribution of the control points should
guarantee ming /o6 R^ 1- On the other hand, xnixiQ fob strongly depends on the geometry
of the domain (for example in case of a squared domain the optimal value is ming fob = 1
while, in general, this value is not reached).
Summary of the Method
(1)
Compute the initial set of control points Q' by means of a Lagrange blending function
method using the four given boundary curves,
(2) Compute the initial grid Q' = {G{si,tj), i = l,...,M, j = l,...,N} with G given
in (2.7) by using the set of control points Q',
(3) Minimize the objective function (3.3) so defining a new set of control points Q^,
(4) Compute the final grid G^ = {G{si, f,), i = 1,..., M, j = 1,...,N} with G given
in (2.7) by using the set of control points Q^ with M:^ M, N:» N.
Remark 3.1 We note that, in order to reduce the computational cost of the minimization procedure, the integers M and N are chosen less than M and N.
4
Numerical Results
We conclude the paper giving some numerical results testing the properties of the transformation G and showing the performance of the proposed approach.
Three domains are considered. For each of them we present the initial grid obtained
by the transformation G using the initial set of control points Q' and the final grid
obtained using the set of control points Q^ resulting from the minimization procedure.
In all the figures the control points are denoted by the symbol '*'. The minimization
problem is solved by using a sequential quadratic programming method i.e. by using the
routine constr of the Optimization toolbox of the Matlab package. In the minimization
procedure, the constraints on the control points Q^ are chosen so that some geometric
properties of the domain, such as symmetry and convexity, are preserved. Furthermore,
in all the examples M and A'' are equal to M and to N. The values of the objective
function before the minimization (/^j) and after the minimization (/j^) are also given in
the figure captions.
The first and the second test display a "waterway" grid and a W-shaped grid with
their control points before and after the minimization procedure. The effectiveness of
the method is evident.
An automatic control point choice
0
0.1
0.2
0.3
'
0.5
0.6
0.7
0.8
0.9
1
0
01
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0,9
1. Initial grid (left) and final grid (right), /^^ = 3.74, //^ = 1.65.
FIG.
1
0,4
'—6FFPBTO inniiniiniiitiiniiniiifli'iiini-ii
,—1
M^
1
:;
10
8
2
m
i
\
t ,
1
^
#piW-U IMH^Wm
JWffff^^
-2
FIG,
0
0
-2
2
4
6
8
10
12
1
1
14
16
-6
-2
0
2
4
G
8
10
12
14
16
2. Initial grid (left) and final grid (right), /^^ = 1-45, //{, = 1.22,
.L
FIG,
3. Initial grid (left) and final grid (right), /^ = 1,89/6,36, //^ = 1.59/2.20,
C. Conti, R. Morandi and D. Scaramelli
Figure 3 shows a grid composed of six sub-grids, obtained via a domain decomposition approach. In this case, the Hermite-type interpolation method guarantees a C^
connection among the patches. Here, the two values of /^j, and //^ in the figure captions
refer to the "horizontal" and "slanted" grids, respectively.
Bibliography
1. P. R. Eiseman, High Level Continuity for Coordinate Generation with Precise Controls, Journal of Computational Physics 47 (1982), 352-374.
2. P. R. Eiseman, Control Point Grid Generation, Computers Math. Applic. 5 (1992),
57-67.
3. W. J. Gordon and L. C. Thiel, Transfinite Mappings and their Application to Grid
Generation, Appl. Math, and Comp. Vol. 10-11 on Numerical Grid Generation, J.F.
Thompson ed., 171-192, 1982.
4. P. M. Knupp, Matrix Norms & the Condition Number, Proceedings, 8th International Meshing Roundtable, South Lake Tahoe, CA, U.S.A., 13-22, 1999.
5. V. D. Liseikin, Grid Generation Methods, Springer, 1999.
6. C. W. Mastin, Three-dimensional Bezier interpolation in solid modeling and grid
generation, Comp. Aid. Geom. Des. 14 (1997), 797-805.
7. R. Morandi and A. Sestini, Precise Controls in Numerical Grid Generation, Advanced Topics in Multivariate Approximation, edited by F. Fontanella, K.Jetter,
and P. J. Laurent, 243-258, 1996.
8. B. V. Saunders and P. W. Smith, Grid generation and optimization using tensor
product B-Splines, Approx. Theory & its Appl., 3 (1987), 120 452.
9. D. ScarameUi, Ph.D. Thesis, in preparation.
Shape-measure method for introducing the nearly
optimal domain
A. Fakharzadeh
DepaHment of Mathematics, Shahtd Chamran University of Ahvaz, Ahvaz, Iran.
a-f aLkharzadehOhotmail. com
J. E. Rubio
Department of Applied Mathematical Studies, University of Leeds, Leeds, LS2 9JT, UK.
Abstract
We deal with introducing a new algorithm for solving the optimal shape problems in
which they are defined with respect to a pair of geometrical elements. The ProWem is
to find the optimal domain approximately for a given functional that is involved with
the solution of a linear or nonlinear elliptic equation with a boundary condition over
a domain The Shape-Measure method, in Cartesian coordinates, will be used to find
the nearlv optimal solution in two steps. By transferring the problem into a measuretheoretical form, first we will find the solution of the elliptic problem for a given domain
by using the embedding method. Then the Shape-Measure method will be applied to find
the best domain approximately. An example will be given.
1
Introduction and Problem
Consider the optimal shape (optimal shape design) problems in which they are defined
with respect to a pair of geometrical elements; this pair consists of a measurable set
(in m^) which can be regarded as a domain, and a simple closed curve contammg
a given point, which is the boundary of the set. By considering the property for the
desired curves to be simple, the problem depends on the geometry which is used. In
polar coordinates, we solved the similar problem in [1]. But in Cartesian coordinates,
it is difficult to introduce a linear condition which determines the property of a closed
curve being simple. Thus here we consider some limitation on shape in order to make
sure that it is simple. The problem will be solved in two stages. First, by use of measures,
the value of the objective function will be calculated for any given domain. Then the
optimal domain will be obtained by use of optimization techniques.
Let Z? C iR^ be a bounded domain with a piecewise-smooth, closed and simple
boundary dD. We assume that some part of OD is fixed and the rest, F, with the given
initial and final points A and B respectively, is not fixed. Here we suppose that the fixed
part of dD is made by three segments, parts of lines y = 0,x = 0 and y = 1 between
points .4(1,0),(0,0),(0,1),B(1,1) (see Figure 1).
Thus F is chosen as an appropriate variable curve joining A and B so that U is
well-defined. Let u{X) [X = {x, y) £ M^) be a bounded solution of the following elliptic
10
Preceding Page Blank
Shape-measure method
._
,
„
11
, 'lillr:yrrr~.::^
C
I' '
FIG.
1. An admissible domain D under the assumptions of the numerical work.
equation:
A«(X) + /(X,u)=t;(Z),U|,„=0,
(1.1)
where X e D —> viX) G iR is a bounded real function [v also can be considered as a
fixed control function); the function / is assumed to be a bounded and continuous realvalued function in L^ip x M). Moreover the above domain D is called an admissible if
the equation (1.1) has a bounded solution on D; we denote by V as the set of all such
admissible domains. We are going to solve the problem of minimizing the functional
1(1?) = /^ fo{X,u) dX, on the set V where /o is a given continuous, nonnegative, realvalued function on D x ]R. To calculate the value of 1(D) for a given domain D, it is
necessary first to identify the solution of (1.1).
2
Weak solution and metamorphosis
In general, it is difficult to identify a classical solution for the problem like (1.1) and
usually one tries to find a weak (generalized) solution of them. Hence the variational
form of (1.1) is introduced in the following; we remind the reader that HQ{D) =
{ip e H'^{D) : VisD = 0}i where H^{D) is the Sobolev space of order 1.
Proposition 2.1
equality:
Let u he the classical solution of (1.1), then we have the following
[ {uAi; + ipf) dX= f tpvdX ,\fip£ Hl{D).
JD
JD
(2.1)
Proof: Multiplying (1.1) by the function tp e HQ{D) and then integrating over D, with
use of the Green's formula (see [3]) gives J^{uA^l) + ipf - tpv) dX = /^^(^Is - "1^) ^^^
where n is the unit vector normal to the boundary dD and directed outward with respect
to D. Because tp^g^ = 0 and U|gp — 0, then (2.1) is satisfied.
D
12
A. Fakharzadeh and J. E. Rubio
Definition 2.2 A function u G H^{D) is called a bounded weak solution of the problem
(1.1) when it is bounded and satisfies the equality (2.1) for all tjj £ HQ{D) (the conditions
for existence of the weak solution of the problem (1.1) and also the boundedness property
of it, have been considered in many references, like [3] and [2]).
Now we apply our new way which is called the Shape-Measure method. Let fl = UxD,
where U C M is the smallest bounded set in which the bounded weak solution w(-)
takes values. Then by applying the Riesz Representation Theorem ([6]), a bounded
weak solution can be represented by a positive Radon measure; the proof of the following
Proposition is similar to the Proposition 3.1 in [1].
Proposition 2.3 Let u{X) be a bounded generalized solution of (1.1); there exist a
unique positive Radon measure, say /x„, in M'^{ft) such that:
Hu{F)= [ Fdfiu= [ F{X,u)dX;WeC{n).
(2.2)
Thus the equality (2.1) can be changed to HuiF-^,) = 1^, ^^ e Hl{D), where F^/, =
«A^ + fip and 7^; = /p ipv dX. Also, 1[D) = /x„(/o)- Because the measure /u„ projects
on the (a;, 2/)-space as the respective Lebesgue measure, we should have ^iu{C} = a^,
where ^ : fi —> iR depends only on variable X (i.e. ^ e Ci(0)), and a^ is the Lebesgue
integral of ^ over D. Therefore the original problem can be described as follows:
To find a measure //„ e M'^{Vl) so that it satisfies the following constraints:
liu{F^) = l4.,
/^«(e) = «c,
^^PeHliD); ,
VCeCi(J^).
(2.3)
As Rubio did in [5], to be sure that we do not miss any solution, we extend the underlying
space; instead of finding a measure /i„ G A^+(fi), introduced by Proposition 2.3 and
equalities (2.3), we seek a measure /i G M'^{U) which satisfies just the conditions:
M0 = a5,
3
VCGCi(fi).
(2.4)
Approximation
The system (2.4) is linear because all the functions in the right-hand-side of equations
are linear functions in their argument /x. But the number of equations and the underlying
space are not finite. We shall develop this system by requiring that only a finite number
of the constraints are satisfied. This will be achieved by choosing countable sets of
functions whose linear combinations are dense in the appropriate spaces. But first we
should approximate the unknown part of the boundary just by the finite number of its
points. This idea comes from the approximation of a curve by broken lines. For the given
D and hence for the given F, let A^ = {xm,yrn),m = 0,1,2,.;., Af, be a finite number
of points on F (where AQ = A). We link together each pair of consecutive points Am
and Am,+i for m = 0,1,..., M - 1 and close this curve by joining the points AM and
B together. Now the resulted shape, which is denoted by ODM, is an approximation for
Shape-measure method
dD; we also call the domain which introduced by its boundary
13
ODM
as
DM
(see Figure
It is possible that by increasing M, the curve ODM will become closer and closer (in
the Euclidean metric) to the curve dD, and hence one may conclude that the minimizer of
I over VM, if it exists, tends to the minimizer of I over V, if it exists. But some difficulties
could arise (too oscillatory a curve may cause problems). Thus, we will fix the number of
points. For a given M, let the value of the components yi,y2,- ■ ■, VM, be fixed. Because
Xm is a free term, the point A^ could be anywhere on the line y = y^, a; > 0 for every
m (see Figure 1). Therefore points Am and Am+i can be chosen so that they belong to
r and hence the part of F between the lines y = Ym and y = Ym+i can be approximated
by the segment AmAm+i- Hence, we do not lose generality. Thus, we fix the components
2/1) 2/21 • • •, VM with the values YijY^,-.., YM , respectively.
Now we introduce the set {ipi € HQ{D) : Z = 1,2,... } so that the linear combinations of the functions {ipi} are uniformly dense (that is, dense in the topology of
the uniform convergence) in HQ{D). We know that the vector space of polynomials
with thei variable x and y, P{x,y), is dense in C°°{D); therefore the set Po{x,y) —
{p{x,y) € P{x,y) \ p{x,y) = 0,V(a;,y) e dD} , is dense (uniformly) in {h 6 C°°{D) :
/i|3„ r= 0} = Cg°(D)}. Since the set
Q{x,y) = {l,x,y,x^,xy,y'^,x^,x^y,xy^,y^,...}
is a countable base for the vector space P{x, y), each elements of P(a;, y) and also Po{x, y),
is a linear combination of the elements in Q{x,y). By Theorem 3 of Mikhailov [3] page
131, the space C°°(D) is dense in H^{D); thus the space C^(D) will be dense in H^{D).
Consequently, the space Po(a;,y) is uniformly dense in ifQ(-D). We define
M
ipi{x,y) = xy{y-l)Yl{x-xi+y-Yi)qi{x,y),
(3.1)
1=1
where qi G Q{x, y). Therefore V'|r = 0 and the set {i!i{x, y) : i = 1,2,... } , is total (dense
in the topology of the uniform convergence) in iifo(-^)For the second set of functions, let L be a given positive integer and divide D into L
(not necessary equal) parts J9i, D2, • • •, D^, so that by increasing L the area of D^, s =
1,2,..., L, will be decreased. Then, for each s we define:
ii{x,y) €Ds,
[x,y,u) = i. Q ^
otherwise.
These functions are not continuous, but each of them is the limit of an increasing sequence
of positive continuous functions, {^s^}; then if fi is any positive Radon measure on ft,
K^s) = limfc-KX) M(^sfc)- The linear combination of functions {^j : j = 1,2,..., i} for all
positive integer L, can approximate a function in Ci{Q) arbitrary well (see [5] Chapter
5).
By Selecting just the finite number of functions in the mentioned spaces the problem
(2.4) can be replaced by another one in which we are looking for the measure UMIM^ ^
14
A. Fakharzadeh and J. E. Rubio
A4+(n), so that it satisfies the following constraints:
,
MMi,A/2(^t) = 7n
i = l,2,...,Mi;
^MuM^iij) = dj,
i = l,2,...,Af2,
(3.2)
where Mi and M2 are two positive integers and F, = F^,. , 7; = 'y^,^ , aj = a^.. If
we denote by Q{Mi,M2) the set of positive Radon measures in M'^{fl) which satisfy
equalities (3.2), and also denote by Q the set of positive Radon measures in A4+(fi) which
satisfy equalities (2.4), one can easily prove the following Proposition by considering the
proof of Proposition///.I in [5].
Proposition 3.1 : If Mi,M2 —> 00 then Q{Mi,M2) —> Q; hence for the large
enough numbers Mi and M2 the set Q can be identified by Q{Mi,M2)But even if the number of equations in (3.2) is finite, the underlying space Q(Mi, M2)
is still infinite-dimensional. By Theorem ^.5 in the Appendix of [5], /iA/i.A/j in (3.2)
can be characterized as HMI,M2 — Z^n=i ^ ci„i5(Z„), with triples Z„ S fi and the
coefficients Q:„ > 0 for n = 1,2,...,Mi + M2, where ^(2) e M'^{Vl) is suppoised to
be a unitary atomic measure with support the singleton set {z}. Thus the measure
problem is equivalent to a nonlinear one in which the unknowns are the coefficients a„
and supports {Z„}. Proposition ///.3 of [5] Chapter 3, states that the measure IJ,MI,M2
has the following form
N
MMi,Af2 = X/""^^"^")'
n=l
^^'^"^
where Z„, n = 1,2,..., iV, belongs to a dense subset of Q. Now let us put a discretization
on fi, with the nodes Z„ = {x„,yn,u„), in a dense subset of fl; then we can set up the
following linear system in which the unknowns are the coefficients a„:
a„ > 0,
n = 1,2, ...,iV;
N
^a„F,:(Z„) =7,:,
i = l,2,...,Mi;
n=l
N
Y,an^j{Zn) = aj,
i = l,2,...,M2.
(3.4)
n-l
The solution of (3.4) is not necessary unique (even if the problem (1.1) satisfies
the necessary conditions for having a unique bounded weak solution), because of the
approximation scheme.
4
The optimal solution
The main aim of the present section is to find an optimal domain D* € V^ so that
the value of I{D*) will be the minimum on the set T>M- By applying the result of the
previous section, a solution of (1.1) can be found. Indeed, it is approximated by a solution
of the linear system (3.4) according to the variables, a;^, m = 1,2,..., M. As mentioned.
Shape-measure method
15
this solution is not necessary unique. Let us specify one by solving the following Unear
programming problem
N
Minimize:
y^Q„/o(Zn)
n=l
Subject to:
^n > 0,
n= 1,2,. ..,N;
N
^a„Fi(Z„)=7i,
i = l,2,...,Mi;
71=1
N
^Q„^,(Z„) = Oj,
j = l,2,...,M2.
(4.1)
n=l
Thus, for each D, the value I{D) = J^/o(X, w) dX = /i(/o) ^ MMI,M2(/O), is defined
uniquely in terms of the variables Xm, m = 1,2,..., M. So, we set up a function, J, on
VM defined by D G DM —^ I(-D) = MMLMSC/O) where MMI,M2(/O) = En=i Q:n/o(^n)Clearly J can be regarded as a vector function:
J : {xi,X2,...,XM)e]R^ —* MMi,M2(/o)€iR.
(4.2)
Since J is a real-valued function which is bounded below, and is defined on a compact
set (since constraints are to be put in the variables), it is possible to find a sequence of
points so that the value of the function along the sequence tends to the (finite) infimum
of the function. The coordinate values corresponding to the points in the sequence are of
course finite. Now, suppose that (a;^, X2,..., x*j^) is the minimizer of the vector function
J; it can be identified by using one of the related minimization methods. The introduced
domain by the minimizer is denoted by D*. We assume in the following theoretical result
that the minimization algorithm which is used, is perfect; that is, it comes out with the
global minimum of J in its (compact) domain.
Theorem 4.1 : Let M, Mi and M2 be the given positive integers which were defined in
section 3, and D* be the minimizer of (4-2) as mentioned above. Then D* is the minimizer domain of the functional I over VM and the value ofI{D*) can be approximated
by 3{D*); moreover J{D*) —> I(D*) as Mi and M2 tend to infinity.
Proof: Suppose D* is not the minimizer of I; hence there exists a domain, call D',
in VM SO that I(D') < !(£>*). Proposition 2.3 shows that there is a unique measure,
call n', in A^+(fl) so that I(-D') = fJ-'ifo), and also Proposition 3.1 states that for
sufficiently large numbers Mi and M2, M'(/O) can be approximated by fj,'j^^ M2(f°) ^^
Q{Mi,M2). Thus, I(D') = n'f^^ MSf°) = JPO- In the same way, one can show that
J(D*) approximates I{D*); so i{D*) ^ ult^^M^o) = ^{D*). Hence J(£>') < 3{D*).
which is contrary with the fact that D* is the minimizer of J. Moreover, from Proposition
3.1 it follows that J(D*) tends to I(D*) as Mi,M2 —> 00.
□
16
A. Fakharzadeh and J. E. Rubio
5
Numerical example
We consider the elliptic equations (1.1) with
^f^y) = l 1 if(x,y)eDnC,
^
,
"^
[ 0
where C is the square [|, |] x [\, |] (see Figure 1). We also take M = 8, Mi = 10, M2 = 8,
N = 740 (the 36 number of nodes are chosen so that W|go =0) and suppose Yi,Y2,... ,Yg
are 0.15,0.25,..., 0.85, respectively. By extra constraints, a:;,„ > | , m = 2,3,..., 7, the
value of 7i for any D 6
PM
is defined as 7,: = // // Vi(a^i v) dxdy , i = 1,2,..., 10. We
■
■
otherwise,
4
4
also assume that the function u takes values in U =[-1,1], and consider the polynomials
qi{x,y) as l,x,y,x^,xy,y'^,x^,x'^y,xy'^,y^. The function/o is chosen as/o = (u-0.1)^.
This function can be considered as a distribution of heat in the surface for the system
governed by an elHptic equations.
In minimization, we apply the Downhill Simplex Method in Multidimension by using
the Subroutine AMOEBA (see [4]) and also consider an upper bound for variables
(suppose they are not higher than 2). These conditions are applied by means of the
penalty method (see [7]). Hence, for the nonlinear case of the partial differential equations
(1.1), we have taken f{x,y,u) = 0.25M^, and used the initial values as Xm. = 1.0,m =
1,2,... ,8, and the stopping tolerance for the program (variable ftol in the Subroutine
AMOEBA) as 10"'''. We remind the reader that the functions Fj and the values of 7,,
i = 1,2,..., 10, have been calculated by the package ''Maple V.3". The results are: the
optimal value of I = 0.45467920356379, the number of iterations = 502, the value
of the variables in the final step are Xi = 1.05019, X2 = 1.08521, X3 = 0.750001,
X4 = 0.768701, Xr, = 1.12986, Xg = 1.13775 ,^7 = 0.97783, Xg = 1.61566, which
represent the optimal domain, shown in the Figure 2.
Bibliography
1. A. Fakharzadeh J. and Rubio, J. E. Shapes and Measxires. IMA Journal of Mathematical Control and Information, vol.16, p.207-220, 1999.
2. Ladyzhenskaya, 0. A. and Urahtseva, N. N. Linear and Quasilinear Elliptic Equations. vol.46, ACADEMIC PRESS, Mathematics in Science and Engineering, 1968.
3. Mikhailov, V. P. Partial Differential Equation. MIR Publisher, Moscow, 1978.
4. Press W. H., Flannery B. P., Teukolsky S. A. and Vetterling. W. T. Numerical
: Recipes: The Art of Scientific Computing. Cambridge University Press, 1986.
5. Rubio, J. E. Control and Optimization: The Linear Treatment of Nonlinear Problems. Maxich.esier\}mver:sity'Pxess,M&x\c\iesieT,l%^&.
6. Rudin, W. Real and Complex Analysis. Tata McGraw-Hill Publishing Co.Ltd, New
Delhi, second edition, 1983.
7. Walsh, G. R. Method of Optimization. John Wiley and Sons Ltd., 1975.
17
Shape-measure method
Initial Domain
c\i
in
■^
^P
^^
in
d
q
0.0
0.5
1.0
1.5
2.0
X
Optimal Domain
cvi
in
>,^
*^^
^i::^-
^5^
c^
7
in
d
q
0.0 0.5
1.0
1.5 2.0
X
FIG.
2. The initial and the optimal domain for nonhnear case of elliptic equations.
Convex combination maps
Michael S. Floater
SINTEF, Postbox 124 Blindem, 0314 Oslo, NORWAY.
mifSmath.sintef.no
Abstract
Piecewise linear maps over triangulations are used extensively in geometric modelling and
computer graphics. This short note surveys recent progress on the important question of
when such maps are one-to-one, central to which are convex combination maps.
1
Introduction
Piecewise linear maps over triangulations have several applications in geometric modelling and computer graphics. For example, Figure la shows a surface triangulation T of a
set of points {xi,yi, Zi) sampled from some unknown surface in R"^. A standard approach
to fitting a smooth parametric surface s(u, v) to these points is to first parameterize them,
i.e., compute planar points {ui,Vi) corresponding to the data points {xi,yi, z,). Then using some scattered data method, we find a parametric surface s : fi —♦ R"^, defined over
some suitable domain Q containing the points {ui,Vi), such that
s{ui,Vi)!v{xi,yi,Zi).
A choice of parameterization is shown in Figure lb and a least squares surface approximation using bicubic B-splines is shown in Figure Ic.
Notice that the choice of parameter points {ui,Vi) uniquely determines a piecewise
hnear map (j) : Dq—> R^, where Dr is the union of the triangles in T. In practice,
a necessary requirement on 0 to ensure adequate quality of the subsequent surface approximation s{u, v) is that (p should be injective. In Figure lb the mapping 0 was taken
to be a so-called convex combination map, which, as we will see later, is guaranteed to
be one-to-one since the boundary of T is mapped to a rectangle. Put another way, none
of the triangles in Figure lb are 'folded over'. In fact further properties of the map are
important, such as linear precision, and this was achieved in Figure lb by using the
so-called shape-preserving weights (the coefficients in the convex combinations). For a
discussion of that, see [3].
Another application of piecewise linear maps is to image morphing. Image morphing
can be carried out by continuously transforming one planar triangulation T° (whose
vertices represent feature points in the image) to another, T'. Here we assume that
there is a one-to-one correspondence between the vertices, edges, and triangles of T° and
T^. We can view each intermediate triangulation T{t), 0 < f < 1, (where T(0) = T°
and T(l) = T^) as the image of a piecewise linear map (/>(<) : Dro -^ ^T(t)- As with
18
Convex combination maps
19
parameterizations, it is again important that (j){t) is one-to-one. Figure 2 shows a socalled convex combination morph of [4] of two given planar triangulations: T° appears
on the left and T^ on the right. The two triangulations in the middle are T(l/3) and
T(2/3). This morph ensures that </>(i) is one-to-one for all t in [0,1] and therefore T(i)
has no 'folded' triangles at any time instant t.
Piecewise linear maps also arise in: texture mapping; numerical grid generation; and
in setting up multiresolution frameworks (nested spaces of piecewise linear functions)
for manifold surface triangulations in computer graphics.
A!-:.--"/
1. Spatial triangulation (la), Convex combination parameterization (lb), Bicubic
spline approximation (Ic).
FIG.
FIG.
2
2. Convex combination morph.
Convex combination maps
For the sake of simplicity we will only discuss convex combination maps defined over
planar triangulations even though all the results hold equally well when the domain of the
map is a spatial triangulation such as that in Figure la. Thus let T = {Ti,..., TM} be a
simply-connected planar triangulation, with closed triangles Tj, and let Dr = Urer^'
as in Figure 3. We will call a mapping (j) : Dq—> R^ a convex combination map if
it is piecewise linear over T and, for every interior vertex v of T, there exist weights
^vw > 0, for w G Ny, such that
Michael S. Floater
20
and
(1)
■w€N„
where Ny is the set of neighbours of v; see Figure 3.
FIG.
3. Convex combination map.
In applications, the mapped boundary vertices (p{v) are chosen first. Then the weights
Xvw are all specified according to some chosen strategy. Then finally the mapped interior
vertices are found by treating the equations in (1) as a finear system.
Example 2.1 If an interior vertex u of T has five neighbours vi,...,Vs, then we might
set
Until recently, the only theory behind convex combination maps was that of Tutte [8].
Working from a purely graph-theoretic point of view, Tutte proposed a so-called barycentric mapping for constructing straight line drawings of 3-connected graphs (which
include triangulations). A barycentric mapping in our context is simply a convex combination map in which all the weights at each vertex are equal, i.e., A^,,, = 1/dy, where
dy is the degree or valency of the vertex v. Thus for v in Example 1 we must have
111
1
1
(f)(^v) = -(f>{Vi) + -(l){V2) + -(l>iV3) + -(t>{Vi) + -(l){V5).
Tutte showed that a valid straight line drawing, i.e. one with no edge crossings, results
from a barycentric mapping if the 'boundary' of the graph, a so-called 'cycle', is mapped
to a convex polygon. However, as argued in [3], convex combination maps share all those
properties of barycentric maps necessary for Tutte's proof. Thus when interpreted in the
right way and suitably generalized, Tutte's theorem can be expressed in the following
way. ,
Theorem 2.2 Suppose (p : Dr —> R^ is a convex combination mapping which maps the
n boundary vertices of T cyclically into the n vertices of some n-sided convex polygon in
the plane. Then 4> is one-to-one.
Despite this generalization, however, there are stih two aspects of it which need to
be improved from the point of view of applications and future research.
The first is that we would like to extend the theorem so that we can allow some, and
indeed many, of the mapped boundary vertices to be collinear. Indeed in the application
Convex combination maps
to parameterization for surface fitting, it might be convenient to map all the boundary
vertices of the given triangulation into the four sides of a rectangle, as in Figure lb.
This is because tensor-product splines surfaces are defined over rectangular domains.
Collinearity will also often be desirable in morphing, as in Figure 2, and in most other
applications. Thus a drawback of Theorem 2.2 is that it does not allow coUinear vertices
in the image boundary.
The second aspect is that we would like to simplify the proof in order to have some
hope of establishing the injectivity of piecewise linear maps in even more general situations, such as when mapping to non-convex regions, or when some of the mapped vertices are constrained, for example. The fact that Tutte's proof relies on the non-existence
of the Kuratowski subgraphs K^ and if3,3 in a planar graph illustrates its complexity.
It is these two improvements that are the focus of [5]. The main idea of [5] is the
observation that Theorem 2.2 is very similar to a theorem on harmonic maps, referred
to by Duren and Hengartner [2] as the Rado-Kneser-Choquet thereom, which was established in [7, 6, 1]. Recall that a mapping <p : D -^ R^, with D CB? and <j) = {u, v),
is harmonic if both its components u{x, y) and v{x, y) satisfy the Laplace equation in
D, i.e.,
T Uyy
"yy — U,
+ Vyy = 0;
see Figure 4.
Rado-Kneser-Choquet Theorem. Suppose (p D ^ ]R is a harmonic mapping
which maps the boundary dD homeomorphically into the boundary dfl of some convex
region O C IR^. Then (f> is one-to-one.
FIG.
4. Harmonic map.
This suggested that a proof of Theorem 2.2 might be based on a proof of the RadoKneser-Choquet theorem, in particular the short proof of Kneser [6]. Kneser's proof
begins by showing that (j) is locally one-to-one in the sense that the Jacobian of (f),
Uy
never vanishes. Kneser establishes this by supposing that the Jacobian is zero at some
point {xo, J/o)- In that case there must be a straight line ax + by-{-c = 0 passing through
the point <p{xo,yo) such that both partial derivatives of the function h{x, y) = au{x, y) +
bv{x,y) + c are zero at {xo,yo)- At the same time, the function /i : D —> R is zero at
ixo,yo) and has just two zeros along the boundary of D. Noting that h{x, y) is a harmonic
21
22
i
Michael S. Floater
function, Kneser then uses the Nodal Lines theorem of Courant to argue that there are at
least four zero contours of h emanating from {XQ, ya) and due to the maximum principle
for h, these four curves can never self-intersect nor intersect one another. Therefore all
four curves must reach the boundary of D which is a contradiction.
These ideas were used in [5] to establish a much simpler proof of Theorem 2.2 than
that of Tutte. No graph theory is needed at all. Instead, the discrete maximum principle
for convex combination functions plays the role of the maximum principle for harmonic
functions. Similar to Kneser's proof we show first that 0 is locally one-to-one, except
that we understand this to mean that the restriction of 0 to any quadrilateral in T is
one-to-one, a quadrilateral being the union of two triangles sharing a common edge.
V2
FIG.
5. Dividing edges.
Moreover, Theorem 2.2 is generalized in [5] to allow collinear mapped boundary
vertices. We call an edge [v, w] of T a dividing edge if both endpoints v and w are
boundary vertices yet the edge [v, w] itself is not contained in the boundary. For example
in Figure 5, the only dividing edge in the triangulation is [t^i,i'2]- Dividing edges play a
critical role because they partition the triangulation into subtriangulations %, in each
of which every convex combination function satisfies a discrete maximum principle in its
strong form. The main result of [5] was the following.
Theorem 2.3 Suppose T is any triangulation and that 4> '■ DT ~> ^ is a convex
combination mapping which maps ODT homeomorphically into the boundary dfl of some
convex region fi C R^. Then 4> is one-to-one if and only if no dividing edge [v,w] ofT
is mapped by (p into dfl.
3
Future research
Here is a list of topics for future research.
• A triangulation is a special (maximal) kind of planar graph. Can one extend Theorem 2.3 to other planar graphs, for example, rectangular grids? This is likely
because Tutte's theory already holds for all 3-connected graphs.
• In what way can the theorem be extended from bivariate maps to trivariate ones?
• Can similar one-to-one maps be guaranteed when mapping closed surfaces of various topology? For example, we would like to map a closed manifold triangulation.
Convex combination maps
homeomorphic to a sphere, into a unit sphere injectively. Here each triangle in the
triangulation would be mapped to a spherical triangle on the surface of the sphere.
• Can one find sufficient conditions for the injectivity of constrained maps, i.e., piecewise linear maps in which the image of certain interior points is specified in advance?
• Can one remove the requirement of having to map the boundary to a convex polygon
and still ensure a one-to-one mapping under some weaker condition?
• Can the Rado-Kneser-Choquet thereom and Theorem 2.3 be combined as part of
a single more general theorem?
Bibliography
1. G. Choquet, Sur un type de transformation analytique generalisant la representation
conforme et define au moyen de fonctions harmoniques, Bull. Sci. Math. 69 (1945),
156-165.
2. P. Duren and W. Hengartner, Harmonic mappings of multiply connected domains,
Pac. J. Math. 180 (1997), 201-220.
3. M. S. Floater, Parametrization and smooth approximation of surface triangulations,
Comp. Aided Geom. Design 14 (1997), 231-250.
4. M. S. Floater and C. Gotsman, How to morph tilings injectively, J. Comp. Appl.
Math. 101 (1999), 117-129.
5. M. S. Floater, One-to-one piecewise linear mappings over triangulations, to appear
in Math. Comp.
6. H. Kneser, Losung der Aufgabe 41, Jahresber. Deutsch. Math.-Verien. 35, (1926),
123-124.
7. T. Rado, Aufgabe 41, Jahresber. Deutsch. Math.-Verien. 35, (1926), 49.
8. W. T. Tutte, How to draw a graph, Proc. London Math. Soc. 13 (1963), 743-768.
23
Shape preserving interpolation by curves
T. N. T. Goodman
Department of Mathematics, University of Dundee, Dundee DDl 4HN.
tgoodmanOmaths.dundee.ac.uk
Abstract
A survey is given of algorithms for passing a curve through data points so as to preserve
the shape of the data.
1
Introduction
We consider the problem of passing a curve through a finite sequence of points. We want
the curve to preserve in some sense the shape of the data, i.e. the shape of the curve
gained by joining the data by straight line segments (which we call the 'piecewise linear
interpolant'). We do not consider the important problems of approximating the data
by a curve, or of shape-preserving interpolation by a surface. The short length of the
paper forces it to be selective. So we concentrate on actual algorithms for solving the
problem rather than related theory. Also we consider only algorithms where the curve
is defined explicitly, not implicitly either as the zero set of a function or as the limit of
a subdivision process (though there are, to our knowledge, extremely few such implicit
shape-preserving schemes).
In Section 2, we consider planar curves given by a function y = f{x), often rather
misleadingly referred to as 'functional interpolation'. There are numerous such schemes,
dating from 1966, with most of them prior to 1990. Our treatment is therefore very
selective. Section 3 deals with parametrically defined planar curves, for which the schemes
are fewer and more recent. Finally, in Section 4, we consider curves in three dimensions,
often called 'space curves'. Here the work is much more limited, dating only from 1997.
We note that in shape-preserving interpolation, the map from the data to the function
describing the curve must be non-linear. In what we call 'tension methods' the curve
can be constructed by a linear scheme for any choice of certain 'tension parameters'.
These parameters are then varied so as to 'pull' the curve towards the piecewise linear
interpolant until the shape criteria are satisfied. Though there are a few variations on
this theme, there is generally a clear distinction between tension methods and other
schemes, which we shall term 'direct methods'.
2
Functional interpolation
Given data
(.'Ei,^/,;)Gi^^
i = 0,...,N,
24
xo<xi<---<XN,
(2.1)
Shape preserving interpolation by curves
25
we consider a function/: [a;o,a;iv]—>-R satisfying
f{xi) = yi,
i = 0,...,N.
(2.2)
For some reasons, perhaps the physical situation which / is intended to model, we
may wish the graph of / to inherit certain shape properties of the data. We now describe
these and other properties which it may be desirable for / to possess.
2.1 Desirable properties
Monotonicity. Here we require / to be increasing (respectively decreasing) if (j/j) is
increasing (respectively decreasing). More generally we may require the scheme to be
'co-monotone', i.e. for i = 0,..., A'' — 1, / is increasing (decreasing) on [a;i,a;j+i] if
Vi < Vi+i iVi > Vi+i)- Co-monotonicity has the consequence that the local extrema of
/ occur exactly at the local extrema of (j/j). Moreover if yi = j/j+i, then / is constant
on [a;j,a;i+i]. These properties may be too restrictive and a weaker alternative is what
we call 'local monotonicity': for i = 1,... ,N — 2, f is increasing on [xi,Xi+t] if
Vi-i < J/i < Vi+i < yi+2 (and similarly for decreasing). Although this is not generally
stated, it is also desirable that for i = 0,..., A'' — 1, / has at most one local extremum
on{xi,Xi+i).
Convexity. Here we require / to be convex (concave) if the piecewise linear interpolant is
convex (concave). More generally we call the scheme 'co-convex' if for i = 1,..., A^ - 2,
/ is convex (concave) on [a;j,a;i+i] if the piecewise linear interpolant is convex (concave)
on [xi-i,Xi+2]- It is also desirable in a co-convex scheme for / to have at most one
inflection in {xi,Xi+i), 0 <i < N — 1.
Smoothness. By definition, the piecewise linear interpolant is shape-preserving, and so
the problem is trivial unless we require / to have greater smoothness than continuity, i.e.
C'^ for fc > 1. Since all the schemes use piecewise analytic functions, the C^ condition
needs to be checked only at a finite number of 'knots', which generally include the data
points. We remark that smoothness and shape-preservation may not be compatible; e.g.
if for i = 0,.. .,A, Xi = i — 2, yi = \xi\, and / is convex on [a;o,a;4], then f{x) = \x\,
—2 < a; < 2, and so is not C^ at 0.
Approximation order. It is generally supposed that the data arise as values of some
unknown 'smooth' function g, i.e. y, = g{Xi), i = 0,... ,N. Then we can consider how
fast the interpolant / converges to g as we increase the density of data values Xi in the
fixed interval [a,b]. A scheme has a,pproximation order 0{h™-) if ||/-3|| = 0{K^), where
h = max{xi+i — Xi : i = 0,... ,7V - 1} and the usual norm is ||F|| = sup{|F(a;)| : a <
x<b).
Locality. In a 'global' scheme, the value f{x), for any x, generally depends on all the
data. In contrast, for a 'local' scheme, f{x) depends on the data values {xi,yi) only for
Xi 'near' x. There may be advantages in local schemes, e.g. when data are modified or
inserted.
Fairness. It is often desirable that the curve is 'fair', i.e. pleasing to the eye, see Section
3.
26
T. N. T. Goodman
Other desirable properties are invariance under scaling or reflection in x or y, and
stability, i.e. small changes in the data produce small changes in /. There may also be
other constraints on /, e.g. / > 0 when y^ > 0, z = 0,..., A'^.
2.2
:
Tension methods
Many tension methods are a modification of cubic spline interpolation, which we now
describe. Given data (2.1), there is a unique function / satisfying (2.2), where / is C^,
is a cubic polynomial on [a;j,3;i+i], i = 0,...,N - 1, and satisfies suitable boundary
conditions at XQ and XN- The function / minimises J^^{g )^ over a suitable class of
functions and this energy minimisation property is generally considered to give a fair
curve. Determining / requires solving a global, strictly diagonally dominant tridiagonal
system of linear equations.
Since cubic spline interpolation is not shape-preserving, in 1966 Schweikert [67] mod: ified the scheme by replacing cubic polynomials on each interval [.'E,:,XJ+I] by solutions
: of
/W-Ai/"=0,
where Aj > 0. When Aj = 0, / will reduce to a cubic, while as Aj —> cx), / approaches
a linear polynomial. Thus Aj acts as a tension parameter and by making appropriate
choices of A^ large enough the function will preserve monotonicity and/or convexity
globally or locally.
Many papers have been written on Schweikert's tension splines giving, for example,
ways of choosing the values of the tension parameters, e.g. [68,57,46,60]. However the fact
that the method uses exponential functions can be seen as a drawback. An alternative was
introduced by Nielson in 1974 [55] by adjusting the minimisation property of cubic splines
to a minimisation problem involving also the first derivative. The resulting function,
called a i^-spHne, is also cubic on each interval [xijXj+i] but only C^. However the form
of the C^ continuity gives extra 'smoothness' for parametrically defined curves and so
we discuss z/-splines further in Section 3. By generalising the minimisation problem still
further one can gain a C^ piecewise cubic interpolant with further parameters for gaining
shape properties [22].
The idea of using rational functions in tension methods was introduced by Spath [69],
also in 1974, and put in a general setting of tension methods in [57]. Prom 1982-1988,
Gregory and/or Delbourgo produced a series of algorithms using rational functions, e.g.
[19,36,20,21,18]. We illustrate the ideas with an algorithm from [37]. Here / is C^ and
on each interval [a;;,2:^+1] it has the form, for some a, b, c, d,
,,_ a + bt + ct"^+ dt^
1 + \it{l - t) '
_
X - Xi
Xi+i - Xi '
For Aj > —l,i = 0,... ,N—1, f can be determined as the solution of a strictly diagonally
dominant tridiagonal hnear system (and hence the scheme is global). When all A; = 0,
/ reduces to the usual cubic spline interpolant, while as A; -+ 00, / converges uniformly
to the linear interpolant on [xi,Xi+i]. In general the approximation order is 0{h'^) for
Shape preserving interpolation by curves
27
data from a C^ function. In the special case of monotone data, choosing
Ai = ft + if'ixi) + f'{xi+i)f'+^ " ""', ft>-3, i = 0,...,N-l,
Vi+i - Vi
ensures that / is correspondingly monotone, and for the choice /Xj = —2, / reduces to a
rational quadratic which gives optimal approximation order 0(/i^). Similarly for convex
data, / is also convex provided that each A, satisfies an inequality involving / [xi),
f {xijf.i), and choosing Aj appropriately (which requires solving a non-linear equation)
further ensures approximation order 0{h^).
There are some more recent methods involving rationals, e.g. [58].
The idea of using variable degree to preserve shape was introduced by McAllister,
Passow and Roulier in 1977 [47,56]. They produce monotone, convex schemes of arbitrarily high smoothness by constructing a shape-preserving piecewise linear interpolant I
with one knot between any two data points (and no knots at the data points) and then
defining the final interpolant on each interval [xi, cCj+i] as the Bernstein polynomial of /
of some degree mj. The idea was extended from 1986 by Costantini [8-10]. For fc > 1,
rui > 2k -\-1, i = Q,..., N — 1, he constructs a shape-preserving piecewise linear interpolant I with knots at Xi-\-k{xi+i -Xi)/mi and a;j+i — A;(a;j+i —Xi)lmi, i = 0,... ,N—1.
The final interpolant / coincides on each interval [a;j,a;i+i] with the Bernstein polynomial of I of degree rrii and is hence C'^ (with f^^\xi) = 0, j = 2,..., fe). In [10] there is
a co-monotone, co-convex scheme in which the degrees mj can either be chosen a priori
or computed automatically according to the data.
The above schemes using variable degree are not strictly tension schemes in our sense
but in 1990, Kaklis and Pandelis [40] introduced a tension method by using the above
form for fc = 1, i.e. on each interval [a;j,a;i+i] it has the form:
f{t) = f{xi){i-t)+f{xi+^)t+cit{i-tr^+dit^^{i-t),
t= ''"^' .
Here mj > 2 is an integer and for each choice of mo,... ,mjv-i, the numbers Cj, di are
chosen so that / is C^, which requires the solution of a strictly diagonally domina,nt
tridiagonal linear system. When all m, = 2, this reduces to the usual cubic spline interpolant, while as m^ —> oo, / converges uniformly to the linear interpolant on [a;i,a;i+i]
with order 0{m~^) (or 0{m~'^) if mj-i, rrij+i remain bounded). For further discussion
of variable degree shape-preserving functional interpolation, see [11].
Our final type of tension method was introduced by Manni [50] in 1996. The general
idea is to define / on [a;i,Xj+i] as
f{x)=pi{ql'^{x)),
where pi, qi are cubic polynomials on [a;,, a;j+i] and g, is strictly increasing from [aij, aij+i]
onto itself, so that the inverse q~^ is well-defined on [a;j,a;j4.i]. For / [xi) = di, i =
0,... ,N,we require
Pi(^j) = Ajrfj,
qi{xi) = Xi,
Pi{xi+i) ^ ^lidi+i,
qi{xi+i) = Hi,
for parameters Aj > 0, ft > 0. For Ai = ft = 1, we have qi{x) = a; and / reduces to a
cubic on [xi,Xi+i], while for Xi = Hi = 0, f becomes linear on [a;i,a:j+i].
28
T. N. T. Goodman
In [50], the values do,., .^d^ sxe assumed known (or estimated from the data vahies)
and the scheme is local C^, gives necessary and sufficient conditions for the values of the
parameters A,:, /ij forco-monotonicity, and has approximation order 0(/i^) when g is C^
and generally 0(/i^) when g is C^.
Manni and co-workers have written a series of papers using the same idea, [51,53,54].
For example in [45], the values di are not assumed given but are chosen to ensure that
the function is C^, thus providing a locally monotone, co-convex global scheme which
generalises usual cubic spline interpolation; while in [52] two further knots are inserted
in each interval [xi,XiJ^\] to produce a C^, locally monotone, co-convex local scheme
which interpolates values of f^^\xi), j = 1,2, i = 0,... ,N.
2.3 Direct methods
In 1967, Young [71] considered shape-preserving interpolation by polynomials and a number of papers have appeared since on this topic, e.g. [59] gives a constructive proof of
the existence of a co-monotone interpolant with an upper bound on the degree required.
However for a practical algorithm, using a piecewise polynomial offers much more flexibility than a single polynomial. Numerous papers have been written using such polynomial
splines and we mention briefly only a few.
By inserting extra knots between data points, a convexity preserving scheme with
C^ cubics was given by de Boor [4, p.303], and co-monotone, co-convex schemes with
C^ quadratics in [48,49,66]. C^ cubic sphnes with knots at the data points are used for
co-monotonicity in [25,5,24,70], (the last of these using a variational approach), and for
both co-monotonicity and co-convexity in [16,17]. We also recall the methods using spline
functions of variable degree with knots between the data points to obtain interpolants
with arbitrarily high smoothness which were discussed under tension methods.
Finally we note that following the paper [62] which was as early as 1973, Schaback
[63] gives a C^ co-monotone, co-convex scheme which uses a cubic polynomial on any
interval [xi,a;j+i] where an inflection is needed, and on other intervals employs a rational
function of form quadratic/linear.
3
Planar curves
Given data
heR'^,
i = 0,...,N,
we consider a curve r : [a, b] —> R? satisfying
r{ti) = Ii,
i = 0,...,N,
(3.1)
for values a = to < h < ■ ■ ■ < t^ = b. For a closed curve the situation is extended
periodically so that
li+N = lo,
ti+N = ti,
ieZ,
r{t + b-a) = r{t),
t € R.
3.1 Desirable properties
Shape. For this case it is not usually relevant to consider preservation of monotonicity.
We say a scheme is 'co-convex' if the curve r has the minimum number of inflections
Shape preserving interpolation by curves
29
consistent with the data. In practice, schemes satisfy the somewhat stronger condition
that for any 0 < i < j — 2 < N - 2, r is positively (negatively) locally convex on
[ti+i,tj-i] if the polygonal arc joining Ii,...,Ij is positively (negatively) locally convex.
For more details on this and other desirable properties, see [29].
Smoothness. We shall call the interpolating curve C'^ for fc > 0 if the function r is
C^. kC^ curve r we shall call G^ if the unit tangent vector is continuous, and G^ if, in
addition, the curvature is continuous. A C'^ curve r is G^, fc = 1,2, provided that the
parameterisation is regular, i.e. r (i) ^ (0,0), which is generally desirable. It is usually
sufficient to have G^, rather than C'^, continuity if only the appearance of the curve is
important and the choice of parameter t is not significant.
Fairness. Planar curves often arise in computer-aided design where it may be particularly important that the curve is pleasing to the eye. Though this is subjective,
various criteria have been suggested to be relevant, such as magnitude, rate of change
or monotonicity of the curvature. Some schemes include 'shape parameters' which can
be manipulated by the designer to modify the shape of the curve.
Approximation order is not important in the context of design when the data are
not considered to be taken from some unknown curve. Approximation order is related
to reproduction of polynomial curves, and a related property for planar curves is reproduction of arcs of circles (or more generally conies); this cannot be done exactly by
polynomials but it can be achieved by using rationals.
Locality and other desirable properties are similar to the functional case as described
in Section 2.1, though it is generally more appropriate that the invariance is under a
rotation and the same scaling in both x and y.'
3.2
Tension methods
In Section 2.2 we mentioned Nielsen's i/-splines [55]. Applying this scheme for both
components of r gives a function r which is cubic on each interval [:ti,tj+i], is C^ and
satisfies
r'\tt)=r"{t-) + Vir'{ti), J - 1,... ,iV - 1,
where Vi > 0. This condition is sufficient for G^ continuity of r (assuming regular parameterisation). When all vi = 0, r will reduce to the usual G^ cubic spline interpolant. As
Vi —> 00, the curve is 'pulled tight' at Jj and as Vi, Vi+i —> oo, it approaches the linear
interpolant on [ii,ij4.i].
The scheme in [37] by Gregory which was mentioned in Section 2.2 was adapted to
the planar case in [38]. Other schemes using rationals were proposed by Clements in
[6,7], where r is a G^ curve which on each interval [ij,fj+i] has the form, for some a, 6,
c,d&B?,
, ,
a(l - s)^
WiS + 1
- ,,
,
where Wj > 0 are the tension parameters.
ds^
^^(l - S) -t- 1
t-ti
ti+i-ti
30
T. N. T. Goodman
The variable degree tension method of [40], also mentioned in Section 2.2, was adapted
to the planar case in [41], and extended in [27] to allow the designer to obtain a 'fair'
curve by minimising the number of changes in the monotonicity of the curvature.
3.3
Direct methods
The papers [34,35,28,23] give local, G^ co-convex schemes, e.g. in [28], a rational cubic/cubic is used on each interval [ij,ij+i] and the tangent vectors and curvatures are
stipulated by the algorithm to ensure that the convexity conditions are satisfied and
circular arcs are reproduced, with the possibility of modifying the tangent vectors and
curvatures further as shape parameters.
Following an earlier scheme in [64], Schaback in [65] gives a global G^ co-convex
scheme which uses a cubic polynomial on any interval [fi,<,.|_i] where an inflection is
needed, and on other intervals employs quadratic polynomials.
Sapidis and Kaklis [61] give a G^ co-convex scheme by interpolating by a piecewise
quintic curve tangent directions and curvatures gained by their tension method [41].
In [1] a local, co-convex G^ scheme is given which uses polynomials of degree six
and which attempts to obtain a fair curve by imposing conditions on the curvature to
minimise measures of fairness. Finally we note that in [12] Costantini gives an abstract
theory and general purpose code.
4
Space curves
Given data
h&E?,
i = 0,...,N,
we consider a curve r : [a,b] —> R^ satisfying condition (3.1) as before.
4.1
Desirable properties
What is meant by 'shape-preserving' is not so clear for space curves as for the planar
case. Criteria were introduced by Kaklis and Karavelas [39] and extended by Ong and
the author in [31]. We shall sketch these below. They are discussed in further detail in
[30], where some further extensions are suggested. We write, for appropriate indices i:
Li — Ii+i-Ii,
A, = det[Lj_i,L,:,Lj+i],
N, = Li^i x Li.
Torsion. We ensure that the curve is 'twisting' in the same manner as the piecewise
linear interpolant by requiring that if A,; =^ 0, then the torsion of r has the same sign as
AiOn{ti,ti+i).
Convexity. Let
K{t) = r'{t) xr"{t),
a<t<b.
We require that for 1 < i < iV - 1, K{ti).Ni > 0, which means that the projection of
the curve r onto the plane of /j_i, /j, /j+i, has the same sign of local convexity at /,; as
the polygonal arc Ii-\lili+i. Moreover if Ni.Ni+i > 0, we require
K{t).Nj>0,
j = i,i + l,
ti<t<ti+i,
Shape preserving interpolation by curves
31
which impUes that the curve r has the same sign of local convexity on [ti,ti+i] when
projected in any direction XNi + (1 - X)Ni+i for 0 < A < 1. Finally we require that
if Ni.Ni+i < 0, then for j = i, i + 1, K{t).Nj has exactly one sign change in [ti,ti+i],
which imples that each of the above projections of r have just one inflection.
Smoothness. This is as for planar curves, except that we call the curve G^ if it is G^
and, in addition, the torsion is continuous. Other desirable properties are similar to the
planar case.
4.2
Tension methods
Although interpolation by space curves with a special shape is considered in [44], the
first specific shape-preserving interpolation scheme by space curves was due to Kaklis
and Karavelas [39], who adapted the variable degree tension method of [40] to give a
C^ method which was also G^, but at the expense of zero torsion at the data points. In
[42] the same authors adapted Nielson's i^-splines to the three dimensional case to give a
curve which is C-^ and G^. The paper [14] also uses variable degree for tension parameters
but gives a C^ scheme in which the limiting curve as the tension goes to infinity is not
the piecewise linear interpolant but the shape-preserving interpolant given by either of
the above two schemes. In [15] a C^ scheme is also given but here the components of r
on each interval [ij,ij+i] lie in the linear span of the functions
■^ t-tH+l — ti
When rrii = rrii^i = 5, this reduces to a quintic polynomial. As mt, rrii+i —> oo, it tends
to a linear polynomial and then the curve r approaches the piecewise linear interpolant
on[ii,ti+i].
.
The paper [26] also uses variable degree splines with degree on each interval at least
five, and the curve r also converges to the piecewise linear interpolant as the degrees go
to infinity. However here the curve is C^, which the authors feel may give extra fairness
to the curve due, for example, to lowering the maximum absolute value of the curvature.
Variable degree polynomial splines are also used in [13].
4.3
Direct methods
Following an earlier scheme in [31], Ong and the author gave a local G^ scheme in [32]
which employed a rational cubic/cubic between data points, extending the ideas of the
planar scheme in [28]. This was further extended to a local G^ scheme using a rational
quartic/quartic in [43]. In [33], the degrees of freedom- inherent in the scheme in [32]
were used to optimise a fairness measure. Finally we mention the papers [2,3] which give
local G^ schemes using a piecewise polynomial of degree six, also allowing optimisation
of a fairness measure.
It will be noted that many of the above papers are extremely recent and it is hoped
that the unavoidable lack of detail here will serve to tantahse readers to discover for
themselves more of this rapidly developing field.
,32
T. N. T. Goodman
Bibliography
1. S. Asaturyan, P. Costantini and C. Manni, G^ shape preserving parametric planar
curve interpolation, in Creating Fair and Shape-Preserving Curves and Surfaces, H.
Nowacki, P. D. Kaklis (eds.), B. G. Teubner, Stuttgart (1998), 89-98.
2. S. Asaturyan, P. Costantini and C. Manni, Shape-preserving interpolating curves in
R^: a, local approach, in Creating Fair and Shape-Preserving Curves and Surfaces,
H. Nowacki, P. D. Kakhs (eds.), B. G. Teubner, Stuttgart (1998), 99^108.
3. S. Asaturyan, P. Costantini and C. Manni, Local shape-preserving interpolation by
space curves, IMA J. Numer. Anal. 21 (2001), 301-325.
4. C. de Boor, A Practical Guide to Splines, Springer, New York (1978).
5. J. Butland, A method of interpolating reasonable-shaped curves through any data,
Proc. Computer Graphics 80, Onhne Publ. Ltd., Northwood Hills, Middlesex, U.K.
(1980), 409-422.
6. J. C. Clements, Convexity-preserving piecewise rational cubic interpolation, SIAM
J. Numer. Anal. 27 (1990), 1016^1023.
7. J. C. Clements, A convexity-preserving C^ parametric rational cubic interpolant,
Numer. Math. 63 (1992), 165-171.
8. P. Costantini, On monotone and convex spline interpolation, Math. Comp. 46
(1986), 203-214.
9. P. Costantini, Co-monotone interpolating splines of arbitrary degree - a local approach, SIAM J. Sci. Stat. Comput. 8 (1987), 1026-1034.
10. P. Costantini, An algorithm for computing shape-preserving interpolating splines
of arbitrary degree, J. Comput. Appl. Math. 22 (1988), 89-136.
11. P. Costantini, Abstract schemes for functional shape-preserving interpolation, in
Advanced Course on Fairshape, J. Hoschek, P. Kaklis (eds.), B. G. Teubner, Stuttgart (1996), 185-199.
12. P. Costantini, Boundary-valued shape-preserving interpolating splines, ACM Trans,
on Math. Software 23 (1997), 229-251.
13. P. Costantini, Curve and surface construction using variable degree polynomial
splines. Computer Aided Geometric Design 17 (2000), 419-446.
14. P. Costantini, T. N. T. Goodman and C. Manni, Constructing C^ shape preserving
interpolating space curves, Advances Comp. Math. 14 (2001), 103-127.
15. P. Costantini and C. Manni, Shape-preserving C^ interpolation: the curve case, to
appear.
16. P. Costantini and R. Morandi, Monotone and convex cubic spline interpolation,
Calcolo 21 (1984), 281-294.
17. P. Costantini and R. Morandi, An algorithm for computing shape-preserving cubic
spline interpolation to data, Calcolo 21 (1984), 295-305.
18. R. Delbourgo, Shape preserving interpolation to convex data by rational functions
with quadratic numerator and linear denominator, IMA J. Numer. Anal. 9 (1989),
123-136.
19. R. Delbourgo and J. A. Gregory, C^ rational quadratic spline interpolation to mono-
Shape preserving interpolation by curves
tonic data, IMA J. Numer. Anal. 3 (1983), 141-152.
20. R. Delbourgo and J. A. Gregory, The determination of derivative parameters for a
monotonic rational quadratic interpolant, IMA J. Numer. Anal. 5 (1985), 397-406.
21. R. Delbourgo and J. A. Gregory, Shape preserving piecewise rational interpolation,
SIAM J. Sci. Stat. Comput. 6 (1985), 967-976.
22. T. A. Foley, A shape preserving interpolant with tension controls. Computer Aided
Geometric Design 5 (1988).
,
23. T. A. Foley, T. N. T. Goodman and K. Unsworth, An algorithm for shape-preserving
parametric interpolating curves with G^ continuity, in Mathematical Methods in
CAGD, T. Lyche, L. L. Schumaker (eds.). Academic Press, Boston (1989), 249-259.
24. F. N. Fritsch and J. Butland, A method for constructing local monotone piecewise
cubic interpolants, SIAM J. Sci. Stat. Comput. 5 (1984), 300-304.
25. F. N. Fritsch and R. E. Carlson, Monotone piecewise cubic interpolation, SIAM J.
Numer. Anal. 17 (1980), 238-246.
26. N. C. Gabrielides and P. D. Kaklis, C^ interpolatory shape-preserving polynomial
splines of variable degree. Computing 65 (2001), to appear.
27. A. Ginnis, P. Kaklis and N. S. Sapidis, Polynomial splines of non-uniform degree:
controlling convexity and fairness, in Designing Fair Curves and Surfaces, N. S. Sapidis (ed.), SIAM Series on Geometric Design, Philadelphia (1994), Part 3, Chapter
10. ■
28. T. N. T. Goodman, Shape preserving interpolation by parametric rational cubic
splines, in Numerical Mathematics Singapore 1988, R. P. Agarwal, Y. M. Chow, S.
J. Wilson (eds.), International Series of Numerical Mathematics Vol. 86, Birkhauser
Verlag, Basel (1988), 149-158.
29. T. N. T. Goodman, Shape preserving interpolation by planar curves, in Advanced
Course on Fairshape, J. Hoschek, P. Kaklis (eds.), B. G. Teubner, Stuttgart (1996),
29-38.
30. T. N. T. Goodman and B. H. Ong, Shape preserving interpolation by curves in
three dimensions, in Advanced Course on Fairshape, J. Hoschek, P. Kaklis (eds.),
B. G. Teubner, Stuttgart (1996), 39-48.
31. T. N. T. Goodman and B. H. Ong, Shape preserving interpolation by space curves.
Computer Aided Geometric Design 15 (1997), 1-17.
32. T. N. T. Goodman and B. H. Ong, Shape preserving interpolation by G^ curves
in three dimensions, in Curves and Surfaces with Applications in CAGD, A.
LeMehaute, C. Rabut, L. L. Schumaker (eds.), Vanderbilt Univ. Press, Nashville
(1997), 151-158.
33. T. N. T. Goodman, B. H. Ong and M. L. Sampoli, Automatic interpolation by fair,
shape preserving, G^ space curves. Computer-aided Design 30 (1998), 813-822.
34. T. N. T. Goodman and K. Unsworth, Shape preserving interpolation by parametrically defined curves, SIAM J. Numer. Anal. 25 (1988), 1453-1465.
35. T. N. T. Goodman and K. Unsworth, Shape preserving interpolation by curvature
continuous parametric curves, Computer Aided Geometric Design 5 (1988), 323-
33
34
T. N. T. Goodman
340.
36. J. A. Gregory, Shape preserving rational spline interpolation, in Rational Approximation and Interpolation, Graves-Morris, Saff and Varga (eds.), Springer-Verlag
(1984), 431-441.
37. J. A. Gregory, Shape preserving spline interpolation. Computer-aided Design 18
(1986), 53-58.
38. J. A. Gregory and M. Sarfraz, A rational cubic spline with tension. Computer Aided
Geometric Design 7 (1990), 1-13.
39. P. D. Kaklis and M. I. Karavelas, Shape preserving interpolation in R^, IMA J.
Numer. Anal. 17 (1997), 373-419.
40. P. D. Kaklis and D. G. Pandelis, Convexity-preserving polynomial splines of nonuniform degree, IMA J. Numer. Anal. 10 (1990), 223-234.
41. P. D. Kaklis and N. S. Sapidis, Convexity-preserving interpolating parametric
sphnes of non-uniform polynomial degree. Computer Aided Geometric Design 12
(1995), 1-26.
42. M. I. Karavelas and P. D. Kakhs, Spatial shape-preserving interpolation using i/splines. Numerical Algorithms 23 (2000), 217-250.
■ 43. V. P. Kong and B. H. Ong, Shape preserving interpolation using Prenet frame
continuous curves of order 3, to appear.
44. C. Labenski and B. Piper, Coils, Computer Aided Geometric Design 20 (1996),
1-29.
45. P. Lamberti and C. Manni, Shape preserving C^ functional interpolation via parametric cubics, Numerical Algorithms, to appear.
46. R. W. Lynch, A method for choosing a tension factor for spline under tension
interpolation, M.Sc. Thesis, Univ. of Texas at Austin (1982).
47. D. F. McAllister, E. Passow and J. A. Roulier, Algorithms for computing shape
preserving spline interpolation to data. Math. Comp. 31 (1977), 717-725.
48. D. F. McAllister and J. A. Roulier, An algorithm for computing a shape preserving
osculating quadratic spline, ACM Trans. Math. Software 7 (1981), 331-347.
49. D. F. McAllister and J. A. Roulier, Algorithm 574. Shape preserving osculating
quadratic spHnes, ACM Trans. Math. Software 7 (1981), 384-386.
50. C. Manni, C^ comonotone Hermite interpolation via parametric cubics, J. Comp.
Appl. Math. 69 (1996), 143-157.
51. C. Manni, Parametric shape-preserving Hermite interpolation by piecewise quadratics, in Advanced Topics in Multivariate Approximation, F. Fontanella, K. Jetter,
P. J. Laurent (eds.). World Scientific (1996), 211-228.
52. C. Manni, On shape preserving C^ Hermite interpolation, BIT 14 (2001), 127-148.
53. C. Manni and P. Sablonniere, Monotone interpolation of order 3 by C^ cubic splines,
IMA J. Numer. Anal. 17 (1997), 305-320.
54. C. Manni and M. L. Sampoli, Comonotone parametric Hermite interpolation, in
Mathematical Methods for Curves and Surfaces II, M. Daehlen, T. Lyche, L. L.
Schumaker (eds.), Vanderbilt Univ. Press, Nashville (1998), 343-350.
Shape preserving interpolation by curves
55. G. M. Nielson, Some piecewise polynomial alternatives to splines under tension, in
Computer Aided Geometric Design, R. E. Barnhill, R. F. Riesenfeld (eds.). Academic Press (1974), 209-235.
56. E. Passow and J. A. Roulier, Monotone and convex interpolation, SIAM J. Numer.
Anal. 14 (1977), 904-909.
57. S. Pruess, Properties of splines in tension, J. Approx. Theory 17 (1976), 86-96.
58. R. Qu and M. Sarfraz, Efficient method for curve interpolation with monotonicity preservation and shape control. Neural, Parallel and Scientific Computations
5 (1997), 275-288.
59. L. Raymon, Piecewise monotone interpolation in polynomial type, SIAM J. Math.
Anal. 12 (1981), 110-114.
60. N. S. Sapidis, P. D. Kaklis and T. A. Loukakis, A method for computing the tension
parameters in convexity preserving sphne-in-tension interpolation, Numer. Math. 54
(1988), 179-192.
61. N. S. Sapidis and P. D. Kaklis, A hybrid method for shape-preserving interpolation
with curvature-continuous quintic splines. Computing Suppl. 10 (1995), 285-301.
62. R. Schaback, Spezielle rationale Splinefunktionen, J. Approx. Theory 7 (1973), 281292.
63. R. Schaback, Adaptive rational splines, NAM-Bericht Nr. 60, Universitat Gottingen
(1988).
64. R. Schaback, Interpolation in i?^ by piecewise quadratic visually C^ Bezier polynomials. Computer Aided Geometric Design 6 (1989), 219-233.
65. R. Schaback, On global GC'^ convexity preserving interpolation of planar curves by
piecewise Bezier polynomials, in Mathematical Methods in CAGD, T. Lyche, L. L.
Schumaker (eds.). Academic Press, Boston (1989), 539-548.
66. L. L. Schumaker, On shape preserving quadratic spline interpolation, SIAM J. Numer. Anal. 20 (1983), 854-864.
67. D. G. Schweikert, An interpolation curve using a spUne in tension, J. Math. Phys.
45 (1966), 312-317.
68. H. Spath, Exponential spline interpolation, Computing 4 (1969), 225-233.
69. H. Spath, Spline algorithms for curves and surfaces, Utilitas Mathematica Pub. Inc.,
Winnipeg (1974).
70. F. I. Utreras and V. Cells, Piecewise cubic monotone interpolation: a variational
approach, Departamento de Matematicas, Universidad de Chile, Tech. Report MA83-B-281 (1983).
71. S. W. Young, Piecewise monotone polynomial interpolation. Bull. Amer. Math. Soc.
73 (1967), 642-643.
35
GAGD techniques for differentiable manifolds
Achan Lin and Marshall Walker
York University, Toronto M3J 1P3, Canada.
linOyorku.ca, walkerOyorku.ca
Abstract
The paper outlines procedures for extending the de Casteljau, de Boor and Aitken algorithms in such a way as to allow the construction on a Riemannian manifold of curves
analogous to Bezier, B-spline, and Lagrange curves. These curves lie in the manifold and
respect intrinsic geometry.
1
Introduction
Given a sequence of points in a Riemannian manifold M we describe methods for extending the de Casteljau, de Boor, and Aitken algorithms. These methods allow construction
of corresponding interpolating or approximating curves that lie in the manifold and respect intrinsic geometry. In the case that the manifold is a sphere, opportunity for
applications exist in the domain of geological and geographical mapping, for instance
the creation of topographical contour lines or isotherms, and in the field of video production, where it is desirable to have smooth camera trajectories interpolating fixed
camera positions. For higher dimensional manifolds there are applications in the field
of data analysis. For the case of a sphere, there is an extensive literature dealing with
the general problem of data fitting, and a superb review can be found in Fasshauer and
Schumaker [2]. Shoemake [7] uses properties of quaternion arithmetic to describe curves
on the unit quaternion sphere, and Levesley and Ragozin [4], using techniques different from those presented in this paper, describe methods for Lagrange interpolation in
differentiable manifolds.
The techniques described in this paper come from the simple observation that in the
de Casteljau, de Boor, and Aitken algorithms one may formally substitute appropriately
parametrized geodesic arcs for straight line segments. These ideas are introduced in detail
in the next section in the context of the blossoming paradigm, [6] and [3]. Unfortunately
many of the useful properties of blossoms depend on the affine structure of Euclidean
space which in general has no counter part in a Riemannian manifold. In particular,
geodesic blossoms may be neither symmetric or multi-affine, and in general they do not
possess uniqueness characteristics common to the Euclidean blossom.
For an arbitrary Riemannian manifold [1] or indeed an arbitrary differentiable 2manifold enibedded in E^, it may not be possible to construct unique shortest geodesic
arcs between two points. However, if the manifold is compact or in the case that the two
points lie in a sufficiently small neighborhood, such arcs are known to exist. But even
36
CAGD techniques for differentiable manifolds
then, there appears to be no general method that allows explicit construction. So, the
task of constructing geodesic blossoms becomes a study of special cases in which specific
methods can be set forth. For the general case, a discrete variational method can be
used to obtain good approximations.
In Section 3 a few specific examples are discussed. The case in which the manifold
is a sphere is given special attention. There we introduce a variation which allows the
discussion of Archimedian curves which are constructed by substituting Archimedian
spirals for geodesies. This variation allows the natural construction of curves that lie off
the sphere. Although the spherical geodesic blossoms are neither symmetric or multiaffine, a simple reparametrization of geodesic arcs results in spherical blossoms that have
all desirable characteristics. Section 3 also contains a brief discussion of the problem of
finding geodesies in developable surfaces and in surfaces of revolution.
2
Preliminaries
Let M be a C°° Riemannian manifold. There is the following theorem that guarantees
the existence locally of geodesies.
Theorem 2.1 If M is a Riemannian manifold, xo € M. Then there exists a neighborhood V of Xo and e > 0 so that if x € V and v is a non-zero tangent vector at x
and \\vx\\ < e, then there is a unique CP° geodesic Q : (—2,2) —> M defined on the open
interval (—2,2) such that a{0) = x and I "3~ )
= ^x\ "* / t=o
For compact Riemannian manifolds there is the Hopf-Rinow theorem that tells us
that points can be connected by geodesic arcs.
Theorem 2.2 (Hopf and Rinow) If a connected Riemannian manifold M is compact,
then any pair of points x and y may be joined by a geodesic whose length corresponds to
the distance in the manifold from x to y.
We also need the notion of geodesic convexity and the result of J. H. C. Whitehead
that geodesically convex neighborhoods exist for all a; € M.
Definition 2.3 Given a subset X of M and a point XQ £ X, X is star shaped with
respect to the point 'XQ, if for every x £ X there is a unique shortest geodesic connecting
Xo with x which lies in X.
Definition 2.4 A subset X of M is geodesically convex if it is star shaped with respect
to each of its points.
Definition 2.5 Given a subset A of a geodesically convex set X the geodesic convex
hull of A is the smallest convex set which contains A.
Theorem 2.6 (J. H. C. Whitehead) Let V be an open subset of a Riemannian manifold
M and let x G M , then there is a geodesically convex open neighborhood U of x such
that U CV.
Let M be a Riemannian manifold and let X be a geodesically convex subset of M.
Given points Pi in M we describe extensions of the de Castlejau, de Boor, and Aitken
algorithms.
37
38
A. Lin and M. Walker
2.1
Riemannian Lagrange curves
Let M be a Riemannian manifold, and let A = {PQ, Pi, ■ ■ ■, Pn} be a subset of a
geodesically convex subset X. Given parameter points, to < ^i < • • • < <n, assume that
A is contained in a sufficiently small neighborhood in which specified geodesies exist.
For 0 < i < n — 1, define 7/ : [to, t„] ^ X to be the unique geodesic parametrized
so that Jiiti) = Pi and 7,^(^1+1) = Pj+i- For 1 < r < n and 0 < i < n - r define
7[ : [to, in]'' —» X so that 7[(ui, U2, ■ • ■, Wr-i, •) is the unique geodesic parametrized so
that 7[(MI,«2,- • ■,Ur-i, ti) = il'''^{ui,U2,- ■ ■,u,-i) and 7[(wi,«2,- • ■,Ur-l, U+r) =
'yi+iiui,U2,- ■ ■,Ur-i). The function 70 : [to, tn]" —> X is called the geodesic Aitken
blossom associated with the points P, E X, 0 < i <n and the parameter points, to <
ti < • ■ ■ < tn ■ If A : [to, tn] -* [to,t„]" is the diagonal map defined by A(u) =
{u,u, ■ ■ ■, w), the geodesic Lagrange curve associated with X and the points Pj is the
'
V
'
n
function FJf = 70 o A.
Theorem 2.7 If FQ : [to, tn] -^ M is the geodesic Lagrange curve associated with the
points Pi e M, 0 < i <n, as defined above, then Fo(ii) = PiProof: Observe that for 1 < r < n and 0 < i < n — r, 7^ depends for its definition
only on the points, Pj, where i < j < i + r. If n = 1, and we are given points, Po and
Pi, the result follows from the definition of 70 . Inductively assume it is true for k < n.
For k = n, if i = 0, by definition
rS(io) = l^{to,to,--;to) = 7S'-\to,to, ■■■,to) =■•■ = 7o(^o) = Po
n —1
n
and likewise if i = n, FS(i„) = 7j(i„,t„, • •-,*„) = 7o"Hin,in, ■ • ^in) = ■•• = 7o(*n) =
^
^
'
■
'
'
n
n—1
P„. For 2 7^ 0 and i ^ n, observe that the geodesies used in the construction of
7o~'' and 7"""^ may be restricted respectively to the intervals [fo,in-i] and [fi,in] so
that 7o~^becomes the geodesic Aitken blossom associated with the points Po,Pi,' • •,
P„_iand the parameter points to < ti < ■■■< i„_i, and 7""^ becomes geodesic Aitken
blossom associated with the points Pi,P2,- • ■, P„ and the parameter points ii < ^2 <
• ■ • < tn- By the deductive assumption, 7o~^(ii^--j^) = Pj = 7"~^fti,ij,- • ■,ti),
^
V
n—1
^
*"
-v-
n—1
and consequently 7o (ij, ii, • • •, ti, •) is the geodesic connecting 7o ~H*i, ij, • • •, U) with
n—1
n—1
'yi''^(ti,ti,- ■ -jti), and is thus the constant function, ^Q{ti,ti,- ■ ■,ti,u) = Pi
n—1
uE[to,tn]-
for all
n
Thus in particular, 75'(ii,fi, ••-,«») =Fg(ii) =Pi.
O
^
2.2
CAGD techniques for differentiable manifolds
Riemannian Bezier curves
Following the previous format we introduce a Riemannian version of the de Casteljau
algorithm. Accordingly, let X be a geodesically convex subset of a Riemannian manifold
M. Let A = {Po, Pi, • • •, P„} be a subset of X. Define 7? : [0, 1] -> X hy j^{u) =
Pi. For 1 < r < n and 0 < i < n - r define 7[ : [0, 1]'' ^ X to be the unique
geodesic with the property that 7[(ui,«2,- ■ ■,Ur-i, 0) = 7["^(wi,U2,- • -jUr-i) and
7[(ui,U2,---,Wr-i, 1) = 7[+i'^(wi,U2,---,Wr-i)- The fuuctiou 7o : [0, 1]" -^ X is called
the geodesic de Casteljau blossom associated with the set A. If A : [0, 1] -^ [0, 1]" is
the diagonal map, the geodesic Bezier curve associated with X and the set A is the
function T^ = j^ o A.
2.3
Riemannian B-Spline curves
Given A = {PQ, PI, ■ • •, Pn} contained in a geodesically convex subset X of a Riemannian manifold M, and given knots ti < t2 < ■ ■ ■ < t2n, define 7? : [ti,t2n] ^ X by
jf{t) = Pi,ioiO<i<n. For 1 < r < n and r < i < n, define 7[ : [i,, ti+n+i-rV -^ X
to be the unique geodesic with the property that 7[ (ui, U2, • • ■, Wr-1, U) = l^ll (wi, U2, • •
■,Ur-i) and 7[(«i,U2,- • -yUr-i, ti+n+i-r) = 7[~^(ui,U2,- • ■,Ur-i). The fuuction
7n : [tn, in+i]" ^ X is Called the geodesic de Boor blossom associated the set A.
If A : [i„, tn+i] -^ [tn, tn+i]" is the diagonal map, the geodesic B-Spline curve associated with X and the points Pi is the function FJJ = 7JJ o A.
We have the following results, which follow from the fact that both the geodesic de
Casteljau and the geodesic de Boor blossoms are constructed from successive geodesic
combinations beginning with the set A = {PQ,PI,- ■ ■, Pn}Theorem 2.8 Given A = {Po,Pi, • • •, P„} contained in a geodesically convex subset
of a Riemannian manifold, if 7^ : [0, 1]" —> X is the geodesic de Casteljau blossom
of A, then 7o([0, 1]") is contained in the geodesic convex hull of the set A.
Theorem 2.9 Given A = {Po,Pi,---, Pn} contained in a geodesically convex subset
of a Riemannian manifold, i/70 : [*n, tn+iY"-* X is the geodesic de Boor blossom of
A relative to a knot sequence ti <t2 < ■ ■ ■ < t2n, then 7o ([*«, ^n+i]") is contained in
the geodesic convex hull of the set A.
.
Since each of the three blossoms are constructed successively from C°° geodesies, it
follows that the blossoms and their restrictions to the diagonal are also of class C°°.
Theorem 2.10 The geodesic Lagrange, Bezier, B-spline curves are of class C°° as are
each of their corresponding blossoms.
3
Examples
The impediments to implementation of these ideas depend on the manifold in question.
In all cases it is necessary that the points Pj should he in a region in which it is possible
to construct geodesic arcs between points. The problem then reduces to that of finding
methods for such constructions. Even in cases for which this is possible, there is the
additional problem that many of the desirable properties associated with B-spline or
Bezier curves in R^ may have no direct analogs. Many properties such as the ability
39
40
A. Lin and M. Walker
to subdivide a curve depend on the blossom being symmetric or multi-afRne, and for
the generalizations presented here, this is seldom true. For the case of an orientable
2-manifold embedded in R^, there are in many cases good solutions to the problem of
finding geodesies, but different classes of surfaces lead to different solution. In this section
we mention a few. In the case that the manifold M is the 2-sphere S^ a preliminary
version of our results is reported in [5].
3.1 The sphere
In the case that M = S^, a small alteration to methods presented so far allows the
consideration of curves that lie off the sphere. Given points P and Q that lie off the
sphere consider radial projections to points P and Q and let 7 : [a,b] —> 5^ be a
geodesic with the property that 7(a) = P and 7(6) = Q. The curve 7 : [a, b] -> MP
defined by
7(0 = (f5i-11^11 +IE!-IIQIl)-7(0
is called the Archimedian spiral connecting the points P and Q. To explicitly describe
the curve 7 , set P — vi,Q = V2 and for simplicity consider the parameter interval [a, b]
to be the unit interval [0,1]. For < -, ■ > the standard inner product on R^'' set
^3 = (< Wl,'!'2 > fl - U2)/(||< Vl,'y2 > fl - ■i'2||)
SO tha:t V3 is orthogonal to Vi and in the plane containing vi and V2. Letting 6 =< fi, V2 >
denote the angle between vi and V2, the geodesic 7 connecting Vy with V2 is defined by
y{t)
.
:
=
=
cos{t6)vi + sm(t,d)v3
(
.„.
sm{te) < vi, V2> \
COs(te) + 71
^^^
II
V
\\<VuV2>Vi - V2\\J
sm{t9)
^—^
|TU2.
||< Vi,?;2 > Vl - V2II
Vi - -r.
The corresponding Archimedian Lagrange, Bezier and B-spline curves may now be constructed with the general algorithms of Section 2.
One of the difficulties that arise with Archimedian curves is that geodesic blossoms are
not necessarily symmetric or multi-affine. It is even not clear what these concepts might
mean in a geodesic context. Consequently, certain results that hold for normal Bezier or
B-spline curves that depend on these properties are no longer valid. In particular analogs
of the subdivision algorithms that allow one to determine control points of a portion of a
given Bezier or B-spline are not valid. However, it can be shown that a simple non-linear
change in the parametrization of the geodesic arcs, makes it possible to recapture most
of what is needed.
Definition 3.1 Given two points A and B on the sphere. Let C be the smaller arc of
the spherical geodesic joining A with B. The barycentric param.etrizo,tion of C on the
parameter interval [a, b] is the function a : [a, 6] —> C defined by
a{t) = q{x{t)),
where x{t) = ^rz^-A + ^j^B and q:R^ -^ S^ is the radial projection q{x) =
In the following we prove a spherical version of the Menelaus theorem.
T—TT .
CAGD techniques for differentiable manifolds
41
Theorem 3.2 Given 3points PQ, PI, P2 on S^ let 7 : [0,1] x [0,1] -> R^ be the geodesic
de Casteljau blossom in which all geodesic arcs are given the barycentric parametrization.
Then'y{s,t) =^{t,s).
Proof: Observe that an elementary geometric argument tells us that:
7(s,t)=7o'(s,*)
'
.
=
=
?((l-i)7o(s)+t7i(s))
g((l-i)[(l-s)Po + sPi]+f[(l-s)Pi+sP2])
7(i,s)=7o(*>s). =
=
q{{l-sH{t) + s^\{t))
5((l-s)[(l-i)Po + tPi] + s[(l-i)Pi+tP2]).
and
And the result follows from the afhne properties of M^.
D
As an immediate consequence we have
Theorem 3.3 Given points PQ, PI, • • •, Pn on S"^, the associated de Casteljau blossom,
in which geodesic arcs are given barycentric parametrization, is symmetric.
The conventional blossoming description of subdivision can now be employed. Prom
the blossom construction we can conclude that 7o (0,0, • • •, 0, 1,1, ■ • •, 1) = P,. In pari
ticular, it follows that, for 0 < u < 1, the points Qi = 7o(0,0, • • -,0, u,u,---,u)
i
describe a geodesic de Casteljau blossom which is parametrized to the interval [0,u]
and which, because of the uniqueness of geodesic arcs, equals the restriction of 70 to
[0,u]". Likewise, for the interval [u, 1] the points Pj = 'JQ^U^U,••-,«, 1,1,• • •, 1) dei
termine a geodesic de Casteljau blossom which is parametrized to the interval [u,l]
and which equals the restriction of 70 to [u, 1]" Therefore, if g : [0,1] -^ S^ is the
geodesic Bezier curve determined by PQ, Pi, • • •, P„ and if 5 = 7Q o A, it follows that,
9\[o,u] ■ t^lo{t,t,---,t, u,u,---,u) and g\[i^u] : t ^'y^{u,u,-■ ■,u, t,t,-■ ■,t), for
i
i
0<u<l.
More generally and along the lines of the proof above, we have the following theorem
which allows all familiar properties of both Bezier and B-spline curves which have
descriptions in terms of their corresponding blossoms to carry over to the spherical
case.
Theorem 3.4 Let f : [0,1]" —>■ M^ be the Euclidean blossom generated by the de
Casteljau algorithm using points Pi £ S'^,0 <i <n. Then 7o = 9° /■
3.2
Other surfaces
We briefly discuss two examples in which explicit descriptions of geodesies between points
are possible.
A developable surface S [4], described as the image of a function / : (7 —> E^ for
U an open subset of M^, possess the characteristic, among others, that distances are
42
;
A. Lin and M. Walker
preserved by the function /. Therefore, a geodesic in the surface f{U) may be considered
as the image of a straight Hne in the plane. If Po,Pi,- • -/Pn are points in S, let Qi =
f~^{Pi), 0 < i < n. If C C f/ is the Lagrange, Bezier, or B-spline curve obtained
from the standard Euclidean versions of the algorithms, then it follows that /(C) is the
corresponding geodesic curve in S that would have been obtairied using geodesic versions
of the algorithms that we have described.
For surfaces of revolution the description of geodesies between two points is rather
more involved. Let C be a curve in the yz-p\ane described implicitly by
■ / m=^
\
x=0
'
for (y, z) belonging to some open set U contained in the upper half of the yz-plane. The
surface S obtained by rotating C about the z-axis may be expressed as g~^{0) where
5 : R X t/ —> R is defined by g{x,y,z) = f{\/x^ + y^) - z = 0. In polar coordinates
letting u = -y/^M-^, we express S in the form
X = u cos 6
y = usin9 .
z = f{u)
Let P = («iCOS^i,uisin^i,/(ui)) and Q = {u2Cos62,U2sm92,f{u2)) be two points
on S. Then it may be shown that the geodesic connecting P with Q is the function
a : [ui,U2\ -^ S such that a{u) = {ucos6{u),sin9{u),f{u)), where for fixed WQ,
^'"'=£/i^-'''
and constants c and c' satisfy the following equations:
n ji + ifju)
2
72-6'i=/ A/-T—;
;w-du
Jui V ^"^ ~ ■"'
1+ (/'("]!!,„.
Ju„ V
^'^'
For complete details see [6].
4
Conclusion and future research
We have outlined a procedure by which conventional computer aided design constructions
may be extended to arbitrary Riemannian manifolds. In practice, there are difficulties.
In a given manifold points to be interpolated or approximated must lie in a region in
which it is possible to construct necessary geodesic arcs. Supposing this the case, one
then needs to find explicit descriptions of the geodesies. And then there is the question
of the additional characteristics which the curves might possess. The paper raises more
questions than it answers. In the case of a sphere, good results are obtained, and it
CAGD techniques for differentiable manifolds
is also possible to add variation that allows consideration of curves off the sphere but
which project radially to geodesic Lagrange, Bezier, or B-spline curves. It is also shown,
in the spherical case, that a change parametrization of geodesies results in blossoms that
retain the desirable characteristics associated with Euclidean blossoms. For surfaces of
revolution and developable surfaces, we know that geodesies can be found between points
so the geodesic blossom constructions will always exist. It is however unlikely that these
blossoms will be either symmetric or multi-afhne; these characteristics depend on the
affine structure of M^. Thus, in the case of a general Riemannian manifold, although the
constructions may be valid, it is not clear that we will be able to employ fundamental
operations such as subdivision which depend on the symmetry of the blossom. We have
outlined three different methods of blossom construction, one for each of the algorithms
considered. In the Euclidean case, we know that there is a unique symmetric, multiafhne polynomial that restricts to a given polynomial on the diagonal. This may not be
true in our more general setting.
Bibliography
1. Conlon, L., Differential Manifolds, a First Course, Birkhauser, Boston, 1993.
2. Fasshauer, G. E. and Schumaker, L. L., Data Fitting on the Sphere, in Mathematical
Methods for Curves and Surfaces II, Daehlen, M., Lyche, T., and Schumaker, L. L.
(eds), Vanderbilt University Press, Nashville, 1998, 117-166.
3. Gallier, J., Curves and Surfaces in Geometric Modeling, Theory and Applications,
Morgan Kaufman, San Francisco, 2000.
4. Opera, J., Differential Geometry and its Applications, Prentice Hall, Upper Saddle
River, NJ, 1997.
5. Levesley, J., and Ragozin, D. L., Local Approximation on Manifolds Using Radial
Basis Functions and Polynomials, in Curve and Surface Fitting, Cohen, A., Rabut,
C.R., Schumaker, L.L. (eds), Vanderbilt University Press, Nashville, 2000, 291-301.
6. Lin, A., Geodesies between points on surfaces of revolution. Tech. Report, Dept.
Mathematics, York University, Toronto, May 2001.
7. Ramshaw L., Blossoming: A Connect the Dots Approach to Splines, Digital Systems
Research Center, Report 19, Palo Alto, CA, 1987.
8. Shoemake, K., Animating Rotation with Quaternion Curve, ACM Proceedings, San
Francisco, July 22-26, 9, 1985, 245-254.
9. Walker, M., Curves over a Sphere, preprint, 2000.
43
Parametric shape-preserving spatial interpolation and
z/—splines
Carla Manni
Department of Mathematics, University of Torino, Italy
maimi@dm.unito.it
Abstract
In this paper we present a class of C^ spatial interpolating curves depending on a set
of tension parameters and we illustrate their ability to reproduce the shape of the data.
The curves are constructed using cubic splines and basically reduce to classical F-splines
for particular values of the tension parameters.
1
Introduction
Shape-preserving interpolation via functional as well as parametric splines is a well
studied topic for the planar case. On the other hand, shape-preserving interpolation for
spaces curves is considerably more complex than for planar ones and the related literature
is apparently limited. On this concern, a considerable part of the available schemes only
ensures geometric continuity of the obtained curve (see [1, 8] and references quoted
therein). Recently, C^ and C^ shape-preserving interpolating space curves have been
obtained using polynomial splines of variable degree, [2, 3, 6]. However, working with
low(fixed)-degree polynomial splines seems to be a standard choice in the CAD/CAM
community. This motivates the careful investigation of shape preserving properties of
cubic zv-splines recently carried out in [7] and the present paper.
In this paper we present a method for constructing C^ spatial interpolating curves
reproducing the shape of the polygonal line which interpolates the given data. The curve
is constructed via the so called "parametric approach", [10], using classical cubic splines.
The shape of the curve is controlled by the amplitude of the tangent vectors at the data
sites which play the role of tension parameters. It turns out that, for particular values
of the tension parameters, the proposed scheme provides a new, geometrically evident,
description of classical C^ - G^ cubic i/-splines, [11]. Moreover, the method produces a
suitable reparameterization for the above mentioned curves ensuring C^ continuity. The
reparameterization is a cubic polynomial involving the tension parameters (see (3.3)).
Thus, the evaluation of the curve for a fixed value of the new parameter requires the
solution of a cubic equation.
The geometric meaning of the tension parameters coupled with the powerful "shapepreserving" properties of the Bernstein-Bezier representation can be efficiently used to
construct an iterative algorithm for C^ shape-preserving interpolation. The algorithm
44
Shape-preserving spatial interpolation
45
converges in a finite number of iterations and requires at each iteration the solution of
a diagonally dominant linear system.
The paper is organized as follows. In Section 2 we state the problem. In Section 3 we
describe the construction of the required interpolant and we illustrate its dependence on
the tension parameters. The asymptotic behavior and the shape-preserving properties
of the obtained curve are briefly discussed in Section 4. We conclude in Section 5 with
a graphical example.
2
The problem
In this section we introduce the problem of shape-preserving interpolation by curves in
M^. The adopted notion of shape-preserving follows the definitions of [2] and [6]. Let
Ij e]R^ i = 0,...,N,
be the interpolation points with I^ ^ li+i- Define, for all admissible indices,
Lj := Ij+i — li,
N,:=(feS' if||L.-ixL,||>0,
10,
, 0,
elsewhere.
elsewhere,
where |a b c| denotes the determinant of the matrix with columns a, b, c. The vectors
Nj and the scalars A, are, respectively, the discrete binormals and the discrete torsions
of the data.
Let the parameter values ai, i = 0,... ,N, with ai < Gi^i be given, and let
/ij := CTj+i - CTj, i = 0,1,..., A^ - 1
be the corresponding spacings. We wish to construct a curve Q(s), s G [(JQ, ITjv], which
interpolates the data, Q((TJ) = Ij, i = 0,...,N, such that Q € C'^[ao,aN]. In addition, we also require that Q(s) is shape-preserving, that is it reproduces the convexity
and torsion of the polygonal line connecting the interpolation points. More specifically,
denoting with dashes derivatives with respect to the parameter s, we define
(2.1)
as the curvature vector and the torsion of the curve respectively. Q(s) is shape-preserving
if it satisfies the following criteria ([2, 6, 7]).
(i) Convexity criteria:
(i.l) if Ni • Ni+i > 0, then K(s) • Nj > 0, j = i,?-I-1, s e [cri,cTi+i],
(i.2) if Nj • Nj+i < 0, then K(s) • Nj, j = i,i-\-l, has one change in sign in
(i.3) if Ni ■ Nj 7^ 0 then (K(cTi) • Nj)(Ni ■ Nj) > 0, j = i-l,i,i + l.
(ii) Torsion criteria: ii Ai ^ 0 then T{s)Ai > 0, s G [(7^,0"^^;^].
[(TJ,ITJ+I],
,46
Carla Manni
For the sake of brevity we refer to [7] for the more technical collinearity and coplanarity criteria.
3
Constructing the interpolating curve
In order to construct the curve Q we consider, as a first step, a cubic curve C interpolating the data. We put
C(0|K,.,,,,:=C,:(t;Af\Af^
(3.1)
Ci{t; Af\ Af)) := lM'\u) + Ii+rH["\u) + Af ftiT,i/f^'(«) + X^^hiTi+rHl'\u),
t€[cri,ai+i], u:={t-ai)/hi,
(3.2)
where 0 < A^- ,A,- < 1 are shape parameters, T,;, Tj+i are vectors to be determined
and H^{u) denote the elements of the cardinal basis for cubic Hermite interpolation,
that is H\ {u) are the polynomials of third degree such that
—jl^^ = 5ij5ri, r,l = QX
One can immediately verify that the curve (3.2) interpolates the points I,;, I,;+i at the
extremes of the interval [CTj,cri+i] and has tangent vectors \\ 'T,, A^ TJ+I at the same
extremes. The parameters x\ \\i determine the amplitude of the tangent vectors of
the curve at the two end points of the interval and they control the shape of the curve. To
be more specific, since HQ{U)-\-HI{U) = 1, we have that Ci(i; 0,0) reduces to the fine
through Ij, Ij+j. Thus, the parameters A,- ,X\ act as tension parameters stretching
the curve from the classical Hermite cubic interpolating I,, Ij+i with tangents T,;, Tj+i
{\f\\f^ = 1) to the fine segment {xf\x^P = 0). The curve (3.1) turns out to be of
class G^.
Let us consider now the new global parameter
s{t\a,,^,^,y.= Si{t-Af\\^P):=<TiH^o\^) + ^i+^Hf\u)+
(3.3)
It is not difficult to see that, if
0<AP\Ar'<l
(3.4)
then
^sC/'A^^) A^^h
,
~ > U, t e [cri,cr,;+ij.
Thus (3.3) implicitly defines a function t = t{s), which provides a reparameterization for
(3.1). In the following we assume that conditions (3.4) hold and we define
Q(5) := C(i(.s)).
(3.5)
Since Q'(cri) = T,;, i = 0,... ,7V, Q is of class C^. For each sequence of the tension
Shape-preserving spatial interpolation
47
parameters Xl',\l' we will determine the tangent vectors Tj,Tj_)_i so that Q is also
of class C^. Let us denote by dots derivatives with respect to the local parameter u.
Imposing continuity of Q"(s) ai ai, i = 1,... ,N — 1, from (3.3), (3.5) and from the
chain rule for derivatives, we obtain
Q-i(l-)/i,-iAli\ - 5,-i(l-)fe,_iAli\T, _ C,(0+)fe,Af - s,(0+)ft,Af T,
(3.6)
Thus, after some manipulations, from (3.2) we have
UiTi^i+Ti+ViTi+i^Zi,
Ui =
i = l,...,N-l,
(3.7)
,
Wi
Wi
Wi = /ii_i(3 - xf\){hiXf^f + hi{3 - AP)(/^i_lAli\)^
(3.8)
z, = -Li{hi.,X^],r + -L,_i(/iiAf )^
Wj
'
'-'
Wi
In order to uniquely determine the vectors Tj we need two additional equations that
will be obtained by imposing boundary conditions. Classical boundary conditions are
periodic conditions:
uoT;v-i + To + t;oTi =zo, WjvTAr-i + Tjv + I^ATTI = zjv
(with
UO,VO,UN,VN,ZO,Z,N
defined according to (3.8) setting /i_i =
A_i - A}y_p L_i = Ljv-i,
tangent conditions:
/ijv = ho, Xy = A^°', X)^' =
To = Do,
TTV
=
/IAT-I,
X^Q^\ LN
X_l = A^_j,
= Lo) and end
DAT,
(where Do, DN are given in input). In the following we will denote by I the set of indices
{1,..., A'' — 1} ({0,..., N}) when end tangent (periodic) conditions are considered. It is
not difficult to see that (3.7) for any choice of the above mentioned boundary conditions
provide a diagonally dominant system ,
AT = z.
(3.9)
Thus we can state the following
Theorem 3.1 For any sequence A^ , A^^\ i = 0,..., iV-1, satisfying (34), there exists
a unique Q G C^[(To,(TAr] defined via (S.l)-(3.3), (3.5) which interpolates the given data
and satisfies periodic or end tangent conditions.
We notice that for A^.' = X^^' — 1, system (3.9) reduces to the system for the
computation of classical C^ cubic splines. Moreover, if Aj._j = X^°' = Afc, k G I, the
48
,
Carla Manni
curve G is of class C^ and equation (3.6) reads
^C,(a, ) - ^Q_,(a, )
—
-Q(a, ).
Then (3.6) is equivalent to impose that the cubic curve (3.1) is a C^-G"^ cubic v-spline.
[5, 7, 11] where, from (3.3), for i e I
/z-^Si(0+)-fe-_Vi-i(l-) ^ (6 - 4A,; - 2\i+{)hf+ (6 - 2A,-i - i\i)h-\
Vi :=
(3.10)
4
Asymptotic behavior and shape-preservation
In this section we briefly discuss the asymptotic behavior and the resulting shapepreserving properties of the curve Q, defined by (3.1)^(3.3), (3.5) and (3.9), as the
tension parameters A,-°\ Af' approach zero. The following lemma (see also [7]) concerns
the asymptotic behavior of the tangents T,;. We omit the details of the proof which are
completely analogous to those of Theorem 3 in [9].
Lemma 4.1 The vectors Tj, i — Q,...,N, obtained from (3.9) are hounded independently of Xf\xf\ j =0,... ,N - 1. Moreover,
A<^),, A<^>-.o V
/^,(Af)2 + /l,_l(Aii\)2/^.-l
=: (l-ai)r^
+ «.^,
i€X.
ftj_i
tii
hi^r{X^\y + hi{xfr h,
(4.1)
Since the tangents are bounded independently on the tension parameters, from the
previous section we have that Q approaches the piecewise linear function interpolating
the data as the tension parameters tend to zero. Moreover, each tangent Tj determined
by (3.9) tends to a strictly convex combination of Lj_i//ii_i and Li/hi as the tension
parameters A^^°\, A^^ tend to zero while A[i\/Af ^ remains bounded and strictly positive.
Due to these two main facts, we are able to easy control the shape of the curve Q and
to ensure that it reproduces the shape of the data as the tension parameters approach
zero as we will discuss briefly in the foflowing.
Since C and Q only differ for a reparameterization they have the same image. Thus, as
far as the shape-preserving properties are concerned, we can consider the expression of C.
As noticed in Section 3, if x\^\ = xf\ i € I, the curve C with Tj obtained by (3.9), is a
C^-G^ cubic z/-spline. In such a case, using (3.10), the careful shape analysis carried out in
[7] and the resulting algorithm can be considered. However, the simple geometric meaning
of the tension parameters A,^°\ Af' coupled with the "shape-preserving" properties of the
Bezier-Bernstein representation, allow us to more easily establish the shape-preserving
results also for completely general configurations of x\^\, xf\ Thus, we express the
Shape-preserving spatial interpolation
49
curve segment Ci{t;X\ ,AJ ') in Bezier-Bernstein form:
3
Q(t;Af),A«) = 5;Q,P)t'(l-e
1=0
Cj^o '■— III Cjji :— Ij + 3/iiAj Tj, Cj^2 :— li+i — a'^'j-^i Tj+i, 0,^ :— Ij+i
Let us consider at the beginning the convexity criteria.
A (1)
A
Lemma 4.2 //Nj • Nj ^ 0 ond -^ -> c> 0, t/ien
lim
N{0)
(K(ai)-Nj)(Ni-N,)>0.
X(I)_^O
Proof: From the properties of Bezier curves (see [5]) and from (2.1) and (3.5)
sgn(K(CTi) • N,-)
=
sgn((Q,i - Ci,o) X (Ci,2 - Ci,i)) • N,sgn
Af/^.„
TixlLi- -^-^Ti -
)S'h,
■N,-
Li+l
where sgn(y) denotes the sign of y. Moreover, from (4.1)
lim
(Ti X Li) ■ N,- = fai^ X Li + (1 - a^)^ x
•
LA
N,-
= ^\~"'^Ni • N^.
t —1 ' t
Hence, we obtain the assertion if Nj • Nj ^ 0.
□
The previous lemma ensures that, if AJ_\, A^ ' are small enough the third convexity
criterion, (i.3), stated in Section 2 is satisfied. In addition, the sign of K(0-fe)-Nj, k = i,i+
1 can be checked considering the Bezier coefficients C,,;, 1 = 0,1,2,3, of Cj. Furthermore,
thanks to the shape-preserving properties of totally positive bases, for small values of the
tension parameters, (see [4]) the number of changes in sign of K(s) • Nj, s e [(Ti,CTi+i]
is bounded by the number of changes of sign in the pair K(0-fe) • Nj, fc = i, i + 1. Thus,
also the first and the second convexity criteria (i.l) and (i.2) are satisfied if the tension
parameters are small enough.
As far as the torsion is concerned, we recall that the sign of the torsion of a cubic
curve coincides with the sign of the discrete torsion of its Bezier control polygon (see for
example [5]) thus it is not difficult to obtain the following
Lemmia 4.3 7/ Aj 7^ 0 and -^ -^ c> 0, j = i,i + l, then
,(0)
,(1)
lim
,(0) .(1) ,(0) ,(1)
.„
T{s)Ai>0, s e[a^,crZ-,].
' '
.
'
^^'■'
■
With similar arguments it is not difficult to prove that also the coUinearity and the
coplanarity criteria stated in [7] are fulfilled as the tension parameters approach zero.
We omit the details for the sake of brevity.
Summarizing, from the previous discussion it follows that if the tension parameters are
small enough then the Bezier control polygon of C reproduces the shape of the data and
50
Carla Manni
the curve C does the same thanks to the properties of Bezier-Bernstein representation.
Thus, to obtain an automatic algorithm to compute the C^ interpolant Q defined by
(3.5), satisfying convexity and torsion criteria, basically we have to perform the following
steps:
(a) for a given sequence of the tension parameters solve the system (3.9) and compute
the Bezier coefficients of the resulting curve C;
(b) check if the control polygon of each segment C, satisfies the convexity and torsion
criteria;
(c) if this is not the case reduce the values of the related tension parameters according
to a given rule and go to step (a).
5
A graphical example
To illustrate the performance of the presented scheme we consider the data proposed
in [7], Example 2, consisting of 20 points with uniform parameterization in [0,1]. End
tangent boundary conditions have been used (see Table 2 in [7]). Figures 1-3 show the
behavior of the obtained C^ curve Q compared with the classical C^ cubic spline. The
shape-preserving curve Q is defined by the following sequence of tension parameters
A(I)
.
.6
.9
.6 1
.6 .6
FIG.
.9
1
.9 111111111
.9 .911111111
.75
1
1
1
1
1
1
.75
1
1.
1. C^ cubic spline (left) and Q (right).
Bibliography
1. S. Asaturyan, P. Costantini and C. Manni, Local shape-preserving interpolation by
space curves, IMA J. Numer. Anal. 21 (2001), 301-325.
2. P. Costantini, T. N. T. Goodman and C. Manni, Constructing C^ shape-preserving
interpolating space curves, Adv. Comput. Math. 14 (2001), 103-127.
3. P. Costantini and C. Manni, Shape-preserving C^ interpolation: the curve case, Adv.
Comput. Math. (2002) to appear.
4. T. N. T. Goodman, Total positivity and the shape of curves, in Total Positivity and
its Applications, M. Gasca and C. A. MiccheUi (eds), Kluwer, 1996, 157-186.
Shape-preserving spatial interpolation
2. Left: ||K(s)|| for the C^ cubic spline (dotted line) and for Q. Right: convexity
ratio ^^^ in [ao,ai] (with NQ := ||^g^^°||) for the C^ cubic spline (dotted line) and
for Q.
FIG.
3. Left: torsion of the C^ cubic spline (dotted line) and of Q (the horizontal lines
depict the sign of the discrete torsion). Right: first component of (fC/dt^ (dotted line)
andofd^Q/rfs^.
FIG.
5. J. Hoschek and D. Lasser, Fundamentals of Computer Aided Geometric Design, A.
K. Peters Ltd., 1993.
6. P. D. Kaklis and M. L Karavelas, Shape preserving interpolation in TZ^, IMA J.
Numer. Anal. 17 (1997), 373-419.
7. M. L Karavelas and P. D. Kaklis, Spatial shape-preserving interpolation using v-^
splines, Numer. Algorithms 23 (2000), 217-250.
8. V. P. Kong and B. H. Ong, Shape Preserving Interpolation using Frenet Frame
Continuous Curve of Order 3, (2001) preprint.
9. P. Lamberti and C. Manni, Shape-preserving C^ functiona,! interpolation via parametric cubics, Af«mer. ^tyorii/ims 28 (2001), 229-254.
10. C. Manni, On Shape Preserving C^ Hermite Interpolation, BITAl (2001), 127-148.
11. G. Nielson, Some piecewise polynomial alternative to spline under tension, in Computer Aided Geometric Design, R. E. Barnhill and R. F. Riesenfeld (eds) Academic
Press, 1974, 209-235.
51
On the g-Bernstein polynomials
Halil OruQ and Necibe Tuncer
Department of Mathematics, Dokuz Eyliil University, Tinaztepe Kampiisii
35160 Buca Izmir, Turkey
halil.orucQdeu.edu.tr,
necibe.tuncerOdeu.edu.tr
Abstract
We discuss here recent developments on the convergence of the g-Bernstein polynomials
Bnf which replaces the classical Bernstein polynomial with a one parameter family of
polynomials. In addition, the convergence of iterates and iterated Boolean sum of qBernstein polynomial will be considered. Moreover a g—difference operator T>qf defined
by Vqf — f[x, qx] is applied to g-Bernstein polynomials. This gives us some results which
complement those concerning derivatives of Berrlstein polynomials. It is shown that, with
the parameter 0 < g < 1, if A*/r > 0 then VgBnf > 0. If / is monotonic so is VgBnf.
If / is convex then V^Bnf > 0.
1
Introduction
First we begin by introducing some notations to be used. For any fixed real number
q> 0, the g-integer [k] is defined as
(1 - g^Va - g), g^l,
^ '
^ k,
q = l,
for all positive integer k. The term Gaussian coefficient is also used, since they were first
studied by Gauss (see Andrews [1]).
Let p{N, M, n) denote the number of partitions of a positive integer n into at most M
parts, each less than or equal to N. Then the Gaussian polynomial, G{N, M, n), appears
as the generating function
'N + M'
M
G{N,M,n) =
Y,p{N,M,n)q\
n>0
Note that [^] defined by
= 1 FFfel. n>k>0,
I
0,
otherwise,
where [n]\ = [n]{n — 1] • • ■ [1] with [0]! = 1, is called Gaussian polynomial (or g-binomial
coefficient) since it is a polynomial in q with the degree (n — k)k. The g-binomial coeffi52
53
On the q-Bemstein polynomials
cients satisfy the recurrence relations,
'n+l
and
= g»-fe+i
[n+ll
fc
—
n
k - 1.
71
+
n
^ 77,
+ g'^
.fc
k-'i-.
(1.1)
(1.2)
The following Euler identity can be verified using the recurrence relation (1.1) by
induction that
(1 + x){l + qx)-- ■ (1 + q^-^x) = ^ 5K'--i)/2
(1.3)
r=0
Phillips [8] introduced a generalization of Bernstein polynomials (g-Bernstein polynomials) in terms of q'-integers
n—T—l
Bn{f;x) = Y,fr ^ x\^{l-q^x),
r=0
L
J
(1.4)
g_o
where /r = / (j^) and an empty product denotes 1. When q=l the (1.4) reduces the
classical Bernstein polynomials. The Bn{f;x) generalizes many properties of classical
Bernstein polynomials. Firstly, generalized Bernstein polynomials satisfy the end point
interpolation
5n(/;0) = /(0), 5„(/;l) = /(l).
Phillips [8] also states the generalization of well known forward difference form (see Davis
[3]) of the classical Bernstein polynomials by the following theorem.
Theorem 1.1 The generalized Bernstein polynomial, defined by (1-4), inay be expressed
in the q-difference form
Bn{f;x) = Yl
AVoa;'-
(1.5)
r=0
/\where A^i = A'-i/i+i -q^'^A^-^fi forr>l and A^/i = fi
It is easily verified by induction that q'-differences satisfy
AVi = ^^(-l)^^^^^-!)/^
fr+i—k-
(1.6)
k=0
Using the g-difference form of the q-Bernstein polynomials (1.5), one may show that
q-Bernstein polynomials reproduce linear functions, since B„(l;a;) = 1; Bn{x;x) = x.
2
Convergence
In the discussion of the uniform convergence of the g-Bernstein operator, the BohmanKorovkin Theorem (see Cheney [2]) is used as in the classical case. The BohmanKorovkin Theorem states that for a linear monotone operator £„, the convergence of
54
Halil Drug and Necibe Tuncer
^nf —* / for f{x) = l,a;,a;^ is sufficient for the sequence of operators £„ to have the
uniform convergence property £„/ —» /, V/ G C[0,1]. Observe that the g-Bernstein operator is a monotone linear operator for 0 < g < 1. For a fixed value of q with 0 < g < 1
fnl —>
OS
n —> oo.
Notice that, since Bn{x^]x) = x^ + [n\ ^ B„{x'^\x) does not converge to x^. PhilHps
[8] studies the uniform convergence of qi-Bernstein polynomial.
Theorem 2.1 Let q = Qn satisfy 0 < g„ < 1 and let g„ —> 1 as n -+ oo. Then,
Bn{f;x)^f{x),
V/(x)eC[0,l].
The degree of g-Bernstein approximation to a bounded function on [0,1] may be described in terms of the modulus of continuity with the following theorem.
Theorem 2.2 If f is bounded on [0,1] and B„f denotes the generalized Bernstein
operator associated with f defined by (1.4)> ^'^^n
;
||/-B„/|U<^a;(l/HV2).
An error estimate for the convergence of qi-Bernstein polynomials is given in Phillips [8]
by the Voronvskaya type theorem.
:
Theorem 2.3 Let f be bounded on [0,1] and let Xo be a point of [0,1] at which /"(XQ)
exists. Further, let q = Qn satisfy 0 < Q'„ < 1 and let q'n —> 1 as n —> oo. Then the rate
of convergence of the sequence of generalized Bernstein polynomials is governed by
lira [n]{B„{f;Xo) - f{xo)) =-xo{l-xo)f"ixo).
n—^oo
/
It is well known that the classical Bernstein polynotnials B„/ provide simultaneous
approximation of the function and its derivatives. That is if / 6 CP[0, 1], then
:
■
lim B(P)(/;x) = /(P)(x)
n~*oo
uniformly on [0,1]. It is worthwhile to examine if this property hold for g-Bernstein polynomials. Phillips [7] proved that the p*'' derivative of g-Bernstein polynomials converges
uniformly on [0,1] to the p*'' derivative of / under some restrictions of the parameter q.
This property results from the generalization of the following theorem.
Theorem 2.4 Let f € C^ [0,1] and let the sequence (Q„) be chosen so that the sequence
(e„) converges to zero from above faster than (1/3"), where
Then the sequence of derivatives of the generalized Bernstein polynomials, B'^f, converges uniformly on [0,1] to f'{x).
Up to now the convergence of g-Bernstein polynomials is examined by taking a sequence q = Qn such that ^n ^ 1 as n -^ oo. In the recent developments, the convergence
On the q-Bernstein polynomials
55
of Q'-Bernstein polynomials is examined for fixed real q, 0 < q < 1 and for g > 1. It is
proved in Orug and Tuncer [6] that for a fi^xed q, 0 < q < 1, the uniform convergence
holds if and only if / is linear on the interval [0,1]. Moreover, if g > 1, 5„/ —> / as
n —> 00 if / is a polynomial.
Theorem 2.5 Let q>l be a fixed real number. Then, for any polynomial p,
lim Bn{p;x) =p{x).
n—*oo
For any fixed integer i, the g-Bernstein polynomials of monomials (see Goodman
et.al. [4]) can be written explicitly as
i
Bn{x';x) = Y,Xj[nY-'Sg{i,j)x^,
(2.1)
where
an empty product denotes 1, and
^.(M-) = ^jr7(^:i)7,E(-l) r^r(r-l)/2
[j-rY,
0<i<j,
(2.2)
is the Stirling polynomial of second kind. Thus for any polynomial p of degree m, one
may write
Bn{p;x) = a'^Ax,
(2.3)
where a is the vector whose elements are the coefficients of p, A is an (m+1) x (m +1)
lower triangular matrix with the elements
a =1 ^^• [ny-'Sg{i,j),
0<j<i,
^2.4)
and X is the vector whose elements form the standard basis for the space of polynomials
Pm of degree m.
Lemma 2.1 Let 0 < q <1 be a fixed real number. Then
lim Bn{p;x) = p{x)
n—►oo
if and only if p{x) is linear.
This lemma can be generalized for any function / € C[0,1].
Theorem 2.6 Let 0 < g < 1 be a fixed real number and f 6 C[0,1]. Then
lim Bn{f;x) = f{x)
n—*oo
if and only if f{x) is linear.
56
Halil Orug and Necibe Tuncer
3
The iterates
The iterates of classical Bernstein polynomials were first studied by Kelisky and Rivlin
[5]. The authors proved that iterates of Bernstein polynomials converge to linear end
point interpolants on [0,1]. Several generalization of the result due to Kelisky and Rivlin
has been considered by many authors; see Sevy [9] and Wenz [10]. The recent result is
the convergence of iterates of generalized Bernstein polynomials. It is proved in Orug and
Tuncer [6] that the g-Bernstein polynomials do preserve the convergence property of iterates of classical Bernstein polynomial. The iterates of generalized Bernstein polynomial
are defined by
B^+\f;x) = B„{B^{f;xy,x), M-1,2,...,
(3.1)
where B^(/;a;) = B„(/;a;).
■
Theorem 3.1 Let q>0 be a fixed real number. Then
Jim Bf(/;a:)=7(0) + (/(l)-/(0))x.
(3.2)
Let A and B be operators then the Boolean sum of A and B is defined to be
A®B = A + B-AoB.
;
We will be concerned with iterated Boolean sums of the generalized Bernstein polynomials in the form 5„ 9 B„ © • • ■ © B„ and will denote such an M-fold Boolean sum of
the generalized Bernstein operators by ®^B„. Sevy [9] and Wenz [10] proved that the
hmit of iterated Boolean sums of Bernstein polynomials is the interpolation polynomial
with respect to the nodes (^,/(^)) i = 0,... ,n as M —> oo. The second theorem of this
section will give a result for the convergence of iterates of Boolean sums of generalized
Bernstein polynomials. It is proved in Orug and Tuncer [6] that the iterates of Boolean
sums of q'-Bernstein polynomials converge to the interpolating polynomial at the nodes
(i.^(fl))Theorem 3.2 The iterated Boolean sum of the q-Bemstein operator ®^Bn{f;x) associated with the function f{x) 6 C[0,1] converges to the interpolating polynomial Lnf
of degree n of f{x) at the points Xi = [i]/{n], i = 0, l,...,n.
4
A difference operator Vq on generalized Bernstein polynomials
Given any function f{x) and q £ Rwe define the operator Vg
pj(,) = felzM.
z'
qx -X
(4.1)
Thus T>qf{x) is simply a divided difference, Vqf{x) = f[x,qx]. Note that, for a function
/ and non-negative integer fc
f[x,qx,...,q''x] = -~V'jix).
57
On the q-Bemstein polynomials
Theorem 4.1 For any integer 0 < k <n,
n—k
n—k
r
r=0
n—r—1
n (i-9^^)-
s=fe
Proof: Recall the g-difference form of generalized Bernstein polynomials (1.5) and
apply the operator Vq to Bn{f; x) repeatedly k times to get,
n—k
VlBn{f;x) = Y, T
[n]\
i
i,r ,A^^'h^'■
(4.2)
It will be useful to express A^^^'" in terms of A*'. One may prove by induction on m that,
for 0 < m < n — A; we may write
^m+fcj. ^ y^(_l)tgt(t+2t-l)/2
A fm+i-t-
t=0
Now applying the latter identity to (4.2) gives
'^
n—k
r
r!
r=0 t=0
AVr-tX^
(4.3)
Writing m = r — t
N!
771 +i
[n-fc-m-i]![m + t]!
t
[n]\
n—k—m
[n — k — m]\[m\\ [
t
(4.4)
and putting (4.4) in (4.3) we obtain
n—k
P,^B„(/;x)=E
m=0
n!
[n — fc — m]![m]!
n — fc — m
t
a;*.
Now, it can be easily derived from generalized binomial expansion (1.3), on replacing x
by q''x, that
n—m—l
n—k—vn
n {l-q'x)= E (-l)*9*(*+2'=-i)/2
f=fc
This completes the proof.
t=o
n—k—m
x\
t
D
From Theorem 4.1 we see that, with 0 < q' < 1, if t^fr >OforO<r<n — fc then
V^Bnif; x) > 0. If / is convex on 0 < a; < 1 then V^B„{f; x) >0 ioi 0 < q <l.Ii f IS
increasing then T>qBn{f; x) > 0, ioi 0 < q < 1.
Acknowledgment: The second author is supported from the Institute of Natural and
Applied Sciences of D.E.U. and this research is partially supported by the grant AFS
0922.20.01.02.
58
Halil Orug and Necibe Tuncer
Bibliography
1. G. E. Andrews, The Theory of Partitions, Cambridge University Press, Cambridge,
1998.
2. E. W. Cheney, Introduction to Approximation Theory, AMS Chelsea, Providence,
1981.
3. P. J. Davis, Interpolation and Approximation, Dover PubHcations, New York, 1975.
4. T. N. T. Goodman, H. Orug, and G. M. Philhps, Convexity and generalized Bernstein polynomials, Proc. Edin. Math. Soc. 42 (1999) 179^190.
5. R. P Kelisky and T. J. Rivlin, Iterates of Bernstein polynomials. Pacific J. Math.
21 (1967), 511-520.
6. H. Orug and N. Tuncer, On the convergence and iterates of g-Bernstein polynomials,
J. Approx. Theory, to appear.
7. G. M. Phillips On generalized Bernstein polynomials. Numerical Analysis, D. Griffiths and G. Watson eds. (1996), 263-269.
8. G. M. Phillips, Bernstein polynomials based on the g-integers. The heritage of P. L.
Chebyshev: a Festschrift in honor of the 70th birthday of T. J. Rivlin. Ann. Numer.
Afaf/i. 4 (1997), 511-518.
9. J. C. Sevy, Lagrange and least-square polynomials as limits of linear combinations
of iterates of Bernstein and Durrmeyer polynomials, J. Approx. Theory 80 (1995),
267-271.
10. H. J. Wenz, On the limits of (Linear combinations of) iterates of linear operators,
J. Approx. Theory 89 (1997), 219-2S7.
Uniform Powell-Sabin splines for the polygonal
hole problem
Joris Windmolders and Paul Dierckx
Department of Computer Sciences, Kath. University Leuven, Belgium.
Joris.WindmoldersQcs.kuleuven.ac.be, Paul.DierckxQcs.kuleuven.ac.be
Abstract
An algorithm is described for smoothly filling in a polygonal hole in a surface, with a
parametric uniform Powell-Sabin spline surface patch. It uses interpolation and subdivision techniques for iteratively determining an approximating solution. No assumptions
are made about the surrounding surface. The user has to provide routines for calculating
the curve points and the unit surface normal along the edge, as well as the unit tangent
vector of the edge curves, parametrized on the unit interval.
1
Introduction
A classical problem in CAGD is to fill in a hole, bounded by a set of surfaces. This
problem has already been addressed in the literature (e.g. [1, 2, 4]). In most cases,
assumptions are made on the bounding surfaces. In this paper, we present an algorithm
for filling in a 3, 4, 5 or 6-sided hole that makes no assumptions on the surrounding
surfaces, and therefore it is generally applicable. On the other hand, the filling patch
will meet the given boundary curves approximately. The input of our algorithm (see
Figure 1) consists of the boundary curves p which join at their endpoints. Furthermore,
the user should provide the unit tangent vector 7 to the boundary curves at any point,
and the unit normal vector n to the surrounding surface at any curve point except the
endpoints, where the tangent vectors of the joining curves are needed only (see Figure
1 again). For other (interior) curve points, our algorithm will calculate a unit vector
5 = n X j, which will be called the (unit) cross-boundary tangent vector. It shall be
referred to as if it were provided by the user. We will calculate a filhng surface patch
that interpolates the user suppUed boundary curves and has the same siurface normal in
a number of points. This will leave us some degrees of freedom, which we will use to fit
the curve and the cross-boundary tangent vector in between each pair of interpolation
points. In section 2 we briefly recall the basic properties of uniform Powell-Sabin splines.
Section 3 explains how we can benefit from these properties to use UPS-splines for the
polygonal hole problem. Section 4 explains our algorithm in detail. Finally we remark
that on the pictures, we will denote 2D and 3D entities interchangebly; therefore most
pictures reflect the situation only schematically.
60
Uniform Powell-Sabin splines
FIG.
2
61
1. User supplied data.
Uniform Powell-Sabin splines
This section recalls the main properties of Uniform Powell-Sabin splines. For details,
we refer to the original papers [3, 5].
I
By 52(A*) we denote the linear space of uniform Powell-Sabin splines (in the sequel
called UPS-splines), i.e., piecewise quadratic polynomials on a uniform triangulation A
(which means that all triangles are equilateral and have the same size) of a polygon fi,
where A* is a PS-refinement of A. The boundary of ft will be called 5Q, whereas the
boundary of the tria,ngulation will be referred to as 6A. The vertices of A are denoted
Vi,i = 1,...,n, and its triangles are pi,i = l,...,m. These splines have global C^continuity on A*. Any s(u,v) has a unique B-spline representation
n
3
s(u,v) = ^^Ci,jBi(u,v),
(u,v)en,
,
(2.1)
i=ij=i
where the locally supported basis functions form a convex partition of unity and Cjj S R^
are the control points. It follows that s(u, v) belongs to the convex hull of {cjj}^ •.
Furthermore, one can prove that the control triangles, being defined as Ti(cj_i, Ci,2) ^5,3),
i = 1,... ,n, are tangent to the surface at s(Vi). Due to the local support of Bf, a
change to Cij will only affect s(u, v)|Mii i-e., the restriction of s(u,v) to the molecule
of Vi, being the set of triangles pj that have V^ as a vertex. This indicates that we
have a useful representation for C-^-continuous surfaces, without being restricted to a
rectangular domain, and still enjoying the interesting features of the classical B-sjpline
representation for tensor product splines.
2.1
Subdivision
In [5] we present a subdivision scheme for UPS-splines. Let A^ be a uniform refinement of A, obtained by midedge subdivision. For a given s(u, v) on A, the representation
(2.1) on Ar can be calculated using convex barycentric combinations of the control points
only. First, a new control triangle along each edge ViVj is calculated as illustrated in
J. Windmolders and P. Dierckx
62
=i.3
Ci,3
<=j,3
•^3,2
<^j,2
FIG.
2. Subdivision and Bezier points.
Figure 2, left, for the bottom edge of a triangle pi{Vi, Vj, Vk) G A:
Cl
C2
C3
=
=
=
5(Ci,2 + Ci,3)
icj,l + |(ci,2 + Cj,2)
icj,l + |(Ci,3 + Cj,3).
(2.2)
Next, the control triangles at the original vertices are rescaled: for example.
"i,2
=
=
|ci,l + g(Ci,2 + Ci,3)
|ci,2 + g(Ci,3+Ci,i)
"1,3
=
|ci,3
3^>.3 +
' g(Cl,l+Ci,2).
6V
(2.3)
They are still tangent to the surface at their barycenter, but their area is only a quarter
that of the former control triangles. Therefore they connect tighter to the surface.
2.2
The piecewise Bezier representation
Another important property of the B-spline representation for UPS-splines, is that
the piecewise Bezier representation can be calculated from (2.1) using simple convex
barycentric combinations of the control points. In particular, focus an edge ViVj of A
(see Figure 2, right). The Bezier points of the edge curve can be found from:
s(Vl) = Pl = -(Ci,i + Ci,2 + Ci,3),
1/
X
Ul = -(Ci,2 + Ci,3),
s(Vj)
Pj
2
1,
Uj = -Cj,i + -(Cj,2 + Cj,3),
:(^J.l+Cj,2+Cj,3),
(2.4)
-(Ui + Uj).
(2.5)
Fj j
This is a piecewise quadratic Bezier curve, which means that pi, ry and Pj are surface
points, and that ui - pi and pj - Uj are tangent to the surace at pi, resp. pj. Assuming
a (counterclockwise) ordering of the boundary vertices F, € ^A, the edge curve from
s(Vi) to the next adjacent point s(Vj) will be denoted ei(u, v).
3
Application to the polygonal hole problem
Recall that our goal is to calculate a UPS-spline filling a hole in a surface, given by a
set of bounding curves (denoted p), their derivatives j and the cross-boundary tangent
vectors S. The UPS-patch will fit these curves approximately along its boundary. In the
first place, interpolation of the given data at the vertices Vt e 5A is achieved. This leaves
63
Uniform Powell-Sabin splines
Ci,V
FIG.
3. Tangent and cross-boundary tangent vectors.
some degrees of freedom allowing to fit the given curves. In the sequel we shall denote
the user supplied data, evaluated at Vi, by (pi,7i,5i).
3.1
Interpolating UPS—splines and degrees of freedom
In order to obtain interpolation we determine a control triangle Tj in the tangent plane
spanned by pi + eji + uSi, e,^ E R, such that s(Vi) = pi. Curve point interpolation is
simply expressed by (2.4). Furthermore, we let the tangent to ei at Vi be parallel to jf.
Ui - Pi = ^(Ci,2 + Ci,3) - -Cl,i
O
O
■ a-iji,
(3.1)
where a^ is a scahng factor. Next, we need the cross-boundary tangent vector of s(u,v)
at Vi to be parallel to 5,. Mapping the cross-boundary vector d in the domain plane (see
Figure 2, right) onto the control triangle yields a vector parallel with Ci,2 — Cj^a:
\
Ci,2 - Ci,3 =
2A4
(3.2)
where /3j is again a scaling factor.
Solving (2.4), (3.1) and (3.2) to cij in terms of the unknown a, and /3j (further called
the a- and /3-factors) yields
Pi - aai
Pi + fTi + A^i
Pi + t7i-A^i-
(3.3)
These equations ensure that s(u, v) interpolates the given data at Vi G 5A, and leaves
us two degrees of freedom per vertex (oj and 0i). These scaling factors are related to
the size of the control triangle. For example, subdivision by (2.3) divides a^ and /Sj by a
factor of 2.
3.2
The fitting equations
We will now use these degrees of freedom to fit the user supplied data, in between each
pair of adjacent interpolating vertices Vi, Vj £ SA. First, the a-factors at Vi axid Vj are
determined by trying to interpolate the curve p at the edge midpoint Vij = |(Vi + Vj).
From Section 2.2, the interpolation condition reads rij = 2("»"'' '^j) ~ P>J' where pij
is the given curve point. Taking (2.5) and (3.3) into account, we have
aili - djlj = 4pi,j - 2(pi + pj) = qij.
(3.4)
64
J. Windmolders and P. Dierckx
FIG.
4. Consecutive iteration steps.
This is a system of 3 equations with (at most) 2 unknowns. It can be solved in the least
squares sense.
Next, the /3-factors at Vi and Vj are obtained by fitting the cross-boundary tangent
vector at Vij. First, we derive a subdivision rule for the /3-factors at the vertices of A
from (2.2) and (3.2):
0'iAj = \i^^^i + I^M
(3-5)
where Slj is the cross-boundary tangent vector to s(u, v) at Vij. This /S^'^—factor belongs
to a finer subdivision level then /3j and Pj, so we have to scale it up by a factor of 2. The
interpolation condition then is
Note that 5ij has been used instead of S'^j. This is again an overdetermined system
which can be solved in the least squares sense.
4
The algorithm
We will restrict the figures illustrating the algorithm to the case of a triangular hole,
although the algorithm is immediately applicable to cases with 4, 5 and 6 boundary
curves as well (see Section 4.4).
The idea is to calculate, during a pre-iteration step, an initial solution which is smooth,
but in general not close enough, and to refine this approximation iteratively to obtain
a better fit to the given curves until a certain stopping criterion is satisfied. Finally,
during a post-iteration step, the interior control triangles are calculated, actually filling
the hole. Figure 4 illustrates this: imagine a pre-iteration step, two refinement steps and
a post-iteration step. The control triangles added during a particular step have been
shaded.
4.1
An initial solution
The initial solution (Figure 4, leftmost) is easily obtained by solving (3.4) in the least
squares sense for each edge ViVj. If we assume that 7, =^ 7^, then
:
.
Oii
=
p((7i-qi,j)-(7j-qij)(7i-7j)),
(4-1)
(
Uniform Powell-Sabin splines
"j
=
;^(-(7rqi,j) + (7i-qi,j)(7i-7j)),
65
(4.2)
where D = 1 — (7i •■fj)^. This yields two a-factors per vertex: one for each boundary edge
being incident to that vertex. Therefore, T, is completely determined. The /3-factors can
be calculated by writing (3.3) for both edges incident with the vertex and ehminating
C2, respectively Ci, e.g., for Figure 3, right,
Pi= 0:2(72 • ^1),
P2 = -ctiili • ^2)-
(4.3)
There exist pathological cases where 72 -L Si or 71 ± 62. Our algorithm then sets
/3i = ai, resp. ^2 = 0:2- For the case 7, = 7,, (3.4) has no solution in the least-squares
sense. Assuming that si is a straight line from s(Vi) to s(Vj), the a-factors can then be
determined from the projection onto the domain plane, where the size of the so-called
PS-triangles (the projections of the control triangles) is fixed. The reader can verify that
this yields ai = aj = ||ViV^'|.
4.2
The iteration step
First the control triangles from the previous steps are rescaled by subdivision. This is
simply done by scaling down the a- and /3-factors: a, <— ^ and /3i <— ^, for each
Vi € SA. Next, a new control triangle is created in between any two adjacent vertices at
the coarser level. This situation is illustrated in Figure 5, left, where the darker triangles
are known. We are looking for the a-and /3-factors for the middle control polygon, which
is tangent to the surface at s(Vk), Vk = 2 (^ + ^)- Consider the a-factor first. In order
to obtain a better fit, we try to interpolate p at Vi^k = ^{Vi + Vk) and Vkj — ^(Vfe + 1^).
This yields a set of fitting equations
r o.ai-ak%
= qi,k,
.^^y
where a, and Uj are known. Thus, afc can be obtained as the least-squares solution of
(4.4): .
ttfe = 2(7fc-(aj7J-qi,k + qk,j-Q;j7j))- ^
(4.5)
The ,8fe-factor is found by fitting the cross-boundary vectors at Vi^k and Vk^j, i.e., by
solving the following system in the least-squares sense:
{l3i,kSi,k
—
2^l3i5i + I3k5k),
/^ gs
Pk,jh,j = l{Pkh + 0j5j),
where /3j and /3j are known. If 6i^k = Sk = Skj, as is always the case for a planar curve,
this system has no solution in the least-squares sense. The /3k factor can then easily be
obtained by equation (3.6), i.e., by subdivision and upscaling.
4.3
The interior control points
Finally, as soon as the user supplied edge curves have been approximated well enough,
the interior control points at the eventual refinement level have to be calculated. We will
J. Windmolders and P. Dierckx
66
<=j.2
FIG.
5. The refinement and post-iteration steps.
FIG.
6. The hole and the triangular patches.
discuss three possibilities by the help of an example; Figure 6 shows a hole (left) and
two filling patches (right).
Copy From Initial. The interior control points are obtained directly from the initial
solution by subdivision. This guarantees that the interior of the patch is smooth. A
disadvantage is that the inner of the first approximation in general has no connection with the shape of the edge curves. This can cause unwanted artefacts near the
boundary, after a few iterations (see Figure 7, left). The next option will therefore
take edge features into account.
Averaging. We will fill the hole gradually by calculating a ring of control triangles
during each pass, going from the edge towards the inner of the patch. Figure 5, right
shows an example where each ring has a different shade of grey. At each step, a
control triangle of the current ring is obtained by averaging six surrounding control
triangles. These come from the initial solution, or, if possible, from a previously
calculated ring. Edge features are now smoothed out towards the inner of the patch.
However, there is a main disadvantage to this approach, if averaging is applied after
the last iteration step: the unwanted artefacts mentioned before are now repeated
for every ring, smoothed out towards the inner of the surface, as shown on Figure
7, middle.
Instant Update. A good compromise would be to take edge features into account
before we finish iterating. This can be accomplished by subdividing the initial solution at each refinement step, but, we always overwrite its edge with the most recent
boundary approximation. The results of this strategy are depicted in Figure 7, right.
In any case can the user change the interior control triangles, and still he has a C^continuous filling patch, fitting the specified edge curves with demanded precision.
Uniform Powell-Sabin splines
FIG. 7. Copy from initial solution and averaging (4 iterations); instant update (3 iterations).
FIG.
4.4
8. Cases with 4, 5 and 6 boundary curves.
A note on the number of edges
The algorithm sketched in Section 4 is immediately applicable to problems with 4, 5
and 6 boundary curves as well. Figure 8 shows the configuration of the initial solution
for each of these cases. If we are working with 5 edges, there are 2 edges having a
control triangle at its midpoint (shaded darker). This requires a tiny modification to the
calculation of the initial solution for those edges. The a-factors are obtained by solving
(4.4) to the unknown ai,aj and Uk. The ^S-factors of the outer control poygons are
obtained as usual; for the middle polygon one can apply (3.6). Also, for the cases of 5
and 6 boundary curves, an interior control triangle (unshaded) has to be calculated for
the initial solution. This can be done by averaging the six surrounding control polygons.
Bibliography
1. Charrot, P. and A. Gregory, A pentagonal surface patch for computer aided geometric design. Computer Aided Design 1, pp 87-94.
2. Chui, C. K. and M.-J. Lai (2000), Filling polygonal holes using C^ cubic triangular
spline patches, Computer Aided Geometric Design 17, pp 297-307.
3. Dierckx, P. (1997), On calculating normalized Powell-Sabin B-spfines, Computer
Aided Geometric Design 15, pp 61-78.
4. Gregory, J.A., V. K. H. Lau, and J. M. Hahn (1993) , High order continuous polygonal patches, in Geometric Modelling, G. Farin, H. Hagen and H. Noltemeier (eds.),
Springer-Verlag Wien.
'
5. Windmolders, J., Dierckx, P. (1999), Subdivision of Uniform Powell-Sabin splines.
Computer Aided Geometric Design 16, 301-315.
6. Windmolders, J. and P. Dierckx, NURPS for Special Effects and Quadrics: Oslo
2000, Tom Lyche and L. L. Schumaker (eds.), Vanderbilt Press, Nashville 2001.
67
Chapter 2
Differential Equations
69
Iterative refinement schemes for an ill-conditioned
transfer equation in astrophysics
Mario Ahues, Filomena d'Almeida, Alain Largillier,
Olivier Titaud and Paulo Vasconcelos
Universite de Saint Etienne, France and Universidade do Porto, Portugal
Abstract
Let X := L'([0, To]), where To represents the optical depth of a stellar atmosphere. The weakly singular integral operator T : X—*X defined by
(r^)(T) = f/;''£i(|T-T'|MT')rfr',
where zj 6]0,1[ is the albedo of the atmosphere and Ei denotes the first
exponential-integral function, is such that ||r||i = 07(1 — £2(TO/2)), where
E2 denotes the second exponential-integral function. If zo is close to 1, and
To is large, then ||T||j is close to 1. In that case, the transfer problem
given fGX, find ip€X such that Tip — ip + f
is ill-conditioned, and the convergence of the fixed-point iteration (pk+i = T<pk—f, which
is commonly used by numerical astronomers, becomes prohibitively slow. The purposes
of this work are to approximate (p through different sequences whose terms solve wellconditioned approximate equations, and to compare their eflSciency and computational
costs.
1
Introduction
For a given
TQ
> 0, let p be a function defined on ]0, TQ] such that
lim gM =
(1.1)
-|-CXD,
T-»0+
pGC°(]0,To])nLiaO,To]),
(1.2)
ff(T) >OforallT€]0,To],
g is a decreasing function on ]0, To].
(1.3)
(1.4)
We consider the integral operator T defined by
/•To
iTx){r):= / gi\r-T'\)x{r')dr'.
Jo
(1.5)
/•To/2
Theorem 1 T is a linear compact operator in
Proof: See [2].
L^([0,TO])
and \\T\\^ = 2 /
Jo
^(T)
dr.
D
70
Iterative refinement schemes
71
For z in the resolvent set of T, we consider the Predholm equation of the second kind
T(p = z(p + f.
(1.6)
Apphcations will concern the function g : ]0, TQ] —> M given by
g{r) :=
(1.7)
^E^{T)
where ro € ]0,1 [ and Ei is the exponential-integral function :
T > 0. £?i is the first function of the sequence {E^)^>i,
EI{T)
:= /
^^—— dfi,
:= / -—^^
d^i,
Ji
A*
r > 0, f > 2, and it is the only one presenting a logarithmic singularity at r = 0.
Following Theorem 1, when g is defined by (1.7), we have ||T||i = tx7[l - £'2(TO/2)] < 1.
We recall that a bounded linear finite rank operator T„ in a normed linear space X
can be written as
E^{T)
n
Tn '■= ^__,\" I ^11,31^71,3
(l-o)
where n e W*, and, for j e |l,n], ^„,j € X*, the topological adjoint space of X, and
The resolution of the approximate equation
r„(^„ = 2;<^„ + /,
(1.9)
where z belongs to the resolvent set of T„, leads to an n-dimensional linear system
(A„ - 2;l„)x„ = b„
(1.10)
where l„ is the identity matrix of order n,
Kihj) •■= {en,j , L,i),
K{i) ■ = (/, L,i),
x„(j) := {(fn, 4,j)-
(1-11)
Once this system is solved, the solution of (1.9) is given by
V^n = - I ^>^n{j)en,3 " / j •
(1-12)
We are interested in refining approximations obtained with T„ := 7r„T, where 7r„ is
a sequence of projections with finite rank n. A bounded projection 7r„ of finite rank n is
n
defined by 7r„a; := X^ (x, e* ,,-)e„j- for all x e X, where {en,j)]=i is an ordered basis of
j=i
the range of 7r„, and (e* ,,)"^i is an adjoint basis of the former in X*. Hence
n
TnX:=Y.'^Tx,elj)en,3,
xeX.
(1.13)
We suppose that 7r„ is pointwise convergent to the identity operator in the Banach X
where the operator T is defined. Since T is compact, r„ converges to T in the operator
72
M. Ahues, F. d'Almeida, A. Largillier, 0. Titaud and P. Vasconcelos
norm. Let R{z) := {T - zl)-'^ be the resolvent of T at z. Then Rn{z) ~ (T„ - ziy^
exists for n large enough and is uniformly bounded, that is, there exists no such that
Co(2) := sup ||i?„(2)|| <+TO.
(1.14)
n>no
We develop an application in the space X := L^([0, TQ]). Let (r„j)^=o be a grid on [0, TQ]
such that
0=:r„,o <r„,i < ■■• < r„,„_i <T„,„ :=To,
(1.15)
and set
K,j
■=
T„j-Tn,j-i
for j G [l,...,n].
(1.16)
We define, for r G [0, To],
„ /,'!._ J 1
e„,,^rj.-| Q
if ■7'G (T„J_I,T„J)
otherwise
Cl 171
^
'
and, for a; G L'^([0,To]),
{x,e*„j):=-j^rxiT')dT'.
(1.18)
The product defined in (1.18) is a special case of the scalar product used in equation
(1.8) when a grid such as (1.15) is set. In this case the operator in (1.13) is the operator
in (1.8) if we choose £n,j = T*e^j. Let
/x„ := min{/i„,3- : j G [1,... ,n]},
/i„ := max{/i„j : j G [1,... ,n]},
?„:=-7^.(1.19)
For quasi-uniform grids, there exists a constant q independent of n such that, for all n,
q<qn- For uniform grids, ?„ = 1 for all n.
Theorem 2 Let (p j^ 0 be the solution of (1.6) with T defined by (1-5). Let (fn be the
solution of (1.9) with T„ defined by (1.8) and (1.15)-(1.17). Then, forn large enough,
y-fnh ^8co{z) Z'^,,,^,
II II
- ~^ / ^(^ ^■^'
ll'/'lli
Qn Jo
^j^_20)
where Co{z) is given by (1-14) and computed with the 1-norm.
Proof: See [2].
□
In the case (1.7), the matrix A„ of the linear system (1.10) has entries
A„(i,j):=-^/
/
E,{\r-r'\)en,jir')dT'dr,
(1.21)
E,{\r-T'\)f{T')dT'dT.
(1.22)
and the second member b„ has entries
b„(i):=-^/
Iterative refinement schemes
73
For more details, see [3]. An application to the transfer problem in astrophysics gives
(1.6) with 2: = 1, and as free term,
fr^y^l -1
ifO<r<ro/2,
which describes a sudden drop of the temperature on the
sphere. For further details on the physical model, see [4].
2
23)
T
= ro/2 layer of the atmo-
Iterative refinement of approximate solutions
To attain a given precision on the approximate solution <^„, it may be necessary that the
largest grid step ft„ be so small that the dimension of the corresponding hnear system
will be prohibitively large from a computational point of view. Not only the algorithm's
stability becomes poor but also the condition number of the matrix may increase if its
size increases. Refinement schemes allow us to attain iteratively the exact solution of a
large scale linear system by means of the resolution of a sequence of linear systems of
moderate fixed size. Let us consider the general framework of a complex Banach space
X and a linear compact operator T : X —> X. If z is in the resolvent set of T, then
z j^ 0. Let T„ be a sequence of linear bounded operators in X such that ||T — r„|| —> 0
in the operator norm. Then, for n large enough, z belongs to the resolvent set of r„ and
Rn{z) is norm-convergent to R{z).
The most elementary way to refine the approximate solution tp„ := Rn{z)f is the
following.
r
Scheme A
xW := ^„,
<
[ a;(*^+i)
. ,
(2.1)
:=
a;W - i?„(z)(Ta;W - ^a;^ -/),
k>0.
We can interpret Rniz) as an approximation of the inverse of the Prechet derivative of
the afiine operator a; 1-^ {T — zl)x - /, the exact one being R(z). Since R{z) satisfies the
identities
R{z) = \{R{z)T-I) = h:rR{z)-I)
(2.2)
two new different approximations of R{z) are thus motivated,
^„(z):=-(i?„(2)r-7),
R^{z):=\TRr,{zy-I).
(2.3)
These approximate resolvent operators lead to the following iterative refinement schemes,
J(0)
Scheme B
\
Scheme C
^
:=
Rn{z)!,
j(fc+i) .= sW-^„(^)(rxW-;2S(fc)_/), fc>o,
m
(2.4)
:= Rn{z)j,
(2.5)
j(fc+i)
.^
£(fe) _ ^„(^)(2-£(fc) _ ^j(fe) _ /)^
fc > 0.
74
,
:
M. Ahues, F. d'Almeida, A. Largillier, 0. Titaud and P. Vasconcelos
Since the computation of residuals which tend to zero, as well as the resolution of almost
homogeneous linear systems may be unstable, the following theorems are interesting for
algorithmic purposes.
Theorem 3 In (2.1), a;(*^+i) = x^^^ + Rn{z){Tn - T)x^^'> for fc > 0.
Theorem 4 In (2.4), x^^+i) = x^^^ + -/l„(z)(T„ - T)Tx^''^ for k>0.
z
Theorem 5 In (2.5), x^''+^^ = JW + -Ti?„(2)(T„ - T)x^^^ for k>0.
z
Proof: For each fc > 0, in (3),
\{k+i)
^
xW - i?„(;j)(Ta;W - 2:rW -/)
=
a;W+i?„(2)(T„-T)xW
For (4) and (5), the proof follows the same idea but it is technically more complicated.
In our application to the transfer equation in astrophysics, T is defined by (1.5) with
g given by (1.7), and the equation (1.6) has z = 1.
3
Numerical computations
The iterative refinement schemes allow us to obtain the exact solution of a large scale
linear system by solving a sequence of moderate fixed size ones. Each of the three iterative
refinement schemes presented in this work are based on an approximation, say Gn{z),
of the resolvent operator R{z). Their common structure is the following.
^(0)
^(fc+i)
:= Gn{z)f,
_ ^(0)^(j_G„(^)(T-2j))eW,
' 3,y
^ '
fc>0.
Theorem 6 Letci{z) := 8co(z)max{l, ||r||i/|2|}, and(^(*^^)fc>o be any of the sequences
(2.1), (2.4) or (2.5). Then
M^^^<mrgir)dA''\ k>0.
Ml
'^ Qn Jo
^ '
^
-
Proof: Let us prove the bound for the sequence defined by (2.1). For the other two,
the arguments are similar. Using Theorem 3, we have
Hence,
x^^^-ip
=
(i?„(z)(r„-T))'(x(°'-^),
X^°^-ip
=
Rn{z){T-Tn)^.
II^W-y.il, <||(i?„(z)(T-T„))'=+i|UML,
and, in [2], we have shown that ||i?„(2)(T„ - T)||, < —^^ /
Qn
Jo
^(T)
dr.
n
All the schemes need evaluations of T at some prescribed functions of X. In practice
T is not used for this purpose but an operator Tm of the sequence {Tv)u>i is used instead,
Iterative refinement schemes
75
where m > n. We consider the kernel g defined by (1.7) and the free term / defined
by (1.23). Table 1 gives the number of iterations performed by each scheme for several
values of ■co in order to obtain a first relative residual less than or equal to 10~^^, when
a quasi-uniform grid (ry,j)J'_o is built such that f is a multiple of 10, TQ = 1000,
^
2v
n = 200,
m = 1000,
and
Ki:= <
5i/
I^
if
ie[i,...,|],
if
ie[f + i,...,f2J'
if
j p fii 4.1
(3.2)
%1
, V '^ ie[fg + i,...,H.
Albedo
VJ
0.750
0.990
0.999
Scheme A
(2.1)
29
46
385
Scheme B
(2.4)
15
27
196
Scheme C
(2.5)
14
26
195
TAB 1. Number of iterations.
Figures 1, 2 and 3 show the last iterate of all schemes, as well as the corresponding
convergence histories, for w G {0.750,0.990,0.999}. As we can see, the schemes B and
C are much faster than Atkinson's formula A, specially when the albedo is close to 1. In
the latter situation a wider boundary layer arises at the left of the atmosphere, and the
decay at the middle point takes place along a wider subinterval.
A survey on different discretization methods for integral operators can be found in
[1], with special emphasis on spectral applications. In what concerns condition number
of associated linear systems, the reader is refered to [7], [5] and [6].
Bibliography
1. M. Ahues, A. Largillier and B.V. Limaye, Spectral Computations with Bounded
Operators, Chapman and Hall, Boca Raton, 2001.
2. M. Ahues, A. Largillier and O. Titaud, The roles of a weak singularity and the
grid uniformity in the relative error bounds, Numer. Funct. Anal, and Optimiz. 22,
789-814, 2001.
3. M. Ahues, F. D'Almeida, A. Largillier, O. Titaud and P. Vasconcelos, An L^ Refined Projection approximate solution of the radiation transfer equation in stellar
atmospheres. Journal of Computational and Applied Mathematics 140,13-26, 2002.
4. I. W. Busbridge, The Mathematics of Radiative Transfer, Cambridge University
Press, 1960.
i
76
M. Ahues, F. d'Almeida, A. Largillier, 0. Titaud and P. Vasconcelos
5. L. N. Desphande and B.V. Limaye, On the stability of singular finite-rank methods,
SIAM J. Numer. Anal. 27, 792-803, 1990.
6. A. Largillier and B.V. Limaye, Finite-rank methods and their stability for coupled
systems of operator equations, SIAM J. Numer. Anal. 2, 707-728, 1996.
7. R. Whitley, The stability of finite-rank methods with applications to integral equations, SIAM J. Numer. Anal. 23, 118-134, 1986.
Residual
zoo
300
W
70}
500
no
WO
FMitwrtisrittra
FIG. 1. Solution and convergence history for w = 0.750: Scheme A
Scheme B — dotted Une, Scheme C — solid line.
Solution
(
dashed line,
Residual
\
.
ico!oo3oa4005cofla)7[nioDioo
FIG. 2. Solution and convergence history for w = 0.990: Scheme A ^ dashed line,
Scheme B — dotted line, Scheme C — solid line.
Iterative refinement schemes
Solution
0
1D0
S«
77
Residual
700
800
SCO
ia»
3. Solution and convergence history for ru = 0.999: Scheme A — dashed line,
Scheme B — dotted line, Scheme C — solid line.
FIG.
Geometrical symmetry in symmetric Galerkin BEM
Alessandra Aimi and Mauro Diligenti
Department of Mathematics, University of Parma, Italy.
alessandra.aimi@uiiipr.it, mauro.diligentiQunipr.it
Abstract
We consider a symmetric boundary integral formulation associated with a mixed boundary value problem defined on a domain fi € H^ with piecewise smooth boundary T.
We assume that fJ is mapped onto itself by a finite group Q of congruences having at
least two distinct elements. Hence, we can decompose the related symmetric Galerkin
BEM problem into independent subproblems of reduced dimension with respect to the
complete one. Shape functions for each subproblem can be obtained from classical BEM
basis, ordered as a vector, applying suitable restriction matrices constructed starting
from group representation theory.
1
Introduction
Let fi C R^, be a bounded domain with a piecewise smooth boundary T. The_boundary
r is partitioned into two non intersecting open subset Ti and r2, with T = Ti Ur2 =
\JJ::=IT\ P being an open strainght Une segments. In the following we always assume
measTi > 0. The solution of the mixed boundary value problem
L{x)u{x) = 0
u{x)=u*{x)
onTi,
infi,
q{x):= — =q*{x)
(1.1)
on Ta,
(1.2)
xeQ.
(1.3)
can be expressed by the representation formula
u{x)=
U{x,y)q{y)dy-
■^Uix,y)u{y)dy,
In (1.1) L(-) is an eUiptic partial differential operator of second order, U{x,y) its fundamental solution (see [4] for a general discussion). In (1.2) ^ denotes the derivative
with respect to the outher normal n to T, and u* and q* are given functions. Applications of (1.1)-(1.2) are, for instance, boundary value problems in potential theory and
in elastostatic. Prom (1.3) it is clear that if we want to recover u in $7 we have firstly to
know the remaining Cauchy data, since in (1.2) these functions are given only partially.
Taking the limit of u{x) for x eVa and the normal derivative ^(x) for a; G r2 in this
formula and using the jump relations, one finds the system [2]
/ U{x,y)q{y)dy- f ■^—-U{x,y)u{y)dy = fi(x),
78
XGTI,
Geometrical symmetry
79
In order to perform the Galerkin method, we need a family of finite-dimensional subspaces {Uh,p{T)} defined on T. Let us define a mesh T^ for each T^: T = Ui=ir'fe,i
such that Tj^^ is an open segment. We define iov p > 0, h > 0, Uh,p{Ti) to be the set
of functions on Fi whose restrictions to P C Fi belong to the set of all polynomials of
degree < p on F^^^. Moreover, for p>l, U^p{T2) will denote those continuous functions
on Fa whose restrictions to F^ c F2 belong to C°(F2) and which vanish at the end points
of T2. The approximating boundary element shape functions of degree p > 0 are defined
through the standard assembling of the local basis functions defined on each Fj^ ^. We
then define
Uh,p{T) := spanKyPi, V^) : ipi G U^JT2), ^e € Uh,p{^i)}.
(1.5)
The corresponding symmetric Galerkin boundary elements scheme for (1.4) leads to a
Unear system of the form
A^=b.
(1.6)
If the boundary F presents symmetry properties, we will exploit them to reduce the
computational cost of the solution of (1.6), using a decomposition result for the Galerkin
boundary element problem that we will introduce at the end of the next section.
2 Matrix representation of a finite group of congruences and
projection operators
Let ^ be a finite group of i congruences {t > 2) of the Euclidean space R'" (m = 2, 3).
The group Q can be described by orthogonal matrices 7, of order m. Let {71,..., 7t} be
the elements of Q, 71 the identity matrix. Prom the theory of group representation [5] it
follows that any finite group Q admits a finite number q of unitary irreducible, pairwise
inequivalent matrix representations
{a;W(7i)},{a;(2)(7,)},...,{a;W(^,)}
{i = l,...,t).
(2.1)
Let de be the order of the representation {uj^^\'yi)}, i.e., the order of the matrices
'^^^Hli)- The number q of the representations (2.1) and the orders di,..., dq only depend on G- Any representation {^(^^(71)} of order di > 2, can be replaced, in the
system (2.1), by an equivalent unitary representation. Representations of order 1 are
univocally determined. We observe that, if 7^ and 7^ are two elements of Q, then
a;(^)(7i7^.) = a;W(7i)a;W(7j), ^(^^(7^^) = [w(^)(7i)]*, where [wW(7i)]* denote the transpose of the matrix uj^^^'ji). Always fi-om the theory of group representation it follows
that q <t and the relation df -1- d^ H
+ dq ='t holds. Furthermore, q = t if and only
if di = ^2 = • ■ • = dq = 1. Having set M = di + d2 + ■■■ + dq, then q< M <t, and we
have q = M = t ii and only if Q is an abelian group.
Let n be a bounded domain in H^ with a piecewise smooth boundary F, invariant
with respect to Q, i.e., sent onto itself by the congruences of Q. Also the boundary F is
invariant with respect to Q, i.e., for any ji €Q and a; € F, {"Yex) € F.
80
A. Aimi and M. Diligenti
Let W(r) be the real vector space of real functions defined on T. We can associate
to any element 7^ of ^ a linear transformation T; defined, for any v G W{T), by
{Tiv){x):=v{^-'x)
xer,
(2.2)
where Ti is a linear, invertible transformation from Vy(r) onto W(r), and Ti is the
identity.
Definition 2.1 A subset V(r) of W(r) is said to be invariant with respect to Q (or
Q-invariant) if for any v G V(r) and any 7i 6 G, TiV £ V(r).
Obviously if i; is a function of Vy(r), not identically equal to zero, the set of functions
{TiV, i = 1,..., t) is invariant with respect toQ.
Definition 2.2 Let C be a linear operator in V{T). We will say that C is invariant with
respect to Q if for any u G V(r).- CTiU = TiCu, i = l,...,t.
Example 2.3 Let V(r) be a suitable Sobolev space and {Cf){x) := /p K.{x, y)f{y)dTy
an integral operator defined on V(r), with kernel K.{x,y).
We have: Ti{Cf){x) = Jj.lC{'y~^x,y)f{y)dTy; since 7, G ^ is an isometry, the mapping
y -^ liV preserves the differential element dTy. Thus
C{Tif){x) = J iC{x,y)f{'rr'y)<iTy = j iC{xniy)f{y)dry.
Then the integral operator C is ^-invariant if the kernel fC{x, y) satisfies the condition
JC(a;,j/) = /C(7ia;,7i2/) for all a;, yGT, i = \,...,t.
Starting from the group G, the system of representation (2.1) and the linear transformations Ti defined by (2.2), we can introduce M Hnear transformations of >V(r),
^fe = TE'^ifc(^)^
(£ = l,...,9;fc = l,...,rf,).
(2.3)
i=l
Owing to the property of the representations (2.1), there holds
Pl = Pek,
PekPe'k'=0 if (£,fc) 7^ (^',fc'),
EE^a=Ti.
(2.4)
£=1 fc=i
The linear transformations Pek, which will be called projection operators, determine a
decomposition of any vector space V(r) C W(r) invariant with respect to G, into a
direct sum of M subspaces Vffc(r); Vffc(r) is the co-domain of Pek, viewed as a linear
transformation from V(r) onto itself.
If 0 is a non-abelian group, it is useful to consider in the space W{T) further linear
transformations linked to the system (2.1). Let {oj^^^'yi)} be a representation of G of
order de > 2. Let us consider dj Unear transformations, already introduced in [1], defined
as follows
4'} = Ti24'H^i)Ti,
' k,r = l,...,de.
(2.5)
Geometrical symmetry
81
li k = r, then Ai2 = PekDefinition 2.4 Let B{-, •) be a bilinear form from V{T) x V(r) onH. We will say that
B{-,-) is G-invariant if for, any u, V eV{T),
B{TiU,Tiv) = B{u,v),
i = l,:..,t.
(2.6)
Let V(r) be a Hilbert space and let us consider the following problem
find
uGV(r) : B{u, v) = T{v)
for all t; G V(r),
(2.7)
where B{-,-) is continuous and coercive, and J^{-) : V{T) —> IR a linear continuous
functional. If T and V(r) are invariant with respect to Q, and V(r) = ®'^j ®tLi V^fc(r)
is the decomposition of V(r) defined by the projection operators (2.3) the following
fundamental result holds.
Theorem 2.5 IfB{-, •) verifies the condition (2.6) and Pik are the projection operators
defined in (2.3), then the problem (2.7) can be decomposed into M independent problems;
find uik E.Vik{y) such that
B{uik,Vik)=H'Vtk) ^or&Wvtk&VikiV),
£ = 1,..., q; k = 1,..., de.
(2.8)
The solution of (2.7) can be recovered as u = ®'^j ®feLi ^ttThe above result can be applied, under the invariance hypothesis, in the discrete form
to the symmetric Galerkin BEM scheme if we choose the finite dimensional subspace
[//j,p(r) defined in (1.5), to be G-invariant too, and therefore decomposable as Uh,p{T) =
®j^i ®feLi ^h%i^)- Then the symmetric Galerkin boundary element problem can be
decomposed into M independent problems which have reduced dimension with respect
to the original one and which can be solved on parallel processors. Now one has to
construct boundary element basis functions for each subspace Uf^' [T). With some simple
geometries (and groups of congruences) this can be done directly, but in many cases this
is a difiicult task. We solve it here by applying restriction matrices, which we introduce
in the next sections, to the basis of Uh,p(X)-> ordered as a vector. Since there is a onoto-one correspondence between the standard boundary element shape functions and the
nodes of the mesh fixed on T, in the following we will work directly on the nodes of the
boundary.
3
Elementary restriction matrices
In this section we introduce suitable matrices depending only on the group G and on the
system of representations (2.1), which*will be called elementary restriction matrices. In
the following sections we will see how, starting from these, we can construct restriction
matrices relative to a mesh defined onT. We fix a finite group G = {71,..., 7*} of
congruences of R" and a system (2.1) of orthogonal irreducible, pairwise inequivalent
representations of G- G always admits the representation {1, 1,..., 1} which we indicate
by {u)^^\'^i)}\ let us order the remaining representations (2.1) with increasing order d^;
let {w^^n7i)}i • ■ • I W''^\li)} be the representations of order 1. If G is an abehan group
one has s = q = t and di = d2 = ■ ■ • = dt = 1. li G is a nonabelian group, it holds
s < q <t and therefore di = d2 = ■■ ■ ~ dg = I, 2 < d^+i < ■ ■ ■ < dg.
82
A. Aimi and M. Diligenti
Let G be an abelian group. We will call elementary restriction matrices the following
t matrices, with 1 row and t columns
^,^^_L(^W(^^)...^W(^,)),
e = i,...,t.
(3.1)
Since representations {w(^'(7i)} are real, it follows that a;(^'(7i) = ±1, for £, i = i,...,t.
Let S be a nonabeUan group. Correspondingly to the representations {oJ^^\'yi)} of
order 1 of the system (2.1), we introduce matrices Rei with 1 row and t columns
i?« = i=(a;W(7i)...c^(^n7t)),
£ = !,...,s.
(3.2)
We obtain, in this case, s matrices. Let now {w^^'(7i)} be a representation of the system
(2.1) of order de, with de > 2. With k = 1,..., de fixed, let us consider the following
matrix, with de rows and t columns
Rek =
f-\2M -11^(7.) - -12(7t)\
4'iM 41'(72) ■■■ 41'(70
(3.3)
V4;i(7i) 411(72) ■■• 411(7*) y
Due to the orthogonality properties of the representationfw^^' (7;)}) matrix Rek has pairwise orthonormal rows. Therefore the rank of matrix Rek is de- For any representation
{a;(^)(7j)} we obtain de matrices Rek {k = 1, • ■ ■, de). Matrices Rek {i = 1, ■ ■ ■, q; k =
1,..., de) defined in (3.2) and (3.3) will be called elementary restriction matrices. The
total number of these matrices is M, with M = di + da H
^dq. The matrices defined in
(3.1) or (3.2)-(3.3) satisfy some properties, easily deducible from orthogonality relations
(2.4) and which we summarise in the following.
Theorem 3.1 ([1]) The M elementary restriction matrices defined by (3.1) or (3.2)(3.3) verify the relations
q
RekRlk = he,
RekRtk' = 0 if (£, k) ^ (£', k'),
de
^^ ^ RlkRek = /
(3.4)
f=i fc=i
where Ide, / are identity matrices of order de and t respectively.
4
H{T,a) spaces and elementary restriction matrices
Let r be the piecewise smooth boundary of ft, invariant with respect to G, and a eV.
Consider the ordered set
Sa = {a,7^^a,...,7t"^a},
(4.1)
and the space W(Sa) of real functions defined in £„. A natural basis B in W(Ea) is
formed by functions having value 1 in a point of £„ and 0 in the remaining points.
Having indicated with x the function of B with value 1 in the point a, we obtain the
Geometrical symmetry
83
ordered basis B = {x(a;), x(72a;), ••• 5 x(7t2;)}, such that, of course,
n(Sa)=spsin{x{x),x{l2x),...,xbtx)}.
(4.2)
W(Ea) is a vector space with finite dimension n <t, invariant with respect to Q (since
T,a is invariant with respect to 0) and therefore decomposable into direct sum of M
subspaces Tieki^a)- Having set ne = dimlieki^a), we have n = X)?=i den^.
Definition 4.1 We say that a is a generic point ofV (with respect to the group Q) if
dim'H(Ea) =t or, equivalently, if all the elements o/Ea are distinct.
The following results hold.
Theorem 4.2 ([1]) Having fixed any point a eT, if {uj^^^'ji)} is a representation of
order 1, then Tiai^a) = span{P^ix} and ne < 1. If {oj^^^'yi)} is a representation of
order d^ > 2, one has
Heki^a) = span{4'i)x, ■ • ■, 4'ix}>
k = l,...,de:
(4.3)
and therefore n^ < de. If a is a generic point, then n^ = de for any i.
Let now V* be the column vector {xix),x{l2x), ■ ■ ■, x(7ta^))*) whose order is related to
that one fixed for the elements of Q. Corresponding to the representations of order 1 of ^,
for the elementary restriction matrices defined in (3.1), (3.2) we have ReiV*' = \ftPi\XProm Theorem 4.2, it follows that
.
Wn(E„) = span{i?ay*}.
(4.4)
Corresponding to the representations of order di > 2, for the elementary restriction
matrices defined in (3.3) we have RekV^ = \/t/de(Af.{Xj
(4.3), it follows that
nek{^a) = span{RekV'}.
42A)--->
4i<:X) ■ I^o™
(4.5) ,
In both cases, if a is a generic point, the components of the vector RekV* constitute a
basis in HikC^a)- Therefore, for any generic point a, the elementary restriction matrix
Rik represents the projection operator P^k from W(Ea) onto Tieki^a), if we choose F*
as a basis in K(E(j).
Now, we want to construct elementary restriction matrices Rek which represent the
projection operators P^k from H(Sa) onto Heki^a) for nongeneric points. Therefore let
us suppose a to be a nongeneric point, i.e., such that the functions
X{x),xil2x),...,x{ltx)
(4.6)
are linearly dependent. Let n be the maximum number of linearly independent functions
among (4.6) and let the following functions be linearly independent,
xhiiX),...,x{'7inX)-
(4-7)
It is convenient to order the functions (4.7) with increasing index ia, therefore let us
suppose ii < 12 < • • • < ^n• In this case elementary restriction matrices Rik will have n
columns. The number n^ of rows (n^ < de) of each Rek is not determined by «i, ia,. • •, in-
.
:84
A. Aimi and M. Diligenti
In general, we only can say that matrices Rn,..., Rue have the same number ne of rows,
where n^ = dim Wffc(Ea).
Then we now consider a significant class of nongeneric points. Having fixed £{£ =
2,..., i), let Ie{T) be the set of all points a € F such that
a = 77^a.
(4.8)
Prom (4.8) it follows, for any i : xili^') = xileli^)- This implies that the functions (4.6)
are naturally subdivided into subsets and any subset contains coincident functions. Then
we can obtain elementary restriction matrices for the space 7i{T,a) with a G /f (F) starting
from elementary restriction matrices built in Section 3, with the following procedure,
• Let us sum to each column of index ia (a = 1,..., n) all the columns of index j,
with j such that jT^a = %J^a. We indicate with Rek the obtained matrices, all
with de rows and n columns, but not all full-rank matrices; some of these may be
zero matrices.
• Let us extract from nonzero matrices Rck submatrices Rek made up of n^ linearly
independent rows.
• Finally, let us construct from R^k matrices Rek with a row-orthonormalization procedure.
The (nonzero) matrices i?^fc verify the properties expressed by Theorem 3. L Furthermore, matrices Rek, applied to the vector F" = (x(7iix),.. .,x(7i„a;))* corresponding
to a point a G /f(F), give vectors whose components constitute a basis for TiekC^a)For this reason they represent the projection operators from W(Ea) onto HekC^a), for
any a € Ie{T). Then we will say that the matrices R^k, with ne rows and n columns,
are elementary restriction matrices for the space W(Ea) relative to points a G /f(r).
Furthermore n = 2|_j d^n^.
5
'W(S) spaces and restriction matrices
Let F be the piecewise smooth boundary of fi, E a set formed by A^ points of F constituting a not necessarily uniform mesh defined on F. Let us suppose F and E invariant
with respect to Q. Let W(S) be the vector space of real functions defined in E. W(S) is
a A''-dimensional vector space, invariant with respect to G; this is due to the fact that E
is invariant with respect to Q. A natural basis B in W(E), invariant with respect to Q,
is formed by functions having value 1 in a point of E and 0 in the remaining points. In
order to more easily construct restriction matrices for the space W(E), or equivalently
for the mesh E, it is suitable to introduce in the set E the following equivalence relation.
Definition 5.1 We say that a point a' is equivalent to a" if there exists an element
: ^i £ Q such that a" = ^^^a' (and therefore a' = 'jia").
The points of the set E are then subdivided into r equivalence classes. If r = 1 one
has W(E) = 7i(Ea), with a G E. Then let us suppose r > 2. We order the points of the
set E as follows; having indicated with 0,1,..., Or ^ pairwise inequivalent points of E, we
consider the following ordered points
Geometrical symmetry
85
If points (5.1) are distinct, we have A'' = rt. If some points among (5.1) coincide, we will
erase from the sequence (5.1) a point if it is equal to a previous one. Then a sequence
of N points, with A'' < rt, will remain, with n^^^ points equivalent to ai, n^^) equivalent
to 02,..., nW equivalent to o^. In both cases W(S) ^'H{Y,a^)®n{T,a^)®- ■ ■®n{'Sar),
with dimW(Ea,.) = n^) <t,j = l,...,r&ndN = n^^) + n'^) + •... + n^^). We indicate
by C^l' the elementary restriction matrices relative to the space W(Eo.), constructed as
indicated in Section 4. Let n^ be the number of rows of the matrix CJ^^; having fixed j,
the number of columns of matrices CJ^\ for any £ and k, is n^^\We consider therefore
the following M block matrices
/ Mi)
Rek=
o0
.
V 0
0
■ ■
0
dl^ o0 ■■■
■ ■
o
0
0
.
O
0
■
■
.
,
(5.2)
'-'ik
with Ni = n} +n[ '-\ +n^l rows and N columns, from which we have to eliminate the ■
possible zero rows. Matrices R^k determined by this procedure, which we call restriction
matrices for the space H{T,) of dimension N, have rank equal to the number A''^ of the
remained rows and for these matrices properties expressed in Theorem 3.1 still hold. In
both cases, we have the following theorem.
Theorem 5.2 Considering the basis B in W(S) as a column vector V^ with the order deduced from (5.1), the components of the vector RekV^ form a basis in TitkCS).
Therefore the M matrices Rek, having fixed in W(S) the ordered basis V^, determine a
decomposition of W(E) in M subspaces, which coincides with the one obtained with the
projection operators P^k ■
Preliminary numerical results appear promising; algorithms for potential and linear
elasticity problems are being implemented on parallel processors to analyse the efficiency
of the proposed approach.
Bibliography
1. A. Aimi, L. Bassotti, and M. Diligenti, Groups of Congruences and Restriction
Matrices, submitted to BIT.
2. A. Aimi and M. Diligenti, Hypersingular kernel integration in 3D Galerkin boundary
element method, J. Comp. ^ppZ. Mai/i., 138, 1, (2002), 51-72.
3. L. Bassotti Rizza, Operatori lineari T-invarianti rispetto ad un gruppo di congruenze, Ann. Mat. Pura ed AppL, 148, (1987), 173-205.
4. J. L. Lions and E. Magenes: Non-Homogeneous Boundary Value Problems and Applications I, SpnngeT-Verla.g, Berlin, Beidelheig, New York, 1972.
5. V. I. Smirnov, Linear Algebra and Group Theory, McGraw Hill, New York, 1961.
The numerical simulation of the qualitative behaviour
of Volterra integro-differential equations
John T. Edwards, Neville J. Ford and Jason A. Roberts
j.edwards@chester.ac.uk, njford@chester.ac.uk, j.roberts@chester.ac.uk
Chester College, Parkgate Road, Chester, CHI 4BJ, UK.
Abstract
We consider the qualitative behaviour of exact and approximate solutions of integral
and integro-differential equations with fading memory kernels. Over long time intervals
the errors in numerical schemes may become so large that they mask some important
properties of the solution. One frequently appeals to stability theory to address this
weakness, but it turns out that, in some of the model equations we have considered,
there remains a gap in the analysis.
We consider a linear problem of the form
y'{t) = - f e'^^'^'^y{s)ds,
2/(0) = 1,
Jo
and we solve the equation using simple numerical schemes. We outline the known stability behaviour of the problem and derive the values of A at which the true solution
bifurcates. We give the corresponding analysis for the discrete schemes and highlight
that, for particular stepsizes, the methods give unexpected behaviour and we show that,
as the step size of the numerical scheme decreases, the bifurcation points tend towards
those of the continuous scheme. We illustrate our results with some numerical examples.
1
Introduction
The qualitative behaviour of numerical approximations to solutions of functional differential equations is an important area for analysis. We aim to investigate whether the
behaviour of the numerical solution reflects accurately that of the true solution. We are
particularly concerned with the behaviour of the solution over long time periods when (in
particular) the convergence order of the method gives us limited insight, since the error
depends on a constant that grows with the time interval. Many authors are concerned
with stability of solutions and of their numerical approximations. We have considered
elsewhere (see [7]) the stability of numerical solutions of equations of this type (and of
non-linear extensions). This analysis raised a number of questions, which we consider
here, about just how well the full range of qualitative behaviour of even quite a simple
equation is understood.
Bifurcations (by which we shall mean any change in the qualitative behaviour of
solutions) frequently arise only for systems or for higher order problems and therefore
86
Numerical simulation of volterra equations
87
one is particularly interested in finding suitable simple equations as the basis for analysis.
In this paper, we consider the solution by numerical techniques of the integro-differential
equation
y'{t) = - f e-^(*-«)y(s)ds, 2/(0) = 1.
(1.1)
Jo
The equation is a linear convolution equation with separable fading memory convolution kernel and therefore is a simple example from an important class of problems
familiar in applications. It is also possible to analyse the equation in the form of a second
order ordinary differential equation.
The equation has several key properties that ma;ke it an ideal basis for our analysis:
1. it depends on the value of the single parameter A,
,
2. when A varies through real values, four distinctive qualitative behaviours in the
solution can be detected, and
3. equations with exponential convolution kernels frequently arise in applications and
elsewhere in the literature.
For A real and positive, the kernel is of fading memory type. For A real and negative,
the kernel has a growing memory effect. This linear equation displays surprisingly rich
dynamical behaviour for real values of the parameter A and it is this behaviour that we
want to consider for the numerical scheme. We note that the classical test equation
y'{t)^g{t)+^y{t) + r] f y{s)ds, rij^O
(1.2)
Jo
([1, 2]) displays the same range of qualitative behaviour possibilities as (1.1) for varying
values of the two real parameters ^,r]This motivates us to consider equation (1.1) as a prototype problem that is interesting
in its own right and that will also provide insight into the behaviour of more complicated
equations. We propose to give a further analysis, where we consider the boundaries along
which bifurcations occur for equation (1.2) in a sequel [3].
We consider the following questions.
1. Does the numerical scheme display the same four qualitatively different types of
long term behaviour as are found in the true solution?
2. Are the interval ranges for the parameter A that give rise to the changes in behaviour
of the solution the same as in the original problem?
2
Behaviour of the exact solution
We consider the equation (1.1) which can be shown to have a unique continuous solution (see, for example, [10]). One can easily establish (by considering, for example, an
equivalent ordinary differential equation) the general solution
y{t) = Ae
^^* + Be
2
*
(2.1)
where A, B are constants. For real values of A the solution to (1.1) bifurcates (or changes
qualitative behaviour) at A = 0, ±2. We have the following qualitative behaviour.
J. T. Edwards, N. J. Ford, and J. A. Roberts
Al.
A2.
A3.
A4.
When A > 2, y —> 0 as i —> oo, with no oscillations.
When 0<A<2, y—>Oasf^oo, with infinitely many oscillations.
When A = 0, y{t) = cos(i) (persistent oscillations).
When —2 < A < 0, the solutions contain infinitely many oscillations of increasing
amplitude.
A5. When A < — 2, the solution grows (in magnitude) without any oscillations.
3
Numerical analysis
To apply a numerical method to an integro-difFerential equation of the type
y'it) = f(t,y{t),j k{t,s,y{s))dsY
y{0) = yo,
(3.1)
we write the problem in the form
y'{t)
= fit,y{t),m)
z{t)
=
(3.2)
I k{t,s,y{s))ds.
(3.3)
Jo
We solve (3.2), (3.3) numerically using a linear multistep method for solving equation
(3.2) combined with a suitable quadrature rule for deriving approximate values of z from
equation (3.3) (see [2]). Such a method is sometimes known as a DQ-method. For linear
fc-step methods, one also needs to provide a special starting procedure to generate the
additional k — 1 initial approximations to the solution that are not given in the equation
but are needed by the multistep method on its first application. It turns out that one
needs to choose the quadrature, multi-step method and starting schemes carefully to
ensure that the resulting method is of an appropriate order of accuracy for the work
involved. One should try to choose schemes of the same orders as one another since the
order of the overall method is equal to the lowest of the orders of the three separate
methods (the multistep formula, the starting value scheme and the quadrature) used
to construct it. In this paper we have chosen to focus on one-step methods. There are
two reasons for this: we have thereby avoided the need to construct special starting
procedures which would make our analysis more complicated; as Wolkenfelt showed in
[11], methods with a repetition factor of 1 (such as the ones we consider) are always
stable and we also draw attention (see [9] for example), to the fact that the trapezoidal
rule is an A-stable 1-step method.
For a well-behaved numerical scheme for (3.2), (3.3), we would anticipate four intervals (as with the continuous problem) of A-values where the solutions to the discrete
scheme behave qualitatively differently. However we know from investigation of bifurcation points for numerical solution of delay differential equations (see [12]) and indeed
from stability analysis of integro-differential equations, that the points at which the qualitative behaviour of the solution changes may arise at the wrong values of the parameter.
Based on previous experience (see [6]) we would expect this difl^erence to be dependent
upon the stepsize h of the numerical method and on the choice of method itself. Furthermore (see, for example [8], [12]), one might expect the bifurcation points of the discrete
Numerical simulation of volterra equations
89
scheme to approach the bifurcation points of the continuous problem as h -^ 0 and one
could anticipate that, for a method of overall order p, the approximation of the true bifurcation point by the bifurcation point of the numerical scheme would also be to 0{hP).
We will show in this paper that (for h—>0) the approximation of the bifurcation points
in the methods we have chosen is at least to the order of the method.
To keep the analysis reasonably simple, we consider the following discrete form of
(3.2). We use a linear ^-method in each case so that we solve the system
Vn+i
=
yn + h{eiFn + {l-ei)Fn+i),n = 0,l,...,
Fn
= f{nh,yn,Zn),
Zn
=
(3.4)
,
(3.5)
h{e2k{nh,0,yo) + Y^k{nh,jh,yj) + {l-e2)k{nh,nh,yn)\.
i^.G)
One could choose any combination of ^,,0 < ^j < 1 and a natural choice could be
01 =62- However, in order to start with a simple method where the algebraic problem
is tractable we have considered first the cases where ^i = 0 and we consider a range of
values of ^2One solves equations of the form
'
n
yn+i - yn = -h^
^2e-^^"+')'^yo + 5]e-^'»("+i-^)%- + (1 - e2)yn+i \,yo = yi = 1.
(3.7)
Note that we ha,ve used a simple procedure to find the additional starting value yi = 1.
We have observed from the integro-differential equation that j/'(0) = 0 and have deduced
that y{h) = 2/(0) will provide a reasonable order 1 starting approximation. This choice
of formula implies that we are combining a backward Euler scheme to discretise the
differential equation, with, respectively, (for O2 = 1) the forward rectangular (Euler)
rule, (for ^2 = 5) the trapezoidal rule and (for 62 = 0) the backward rectangular rule
for the quadrature. We will return to consider other combinations of Oi, 62 later.
The equation (3.7) is equivalent to
(1 + h''{l - 92)) yn+2 + {h^62e-^^ - 1 - e"^'^) y„+i + e'^'^yn = 0.
(3.8)
The behaviour of the solution as i —> 00 depends on the roots of the characteristic
equation
{l + h^{l-e2))k''+{h^e2e-^^-l-e-^^)k + e-^^ = Q.
(3.9)
Any solution of (3.8) will be asymptotically stable if both roots of (3.9) are of magnitude
less than one and unstable if either root of (3.9) has magnitude greater than one. The
solutions will contain (stable or unstable) oscillations when the roots of|(3.9) are complex
or, indeed, when at least one root is negative. It follows from this'(see [4]) that the
bifurcations occur as follows (for reasonably small /i > 0).
90
J. T. Edwards, N. J. Ford, and J. A. Roberts
Bl. When A > -^In I
h^ei^H^+i
)^ y„ -> 0 as n -4 oo with no
oscillations. This condition can be written in the simpler form
A > ^In (1 + 2/1^ - h^Oi + 2yJ-h? {h^e2 - 1 - h?)\
and we thank the anonymous referee for pointing out this simplification.
B2. When i In (i+pfjre;)) < A < i In (l + 2/1^ - /i^^a + 2^-h? {h?e2 - 1 - ft^)),
y„ ^ 0 as n —> 00, with infinitely many oscillations.
B3. When A = ^ In (i^^an_g s) we obtain persistent oscillations.
B4. When i In (l + 2h^ - h^2 - 2y/-h^ {K^Oi - 1 - h?)) < A < i In (^^^^J^-^),
the solutions contain infinitely many oscillations of increasing amplitude.
B5. When \ < }^\n (l + 2h? - h^Oi - 2^-h? {h?92 - 1 - h?)\, the solution grows (in
magnitude) without any oscillations.
4 Bifurcation points of the numerical scheme as approximations
to true bifurcation points
We consider now the way in which the bifurcation points of the discrete scheme approximate those of the original problem. We are using a numerical scheme of order 1.
First we consider the value of Ai = ^ In (l + 2h'^ - h'^92 + 2^-/i2 [h'^e^ - 1 - h'^)\
as 92 varies and h —> 0. It is easy to see that, as h ^ 0, the value Ai satisfies Ai —> 2.
In fact we can give greater precision to this. We can show that Ai = 2 - 92h + 0((^)
as /i —> 0. This means that, for $ methods in general, the approximation by our scheme
approximates the true value (—2) to order 1 (the order of the method), as /i —> 0. In the
particular case 92 = 0 the approximation is to order 2.
For A2 = ^ In (i^f^i}i_Q )) it is straightforward to show that stability is lost at a
value of A that approximates the true value (0) to order 1 in general. In fact, for ^2 = 1,
the forward Euler scheme, the approximation is exact for all values of /i.
The analysis of A3 = i In (l + 2h'^ - h^92 - 2y/-h? {h?92 - 1 - h'^)) follows in exactly the same way as for Ai and leads to an identical conclusion: the approximation of
the bifurcation point A = —2 is in general to order 1 as h -^ 0 and to order 2 if ^2 = 0.
We illustrate our results graphically. Each of the plots shown in Figure 1 illustrate,
for varying h, the ranges for the parameter A where
1. the solutions are unstable due to at least one real root greater than unity in magnitude
(the darkest region in the figures) (exponential growth if the root is positive, growing
oscillations if the root is negative),
2. the solutions are unstable due to growing oscillations (the next darkest region in the
figures),
3. the solutions are stable with asymptotically stable oscillations (the lightest region in
the figure), and
Numerical simulation of volterra equations
4. the solutions are stable with exponentially stable decay.
We can compare with the right hand plot in Figure 2 which shows the true regions
for the original problem and we can make the following observations.
1. As ft -+ 0 the values of A at which changes in the behaviour occur approach the true
values. This coincides with our previous experience in delay equations (see [8]).
2. There is some extremely surprising behaviour for some values of /i > 0.
(a) For the two values 62 = 0.5 and 62 — lwe can see that the darkest region is in
two parts: in the upper part there is a negative real root of magnitude greater
than unity leading to exponentially growing oscillations in the solution; in the
lower part there is a positive real root of modulus greater than unity leading
to exponential growth in the solutions.
(b) There can be a critical value oi h > 0 {h = -h= when 62 > 0) at which, for
apparently arbitrarily large A < 0 the numerical solution displays oscillatory
behaviour.
(c) There can be an additional thin region (visible only in larger scale versions
of the plots) between the darkest and lightest regions in which there is a real
negative root of magnitude less than unity leading to decaying oscillations.
(d) For 02 = 0.5 and 62 = I the upper part of the darkest region indicates some
really strange behaviour: spurious oscillations may arise for arbitrarily large
negative values of A and even (see figure 1) for some positive values of A. Thus
we can have the situation (for example for A small and positive) where the true
solution tends to zero while the approximate solution exhibits oscillations of
growing magnitude. Alternatively, (for A large and negative) the true solution
could exhibit high index exponential growth while the approximate solution
exhibits oscillations. We draw attention also to the fact that, for 62 — 0.5
and 62 = 1 the stability boundary of the method is made up of parts of the
boundaries of two regions, making the prediction of behaviour for varying h>0
particularly difficult.
We believe that these observations justify our view that more attention needs to
be paid to changes in qualitative behaviour other than stability in reaching a good
understanding of the behaviour of numerical methods for problems of this type.
We can consider next whether these observations are equally true for other choices of
numerical method. We present in Figures 2 plots reveaUng the qualitative behaviour of
solutions to equations (3.2), (3.3) with other choices of ^-method. It is easy to see that,
even for combinations such as using the trapezium rule for both parts of the discretisation
(a method characterised by 61 = 62 = 0.5 and known to do very well at preserving the
stability boundary) there are problems in the preservation of other types of qualitative
behaviour when h is not very small. Similarly, we can see that the choice 61 = 62 = I
leads to a shrinking range (as h increases) for A that lead to stable oscillatory solutions.
91
92
J. T. Edwards, N. J. Ford, and J. A. Roberts
8,=0 ft;=05
*
FIG.
J
-2.
0
?
4
6
.8
-E
-4
<2
1. Bifurcation points as h varies for 0i = 0, ^2 = 0, .5,1 respectively.
W 5, Irlpillum rull (oi ielMtnt
3
&^T, bachwrt •litr for dims'W
4
e
B
fl
*
4
-2
1
4
E
8
«
*
■<
-2
2
4
E
2. Bifurcation points as h varies for, respectively, 9i = O2 = 0.5,1 and for the
analytical problem.
FIG.
5
Alternative approaches
The particular equation we have considered can be formulated as an integro-differential
equation, as an integral equation or as a second order differential equation. We have
shown in [4] that the interesting and somewhat surprising observations about numerical
behaviour that we made in the previous section also apply in these other formulations.
6
Closing remarks
The results presented in this paper show that the well-established stability theory based
on the analysis of equation (1.2) gives only a very limited insight into the qualitative
behaviour of solutions of the class of convolution equations with exponential memory
kernel that we have considered here. We have observed elsewhere (see [5, 6, 7]) that the
qualitative behaviour of numerical solutions to equations of this type may have surprising
features and our consideration here of the prototype problem (1.1) illustrates how this
unexpected behaviour may arise. We have seen in this paper how oscillations may arise
in the numerical schemes when they should not, and how in other cases the numerical
schemes may supress genuine oscillatory behaviour. When one seeks good methods based
on a stability analysis, the desire is to focus on those methods where the step-length h> 0
is not subject to some upper bound to ensure the stability of the method. However our
initial observations in this paper have shown that this may well prove an unreasonable
Numerical simulation of volterra equations
expectation when one is investigating these other changes in qualitative behaviour.
We believe that this paper introduces a range of worthwhile investigations in a field
that is still quite open. Space restrictions have prevented us from considering the behaviour of more general methods in this paper and also from extending our analysis to
consider other problems. The results we have presented here show that, for these simple
methods at least, the bifurcation parameters are approximated in the numerical scheme
to at least the order of the method, for sufficiehtly small ^ > 0. It is also very clear
that, even for what appears to be a simple problem, the choice of numerical scheme and
the form in which the problem is presented provide us with a rich source of example
behaviour.
Bibliography
1. H. BRUNNER AND J. LAMBERT, Stability of numerical methods for Volterra integrodifferential equations, Computing, 12 (1974), pp. 75-89.
2. H. BRUNNER AND P. J. VAN DER HOUWEN, The numerical solution of Volterra
equations, North-Holland, 1986.
3. J. T. EDWARDS, N. J. FORD, AND J. A. ROBERTS, Numerical approaches to
bifurcations in solutions to integro-differential equations. Proceedings of HERCMA,
(2001).
4.
, The numerical simulation of an integro-differential equation with exponential
memory kernel close to bifurcation points. Tech. Rep. preprint, Manchester Centre
for Computational Mathematics, (ISSN 1360 1725) 2001.
5. J. T. EDWARDS AND J. A. ROBERTS, On the existence of bounded solutions to a difference analogue for a nonlinear integro-differential equation. International Journal
of Applied Science arid Computations, 6 (1999), pp. 55-60.
6. N. J. FORD, C. T. H. BAKER, AND J. A. ROBERTS, Nonlinear Volterra integrodifferential equations- stability and numerical stability of 6-methods, Journal of Integral Equations and Applications, 10 (1998), pp. 397-416.
7. N. J. FORD, J. T. EDWARDS, J. A. ROBERTS, AND L. E. SHAIKHET, Stability
of a difference analogue for a nonlinear integro-differential equation of convolution
type, Tech. Rep. 312, Manchester Centre for Computational Mathematics, October
(ISSN 1360 1725) 1997.
8. N. J. FORD AND V. WULF, The use of boundary locus plots in the identification of bifurcation points in numerical approximation of delay differential equations.
Journal of Computational and Applied Mathematics, 111 (1999), pp. 153-162.
9. J. LAMBERT, Numerical methods for ordinary differential systems, Wiley, 1991.
10. P. LINZ, Analytical and Numerical Methods for Volterra Equations, SIAM, 1985.
11. P. H. M. WOLKENFELT, On the relation between the repetition factor and numerical
stability of direct quadrature methods for second kind Volterra integral equations,
SIAM Journal on Numerical Analysis, 20 (1983), pp. 1049-1061.
12. V. WULF, Numerical analysis of delay differential equations undergoing a Hopf
6i/urcaizon, PhD thesis, University of Liverpool, 1999.
93
Systems of delay equations with small solutions:
a numerical approach
Neville J. Ford and Patricia M. Lumb
Chester College, Parkgate Road, Chester, CHI 4BJ, UK.
njford@chester.ac.uk, P.Lumb@chester.ac.uk
Abstract
We consider systems of delay differential equations of the form
y'(t) = A{t)y{t-l)
where y € IR" and J4 : R —> JR"**". We investigate whether a numerical method can be
used to determine whether or not the equation has so-called small solutions. Our work
builds on recent analysis and experimental work completed in the scalar case and we are
able to conclude that, at least when A is a suitable periodic matrix, one can predict small
solutions by using a numerical approximation scheme of fixed step length.
1
Introduction and basic theory
The analysis of delay differential equations, both analytically and numerically, is wellestablished. One distinctive feature is that even a scalar delay differential equation is an
infinite dimensional problem. For, if x satisfies
y'{t) = b{t)y{t-l)
(1.1)
the initial conditions that need to be specified take the form
yit) = <pit),
-l<t<0.
(1.2)
This infinite dimensionality has two significant implications for us:
(1) the dimension of a system of delay equations is the same as the dimension of a
scalar delay equation, and
(2) the range of dynamical behaviour among solutions of delay equations is far wider
than would be the case for ordinary differential equations.
In the present paper we are investigating an infinite dimensional property (that of possessing small solutions) where the analysis and results for systems needs to be presented
quite separately from those for scalar equations because there are some interesting and
distinctive features.
One way in which delay equations may be analysed is to view the solution operator
as a dynamical system. The dimension of the dynamical system then inherits the infinite
dimensionality of the delay equation itself. Small solutions (those that satisfy a:(i)e"* —* 0
94
Systems of delay equations with small solutions
95
as t —> 00 for all values of the parameter a) can arise in these infinite dimensional
problems but would not be observed in firnite dimensional equations. They are important
because, when a delay equation has small solutions, the eigenfunctions and generalised
eigenfunctions of the solution map do not form a complete set. This means that some
standard analytical results do not hold and that particular care must be taken in solving
and analysing the equation.
The easy detection of problems that have small solutions is still, in general, open, but
we have seen [4, 5] that the use of a numerical approximation scheme can lead to good
insights. Here we approximate the delay differential equation using a simple numerical
scheme with fixed step length and then consider the spectrum of the resulting solution
map.
In recent work (see, for example [3, 5]) the scalar case has been considered with
some success. We have been able to see that, for the equation (1.1) with b periodic of
period 1, we can detect the existence of small solutions by exploring the (finitely many)
eigenvalues of the numerical scheme. We also found that it was not necessary to use a
sophisticated numerical scheme for the investigation and this has justified us in focussing
on the trapezium rule as the numerical method in this paper.
For the scalar case (1.1) it is known (see for example [4, 5]) that, when h satisfies the
periodicity condition h{t) = b{t - 1), then non-trivial small solutions arise if and only if
the function b changes sign. For the vector-valued case we can give a theorem, recently
proved by Verduyn Lunel ([11]).
Theorem 1.1
Consider the equation
y'{t) = A{t)y{t-1), where A{t) = A{t-l),
(1.3)
and where y S iR". The equation has small solutions if and only if at least one of the
eigenvalues Xi satisfies, for some t,
sRAi(t-)xSRAi(£+)<0,Ai(i) = 0.
(1.4)
Remark 1.2 We shall describe the property (1.4) using the words an eigenvalue passes
through the origin. We note that, even for real matrices A, the eigenvalues may be complex and it could be that a pair of complex conjugate eigenvalues will cross the y—axis
away from the origin. In this case the equation has small solutions only if there is some
other crossing of the y— axis by an eigenvalue where the crossing does take place at the
origin.
2
Numerical methods and systems of order two
All the important relevant features of systems of delay equations turn out to be exhibited
in systems of two equations and so we shall focus on these for simplicity. We consider
the equation
y'{t) = A{t)y{t - 1)
for
A e M^^^
and
y e K^
subject to y{t) = (p{t) for -1 < t < 0 and we assume that A{t) = A{t - 1) for all t.
(2.1)
96
N. J. Ford and P. Lumb
We introduce
We apply the trapezium rule with step length h = jj and introduce the approximations Xij f« xi{jh), and X2,j « X2{jh),j > 0; xij = (pi{jh),X2,j = </'2(j/i), -N <j < 0.
Set
yn = [ Xi^n,Xl^n-l,--- ,Xi,n-N^^2,n,X2,n-l,--- ,X2,n-N )
•
(2.3)
We note that, as in the one-dimensional case (see [3, 4, 5]), we can write the numerical
scheme as j/„+i = A{n)yn, where the matrix A{n) now takes the form
/ 1
1
0
0
0
1
A{n)
0
|a„+i
jan
0
0
0
0
0
1
0
2 7n+l
2 '"
0
0
1
1
0
0
Vo
lPn+1
0
0
Sn + 1
0
1
2^" \
■ (2.4)
2""
0
/
The sequence of matrices {j4(n)} is periodic, of period A'' (since the function A is
periodic of period 1) and j/2 = A{l)yi,y3 = A(2)A{l)yi and so on. Therefore yw+i = Cyi
where C = A{N)A{N- 1). • • ■ .^(2)^(1).
Remark 2.1 The key to extending our discussion to larger systems, and indeed, to
gaining a full understanding of the approach, is to note that in both the matrix A{n) and
the matrix C the original block structure is retained. Therefore although the matrices
A{n) and C are considerably larger than the original 2x2 matrix A{t) in the problem,
they are made up of 4 blocks in a 2 x 2 formation. Indeed the contents of each block
is completely determined by our numerical method (the trapezium rule) and the values
of the corresponding function, respectively a, 0,^,6. There is no pollution of the blocks
from the neighbouring functions.
We consider three different cases:
(1) P{t) = 7(i) = 0 so that the matrix A is diagonal,
(2) either P{t) = 0 or 7(i) = 0 so that the matrix A is triangular, and
Systems of delay equations with small solutions
(3) the matrix A is neither diagonal nor triangular.
The first two cases can be dealt with quite quickly because of the' fact that real
diagonal and triangular matrices have only real eigenvalues and these eigenvalues lie on
the diagonal. Therefore in these two cases we need consider only the question of whether
the eigenvalues pass through zero; we do not need to concern ourselves with possible
complex eigenvalues whose real parts change sign away from the origin.
We can go further: a diagonal matrix A leads to a block diagonal matrix A{n) (with
non-zero blocks top left and bottom right). Now by simple matrix theory we know
that the eigenvalues of such a matrix are simply the union of the eigenvalues of the
two blocks. A similar argument applies when there is a triangular matrix A because
the matrices A{n) are then block triangular. It follows that, for both of cases 1 and 2,
the 2—dimensional eigenvalue problem simply reduces to two 1—dimensional problems.
Therefore, when we consider the eigenspectra of the numerical schemes in cases 1 and
2, we expect the result to be the superposition of the eigenspectra from the two block
matrices on the diagonal of C.
Case 3 is more complicated and we shall return to it after we give brief examples of
Cases 1 and 2.
3
How to recognise small solutions: our previous work
Space restrictions here prevent us from giving a great many details of our previous work,
but we provide a summary to show how the current investigation builds on the scalar
case. In [3] we considered the eigenspectra of the matrix C. We showed that there were
three characteristic patterns for the eigenspectra, represented by Figure 1. We take the
presence of the closed loops that cross the a;-axis to be characteristic of the cases where
small solutions arise.
FIG. 1. Eigenspectra where b{t) has no change of sign on [0,1] (left), where b{t) has a
change of sign on [0,1] and /^ b{s)ds = 0 (centre), and where b{t) has a change of sign
on [0,1] and /^ b{s)ds # 0 (right).
4
The cases when /3(i) = 0 and/or ^{t) = 0
As we have remarked already, the eigenspectrum when A is diagonal or triangular is just
the same as the eigenspectra of the block matrices from the diagonal of C. We expect to
97
N. J. Ford and P. Lumh
98
find the eigenspectra superimposed, which is indeed what we see in the examples given.
Here we assume that at least one of 7(<) or /3(i) is zero; the plots are then independent
of the values taken by the other.
Example 4.1 We solve (2.1) with the choice a{t) = sin 27ri+1.4 and 5{t) = sin 27ri+0.5.
Here a does not change sign but 5 does change sign. We expect small solutions and Figure
2 provides confirmation.
Example 4.2 Now we solve (2.1) with a{t) = sin 27ri and 6{t) = <
'
.
4 ^ )i' ^i
This time both a and S change sign and we expect small solutions (see Figure 2).
FIG.
2. Eigenspectra for Example 4.1 (left) and Example 4.2 (right).
4.1 The general two dimensional case
We now move on to consider the case when neither of P(t),j{t) is identically zero. In
this situation the eigenvalues of A{t) can be complex and so may cross the y-axis away
from the origin.
First, we recall that det(yl) is the product of the eigenvalues of A so that, by Theorem
1.1, it follows that det{A) = 0 is a necessary condition for small solutions. However this
condition cannot be used to characterise equations where small solutions arise; if the
eigenvalues of A are real and one passes through the" origin, then det(yl) will change
sign. If the eigenvalues of A are a complex conjugate pair and cross the y-axis at the
origin then det(^) will instantaneously take the value zero but will otherwise remain
positive (the same behaviour as when a real eigenvalue becomes zero but does not change
sign). Therefore one cannot expect a change of sign in det(j4) whenever there are small
solutions. The fact that the trace of A is the sum of the eigenvalues of A can be used to
characterise this case.
We summarise. For a real matrix A:
(1) if det(i4) changes sign then there are small solutions,
,(2) if det(i4) becomes zero instantaneously and trace(A) simultaneously changes sign
then there are small solutions,
(3) if det(j4) becomes zero instantaneously and trace(j4) does not simultaneously change
sign then there are no small solutions indicated.
Systems of delay equations with small solutions
99
Example 4.3 We first consider the case when the matrix A takes the form
Ait)^
sin 2nt + a
sin 2iTt + c
sin 2iTt + b
sin 2irt + d
By judicious choice of the constants a, b, c, d one can produce different types of behaviour. One can see that \A{t)\ = {a + d-b- c)sin27ri + {ad- be). We will illustrate
with the following choices of the constants
Case 1: a = 1.5, b = 0.7, c = 0.5, d = 0.5 where the determinant changes sign,
Case 2: a = —2, b — 0.8, c = 1.8, d = 0.7 where, again, the determinant changes sign.
Case 3: a = 1.6, b = 0.8, c = 1.8, d = 0.7 where the determinant never becomes zero.
Prom the plots for cases 1 and 2, we can easily see the presence of small solutions in
the eigenspectra shown in Figure 3. In the Case 3, the eigenspectra in Figure 3 indicate
that, as expected, no small solutions are present.
»ffV-
i^t»t$*{»"*-«twti: XI I
FIG.
3.
Case 1.
Case 2.
Case 3
Example 4.4 Next, we consider the case when the matrix A takes the form
sin 2Trt
sin 2-jTt + b
{sm2'Kt + b)
sin 27ri
We choose the constant 6 in the following ways
Case 4: 6 := 0 so that det(A) becomes instantaneously zero at the same value that
trace(yl) changes sign and the complex eigenvalues of A cross the y-axis at the
origin.
Case 5: 6 = 0.05 so that the complex eigenvalues of A cross the y-axis away from the
origin.
Here we can see that the characteristic shapes we familiar from our earlier work are
not reproduced and further investigation is called for. We remark that (in the zoomed
versions) the eigenspectrum where small solutions arise passes through the origin. This
property is reproduced also for all other examples that we have tried.
Example 4.5 Now we consider the case when the matrix A takes the form
A{t) =
t
t+b
-t-b
t
N. J. Ford and P. Lumb
100
FIG.
4. Left: Case 4.
Right: Case 5 and (below) zoomed versions.
for t e [-0.5,0.5), A(t) = A{t - 1) for t > 0.5 then it follows that A has complex
eigenvalues that cross the y-axis at y = b when i = 0. We plot the eigenspectra for
Case 6: 6 = 0 so the eigenvalues of A cross the y-axis at the origin,
Case 7: b = 0.01 so the eigenvalues of A cross the y-axis away from the origin.
FIG.
5. Left: Case 6.
Right: Case 7 and (below) zoomed versions
Systems of delay equations with small solutions
5
Conclusions
We have seen that it is easy to extend the detection of small solutions by numerical
methods from one-dimensional to two-dimensional problems where the eigenvalues are
real. Initial experiments indicate that the method works also for problems possessing
complex eigenvalues, but here the patterns that arise in the eigenspecra plots are unfamiliar and require further investigation. However, based on our experimental evidence,
it seems that small solutions arise in the latter case if and only if the eigenspectra plots
pass through the origin.
Bibliography
1. O. Diekmann, S.A. van Gils, S. M. Verduyn Lunel, H.-O. Walther, Delay Equations,
Springer Verlag, New York, 1995.
2. Y. A. Fiagbedzi, Characterization of Small Solutions in Functional Differential
Equations, ^pp?. Mai/i. Lett. 10 (1997), 97-102.
3. N. J. Ford, P. M. Lumb, Numerical approaches to delay equations with small solutions, Proceedings of HERCMA 2001, to a,ppea,T.
4. N. J. Ford, S. M. Verduyn Lunel, Numerical approximation of delay differential
equations with small solutions. Proceedings of 16th IMACS World Congress on Scientific Computation, Applied Mathematics and Simulation, Lausanne 2000, paper
173-3, New Brunswick, 2000 (ISBN 3-9522075-1-9).
5. N. J. Ford, S. M. Verduyn Lunel, Characterising small solutions in delay differential
equations through numerical approximations, Applied Mathematics and Computation, to appear.
6. N. J. Ford, Numerical approximation of the characteristic values for a delay differential equation, MCCM Numerical Analysis Report No 350, Manchester University
1999 (ISSN 1360 1725).
7. J. K. Hale and S. M. Verduyn Lunel, Introduction to Functional Differential Equations, Springer Verlag, New York, 1993.
8. D. Henry, Small Solutions of Linear Autonomous Functional Differential Equations,
J. Differential Equations. 8 (1970), 494-501.
9. S. M. Verduyn Lunel, A sharp version of Henry's theorem on small solutions, J.
Differential Equations. &2{l%m),2m-2'JA.
10. S. M. Verduyn Lunel, Series Expansions and Small Solutions for Volterra Equations
of Convolution Type, J. Differential Equations. 85 (1990), 17-53.
11. S. M. Verduyn Lunel, private communication.
101
On an adaptive mesh algorithm with minimal
distance control
Kamal Shanazari and Ke Chen
Department of Mathematical Sciences, The University of Liverpool,
Liverpool L69 7ZL, UK.
{kamals, k.chen} 01iv.ac.uk.
Abstract
In this paper, we present a new technique for generating error equidistributing meshe.s
that satisfy both local quasi-uniformity and a preset minimal me-sh spacing. This is firstly
done in the one-dimensional case by extending the Kautsky and Nichols method [6] and
then in the two-dimensional case by generalizing the tensor product methods to alternating curved line equidistributions. With the new meshing approach, we have achieved
better accuracy in approximation using interpolatory radial basis functions (RBFs). Furthermore improved accuracy in numerical results have been obtained for a class of linear
and non-homogeneous PDEs solved by the dual reciprocity method (DRM).
1
Introduction
The adaptive mesh algorithms have been widely used in the numerical solution of partial differential equations (PDEs) for boundary value problems [1, 13]. One undesirable
feature of an error equidistributing mesh is that there is no guarantee of it being sufRciently smooth. For our applications of interpolation (using RBFs), the distance between
points becoming too small can imply that the underlying interpolation matrix becomes
ill-conditioned.
In this paper, we propose a method to deal with this problem in Section 2. Essentially
our method consists of modifying the error monitor function in a suitable way and
then equidistributing the new function so that the minimal mesh size constraint can be
satisfied. We deal with the extension of adaptive mesh to two dimensions in Section 3.
Finally, some numerical results will be given in Section 4.
2
An adaptive mesh with minimal mesh size control
In the ID case, a typical adaptive mesh problem can be stated as follows: given a mesh
(uniform or non-uniform) to,ti,...,tm, and its corresponding error values (usually estimated from the numerical solution using a monitor fvmction [5]) /o, /i,..., /m, we wish
Support of a studentship by the Ministry of Education (Iran) is gratefully acknowledged.
102
Adaptive mesh algorithm with distance control
103
to find a new mesh
n:a;o, a;i, ..., a;„,
(2.1)
that is locally bounded with respect to a positive constant A; > 1 such that 1/fc <
hj/hj-i < k,-j = 1,2,.. .,n — 1, hj = Xj^i — Xj, while the errors are equidistributed on
mesh n. One solution to this problem was given in [6] by replacing fj by fj followed by a
standard equidistribution algorithm, fj is referred to as the padded function and the main
idea of replacing fj is increasing the values of the function /, where too small, to prevent
considerably large mesh sizes. We now propose a method of further modifying fj in such
a way that the resulting equidistribution mesh satisfies the preset minimal mesh size
hmin- Before proceeding, we consider replacing the piecewise linear function f{x) (with
endpoint values fj = f{tj)) by another piecewise hnear function Z{x) (with endpoint
values Zj = f{xj)). This is a technical approximation to simplify the presentation;
actually the proposed method may work without this step. Note that if we were to
equidistribute Z{x), the resulting mesh would not differ from Xj much; define the average
value of the monitor function as
d' = d'{Z)^^Y.^Zj + Zj^,)^.
(2.2)
Our aim now is to modify some Zj values so that the modified average value is the same
as d' while the modified values ensure a preset minimal mesh size hmin is satisfied. To
present our method, we note that insisting on hj > hmin implies Zj < Z where
Zhmin ~ "
(2-3)
and Z is the critical constant to realize hmin- This points a way of modifying those large
values of Zj. However it is not obvious how to ensure the new and modified average
values are the same, i.e. equidistribution is maintained for the same error constant.
Suppose that among the current Zj values, there are M +1 of them that are larger than
Z (i.e. whose corresponding mesh size is less than hmin)', denote these values by Zk. for
i = 0,1,..., M. This means that Zk^ < Z for j = M +1, M + 2,..., n. Here the sequence
A;o,fci,.. .,fc„ represents a permutation of 0,1,2,.. .,n.
It turns out that a suitable modification (from Zj to Zj) is the following:
(i)
Zkj=Z
when
Zkj>Z,
i.e. for j = 0,1,.. .,M,
M
(ii) z,^=z,^+-^
J2iZk, - Z)h,
^k,
^2.4)
l=M+l
for j = M+l,M + 2,...
where
{hki+hki-i)/2
'T-ki = { ho/2
hn-i/2
when
when
when
ki^Q,n,
ki=0,
ki = n.
(2.5)
104
Shanazari and Chen
For a simple illustration, see the plot of Fig 3b. To prove that the above modification is
suitable, we first present the following result for a simple case.
Theorem 2.1 Let xo,xi,...,Xn be a non-uniform mesh with the mesh sizes hj =
Xj+i - Xj and Zo,Zi,. ..,Zn are the corresponding error values. If the critical constant
value Z as in (2.3), and only one value Z\ > Z (i.e. M = 1 and all others Zj are less
than or equal to Z), the modification (2.4) takes the following form,
(i)
(U)
Zo = Zo,
Zi-Z,
Zj = Zj + ^n' r, [(^1 - ■^)(/io +
/M)/2]
l{hi + ft^-i)/2 forj = 2,3,.. .,n.
Then the average value d = d{Z) of the modified values Zj is the same as d' = d'{Z) in
(2.2).
Note M = 1 here; in fact the results holds for any one value Zj > Z. Now we are readyto present the main result on equation (2.4) with regard to minimal mesh size control.
Theorem 2.2 With the error function modified as in (2.4), the new mesh hj resulting
from equidistribution satisfies (i) the average error value remains as d'; (ii) hj > hminHere hmin cannot be specified to be larger than h = 1/n (the uniform mesh size);
practically we found hmin G [h^,h/2] is adequate. Full proofs to these results will be
given in the full version of this paper [10].
In the method in (2.4), the values of ZA-^ which are less than but close to Z may
become unnecessarily larger (e.g. larger than Z) and therefore we can propose a further
refinement. We can keep some of the Zkj values which are between Z/2 and Z. In other
words, we only modify the very large and very small values of Zkj (see plot of Fig 3b).
Then our theorems are still valid but the proofs may need minor changes. Finally we
summarise our adaptive method with minimal mesh size control as follows (see the plot
of Fig 3b for an illustration).
Algorithm 2.3. (Numerical algorithm) For given non-uniform mesh a = to,ti,...,
tm = b, the error values fo,fi,...,fm,, values c and hmin'(1) Does the locally bounded mesh algorithm converge to the new m.esh a — XQ < Xi <
■ ■ ■ < Xn = b which is sub-equidistributing with respect to c and f, that is, for a
sufficiently large value of the integer n such that J^ f < nc, and the inequalities
/
f<c,
j = 0, l,...,n-l
are satisfied.
(2) Check the minimal mesh size and compare it with the hmin- If it is less than hmin>
go to the Step 3 otherwise stop.
(3) Approximate the padding values Zj — f{xj) corresponding to the new mesh by using
piecewise linear interpolation of fi values and calculate the average value
1 "~-^
h ■
d=-^{Zj-\-Zj+i)-^,
^•=o
where
hj =Xj+i-Xj,
Adaptive mesh algorithm with distance control
105
and Z according to Zhmin = d.
(4) Obtain the decreasing arrangement of Zj, Zk^ by ordering them.
(5) Modify the Zk^ values as follows,
(i)
Zkj = Z
when
Zk^ > Z,
(ii)
assuming, that for j = 0,1,..., M Zk^ > Z,
Zk, = Zk, when Z/2 < Zk, <Z,
assuming, that for j = M+ 1,M+ 2,.. .,N,
(Hi)
Z/2 < Zk, < Z,
Zk^ = Zk^ + Y^n^^l^z^^ [Et^o(^fei - Z)hki\ l\
for j^N+l,N + 2,..:,n,
where hk^ was introduced in (2.5).
(6) Check the modified values Zkj in the stage (Hi) of the Step 5. If Zk^ < Z/2 for all
j, go to Step 7 otherwise repeat Step 5.
(7) Perform the equidistribution procedure for the modified values Zkj o-nd obtain the
new adapting mesh.
3
Extension to two dimensions
The concept of adapting mesh in one dimension is well known (see e.g. [5, 3]). Extension
of this idea to two dimensions is not straightforward. For a given function f{x, y) and
2D domain Q, an obvious extension is dividing the domain fi into some subdomains fij
in such a way that
f{x,y) = constant.
(3.1)
J jQi
1. In Fig (a) the monitor values corresponding to the new mesh are represented by
'*', the linear interpolation for these values is shown by '-.' and in Fig (b) the modified
values of the padded function, represented by dash line, are compared with the original
values.
FIG.
Shanazari and Chen
106
2. In Fig (a) equidistribution of slabs in the two coordinate direction and in Fig
(b) three stages of the new method are shown.
FIG.
But, such a partition is not unique and furthermore satisfying condition (3.1) properly
is not simple. Consequently, this condition has to be replaced. Among the methods
given to satisfy the condition (3.1) as much as possible, two well known methods are
transformation and dimension reduction. Transformation methods are based on mapping
the physical domain into a simple domain with a uniform mesh and ultimately applying
the equidistribution condition to obtain an adapting mesh in the physical domain [4, 12].
These methods are generally costly and complicated in theory. In this work we first
consider the latter method which is easier and cheaper than the former method. We
then present a new technique to generate a 2D mesh.
3.1
Dimensions reduction
We assume that fi is a rectangle in the form fl = {{x,y), a < x < b, c < y < d}. A
simple idea is to produce the mesh,
a = XQ < Xi < . . .< X„-l < Xn = b,
C = yo < yi < ■ ■ ■ < Vm-l < Vm = d,
such that
r^i+1
rVm
/
/
Jxi
Jyo
fx{x,y)dydx = constant,
(3.2)
fy{x,y)dxdy = constant,
(3.3)
and
rVj+i
r^r,
/
/
Jyj
Jxo
where fx{x,y) and fy{x,y) are the monitors in the x and y directions respectively (see
Fig 3.1a). Obviously the generated mesh by this method is much different from an
equi-distributing mesh that one expects from (3.1). Another method which leads to a
non-rectangular grid is dimensional splitting [11]. We now describe a new method of
type dimension reduction.
Adaptive mesh algorithm with distance control
O.B
107
-...-._
■ f-.\w,r.-.-
: ^vr-•
••^ —:
....^', "i
..v/
.-"v-*'
.
-::"*
::-■
■
'::"
--::
O.S
O
■
■
^:^
- ::-.
- fr-~;-.
::
:
::
,,
.{,5i.=v.
'.':-r •
---"^»" ^.
FIG. 3. In Fig (a) the mesh generated by the new method for function in (3.6) and in
Fig (b) the resulting mesh when restricting the minimal mesh size as hmin = h/2 for
the same function are shown.
3.2
A new approach for a 2D mesh
The idea is based on the tensor product method and therefore a non-rectangular grid.
We start with a uniform mesh in a rectangular region f] and perform the method in
three stages. In the first stage, the error equidistributing is performed for each line in
the horizontal direction (see the first part of Fig 3.1b), that is.
/"
J Xi
fx{^: Vi) dx = constant for z = 0,1,..., m.
(3.4)
In the next stage, the mesh is redistributed in the vertical direction along the new grid
lines (see the second part of Fig 3.1b), that is.
fy{xj ,y)dy = constant for j = 0,1,..., n.
(3.5)
J Si
where Sj+i — Sj is the distance between two consecutive points {xj,yi) and Xi,yi+i)
'^jii
along the new lines. In the final stage, equidistributing is repeated in the horizontal
direction along the grid lines (the last part of Fig 3.1b). One can observe that repeating
this procedure usually leads to a convergent mesh. According to our experiments, the
number of iterations to achieve convergence is at most five. The resulting mesh by this
procedure for function
u{x,y) = e(^-^"~^y"T
(3.6)
when applying the arc-length monitor is shown in Figure 3a. The idea of controlling
the mesh size can also be applied in this technique. The generated mesh for the same
function when the mesh sizes are restricted to hmin = h/2, where h is the mesh size in
the case of uniform mesh, is given in Figure 3b.
4
Numerical examples
In this part the affect of adapting the mesh on the accuracy of interpolation and the DRM
is considered. In the following examples, the infinity norm has been used to measure the
Shanazari and Chen
108
..'^ 'i'f 11 VNi-Wl*
0.2 - ■
•
>
:.■.-.■.•.::.-
.
■
-—<i
?^*^
—a
—1
4. The resulting mesh when using the new method for function in Examples 1 and
2 are shown in Figures (a) and (b) respectively.
FIG.
Method
uniform mesh
Adaptive mesh
with control
Adaptive mesh
without control
stage
—
first
second
third
first
second
third
Function (El)
5.1E-2
5.4E-3
5.4E-3
3.8E-3
1.4E-2
2.2E-2
1.8E-2
Derivative
9.5E-1
1.6E-1
3.0E-1
3.0E-1
9.9E-2
7.5E-1
6.0E-1
Function (E2)
1.3E-2
2.5E-3
2.1E-3
3.7E-3
2.5E-3
2.1E-3
4.5E-3
Derivative
2.2E-1
1.3E-2
l.OE-1
l.OE-1
1.5-2
l.OE-1
1.2E-1
TAB. 1. The interpolation error for Examples 1-2 using adaptive mesh with and
without control the mesh sizes.
accuracy, that is, if u and u are the exact and approximate values respectively then the
error is calculated as
e„ = ||u(a;) -u(x)%
max \u{x) — u(x)
A polynomial RBF, 1 -|- r-'', has been employed in this work.
Example 4.1 We check the interpolation in terms of the RBFs for the function,
t/(a;,2/) = (l-e^"-3)sin(1.5 7r2/),
(4.1)
in a rectangular domain. The generated mesh for this function is shown in Figure 4a.
Table 4 shows the affect of adapting mesh on the interpolation accuracy with and without
controlling the mesh sizes. As one can observe, using the adapting mesh considerably
improves the accuracy in comparison with the case of uniform mesh. Moreover, the result
in the case of controlling the minimal mesh size is better.
Adaptive mesh algorithm with distance control
Example 4.2 In this example we first check the function f2{x, y) = 0.5—0.5 tanh(-4+
l&x^ + 16y2) and then solve the linear PDE: y^" + 2^1^ + ^% + ^V^ = ^' ^i*'^ ^^^
Dirichlet boundary condition over the elliptic domain x^ + 4j/^ = 4, where d is a known
function such that the exact solution is w(a;,j/) =/2(a;, y).
Again from Table 4, we see improved approximation. We apply the DRM method [7] for
solution, where the domain integrals are approximated by using RBF interpolation. The
adaptive mesh for this function is given in Fig. 4b and has been observed to give rise to
improved DRM solution.
5
Conclusions
We considered a new algorithm for producing a locally bounded mesh with a preset
minimal mesh size. Such a mesh is used to overcome the ill-conditioning problems associated with radial basis function interpolation. Extension of the idea to the 2D case is
also considered. Some preliminary and improved numerical results are given.
Bibliography
1. Ainsworth, M. and Oden, T. J., A Posterior Error Estimation in Finite Element Analysis,
John Wiley, 2000.
2. Beckett, G., Mackenzie, J. A., Ramage, A. and Sloan, P. M., On the numerical solution
of one-dimensional PDEs using adaptive methods based on equidistribution, Journal of
ComputationalPhysics, 167 (2), 372-392,2001.
3. Carey, G. F. and Dinh, H. T., Grading Functions and Mesh Redistribution, SIAM J.
Numer. Anal., 22 (5), 1028-104:0, 1985.
4. Chen, K., Two-Dimensional Adaptive Quadrilateral Mesh Generation, Communications
in Numerical Methods In Engineering, 10, 815-825, 1994.
5. Chen, K., Error Equidistribution And Mesh Adaptation, SIAM J. Sci. Comput, 15, No
4, 798-818, 1994.
6. Kautsky, J. and Nichols, N. K., Equidistributing Meshes With Constraints, SIAM J. Sci.
Statist. Comput, 1, No 4, U9-511, 1980.
7. Partridge, P. W., Brebbia, C.A, and Wrobel, L. C, The Dual Reciprocity Boundary Element Method, Computational Mechanics Publications, 1992.
8. Pereyra V., and Sewell, E. G., Mesh selection for discrete solution of boundary problems
in ordinary differential equations, Numer. Math., 23, 261-268, 1975.
9. Profit, A., Chen, K. and Amini, S., Application of the DRBEM with Adaptive Internal
Points to Nonlinear Dopant Diffusion, Proc. 2nd UK BIE conf., Brunei University Press,
1999.
10. Shanazari, K. and Chen, K., On an adaptive mesh algorithm with minimal distance control
for the dual reciprocity method. In preparation.
11. Sweby, P. K., Data-Dependent Grids, A^Mmericai AnaZj/sis Kepori 7/,S7, University of Reading, UK, 1987.
12. Thompson, J. F., Warsi, Z. U. A. and Mastin, C. W., Numerical Grid Generation Foundations and Applications,'NoTth.-H.oWand, 1985.
13. White, A. B., On selection of equidistribution meshes for two-point boundary value problems, SIAM J. Numer. Anal., 19, 472-502, 1979.
109
An alternative approach for solving Maxwell equations
Wolfgang Sproessig
Freiberg University of Mining and Technology, Germany
sproessig@math.tu-freiberg.de
Ezio Venturino
Politecnico di Torino, Italia
egvv@calvino.polito.it
Abstract
At present the use of hypercomplex methods is pursued by a growing number of mathematicians, physicists and engineers. Quaternionic and Chfford calcuhis will be applied on
wide classes of problems in very different fields of science. We explain Maxwell equations
within the geometric algebras of real and complex quaternions. The connection between
Maxwell equations and the Dirac equation will be elaborated. Using the Teodorescu
transform we will deduce an iteration procedure for solving weak time-dependent Maxwell
equations in isotropic homogeneous media. Assuming the so-called Drude-Born-Feodorov
constitutive laws Maxwell equations in chiral media were deduced. Full time-dependent
problems will be reduced to the consideration of Weyl operators.
1
Historical oriented introduction
Classical Maxwell equations were discovered in the second half of the nineteenth century
as result of the stormy development of electromagnetic research in that time. The study
of these equations ha.s attracted generations of physicists and mathematicians but some
of their secrets are still hidden.
At about the same time, also new algebraic structures were invented. W.R. Hamilton
discovered in 1843 the algebra of real quaternions as a generalization of the field of
complex numbers. Under the influence of H. Grassman's extension theory and Hamilton's
quaternions, W.K. Clifford created in 1978 a geometric algebra, which is nowadays called
Clifford algebra. Its construction starts with a basis in the signed R" = RP''^ with units
ei,..., e„. Assume that ef = -1, for i — 1,..., q, and ej = 1, for j = 1, ...,p, as well as the
anticommutator relation
GiG-j "T" GjGi — U
for i 7^ j. Together with eo = 1 one can construct a basis in the 2"-dimensional standard
Clifford algebra Clp^g. Incidentally, in 1954 C. Chevalley [5] showed that each Clifford
number, i.e. each element of Clp^q, can be identified with an antisymmetric tensor.
Let us go back to the electromagnetic field equations. Already J. C. Maxwell [15]
himself and W. R. Hamilton [10] used these new algebraic techniques to try to simplify
110
An alternative approach for solving Maxwell equations
Maxwell's equations. The aim was to obtain an equation of the type
Du + au = F
with suitable operators D and a. For this reason Hamilton introduced his "N'abla operator" as well as the notion "vector". The tendency of algebraisation of physics continued
in the first half of the last century. A long list of important publications were devoted
to this topic. We only stress here some of the milestones, beginning with the "Theory
of Relativity" by L. Silberstein (1914) [18] , and H. Weyl's book "Raum-Zeit-Materie" of
1921. Important results of Einstein/Mayer, Lanczos and Proca foUowed. In 1935 this development highlighted with the thesis of M. Mercier (Geneva) [16]. After the reinvention
of the concept of "spinors", firstly appeared in 1911 in a paper by E. Cartan, D. Hestenes
[11, 12, 13], F. Bolinder [3] and M. Riesz [17] wrote fundamental algebra papers with
applications in electromagnetic theory, using the framework of Clifford numbers and
spinor spaces.
Meanwhile, in the late thirties the famous Swiss mathematician R. Fueter and his coworkers and followers used a function-theoretic approach for the same problems. These
ideas were refreshed and fruitful extented by R. Delanghe and his group and A. Sudbury
in the seventies and early eighties (cf. [4, 20]). Influenced by the success of complex
analysis and Vekua theory a generalized operator theory with corresponding singular
integral operators [19] and a corresponding hypercomplex theory for boundary value
problems of elliptic partial differential equations Were developed [8],[9].
Making use of a transformation of Maxwell's equations into a system of homogeneous
coordinates we will propose an alternative solution method.
2
Meixwell equations
Let ,G be a bounded domain with sufficient smooth boundary F that is filled Out with
an isotropic homogeneous material.
Using Gauss units Maxwell equations read as follows:
CTOt H
=
4ITJ
+ dtD
c rot E
div D
div B
=
=
=
—dtB
iirp
0
(Biot-Savart-Ampere's law)
(Faraday's law)
(Coulomb's law)
(no free magnetic charge)
Furthermore, the continuity condition has to be fulfihed:
div J = -dtp,
where E — E{t, x) is the electric field, H = H{t, x) the magnetic field, J = J{t, x) the
electric current density, D = D{t,x) the electric flux density, B = B{t,x) the magnetic
flux density, p = p{t, x) the charge density, and c is the speed of light in a vacuum.
The relations between flux densities and the electric and magnetic fields depend on
the material. It is well-known that for instance all organic materials contain carbon and
111
112
Wolfgang Sproessig and Ezio Venturino
realize in this way some kind of optical activity. Therefore, Lord Kelvin introduced the
notion of the chirality measure of a medium. This coefficient expresses the optical activity
of the underlying material. The correspondent constitutive laws are the following:
D = eE + ef3 rot E
(Drude-Born-Feodorov laws),
where e = e{t, x) is the electric permittivity, /x = n{t, x) is the magnetic permeability and the coefficient /3 describes the chirality measure of the material. In isotropic
cases one has the possibility to use the so-called Tellegen representation
D = eE + aH,
B = nH + a*E.
The connection between the electric field E and current density J is given by
I
J=aE+ag
where a is the electric conductivity and g a given electric source.
Starting with /3 = 0 and replacing D and B hy D = e E and B = /iH we get in the
case of
£ = S{x) , fJ, = ll{x)
-EdtE + cmtH
fidtH + c lot E
edivE
=
=
=
0,
iTrp-{Ve-E);
(2.1)
(2.2)
(2.3)
/xdivF
=
-{Vn-H).
(2.4)
4TTJ,
After summing (2.1) and (2.4) as well as (2.2) and (2.3) we obtain
-sdtE + cT0tH + ndivH = -{Vn-H) + 4TrJ,
(2.5)
^i^tH + cTotE + £diyE = -{Ve■H) + 41rp.
(2.6)
In the case of e,n being constants we can introduce the new functions E, H which
are defined on a homogeneous space with a first coordinate XQ and the other coordinates
IT = (;ri,;r2,i3). We obtain:
E{t,x) =: E (--t,-c
H{t,x)=:H(-t,-c
The equations (2.5)-(2.6) transform into
diE + TOt H + fi c div H = Air J,
diH+ iotE + £cdiv £ = 41: p.
An alternative approach for solving Maxwell equations
3
113
Quaternionic representations
Let 61,62,63 be the generating units of the algebra of real quaternions M, which fulfil
the conditions
eiej-\-ejei =-25ij
(i,j = 1,2,3).
This leads to the following multipUcation rule for two quaternions u =
UV = UoVo
—U-V + UOV + VQU + UX
V
UQ+U
, v=
VQ+V:
(ViGR),
where u — uiei + u^eo, + U363 , v = vie^ + ^262 + ^363- Further, let u = UQ + u be a
quaternion. Then U = UQ-U\S called to be its conjugate quaternion. The operator
defined by
£> = 5iei + ^262 + 9363
is called Dirac operator. It acts on a quaternionic valued function as follows:
Dti = — div u + rot u + grad Uo •
With the multiplication operator me
meu — OuQ + u
withu = Uo+M,
(0 6 IR"*"),
u = Uiei + ^262+ 11363, we obtain
mf,o{diE + DH) = A-K J ,
rUscidiH + DE) = 4n p ,
and so
diE + DH
=
m~J47rJ
diH + DE
=
-1/
mj^'inp.
Finally, we get
d{E + H)=di{E + H) + D{E + H)=4Tr{m-^J + mJ^^p)=:Fi,
d{E-H) = di{E-H)-D{E-H)=4TT{m-^J-mJ^^p)=:F2,
where d is also called Weyl operator and d is the conjugate to d. By the way, a function
u is called quaternionic regular iidu = 0 and quaternionic anti-regular iidu = 0.
For simplifying we set: E + H =: v and E - H —-.w. Then it follows
dw
dw
=
=
Fi{v,w),
F2{v,w).
(3.1)
(3.2)
Let us have a closer look at the functions ^1,^2- The electric current density J is given
by
J = aE + ag,
114
Wolfgang Sproessig and Ezio Venturino
where E and g are vector functions. This leads to the following simplification
Fi =47r a{E + g) +
27r cr(v + w) -\- ag ^
eci
F2 =47r (j{E + ag)
ecJ
= 27r a(v + w)+g
ec
ec
Hence
F2 = -F\ .
Thus
4
dw
=
Fi{v,w),
(3.3)
dw
=
—Fi{v,w).
(3.4)
Integral representation
Let G be a bounded domain in R^ and a a positive constant. We consider in R,^ the cylinder Z = Gx [—a,a]. A right inverse to the Weyl operator is the following Teodorescu
transform:
{Tzu){x) = — fe{x- y)u{y)dy ,
(^3 J
Z = Gx [-a, a]
with e(x) = xj\x\'^, 0-3 = 27r'''^^/r(3/2). We obtain in a straightforward manner
dTzu =
{0
in
in
Z,
e," \ Z,
and
Tzdu + (t>z
[l
in
in
Z, _
e,^ \ Z,
with (j)z G kerS. In complete knalogy a conjugate Teodorescu transform T^ is introduced.
We just have to replace e{x) by its conjugate. Now it follows from (7)^(8) that
,
v = TzFx{v,w) + (f,z
{d(pz=0),
w = T*zF2(v,w) + 4>*z
{Wz=^)Furthermore we have to introduce Cauchy-Bizadse-type operators, which are defined
by the boundary data. These operators read as follows:
{Fezv){x) = - fe{x- y)n{y)u{y)d{dZ)y
{x i dZ)
dZ
and
iF^zu)ix) := - feix- y)n{y)u{y)d{dZ)y, {x i dZ)
.
(^3 J
dz
where n{y) = (no +n){y) denotes the unit vector of the outer normal on dZ at the point
y.
■
An alternative approach for solving Maxwell equations
115
It can be proved that
•^1 = ^az'^
and
(/>2 =
FQZV
in
Z
.
It should be noted that we do not need the whole trace of the functions w and v
on the boundary. We just have to consider these parts of trzv ( trzw) which are lying
in the corresponding Hardy space of functions, which permit a quaternionic regular
(quaternionic anti-regular) extension into Z, accordingly. We get the integral equations
V
=
A'KaTz{v + w) + 4TrTz{ag+~) + h,
(4.1)
w
-
4TTaT^{v + w)+47rT^{ag-^) + h*,
(4.2)
where
h = Fsztrazv
and
h* = Fgztrgzw.
If h, h* are known then under smallness conditions the iteration procedure:
Vn=4:TraTz{v„-i+Wn-i) + 4.'jrTz{c7g+—) + h,
Wn = 4:7raT^{vn-i+Wn-i) + i'KT^{ag--^) + h*,
with (t)o = u;o = 0) will converge in suitable Banach spaces.
Remark 4.1 In [1] is proved the following estimation:
5
Weak time dependent Maxwell-equations
Assume now e = £(a;),/i =/x(x),K = «;(a;) (5 = 0) and
E{t,x) = Eo{t)E{x)
where the scalar functions
EQ
and
H{t,x) = Ho{t)Hi{x),
and Ho are known. Maxwell equations then transform to
cEoiotEi
=
-dtifiHo)Hi,
(5.1)
cHomtHi
£'o(Ve-£i)-|-edivEi
=
=
(dtieEo) + 4TVKEO)EI ,
47rp,
(5.2)
(5.3)
{ViJ.-Hi)+fidWHi
=
0.
(5.4)
It follows
rot El
=
, „
,. ,-,
-divEi
=
-— Hi =: aoHi ,
c iio
fs dtEo , 4TTK EO\
47rp
Ve „
,
„
-p +
Ei^p'-a-Ei,
.,
116
Wolfgang Sproessig and Ezio Venturino
-divFi
=
^ ■ Hi = -0 ■ Hi .
Here a = ao + a , P = po + §_, a := -^ , 13 := -^. Using the fact that in H
Du = — div u + rot u,
we get
■
DEi
=
aoHi + p' -a-Ei,
DHi
=
l3oEi-§_-Hi.
The right inverse of D is the corresponding Teodorescu transform To over G CR^. A
short calculation leads to
Ei=TGaoHi-TGa-Ei + TGp' + <l>i,
HI=TG(3OEI-TG§_-HI + (I>2,
where (/)j € ker D (i = 1,2). The iteration method
Ej"^ =-Tea •
if (") =
with H["^ =
conditions.
E[°''
TG^^
EJ"-')
. £(") -
+ TcaoFl"-'^ + Top'+ ,/.i,
TG^
. Jj("-^) + ^2,
- 0 converges in suitable Banach spaces (L2, W^,C) under smallness
In the time-harmonic case i.e. Ho = Eo = l and e, /x, are constants and
have
DEi=p'
and
K
=
K{X)
we
DHi = l3oEi.
Setting Po = 6~^ we obtain
DSDHi=^^=p',
i.e.
AHi = -f.
If boundary values of iJi (trr/fi) are known i.e. trrHi = g the complete solution is
given by
Hi=Frg + TGVsDh + TGQsSTGf.
(5.5)
Here Vs and Qs are orthoprojections on subspaces in the quaternionic Hilbert space
L2{G), namely
L2{G) = SkeTDnL2{G)QDW2iG).
s
An alternative approach for solving Maxwell equations
The scalar product is defined by
iuSvdG e M.
{u,v)5 :=
G
The operator Vs can be seen as a generalized Bergman projection.
In the representation formula from above is Fr the Cauchy-Bizadse operator on T
and h a smooth continuation of g into G. Note that Vs and Qs can be explicitly defined
(cf. [9])! Then
Ei = -^VsDh + QsSTGf.
4TTK
Let us prove that the boundary condition is fulfilled! Indeed,
QsTof = Df
with
TGDf = f-Frf^O
/ €1^2 i-e- trrf = 0.
(Borel-Pompeiu's formula).
On the other hand, Plemelj-Sokhotzkij's formulae yield:
trrHi
=
Prg + trrVsDh = Prg + trrTDh-trrTQsDh
=
Pr9 + 9-Pr9 + 0 = 9-
Pr is the so-called Plemelj-projection onto that Hardy space of iff-regular extendible
functions into G.
Bibliography
1. H. Bahmann, K. Guerlebeck , M. Shapiro and W. Sproessig, On a modified Teodorescu transform. Integral Transforms and Special Functions 12 (2001), 213-226.
2. A. W. Bitsadze, On two-dimensional integrals of Cauchy-type, Akademii Nauk Grus.
55i? 16 (1955), 177-184 (Russian).
3. E. F. Bolinder, The classical electromagnetic equations expressed as complex fourdimensional quantities. J. Franklin Inst. 263 (1957), 213-223.
4. F. Brackx, R. Delanghe and F. Sommen, Clifford analysis, Pitman Research Notes
m Mai/i., Boston, London, Melbourne, 1982.
5. C. Chevalley, The algebraic theory of spinors, Columbia University Press, New York,
1954.
6. W. K. Clifford, Applications of Grassmann's extensive algebra. Americ. J. of Math.
Pure and Appl. 1 (1878), 350-358.
7. R. Fueter, Analytische Theorie einer Quaternionenvariablen. Comment. Math. Helv.
4 (1932), 9-20.
8. K. Guerlebeck and W. Sproessig, Quaternionic Analysis and Boundary Value Problems, Birkhuser Verlag, Basel, 1990.
9. K. Guerlebeck and W. Sproessig , Quaternionic and Clifford calculus for physicists
and engineers. Mathematical Methods in Practice Vol. 1, John Wiley &: Sons, 1997.
117
118
Wolfgang Sproessig and Ezio Venturino
10. W. R. Hamilton, Elements of Quaternions (2 Vols), Chelsea, (reprint 1969) 1866.
11. D. Hestenes, Space-Time Algebra, Gordon and Breach, New York, 1966.
12. D. Hestenes, New foundations for classical mechanics, Reidel, Dordrecht, Boston,
1985.
13. D. Hestenes and G. Sobzyk, Clifford algebras for mathematics and physics, Reidel,
Dordrecht, 1985.
14. V. V. Kravchenko and M. Shapiro, Integral representations for spatial models of
mathematical physics. Pitman Research Notes in Math. Series 351, 1996.
15. J. C. Maxwell, The Scientific Papers (2 Vols), Dover, 1969.
16. M. Mercier, Expression des Equations de lectromagnetisme au moyen des nombres
au Chfford, Thesis Nr. 953, University of Geneva, 1935.
17. M. Riesz, Clifford Numbers and Spinors, Lecture Series 38, Maryland, 1958.
18. L. Silberstein, The theory of relativity, Macmillan, London, 1914.
19. W. Sproessig and E. Venturino, The treatment of window problems by transform
methods, Zeitschrift fur Analysis und Anwendungen 12 (1996), 6A3-654.
20. A. Sudbery, Quaternionic analysis, Math. Proc. Cambr. Phil. Soc. 85 (1979), 199225.
Chapter 3
Metrology
121
Orthogonal distance fitting of parametric
curves and surfaces
Sung Joon Ahn, Engelbert Westkamper, and Wolfgang Rauh
Fraunhofer Institute for Manufacturing Engineering and Automation (IPA)
Nobelstr. 12, 70569 Stuttgart, Germany
{sja; wke; wor}@ipa.fhg.de
Abstract
Fitting of parametric curves and surfaces to a set of given data points is a relevant
subject in various fields of science and engineering. In this paper, we review the current
orthogonal distance fitting algorithms for parametric models in a well organized and easily
understandable manner, and present a new algorithm. Each of these algorithms estimates
the model parameters minimizing the square sum of the error distances between the model
feature and the given data points. The model parameters are grouped and simultaneously
estimated in terms of form, position, and rotation parameters. The form parameters
determine the shape of the model feature, and the position/rotation parameters describe
the rigid body motion of the model feature. The new algorithm is applicable to any kind
of parametric curve and surface. We give fitting examples for circle, cylinder, and helix
in space.
1
Introduction
The use of parametric curves and surfaces is very common and model fitting to a set of
given data points is a relevant subject in various fields of science and engineering. For
fitting of curves and surfaces, orthogonal distance fitting is of primary concern because
of the applied error definition, namely the shortest distance from the given point to the
model feature [5, 9]. While there are orthogonal distance fitting algorithms for explicit [3],
and implicit models [2, 7] in the literature, we are considering in this paper fitting
algorithms for parametric models [4, 6, 8, 10, 11] (Fig. 1).
The goal of the orthogonal distance fitting is the estimation of the model parameters
minimizing the performance index
ag = (X - X'f pTp(X - X')
(1.1)
ag^d^pTpd,
(1.2)
or
where X"^ = {Xj,..., X^) and X''^ = (X'7,..., X'J) are the coordinates vectors of the
m given points and of the m corresponding points on the model feature, respectively.
Moreover, d"^ = (di,.. .,dm) is the distances vector with rfj = ||Xi - XJ||, P'^P is the
weighting matrix. We are calling the fitting algorithms based on the performance indexes
(1.1) and (1.2) coordinate-based algorithm and distance-based algorithm., respectively.
122
123
Orthogonal distance fitting of parametric models
Measuremert point: X,
Orthogonal
contactingpoint: xj
■x(a
z
,M)
(a)
Measurement point: X,
Orthogonal
contacting point: x'
FIG. 1. Parametric features, and the orthogonal contacting point x^ in frame xyz from
the given point Xj in frame XYZ: (a) Curve; (b) Surface.
In this paper, the model parameters a are grouped and simultaneously estimated
in three categories. First, the form parameters ag (e.g. three axis lengths a, b, c of an
ellipsoid) describe the shape of the standard model feature defined in model coordinate
system xyz (Fig. 1)
x = x(ag,u)
with
ag = (ai,. ,ai
(1.3)
The form parameters are invariant to the rigid body motion of the model feature. The
second and the third parameters groups, respectively the position parameters Ap and the
rotation parameters a^, describe the rigid body motion of the model feature-in machine
coordinate system XYZ:
where
X=:R-ix + Xo
or
x = R(X-Xo),
R = RKR^OR^, = (ri r2 ra)'^ ,
R"-^ = R'^ ,
and
ar = {CJ,(P,K)
3-p — XQ — [Xo, Jo, Zo)
(1.4)
A subproblem of the orthogonal distance fitting of a parametric model is the finding
of the location parameters {uj}^i, which represent the nearest points {X^j^^ on the
model feature from each given point {Xj}J^i. The model parameters a and the location
parameters {ui}^i will generally be estimated through iteration. By the total method [6,
10], a and {UJI^^I wih be simultaneously determined, while they are to be separately
estimated by the variable-separation method [4, 8,11] in a nested iteration scheme. There
could be four combinations for algorithmic approaches as shown in Table 1. One of the
algorithmic approaches in Table 1 results in an obviously underdetermined linear system
for iteration, thus, it has no practical application. We describe and compare the reahstic
three algorithmic approaches in the following sections.
124
S. J. Ahn, E. Westkamper, and W. Rauh
Algorithmic approaches
Total method
Variable-separation method
Coordinate-based algor.
I (ETH [6, 10])
III (FhG, this paper)
1. Orthogonal distance fitting algorithms for parametric models.
TAB.
2
Distance-based algor.
Underdetermined system
II (NPL [4, 11])
Orthogonal distance fitting algorithm I (ETH)
The ETH algorithm [6, 10] is based on the performance index (1.1), and simultaneously
estimates the model parameters a and the location parameters {ui}J!ii for the nearest
points on the model feature. We introduce the new estimation parameters vector b
containing a and {uj^i as follows,
b'^ = (a'^, uj",..., u^) = (aj, aj, a^, uj,..., u^).
The parameters vector b minimizing the performance index (1.1) can be determined by
the Gauss-Newton method
p_
Ab = P(X-X')|fc,
bfc+i = bfc + aAb,
(2.1)
with the Jacobian matrices of each point X^ on the model feature, from (1.3) and (1.4)
Jx^b =
=
ax
dh
x=x;
>-i dx
R
da,
R
dh^ ab '^^ ab
an -1
d&r
Oi,--,Oi_i
R
idx
a^'Oi+i.'
■,o„ j
A disadvantage of the ETH algorithm is that the storage space and the computing time
cost increase very rapidly with the number of the data points, unless the sparse linear
system (2.1) is handled beforehand by a sparse matrix algorithm.
3
Orthogonal distance fitting algorithm II (NPL)
The NPL algorithm [4,11] is based on the performance index (1.2), and separately estimates the model parameters a and the location parameters {ujjj^j in a nested iteration
''*''"''
min min <Tg({XKa,u)}r=i).
» {"iltei
The inner iteration determines the location parameters {u^}^i for the minimum distance
points {X^}J^i on the current model feature from each given point {Xj^j, and, the
outer iteration updates the model parameters. In this paper, in order to implement the
parameters grouping of a^ = {aJ,aL^,aJ), we have modified the initial NPL algorithm.
3.1
Orthogonal contacting point
For each given point x, = R(Xj—XQ) in frame xyz, we determine the orthogonal contacting point x^ on the standard model feature (1.3). Then, the orthogonal contacting point
X^ in frame XYZ to the given point Xj will be obtained through a backward transformation of x^ into XYZ. We are searching the location parameters u which minimizes the
error distance between the given point Xj and the corresponding point x on the model
125
Orthogonal distance fitting of parametric models
feature (1.3)
D = {Ki- x(ag, u))'^(xi - x(ag, u)).
The first order necessary condition for a minimum of (3.1) as a function of u is
(3.1)
(3.2)
The condition (3.2) means that the error vector (xj-x) and the surface tangent vectors
dx/du at X should be orthogonal. We solve (3.2) for u by using the Newton method
(how to derive the Jacobian matrix di/dn is shown in Section 4).
dvi
Au=:-f(u)|fc,
Ufe+1 = Ufc + aAu.
(3.3)
3.2 Orthogonal distance fitting
We update the model parameters a minimizing the performance index (1.2) by using
the Gauss-Newton method (outer iteration)
P—- Aa=-Pd|fc,
a/;+i ^afe + aAa.
9^ k
Prom di = \\yii - Xy, and equations (1.3) and (1.4), we derive the Jacobian matrices of
each orthogonal distance di
ddi
(Xi - x^'^ ax
Jdi.a
da
Xi X'JI 9a
T
R -1
ax
5x5u
du da
da.
aR-i
axo
da
da
With (1.4) and (3.2) at u=<,
{Xi -
and
Jdi.a =
X^TR-
(X,-XO^ /
||X,;-X'|| I
_i5x
i9x
da.
(X.-X,)
-
dar
-X,-
is the resultant Jacobian matrix for dj. A drawback of the NPL algorithm is that the
convergence and the accuracy of 3D-curve fitting (e.g. fitting of a circle in space) are
relatively poor. 2D-curve fitting or surface fitting with the NPL algorithm do not suffer
from such problems.
4
Orthogonal distance fitting algorithm III (FhG)
At the Praunhofer Institute IPA (FhG-IPA), a new orthogonal distance fitting algorithm
for parametric models is developed, which minimizes the performance index (1.1) in a
nested iteration scheme (variable-separation method). The new algorithm is a generalized extension of an orthogonal distance fitting algorithm for implicit plane curves [1].
Interested readers are referred to [2] for the orthogonal distance fitting of implicit surfaces and plane curves. The location parameter values {u^}g,i for the minimum distance
126
S. J. Ahn, E. Westkamper, and W. Rauh
points {Xf}J^i on the current model feature from each given point {XJJILi are to be
found by the algorithm described in Section 3.1 (inner iteration). In this section, we
intend to describe the outer iteration which updates the model parameters a minimizing
the performance index (1.1) by using the Gauss-Newton method
dX.'
Aa = P(X-X')|fc,
afc+i=aA. + QAa,
(4.1)
da fc
with the Jacobian matrices of each orthogonal distance point X[, from (1.3) and (1.4)
' dx
'X' a =
.j dx du
du da
R
dx du
du da
= R»-i , da
da X=X'
da
dx
R-1
+
dR-
+
gXo
da
dR -1
I
dag u=u;
-x +
da^
(4.2)
The derivative matrix du/da at u = Ui in (4.2) describes the variational behavior of
the location parameters u^ for the orthogonal contacting point x^ in frame xyz relative
to the differential changes of the parameters vector a. Purposefully, we derive du/da
from the condition (3.2). Because (3.2) has an implicit form, its derivatives lead to
df du
di dxi
af
0
+ 7^
^+
du da ' dxi da ' da
where dxi/da is, from Xj = R(X,: -
ax,;
dR ,„
da
da
„ .
or
d{ du
du da
=
0
( di dxi
(9i_dx^_^d{\
\dxi da
da)
(4.3)
XQ),
„ aXo
(Xj — XQ) — R
da
R
S<'''-^°>)-
The other three matrices df/du, dt/dxi, and df/da in (3.3) and (4.3) are to be directly
derived from (3.2). The elements of these three matrices are composed of simple hnear
combinations of components of the error vector (x, - x) with elements of the following
three vector/matrices dx/du, H, and G (XHG matrix):
dx
,
x,),
H^f';-' l^A ,
G= I Gi
[Xi
\X{
Xj X-iiy
XJ Xyy
\Xi ~ X)
[X-i
xj X
xjGo - (xi
xjGo - (xj
x)TGi
Xy) [X^ Xyj
df
af
.
d^r-^""^
■"■")
'
a„
da
(4.4)
xfG2
Now (4.3) can be solved for du/da at u = u^, and the Jacobian matrix (4.2) and the
linear system (4.1) can be completed and solved for the parameter update Aa.
We would like to stress that only the standard model equation (1.3), without involvement of the position/rotation parameters, is required in (4.4). The overall structure of
the FhG algorithm remains unchanged for all dimensional fitting problems of parametric
models. All that is necessary for a new parametric model is to derive the XHG matrix
of (4.4) from (1.3) of the new model feature, and to supply a proper set of initial para-
127
Orthogonal distance fitting of parametric models
® Measured/given point
/-
>N
Orthogonal contacting point/-~N
^ (Xr-X^ -^—
(XT)
i = \,...,m
i^ ={a'„Xl,oi>,(p,K)
x = R(X-X„)
t = |(R(x,-x„))'
T
-5X
•'":••" Sal
FIG.
X
Y
Z
5
1
-3
Orthogonal I au 5a
l^Sx,. Sa Sa
cohtactini
%^ f(x,,x(ag,u)) = 0
Machine
coordinate
system XYZ
X = R-'x + X„
|-(R-'x(a„u) + X„)|
da
Model
coordinate
system xyz
©
a.^i =a,+aAa
Aa=P
Machine
coordinate
system XYZ
■®-
'' Model
coordinate
system xyz
x(a_,u)
2. Information flow with the FhG algorithm.
6
3
-1
TAB.
5
4
1
5
6
3
3
5
5
2
4
7
0
2
9
-1
0
11
-1
-2
11
0
-5
11
3
-7
11
4
-8
11
7
-10
11
9
-9
10
2. Fourteen coordinate triples representing a helix.
meter values ao for iteration (4.1). An overall schematic information flow with the FhG
algorithm is shown in Fig. 2. The FhG algorithm shows robust and fast convergence
with 2D/3D-curve and surface fitting. The storage space and computing time cost are
proportional to the number of data points. A disadvantage of the FhG algorithm is that
it additionally requires the second derivatives d'^-x./da.gdn as shown in (4.4).
As a fitting example, we show the orthogonal distance fitting of a helix. The standard
model feature (1.3) of a helix in frame xyz can be described as follows. x(ag,u) =
x{r,h,u) = {rcosu,rsmu,hu/2'K) , with a constraint on the position and rotation
parameters
^
/c(ap,ar) = (Xo-X) r3(a;,<^)=0,
where r and h are respectively the radius and elevation of a helix. X is the gravitational
center of the given points set and ra (see (1.4)) is the vector of direction cosines of
the z-axis. We have obtained the initial parameter values from a 3D-circle fitting, and
a cylinder fitting, successively. The helix fitting to the points set in Table 2 with the
initial values of h — lO and K = 7r terminated after 0.22s, 8 iteration cycles for ||Aa|| =
3.2 xlO"'' with a Pentium 133 MHz PC (Table 3, Fig. 3). They were 0.33s, 10 iteration
cycles for ||Aa||= 3.6x10"'^ with the ETH algorithm, and, 1.05s, 61 iteration cycles for
||Aa|| =8.8x10""'' with the NPL algorithm. The computing cost with the ETH algorithm
increases rapidly with the number of the data points. The NPL algorithm showed slow
convergences with the 3D-circle and the helix fitting (3D-curve fitting).
128
S. J. Ahn, E. Westkdmper, and W. Rauh
Parameters a
3D-Circle
a(a)
Cylinder
:cT(a)
Helix
Yo
-2.7923
, 0.8421
-3.0042
0.4525
-1.5560
0.3934
TAB.
CTO
T
5.8913
8.3850
0.7355
8.2835
0.2738
6.1368
0.4238
1.6925
2.2301
Zo
5.2333
0.8821
4.5081
0.6513
6.4871
0.7500
OJ
-0.6833
0.1177
-0.4576
0.3049
0.3003
0.0880
h
— —"
Xo
5.6999
0.9939
4.7596
0.7465
3.8909
0.5488
19.5811
1.3214
V
0.7882
0.1375
1.1327
0.2116
0.5114
0.0663
K
'—~~
2.4602
0.2881
3. Results of the orthogonal distance fitting to the points set in Table 2.
X- X- K- K- X- X
12
Iteration number
16
(a)
(b)
FIG. 3. Orthogonal distance fitting to the points set in Table 2: (a) Helix fit; (b) Convergence of the fit. Iteration number 0-3: 3D-circle, 4-12: circular cylinder, and 13-:
helix fit with the initial value of /i= 10 and « = 7r.
5
Summary
In this paper, we have reviewed the current orthogonal distance fitting algorithms for
parametric curves and surfaces in an easily understandable manner, and presented a new
algorithm. By each of the algorithms the model parameters are grouped and simultaneously estimated in terms of form/position/rotation parameters. The ETH algorithm demands a large amount of storage space and high computing cost, and the NPL algorithm
shows relatively poor performance with 3D-curve fitting. The new algorithm, the FhG
algorithm, has no such drawbacks of the ETH algorithm or of the NPL algorithm. A
Orthogonal distance fitting of parametric models
disadvantage of the FhG algorithm is that it requires the second derivatives 9^x/9ag9u.
The FhG algorithm does not require a necessarily good set of initial parameter values, which could also be internally supphed as demonstrated with the fitting examples.
Prom the viewpoint of implementation and application to a new model feature, the FhG
algorithm is,universal and very efficient. Merely the standard model equation (1.3) of
the new model feature is eventually required, which has only few form parameters. The
functional interpretation and treatment of the position/rotation parameters are basically identical for all parametric models. The storage space and the computing time cost
are proportional to the number of given data points. Together with other orthogonal
distance fitting algorithms for implicit models [2], the FhG algorithm is certified by the
German federal authority PTB [5, 9], with a certification grade that the parameter estimation accuracy is higher than 0.1/tm for length unit, and 0.1/xrad for angle unit for
all parameters of all tested model features.
Bibliography
1. S. J. Ahn, W. Rauh, and H.-J. Warnecke, Least-squares orthogonal distances fitting
of circle, sphere, ellipse, hyperbola, and parabola, Pattern Recognition 34 (2001),
2283-2303.
2. S. J. Ahn, W. Rauh, and H.-J. Warnecke, Best-Fit of ImpUcit Surfaces and Plane
Curves, in Mathematical Methods for Curves and Surfaces: Oslo 2000, T. Lyche and
L. L. Schumaker (Eds.), Vanderbilt University Press, TN, 2001, 1-14.
3. P. T. Boggs, R. H. Byrd, and R. B.- Schnabel, A stable and efficient algorithm
for nonlinear orthogonal distance regression, SI AM J. Sci. Stat. Comput. 8 (1987),
1052-1078.
4. B. P. Butler, A. B. Forbes, and P. M. Harris, Algorithms for Geometric Tolerance
Assessment, Report no. DITC 228/94, NPL, 1994.
5. R. Drieschner, B. Bittner, R. EUigsen, and F. Waldele. Testing Coordinate Measuring Machine Algorithms: Phase II, BCR Report, EUR 13417 EN, Commission of
the European Communities, Luxemburg, 1991.
6. W. Gander, G. H. Golub, and R. Strebel, Least-squares fitting of circles and ellipses,
BITZA (1994), 558-578.
7. H.-P. Helfrich and D. Zwick, A trust region method for implicit orthogonal distance
regression, Numerical Algorithms 5 (1993), 535-545.
8. H.-P. Helfrich and D. Zwick. A trust region algorithm for parametric curve and
surface fitting, J. Comput. Appl. Math. 73 (1996), 119-134.
9. ISO/DIS 10360-6, Geometrical Product Specifications (GPS) - Acceptance test and
reverification test for coordinate naeasuring machines (CMM) - Part 6: Estimation
of errors in computing Gaussian associated features, ISO, Geneva, 1999.
10. D. Sourlier, Three Dimensional Feature Independent Bestfit in Coordinate Metrology, Ph.D. Thesis, ETH Zurich, 1995.
11. D. A. Turner, The approximation of Cartesian coordinate data by parametric orthogonal distance regression, Ph.D. Thesis, University of Huddersfield, 1999.
129
Template matching in the ii norm
Iain J. Anderson and Colin Ross
School of Computing and Mathematics, University of Huddersfield, UK.
i.j.andersonOhud.ac.uk, c.ross@hud.ac.uk
Abstract
We present a method for matching a surface in three dimensions to a set of data sampled
from the surface by means of minimising the distances from the data points to the closest
point on the surface. This method of association is afline transformation invariant and
as such is very useful in situations where the coordinate axes are essentially arbitrary.
Traditionally, this problem has been solved by minimising the £2 norm of the distances
from the data points to the corresponding points in the surface, while the use of other
£p norms is less well known. We present a method for template matching in the £1 norm
based upon a method of directional constraints developed by Watson for the related
problem of orthogonal distance regression. An algorithm for this method is given and
numerical results show its effectiveness.
1
Introduction
Template matching is used in a variety of applications such as the quality assurance of
manufactured artifacts [1] and dental metrology [2]. Given a fixed template, i.e., curve
or surface, and a set of data in a different frame of reference, template matching involves
finding the frame transformation which maps the data onto the template.
A typical strategy for finding the optimal transformation parameters in the template
matching problem is to minimize, in some norm, the orthogonal distances between the
transformed data and the template. In this case, the template matching problem can be
viewed as a form of orthogonal distance regression (ODR) [3], which is a technique commonly used for fitting curves and surfaces to measured data. Therefore, most algorithms
for solving the template matching problem are extensions of algorithms for ODR. Template matching in the ^2 norm is addressed by Turner [3] and in the £oc norm by Butler
et al. [1] as well as by Zwick [7] for the two dimensional case.
In this paper, we are specifically concerned with the following problem.
Given a fixed differentiable parametric surface f (u, v) and a set of m data
{xj}^! G 3?^, find points {f(uj,Di)}"i, ^ rotation matrix RQ, and a translation vector to such that the £1 norm of the residual distances {||i?e(xj - to) {{ui,Vi)\\2}i^i is mimmal.
This is the template matching problem in the £1 norm, and although not as widely used
as the £2 and £00 counterparts, it does nonetheless have an important role to play. The
importance of the Ci norm is that, generally speaking, any outlying data are effectively
ignored with the result that an approximation is obtained which is largely independent
130
Template matching in the £i norm
131
of any unreliable data. This has particular importance when our data arises as a result
of some measurement process, perhaps involving ma,ny complicated and finely-tuned
instruments. For such a measurement scenario, any change in the assumed measurement
conditions can result in a datum which has gross error relative to other data. Thus, if we
choose a measure which is susceptible to outlying data, we are in danger of obtaining an
unrepresentative approximation. This situation is avoided by use of the ^i norm and we
therefore advocate its use both here and in any situation involving measurement data
where a representative approximation is required.
A feature of optimal ii solutions is the Ukelihood of a small number of the data having
a residual of zero, and it is therefore unclear whether the elements of the Jacobian matrix
of partial derivatives are well-defined for these points. As a result, use of the usual
Gauss-Newton method would appear to be handicapped due to its dependence upon
the Jacobian matrix to calculate an updated transformation estimate. This difficulty
also arises in the conventional ODR fitting problem and has recently been considered by
Watson [6]. His solution is to adopt a method of fitting subject to directional constraints.
By setting these directional constraints to be orthogonal to the approximant, Watson
shows not only that the Jacobian is defined but also how to compute its elements without
incurring a build-up of rounding error.
In this paper, we extend Watson's constrained direction fitting routine to the template
matching problem. We show that Watson's results are equally valid for ^i template
matching. Finally, we exploit these results to give a reliable algorithm for the f i template
matching problem.
The structure of this paper is as follows. Section 2 provides the results necessary
to justify the new technique. Section 3 describes the algorithm adopted to implement
the theory. Section 4 gives some numerical results for both a simple case and a more
challenging case. Finally, Section 5 concludes this paper and presents possibilities for
future work.
2
Theory
We are concerned with the minimisation of the quantity
.E^\\{di,...,dm)\\,
(2.1)
where
<^i = '^^J\^i-Hui,Vi)\\2,
i = l,2,...,m,
(2.2)
and
x = i?e(x-to),
with respect to the rotation parameters
(2.3)
J32
:
I. J. Anderson and C. Ross
the translation parameters
to =
and the location parameters
U=
This is a constrained problem and can be solved using a separation-of-variables approach
as described by Turner [3] among others. In this approach, the problem of obtaining the
transformation parameters
is separated from the subproblem of obtaining the location parameters U. At each iteration, the subproblem is solved to obtain an optimal U for the current transformation
parameters t which is then used to obtain an update of the transformation parameters
themselves.
,
2.1 Considerations specific to tlie h problem
Up to this point, we have not specified which norm we are using to measure the disparity
between the transformed data and the template. Since we will be particularly interested
in the h case, this section discusses problems inherent in the solution of such a problem.
The major problem with solving non-linear ^i problems is that in order to use a
technique such as the Gauss-Newton method, derivative information is required. Unfortunately, derivatives of the distances d are not defined when a distance has a value of
zero. Such is the nature of ^i approximation that zeros are to be expected at an optimal
solution [5]. Thus, it is unclear whether the Jacobian matrix is defined at these data
points. Recent work by Watson [6] has considered how the related problem of orthogonal distance regression might be solved by considering distances to be measured along
fixed direction vectors w^. Orthogonal distance regression involves the fitting of a curve
or surface to a set of data where the residuals are taken to be the shortest distance from
the data to the approximant [3]. Template matching can be seen as a variant of this since
the residuals are measured in the same way, but we are only altering the position and
orientation of the approximant, rather than the actual shape itself. Thus, techniques for
orthogonal distance regression can be used successfully in template matching.
By means of these directional constraints, it is possible to show that if we choose the
directions Wi to be the orthogonal directions,
f{Ui,Vi) -Xj
"^'^ \\{{Ui,V,)-Xi\\2
then the derivatives are well defined in the hmit as \\{{ui,Vi) - Xi||2 ^0.
This result may be summarised in the following Theorem (taken from Watson [6]).
Theorem 2^1 For parametric fitting, let the (usual) Gauss-Newton method produce a
sequence {t} such there is a unique unit normal vector to the template at f{ui,Vi), and
Template matching in the l\ norm
133
Xj remains on one side of the template. Then Vt^^i is well defined on this sequence.
If {{ui,Vi) —> Xj, then this formulation will lead to similar problems to which we
are attempting to resolve as a result of the quotient becoming undefined. As a result,
Watson [6] suggests leaving Wj unchanged once di becomes small. By this method,
numerical problems arising as a result of a distance tending to zero may be avoided.
However, the algorithm will still tend to the correct solution provided that the small
residual corresponds to an interpolation point of the £i solution. If this is not the case,
then the solution will not be optimal, but will still be close to the optimal solution.
2.2
Possible problems
The most immediate problem that arises is how to ensure that there exists a point on
the template which is situated along the direction vector given from each datum. Clearly
in certain situations, there will not exist such a point — corresponding to the case where
the direction vector Ues within the tangent plane of the template in the region of the
datum. In such a situation there would seem to be two possible recourses available.
(1) Ignore these data.
(2) Choose the point on the template that is closest to the line though the datum
defined by the direction vector.
It has been found through empirical results that provided the problem only occurs on
certain iterations rather than as a result of poor choice of the direction vectors associated
with the template, ignoring the problem data is the better option. Use of the second
option has been found to prevent convergence of the algorithm.
3
Algorithm
The algorithm to implement this technique consists of two sub-algorithms, each related
to a specific section of the main algorithm. These sub-algorithms are
(1) the constrained closest point problem,
(2) the calculation of a new transformation estimate.
3.1
Constrained closest point problem
For each data point Xj, this problem is that of finding Uj and u, such that the constraint
X — f{u,v) = dw,
(3.1)
is satisfied (subscripts dropped for clarity). Expanding this equation, we obtain
ws /
If we pre-multiply this equation by a"^, we obtain
a'^x-a^f(u,?;)-daTw = 0.
(3.2)
134
I. J. Anderson and C. Ross
Thus, by choosing a to be orthogonal to w, we are able to eliminate d from equation
(3.2). Similarly, if we multiply equation (3.1) by b we obtain the equation
We may thereby reduce the system (3.1) to that of two (nonlinear) equations in two
unknowns (u and v). This system can then be solved by adopting a Newton-type method.
Our problem has been reduced to that of solving
F{u,v) = [ai:hf{x-t{u,v)) = 0,
which has derivative
V„,„F =-[a : b]'r(V„f : V„f),
by means of Newton's method which involves adopting an iterative approach and solving
VU,VF(^11^ = -F{U,V),
(3.3)
at each stage to obtain better estimates u + 6u and v + 6v. The quantities F{u,v) and
Vu,vF are straightforward to calculate as they arise directly from the explicit parameterisation of the template.
All that remains is the choice of a and b. We obtain these vectors by taking the cross
product of w with two arbitrary vectors — resulting in two vectors which are orthogonal
to V. More generally, the vectors a and b should be chosen to ensure that the system
(3.3) is well-conditioned.
3.2
Updating the transformation estimate
The method we adopt to obtain an update of the transformation parameters is the
Gauss-Newton method. This involves solving, at each iteration, the problem
J5t = -d,
(3.4)
in the li sense, where J is the Jacobian matrix of partial derivatives with entries Jij =
^tjdi. The estimate of the optimal transformation parameters is then updated according
to
t = t-|-^t.
Thus, since the distances d are obtained from the constrained closest point subproblem, we are left with the task of calculating the Jacobian matrix. For each datum, from
equation (3.1), we have that
x(t) - f (w(t), t;(t)) = wrf(u(t), v(t)),
where we have explicitly included the dependency of the distance d on the location
parameters U. Differentiating and rearranging, we obtain
Vtx = wVtrf-I-V[/fVtC/.
This is equivalent to the form
V.x = [w : Vuf] ( ^ifj ) -
Template matching in the ii norm
Therefore,
J = Vtd = enw:Vuf]-'Vtx,
where ei is the first component vector. Having obtained the Jacobian matrix J and the
distance vector d, we are now in a position to solve the system (3.4) in order to update
our estimate of the optimal transformation parameters t.
We note that using the traditional orthogonal distances can lead to problems since
calculation of the Jacobian matrix involves division of each row by the corresponding
orthogonal distance — leading to exacerbation of rounding errors and possible division
by zero especially in the £i case.
4
Numerical results
In this section, we present two example to illustrate the techniques presented in this
paper. In the first, we have a small number of data which we wish to match to a given
plane. In the second, we have a larger number of data and we wish to match them to
a cylinder. In both cases, although analytical expressions are available to obtain the
constrained closest points on the templates, we nonetheless utilise the method presented
above in order to test its effectiveness.
4.1
Simple problerti
Here we describe the problem of matching a representative set of 8 data onto the plane
defined as
/1\
fo
f(li,v)=u\ 0 \ +v\ 1
Since this problem is rank deficient if we use all six possible transformation parameters,
we restrict ourselves to using a translation in the ^-direction and rotations about the x
and y axes.
Having three degrees of freedom, we might expect to obtain an optimal £i solution
which interpolates 3 of the data. However, as we shall see, this is unattainable in general
and we can, in fact, only expect interpolation at two points. As Watson states [6], in
such a situation, the rate of convergence can be unacceptably slow. This is found to be
the case. It can be seen that not only is the convergence slow, but an optimal solution
Iteration
1
5
10
50
100
TAB.
norm (residuals) norm (update)
0.6662
4.9901e-02
0.3008
3.5716e-04
0.3007
8.8545e-06
0.3006
9.1533e-06
0.3008
3.8514e-04
1 Progress of the Gauss-Newton method for planar data,
is never obtained, with the objective function ||d||i increasing occasionally.
135
136
L J- Anderson and C. Ross
To ensure convergence, a simple line-search algorithm was adopted which searches
along the direction obtained from the Gauss-Newton step for the maximum reduction
in the objective function. This modification affects convergence in 3 iterations.
4.2
A more challenging problem
As a more challenging problem, we consider the matching of a set of 128 data which
supposedly represent a cylinder but which contain 8 wild points. The cylinder is parametrised by u and u as
(cosw
sinu
resulting in a cylinder with unit radius oriented along the z-axis. Again, the problem
of matching the data onto this model is rank deficient. The rank deficiencies occur due
to rotations about the 2-axis and translations along the z-axis. As such, we omit these
possible transformations.
Although we might initially expect to interpolate 4 data points at an optimal ^i solution, we find that in fact only two are guaranteed, although if a third point lies within
two radii of one of these two points, then three points can be guaranteed. Typically, this
will occur when the data is representative. For the data set we are considering, we expect
three interpolation points due to the data representing the cylinder and in fact at the
optimal solution, three interpolation points are obtained. In fact, the "missing" interpolation has the effect of slowing convergence of the Gauss-Newton method considerably so
that in 100 iterations, the algorithm had not been deemed to converge. However, by the
introduction of a simple line-search method, the algorithm converged in five iterations
as displayed in Table 2.
TAB.
5
Iteration norm(residuals) norm(update)
5.6796e-03
0.9654
1
6.2932e-04
0.9559
2
1.0141e-04
0.9557
3
2.5812e-07
0.9557
4
4.4006e-14
0.9557
5
2 Progress of the Gauss-Newton method for cylindrical data using a line-search.
Conclusions
This paper has shown how perceived problems in ii template matching can be avoided
by use of the so-called "method of directional constraints". In this method, the closest
point on the template along a given direction vector is calculated in order to obtain the
residuals between data and template. By then altering this direction vector to be the
normal to the surface at that projected point, the algorithm progresses to the expected
ii solution. Problems regarding undefined quotients are avoided by no longer updating
Template matching in the ii norm
137
the direction vectors corresponding to a datum when the residual associated with that
point is below a certain tolerance.
This work forms part of a larger project to consider novel approaches to ill-conditioned
problems in metrology. It is hoped that the work presented in this paper will aid in the
resolution of rank-deficient systems and ill-conditioned systems by altering the usual
orthogonal distances to be these directional constraints, which should remove some of
the rank deficiency.
As an example, consider the template matching problem where the template to be
matched is an infinite cylinder with axis along the z-axis. Using typical template matching algorithms, this problem is rank deficient by two at the solution due to the possible
translation in the 2;-axis and the possible rotation about the z-axis. By introducing these
directional constraints, the rotational rank deficiency is almost completely resolved (there
are now two possible rotations to obtain the optimal matching rather than the infinite
number previously).
The use of the ii norm is also being used tO attempt and resolve any rank deficiencies and ill-conditioning present in the problem. This is achieved by ensuring that any
local deviations from the template (caused by, for example, wear) are "ignored" so that
regions of local deviations might be compared. This will then result in a resolution of
the uncertainty in the transformation parameters.
Bibliography
1. B. P. Butler, A. B. Forbes, and P. M. Harris. Algorithms for geometric tolerance
assessment. Technical Report DITC 228/94, National Physical Laboratory, Teddington, UK, 1994.
2. V. Jovanovski. Three-dimensional Imaging and Analysis of the Morpohology of
Oral Structures from Co-ordinate Data. Ph.D. Thesis, Department of Conservative Dentistry, St Bartholomew's and the Royal London, School of Computing and
Dentistry, Queen Mary and Westfield College, London, UK, 1999.
3. D. A. Turner. The approximation of Cartesian coordinate data by parametric orthogonal distance regression. Ph.D. Thesis, School of Computing and Mathematics,
University of Huddersfield, UK, 1999.
4. D. A. Turner. Least squares profile matching using directional constraints. Preprint,
2001.
5. G. A. Watson. Approximation Theory and Numerical Methods. Wiley, New York,
US, 1980.
6. G. A. Watson. On curve and surface fitting by minimizing the £i norm of orthogonal
distances. Preprint.
7. D. S. Zwick. A planar minimax algorithm for analysis of coordinate measurements
Advances in Computational Mathematics, 2:4, 1994, 375-391.
-
A bootstrap algorithm for mixture models and interval
data in inter-comparisons
p. Ciarlini and G. Regoliosi
Istituto per le Applicazioni del Calcolo "M.Picone", CNR, Roma, Italy
F. Pavese
Istituto di Metrologid "G.Colonnetti", Torino, Italy
Abstract
To combine the information from several laboratories to output a representative value
Xr and its probability distribution function is the main aim of an inter-comparison in
Metrology. Here, the proposed procedure identifies a simple model for this probability
function, by taking into account only the probability interval estimates as a measure of
the uncertainty in each laboratory. A mixture density model is chosen to characterize
the stochastic variability of the inter-comparison population considered as a whole. The
bootstrap method is applied to approximate the distribution function of the comparison
output in an automatic way.
1
Introduction
The "mise en pratique" of the Mutual Recognition Arrangement (MRA), issued by national metrological Institutions in 1999, prompted new studies and projects in Metrology
mainly concerning the inter-laboratory comparisons area.
Recently, considerable effort has been devoted to finalise the problem of the choice
of a suitable statistical procedure to summarise inter-comparison data. The problem
solution is influenced by both metrological and statistical considerations, but it can also
depend on the physical quantity under comparison.
Some of the critical issues now emerging are related to several different reasons. For
instance, the statistical information supplied by each laboratory is synthetic, since it
comes from a data reduction process performed on several experimental datasets. In
each laboratory, assumptions and statistical reduction procedures may be different and
sometimes not fully documented or the a priori information on the original data may
be insufficient to define a "credible" probability distribution function (pdf) for output
quantities of the inter-comparison.
The use of the whole sets of original data from each laboratory might be an unfeasible
approach in the inter-comparison case, due to the unavailability of all needed data or
to practical reasons. At present, the practice is to supply synthetic information Xi by
each participant to the inter-comparison and to use a location estimator to output the
representative value.
138
A bootstrap algorithm for mixture models
139
Efforts should be given to improving the rehability of inter-comparison results by
asking for the use of any a priori information and of its "credibihty" to go ahead,
towards the direct estimation of the output of the comparison, Xr.
This paper proposes the identification of a solution without resorting to the synthetic
values and its point estimates of the standard uncertainty, but only to the probability
interval estimates as the measure of the uncertainty. This approach consists of two
parts: a modelling procedure to identify a simple mixture model able to approximate the
stochastic variability of the inter-comparison population as a whole; a parametric Monte
Carlo algorithm to autoiriatically estimate the probability distribution of the output Xr
and any accuracy measures at a prescribed precision.
The concept of a mixture of distribution functions occurs when a population made
up of distinct subgroups is sampled, for example, in biostatistics, when it is required
to measure certain characteristics in natural populations of a particular species. In an
inter-comparison each participant constitutes a subgroup.
The Monte Carlo method, based on the principle of mimicking sampling behaviour,
can always compute a numerical solution in an automatic way, also when the required
analytic calculations may not be simple. If the Monte Carlo method is applied with the
principle of substitution (of the unknown probability function with a probability model
estimated from the given sample), the approach is known as the bootstrap approach [4]
and is already used in Metrology [2]. In [1] the case of a multivariate normal mixture
model is considered and the standard errors are estimated by means of the parametric
bootstrap. The present algorithm will be applied to a thermometric inter-comparison,
where data cannot be assumed to be normally distributed.
2
Data structure of an inter-comparison with interval data
The number, N, of laboratories involved in an inter-comparison is typically small. In
the i-th laboratory, the {^i ,... , Ci ) measurements are supposed to pertain to a single
probability distribution function, say Fi{A), where A is the parameter vector, that may be
partially unknown. The measurements are statistically analysed and reduced to provide
to the comparison the synthetic value Xi and its uncertainty Uj at 95% confidence level,
or a 95% uncertainty interval (95%CJ): ((a;i,«i)..., (xiv,MAT))In this work the uncertainty is considered as "a 95%CI rather than as a multiple of the
standard deviation" (see 4.3.4 in [6]). Then an aim of an inter-comparison is to combine
the input data in the labs to characterise a representative value of the inter-comparison,
i.e., the random variable 6 and its pdf F. Hence a good estimate of the 95%CJ for ^ can
be obtained if the output pdf F is a simple known function, describing the stochastic
variability of the inter-comparison data. In other cases a suitable approximation of the
expected value £^j?[X] = J xdF{x) could be accepted to output the reference value XrThe inter-comparison data structure is summarised here in terms of interval estimates:
INPUT Sample — Each one of the N participants originates a 95%C/ that is one
element of the inter-comparison sample:
/
{[uii,Uiu],i = l,... ,N}'.
(2.1)
140
P. Ciarlini, G. Regoliosi, and F. Pavese
Here no value Xi in the interval [uji, w™] is chosen as representative; possible information
on Fi (such as limited or unhmited support, symmetric or not) should be added. If a
laboratory does not supply any information on the pdf, the uniform distribution is
assumed.
Comparison OUTPUT — It includes the representative value and its 95%C/
(^,h,6„]).
(2.2)
In many inter-comparisons, the differences to 6 are also defined: (y^, [«,;,«;,„]), where
yi = Xi-§,i = 1,... ,N.
3
A classical approach to inter-comparisons
Let us recall the solution to the inter-comparison problem through the traditional estimator, the weighted mean. It is a location statistic that combines several measures and
their standard uncertainties {xi,Ui)fLi. It provides the following estimate for 6,
and the following symmetric 95%C/,
dw±kuw,
(3.2)
where the coverage factor k is taken as the value iAr-1,0.95 of the Student distribution, A^
being small. In this approach, each Xi is viewed as an unbiased estimate of the laboratory
mean value and the random variable 6^ is defined to be a linear combination of N independent random variables Xi,... ,XN, where {xi,... ,XN} is an observed sample. 6u, is
supposed to be asymptotically normally distributed [6]. This estimator can be correctly
adopted to solve an inter-comparison problem if the assumption of the homogeneity of
the data is valid. This is equivalent to saying that, after considering the extent of the
real effect and bias in each laboratory, the laboratories yield on the average the same
value, so that the differences between the estimates are entirely due to random error.
In this case, the selected estimator 9^ appropriately estimates 0 and (3.2) accurately
estimates its 95%CJ.
Obstacles to applying this approach to a key-comparison have been discussed in [3].
The "credibility" of the representative values Xi, and of their uncertainty can critically
affect the accuracy of the estimate of the representative value Xr- Moreover, the peculiar
characteristics of a typical inter-comparison sample ((1) its very Hmited size, from a
statistical point of view, (2) different experimental methods, used in each laboratory)
often imply that the statistical assumptions are not satisfied, as for example in several
thermometric cases. Indeed, the first characteristic implies that the Central Limit Theorem and the asymptotic theory do not hold. Then the normal distribution cannot be
properly used to infer the estimates in (3.2).
Another example of the inadequacy of the weighted mean approach is when some
laboratories provide data aflFected by bias, resulting from skewed distributions underlying
their measurements. The symmetric confidence interval of (3.2) cannot be considered an
A bootstrap algorithm for mixture models
141
accurate approximation^ of the true one, since it does not adjust for the skewness. Finally,
it is necessary to point out that the homogeneity condition among the laboratories rhust
be assured in some sense, otherwise it would be impossible to attempt to the computation
of any summary estimate and its associated uncertainty.
4
4.1
The approach based on interval data
The mixture density function
This paper proposes to construct a simple model for the output pdf, and to estimate
its expected value 6 without requiring strong assumptions such as N large or each Fi
normal. This approach enables us to Compute the probability interval of the output
value in terms of the identified density in each laboratory. The stochastic variability of
the population of inter-comparison data is directly considered in the modelhng approach
as a whole, by means of a so-called mixture distribution model [5]. This model, being
a linear superposition of several (say N) component densities, appears to be suitable
from a computational point of view and can be embedded in a bootstrap algorithm to
simulate several data needed to predict the output quantities.
In an inter-comparison, let us suppose that a density function /^(a;; A^*)) is assumed
for the i-th laboratory, then the following density mixture is identified to model the
output pdf, where the parameter vector is A = {h.^^\... , A^'^)) and given weights TTJ >
0, i = 1,... ,7V, have summation normalised to one:
N
9{x;K) = Y.'^ifi{x;K^%
(4.1)
To compute the output as estimate of the expected value of the mixture, 6 — EG{h.)[X],
the probability function G(A), corresponding to the density in (4.1), must be known.
When some laboratory provides only partial information on a pdf, we propose to identify
its experimental variability by one of the following simple probabilistic models: uniform,
normal or triangular pdf (right or left or symmetric triangular). Indeed, in thermometric
experiments these three probabilistic models can represent several common stochastic
variabilities for measurements, such as a limited or unUmited support, symmetric or not.
We want the mixture parameters to be estimated by means of the INPUT Sample,
(2.1), as required in a bootstrap approach. Let us call/j the probability interval to which
the 100% measurements of the laboratory are supposed to pertain. For the uniform and
the triangular types, A^'^ parameters are defined to be the extremes of J, = [Xu, Aj„]. For
the normal model the parameters are the mean Xi and the variance Wj, while Jj becomes
(-00,-1-00).
A right triangular pdf (RT), a left triangular pdf (LT) or symmetric triangular pdf
(ST) is chosen according to the position where the maximum of the probability density
occurs, i.e., one extreme or the middle point of /.
^A 95% CI [€(,£11] for 9 is defined to be accurate if the following holds for every possible value for 6: 'Ptoho{6 >
£„} = 0.025 and ProbG{6» <£(}= 0.025
142
^
P. Ciarlini, G. Regoliosi, and F. Pavese
To compute the two components of the vector A'"' = {Xu, Xiu)"^ given the i-th input
interval, a 0.025% portion of probabihty mass is added outside of each extreme, according
to the suppHed density shape. For example, if the ST density is chosen, the parameters
are computed by:
Aj; = (0.89u,:; - 0.11Wi„)/0.78
Ai„ = (0.89ui„ - 0.11u,;;)/0.78.
The mixture weights could be used to associate a degree of "credibility" to each
laboratory. Then the choice TTJ = 1/A'', i = 1,... , TV, implies that every laboratory equally
contributes to the inter-comparison.
When the mixture G(A) is completely identified, it can be used to simulate data and
to approximate the output value in the Monte Carlo algorithm.
4.2
The bootstrap algorithm
To avoid integral computations to estimate 6 and its variance, the Monte Carlo method
is commonly used to approximate them within a given precision. Since the parametric
bootstrap approach does resampling frorh a parametric distribution model, in this case
the mixture model G(A), is adopted to approximate the following distribution,
H{x)^Prohc;{9* < x).
(4.2)
The Monte Carlo method simulates a sufficiently high number B of data 6* from G =
G{K), to compute,
Hix)^''^ = ^f2^{ei<x},
(4.3)
where the function 11{A} is the indicator function of the set A. With probability one, it is
known that the Monte Carlo approximation converges to the true value as B —> oo. The
Monte Carlo algorithm has been developed for a mixture density to estimate the comparison output. A hierarchical resampling strategy is used to reproduce the hierarchical
variability in the inter-comparison population, throughout the following steps:
(1) (a) Choose at random an index, say fc, of fc-th laboratory by randomly resampling
with replacement from the set {1,... , A''}
K r^ Prob{K = k} ^-Ki.
(b) Given k, generate, at random from the selected Fk of the distribution, a bootstrap value ^* in [Afc(,AA;„].
Repeat Step 1 B times to simulate the full bootstrap sample 01,.. .,9^.
(2) Approximate the bootstrap mixture distribution as in (4.3) to compute:
— the bootstrap estimate of the expected mean
1
^'B=
^
oE^^
6=1
(4.4)
A bootstrap algorithm for mixture models
Labi
Lab3
Lab5
Lab7
(-0.05;
( 0.18;
( 0.71;
(-0.03;
0.15)
0.15)
0.15)
0.15)
[-0.347,
[-0.117,
[ 0.413,
[-0.327,
0.247]
0.477]
1.007]
0.267]
Lab2 (0.03; 0.30) [-0.564, 0.624]
,Lab4 (0.04; 0.15) [-0.257, 0.337]
Lab6 (-0.01; 0.15) [-0.307, 0.287]
'.
TAB. 1. Inter-comparison of 7 laboratories [7]: point estimates and simulated interval data.
— the bootstrap standard deviation: Sd
— the 95%CI [e*, e*], where the two extremes are computed as the a-th quantile
2 (a = 0.025) of the bootstrap distribution H^„^^{a))-^ = q*^, hence £;* = q*^
In Step lb) the inverse transformation method has been used for simulating a random variable X having a continuous distribution F^. For example, X = F^'^{U), for a
U{\kh hu) random variable. In Step 2 the bootstrap CI has been computed by means of
the percentile method (see footnote). However, when the normal distribution is involved
in the mixture, the t-bootstrap method gives more appropriate results [4]. To determine
B in approximating the bootstrap confidence interval the coefficient of variation [4] can
be used. The value of B is increased until the coefficient of variation cv of the sample
quantile approaches the given precision SQ. Indeed, from a metrological point of view, it
appears easier to choose do instead of B as stopping rule in Step 1.
We would hke to have also an automatic tool to investigate how well every laboratory
contributes to the comparison, or to detect the possible presence of heterogeneous data.
Here the concept of jackknife-after-bootstrap has been adopted to compute the mean
and the bootstrap 95%C/. It is simply obtained by the following algorithm:
— for i = 1,..., N, leave out the i-th lab and compute 9%{-i) and q*B{--i),
— compare the N jackknife estimates to detect outlier values.
5
An application in thermometry
The proposed method is shown applied to an inter-comparison of Temperature Fixed
Points, involving N =7 laboratories [7]. Each lab provided data Xi with the 95% standard
uncertainty (Table 1: first item).
The second item (square brackets in the same table) represent the interval data
generated with (3.2), that used to perform this simulated example. Since no specific
pdf was supplied, the mixture distribution density has been constructed assuming the
uniform type for each participant and equal weights. The parameters of every uniform
density was computed using interval data, and the obtained mixture density was used
in the resampling step of the algorithm to compute the representative value and its
The percentile method of a statistics 6, based on B bootstrap samples, simply gives for a a-percentile ot"
{(QB)th largest for 0*}
.
143
P. Ciarlini, G. Regoliosi, and F. Pavese
144
Mixture of 7 Uniform densities
Mixture of 6 Uniform and 1 RT densities
B = J209
"l- 1
w
1. Bootstrap histograms B =2209: left-mixture of 7 uniform distributions; rightmixture of 6 ST plus one RT density for Labi.
FIG.
probability interval with So = 0.05. In Figure 1 (left) the bootstrap histogram, that
approximates the mixture density, shows a bimodal behaviour. The computations are
obtained for 6o = 0.05 or B = 2209: 6* = 0.14, bootstrap standard deviation Sd*=0.33,
95%CJ [-0.35, 0.92].
The proposed algorithm was also applied with a mixture of seven normal densities,
and the results are 0* = 0.13, Sd* = 0.43, bootstrap 95%CJ [-0.61, 1.1] for B =4752. The
effect of assuming unlimited symmetric distributions to model the output pdf results in
a wider 95%C/ for a mixture of normal densities.
By comparing the jackknife results in Table 2, Lab5 appears to supply unusual values.
To directly consider this behaviour in the inter-comparison, a mixture of six uniform
densities plus a RT density, identifying Lab5, has been constructed. The approximated
bootstrap distribution is displayed in Fig.l (left), with bootstrap estimates, 6** = 0.15,
standard deviation Sd* = 0.35 and [-0.35, 0.96] for the Bootstrap 95%C/, obtained for
5 = 2209.
6
Conclusions
The problem of the inter-comparison data has been described, and a new approach has
been proposed. It is based on the uncertainty estimates, that should be provided by each
Laboratory as interval estimate at 95% confidence level together with information, also
partial, on the probability function. The constructive procedure directly characterises
the stochastic variability of the reference value of the inter-comparison, by means of a
mixture density model. The result of an inter-comparison is then viewed as a random
variable, not directly measured, being the output of a complex process, that involves
measures, statistical information and metrological considerations. These considerations
suggest us constructing a mixture, with weights TT, to take into account each participating
laboratory according to its credibility.
A bootstrap algorithm for mixture models
Labi
Lab3
Lab5
Lab7
0.34
0.34
0.23
0.34
-0.45,
-0.40,
-0.42,
-0.42,
0.92;
0.91
0.48
0.92
Lab2
Lab4
Lab6
0.32
0.34
0.34
-0.31, 0.94]
-0.35, 0.92
-0.36, 0.95
TAB. 2. Jackknife-after-bootstrap estimates. Standard deviation and 95%C/ for
mixture of 6 uniform densities {B — 1000): in the ith item, Labi is left out.
The parametric bootstrap approach has been adopted to estimate in a simple and
automatic way the inter-comparison output, where information, even partial, on the
probability hierarchical data of the participating laboratories, have been taken into account.
Also with a limited number of laboratories, the method can be applied, as it is shown
in the thermal example, where {N = 7) and the experimental conditions implied to adopt
skewed distributions. The automatic jackknife method of detecting the heterogeneous
data succeeded in revealing an unusual value. To take into account this condition, a
mixture of six uniform densities plus an RT density to identify Lab5 could be better used.
The choice of equal weights emphasises that all the standards have equally contributed
to the inter-comparison.
The bootstrap procedure, completely developed for a class of five simple distribution
functions often used in thermal metrology, could be adapted to consider other distributions, when the synthetic data information provided by the laboratories, as summarised
in Section 2, allow to compute the mixture parameters.
Bibliography
1. K. E. Basford, D. R. Greenway, G. J. McLachlan and D. Peel, Standard errors
of fitted component means of normal mixtures. Computational Statistics 12, 1-17,
1997.
2. P. Ciarlini et al.. Non-parametric bootstrap with application to metrological data. In:
Advanced Mathematical Tools in Metrology, Series on Advances in mathematics for
applied sciences, t6, Singapore, Ciarlini, Cox, Monaco, Pavese eds., World Scientific,
219-230, 1994.
, 3. M. Cox, A discussion of approaches for determining a reference value in the analysis of key-comparison data. In Advanced Mathematical and Computational Tools
in Metrology IV, Series on Advances in mathematics for applied sciences, 53, Singapore, Ciarlini, Cox, Pavese, Richter Eds , World Scientific, 45-65, 2000.
4. B. Efron and R. Tibshirani, An Introduction to the Bootstrap, Chapman and Hall,
London,1993.
5. «B. S. Everitt, Finite Mixture Distributions, Chapman and Hall, London, 1981.
6. ISO, Guide to the Expression of Uncertainty in Measurement, Geneva, Switzerland,
1995.
7. F. Pavese, Monograph 84/4 of Bureau International des Poids et Mesures, BIPM
Sevres, 1984.
145
Efficient algorithms for structured self-calibration
problems
Alistair B. Forbes
National Physical Laboratory, Teddington, Middlesex TWll OLW, UK.
alistair.forbes9npl.co.uk
Abstract
Self-calibration techniques have been used extensively in co-ordinate metrology. At their
most developed, they are able to extract all systematic error behaviour associated with
the measuring instrument as well as determining the geometry of the artefact being
measured. However, this is generally at the expense of introducing extra parameters
leading to moderately large observation matrices. Fortunately, these matrices tend to
have sparse, block structure in which the nonzero elements are confined to much smaller
submatrices. This structure can be exploited either in direct approaches in which QR
factorisations are performed or in iterative algorithms which depend on matrix-vector
multiplications. In this paper, we describe self-calibration approaches associated with high
accuracy, dimensional assessment by co-ordinate measuring systems, highlighting how the
associated optimisation problems can be presented compactly and solved efficiently. The
self-calibration techniques lead to uncertainties significantly smaller than can be expected
from standard methods.
1
Introduction
An important activity in metrology is the calibration of instruments and artefacts. Calibration defines a rule which converts the values output by the instrument's sensor(s)
to values that can be related to the appropriate standard (SI or derived) units. Importantly, to these calibrated values it is required to assign uncertainties that reliably take
into account the uncertainties of all quantities that have an influence. As a consequence,
the size and complexity of the computational tasks associated with the data analysis can
be significant, even for instruments that appear to be of siinple design and operation.
It is thus beneficial to design and implement algorithms that are efficient with respect
to computation and memory. Fortunately, many of the calibration problems give rise to
systems of equations with a well defined sparsity structure.
The rest of this paper is organised as follows. In Section 2 we review least squares
approaches to calibration problems and go on to describe self-calibration problems in
co-ordinate metrology in Section 3. Sections 4 and 5 describe solution methods for two
types of sparsity structure. Our concluding remarks are given in Section 6.
2
Least squares solution to calibration problems
In many calibration problems, the observation equations involving measuremerits y,
146
Efficient algorithms for structured self-calibration problems
147
can be expressed as j/j = ^i(a) + e,, where </>, is a function depending on parameters
a = (ai,..., an)"^ specifying the behaviour of the instrument, and Cj represents random
measurement error. For a set of measurement data {yi}™) best estimates a* of the
cahbration parameters a are determined by solving
m
mm ^/^(a) = f*f,
(2.1)
j=i
where /i(a) = j/j — </>i(a). The most common approach to solving this problem is derived
from the Gauss-Newton algorithm; see, for example, [5]. If a is an estimate of the solution
and J is the Jacobian matrix defined at a by Jij = dfi/doj, then an updated estimate
of the solution is a + p, where p solves the Jacobian system
in the least squares sense. Starting with an appropriate initial estimate of a, these steps
are repeated until convergence criteria are met.
A numerically stable method of solving the Jacobian system is to find a factorisation
J = QR, where Q is an m x n orthogonal matrix and R is an upper-triangular matrix
of order n (see, e.g., [1, 6]). The solution p is determined efficiently by solving the
upper-triangular system
Rp = -QTf
using back substitution. The matrix Q can be constructed using either Householder
reflections, which process the Jacobian matrix a column at a time, or Givens plane
rotations, which process the matrix row-wise. For either approach the orthogonal factorisation requires 0{mn?) operations.
An alternative to the direct approaches to solve matrix equations is to use iterative
procedures based on conjugate gradients. The advantage of these approaches is that they
involve only matrix-vector multiplications and for sparse matrices these multiplications
can be made efficient. In particular, the LSQR algorithm of Paige and Saunders [7]
implements an iterative approach to solving linear least squares problems.
Often, linear equality constraints on the parameters of the form C^a = c, where C
is an n X p matrix, p < n, are required to eUminate degrees of freedom in the problem.
However, we can use orthogonal projections to eliminate these constraints. Suppose C
is of full column rank and has QR factorisation
C=[Vi
V2
s
0
where Vi and V2, respectively, are the first p and last n — p columns of the orthogonal
factor V. If ao is a solution of C'^a = c, then for any (n — p)-vector a, a = ao -f- V2a
automatically satisfies the constraints and the optimisation problem can be reformulated
as the unconstrained non-linear least squares problem
m
minV/f(ao-Fy2a),
148
Alistair B. Forbes
involving the reduced set of parameters a. We note that the associated Jacobian matrix
is simply J = JV2, where Jy = dfi/daj, as before.
Unfortunately, even if J has structure J = JV2 could be full. For indirect approaches,
this is of little consequence since the matrix-vector multiplications can be formed in two
stages (e.g., y = V2X., z = Jy) each of which can be implemented efficiently. For a direct
approach, it may be possible to implement the constraints in such a way as to minimise
the amount of fill-in during the orthogonal factorisation stage.
3
Self-calibration problems in co-ordinate metrology
Co-ordinate metrology is concerned with defining the geometry of two and three dimensional artefacts from measurements of the co-ordinates of points related to the surface
of the artefacts. It is a key discipline in quality and process control in manufacturing
industy. In a (conventional) co-ordinate measuring machine (CMM) with three mutually orthogonal linear axes, the position of the probe tip centre is inferred from scale
readings on each of the three machine axes. In practice, CMMs have imperfect geometry
with respect to the straightness of the axes, the squareness of pairs of axes and rotations
describing roll, pitch and yaw, and these systematic errors have to be taken into account
if the accuracy potential of the CMM is to be more fully realised. Two approaches can
be adopted to nullify the effect of these systematic errors. The first - error mapping involves performing a set of experiments to characterise as completely as possible the
error behaviour of the instrument and then use error correction software to produce more
accurate co-ordinate estimates. The disadvantages of this approach are, firstly, the set
of experiments is expensive to perform and, secondly and more importantly, the error
behaviour of the CMM is likely to drift so that, for example, an error correction valid on
Monday will only be partially valid on Friday and may be of limited value a month later.
The second approach - self-calibration - attempts to use any approximate symmetries,
rotational or translational, of the artefact so that systematic errors associated with the
measuring system are identified as part of the measurement process [4]. The advantage
of this method is that the eff'ect of systematic error behaviour of the instrument is cancelled out arid the accuracy of the measurements are limited only by the smaller, random
component.
3.1
Calibration of reference artefacts in 2-dimensions
As an example, we consider the accurate calibration of 2-dimensional artefacts by a two
dimensional CMM. The artefacts define the location of targets nominally aligned in a
grid pattern. Let y^, j = 1,... ,nY, be the locations of the targets in a fixed frame of
reference, and let
be the location of the jth target in the fcth measuring position. Here, the roto-translation
T is specified by three parameters t defining the translation vector and angle of rotation.
We suppose the systematic error of the two dimensional CMM can be expressed as
X* = x*(x, b) = X-(-e(x, b).
Efficient algorithms for structured self-calibration problems
149
where x* are the true point co-ordinates, x are the indicated point co-ordinates output by
the machine and e(x, b) is the error correction term depending on x and error parameters
b. For instance, suppose the model describes scale and orthogonality errors so that
X* =x(l + bi) + y{l + b2)smb3,
y* = 2/(1-|-62) cos63.
If Xj is the measurement of the jth target with the artefact in the fcth position then the
associated observation equation is
Xi-t-e(xi,b) =yj,fc + ei.
(3.1)
Given a set of such measurements {xi}^-^ and associated index functions {j{i),k{i))
specifying the targets and artefact positions, estimates of the model parameters can be
determined by solving a non-linear least squares problem
mx
min
{yi},{tfc}.b
y^f^fi,
fr^
where fj(yj(i), tfe(i),b) = Xj-I-e(xi,b) - yj,fe.
The model involves three sets of the parameters: the target locations {yj}, transformation parameters {t^} and the error parameters b. Each observation equation depends
on only one target and one transformation, so that the Jacobian matrix j of partial
derivatives can be ordered to have a block-angular structure [2]
[ifl
J=
Jl
J2
K2
J^mx
*^mx
where Kj corresponds to the parameters y^- and the border blocks {J,} correspond to
the border parameters a = {{t^}, b}. The frame of reference for the targets {yj} can be
specified by applying three appropriate hnear equality constraints on the transformation
parameters {tfe}.
While scale and orthogonality errors are often major contributors to the systematic
error behaviour of a CMM, there is no guarantee nor does'experience show that they
explain the full extent of the behaviour. For this reason, more comprehensive models have
been developed [3, 9]. However, they all depend on the approximation of actual behaviour
by empirical functions such as polynomials and the adequacy of the approximation is
often difficult and expensive to evaluate. However, if we always rotate and translate the
artefact according to the symmetries of the reference artefact so that the targets are
always located (nominally) at a subset of a fixed grid of points in the CMM's working
volume, then measurenients are made at a finite number of machine locations. To the
Zth location we associate a machine error ej. If the ith measurement is made at the Ith
location then the observation equation corresponding to (3.1) is
Xi + ei =yj,k + ei.
The advantage of this error model is that it entails no significant approximation: the
Alistair B. Forbes
150
\\
I
^-vx, i
0
50
100
HZ = 1784
1. Sparsity structure of the transpose of the Jacobian matrix associated with the
measurement of a 5 x 5 hole plate in eight positions.
FIG.
systematic errors are modelled exactly. An apparent disadvantage is that there are likely
to be as many error parameters as target parameters giving rise to a sparsity structure
in the Jacobian matrix for which direct, structure-exploiting methods provide relatively
minor efficiency gains. Figure 1 shows on the left the sparsity structure of the Jacobian
matrix J associated with the measurement of a 5 x 5 hole plate in eight positions, the
first four corresponding to rotations by 0, 90, 180 and 270 degrees, the second four
incorporating a translation as well as a rotation. In each position the location of the
targets y^ are measured in order. The nonzero elements of the matrix are represented
by a dot. The first (second) 50 columns correspond to the derivatives with respect to
the machine error parameters e; (target parameters y^) and the last 24 correspond to
the eight sets of transformation parameters tfc. On the right the sparsity structure of the
triangular factor of J is illustrated and shows the substantial fill-in that occurs.
In the next two sections, we describe approaches for dealing efficiently with blockangular and more general sparse-block structure.
4
Algorithms for block-angular systems
We consider non-linear least squares problems where the optimisation parameters can
be partitioned into two sets t) = {yj}"^' and a, and such that each observation equation
involves a and at most one set of parameters y^. Corresponding to (2.1), we have instead
an objective function of the form
F(77,a) = fo^(a)fo(a)
-f
5]f7(y,-,a)f,(y,-,a).
Efficient algorithms for structured self-calibration problems
151
The associated Jacobian matrix J and its triangular factor R can be arranged to have
the form
Ki
Bi
Ri
Ji
Ko
B2
J2
R2
R
J=
K„
Rn
J,
Jo
Bo
The nonzero blocks of the matrix R can be stored compactly in a vector r, row by row.
Efficient updating strategies for such triangular factors have been incorporated into
a non-linear least-squares solver to deal with block-angular problems. It is assumed that
the Jacobian matrix is composed of TIB blocks of rows, with the ith block depending on
at most one set of parameters y^, j = j{i). The user is required to supply a function and
gradient evaluation module that given rj, a and 1 < i < TIB, returns j — j{i) and
fi(a), Ji,
fj(yj,a), Ji, Ki,
j == 0,
j >0.
For each i, the triangular factor and righthand side vector is updated by the ith block
of rows:
Rj{i)
Ki
Bjii)
Ji \
1—>
Rj{i)
[
0
%i)
Ji
\
1
ri?o 1
[ Ji \
1—^
■ -Ro ■
[ 0 J
Linear equality constraints on the border parameters a implemented using the orthogonal
projection approach can be incorporated by setting Ji := JjV2 at the appropriate stage.
5
Algorithms for sparse-block matrices
Let m X n matrix S be composed of HB submatrices Sk of dimension ruk x Uk- We
assume that Sk is stored (column-wise or row-wise) as a column vector s^. The information in S can be encoded in a column vector s/ and an indexing set Is such that
Is{l : 5,fc) = {ik,jk,'n^k,nk,lk) where {ik,3k) specifies the locaition of 5^(1,1) in S and
Ik indicates that s^ = S/(/fe : Ik + mkUk — 1). Blocks of such matrices can be easily
represented by concatenating the s-vectors and index matrices Is and performing some
trivial index modifications. Matrix-vector multiplications of the form y := aS'x-|-/3y are
easily implemented through a sequence of full matrix multiplications: y := /3y, followed
by
y(ifc : ife-I-mfc - 1) := y(4 : 4 + JTifc - 1)-f Q5fcx(jfe : jfc-I-rife - 1),
fc = 1,...,rifi. A similar scheme calculates x := aS'^y + p-x.. The storage and multiplication scheme can be modified to take into account the type or structure of the submatrices
SkTo implement linear equality constraints, it is required to perform matrix multiplication by a submatrix V2 of the orthogonal factor of the constraint matrix C. A simple
scheme can be implemented using the LAPACK routines DGEQRF (orthogonal factorisation) and DORMQR (matrix multiplication by an orthogonal matrix stored as a
product of Householder matrices) [8].
Alistair B. Forbes
152
FIG. 2. Residual errors associated with the first 1000 observations for models a) with
no error separation (dots) and b) with error separation.
We have implemented a non-linear least squares solver for sparse-block systems. The
user is required to supply a module that takes as input the current estimate a of the
optimisation parameters and outputs the function values f (a) and the Jacobian matrix
stored in sparse-block form (s/,/g). The solver implements a Gauss-Newton approach
using the LSQR solver to find the Gauss-Newton step and caters in a straightforward
way for linear equality constraints. The solver has been successfully tested in a number
of self-calibration problems. For example, it was used recently in the calibration of a
13 X 13 grid of targets on a glass plate by a CMM with an optical probing system. The
problem involved approximately 15,000 observation equations in over 800 optimisation
parameters and was solved in a few tens of seconds using a standard laboratory PC (450
MHz). The advantage of the error separation model is illustrated in Figure 2 which shows
the residual errors associated with the first 1000 observations for models a) with no error
separation (dots) and b) with error separation. The fit for the error separation model is
much superior. The practical metrological consequence of adopting the enhanced model
is that uncertainties associated with the target locations can be reduced by a factor
of five. Importantly, because the model is a realistic approximation of the measuring
system, we can have confidence in the uncertainty estimates derived from the model.
6
Concluding remarks
The move to more accurate measurement systems has led to more comprehensive models
of the measuring instrument and its interaction with the physical quantity being measured. These models include parameters that describe properties of the instrument and
those of the measurand. The aim of self-calibration experiments is to determine as much
as possible about both sets of parameters from a set of measurement experiments. For
models with a small to modest set of parameters, a full matrix approach may be acceptable. For larger systems, exploitation of sparsity structure in the defining equations is
Efficient algorithms for structured self-calibration problems
highly desirable and often a stark necessity if the computations are to be made in an acceptable time using the computing resources to hand. The exploitation of block-angular
structure has been well-known and well-used in some areas of metrology. The supporting
numerical technology based on structured orthogonal factorisations is mature, compact
and easily implemented using standard numerical linear algebra. However, these techniques could be applied more widely in metrology, making feasible approaches that have
to be rejected if full matrix methods only are to be used.
The use of sparse matrix techniques is relatively rare within metrology. We have
attempted to show here that in self-calibration problems in dimensional metrology, they
allow us to develop improved models that provide vastly superior fits to the data, with
correspondinig improvements in the evaluated uncertainties in the fitted parameters. The
supporting numerical technology is maturing and accessible.
Acknowledgements: This work has been supported by the Department of Trade and
Industry's National Measurement System Software Support for Metrology Programme
and undertaken by a project team at the Centre for Mathematics and Scientific Software,
National Physical Laboratory. The author is particularly thankful to Maurice Cox, Peter
Harris and Ian Smith for their contributions.
Bibliography
1. A. Bjorck. Numerical Methods for Least Squares Problems. SIAM, Philadelphia,
1996.
2. M. G. Cox. The least-squares solution of linear equations with block-angular observation matrix. In M. G. Cox and S. Hammarling, editors, Advances in Reliable
Numerical Computation, pages 227-240. Oxford University Press, 1989.
3. M. G. Cox, A. B. Forbes, P. M. Harris, and G. N. Peggs. Experimental design in
determining the parametric errors of CMMs. In V. Chiles and D. Jenkinson, editors.
Laser Metrology and Machine Performance IV, pages 13-22, Southampton, 1999.
WIT Press.
4. A. B. Forbes and I. M. Smith. Self-calibration and error separation techniques in
metrology. In P. Ciarlini, M. G. Cox, E. Filipe, F. Pavese, and D. Richter, editors.
Advanced Mathematical and Computational Tools in Metrology V, pages 149-163,
Singapore, 2001. World Scientific.
5. P. E. Gill, W. Murray, and M. H. Wright. Practical Optimization. Academic Press,
London, 1981.
6. G. H. Golub and C. F. Van Loan. Matrix Computations. John Hopkins University
Press, Baltimore, third edition, 1996.
7. C. C. Paige and M. A. Saunders. LSQR: and algorithm for sparse linear equations
and sparse least squares. ACM Transactions on Mathematical Software, 8{1), 1982.
8. SIAM, Philadelphia. T/ieL^PACii'f/sers'Gmde, third edition, 1999.
9. G. Zhang, R. Ouyang, B. Lu, R. Hocken, R. Veale, and A. Donmez. A displacenient
method for machine geometry calibration. Annals of the CIRP, 37:515-518, 1998.
153
On measurement uncertainties derived from
"Metrological Statistics"
Michael Grabe
Am Hasselteich 5, 38104 Braunschweig, Germany.
michael.grabe@ptb.de
Abstract
As measurement uncertainties are closely tied up with error models, it might be of interest to review a model, which the author assigns to "Metrological Statistics". Given
that the random errors are normally distributed, the experimentalist could either refer to
B.L. Welch's concept of "effective degrees of freedom" or to the multidimensional FisherWishart distribution density. In the first case, different numbers of repeated measurements are admissible, in the latter it is strictly required to have equal numbers of repeated
measurements. In error propagation, however, only the latter mode of action opens up
the possibility of designing confidence Intervals according to Student and confidence ellipsoids according to Hotelling. Another point of view, closely linked to the choice of the
numbers of repeated measurements, refers to the customary practice of attributing equal
rights to statistical expectations and empirical estimators. However, the Fisher-Wishart
distribution density suggests using only the information which is realistically accessible to
experimentalists, namely empirical estimators. For the handling of unknown systematic
errors, either the existence of a (rectangular) distribution density may be assumed or,
and this is proposed here, they may be classified as time-constant quantities, biasing expectations and suspending a lot of tools and procedures of error calculus well-established
otherwise.
1
Introduction
The joint propagation of random errors and unknown systematic errors currently places
the experimentahst in the following dilemma.
In regard to the propagation of random errors, there are, at least in principle, two
different choices. If one is willing to accept unequal numbers of repeated measurements
of the physical quantities to be combined within a given function, one has, in order to
express the influence of random errors, to resort to B. L. Welch's sophisticated concept
of so-called numbers of effective degrees of freedom [8]. However, this procedure Is tied
up with difficulties: it is restricted to independent variables.
Though B. L. Welch's concept completely exhausts the information implied in measured data, unfortunately, from a metrological point of view, it is cumbersome to handle
and obstructs the view to existing simpler procedures. On the other hand, if the experimentalist preferred equal numbers of repeated measurements, he would — if need
be — have to give away part of his information, namely that which is carried by the
154
On measurement uncertainties from metrological statistics
155
excessive numbers of repeated measurements of the variables involved. Up to liow, the
disregarding of excessive numbers is regarded as unfavourable. In spite of this view,
just this precaution opens up a toolbox of applied statistics hitherto closed to metrologists, as only with equal numbers of repeated measurements, is the experimentalist in
a position to call upon the standard model of statistics for jointly normally distributed
random variables, i.e. the Fisher-Wishart density [3]. The advantages gained in that
way outweigh by far the "lost information", as relatively few repeated measurements of
experimental set-ups, operating in a stationary mode, are able to locate accurately the
respective physical quantities. After all, in error propagation the experimentalist may
define confidence intervals according to Student (Gosset) including any number of variables. In least squares, he may even establish multidimensional confidence intervals, and
last but not least, certain problems of classical error calculus, such as the Fisher-Behrens
problems no longer arise.
In regard to the interpretation and propagation of unknown systematic errors, the
situation is not simpler. Let us assume that an unknown systematic error /, constant in
time, is confined to an interval of the kind^
-fs<f<fs,
fs>0.
(1.1)
Now, the experimentalist may either assign a postulated probabilty density to /, usually
a rectangular density [7],
p(/) = ^,
(1-2)
/ = constant,
(1.3)
or he niay set without exception
where / hes anywhere within (1.1). The latter interpretation introduces biased estimators, leading to a break-down of many procedures of error calculus otherwise wellestablished.
.
Seen mathematically, both interpretations should be justified. In the case of (1.2),
the combination of random and systematic errors should be carried out geometrically,
in the case of (1.3), arithmetically. Regarding (1.3), the author suggests adding hnearly Student's confidence intervals to appropriately designed worst-case estimates of
the propagated systematic errors, and no probability statements should be associated
with so-defined overall uncertainties.
2
Error propagation
The fundamental error equations of Metrological Statistics are given as follows [4]. Let
xo designate the true value of the physical quantity x to be measured. Furthermore, let
£; be the random error and fx = constant the unknown systematic error corresponding
■'Should the interval be unsymmetrical to zero, it could be symmetrized by subtracting the halved sum of the
upper and lower boundary — the same quantity would have to be subtracted from the data.
156
Michael Grabe
to (1.1). We then have
xi=xo + ei-{- fx,
1 = 1,...,n.
(2.1)
Let Hx = xo + fx be the expectation of the random variable X = {xi,X2,... ,x„}, so
that the a;; are some of its reahzations. We then find
Xi = fXx+£h
l = l,---,n.
(2.2)
Furthermore, let x = l/n^"^j xi denote the arithmetic mean. We then have the useful
identities
Xl=Xo + {xi-fJ,x) + fx,
X = XQ + {x-fix) + fx-
(2.3)
While the arithmetic mean is biased, the empirical variance
si
n
L_J2{xi-xf
_ ,
(2.4)
(=1
is not. For the time being, let us consider just two quantities to be measured, x and y.
As robust and simple uncertainty assessments are a matter of linearization, the overah
uncertainty u^ of a given function (j> {x,y) is proposed to be [5],
ts,pin-l)
"^- ' \f,
(d<t>
D'^-(i)(S)'-(l)'^
Vlai^ «x + 2i^M^)sx,+ (i^) si +
dx
Js,X T
dy
Js,y
(2.5)
where ts,p (n - 1) is the Student-factor corresponding to a confidence level P. We distinctly see how the empirical covariance
1
"
1=1
enters the empirical variance of the (l){xi,yi); I = I,... ,n, given by
.5.
2
/a ;\
/QJL\
/ OA\ 2
^g[„.,.„,-.(.,.)f^(i) 4..(g)(|).,.(|) 4.
. The final result
(j){x,y)±u^
(2.6)
is expected to localize the true value 0(a;o,j/o) with "reasonable certainty" — but no
proper confidence statement should be added, as u^ is a mixture of a statistical and
a non statistical component. The last term in (2.5) may overestimate the uncertainty,
on the other hand linearization errors have been negleted. After all, this uncertainty
statement should fulfill the prerequiste to be safe, robust and simple.
If there are m quantities to be measured, we replace the notation x,yhyx\,X2,--- , XmThen the overall uncertainty u^ of the final result
4>{Xi,X2,... ,Xm)±U^
On measurement uncertainties from metrological statistics
157
is given by
+ „„('^_n
ts,p{n-l)
u^ = - '
in
JIL d(t)
F>A d(l)_
PtA^
JUL P>A
dxi
\
fs,i.
:
(2.7)
When (2.5) and (2.7) are compared, it becomes obvious that the proposed formalism
of error propagation works Uke a building kit, perspicuous and easy to handle. There are
arguments against (2.7), in particular that an experimentalist who wishes to design his
uncertainties in this way, would have to know the complete set of repeated measurements,
in other words, the complete empirical variance-covariance matrix
s = (sij),
2,i = 1,2,... ,m,
(2.8)
of the input data. Arguably, this is true, but in the days of computers and the internet
such a challenge should no longer be apt to provoke difSculties worth mentioning. Another argument, that (2.7) might overestimate overall uncertainties, should be judged in
view of the unique role of metrology in science. Standing "between" theory and experiment, metrology pursues the idea to localize rehably the value of the physical quantity
in question.
3
Least squares
Let
,
A/3«a;
(3.1)
be a linear system of equations to be adjusted. Here, A designates the m x r design
matrix of rank r, /3 the r x 1 vector of unknowns and, finally, a; the m x 1 vector of
the observations or input data. We assume rn> r. The idea of least squares is of purely
geometrical origin.
In what follows, A^ denotes the transpose of A. The idea is to project the vector x
by means of a projection operator
P^AiA^Ay'A^
(3.2)
orthogonally onto the column space of the matrix A, and the result is
P={A'^Ay'A'^x.
(3.3)
As the solution vector $ is Unear in the input data, the transfer of (2.7) to its components
^fc, fc = 1,... ,r, is straightforward.
Clearly, the orthogonal projection is in no way dependent on the error model implied.
In contrast to this, the latter turns out to be crucial in regard to uncertainty assessments.
Let us consider a set of single observations
Xi= xo4 +ei + fi= Xo,i + {Xi - Pi) + fi,
i = l,... ,m,
(3,4)
being the input data, where E {Xi} = ni. Writing (3.4) in vector form, we have
X = xo + (x - fi) + f
(3.5)
158
Michael Grabe
where
X = {Xi,X2,...,Xm)
,
Xo = {XO,1,XQ^2,--- ,Xo,m)
M = (M1)A*2)-•• j/^m)
,
/ = (/l,/2,--- ,/TO)
,
,
-fs,i < ft < fs,i-
Given equal variances a^ = EUXi -fit) [, the minimized sum Qmm of squared
residuals of the adjusted system (3.1) should yield, according to quite familiar procedures,
an estimator s^ i=» 0-^. However, from
Qmin = {X- Px)
{X - Px) ,
we obtain something different, namely
E{Qr^^] = a\m-r) + ff-fPf.
(3.6)
As we see, even the simplest of all associated least squares procedures breaks down,
should the model of time-constant unknown systematic errors be accepted. At the same
time the related basic tool hnked to Qmin and frequently used, namely the test of consistency oi the'mp\it data, based on the criterion
breaks down as well. Indeed, during many decades, time and again, the observation
Qmin/s^ '>m-r
has stunned experimentalists [2], so that, in the adjustments of the fundamental physical
constants, even the abolition of least squares has been considered [1]. However, in view
of (3.6), these observations are understandable.
After all, a least squares adjustment of biased input data requires arithmetic means
5i = Xo,i + {xi - iii) + fi,
i = 1,... , m,
(3.7)
so that the empirical variances and covariances
Sij =
1
"
r y ^ (Xjl — Xi) {Xjl — Xj),
Sii = Sj ,
(."J-"]
are known a priori. Replacing (3.5) by
x = xo + {x-fi) + f
(3.9)
P={A'^Ay'A'^x.
(3.10)
. instead of (3.3), we find
A matter of similar concern refers to the break-down of the Gauss-Markoff theorem. In
view of (3.9), the solution vector P is biased, so that the experimentalist is no longer in
a position to obtain a weight-matrix from the variance-covariance matrix of the input
vector X. Consequently, simple, optimized adjustments, to which we are customarily
used, must be ruled out. Nevertheless, we may multiply (3.1) from the left with any
On measurement uncertainties from metrological statistics
159
non-singular weighting matrix, e.g. with a diagonal one,
G = {gi,g2,.-. ,gm},
9i = —,
(3-11)
and adjust the weights g, by trial and error in order to find the shortest possible uncertainty intervals. As has been shown, this method is also able to detect inconsistencies
among the input data, [6]. Indeed, as a non-singular weight-matrix cannot shift the true
solution vector /3o, we are allowed to proceed this way.
To assign uncertainties to the components ^fe; A; = 1,... , r of the solution vector (3,
we refer to (2.7). To abbreviate the notation, we set in (3.10)
B^AiA^Af^
(3.12)
where the elements of the matrix B will be designated by 6,^. Upon insertion of (3.9)
into (3.10), we arrive at
p = B''xo + B^ix-^i) + B'^f.
(3.13)
Evidently, /3o = B'^XQ is the true value of the estimator p. Setting fx^ = E {^] =
(3Q + B'^ f, we may define the theoretical variance-covariance matrix
E[{^-iXp){^-iipf},
which, however, remains numerically inaccessible. Consequently, the only thing we can
do is to resort to the empirical variance-covariance matrix
s^=(s^.^,,)=-B'"sB,
/c=l,2,...,r,
(3.14)
whose elements are given by
771
Clearly, the Sij are the elements of the empirical variance-covariance matrix s of the
input data, as has been stated in (2.8) and (3.8).
These procedures presuppose, as has been pointed out, equal numbers of repeated
measurements within each of the m means (3.7). The components Pk of the solution
vector may be written as
^
n
m
h = -y^3ki
with
hi = 'yMkXii\
k = l,...,r.
(3.16)
1=1
1=1
Evidently, the Pki are independent and normally distributed. Let p^^ denote the
expectations
H0^^E{h},
fc = l,...,r
(3.17)
of the Pk- Looking for just any one of the ^k,
0k - -^-^
^0. ^ M^. ^^k +
^
s^,
(3.18)
160
Michael Grabe
is a confidence interval according to Student, where ts,p {n - 1) is the Student-factor.
This interval localizes fx^ with confidence P.
The components of the third term on the right-hand side of (3.13) are given by
fh=T.^ikfi,
k = l,...,r.
(3.19)
Worst-case estimates are
m
fsA=Y,\^i''^f^^i^^
■
fc = l,...,r.
(3.20)
t=i
After all, the overall uncertainties u^^ of the components of the solution vector ^, considered and employed individually, are proposed to be
"fe=
^
g^. + /.A.
k = l,...,r.
(3.21)
4 Uncertainty spaces
The component representation of (3.13),
m
m
0k = Po,k + 5^ bik (xi - Hi) + ^ bikfi
(4.1)
reveals the couplings between the least squares estimators. Those due to random errors
may be expressed by means of Hotelling's density [3]. The last term on the right-hand
side of (4.1),
m
fpk = Yl^''•-•^''
k=l,... ,r,
(4.2)
expresses the couplings due to systematic errors. The r components /^^ map the mdimensional hypercuboid
-fs,i < fi< /«,i,
i = 1,... ,m,
(4.3)
onto the r-dimensional space, yielding a convex polytope. Both solids may be combined
to an overall uncertainty space, resembling a "convex potato". Figures 1-3 show the
confidence ellipsoid, the "security polytope" and the combination of both to an overall
uncertainty space for the example of a least squares adjustment of a circle.
5 Conclusion
As computer simulations reveal, the approach presented here leads to measurement uncertainties safeguarding physical objectivity in the sense that uncertainty intervals reliably locate the values of the physical quantities in question. With such a distinct
statement, the traceability of units and standards will certainly be maintained.
On measurement uncertainties from metrological statistics
1. Confidence ellipsoid, security polytope, overall uncertainty space resembling a
"convex potato".
FIG.
References
1. Bender, P.L., B. N. Taylor, E.R. Cohen, J.S. Thomas, P. Pranken, and C. Eisenhart, Should least squares adjustment of the fundamental constants be abolished?,
NBS Special Publication 343, United States Department of Commerce, Washington
D.C., 1971.
2. Cohen, E.R. and B.R. Taylor, The 1986 adjustment of the fundamental physical
constants, CODATA BULLETIN Nr. 63 (1986).
3. Cramer, H., Mathematical Methods of Statistics, Princeton University Press, Princeton 1961.
4. Grabe, M., Principles of "metrological statistics", metrologia 23 (1986/87) 213-219.
5. Grabe, M. Estimation of measurement uncertainties, an alternative to the ISO-Guide,
metrologia 38 (2001) 97-106.
6. Grabe, M., An alternative algorithm for adjusting the fundamental physical constants, Physics Letters A 213 (1996) 125-137.
7. ISO, Guide to the expression of uncertainty in measurement, 1993.1, Rue de Varambe,
Boite postale 56, CH-1211 Geneva 20, Switzerland.
8. Welch, B.L., The generalization of Student's problem when several different population variances are involved, Biometrika 34 (1947) 28-35.
161
/i and loo ODR fitting of geometric elements
Hans-Peter Helfrich
Mathematisches Seminar der Landwirtschaftlichen Fakultat der Universitat Bonn
helfrich@uni-bonn.de
Daniel S. Zwick
Wilcox Associates, Inc.
dzwickOwilcoxassoc.com
Abstract
We consider the fitting of geometric elements, such as lines, planes, circles, cones, and
cylinders, in such a way that the sum of distances or the maximal distance from the
element to the data points is minimized. We refer to this kind of distance based fitting as
orthogonal distance regression or ODR. We present a separation of variables algorithm
for h and loo ODR fitting of geometric elements. The algorithm is iterative and allows the
element to be given in either implicit form f{x,0) = 0 or in parametric form x = g{t, /?),
where (3 is the vector of shape parameters, a; is a 2- or 3-vector, and s is a vector of
location parameters. The algorithm may even be applied in cases, such as with ellipses,
in which a closed form expression for the distance is either not available or is difficult to
compute. For h and loo fitting, the norm of the gradient is not available as a stopping
criterion, as it is not continuous. We present a stopping criterion that handles both the
^1 and the loo case, and is based on a suitable characterization of the stationary points.
1
Introduction
Let us be given A'' points {^Jili G K'' and a geometric object S in
• implicit form {x : /(a;,/3) = 0} with a scalar function /, or
• parametric form x = g{t, p) with a vector function g,
where the shape parameter vector /? e C lies within a closed, convex subset C of R™.
Denote by
(/)i(/3) = inf{ 112;,: - Xi II2 : a^i on 5}
the distance of the point Zi to the geometric object S. Let
<A(/3) = (0i(/?),---,<Ajv(/3)f
be the distance vector with norm
where ||(A(/3)|| denotes either the /00-norm
#(/3) = max(<^i (/?),..., <AJV(/3))
162
Fitting of geometric elements
or the /i-norm
N
We consider the problem:
Find (3 e C and points {xi}^^ on S such that $(/?) = ||(?!>(/3)|| is minimal.
If the minimum is attained, each function 0i(/3) = \\zi - a;j||2 is minimal for the
point Si € 5. Then z, - Xi is orthogonal to S for interior points of S, hence the term
"orthogonal distance regression" or "ODR".
Nonlinear /i ODR problems are treated in WATSON [10, 12]. A survey for linear
problems is given in ZwiCK [13].
As stated, the problem has dimension Nd-\-m. In typical metrology applications, the
data set is very large so that a direct approach to the problem becomes computationally
expensive. We use a separation of variables algorithm that was used in [2, 4] and TURNER
[9] for the I2 ODR problem. Each iteration of our algorithm consists of two steps. In the
first step, the foot points {xi}f^i on S, i.e., the location parameters, are calculated for
a fixed parameter vector (3. These d-dimensional subproblems can be efficiently handled
by trust region methods [3].
In the second step, a first order approximation of (j)i{l3) is employed, that can be given
without explicit knowledge of the dependence of the optimal points Xi{j3) on (3. At this
stage, the norm of the correction to the parameter vector /3 is limited by a trust region
strategy. The correction can be computed by solving a linear programming problem.
For general nonlinear minimax problems such methods were proposed in MADSEN AND
SCHJJGR-JACOBSEN [6], HALD AND MADSEN [1] and JONASSON AND K. MADSEN [5].
Our convergence analysis follows the general approach given in POWELL [8] and MORE
[7]. But in order to handle the li and l^o case we cannot use the norm of the gradient
as a stopping or convergence criterion, since the gradient is not continuous. Moreover, a
neccessary condition for a minimum is that the subgradient contains the zero functional,
see, e.g., WATSON [11]. In order to overcome this difficulty, we introduce a replacement
for the norm of the gradient that serves both as a stopping criterion and as an essential
tool in the convergence proof.
2
The trust region algorithm
At each iteration of our algorithm we solve the low-dimensional subproblems (Pj) for
(3 = Pk for each fixed i,i = l,...,N:
Minimize \\zi — a;,||2 subject to f{xi,l3) =0 or ic, = g{ti,P).
In order to apply the trust region method to h and Zoo ODR we need a first order
approximation tpi{l3,a) to (pi{(3). With appropriate regularity assumptions, this can be
computed without knowledge of the dependence of the optimal points Xi{(3) on /3 ([2],
[4]). This means that the iterative improvement in /3 is uncoupled from the calculations
of Xi{(3), whereby a true first order approximation of the objective function is attained.
163
164
H.-P. Helfrich and D. S. Zwick
In the case of the impHcit form f{x,/3) = 0, the first order approximation 0i(/3 + a) =
^j(/3, a) + O(Q;) is given by
UP,a) = —
^J^^-^^
.
(2-1)
as a first order approximation to the signed distance ±(j)i{(3 + a). For the parametric
form X = g{t,l3), we have
MP,a) = \\zi - XiW, - \'' ~ '^'f ■P^ff(x,;,/3)Q.
'*(2.2)
Note that (2.1) makes sense even for points on the surface. For an orientable hypersurface
in parametric form, the expression [|^^~^'|| in (2.2) should be replaced by the unit normal
for points on the surface.
Denote by
the vector of the linearized distances and let
*(/3,a) = ||V^(/3,a)||-||(/>(/?)||.
The main algorithm:
• Step 0: An initial /3o_G M", a trust region radius Ao > 0, and constants 0 < fl< 1
and 0 < 7 < 1 < 7, A are given. Set fc = 0.
• Step 1: Minimize '^{I3k,a) subject to ||a||2 < Afc and (3k + a £ C. Let ak denote
the solution with minimal norm.
• Step 2: If ak = 0, stop.
• Step 3: Compute
• Step 4:
{1) Successful step. li pk > fji set
and choose Afc+i such that
Afe<Afc+i<min(7Afc,A).
(2) Unsuccessful step. Otherwise, set
,
^fc+i = ^fc and 0 < Afc+i < 7Afc.
• Step 5: Increment k by one and go to Step 1.
3
Global convergence
In an abstract setting our problem may be formulated as
Minimize $(/?) = \\(p{0)\\ on a closed, convex set C.
(2.3)
Fitting of geometric elements
165
To solve this problem, at each stage of the iteration we solve the following constrained,
linearized problem:
Minimize ^{P, a) subject to 13 + a £ C and \\a\\ < A.
In order to get the hnearization in our case, we solve the least distance subproblems
(Pj), i = 1,..., A'', with a shape parameter/3, and use (2.2), or (2.1).
For the purpose of characterizing stationary points, we introduce the quantity
Vi(/?) =-inf{*(/3,a) I ||a|| < 1, ^ + a e C}.
Note that Vi(/3) > 0, since *(/3,0) = 0.
By convexity, Vi (/3) = 0 implies that a = 0 is a solution of the linearized minimization
problem. MADSEN AND SCHJJER-JACOBSEN [6] have shown that the latter condition is
equivalent to a condition given therein for the functional to have a stationary point. In
order to prove Theorem 3.3 we prove a lemma that was given in a similar form for the
loo case in MADSEN AND SCHJ^R-JACOBSEN [6] and JONASSON AND MADSEN [5]). We
give a different proof that is applicable to both the li and /oo cases.
Lemma 3.1
estimate
Let Vi((9) > e and A < A. For the solution of the linearized problem the
^iP,a)<-CeA
(3.1)
holds, with a constant that depends only on e and A.
Proof: According to the definition of Vi(^) and the continuity of \Ef there exists a
feasible ai with ||Q!I|| < 1 such that
*(/?,ai) = -e.:
Let a = tai, where t = min(l. A). Since *(/?, a) is a convex function, we get
^{P,a)<{l-t)^{0,O)+t^{p,ai) = -te.
Since
i> Amin(l,l/A)
we get the conclusion with C = min(l,l/A).
Proposition 3.2 For a minimum point,
□
Vi(/3)=0
holds.
Proof: Assume the contrary, then Vi(,3) = e > 0 holds. According to the definition of
^{P,a) we have
$(/? + a) = $(/3) + *(A a) + o(a).
By Lemma 3.1, we can find an a with
the Lemma, we may conclude that
||Q!||
< A such that (3.1) holds. As in the proof of
#(/? + to) < $(/3) - CeiA + o(to)
for 0 < i < 1. If we let i —> 0 we get a contradiction to the minimum property.
P
166
H.-P. Helfrich and D. S. Zwick
Theorem 3.3 Either the algorithm ends in a finite number of steps, or a sequence 0^
is generated for which \im'mik-*oo^ i{l3k) = (^Proof: Assume the contrary. Then there exists e > 0 such that Vi(/3fc) > e holds for
all k. By the definition of p^ and the lemma, it follows that for a successful step
and by the updating rule for Afc+i we get
Afc+i < c(<^(/3,+i) - 0(A-))with c = l/{iiCe). Combining this inequality with the updating rule for an unsuccessful
step yields
AA-+I < 7Afc + c(0(A+i) - •/-(/JO)By summation and the monotonicity of <t>{l3k) it follows that for all iV
fc=0
II
Since this implies the convergence of ^Ak, we get HmAfc = 0. From ||/3)t|| < A^ we
obtain the convergence of /J^. From the definition of pk it then follows that limpk = 1.
But then the updating rule (2.3) imphes that eventually AA-+I > A^, which gives a
contradiction.
O
Theorem 3.4 (Global Convergence, cf. MORE [7], POWELL [8]) Assum.e that Vi(/3)
is uniformly continuous. Then either the algorithm ends in a finite number of steps, or
a sequence (3k is generated for which
lim Vi(/3fc) = 0.
Proof: Assume the contrary. Then there exists an ei such that for each fco there exists
a fc > /co with
Vi(A)>ei.
By Theorem 3.3 we can find an index / > k such that
Vi(A)<ei/2
(/co will be determined later). We choose the smallest such /. As in the proof of Theorem
3.3, it follows that for that a successful step with k <i < I,
IIA+i - All < A^ < 2ci(<^(A) - 0(A+i)).
Clearly, this also holds for an unsuccessful step. This yields
||A-/3^-||<2ci(^(/3fc)-M))Since (/)(/3j) converges by monotonicity, we can make ||/3; - A-H arbitrarily small for large
enough fco • By the uniform continuity of Vi(/?) we infer
|Vi(/3fe)-Vi(/30|<ei/2,
which is a contradiction.
□
Fitting of geometric elements
4
167
A numerical example
As an illustrative example, we fit an ellipse to data, given as coordinate pairs in M^. There
are 24 data points and five components to the shape parameter vector (i.e., n = 2,d =
2,m = 5,N = 24). We used a standard parameterization involving a center (a;o, yo), the
axes (a, 6), and a rotation angle 9.
.
The output is shown below. The initial values for the parameters and the obtained
parameters in three different norms are given in Table 1. In the I2 case, we give as the
error the root mean square error, in the /i case the mean absolute deviation, and in the
loo case the maximum deviation.
FIG.
Initial
values
h
h
loo
1. I2, h, and ?oo-Approximation.
XQ
Xi
0.4989881
0.6637511
0.5368646
0.7694412
-1.4262126
-1.3987826
-1.4465520
-1.3829474
TAB.
4.6719913
5.5124671
5.2778061
4.9731226
0.4364267
0.3376480
0.3358224
0.4491259
1. Parameters for different norms.
6 (degrees)
Error
20.75913
20.90124
20.88869
20.66893
0.11520
0.09047
0.23489
168
H.-P. Helfrich and D. S. Zwick
The number of iterations in each case was five or six. We note that the deviations for
the best fit li and /oo eUipses exhibit behavior typical to these norms: five of the data
points lie on the best fit li ellipse and there are six deviations of largest magnitude in
the loo case.
Bibliography
1. J. Hald and K. Madsen. Combined LP and Quasi-Newton methods for minimax
optimxzdXion. Mathematical Programming, 2Q-A%-&2,1%%1.
2. H.-P. Helfrich and D. Zwick. A trust region method for implicit orthogonal distance
regression. iVwrnerica/ylZ^orii/ims, 5:535-545, 1993.
3. H.-P. Helfrich and D. Zwick. Trust region algorithms for the nonlinear least distance
problem. Numerical Algorithms, 9:171-179, 1995.
4. H.-P. Helfrich and D. Zwick. A trust region algorithm for parametric curve and
surface fitting. J. Comput. Appl. Math., 73:119-134, 1996.
5. K. Jonasson and K. Madsen. Corrected sequential linear programming for sparse
minimax optimization. BIT, 34:372-387, 1994.
6. K. Madsen and H. Schjaer-Jacobson. Linearly constrained minimax optimization.
Mathematical Programming, lA:208-225, 1978.
7. J. J. More. Recent developments in algorithms and software for trust region methods. In A. Bachem, M. Grotschel, and B. Korte, editors, Mai/iemafica/Proc/rammmg
Bonn 1982-The State of the Art, pages 259-287. Springer, 1983.
8. M. J. D. Powell. Convergence properties of a class of minimization algorithms. In O.
L. Mangasarian, R. R. Meyer, and S. M. Robinson, editors, Nonlinear Programming
2, pages 1-27. Academic Press, 1975.
9. D. A. Turner, I. J. Anderson, J. C. Mason, M. G. Cox, and A. B. Forbes. An efficient
separation-of-variables approach to parametric orthogonal distance regression. In
P. Ciarlini, M. G. Cox, F. Pavese, and D. Richter, editors. Advanced Mathematical
and Computational Tools in Metrology IV, pages 246-255, Singapore, 2000. World
Scientific.
10. G. A. Watson. The use of the h norm in nonlinear errors-in-variables problems.
In S. Van Huffel, editor, Recent Advances in Total Least Squares Techniques and
Errors-in-Variables Modeling, pages 183-192, Philadelphia, 1997. SIAM.
11. G. A. Watson. Choice of norms for data fitting and function approximation. Acta
Numerica, pages 337-377, 1998.
12. G. A. Watson. Some robust methods for fitting parametrically defined curves and
surfaces to measured data. In F. Pavese P. Ciarlini, A. B. Forbes and D. Richter, editors, Advanced Mathematical and Computational Tools in Metrology IV, volume 53
of Series on Advances in Mathematics for Applied Sciences, pages 256-272. World
Scientific, 2000.
13. D. Zwick. Algorithms for orthogonal fitting of lines and planes: a survey In P.
Ciarhni, M. G. Cox, F. Pavese, D. Richter, editors. Advanced Mathematical Tools
in Metrology II, pages 272-283. World Scientific, 1996.
Evaluation of measurements by the method of
least squares
Lars Nielsen
Danish Institute of Fundamental Metrology (DFM), Lyngby, DK}
LN@dfm.dtu.dk
Abstract
In this paper, a general technique for evaluation of measurements by the method of
Least Squares is presented. The input to the method consist of estimates and associated
uncertainties of the values of measured quantities together with specified constraints
between the measured quantities and any additional quantities for which no information
about their values are known a priori. The output of the method consist of estimates of
both groups of quantities that satisfy the imposed constraints and the uncertainties of
these estimates. Techniques for testing the consistency between the estimates obtained by
measurement and the imposed constraints are presented. It is shown that linear regression
is just a special case of the method. It is also demonstrated that the procedure for
evaluation of measurement uncertainty that is currently agreed within the metrology
community can be considered as another special case in which no redundant information
is available. The practical applicability of the method is demonstrated by two examples.
1
Introduction
In 1787, the French mathematician and physicist Laplace (1749-1827) used the method
of Least Squares to estimate 8 unknown orbital parameters from 75 discrepant observations of the position of Jupiter and Saturn taken over the period 1582-1745. Since then,
the method of Least Squares has been used extensively in data analysis. Like Laplace,
most people use a special case of the method, known as unweighted Unear regression. The
calculation of the average and the standard deviation of a repeated set of observations
is the most simple example of that. The unweighted regression analysis is based on the
assumptions that the observations are independent and have the same (unknown) variance. In addition, the linear regression is based on the assmnption that the observations
can be modelled by a function that is linear in the unknown quantities to be determined
by the regression analysis. For most measurements carried out in practice, none of these
assumptions can be justified. In order to evaluate the result of a general measurement,
in which some redundant information has been obtained, one therefore has to apply the
method of Least Squares in its general form.
This paper describes how measurements can be evaluated by the method of Least
Squares in general. The paper is based on an earlier work of the author [2] but includes
1 Address: Building 307, Matematiktorvet, DK-2800 Lyngby, Denmark
170
Evaluation of Measurements
171
several new features not published before as well as practical examples from the daily
work at DFM. An alternative approach is described in [6].
2
Measurement model
In a general measurement, a number m > 0 of quantities is either measured directly using
measuring instruments or known a priori, for example from tables of physical constants
etc. The (exact) values of these m quantities are denoted ^
C = (Cl)---)Cm) •
Due to measurement uncertainty, the values z obtained by the measurement (or from
tables etc.)
Z = {zi,...,Zm)'-'^
are only estimates of the values ^. The standard uncertainties of the estimates Zi,
u{zi)
,
i = l,...,m,
are determined in accordance with the GUM [1] and depend on the accuracy of the
instruments and the reliability of any tabulated value used. In general, some of the
estimates Zi may be correlated. If r{zi,Zj) is the correlation coefficient between the
estimates Zi and Zj then the covariance u{zi, Zj) between these two estimates is given by
u{zi,Zj)=u{zi)r{zi,Zj)u{zj).
Because of the uncertainty, the estimates z can be considered as an outcome of a mdimensional random variable Z with expectation C (the exact values of the quantities)
and covariance matrix S
/
S = u(z,z^)
u^{zi)
u{zi,Z2)
u{z2,Zi)
U'^{Z2)
\ u{Zm,Zi)
u{zi,Zm) \
u{z2,Zm)
u{Zm,Z2)
•••
U^{Zm)
J
In addition to the m quantities for which prior information is available either from
direct measurement or from other sources, a general measurement may involve a number
A; > 0 of quantities for which no prior information is available. The values of these
quantities are denoted by
In general, the values /3 and C are constrained by a number n of physical or empirical
laws. These constraints may be written in terms of an n-dimensional function
/ /i(/3,C) \
/0\
0
,
k<n<m + k.
(2.1)
V /n(AC) /
It is assumed that fi : fl ^^ R
l...n, are differentiable functions (with con-
172
Lars Nielsen
tinuous derivatives) defined in a region ft C ii'=+™ around (/3, C)- As indicated in (2.1),
the number n has to be larger than or equal to the number k of quantities for which no
prior information is available; otherwise some of the values (3 cannot be determined. In
addition, the number n of constraints has to be smaller than the total number fc + m of
quantities involved; otherwise the values of /3 and C would be uniquely determined by
the constraints and no measurements would be needed.
The estimates z, the covariance matrix S and the n-dimensional function f{f3, C) are
the input to the general Least Squares method. It should be stressed that no probabiHty
distribution has to be assigned to the input estimates z. On the contrary, if a probability
distribution has been assigned to an estimate, it should be used to calculate the mean
value and the variance of the estimate which should then serve as input to the Least
Squares method.
Like any other covariance matrix, the covariance matrix u(z,z^) = S is positive
semi-definite. Otherwise, at least one linear combination x^z of the estimates z would
have negative variance u(x^z, z^x) = x^Sx. In the following it is assumed that S is
positive definite and therefore non-singular.
3
Normal equations
Least Squares estimates 3 and C, of the values /3 and C, are found by minimizing the
chi-square function
x'(C;z) = (z-C)^s-i(z-C)
subject to the constraints
f(/3,C) = 0.
It is convenient to solve this minimization problem by using Lagrange multipliers [5]:
If a solution (/3, C,) to the minimization problem exists, the solution satisfies the equation
V(^,C,A)^(/9,C,A;z) = 0
where
$(A C, A; z) = (z - C)^S-i(z - C) + 2A^f (A 0
for a particular set of Lagrange multipliers A = (Ai,... j^A„)'^. By taking the gradient of
the function $, the following n-\-m-{-k equations in 0, C„ A) evolve:
V/3f(/3,C)^A
-S-i(z-C)-fVcf(ACTA
f(/3,C)
where
=
=
=
(3.1)
MA. \
/ 9A
and
V;3f:
. d^
\ 001
0,
0,
0,
...
MIL
df}k
V^f^
\ aci
df„
,
acm /
The equations (3.1) are called the normal equations of the Least Squares problem.
Evaluation of Measurements
4
173
Solving the normal equations
If {(3i,Ci,M) is an approximate solution to the normal equations, a refined solution
(/3;_,_j, Ci+i, A;+i) can be found by the iteration
I \
,
I = 1,... ,00.
The step (A^;, ACi, A;+i) is given by
D(/3„0
AC,
V Am ;
=
S-^z^C;)
h
(4.1)
Vcf(A,C,r
0("-") , /
(4.2)
V -f(A,CO
where
,
D(/3„C;)=
0(™'^)
V V/3f(A,C;)
S-i
Vcf(A,C)
is a symmetric matrix. This iteration procedure is similar to Newton iteration except
that the second order partial derivatives of the functions /, have been neglected as it is
practice to do in non-linear Least Squares estimation [4].
In order to reduce the effects of numerical rounding errors, it is recommended to calculate the step (A/3;, A^;, A^+i) by solving the hnear equations (4.1) by Gauss-Jordan elimination with full pivoting [4]. This algorithm also provides the inverse matrix D(/3,, C;)""^
which is needed at the final stage for estimating the covariance matrix of the solution as
shown in Section 5.
If proper starting values (/3i,Ci) are selected, the iteration is expected to converge
towards the solution {P,C,)
Since the solutions C, are expected to be close to the estimates z of C available a
priori, the estimates z are obviously the proper starting values Ci to be selected for
the iteration. The selection of proper starting values /Sj is more difficult in general. If,
however, f (/3, Q are linear functions in the variables /3, the iteration process will converge
after a few iterations, independent of the choice of jS^.
Most differentiable functions f(/3, C) can be handled by the described method. In
order to get reliable standard uncertainties, it is required, however, that the function
can be approximated by a first order Taylor expansion, i.e.
f (/3, C) ^ f (/3, C) + V;3f (/3, C)(/3 -/9) + Vcf (/3, C)(C - C)
when the values /3 and C are varied around the solution ^ and C on a scale comparable to
the standard uncertainties of the solution. If this vaguely formulated criterion is met, the
function f (/3, Q is said to be linearizable. Note that almost any differentiable function
174
Lars Nielsen
will be linearizable if the standard uncertainties are sufficiently small. On the other
hand, if the uncertainties are sufficiently high, all non-linear functions will no longer be
Hnearizable. The requirement that f(/3, C) is linearizable is considered to be the only
major limitation of the method of Least Squares!
It should be mentioned that the minimization using Lagrange multipliers will fail in
case the gradients V^/j and V^/j of one of the constraint functions /j are both equal to
zero at the point of the solution (/3, C). This gives some restrictions on how a constraint
may be formulated. A function /j defining a constraint may, for example, be replaced by
the square of that function, ff. But since /i(^, Q = 0, the gradient of ff will be zero at
the point of the solution (/9, Q although the gradient of /,; is not.
5
Properties of the solution
Since the solution {P,C„X) depends on the estimates z, which are considered as the
realization of the multivariate random variable Z, the solution (/3,C,A) can also be
regarded as a multivariate random variable. If the functions /,(/9, C) fire linearizable, the
estimates (;9, C,) are hnear in Z
/3\
c U
//3\
c
/
0(M)
+D(Acr\l s"'(z-c) 1-
(51)
In that case, the expectation of the solution is
which means that (/9,C) are central estimators of the values (/3,C)- Under the same
assumption, the covariances of the solution are given by the symmetric matrix D(/3, C)~^
provided by the Gauss-Jordan elimination algorithm^
u{t^^)
V ()'"■'=)
u(C,f)
O'"'"'
O^"'"^
=D(/3,0-' = D(^,C)-^
(5.2)
-u{X,X^) I
This relation can be derived as follows. Partition the symmetric matrix D~^ into nine
sub-matrices similar to the left hand side of (5.2) or similar to the partitioning of D
according to the definition (4.2). Express the covariance matrix of the solution (5.1) in
terms of the covariance matrix S of the random variable Z and the matrix D~^. Insert
the partitioned D"^ into the resulting matrix double product and express the covariances
of the solution in terms of S~^ and the sub-matrices of D"^ Reduce these expressions
to the final result by multiple use of the nine relations between the sub-matrices of D~^
and D derived from the identity DD~^ = I.
: ^The empty brackets in the left hand matrix indicates the parts of D ^ that do not contain information about
covariances.
Evaluation of Measurements
Prom equation (5.1) and (5.2) the covariance matrices between 0, C) and the estimates z are found to be
u{z,f) = u{cf).
Prom the last of these two relations, a relation of particular interest is derived,
u(z-C,z^-C )="(z,z^)-w(C,C )•
For the diagonal elements, this relation reads
u'^{zi-Ci)=u'^{zi)-u'^{Ci)
,
i = l,...,m.
That is, the variance of the difference between the initial estimate Zi of Q and the refined
estimate Q is equal to the variance of Zi minus the variance of Q. This relation is useful
when testing if the difference Zi — Q is significantly different from its zero expectation.
6
x^ test for consistency
When the estimates (/9, C) have been found, the minimum x^ value
x'(C;z) = (z-cTs-Hz-C)
can be used to test if the measured values z are consistent with the measurement model
(2.1) within the uncertainties defined by the covariance mati^ix S. If the model is linearizable, the expectation of the random variable X'^(C; Z) is equal to the number m of
measured quantities, minus the number m + A; of adjusted quantities, plus the number
n of constraints, that is
E X (C; Z) = m — {m -\- k) -\- n = n — k = V.
If, in addition, the random variables Z are assumed to follow a multivariate normal
distribution with mean values C, and covariance matrix S, the random variable x^(C; Z)
will follow a x^('^) distribution with v = n — k degrees of freedom. In that case, the
probability p of finding a y^ value larger than the value x^(Cizi) actually observed can
be calculated from the x^(^) distribution
p = p[x\y) > x'(C,z)} = 1 - P{x\y) < x'(C,z)}.
If this probability p is smaller than a certain value a, the hypothesis that the measured values are consistent with the measurement model has to be rejected at a level of
significance equal to a. As the result of measurements are normally quoted at a 95% level
of confidence, an a = 5% level of significance is a reasonable choice for the consistency
test.
Although the assumption of a normal distribution of Z may not be fulfilled, it is
suggested to carry out the test of consistency as described above anyway. This is justified
by the fact that a value of X^(C; z) significantly higher than the expectation v indicates
inconsistency no matter what the distribution of Z might be. The calculated probability
175
176
Lars Nielsen
p simply describes how unlikely the observed x^ value is if a normal distribution is
assigned to Z.
7
Normalized deviations
If the test described in the previous section leads to a rejection of the measurements,
a tool for identifying the outlying measurements is desirable. A measured value Zi is
defined as an outlier if the difference zt - Q is significantly different from zero taking
into account the standard uncertainty u{zi - Q) of that difference. This leads to the
introduction of the normalized deviation dj defined by"*
ai=
.
"(^' ~ ^i^
=
,
,
i = l,...,m.
^Jv?{zi)-u'{Q
The normalized deviation di has zero expectation and variance 1. A normalized deviation with \di\ larger than 2 or 3 is therefore rather unlikely no matter what the
distribution of the random variable di might be.
If a multivariate normal distribution is assigned to Z and the model function f (/9, Q
is linearizable, the normalized deviation di is normally distributed,
di eiV(0,1)
,
i = l,...,m.
In that case
P{Mi|>2} = 5%,
and a measurement with |di| > 2 is therefore identified as an outlier at a 5% level of
significance. It is suggested to use the criteria \di\>2 to identify potential outliers even
if the distribution assigned to Z is not normal.
8
Adjustment of a variance <T^
If some values Zi have a common but unknown variance u^(zj) = cr^, this variance can
be estimated by adjusting o^ by an iterative procedure until the "observed" x^ value
becomes equal to its expectation value u
where the covariance matrix E is a function of the unknown variance a"^. As the estimates
C, depends on the value assigned to cr^, these estimates have to be updated together with
the estimates C, each time the value of a"^ is changed during the iteration.
This way of estimating the unknown variance a"^ leads to the well-known expression
for the standard deviation in the case of a repeated measurement of a single quantity as
shown in Section 13.
. ^If u{zi — ifi) =0, the difference Zj - Q will be zero as well and d; may be set equal to zero. This situation occurs
whenever there is no 'redundant information available regarding the value of the quantity C,i.
177
Evaluation of Measurements
9
Example: Calibration of an analytical balance
An analytical balance with capacity Max=220 g, resolution d=0.1 mg, and built-in adjustment weight was cahbrated by DFM in October 1999 during an inter-laboratory
measurement comparison piloted by DFM. Two mass standards were used as reference
standards. One of them was a traditional 200 g weight (named R200g) of known conventional mass value^ ruR and density pR. The other reference standard was a specially
designed 200 g stack of weights consisting of four discs (named lOOg, 50g, 25g and 25g*)
machined from the same metal bar of known density p. The conventional mass values mi, m2, ms, m4 respectively of these four discs were not known a priori; only the
conventional mass value ms = mi + m2 -f ma -F 7714 of the stack was known.
The calibration was performed by placing a weight combination at the weighing pan
of the balance and by recording the corresponding average indication / in the display.
A total of 18 weight combinations were used. Each weight combination was weighed 3
times from which the average indication was calculated. The calibration was repeated 4
times during a period of 10 days in which the inter-laboratory comparison took place.
Prom these four calibrations, a grand average indication /,, i = 1,..., 18 was calculated
for each of the 18 weight combinations specified in Table 1. The standard uncertainty of
the grand average was estimated from the observed variation in indication over the four
calibrations.
h
lOOg
50g
25g
25g*
/lO
50g
25g
25g*
TAB.
I2
lOOg
50g
25g
25g*
ill
50g
25g
Is
lOOg
50g
25g
/12
50g
25g*
h
lOOg
50g
25g*
/l3
50g
h
lOOg
50g
/l4
25g
25g*
h
lOOg
25g
25g*
/is
25g
I7
lOOg
25g*
/8
I9
lOOg
25g
lOOg
/16
/l7
/18
25g*
R200g
R200g
1. The weight combinations corresponding to the 18 balance indications 7,
Due to the effect of air buoyancy, the balance indication depends not only on the
mass of the weighed body, but also on the density of the body as well as the density of
the air. When calibrated in air of known density a, the reference indication IR of the
balance corresponding to a load generated by a weight with conventional mass value m
and density p is given by
m
(a-ao) (
\P
Po
where ao=1.2 kg/m^ and po=8000 kg/m^ are the reference densities of the air and the
weight respectively to which the conventional value of mass refers. As a model for the
'*The conventional mass value of a body is defined as the mass of a hypothetical weight of density 8000 kg/m*
that balances the body when weighed in air of density 1.2 kg/m^ and temperature 20 "C.
Lars Nielsen
178
z
u{z)
c
uiO
d
ms
ruR
[g]
[g]
199.988816
0.000010
199.988814
0.000010
1.66
h
z
u{z)
c
u(0
d
z
u{z)
c
u(0
d
[div]
199.988608
0.000023
199.988620
0.000011
-0.56
174.992133
0.000023
174.992149
0.000012
-0.77
175.009992
0.000023
175.010024
0.000012
-1.61
ho
[div]
99.982925
0.000023
99.982899
0.000013
1.38
/8
/9
[div]
124.984217
0.000023
124.984212
0.000014
0.25
[div]
100.005650
0.000023
100.005632
0.000013
0.93
d
0
u0)
/
[g/div]
1.00000186
0.00000019
«(0
TAB.
h
[div]
/l4
c
h
[div]
[dlv]
49.974992
0.000023
49.974995
0.000013
-0.19
z
u{z)
199.999043
0.000008
199.999043
0.000008
-1.66
PR
[kg/m^]
7833.01
0.29
7833.01
0.29
1.66
/l5
H/m']
7965.76
0.71
7965.76
0.71
-1.66
1.1950
0.0035
1.1946
0.0035
1.66
h
h
[div]
149.980675
0.000023
149.980672
0.000012
0.14
/i
[div]
199.988617
0.000023
199.988620
0.000011
-0.16
/7
[div]
24.996417
0.000023
24.996432
0.000011
-0.77
[div]
150.013558
0.000023
150.013558
0.000013
0.03
In
[div]
74.986433
0.000023
74.986450
0.000014
-0.87
In
[div]
199.998867
0.000023
199.998851
0.000011
0.78
mi
m2
[div]
199.998875
0.000023
199.998851
0.000011
1.19
ma
[g]
[g]
[g]
[g]
100.005774
0.000011
50.007963
0.000010
24.978601
0.000010
24.996476
0.000010
/l6
[div]
24.978533
0.000023
, 24.978557
0.000011
-1.17
A
[1/div]
-4.4E-09
l.OE-09
a
P
[kg/m«l
[div]
125.002083
0.000023
125.002087
0.000014
-0.20
/l2
/l3
[div]
75.004325
0.000023
75.004325
0.000014
0.03
[div]
50.007892
0.000023
50.007881
0.000012
0.54
/l8
77)4
2. Measured and estimated values and associated standard uncertainties.
calibration curve of the balance, a second order polynomial through zero is assumed
In = f{l + AI^)
where / and A are unknown quantities to be determined from the calibration data.
In this example, there are m = 23 quantities for which prior information is available
from the measurements performed:
C,={ms,mR,pR,p,a,Ii,...,Iisf
whereas there are fc = 6 quantities for which no prior information is available:
/3 = (/,>!, mi,7712, m3,m4)^.
179
Evaluation of Measurements
f
A
mi
m2
ms
7X14,
TAB.
f
1
-0.945
0.021
0.071
0.096
0.096
A
-0.945
1
0.124
-0.016
-0.094
-0.094
TTll
0.021
0.124
1
-0.194
-0.269
-0.268
m2
0.071
-0.016
-0.194
1
-0.287
-0.287
ms
0.096
-0.094
-0.269
-0.287
1
-0.287
ni4
0.096
-0.094
-0.268
-0.287
-0.287,
1
3. Correlation coefBcients of the estimated $ values.
Between these quantities, there are n — 19 constraints:
/ (mi+m2+m3+m4)(l-(a-ao)(i-^))-/(/i + ^/?) ^
=0 .
f(/3,C) =
V
run (l - (a - ao) (^ - ^)) - / (^is + Alf,)
ms - (mi + m3 + ms +1714)
/
The measured values z and associated standard uncertainties are given in Table 2
under the row headings z and u{z). All measured values are assumed to be uncorrelated.
By solving the normal equations, the estimates C a^nd /3 and associated standard
uncertainties given in Table 2 under the row headings C, u{Q, $ and u(/3) are obtained.
Selected correlation coefficients derived from D(/3, C)~-^ are given in Table 3. The observed minimum x^ value is x(C) z) = 8.6 which should be compared to the expectation
value 1/ = n - fc = 19 - 6 = 13. Since P {x^{13) > 8.6} = 80.3%, it is concluded that
the measured values are consistent with the specified constraints taking into account the
measurement uncertainties. This conclusion is confirmed by the calculated normahzed
deviations given in Table 2 under the row heading d; all normahzed deviations satisfy
the criterion \d\ < 2.
Prom the estimates of the quantities / and A and the associated covariance matrix,
the error of indication E, defined as
E = I-lR = I-f{l + AI^),
and the associated standard uncertainty u{E) can be calculated as a function of the
indication /. The result is shown in Figure 1 as the full lines representing E — u{E), E,
and E + u{E). The measured points Ei,i = 1,..., 18 shown in the figure are the observed
average balance indications li minus the corresponding reference values IR. The error
bars of the measured points indicate the standard uncertainties u{Ei) that have been
calculated taking into account the covariance between li and IR.
10
Example: Evaluation of calibration history
A weight (named Rlmg) of nominal mass 1 mg has been calibrated 39 times in the
period 1992-2001. For calibration number i, the mass rrii of the weight at the time
ti and the associated standard uncertainties u{mi) and u{ti) are given. The calibration
history of the weight is shown in Figure 2 as dots with error bars indicating the standard
180
Lars Nielsen
Balance BP221S
0.00
^—Fit
-—Fit+u(Fil)
— Rt-u(Fil)
O
100
Measured
150
Balance indication //g
FIG.
1. Error of indication of the calibrated balance.
uncertainties; the scale mark 1992-01 on the time axis indicates the position of the date
1 January 1992 etc.
Due to wear and changes in the amount of dirt adsorbed to the surface, the mass of
the weight is expected to change in time. A reasonable model of the change in mass as a
function of time is a superposition of a deterministic linear drift and a random variation
rrii = ai + a2ti + Srrii
,
i = l,...,39,
where Srrii is a random variable with zero expectation and variance cr^. The drift parameters ai, a2 and the associated covariance matrix as well as the variance a^ are unknown
a priori and are to be estimated from the calibration history available. Once the estimates ai and 0,2 have been found, it is possible to predict a value m of the mass of the
weight as a function of time t
' 171 = 0,1 + ogi -f- 6m,
where Sm = 0 with standard uncertainty u{5m) — a. The standard uncertainty of the
predicted mass value is given by
u^{m) =u^{ai)-\-t^u^{a2) + 2tu{ai,a2)+(P.
The measurement model used for evaluating the calibration history is
C, = {mi,...,m3c,,ti,...,tsci,Smi,...,5m3<if
,
/3 = (ai,a2)^,
181
Evaluation of Measurements
R1 mg (Before adjustment of a)
1.0705
f-
1.0703
^ 1.0701
g" 1.0699
5 1,0697-
""^"^i=
44^
8
1993-01
1994-01
Result
a
Ignore
ii
fU'(37)>66.6| = 0.2%
1992-01
■
1995-01
1996-01
1997-01
1996-01
1999-01
2000-01
2001-01
Time of Calibration
_it™(
1992-01
1993-01
1
*
1994-01
y^—«•_
1995-01
1—•
ii
1996-01
♦ ♦*
1997-01
Io ♦
1 ♦♦ *
1998-01
3-01
1
♦
2000-01
*
♦
h«-^
2001-01
2002-01
Time of Calibration
FIG.
2. Evaluation of the calibration history of a 1 mg weight assuming that a = 0.
R1 mg (After adjustment of a)
1992-01
1993-01
1994-01
1995-01
6-01
1997-01
2000-01
2001-01
2002-01
Time of Calibration
J 4.000
■5
2.000 ■
■H
•
»v—«»-5—I—•
^
* ♦^
10 O
1 ♦» \
* I* *
\
!♦ Deviation |
I -4.000
1992-01
1993-01
1994-01
1995-01
1996-01
1997-01
1998-01
1999-01
2000-01
2001-01
2002-01
Time of Calibration
3. Evaluation of the calibration history of the 1 mg weight with a adjusted
to 0.092 fig.
FIG.
(
mi - (ai + a2ti + 5mi)
\
= 0.
f(AC)
\ ^39 - («1 + ^2*39 + ^^39) /
The measured values z are given by the calibration history, except for the values of
182
Lars Nielsen
Sniiji = 1,..., 39 which are set equal to the expectation vakie zero. The associated covariance matrix u{z, z^) = S is built up from the uncertainties w(m,:) and u{ti) available
from the calibration history and a negligible but finite^ initial value of the unknown
variance a^. Since the standard uncertainties u{mi) are of the order 0.1 fig, the value
(T=lE-07 fig is considered negligible and is selected as a starting point.
By solving the normal equations, estimates Oi and 0,2 of the drift parameters and the
associated covariance matrix are found after a few iterations. The predicted value m of
the mass of the weight and the associated standard uncertainty u{rh) as a fimction of
time are shown in Figure 2 as solid lines. The normalized deviations d associated with
the mass values rui are shown in Figure 2 as well''. The observed minimum chi-square
value is x^ = 66.6 which is large compared to the expectation value i^ = 39 - 2 = 37.
Since P{x^(37) > 66.6} = 0.2%, the hypothesis a = 0, or no random variation in the
mass, is rejected at a 0.2% level of significance.
The value of a is therefore increased as described in Section 8 until the calculated
minimum x^ value becomes equal to its expectation value u = 37. In this way the
standard uncertainty reflecting the random variation of the mass of the weight is found
to be 17=0.092 fig. The result of the evaluation of the calibration history after adjustment
of cr is shown in Figure 3. Note the significant increase in the standard uncertainty of
the predicted value of the mass of the weight and the decrease in the absolute value of
the normalized deviations d.
The calibration history can also be evaluated by an iterative technique based on linear
regression [3]. The results obtained are identical to the results presented in this section.
11
Case I: Univariate output quantity, Y = h{Xi,..., X^)
In this section it is shown that the evaluation of measurements by the method of least
squares is consistent with the generally accepted principles for evaluating measurement
uncertainty as described in the GUM [1].
Using the nomenclature of the GUM, a univariate output quantity Y is assumed to
be related to N input quantities Xj,.. .,XN through a specified function h,
The values assigned to the input and output quantities are denoted xi,.. .,XN and
by 2/ respectively. In the nomenclature of this paper, the measurement model is
C = (Xi,...,X;vf
,
/3 = (n
f(AC) = (r-M^i,---,^w)) = o.
The measured values are
z = (.Ti,...,a;iv)^
^If the variance cr'^ is assumed to be exactly zero, the quantities Smi have to be removed from the model. Otherwise
the covariance matrix S will be singular.
^The absolute value of normalized deviations of tj and Sm-i is equal to the absolute value of the normalized
deviation of mj. ;
183
Evaluation of Measurements
with the known covariance matrix
u^{xi)
U{XI,XN)
u{xN,Xi)
V?{XN)
S=W(Z,Z^):=
The coefficient matrix D of the normal equations is
/
D(/3,C)=
0
0(1'^)
OWD
\ 1
where
Vx/l^
1
S-i . -Vx/i(xf
-Vx/i(x)
0
dh
dh
axi' ■ ■ ■' dXN
In the present case, the solution to the normal equations is found after one iteration,
y = $ = h{xi,...,XN)
,
C = (a;i,...,a;N)^,
A = 0.
The associated covariances are given by
^Hy) u{y,f)
.■^ ^T,
u{C,y) u{CCl
()(1.1)
Q(1,N)
if''^ \
O^^''^
=
D(/3,C)-^
_^2(;^) J
Vx/i(x) S Vx/i(x)^
S Vx/i(x)^
1
Vx/i(x)S
S
0
1
0
0
In other words.
JV N
w^(y) = VxMx)SVx/i(x)^ = ^^Ciu(a;i,a;j)cj,
dh
Cj = ^ (a;,)
which is identical to the linear variance propagation formula given in the GUM.
12
Case II: Linear regression, Y = Xa
Linear regression is applied when there is a linear relationship Y = Xa between some
observed quantities Y and some unknown quantities a. The design matrix X is made up
of known elements that may be given as specified functions of one or several independent
variables. In the notation of this paper, the measurement model for the linear regression
problem is
C = Y = {Yi,...,Ynf , /3 = a=(ai,...,afcf, .
f(C,/3) = Y-Xa = 0,
where X^"'*^^ is the known design matrix. The measured values are
z = y = {yi,---,ynf
184
Lars Nielsen
with known covariance matrix
/ u^y^)
i; = u(z,z^) =
:
•••
\ u{yn,yi)
■■•
The coefficient matrix D of the normal equations is
(0(*^'*^)
Q(n,k)
of'''")
-X^
5.-1
i(n,n)
Again, the solution to the normal equations is found after one iteration,
a = 3 = CX^S-V , Y = C = Xa , X^-l^-^Y-y)
where C = (X^S'^X)"^. The associated covariances are given by
^ / u(a,a^)
u(Y,a^)
V Q(n,fc)
u(a,Y^)
u(Y,Y^)
Q(n,n)
Q''^'")
\
()("•"'
_u(A, A^) /
D(3,c)-^
c
cx''
-cx^s-^
XC
XCX^
I-XCX^E V
-s-^xc i-s^^xcx^ s-'xcx'^E^^-s-^
that is,
a = CX^S-V
,
u(a,a^) = C = (X^S-^X)-!
as is known from the theory of linear regression.
13
Case III: Repeated observations of a single quantity
Assume that a quantity X is measured n times with the same uncertainty a. Such a
measurement can be modelled by n quantities Xi,..., X„ having a common value /i
C = X = (Xi,...,X„f
mc)= I
;
,
/3=(/x),
I =0.
The measured values are
Z = X = \Xi, . . . , Xji)
,
and under the assumption that the measurement results are mutually independent, the
associated covariance matrix is given by
i; = u(z,z^) =
/ (T2
...
0
i
V 0
■•.
■..
;
a^
Evaluation of Measurements
185
The coefficient matrix D of the normal equations is
(0(1,1)
0(n,l)
0(1'")
f^-2i(n,n)
-1(1.")
i{n,n)
l(n,™)
0^"'")
_l(n.l)
where 1 denotes a matrix with all elements equal to 1. The solution of the normal
equations is found after one iteration,
n
n •^—'
The associated covariances are given by
= D(/3,C)-V
u(x,A) ^(x,x^)
()("■")
Q(„,1)
Q(n,n)
-u{X,X^) I
(T^n-i
(T^n"!!'"'!)
^-lj^(n,l)
c72n-il(i'")
Cr^n~ll^"'")
l(n,n) _yj-ll(n,n)
n-il(i'")
l(n,n) _ ^-lj_(n,n)
a-~^(ri'"ll("'"' - l("'"))
As expected,
1
/i=-Va;i
%=\
,
u^ {ft) =0^/71.
If a^ is not known a priori, it can be estimated by solving the equation
i.e.,
1
1
"
"
\-t^i —
i=\
which is the well known expression for the experimental standard deviation s.
14
Gonclusion
A general technique for evaluation of measurements by the method of Least Squares
has been presented. The applicability of the method has been demonstrated by two
examples. It has been shown that the method is fully compatible with the generally
accepted principles for evaluation of measurement uncertainty laid down in the GUM
and that ordinary linear regression is just a special case of the method.
The input to the method consists of
• An estimate of the value of each measured quantity, including any relevant influence
quantity.
• The covariance matrix of these estimates formed by the standard uncertainties of
the estimates and the correlation coefficients between the estimates.
186
Lars Nielsen
• A measurement model describing all the known relations between the measured
quantities and some additional quantities (if needed) for which no prior information
is available.
The output of the method consists of
• An adjusted estimate of the value of each measured quantity and an estimate of
each additional quantity introduced in the measurement model.
• The covariance matrix of all these estimates from which the standard uncertainties
and correlation coefficients can be calculated.
• A chi-square value which is a measure of the degree of consistency between the
measurement model, the input estimates, and the covariances of the input quantities.
The adjusted estimate of the value of a measured quantity differs from the input
estimate only if the measurement model imposes additional information regarding the
value of that particular quantity. In that case the standard uncertainty of the adjusted
estimate will be smaller than the standard uncertainty of the input estimate. For a
good measurement, the difference between the adjusted estimate and the input estimate
of a measured quantity should not be large compared to the standard uncertainty of
that difference. It has therefore been suggested that the ratio d of the difference to its
standard uncertainty is calculated and assessed against a selected criterion, e.g. |d| < 2.
By plotting the d values of the adjusted estimates it is possible to assess whether a too
high chi-square value is caused by a few poor input estimates or is due to a poor model.
Bibliography
1. BIPM, lEC, IFCC, ISO, lUPAC, lUPAP, OIML, Guide to the expression of uncertainty in measurement, ISO, 1995.
2. L. Nielsen, Least-squares estimation using Lagrange multipliers, Metrologia 35
(1998), 115-118. Erratum, Metrologia 37 (2000), 183.
3. L. Nielsen, Evaluation of the calibration history of a measurement standard, DFM
report DFM-01-R25, 2001, 1-6.
4. W. H. Press, S. A. Teukolsky, W. T. Vetterling and B. P. Flannery, Numerical
Recipies in C, 2nd ed., Cambridge, Cambridge University Press, 1992, 36-40 and
681-688.
5. T. L. Saaty and J. Bram, Nonlinear Mathematics, Dover Publications, New York,
1981,93-95.
6. K. Weise and W. Woger, Uncertainty and Measurement Data Evaluation, WileyVCH, 1999, 183-224 [in German].
An overview of the relationship between approximation
theory and filtration
Paul J. Scott
Taylor Hobson Limited, Leicester, UK.
PScottStaylor-hobson.com
Xiang Q. Jiang
University of Huddersfield, Huddersfield, UK.
x.jiang@hud.ac.uk
Liam A. Blunt
University of Huddersfield, Huddersfield, UK.
l.a.blunt@hud.ac.uk
Abstract
This paper gives an overview of the similarities and differences between the requirements
and techniques used in mathematical approximation theory and filtration in surface metrology. Although the two fields tend to use the same or similar mathematical objects to
produce functions that simplify a function in a controlled manner, it is the way that this
simplification is achieved which is the main difference between the two. Approximation
theory uses norms to judge the closeness of the approximation while filtration uses the
concept of wavelength to control the "smoothness" of the result of filtration. The new ISO
definition of a filter is stated, together with a generalisation of the concept of wavelength
through "brickwall" filters. This new ISO definition of a filter illustrates the closeness
of approximation theory and filtration. The paper then proceeds to survey some recent
developments in filtration in the hope that there can be some cross-fertilisation between
approximation theory and filtration. These include wavelets, robust filters and non-Unear
filters such as the family of morphological filters, which includes envelope filters and alternating sequence filters (non-linear multiresolution). Examples from surface texture are
used throughout the paper.
1
Introduction
This paper gives an overview of the similarities and differences between the requirements
and techniques used in mathematical approximation theory and filtration in surface
metrology. It is not the intention of this paper to give full mathematical detail but
to survey recent developments in filtration in the hope that there can be some crossfertilisation between approximation theory and filtration.
Although the two fields tend to use the same or similar mathematical objects to
produce functions that simplify the original function in a controlled manner, it is the
188
Overview of approximation theory and filtration
way that this simpUfication is achieved which is the main difference between the two.
Mathematical approximation theory is concerned with best and good approximation
of a large family of functions from a smaller set (usually finitely generated, linear or
non-hnear) in certain normed spaces (such as Lp), the construction of good approximants (if possible) and the determination of approximation order. Classical tools to
achieve this include polynomial tools and splines. More recent tools include wavelets
and multiresolution that decompose the normed spaces.
Filtration uses the concept of "wavelength" to control the "smoothness" of the result
of filtration. In surface metrology, filtration is concerned with the extraction of features
within a prescribed "wavelength" band defined by "wavelength cut-offs". Classical tools
to achieve this include Gaussian filters [1], polynomials and splines [4]. Recently there
has been a resurgence of activity, both fundamentally and practically, in filtration for
surface metrology.
The International Standards Organisation Technical Committee 213 (ISO TC/213),
whose remit includes surface metrology, has recently set up an Advisory Group (AG9) to
explore filtration for surface metrology. They are producing a series of technical specifications (ISO/TS 16610 series [2, 3, 4, 5, 6, 7, 8, 9, 10, 11]) to standardise filter terminology
and to introduce to industry other filtration tools, which include spline wavelets [5],
morphological filters [9] and scale-space techniques [10].
Other groups are also producing filtration for surface metrology. The University of
Huddersfield has used second generation wavelets to produce an improved spline wavelet [12]. The University of Hanover is exploring robust Gaussian filtration [6]. PTB has
developed a Robust Spline filter [7]. The rest of the paper surveys some of the results of
this recent activity.
2
Basic concepts of filtration
This section is a summary of the basic concepts of filtration as given in ISO/TS 16610
part 1 [2].
Let V be the space of real surfaces.
Let Vx be a set of nested subspaces indexed by A € 7^^+ (here 7?.+ is the set of positive
reals which includes zero) such that
yX> n>0;VxCVf,CV and Vo is dense on 'P.
The nesting index A a number indicating the relative level of nesting for a particular
subspace in such a way that given a particular nesting index, subspaces with lower indices
contain more surface information and subspaces with higher nesting indices contain less
surface information. By convention, as the nesting index approaches zero there exists a
surface in that indexed subspace that approximates the real surface to within any given
measure of closeness as defined by a suitable norm. Thus approximation theory is used
to define Filtration. The usual norm used in filtration is L2 but others are used such as
the one-sided Chebychev for morphological filters.
Let $A : 'P —> VA be a projection from the space of real surfaces onto the subspace
indexed by A > 0 which satisfies the following two properties.
189
190
P. J. Scott, X. Q. Jiang, and L. A. Blunt
• The sieve criterion: VA,/x > 0 and Va e V; $A($/i(«)) = ^sup(A,A«)(^)• The projection criterion: VA > 0 and Va e VA; ^x{a) = a.
$A is called the brickwall filter (or primary mapping) and is a method of choosing a
particular surface belonging to a subspace with a specified nesting index, to represent
the real surface, which satisfies the projection and sieve criteria [16].
The sieve criterion allows brickwall filters to have the property that once the surface
has been brickwall filtered at a particular nesting index, subsequent brickwall filtering
with a higher nesting index will produce the same surface as brickwall filtering the
original surface with the brickwall filter with the higher nesting index.
The projection criterion is required in order that the nesting index is a scale or size.
For define the set operator ^x :V —fV as
VA > 0 and VP C 7?;'I'A(P) := {p • P e P and
$A(P)
== p}.
That is to say p 6 *A(P) if and only ii p e P and $A(P) = P- Then it is easily
demonstrated that the set operator 9\ is a granulometry [16] on V and A is the scale/size
of the granulometry.
Since the nesting index of brickwall filters is a scale/size and it satisfies the sieve
criterion, it can be used to define the generalised concept of wavelength. An example
of a brickwall filter is a morphological closing filter using a sphere as the structuring
element. Here the nesting index is the radius of the sphere.
Other filters can be constructed using brickwall filters (e.g. weighted mean of brickwall
filters, supremum of brickwall filters, etc.).
3
Wavelet filters
An important example of the concepts discussed in the previous section is wavelet filtration. The multiresolution form of the wavelet transform consists of constructing a
ladder of smooth approximations to the profile. The first rung is the original profile.
Each rung in the ladder consists of a filter bank where the profile Ai is split into two
components giving, a smoother version ^Ij+i of the profile which becomes the next rung
and a component JDj+i that is the "difference" between the two rungs.
The multiresolution ladder structure lends itself naturally to a set of nested mathematical models of the profile, with the ith model m,, reconstructed from (Dj, D2, D3, ...,
Di, Ai). The nesting index is the order of the model, the higher the model the smoother
the representation with less detail. Thus m,_i_i is a smoother version of the profile than
rui .
As part of a research programme at the University of Huddersfield, the use of biorthogonal wavelets for surface analysis has been investigated because of their significant merits [12]. A very fast, second-generation, in-place algorithm, which uses the lifting scheme,
has been developed at Bell Laboratories for biorthogonal wavelets [13]. One important
property of biorthogonal wavelets is that they allow the construction of symmetric wavelets and thus linear phase filters that preserves the location of surface features with far
less distortion than phase shift filters.
Overview of approximation theory and filtration
Surface texture analysis usually breaks down a surface into defined wavelength components of the surface called roughness, waviness and form. There are many well-known
problems with the current standardised filter [14], i.e. Gaussian filter [1], including lost
data at the edges, distortion due to form, retention of unwanted wavelengths, etc., Huddersfield has investigated the possibility of using a 'lifting wavelet' model to overcome
some of these problems and enhance the extraction accuracy for roughness, waviness and
form. This is achieved by using the wavelet transform to break down the surface into
subsets at different scales and recombining only those subsets of the scales of interest
(i.e. setting all the other subsets to zero and applying the inverse wavelet transform).
Figure 1 shows the application of the wavelet filtering technique a femoral head from an
artificial hip joint. Full details of the particular biorthogonal wavelet and its associated
lifting scheme together with some engineering apphcations are given in reference [12].
FIG.
4
1. Metallic femoral head showing original, reference and roughness surfaces.
Envelope filters
Traditional linear filters, such as the Gaussian filter [1], produce a smoothed mean surface
through a measured surface. Many engineering apphcations of functional surfaces involve
mechanical contact where the envelope of the surface is of interest rather than the mean
surface. But what exactly is the envelope of a surface?
The fohowing are defining properties of the envelope of a surface used by ISO TC/213
AG9 [8]:
• the envelope filter must be Extensive, i.e., VA,JF(A) > J4,
• the envelope filter must be Increasing, i.e., A< B implies F{A) < F{B),
• the envelope filter must be Idempotent, i.e., F{F{A)) = F{A),
where A, B are surfaces and F(A) is the filtered surface of surface A.
But these are also the defining properties of a morphological closing filter [15]; hence
all envelope filters are morphological closing filters. A morphological closing filter using
a disk as the structuring element is illustrated in Figure 2.
Unfortunately, envelope filters, by definition, are not very robust to outliers, consisting
of large spikes, in the surface. Scale-space is an attempt to overcome this problem with
the morphological closing filter.
191
192
P. J. Scott, X. Q. Jiang, and L. A. Blunt
FIG.
5
2. An envelope filter using a closing filter with a disk as a structural element.
Scale-space
Scale-space is a way of breaking down a signal or image into objects of different scales.
To define scale-space we need to define the size of objects in a signal or image. This is
achieved using Alternating Sequence Filters [10].
Alternating Sequence Filters (ASFs) are defined in terms of matched pairs of closing
and opening filters. A closing followed by an opening both at a given scale (radius of the
circle, length of the horizontal segment, etc.) will eliminate features of the surface whose
"scales" are smaller than the given scale.
ASFs begin by eliminating very small features, then eliminating slightly larger features, and then eliminating slightly larger features still etc., in a systematic way up to
a given scale. Usually there is a constant ratio between successive scales. This process
produces a ladder structure similar to wavelet analysis. At each rung in the ladder the
profile is filtered by a matched pair of closing and opening filters at a given scale to
obtain the next rung profile and a component that is the "difference" between the two
rungs. The ladder structure leads to a multiresolution analysis, similar to wavelet analysis, with all of the associated analysis techniques. An example of scale space of a profile
from a ceramic surface is given in Figure 3. The top part of this figure shows the original
non-smoothed profile with the final smoothed profile.
6
Robustness
Robustness of filtration is an increasingly important area of interest in surface metrology.
Robustness is not in general an absolute property of a filter but a relative one. One can
only say that a particular filter is more robust than an alternative filter against a particular phenomenon if there is less distortion in that filter's response to that phenomenon
than in the alternative filter's response.
To make robustness an absolute property of filters we need to define a reference class
Overview of approximation theory and filtration
Alternating Sequence Filter
^^^/v^^'^^-'Vlp'^'''^^
c
E
£
.3 0.01
<l>
I
-0.005
FIG.
0.2
0.1
o.a
1
Spacing mm
1.2
3. Successively smooth profiles of a ceramic profile using an ASF with a disk.
of profile filters with which to compare. The reference class of filters defined in ISO
TC/213 AG9 is the class of linear filters [3]. Hence by this definition all robust filters
must be non-linear. There are several well-known techniques (all non-linear) which can
produce robust filters for a particular phenomenon. These are indicated in the next
sections.
6.1
Metric based
Here the metric used to fit the filter to the surface is altered to a more "robust" metric.
For example the metric based on the Li norm is more robust against spike discontinuities than the metric based on the least square norm {L2 norm), which in turn is
more robust then the metric based on the Chebychev norm (Loo norm).
The Robust Sphne Filter given in ISO/TS 16610 part 32 uses an Li metric rather
than the usual L2 norm to make it more robust [7].
6.2
Robust statistics
Here each point on the surface is weighted according to its relative height position to the
filter's smooth response, with points further away being given less influence on the filter
response than points nearer in height. This is an attempt to make the filter more robust
against spike discontinuities. There are several standard functions used to allocate the
weights to points (Huber, Beaton functions, etc.) which can be found in any standard
book on robust statistics [17].
193
194
P. J. Scott, X. Q. Jiang, and L. A. Blunt
The Robust Gaussian regression filter given in ISO/TS 16610 part 31 uses a Beaton
function to alter the influence of outliers [6].
6.3
Pre-filtering
Pre-filtering is a technique where a phenomenon (such as spikes, form, etc.) in the surface are removed or greatly reduced, by other means, before filtration, thus removing
or greatly reducing any effect the phenomenon can have on the filter's response. This
approach has the advantage that once a method has been found to remove unwanted
phenomenon then this method will work with any filter.
Form pre-filtering, involving removing the form of the surface before filtration, is a
very common technique used in surface metrology. Less common is using scale space
pre-filtering which involves removing singularities and other features of a certain size
before filtration.
7
Conclusions
The paper has given an overview of the similarities and differences between the requirements and techniques used in mathematical approximation theory and filtration in
surface metrology. Some recent work on filtration has been reported. It is hoped that
this paper can generate some cross-fertilisation between the two areas of approximation
theory and filtration.
Bibliography
1. ISO 11562 1996. Geometrical product specifications (GPS)- Surface texture: Profile
method -Metrological characteristics of phase correct filters.
2. ISO/TS 16610-1. Geometrical product specifications (GPS) — Filtration Part 1:
Overview and basic terminology.
3. ISO/TS 16610-20. Geometrical product specifications (GPS) ^ Filtration Part 20:
Linear profile filters; Basic concept.
4. ISO/TS 16610-22. Geometrical product specifications (GPS) — Filtration Part 22:
Linear profile filters; SpHne filters.
5. ISO/TS 16610-29. Geometrical product specifications (GPS) — Filtration Part 29:
Linear profile filters; Spline wavelets.
6. ISO/TS 16610-31. Geometrical product specifications (GPS) — Filtration Part 31:
Robust profile filters; Gaussian regression filters.
7. ISO/TS 16610-32. Geometrical product specifications (GPS) — Filtration Part 32:
Robust profile filters; Spline filters.
8. ISO/TS 16610-40. Geometrical product specifications (GPS) — Filtration Part 40:
Morphological profile filters; Basic concepts.
9. ISO/TS 16610-41. Geometrical product specifications (GPS) — Filtration Part 41:
Morphological profile filters; Disk and horizontal line segment filters.
10. ISO/TS 16610-49. Geometrical product specifications (GPS) — Filtration Part 49:
Morphological profile filters; Scale Space Techniques.
Overview of approximation theory and filtration
11. ISO/TS 16610-60. Geometrical product specifications (GPS) — Filtration Part 60:
Linear areal filters; Basic concepts.
12. X. Q. Jiang, L. A. Blunt and K. J. Stout. Development of a lifting wavelet representation for surface characterization, Proc. R. Soc. Lond. A 456 (2000), 2283-2313.
13. W. Sweldens. The lifting scheme: A construction of second generation wavelets,
SIAM J. Math. Anal, 29 (1997), No. 2, 511-546.
14. X. Q. Xiang, L. A. Blunt and K. J. Stout. Application of the fifting wavelet to rough
surfaces. Precision Engineering 25 (2001), 83-89.
15. J. Serra. Image Analysis and Mathematical Morphology Vol. 1, Academic Press,
New York, 1982.
16. G. Mathron. Random Sets and Integral Geometry, John Wiley & Sons, New York,
1976.
17. P. J. Huber. Robust Statistics, John Wiley & Sons, New York, 1981.
195
Chapter 4
Radial Basis Functions
197
Preceding Page Blank
Applications of radial basis functions:
Sobolev-orthogonal functions, radial basis functions
and spectral methods
M.D. Buhmann
Mathematisches Institut, Justus-Liebig University, 35392 Giessen, Germany
buhmann@uiii-giesseii.de
A. Iserles
DAMTP, University of Cambridge, Silver Street, Cambridge, CBS 9EW, UK
ai@amtp.cam.ac.uk
S.P. N0rsett
Department of Mathematics, Norwegian University of Science and Technology,
Trondheim, Norway
norsett@math.ntnu.no
Abstract
In this paper we consider an application of Sobolev-orthogonal functions and radial basis
function to the numerical solution of partial differential equations. We develop the fundamentals of a spectral method, present examples via reaction-diffusion partial differential
equations and discuss briefly some links with theory of wavelets.
1
Introduction
Radial basis functions are a well-known and useful tool for functional approximation in
one or more dimensions. The general form of approximations is always a linear combination (finite or infinite) number of shifts of a single function, the radial basis function.
In more than one dimension, this function is made rotationally invariant by composing
a univariate function, usually called (f>, with the Euclidean norm. In one dimension such
approximation usually simplifies to univariate polynomial splines. For a recent review of
radial basis function approximations, see [5].
This note is about applications for radial basis functions and other approximation
schemes such as Sobolev-orthogonal polynomials and more general Sobolev-orthogonal
functions to the numerical solution of partial differential equations. The basic ideas stem
from the theory of Sobolev-orthogonal polynomials ([13]), and in this paper there is a
remarkable connection developed between applications of Sobolev-orthogonality with
radial basis functions (e.g. [5]), and wavelets are mentioned as well (e.g. [8, 9]). Sobolev198
Sobolev-orthogonal functions, radial basis functions
199
orthogonal polynomials are a device to extend the standard theory of orthogonal polynomials (see, for instance, [12]) by requiring orthogonality with respect to non-selfadjoint
inner products of the form
{f,g)x= [ fix)9ix)dx + X f f'{x)g'{x)dx
for a positive parameter A and a suitable interval (a, 6), a,b € HU {±00}. The da; in
the two integrals is often replaced by more general Borel measures, dip, say. The scheme
which we want to discuss in this short article is one of spectral type: in lieu of e.g.
finite element spaces as underlying piecewise polynomial approximation spaces for the
solution, we take purpose-build approximations which make the linear systems which we
need to solve particularly simple, sometimes even diagonal.
Therefore, in the first instance, we develop a theory of applying Sobolev-orthogonal
polynomial basis functions for the numerical solution of partial differential equations via
a spectral method. Then we extend this idea to general classes of radial basis functiontype methods, where shift-invariant approximation spaces are generated with Sobolevorthogonal basis functions. Due to the introductory character of this paper, our discussion is restricted to relatively simple cases. Our presentation is illustrated with the
one-dimensional reaction-diffusion partial differential equation.
This is the place to note that radial basis functions have found a number of other
applications in the discretisation of PDEs. Thus, for example, DriscoU and Fornberg [10]
have used fast-converging 'flat' multiquadrics in pseudospectral methods, while Frank
and Reich [11] applied radial basis functions with particle methods in order to conserve
enstrophy in the solution of certain shallow-water equations. Our application is of an
altogether difl:erent nature.
1.1 Examples of PDEs and Sobolev-orthogonality
Consider the partial differential equation
J^=y(^aWu) + bu + c,
/ (1.1)
where u = w(x, t) is of sufficient smoothness with respect to x and f, x is given in a cube
V C M'^ (more generally, in a finite domain), i > 0, a = a(x) > 0, 6 = &(x) and c = c(x).
We impose zero Dirichlet boundary conditions. The stipulation of cube as a domain and
zero Dirichlet conditions is unduly restrictive, but it will suSice for the short presentation
in this paper and adequately illustrate the main novel concepts in our presentation. In
the next section, we shall also introduce a nonlinearity into the underlying PDE.
We wish to approximate the solution «(x, t) as a finite linear combination of the
generic form
m
'
u{yi,t) = '^ai{-K)wi(t),
where t is nonnegatiye and x resides in the domain. In the sequel we shall also use
expansions into infinite series with I e 2Z. Thus, a Galerkin ansatz (in the usual L2 inner
product on M'' which we denote by (•, •) in contrast to the specialised Sobolev-inner
200
,
M. Buhmann, A. Iserles, and S. P. N0rseU
product (•,•)A above) gives
m
m
m
'^{ai,ak)w'i=^{W{aVai),ak)wi + ^{bai,ak)wi + {c,ai,),
(=1
1=1
fc = l,2,...,m.
1=1
Integration by parts in the second term above and substitution of the requisite zero
boundary conditions yield the alternative formulation
m
m
m
.'^{ai,ak)w'i = -Y^{dVai,Vak)wi + '^{bai,ak)wi + {c,ak),
1=1
1=1
k = l,2,...,m.
1=1
(1.2)
We solve the ODE system (1.2) with respect to t, for example with the backward Euler
scheme (we use backward Euler for the sake of simplicity, but it should be noted that
the same analysis applies to any implicit multistep method, because our use of Sobolevorthogonality is only linked to the implicitness of the solution method)
t«f+i=< + AiF,(w"+i),
n6K+,
/ = l,2,...,m,
(1.3)
where the function F; is given implicitly by the equations (1.2) and where w""*"^ in
the expression above is the vector with components w^'^^, I = 1,2,..., m. Let us now
multiply expression (1.3) by {ai,ak) and sum up for Z = 1,2,... ,m. Then, exploiting
(1.2), a little algebra yields
^ I /" [1 - Atfe(x)] Q((x)aA:(x)dx + At f a(x)V'^Q;(x)Vafc(x)dx| <+^
> y; / a,(x)afc(x)dx<+ / c(x)afc(x)dx.
(1.4)
The connection with Sobolev-inner products is clear. Indeed, let us now choose the set
Wm,„ := {wi,'W2, ■■■, Wm} as a set of functions that are orthogonal with respect to the
O
homogeneous Sobolev Hd,2 inner product (see, e.g., [13])
(/, 9)At :=/[!- Ai6(x)]/(x)5(x)dx + At f a(x)V^/(x)Vff(x)dx
(1.5)
Jv
Jv
(this of course requires that Atb{x) < 1, hence may restrict in a minor way the choice of
the time step At). Further below we shall also use infinite sets W instead of the finite
set Wm,n- It is important to note that in general the Sobolev inner-product depends
upon the step size. Subject to this formulation, the linear system (1.4) diagonalises and
its numerical solution becomes trivial. We turn now to a more elaborate example in the
next subsection, namely the reaction-diffusion equation.
1.2
Reaction-difFusion as a paradigm for nonlinear PDEs
Let us consider the nonlinear partial differential equation
^ = V(aVu)+ /(«),
(1.6)
Sobolev-orthogonal functions, radial basis functions
201
where otherwise all the quantities are as in (1.1), including the boundary conditions.
Suppose that an approximation u" to u{x., nAt) is available at all the spatial grid points.
We commence by interpolating u" to requisite precision by some function v. Thus, v is
defined throughout the cube V and coincides with «" at the grid points. This allows us
to linearise the source function / about u", the outcome being
dt
V{aVu) + c + bu + g{u),
(1.7)
where
&(x) = /'(Kx)),
C(X)
g{x,u)
= fiv{K))-f'{v{K))v{x),
= f{u)-f{v{x))-f'{v{x))[u-v{x)].
Note that
g{x,u)^0{\u-vf).
We can now solve the nonlinear system (1.7) by functional iteration, i.e. by letting as a
start
<+i'''=«;r,
j = l,2,...,m,
and recurring, employing the inner product (1.5),
771
J2{ai,ak)Atw'^+'''+'
(1.8)
1=1
= '^{ai,ak)wT+i9i-,J2°^^'^'^^^'^]''^'']'
k = l,2,...,m,
for j e ZZ+.
If, as in the previous subsection, we choose Wm so as to diagonalise the linear system, each step of (1.8) becomes relatively cheap. Hence this approach might offer a
realistic means to derive spectral approximation to nonlinear PDEs. Indeed, a special
one-dimensional case can be treated straightforwardly and it is presented in the sequel.
1.3
The one-dimensional case using polynomial splines
Let (1.1) be given in one space dimension and without source terms, whence it becomes
the familiar diffusion equation with variable diffusion coefficient,
du _ d / du
dt
dx \ dx
Thus, provided that 0 < a; < 1 and t nonnegative, we require the 'usual' Sobolev
orthogonality [13] with respect to the inner product
{f,9)At = {f,g)i= f f{x)g{x)dip{x)+ f f'{x)g'{x)d^{.x),
Jo
Jo
M. Buhmann, A. Iserles, and S. P. N0rsett
202
where
dip{x)
difi{x)
= Ata.
= 1 - Atb,
d.T
"
"'
dx
We emphasise again the dependence of the Sobolev-inner product on the step size. Taking
the approach of the previous subsection as our point of departure, an obvious option is
to use Sobolev-orthogonal polynomials. An alternative approach which can be worked
out explicitly and which we wish to demonstrate in this subsection, is to use univariate
polynomial spline approximations. It ha.s the advantage of being more amenable to a
generalisation to several space dimensions.
We suppose that the unit-interval [0,1] is divided into N intervals of length h := jj
and consider a piecewise-quadratic basis of continuous functions si, S2,..., Siv such that
' j^[x-{l-l)h] + ai{x-lh)[x-{l-l)h],
{l-l)h<x<lh,
siix)~l {[il + l)h-x] + f3i{x-lh)[x-il + l)h],
lh<x<{l + l)h,
^ 0,
\x- lh\ > h.
Clearly, si is a continuous, C[0,1] cardinal function of Lagrange interpolation at the
knots (hence, a quadratic spline with double knots, cf., Powell [16], the added degree of
freedom taken up by the requirement of Sobolev-orthogonality). Next, we need just to
impose Sobolev orthogonality, and solve for the coefficients ai and 0i. This is equivalent
to the requirement that
(s/,s;+i)At=0,
; = l,2,...,iV-l.
In the special case a{x) = 1, b{x),c{x) = 0, we have (p{x) = x, ip{x) = Ate and
(s;,Si+i)At
L ^h + ai+i(x-h)x
+ Ai
h—X
+ f3ix{x - h) dx
L U + 2ai+ix-ai+ih
■l>4 + a,+i/i2(C-l)C
-i + 2A.T-A/!ld.T
• l-e-A/i^CCl-O dC
'i!> + 2a,+ie-am)(l + A-2/3/OdC
h
Let ^l = At/h^ be the Courant number. Since we have two degrees of freedom for each
/ and because each equation is otherweise independent of /, we may fix a = a; = /?;.
Then, letting a := h^a, requiring (S;,S;+I)A« = 0 is equivalent to
5 - 5a + d^ + lOjua^ - 30/i = 0
or
(10/i + h'^)a'^ - bh'^a + 5 - 30/x = 0.
(1.9)
Soholev-orthogonal functions, radial basis functions
203
We wish to solve this quadratic equation for a for a suitable range of Courant numbers.
Indeed, the equation (1.9) has two real solutions a for every /i > g if /i is small enough,
since its discriminant is
(120/x + 5)/i^ + 1200//2 - 200/i.
In the case M = e ^^^^ ^i reduces, upon the choice of d = 0, to a chapeau function.
Otherwise we obtain a = 0(1). We may give up a small support, characteristic of
spline functions (which, anyway, is of marginal importance, since we do not solve linear
systems!). This is a case discussed in the next section. Another obvious alternative is to
construct an orthogonal basis from chapeau functions. This, however, is easily seen to
be identical to the LU factorization of the standard FEM matrix
0
0
1
6
0
0
0
2
0
0
0
1
2
6
f
0
0
6
0
Applications of radial basis functions and wavelets
2.1
Sobolev-orthogonal translates of a radial basis function
In this section, we wish to develop a more general approach employing the concepts
of wavelets and radial basis functions and employ shift-invariant spaces of approximations for our spectral methods. We begin by giving up the compactness of the domain
V and work on the entire real line instead. For this, we shall demonstrate the use of
Sobolev-inner products and shift-invariant spaces and concentrate solely on this part
of the analysis in the present article. So, in particular, the set W above is of the form
{^(' — nh) I n 6 K}. In the sequel we shall add several remarks about how to find
compactly-supported ^ that allow the treatment of partial differential equations on compact domains. We remark that n is no longer used for the time-steps in the differential
equation solver but for the shifts of the radial functions.
To start with, we wish to find a function cj) € H^(IR), where H^(IR) is a nonhomogeneous Sobolev space, such that for a positive constant A and positive spacing
h it is true that
/oo
/»oo
^{x)<p{x - hn) dx + \
-oo
(j)'{x)(f)'{x - hn) dx = Son,
nGTL.
(2.1)
J — oo
We multiply both left- and right-hand-side of the general pattern (2.1) by exp{i6n) and
sum over n G ZZ,
Y^ exp{i0n) < /
n=-oo
W-cxD
4>{x)^{x -hn)dx + X /
J-oo
4>'{x)(j)'{x -hn)dx\ =1, 6 e
[-TT, TT].
J
(2.2)
204
M. Buhmann, A. Iserles, and S. P. N0rsett
In order to be able to exchange summation and integration and apply the Poisson summation formula (Stein and Weiss [17], p. 252) we make a number of assumptions. The
version of the Poisson summation formula that we wish to use states that for a univariate
function / with
\f{x)\ = 0({\ + \x\)-'-^)
|/(a;)r=0((l + M)-i-)
and positive e, the following equality holds (note that the first bound in the above implies
existence and continuity of the one-dimensional Fourier transform)
oo
oo
j=—oo
j=—oo
Specifically, we assume that the following three decay estimates hold:
\cl>{x)\<c{l + \x\)-'-',
\<l>'{x)\<c{l + \x\)-'-%
■ and
where c is a generic positive constant, e > 0, 4> denotes the Fourier transform and we
demand the faster rate of decay in the last display because we shall later require summability of translates of the Fourier transform multiplied by the square of its argument. Note
in particular that the first decay condition renders the Fourier transform (j) continuous
and well defined.
'
An example for a function (j) that satisfies the three decay conditions above is the
second divided difference of the multiquadric radial basis function [4] \/r^~+C^ that is
k^) = \\/{^-1)' + c^ - V^' 'rC'^^\^|{x^ \y + C2.
Here, C is a positive constant parameter. The above function decays cubically [4] and
its Fourier transform even decays exponentially due to the exponential decay of the
modified Bessel function Kx [1] that features in the generalised Fourier transform of the
multiquadric, here stated only in the one-dimensional case,
(cf. Jones [14]).
Once summation and integration are interchanged, (2.2) becomes
/C)0
°o
<^{x) ^ exp(i^n)(/)(a; - hn) dx
Sobolev-orthogonal functions, radial basis functions
/oo
205
oo
4>'{x) ^ exp{ien)(p'{x-hn) dx = l,
-°°
^G[-7r,7r],
(2.3)
n=-oo
or, applying the Poisson Summation Formula (Stein and Weiss, [17], p. 252)
^{x) J] exp(i/i-^a;(6' + 27rn))(Af/i~^(6' + 27m))da; + iA/i-^
•°o
(2.4)
n=-oo
(})'{x) ^ expfi/i"^a:(6' + 27rn)j(6l + 27rn)^f/i"^(6i + 27rn)jda; = /i,
n=-oo
■<»
where 6 € [-TT, TT]. Because (f) vanishes at infinity, integration by parts of the second term
of.(2.4) gives
/oo
oo
<j){x) Y^ exp{ih-''^x{e + 2Trn)U{h~\0 + 2Tm)]dx
"°°
n—— OO
\
/•OO
+ T^ /
Qo
'/'(^) Yl exp(i/i-^a;(6' + 27rn))(6l + 2Tmf4>{h-^{e + 2TTn))dx
"^^
n= —OO
OO
=
Y, ^{h~'^{0 + 2m))^(-h-^{6 + 2-Kn)\ I + Xh-'^{9 + 2^)^ = h.
n=—OO
Since (j) is real, </>(—^) = </'(^)5 and this implies
OO
/^ |0(/i-i(6l + 27rn))|2(l + A/i-2(6» + 27rn)2)=/i,
6ie[-7r,7r].
(2.5)
n=:—oo
This is our condition that leads to the required Sobolev-orthogonality. In summary, we
have established the following theorem.
Theorem 2.1 If the decay conditions on <p, as stated above, hold in tandem with the
expression (2.5), then the required orthogonality condition (2.1) is satisfied.
We note that, if we are given a tp such that
oo
i2
Y |i/'(/i"H^ + 27rn))| =/i,
^e[-7r,7r],
'
(2.6)
n=—OO
then
yrTA|2
^ ^
satisfies (2.5). This expression can be used to derive an explicit transformation which
takes a tp that satisfies (2.6), into a (f> satisfying (2.5), although its practical computation
may be nontrivial. Indeed, by the Parseval-Plancherel theorem [17], we get the useful
identity
0(;^) = ^ /_"
^(^ - y)^o (^) dy,
(2.8)
206
M. Buhmann, A. Iserles, and S. P. N0rsett
which is a convolution and whose Fourier transform is therefore (2.7) (cf., for instance,
Jones [14]). In (2.8), KQ is the 0th modified Bessel function (Abramowitz and Stegun [1]) which is positive on positive reals and satisfies Ko{t) ~ -logi near zero and
iiro(i) ~ ^7r/(2i)e~* for large t, similar to the asymptotics We have used before for the
Ki modified Bessel function. Hence, by a lemma in [7], see also (Light and Cheney [15])
4> decays algebraically of a certain order if tj> does. Moreover, because l/\/l + Xx^ is
positive, integer translates of <)) are dense in L^, say, provided that this is the case with
, integer translates of V'[18].
In some trivial cases we may evaluate the integral (2.8) explicitly, for instance for
il}{x) = cos a;, where the integral is again a constant multiple of the cosine function
(Abramowitz and Stegun [1]). Otherwise, the smoothness and fast exponential decay of
the modified Bessel function can be used together with a quadrature formula.
We may now use the translates of such Sobolev-orthogonal functions in the spectral
approximation of a PDE as above, letting W := {(/>(• - nh) | n G 2Z}.
An example of a function "ip that satisfies (2.5) is simply the characteristic function
scaled by h of the interval [-hn, HTT]. In that case, \tp{x)\ decays like l/|x|. In fact, any
ip that satisfies |V'(OI < c(l + ICJ)"^^^"^^ for positive e can be made to satisfy (2.6) by
subjecting it to the transformation
^(0^?(0:=^—^M_,
J2 m + h-'2iTn)
(2.9)
i.
see for instance (Battle [2]). If V* is compactly supported then the transformed tjj will
not necessarily be compact supported but decay exponentially [6].
In order to find a class of examples of compactly supported ip that satisfy (2.6), see
Daubechies [8] for her compactly supported scaling functions ip which are fundamental
for the construction of Daubechies wavelets. For example, the following conditions are
sufficient for ip which shall be defined by its Fourier transform to satisfy (2.6) for /i = 1
(other h can be used by scaling):
where, for some suitable coefficients hk,
2N-1
fc=0
has to satisfy h(0) = 1, h{ir) = 0, and
For the construction of such h, see [8]. Compactly supported basis functions are important to approximate the numerical solution of a PDE as in the above example defined on
Sobolev-orthogonal functions, radial basis functions
207
a compact V. Moreover, any ip with the aforementioned decay property can be made to
satisfy (2.5) by the transformation
m^
,
^^^^^
(2.10)
£ IV'C^ + ^"^27rn)|2(l + A(^ +/i-i27m)2
\ n=—oo
They can also be found by applying the transformation (2.10) and using the transformation (2.9) as well.
We note finally, that for instance, when V' is a B-spline then its translates are dense
in L^ if we allow h to become arbitrarily small (see, for instance, Powell [16]) and the
last section of this paper).
2.2 Sobolev-orthogonal translates of a function in higher dimensions
Applying the approach of the previous subsection to the Sobolev inner product
f f{K)g{K)dK + X f V^/(x)Vff(x)dx,
the outcome is the orthogonality condition
J2 \kh'\O + 2Trn))\''{l + Xh-^0 + 2nnf) = h'^,
ne'Z''
^G[-7r,7r]^
(2.11)
which replaces (2.5). We are now also interested in the more general case of Sobolev-type
inner products
/ /(x)5(x)/x(x)dx + A / V^/(x)V5(x)Kx)dx,
where the weights n and v are positive. Here the orthogonality condition becomes more
complicated. Specifically, it is
Y, 4>^(h-\e + 2Tni))J'^(h-\e + 27rn)^
neK
+ Xh-^$^(h-\e + 2Trn)^J'^(h-\9 + 27rn)^=h'^,
^e[-7r,7r]^
where
0^
:= <A*VA*>
^^
:=
(II • II X <^) * V^,
and * denotes continuous convolution, used as in (2.8), where V' is convolved with a
modified Bessel function.
2.3
Error estimates
We can offer error estimates for the Sobolev-orthogonal bases, firstly, in the case when
(f) is a, univariate spline of fixed degree m, say, with knots on hTZ, and, secondly, in the
208
M. Buhmann, A. Iserles, and S. P. N0rsett
case when 0 is a linear combination of translates of the radial Gauss kernel
e-»'-'/\
xeM,
along hTl. In the former case it is known that the uniform approximation error to a
sufficiently smooth function from the linear space spanned by </)(• - nh), n e 71, is at
most a constant multiple of /i^^^ ([16])- We have already mentioned that we require
A = 0{h'^), therefore it can be deduced by twofold integration by parts that the Sobolev
error is indeed 0(ft'"+^). This can be generahzed in a straightforward way to higher
dimensions by tensor-product B-splines.
Our L^(IR) error estimates can be carried out as follows: Let / be a band-limited
function, that is, one with a compactly-supported Fourier transform, which satisfies
such assumptions that imply that the best least-squares approximation using a Sobolev
inner product
oo
Sh{x)= ^ {f, <!>{■-hn))^,^c}>{x-nh),
xGll,
(2.12)
n=—oo
is well defined. For instance, we may require that {f,f)x,h < oo, as well as sufficient
decay of the radial basis function cj), i.e.
\cp{r)\
\<f>'{r)\
< c{l + \r\)-'-^,
< c(l + |r|)-l-^
|^(r)i < c{l + \r\r'-'
for a positive e. Here {•, ■)x,h is the Sobolev inner product which we study in this note
and it is helpful to emphasise its dependence on h in the subscript. We begin with the
piecewise polynomial, i.e. spline, case. Hence, let ^ be from the space of splines of degree
m with knots on h7Z such that its translates are Sobolev orthogonal.
Theorem 2.2
estimate
Subject to the assumptions of the last paragraph, we have the error
\\sH-f\\2 = 0{h"'+'),
h-^0.
(2.13)
Proof: We shall establish in the course of this proof an error estimate for the first
derivative of the error function in (2.13), so that an order of convergence can also be
concluded for the norm associated with our Sobolev inner product. Indeed, because the
Fourier transform is an L^(Il) isometry, we may prove (2.13) by considering
ph-fh
(2.14)
instead of the left-hand side of (2.13). The Fourier transform of (2.12) is
oo
n=—oo
The absolute convergence of the above is guaranteed by the decay conditions on 0.
Hence the square of (2.14) is, by the Parseval-Plancherel Formula and periodisation of
209
Sobolev-orthogonal functions, radial basis functions
the integrand with respect to ^,
/oo I
oo
/»- E (/,0(--n/i)>,,,e-'«''"0W
'—OO
-c>o
d^
«__,^
n=—oo
/oo
/•oo
j-^
/»oo
/»- E /
-°°l
/(0<^(0e'«'"(l + A^2)d^e-'«''",^(^) d6»
n=-oo''-°°
/7r/h
oo
I
^ \f{e + 2'Kk/h)-4>{e + 2'K.fc//i)
^Afc=-oo
^
oo
/-GO
/-OO
E /
_
/(0<A(0e'«"'^(l + A^2)d^g —iOnh d9.
(2.15)
,„•/—OO
The (1 + A^^) term in the above comes from the derivative in the Sobolev inner product
and Fourier transform. Because / is band-limited, for small enough h (2.15) assumes the
form
oo
»oo
E mSok-He + '^^k/h) Y^ /
/(O0(C)e'«"\l + Ae')dCe-«"'^
d9.
„^_„v/-oo
■ir/h,
(2.16)
Using again the band limitedness of /, together with the Poisson Summation Formula,
(2.16) can be brought into the form
/w/h
°°
I
E \f{e)Sok-kO + 2Trk/h)
■n/h I
^ ft E f{0 + '^'^n/h)4>{e + 2-Kn/h)(l + X{e + 2Trn/hf)
d(9
n=~oo
p7v/h
oo
^
J—rr/h fe=—oo
,_
,
In the case when cp is in the aforementioned spline space, it can be expressed as the
inverse Fourier transform of
^(0 =
Vhm)
VE:L-OO
^ e M,
(2.18)
\ri^ + h-'27rn)mi + A(^ + h-^27rny)
where f(^) = ^~™~-^. This follows from (2.5) and from the fact that all spUnes from our
space are linear combinations of integer translates of r{x) := [x]"^, whose generalised
Fourier transform is a multiple of ^"'"'"^ [14]. Since any constant factors in front of
the function ^~™"-^ in f cancel in the expression for 4> above, we have ignored them
M. Buhmann, A. Iserles, and S. P. N0rsett
210
straightaway. Substituting (2.18) into (2.17), we get the integral over [-TT/ZI, TT/ZI] of
2
E
f{0)Sok
fc=-oo
r{e + h-'^2irk)r{e)
-fie){i + x9')
J^ \fie + h-^27m)\'^il + X{e + h-Hmf)
(2.19)
Considering (2.19) for each m separately, it follows from (2.19) and from f(^) = ^-"'-i
that our claim is true. Indeed for the sum over all terms with fc ^^ 0, it is evident that
we obtain a factor of /i^™"*"^ from the numerator, because the denominator is periodic,
containing one term independent of h, and the nonvanishing expression /i~^27rfc in the
argument of f{9 + h-^2Trk) guarantees f{e + h-'^2-nk) ~ /i^+i due to r(^) = ^-'"-i. Of
course, the squares then taken provide the /i^"'+^ instead of ft'"+^.
On the other hand, for A; = 0, we have for small enough h
m= \m\'
\fie)\H'^ + >^o^)m
E„^o \r{e + h-^2i^nmi + \{9 + h-'2'Knf)
1 + En^o \^i^ + /i-^27rn)|2(l + A(^ + /i-i27rn)2)
which is also ©(/i^™^^), as required, because the numerator provides an 0{h?^^), according to the rate of the decay of f and the power of h in its argument. This is then
squared to provide 0(/i4™) = 0(/i2'"+2).
As for the derivatives, one only has to multiply the Fourier transform of the error
function in (2.14) with 6, and we get the same error estimate by multiplying the integrands in all the following integrals with l^p.
D
The same analysis remains valid when considering integer translates of the Gauss
kernel e""^ ^ /^ jn order to form 4>. In this case we make use of the fact that the Gauss
kernel has a Fourier transform which is a multiple of e~^ ''^^i ). ^g put this instead of
f into (2.19), and we then get arbitrarily-high orders of convergence from (2.14) as long
as we take 7 = 0[h), see also [3]. For this choice ((> is exponentially decaying, whereas
for splines of degree m we merely get algebraic decay at infinity of order -m - 1.
Bibliography
1. Abramowitz, M. and LA. Stegun (1970) Handbook of Mathematical Functions,
Dover Publications.
2. Battle, G. (1987) "A block-spin construction of ondelettes. Part I: Lemarie functions", Comm. Math. Phys.
3. Beatson, R.K. and W.A. Light (1992) "Quasi-interpolation in the absence of polynomial reproduction", in Numerical Methods of Approximation Theory, D. Braess
and L.L. Schumaker (eds.), Birkhauser-Verlag, Basel, 21-39.
Soholev-orthogonal functions, radial basis functions
4. Buhmann, M.D. (1988) "Convergence of univariate quasi-interpolation using multiquadrics", IMA Journal of Numerical Analysis 8, 365-384.
5. Buhmann, M.D. (2000) "Radial basis functions", Acta Numerica 9, 1-38.
6. Chui, C.K. (1992) An Introduction to Wavelets, Academic Press, New York.
7. Buhmann, M.D. and N. Dyn (1993) "Spectral convergence of multiquadric interpolation", Proc. Edinburgh Math. Soc. 36, 319-333.
8. Daubechies, I. (1988) "Orthogonal bases of compactly supported wavelets", Comm.
Pure Appl. Maths 16, 909-996.
9. DeVore, R.A. and B. Lucier (1992) "Wavelets", Acta Numerica 1, 1-55.
10. Driscoll, T.A. and Fornberg, B. (2001), "Interpolation in the hmit of increasingly
flat radial basis functions", to appear in Computers and Maths & AppUcs.
11. Prank, J. and Reich, S. (2001) "A particle-mesh method for the shallow water
equations near geostrophic balance". Tech. Rep., Imperial College, London.
12. Gautschi, W. (1996) "Orthogonal polynomials: applications and computation",
Acta Numerica 5, 45-119.
13. Iserles, A., P.E. Koch, S.P. N0rsett and J.M. Sanz-Serna (1991) "On polynomials
orthogonal with respect to certain Sobolev inner products", J. Approx. Th. 65,
151-175.
14. Jones, D.S. (1982) The Theory of Generalised Functions, Cambridge University
Press, Cambridge.
15. Light, W.A. and E.W. Cheney (1992) "Quasi-interpolation with translates of a
function having non-compact support", Constr. Approx. 8, 35-48.
16. Powell, M.J.D. (1981) Approximation Theory and Methods, Cambridge University
Press, Cambridge.
17. Stein, E.M. and G.Weiss (1971) Introduction to Fourier Analysis on EucUdean
Spaces, Princeton.
18. Wiener, N. (1933) The Fourier Integral and Certain of its Applications, Cambridge
University Press, Cambridge.
211
Approximation with the radial basis functions of Lewitt
J. J. Green
Dept. Applied Mathematics, University of Sheffield, UK.
j.j.green@sheffield.ac.uk
Abstract
R. M. Lewitt has introduced a family of compactly supported radial basis functions
which are particularly useful in discretising for inversion ill-posed problems involving line
integrals. We consider some practical considerations in their use and implementation,
compare square and triangular grids of the functions in two dimensions, and describe
some particularly favourable choices of the defining parameters.
1
Introduction
In the article [5], R. M. Lewitt introduced a family of window functions
^(,) ^ I (1 - ir/arr/'lm{a{l - {r/ary/^)/Ua), 0 < r < a,
^^ ^^
where /„ is the modified Bessel function of order m (see Ch. Ill, 3.7 [13]). The implicit
dependence of tp on the parameters a > 0, a > 0 and m 6 N is discussed below. Lewitt's
motivation for studying these functions is the use of translates of the radially symmetric
function
*(x) = ^(||a;||)
(x e R')
(see Figure 1) as a basis for the discretisation of tomographic problems [8, 9]. Such a basis
overcomes a number of difficulties associated with the usual, pixel-based, representation
in problems involving the recovery of function from a set of line, curve or strip integrals
across its domain, while retaining the advantage of a sparse discretisation. The author's
interest in these functions arises in their application to a Radon-like problem in the
remote sensing of ocean waves [15], a detailed exposition of which may be found in [3].
2
Discretising x-ray problems
The discretisation of an x-ray transform inversion problem with Lewitt's basis is straightforward. Given a set of centres Xi 6 R'', one represents the (unknown) function / as a
linear combination of the translates of ^,
f{x) = Y,^i^{x-Xi)
212
[x^Kf).
(2.1)
213
On the functions of Lewitt
FIG.
1. Lewitt's radial basis function in dimension 2 with m = 2, a = 3.
The given data in such problems are the values Ij of integrals of / over lines (or more
generally, submanifolds) Lj
(2.2)
The latter integral in (2.2) is the projection or Abel transform of /, which can be calculated exphcitly in the linear case. For a line Lj whose closest point to Xi is at a distance
s from it, and with the dependence of ^ on rri here made explicit,
2rv™(N/^^T^)rft-«%^f-)'^%™+i/2(«)
Jo
Im{a) \a J
(see A7, [5]). Thus (2.2) reduces to a linear system which may be solved for the coefficients
^j. If the support of the basis functions is small (i. e., if a is small) then this Unear system
has an unstructured sparsity which can be exploited by, for example, an iterative rowaction solution method [2].
The computational cost of such a discretisation lies mainly in the evaluation of the
Abel transform which requires the calculation of a Bessel function. Fortunately, Bessel
functions of half-integer order can be calculated efficiently from their recurrence relations
(see the Atlas, [12], for details).
The discretisation techniques describe here can also be applied to problems in which
the integrals are over curves of sufficient smoothness to allow a local linear approximation.
214
J. J. Green
3
Fourier transform and invertibility
The Fourier transform of the rf-dimensional basis function $„ is radially symmetric and
given in (A3) of [5] as
*"W = '^W^%SI^' -V(^-'-ll)-^.-.
(3.1)
The presence of the Bessel function Jm+d/2{z) in this expression clearly implies that it is
not non-negative, and so by Bochner's characterisation of positive definite translationinvariant functions, ^ is not positive definite for any choices of the parameters.
This fact denies us the attractive approximation theory of the compactly supported
radial functions of Wu, Wendland and Buhmann (Section 3, [1]). In particular, there is
no guarantee, per se, on the invertibility of the interpolation matrix [^{xi — Xj)], needed
to ensure that (2.1) can represent an arbitrary function at its centres. However, this
interpolation matrix is invertible if it is strictly diagonally dominant (Corollary 5.6.17,
[4]) which, for a set of centres on a uniform grid F, holds if
^(0)> Y, *(^)-
(3-2)
^Gr\{o}
Values of the parameters for which (3.2) is satisfied for the square planar grid AZ^ are
shown in Figure 2.
As is noted in [5], there are several reasons why a rapid decay of the Fourier transform
of the basis function is advantageous in functional representation for the inversion of xray and related transforms.
• Such inversions may be complicated by functions in the nullspace of the transform,
so-called ghosts. For some transforms [7] it can be shown that such functions have
a Fourier transform which is small close to zero in the frequency domain, and so
representation by a basis with Fourier transform localised around zero will suppress
these ghosts.
• These inversions are often ill-posed and the given data noisy. Representation of the
sought function by a basis with localised Fourier transform imposes smoothness,
and so acts to regularise the problem in the sense of Tikhonov.
• It is often convenient to sample the inverted function on a grid which difi^ers from
the set of centres Xi of the basis. With a localised Fourier transform, such a sampling
can be performed without significant aliasing.
The asymptotic estimate ^m.{x) = 0(l/||a;||'"+^'^+^^''^) may be derived from (3.1) and
estimates /„ with large argument (see Eq. A4, [5]), a fact which should inform our choice
of m.
4
Choice of parameters
One agreeable feature of Lewitt's radial functions is that the choice of parameters of
the functions correspond in a natural way to the balance between representation quality
and efficiency of computation. For example, the asymptotic rate of decay of the Fourier
transform increases with m (see above), but so does the cost of the calculation of /,„•
On the functions of Lewitt
215
A similar choice arises when the centres he on a uniform (square or triangular) grid
r. Let A denote the grid spacing of such a grid, i.e., the minimum distance between
distinct centres in T. It is desirable that the grid ratio a/A be small, as this results in
sparsity of the discretisation. As a guide to fixing the values of a and the grid ratio,
Lewitt suggests the error in quasi interpolation to a constant, the error with which the
function
gix):=Y,^{x)
ier
approximates the function whose constant value is that of g at the centres (edge effects
are ignored here). In Figure 2 the root mean square of this representation error (estimated
numerically) is shown for the square planar grid, m = 2 and a range of values of a and
a/A. The distinctive "trenches" in the error can be explained with Poisson summation
formula (see [11]),
^ ^{x + An) = --^ Y^ exp(27rin.a;/A)*(n/A).
neZ2
(4.1)
n€Z2
The summand for n = 0 in the second sum is *(0), so the representation error depends
only on the values of ^ on the dual grid, Z^/A. Provided that * decays rapidly, we
would expect a small error when ^ is zero, or close to zero, for the dual grid-nodes close
to the origin.
By (3.1), ^(a;) is zero exactly when
'/m+d/2(\/(27ra||a;||)2 -
Q2)
=0,
i.e., for radial values ||a;|| = i?fc,
where T]k is the fc-th zero of Jm+d/2- Thus, the requirement that that the fc-th zero of
^(x) occurs at the radius of the closest non-zero dual grid node (i.e., Rk = 1/A) is a
constraint on the values of a and a/A
a=^(27ra/A)2-r?i
(4-2)
The contours (4.2) agree well with the trenches evident in Figure 2. With the same intent
we can require that the l-th zero of ^(a;) occur at the radius of the second closest dual
grid node {Ri = \/2/A). Points satisfying both of these constraints can be expected to
have a particularly small representation error. In Figure 2 these favourable choices are
labelled k:l.
The above argument can be also be applied the triangular grid. Establishing the Poisson summation formula for such is straightforward (either generalised from VII Section
2 of [11] or specialised from the formula for topological groups in [6]), and one finds that
dual grid is the triangular grid with node spacing 2/(A\/3). The representation error is,
qualitatively, similar to that shown in Figure 2. To make a quantitative comparison we
plot, in Figure 3, the representation error on the principal trench (i. e., along the contour
216
J. J. Green
Grid ratio a/A
2. The representation error of the square planar grid for m = 2. The lower contour
map shows the root mean square error in representation for different values of the grid
ratio a/A and localisation a. The upper figure shows the error along the trenches evident
in the lower. Favourable choices of the parameters are marked 1:2,1:3,..., and are also
shown in the lower figure. Values of the parameters to the left of the dashed line give
rise to a diagonally dominant interpolation matrix.
FIG.
217
On the functions of Lewitt
1.0
1.5
2.0
2.5
1.0
1.5
2.0
2.5
1.0
1.5
2.0
2.5
adjusted grid ratio
FIG.
3. Error in the principal trench for square (dashed) and triangular (sohd) grids.
(4.2) for A; = 1 in the case of the square grid) for each grid type and a number of values
of m.
To ensure a fair comparison, the horizontal scale in Figure 3 is adjusted for each
grid-type to give equal node densities. As is seen, the two grid-types have similar error
performance, suggesting that the square grid (with attendant ease of implementation)
is to be preferred in practice.
5
The functions of Wendland
It is interesting to compare Lewitt's functions with the radial basis functions of Wendland
[1, 14], positive definite functions whose window functions are piecewise polynomial. The
positive definiteness of Wendland's functions indicate their usefulness in approximation,
for which extensive results exist, and a number of recent papers have explored their use
in the discretisation of partial differential equations.
The use of Wendland's functions in x-ray problems does not appear to have been
investigated, although their Abel transforms can be obtained analytically. We do not address this question here, but indicate why Lewitt's functions may offer some advantages
for such problems. The Fourier transform of Wendland's function $2,0, whose window is
02,0 ('') = (1 — ^)+) is proportional to
/>27rr
r-M
Jo
{2nr-t)HJo{t)dt = 0{r-^)
(r = ||x||)
(see Section 3, [14]). In Figure 4, $2,0 is plotted along with the Fourier transform ^^2, of
Lewitt's function with a = 1 and the parameter choice 1:2 of Figure 2. Although both
have the same asymptotic decay of the Fourier transform, Lewitt's is more localised
about zero and thus may offer better suppression of ghosts in x-ray problems.
Finally we mention that Buhmann has shown, in [1], that Wendland's window func-
218
J. J. Green
FIG.
4. Fourier transforms of basis functions,
tion admits a convolution representation of the form
/•oo
p(r):= / {l-T''lt)ieg{t)dt
Jo
(5.1)
for the weight g{t) = (1 - i)+ with suitable k and n. We note that (5.1) may be solved
for g, since substituting a; = r^ in (5.1) allows it to be reduced to a standard integral
equation whose solution,
(-1)"
9{x) = ^f^"Hx),
m=pir),
can be found in Article 1.1-4.32 of [10]. In the case that p is Lewitt's window V-'m, one
may use the differentiation formula, All of [5], to find the corresponding weight g. For
n = 1 we find that
1
Im-iia)
9{x) = --.2a™-2 /^(c^) -i>m-\{r),
a weight qualitatively different from that of Wendland's function.
Acknowledgements: The author wishes to thank L. R. Wyatt and the referees for a
number of helpful comments, and acknowledges the financial support provided by the
EC with the grant MAS3-CT98-0168.
Bibliography
1. M. D. Buhmann. Radial basis functions. In Acta Numerica, volume 9, pages 1-38.
Cambridge University Press, 2000.
2. Y. Censor. Row-action methods for huge and sparse systems and their applications.
SMMiJemew, 23(4):444-466, October 1981.
3. J. J. Green. Discretizing Barrick's equations. Submitted.
On the functions of Lewitt
4. R. A. Horn and C. R. Johnson. Matrix Analysis. Cambridge University Press, 1990.
5. R. M. Lewitt. Multidimensional digital image representations using generalized
Kaiser-Bessel window functions. J. Opt. Soc. Am. A, 7(10):1834-1846, October
1990.
6. L. H. Loomis. An Introduction to Abstract Harmonic Analysis. D. Van Nostrand,
1953.
7. A. K. Louis. Orthogonal function series expansion and the null space of the Radon
transform. SIAM J. Math. Anal, 15(3):621-633, May 1984.
8. S. Matej, G. T. Herman, T. K. Narayan, S. S. Furuie, R. M. Lewitt, and RE. Kinahan. Evaluation of task-oriented performance of several fully 3d PET reconstruction
algorithms. Phys. Med. Biol, 39:355-367, 1994.
9. S. Matej and R. M. Lewitt. Practical considerations for 3-d image reconstructions
using spherically symmetric volume elements. IEEE Transactions on Medical Imaging, 15(l):68-78, 1996.
10. A. D. Polianin and V. Manzhirov. Handbook of Integral Equations. CRC Press,
1998.
11. E. M. Stein and G. Weiss. Fourier Analysis on Euclidean Spaces. Princetion University Press, 1971.
12. William J. Thompson. Atlas for computing mathematical functions. John Wiley &
Sons Inc., New York, 1997.
13. G. N. Watson. A Treatise on the Theory of Bessel Functions. Cambridge, second
edition, 1944.
14. H. Wendland. Error estimates for interpolation by compactly supported radial basis
functions of minimal degree. Journal of Approximation Theory, 93:258-272, 1998.
15. L. R. Wyatt. A relaxation method for integral inversion apphed to HF radar measurement of the ocean wave directional spectrum. International J. Remote Sensing,
11:1481-1494,1990.
219
Computing with radial basic functions the
Beatson-Light way!
Will Light
Department of Mathematics and Computer Science, University of Leicester, UK.
pwl@mcs.le.ac.uk
Abstract
In this paper we discuss a number of recent developments in the practice of how to
compute with radial basic functions. The two main problems addressed are how to develop
fast evaluation schemes for radial basic functions, and how to efficiently carry out the
solution of the interpolation problem. The approach is to mainly describe work which
has involved the author and Professor Rick Beatson as contributors, and to include an
idiosyncratic selection of works by other researchers which have attracted the attention
of the author.
1
Introduction
Research into radial basic functions has been active now for about 30 years. The basic
setup is as follows. A function tp : M" —> M, which we refer to as the basic function, is
specified. A subspace is then constructed by reference to points xi,. ..,Xm in K". The
members of this subspace all have the form
s{x) = '^aiip{x - Xi),
xeH",
where the at,... ,am are real numbers. It is important to appreciate at the outset that
throughout this paper, and indeed in most of the papers appearing in this area, the underlying assumption is that the points a;i,..., a;^ are distinct. One of the most common
tasks for which these functions are used is interpolation. A small amount of research
has been carried out where the points at which an interpolant is developed are arbitrary
distinct points in H", but by far the majority of the work relates to interpolation which
is carried out at the same points as those used to effect the translation. Accordingly,
data di,..., dm are given at Xi,...,Xm,, and we require that
m
dj = s{xj) = '^aitp{xj~Xi),
j = l,...,m.
(1-1)
Two immediate observations present themselves. Firstly, at the present level of generality
there is absolutely no guarantee that the Equations (1.1) will have a unique solution.
Secondly, one knows from the work of Mairhuber [14] that there are no Haar subspaces
220
Computing with radial basic functions
221
of significant dimension in any space H" for n > 2. What this means is that if we are to
construct interpolation problems which have a unique solution for each location of the
data points xi,... ,Xm and for each choice of the data rfi,..., dm, then the subspace used
must vary as the interpolation points vary. If we pause for a moment and consider how we
might in some sensible and orderly way vary the subspace as the points xi,... ,Xm vary,
then using simple shifts of a single basic function ^ is one of the most natural choices.
It is very common to work with a function if) which is a radial function. Thus we take a
function </> : R"'" -^ IR and determine ip by the rule ij){x) = (t>{\x\) for all x € R". Note
that throughout this account, the symbol | • J will stand for the Euclidean norm in M".
At this point a common inaccuracy arises. The function I{J can be correctly referred to as
a radial basic function. However, many authors give this appellation to the function (f),
whose radiality is of no consequence whatsoever, since it would imply that 4> was simply
an even function on R. Since (j) only acts on TR^ the idea that 4> can be radial is vacuous.
Let us continue in this spirit of criticism a little while longer. As far as the author is
aware, only two people in the world would refer to ^ as a basic function, or a radial basic
function. All other authors would use the word basis in place of basic. There are very
obvious problems with this terminology. We are seeking to generate subspaces which are
suitable for interpolation. Such subspaces will naturally have the same dimension as the
number of data, and the functions {ip{- — Xi) : i = l,...,m} should form a basis for
the subspace. The use of the word basis in two completely different senses seems to the
author to be misleading and unhelpful, whereas use of the word basic ^ a difference of
one character — eliminates any possibility of confusion, and avoids the use of the word
basis, which has a very specific mathematical meaning, in a context where its meaning
is not the usual mathematical one.
The problem about whether interpolation is possible has a highly satisfactory answer
in the work of MiccheUi [15]. We direct the reader to the book of Cheney and Light [10]
for a full account of these matters. A couple of examples will be helpful. If one chooses
jn
rn
s{x) = y^^ai(j){\x — Xi\) = /^Qiexp(—ja: — Xj^),
a; € R",
t=i
or
s(a;) = \Jai(/>(|a; — a;t|) = yjaj|a; — a;j|,
a; S R",
then the resulting interpolation problem is uniquely solvable for any choice of Xi,..., Xm
and for any data di,... ,dm- This result contrasts very strongly with the case for polynomial interpolation, where the data points x\,..., Xm have to be constrained not to lie
on an algebraic surface of appropriate degree. Indeed, the alternative formulation of the
above result for the pecond example is quite often surprising to mathematicians who are
uninitiated in the theory of radial basic functions.
Theorem 1.1 Let xi,... ,Xm be distinct points in JR". Then the matrix {\xj — Xi\) is
invertible.
Having drawn a clear distinction between polynomial approximation and approximation by (radial) basic functions it is at this point that we must consider having some
222
Will Light
polynomial ingredients in our interpolant. This is done in a very standard way by a
process we call augmentation by polynomials. We consider interpolants of the form
m
'
s{x) = ^ai(j){\x-Xi\)+p{x),
■
(XGH").
i=l
Here p is a polynomial of total degree at most fc - 1. We still wish to interpolate to
m pieces of information, but now have more than m parameters to determine with
this information. The remaining parameters are determined via the 'natural' boundary
: conditions. The full set of equations is
dj
=
0
=
s{xj) = '^ai(l){\xj-Xi\),
j = l,...,m
m
'Y^aiq{xi),
for all Q G 7ri._i(lR").
i=l
Here 7r/j_i(]R") represents the space of polynomials of total degree A; — 1 in K". Two
questions present themselves pretty quickly from this additional hypothesis. Why should
polynomials be added to the interpolant, and why are the boundary conditions chosen in
this particular way? In some sense it is essential that we allow ourselves the possibility
of adding polynomial terms to some of the interpolants, as we shall soon see. The most
important example of a radial basic function interpolant which has a polynomial part
will be the thin-plate spline. We will make considerable reference to this interpolant in
K^, where it has the form
m
s(a;) = yjoil^; - 3;ipln|a; - Xil+a.T + 6,
(x G R^).
Note here that the parameter a is a vector with two entries, as is x. Thus ax stands
for the dot product between a and x. The parameter 6 is a real number. The natural
boundary conditions take the form
m
m
m
/; Qt = Xl QiSj = VI ajti = 0,
where Xi = {si,ti), i = 1,... ,m. This particular interpolant exhibits a feature common
to all the cases where augmentation by polynomials is either necessary or desirable:
the degree of the polynomial added is very low. The usual choices are k = 0 (when
no polynomial term is added), fc = 1 (when the term is a constant polynomial) and
k = 2 (when the added polynomial is linear). It is now no longer possible to carry
out interpolation for all choices of the points Xi,... ,Xm. One must avoid distributions
of these points which lie on a zero surface of the corresponding polynomial subspace.
In the explicit case we considered above (thin-plate splines), the very mild restriction
needed is that Xi,... ,Xm, should not all lie on a single straight line. The theory developed
by Micchelli [15] includes the case of augmentation by polynomials.
We now propose to take a look at a very simple example which we hope will give the
Computing with radial basic functions
reader a feel for some of the ideas and concepts we have introduced so far. We consider
s{x) = 2_.^i\^ ^ ^i\'^^
(a; e R).
Here the parameter 6 is a real number, and the natural boundary condition gives us
YllLi flj = 0. A unique feature of the univariate case is that we can order the interpolation
points iri < a;2 < • ■ • < Xm- Now consider the function s in one of the intervals [xi,Xi+i],
i = 1,..., m — 1. It is clear that in such an interval s is simply a linear function. The
demand that s interpolates the data at xi,... ,Xm means that s must be the piecewise
linear interpolant to the data in the interval [xi, Xm]- What is the effect of the 'natural'
boundary conditions? In the interval [xm,oo) we can write
m
m
s{x) = y^^ai{x - Xj) + b = -y^^ajXj + b.
i=l
i=l
Thus 8 is constant in [xm, oo). A similar calculation reveals that s is constant in (—oo, xi].
Combining all these observations shows that s is the natural Unear spline interpolant
to the data at xi,...,Xm- This goes some way to explaining why the word 'natural' is
appended the boundary or extra conditions. But we can go a little further. It is well
known that the natural splines satisfy a variational principle. For the linear spline, if we
examine
X = {feS':f'€L\]R)},
then
-oo
is'f < ./—oo
/ iff
for all f e X which also interpolate the data. This variational principle is very useful in
developing error estimates, and we shall return to this general thread of ideas later in this
account. However, we ought to observe that S' is the space of tempered distributions,
and that the first derivative is to be taken in the distributional sense. There are ways
of getting round this distributional approach (see Cheney and Light [10] for an example
which corresponds closely to the discussion here), but it does give the most succinct description, and creates the technical background which will underpin all the theory which
has been developed in this area. Notice also that the quantity being minimised can be
used to specify a seminorm on X simply by taking the square root of the integral. This
seminorm has as kernel 7ro(]R), which is precisely the polynomial subspace we use to augment the original radial basic function. Something very fundamental is happening here.
Most mathematicians would regard this seminorm as being a measure of smoothness
of the corresponding function. The natural linear spline therefore interpolates the data,
and is the smoothest interpolant to the data from X in the sense that it possesses the
smallest derivative in the X^-norm. If we are to pursue this very natural idea of making
higher derivatives of s small, then we will naturally develop seminorms with polynomial
kernels. This goes a long way towards explaining the need for augmentation.
Finally in this introduction, we want to discuss briefly the uses to which radial basic
223
224
Will Light
function interpolation is put. There are two significant feelings about interpolation by
these functions. Firstly, it is thought that radial basic function interpolation is very
good for treating scattered data. Loosely speaking, data is scattered when there is no
possibility of determining either a natural choice of coordinate axes, or an origin. It
is at the opposite end of the spectrum to gridded data. In the presence of a cartesian
product for the data sites, it is much more efficient to use univariate methods together
with tensor product constructions to do the interpolation. Secondly, radial basic function
interpolation is thought to be very good for dealing with high dimensional data. There is
some evidence from the realm of neural networks that this is indeed the case, but we will
not venture into the area of high dimensional data interpolation in this paper. Finally,
many of the data sets we want to treat have very large numbers of data sites and so our
aim is to develop methods which will handle 10,000 to 1,000,000 data sites or more.
2
Computational difficulties and fast evaluation
In this Section, we want to discuss the difficulties that arise when a large radial basic
function interpolation problem is posed. We shall also deal with one of the essential tools
for overcoming some of the difficulties. The system we want to solve has the form
dj
=
s{xj) = '^ai(l>{\xj-Xi\)+p{xj)
0
=
^aiqixi),
{j = l,...,m)
(2.1)
rn
for all 9 e 7rfc_i(IR").
(2.2)
If we declare a basis for 7rfc_i(]R") then we can write these equations in matrix form as
A
Q^
Q
0
)(:)=(o)-
Here the matrix A has entries (t>{\xj — Xi\) and is m x m. The matrix Q has entries
Peixj), where pi,... ,p^ is a basis for 7rfc_i(]R"), and is of size m x u. Recall from our
assumptions that only low degree polynomials are used, and so Q is a long thin matrix.
In the case of thin-plate splines in H^ it would have size m x 3. However, J4 is a very
large matrix, with absolutely no sparsity. In fact, for thin-plate splines, the matrix A
is zero on the diagonal, and has large off-diagonal entries. In solving a large system of
linear equations, the only effective strategy is to use an iterative solver. Such a solver
will involve many multiplications of the matrix A with a vector a, and the full nature
of A makes this a very costly process. One of the key discoveries in this area was the
Beatson and Newsam [8] result which showed how fast multipole algorithms could be
applied to this area. If we consider the expression
m
s{xj) = 'Y^ai\xj-XiflnQxj-Xi\)+p{xj)
for some Xj G M", then this can be considered as an evaluation of the function s at the
point Xj, or the formation of an element in the matrix vector product Aa. Because of
Computing with radial basic functions
225
this, most authors tend only to consider how to evaluate the function s in an efficient way
— generating what are known as fast evaluation algorithms. It is impossible to estimate
properly the importance of this discovery. Anyone involved in programming iterative
solutions to the thin-plate spline equations with tens of thousands of points would find
that any such algorithm would just grind itself into the dust without this technology. The
technology really has two aspects: a mathematical tool, and a programming structure.
Here we intend to give only the flavour of the argument. The reader who really wants to
know the details is advised to look either at the original paper [8], or the later paper of
Beatson and Light [5] which deals with polyharmonic splines. She can also look at two
papers which give clear explanations of simple cases. The first is found in a survey paper
by Beatson and Greengard [3]. The second is a technical report by Beatson, Levesley
and Light [7]. This last paper discusses fast evaluation methods on the circle and higher
dimensional spheres, and the reader will find a very careful and full account of the
one-dimensional circle case. The first trick with problems iii H^ is to consider complex
variables, rather than points in ]R^. Let z be a point at which we wish to evaluate s, and
u a data point, or centre. Then
\z - tip In \z-u\=: n£{\z - uf ln{z - u)) = Tl£{\z - up In z) + TlsiAz - up In Tl - -]].
Look at the last two expressions here. The first of them has the centre u in the square of
the modulus term, and this expression is quite cheap to evaluate, even if there are many
centres u. The effect of many centres on the second term is however quite profound, and
it is with this term that we must work. The idea is to set a tolerance, and only aim to
evaluate s to within this tolerance, rather than exactly. The appropriate series expansion
can then be used:
P=l
p=l
p=l
. ,
The value of N depends on the tolerance demanded of the evaluation and the relative
sizes of u and z. For this reason, we think of z as far away from the origin in M^, and
u close to the origin. If there are now many centres ui,..., u^ near the origin, and z is
far away from the origin, then we can summarise the effects of linear combinations of
all these centres as follows:
m
rn
N
$:a,i.-u,fin(i-^)
^ x:«^Eip(«^>"^.
4=1
i=l
p=l
N
=
m
■
N
'Yl^°-iU{'^i)^~'^ = '^9p{Ul,...,Um)z"^■
p=l 1=1
p=l
The principle now is to use the last expression above to make an approximate evaluation
of s. Of course, the assumption that z was far from the origin and ui,...,Um were
close to the origin is not important. It is simply important that z be far away from the
cluster of centres ui,..., Um- The summarising expression is referred to as a Laurent type
expansion, because it summarises the contribution of the centres ui,... ,Um in terms of
Will Light
226
FIG.
1. Fast evaluation panelling.
series involving negative powers of z. There is now a lot of preprocessing to go on before
the fast evaluation algorithm is ready to roll. Figure 1 shows how the algorithm proceeds.
The shaded square at the bottom left of the domain is the point which contains z, the
evaluation point. All the squares around this one which are the same size are deemed
to be 'close' to the evaluation square. All other squares are 'far away'. Of course, as the
squares get further away from z it becomes possible to use our summarising technique
to total up the contributions of larger and larger numbers of points. This is done in a
very explicit manner, which is represented by the shading in Figure 1. As we get further
away, we double the size of the squares over which we summarise, and there is a band of
same-size squares (or a ring, if the evaluation square was in the middle of the domain)
two squares wide surrounding the evaluation square. Once all the preprocessing is done,
and we shall discuss this a little more in a moment, all the needed coefRcients gp are
available, and evaluation can be carried out in about O(logm) FLOPS instead of 0{m).
The above account does not quite reveal the whole story. The coefficients gp are
calculated in an orderly manner which greatly improves the efficiency of the algorithm.
Suppose our problem is located in [0,1]^. An initial decision is made to divide the
original domain into squares of size 2~". There is then a parent child relationship derived
through a quad-tree data structure. The parent [0,1]^ has four children: [0,0.5]^, [0.5,1]^,
[0,0.5] X [0.5,1] and [0.5,1] x [0,0.5]. This parent-child relationship helps in setting up
the coefficients gp(ui,... ,Um) in an efficient way. There is also a further idea involving
Computing with radial basic functions
227
Taylor series, which gives more efSciency. We omit any description of this technique.
3
Inverting the interpolation matrix
Recall as at the beginning of the previous section that the equations specifiying the
interpolation problem are as follows:
dj
=
s{xj) = ^ ai4>(\xj -Xi\)+ p{xj),
0
=
^aiq{xi),
(j = 1,..., m)
(3.1)
m
for allge7rfc_i(]R").
(3.2)
i=l
In matrix terms we have
A
Q^
Q
0
where ^ is a full matrix which tends to exhibit poor conditioning. The poor conditioning
of A is similar to problems experienced by researchers in the theory of finite elements —
as the interpolation points become very dense in a given region, the conditioning gets
worse. In fact, there are formal statements relating some impression of the condition
number of A (usually the smallest eigenvalue of A) to the minimum interpoint distance.
The following table represents the condition number of A when the interpolation points
are given on a uniform 5x5 grid in [0, a]^. Of course, on a philosophical level, it does not
Scale parameter
a
1.0
0.1
0.01
0.001
TAB.
Condition
Number
3.6458 X 10^
2.5179 X 10^
2.4364 X 10^
2.4349 X 10«
1 Two norm condition numbers of A.
make any sense whatsoever to describe an interpolation problem as being ill-conditioned.
Let's discuss this point in a little more depth. Suppose xi,...,Xm are points in H", and
Gi,..., Gm are a set of functions from M" to IR which are linearly independent over
{xi,... ,Xm}- That is, interpolation to arbitrary data at xi,... ,Xm by linear combinations of Gi,..., Gm is always uniquely possible. Then there is a basis Fi,..., Fm for the
linear span of Gi,..., Gm such that Fi{xj) is 1 if i = j and is zero for all other values of
i, j between 1 and m. If the given data is di,..., dm, then the interpolant can be written
down immediately as
J^diFiix)
1=1
(rcelR")
228
Will Light
If one has in one's hands the basis {Fi,...,Fm} and wants to know the coefficients
which must be used then one need only invert the identity matrix to obtain the solution,
and there are not many matrices which are better conditioned than the identity matrix!
Of course, getting one's hands on the basis Fi,... ,Fm is usually rather difficult — as
hard as solving the original problem in fact. It has become traditional to refer to the
basis Fi,...,Fm as the Lagrange basis (in sympathy with the fact that Lagrange was a
person who wrote down this basis for polynomial interpolation in one dimension) or the
cardinal basis. This last term seems to the author to be quite appropriate, indicating
that the basis is special. However, it does not find favour with spline theorists, since
they think of the word cardinal in a very technical sense (the interpolation points are
K"). Terminology aside, the point is still made that the conditioning of any interpolation
problem is a function of the available basis. A more practical case of this phenomenon
is the problem of natural cubic splines in R. They fit into the radial basic function
interpolation scenario, because a natural cubic spHne with knots at Xi,...,Xm can be
written as
m
s{x) = '^ai\x-Xif+ ax + b
(x e R).
j=i
If we require this sphne to interpolate data di,...,dm at xi,...,Xm then we have to
require that s{xj) = dj for j = 1,... ,m. The natural property comes, as expected, from
the natural boundary conditions:
I m
m
^fli = ^aiXi =0.
j=i
,
1=1
The ill-conditioning illustrated in Table 1 would be equally present in this example, and
the remark that the conditioning increases as the interpoint spacing decreases would also
hold good. Of course, to suggest the use of this basis to a spline practitioner would not
be a good idea! We are well used to the idea that B-splines are the correct basis to use
in this situation.
I suppose the two principles to emerge from the above discussion are that the basis
we have used thus far to describe the interpolation problem is not satisfactory from a
computational point of view, and that in at least some of the cases under discussion
(all of them one-dimensional) there are other bases which are superior. There are other
ways to conceptualise the difficulties we experience with the radial basic functions. Most
of them tend to grow at infinity, and have small value at zero. As a general principle,
we would hke a basis to mimic the B-spline basis. That is, we would like the basis to
be local if possible — each basis function having a fairly small support around one of
the interpolation points. The first people to make progress in this area were Dyn and
Levin [11] in 1983. There is a later paper with Rippa [12] in 1986 which is also worth
looking at. Their technique was based on the observation that if F{x) = [xp In |a;|, and
X €. B,^, then V^F = STTS. Here, V* represents the bilaplacian, and 5 is the Dirac delta
distribution whose action on each rapidly decreasing function in S is to evaluate it at
zero. This description alone should alert us to the fact that V^F = 8TTS is a distributional
equation, and as such must be handled with care. However, numerical analysts dash in
Computing with radial basic functions
229
where others fear to tread, and we can approximate the Laplacian as follows:
{V^F){x) ^ h-'^{F{x-hei)+F{x+hei)+F{x-he2)+F{x+he2)-4.F{x)}
[x e H^).
Here his a. real parameter, and ei and 62 are the usual unit vectors in H^. Pictorially, we
can represent this approximation by the stencil shown in Figure 2. The bilaplacian stencil
1
-•-
FIG.
1
-•-
2. The stencil for the Laplacian.
is shown in Figure 3. This observation is used in a straightforward way if the interpolation
points lie on a grid. Instead of using the thin-plate spline radial basic function to generate
a basis, one uses the appropriate linear combinations which represent the bilaplacian of
this function. Because one has a distributional equation relating this quantity to the d
function, one does not expect to get the 6 function exactly, but one certainly does expect
to get a function which decays rapidly at 00, and this is exactly what happens. Dyn and
Levin provide some encouraging numerical results. Of course, there remains the problem
of what to do when the data is not gridded. Here one must develop first the appropriate
stencil for the Laplacian on a point by point basis. This may seem laborious, but in fact
the next few methods we will describe all compute better basis elements on a point by
point basis.
Perhaps the most successful class of schemes of this nature — computing a new
basis on a point by point approach — comes from Beatson, Goodsell and Powell [2] and
Beatson, Cherie and Mouat [1]. Their approach is perhaps simpler to appreciate and
implement than that of Dyn and Levin. They begin with the observations I made earlier
— what we are really after is the cardinal basis Fi,..., Fm with the property that Fi{xj)
is 1 ii i = j and is zero for all other values of i, j between 1 and m. However, because
this problem is as difficult to solve as the original one, we proceed as follows. Consider
Will Light
230
2. -J
20
-•
•2« -8 *
FIG.
-•
•-
•'
3. The stencil for the bilaplacian.
the job of trying to construct F,. This function is supposed to be 1 at a;, and zero at all
other points. Choose about 50 near neighbours oi Xi, say yj e {xi,... ,x„,}. This choice
must include Xi. Then take
50
Fi{x) ='^aj\x-yj\^In \x-yj\+bx + c,
(x G H^).
We demand that
'^) = {I
Fi{y.
if 2/j =Xi,
otherwise,
and that the natural boundary conditions are also satisfied. Thus we are producing
approximate cardinal functions which have the value 1 at the required point, but are only
zero on about 50 neighbouring points. This suggestion is based on the fact, observed by
many workers, that such functions are often small elsewhere in the domain. We produce
some pictures to illustrate this. In the first (Figure 4), 289 points are spaced on a regular
grid in [0,1]^. The approximate cardinal function is based on the 13 points shown in bold
in Figure 4. Figure 5 illustrates the same situation, but now as shown the points used to
develop the cardinal function are all clustered in one corner of the domain. The effect is
to produce significant values at the opposite corner of the domain. One can infer from
this that whenever the data is pretty much uniformly distributed, the cardinal functions
using points well inside the domain will have good properties, while those at the edge
will be poor. Similarly, in a non-uniform distribution, those interior to a cloud of points
will behave well, while those at cloud boundaries might not.
There are two methods for dealing with the difficulties which have shown up above.
Computing with radial basic functions
•••
•••
FIG.
4. Approximate cardinal function with points central to the domain.
•••
•••
FIG.
5. Approximate cardinal function with points at one corner of the domain.
Firstly, one can pin all the cardinal functions at a fixed set of judiciously chosen points
-— so that every cardinal function must have the value zero at these points. This is very
effective in the case of regularly spaced data as Figures 6 and 7 show. One can imagine
however, that a data set with a number of clouds might benefit from a judicious choice
of points at which to carry out the pinning. What one would really like is a method
which does not rely on any user intelligence in the choice of points. As mentioned before,
a desirable feature of a good basis function is one which decays at infinity. This decay
should be at some rate if possible. The Beatson, Cherie and Mouat prescription for thinplate splines in R^ is that the elements should decay like |a;|""^ as \x\ —> oo. There is a
problem here, in that if we opt for decay elements everywhere, then we will not obtain
a basis for our space. To get around this problem, we accept an element F, as a decay
element if it satisfies
50
i=l
231
Will Light
232
••
•»•• •
•••
•••
••
FIG.
•
6. Approximate pinned cardinal function with points central to the domain.
•••
•
FIG.
•
•
I
•
•
•
•
•
7. Approximate pinned cardinal function with points at one corner of the domain.
and
\Fi{x)\ = Oi\x\-')
as
oo.
Otherwise, we use the Ft which is defined by the previous conditions of cardinality. Again
there are a few bells and whistles needed to make this method operate efficiently, but
we hope that sufficient detail is present for the reader to be able to see the general idea.
All the above methods are providing ways of constructing a better conditioned basis
with which to solve the problem. A method still has to be selected to invert the matrix
associated with the new basis, which is now much better conditioned than the original
matrix corresponding to the conventional basis. The method of choice for most authors
is some version of GMRES.
Beatson called the points at which decay could be obtained 'good' points, and points
at which decay could not be obtained 'bad' points. This idea has been built on in a
recent technical report by Beatson and Levesley [4]. The general spirit is to define good
and bad points in the same way as Beatson, and then to develop an iterative solver,
solving first on the good, then the bad, then returning to the good and so on.
Computing with radial basic functions
233
Finally, a very successful method has recently emerged from the researches of Beatson,
Light and Billings [6]. This method has the advantage that it is a fast iterative solver
which may be regarded as a preconditioner in its own right (thus it may be combined
with a solver such as GMRES). We will describe it here as a solver. It is essentially
the domain decomposition method, although as with previous solvers, our description
will be very much at a 'bare bones' level, and the interested reader is referred to [6] for
the fine details, which include some error estimates, some interesting comments on an
alternative basis, and a good deal of theory. We shall describe the method as applied to
data on the unit square [0,1]^ in M^, and we will not make any attempt to make the
method adaptive in character. The reader will be able to see these improvements for
herself. We will test our method on randomly chosen data in [0,1]^.
We begin with a set of nodes X = {xi,... ,Xm} at which interpolation is to be
carried out. We will describe the algorithm as it is implemented for solving the thinplate spline interpolation problem on the node set X. We divide up the square [0,1]^
into a fairly large number of sub-domains Xi,... ,X(. There are two constraints on these
subdomains. It is important that they are constructed so that about equal numbers of
points lie in each subdomain — about 50 points per subdomain is ideal. Secondly, it is
essential that each subdomain overlap all surrounding subdomains. In our terminology,
two subdomains overlap if they have a (small) number of points in common. In each
subdomain there are some points in X which lie only in that subdomain and not in any
other. We call these points the inner points of the subdomain. A coarse set Y of inner
points in the node set X is also chosen. We will say more about this coarse set in a
moment, but at this stage it simply consists of a small number of inner points from each
subdomain. The algorithm will then construct the interpolant s and proceeds as follows.
We initialise the interpolant s as s = 0. We want to solve the equations
m
dj = s(a;j) = 2Jtti|a^j - a^il^lnla;^ — a;i|+Q;a;j+/3,
(j = l,...,m)
(3.3)
,
(3.4)
subject to the boundary conditions
m
m
m
'^
'^ai = '^aiSi = '^aiti=0,
^
where a;^ = (sj,ij). In matrix form these equations are
A Q \ f a\ _ / d
Q^ 0 )[b )-[0
as we have already seen. Our method will operate by residual correction, so we begin by
setting .
It is important to recall that a is a vector of length 2, which we write as a = (ai,a2).
Suppose now we have begun our iterative procedure and generated an approximation s
with a residual r. The next few steps describe how to update the approximation and the
'
.
Will Light
234
residual.
Step 1. We construct Si,. .,S(, such that each s,- is an interpolant based only on all
points of the subdomain Xj, using as data the residual vector r restricted to Xj.
Step 2. For each inner point x we now have a single real number Ox- which is the
coefficient of | • -x\'^ In | ■ -a;|. If we look at the collection of coefficients belonging to all
the inner points of all domains, then this collection is not in general orthogonal to TTI.
That is, they fail to satisfy boundary conditions of the type given in Equation (3.4). We
now correct so that the collection of coefficients corresponding to all inner points of all
domains is orthogonal to TTI .
Step 3. We set
Si = V^{a3;| • -xf In I • -a;| : a; is an inner point}.
(3.5)
Step 4. We evaluate the residual TZ = r-Si at the coarse grid points, and then construct
the interpolant ^2 to this residual on the coarse grid points Y.
Step 5. We update s hy s <- s + Si + 82- The new residual is then given by
^(o)'
where
Zi = di- s{xi),
i = l,...,m.
This iterative process can either be continued to convergence, or used as a preconditioner
followed by GMRES. Table 2 shows some run times taken to obtain an error of less than
1 X 10"^ for the Franke 1 function (see [13] for the definition of this function). Random
nodes were generated in [0,1]^ and an Intel Celeron PC was used. Recently,
Number
of nodes
10,000
20,000
40,000
80,000
160,000
TAB.
Number of
iterations
8
8
6
6
7
Time
(seconds)
7.0
17.5
35.5
105.7
407.8
2 Run times for domain decomposition.
the group at Leicester, using a twin processor Compaq PC, has obtained solutions to
a problem with 1,000,000 random points in less than 9 minutes, and we can safely say
that the combination of domain decomposition methods and multipole fast evaluation
has produced a robust and effective method. Most practitioners will be aware of other
ways to run a domain decomposition algorithm. In particular, one can use a nesting
approach where one starts with only four subdomains each containing large numbers
of points. To solve each subdomain problem, one subdivides again and does domain
decomposition in the subdomain.
Computing with radial basic functions
Bibliography
1. Beatson, R.K., J.B. Cherie and C.T. Mouat, Fast fitting of radial basis functions:
methods based on preconditioned GMRES iteration. Advances in Computational
Mathematics, 11 (1999), 253-270.
2. Beatson, R.K., G. Goodsell and M.J.D. Powell, On multigrid techniques for thin
plate spline interpolation in two dimensions, Lectures in Applied Mathematics 32
(1996), 77-97.
3. Beatson, R.K. and L. Greengard, A short course on fast rriultipole methods, in Wavelets, multilevel methods and elliptic PDEs, Ainsworth, M., J. Levesley, W.A. Light
and M. Marietta (eds), Oxford University Press, Oxford (1997), 1-38.
4. Beatson, R.K. and J. Levesley, Good point/bad point iterations for solving the
thin-plate spline interpolation equations. University of Leicester Technical Report,
2001/34 (2001).
5. Beatson, R.K. and W.A. Light, Fast evaluation of radial basis functions: Methods
for two-dimensional polyharmonic splines, IMA Journal of Numerical Analysis 17
(1997), 343-372.
6. Beatson, R.K., W.A. Light and S. Billings, Domain decomposition methods for solution of the radial basis function interpolation problem, SIAM Journal Scient. Stat.
Comp. 22(5) (2001), 1717-1740.
7. Beatson, R.K., J. Levesley and W.A. Light, fast evaluation of radial basic functions
on spheres, piepr'mt.
8. Beatson, R.K. and G.N. Newsam, Fast evaluation of radial basis functions I, Computers and Mathematics with Applications, 24 (12) (1992), 7-19.
9. Beatson, R.K. and M.J.D. Powell, An iterative method for thin plate spline interpolation that employs approximations to the Lagrange functions, in Numerical Analysis
1993, D.F. Griffiths and G.A. Watson (eds), Longmans, Harlow, 1994.
10. Cheney, E.W. and W.A. Light, A course in approximation theory. Brooks Cole,
Pacific Grove Ca, 1999.
11. Dyn, N. and D. Levin, Iterative solution of systems originating from integral equations and surface interpolation, SIAM J. Numer. Anal. 20 (1983), 377-390.
12. Dyn, N., D. Levin and S. Rippa, Numerical procedures for surface fitting of scattered
data by radial functions, SIKM io\a:n&\ Scient. Cova.-p. 7
13. Franke, R., Scattered data interpolation: Tests of some methods. Mathematics of
Computation 38 (1982), 181-200.
14. Mairhuber, J.C, On Haar's theorem concerning Chebychev approximation problems
having unique solution, Proc. Amer. Math. Soc. 7 (1956), 609-615.
15. Micchelli, CM., Interpolation of scattered data: distance matrices and conditionally
positive definite functions, Constr. Approx. 2 (1986), 11-22.
16. Sibson, R. and G. Stone, Computation of thin-plate splines, SIAM Journal on
Scient. Stat. Comp. 12 (1991), 1304-1313.
235
Application of orthogonalisation procedures for
Gaussian radial basis functions and
Chebyshev polynomials
John C Mason and Andrew Crampton
School of Computing and Mathematics, University of Huddersfield, Huddersfield, UK.
j.c.mason@hud.ac.-uk, a.cramptonOhud.ac.uk
Abstract
Procedures for orthogonalisation of Gaussians and B-splines are recalled and it is shown
that, provided Gaussians are negligible in appropriate regions, the same recurrence formulae may be adopted in both and render the computation relatively efficient. Chebyshev
polynomial collocation is well known to be rapidly defined by discrete orthogonalisation,
and similar ideas are commonly applicable to partial difi'erentieal equations (PDEs) and
integral equations (lEs). However, it is shown that the most elementary mixed methods
(both boundary conditions and PDEs being satisfied) for the Dirichlet problem in rectangular types of domain can lead to a singular linear system, which may be rendered
non-singular, for example, by a small modification of interpolation nodes.
1
Introduction
Gaussian radial basis functions (RBFs) are negligible outside a certain range, which depends on the accuracy required and the exponent used. For example, if four decimal place
accuracy is sufficient, then outside [-2,2] the function e"'*'^ is negligible for A > 2.5.
Indeed the translated RBFs
(/>i(a;) = e-^(^-*)'
i =-1,0,... ,n + 1,
(1.1)
resemble, at least superficially, a set of translated cubic B-splines, each having a support
of four sub-intervals of length one, contained in [i - 2, i -t- 2].
Following work of Mason et al [4] and Goodman et al [1], we show that these RBFs,
rounded to the required accuracy, may be conveniently and efficiently orthogonalised so
that
236
Orthogonalisation procedures
237
(i)
a 4 term recurrence may be adopted identical to the one in [4]
for cubic B-splines,
(ii)
inner products may be determined very simply in terms of 4 parts
of a normal distribution,
(iii)
a well conditioned calculation results and best I2 approximations
may be obtained immediately with an orthogonalised basis,
(iv)
a continuous or discrete inner product (and best approximation)
may be adopted.
In a second application of orthogonalisation, this time to polynomials, it is shown
that a two-dimensional {n+l)x (n-|-1) polynomial collocation problem, which includes
amongst its nodes n Chebyshev polynomial zeros on each of 4 sides of a square, leads to
a singular (rank one deficient) system. For all n, one superfluous equation is readily identified and a suitable replacement equation is readily found. Discrete orthogonalisation is
used to combine and greatly simplify the equations and prove singularity.
2
Orthogonalised Gaussians
An orthogonal system {Pj} is developed from the Gaussians 0i in (1.1) using
Pk = <t>k - ctkiPk-i - ak2Pk-2 - dkaPk-s,
k = -l,...,n + l,
(2.1)
where 013 = ao3 = ^02 = a-1,3 = a-1,2 = 0,-1,1 =0.
Now we define coefficients 6^;^, for r = 0,..., A; -|-1 and A; = -l,...,n-l-l,as the inner
products
bkr = {(t>k,<Pk-r) = /
(pkix)(pk-r{x)dx,
(2.2)
Jlk.r-
where Ik,r is the common support of ^^ and ^k-r and normalising constants rik are the
squared norms
nk = \\Pkf = {Pk,Pk),
(2.3)
where (•,•) is the inner product (2.2) and || • || is the corresponding norm.
Then, setting (Pfe, Pfe_r) == 0 for r = 1,2,3 gives
{(pk,Pk-r) =akrnk-rTaking the inner product of (2.1) with itself gives
(2-4)
J. Mason and A. Crampton
238
nk = bko + 'Y^[-2akr{(j)k,Pk-r) + alr'"-k-T],
(2.5)
which, by using (2.4), gives
3
,
nfc = 6fco-5Z«L"A:-r■
(2.6)
r=l
This is the first basic equation for writing {n^.} in terms of {aA,-} and {b^.r}.
Now, using (2.1), with k replaced by fc - 1, fc - 2, fc - 3 we obtain
{(l>k,Pk-3) =bk3 = ak3nk-3
(2.7)
{(j)k, -Pfc-2) = &fc2 - a/c-2,1 (<f>k,Pk-3) ■
Hence
afc2"fc-2 = &fc2 - afc-2,l^A-3-
(2.8)
Finally
{^k,Pk-l) = h\ - afc-1,1 {(l>k,Pk-2) - Ofc-1,2 {(l)k,Pk-3)
so that
akink-i = bki-ak-i,i{ak2nk-2)-ak-i,2bk3-
(2.9)
Equations (2.6), (2.7), (2.8) and (2.9) may be solved to determine all the required coefficients {ukr} and {uk} expHcitly by substitution, starting from n_i = ||0-i|p. This
involves 0{n) operations for n + 3 basis functions. The best approximation to a fimction / (either continuous / = f{x) or discrete / = (/i,...,/m)^ ) by orthogonalised
Gaussians may be determined explicitly as
n+l
where cj = {Pj,P^)-'if, Pj) = {njr'if, Pj)2.1 Numerical example
Here we use the procedure for constructing orthogonalised Gaussians to produce an
interpolant to data obtained from a fast response oscilloscope^ To the left of Figure 1
we see the first three orthogonalised Gaussian functions, with centres specified at the
integers -1,0 and 1, with support growing from left to right. The figure on the right
shows the oscilloscope data ** and the fitted o-Gaussian interpolant —.
1 Oscilloscope data supplied by Centre for Electromagnetic and Time Metrology, National Physical
Laboratory, London, UK.
Orthogonalisation procedures
239
In this example we use 512 centres and choose A = 2.5 in (1.1). Since our choice
for A requires only four decimal place accuracy, the normal equations produce the usual
identity matrix and the coefficient vector {c_i,..., c„+i} can then be determined by the
equations c = A^f where / = {/i, • ■ •, /m} and Aij = Pj{xi). The fit is extremely good
and vindicates the neglecting of the Gaussians outside the interval considered.
100
FIG.
2.2
zoo
30O
40)
1. First three orthogonalised beisis functions and o-Gaussian fit to oscilloscope data.
Extensions to orthogonalised Gaussians
The following extensions are clearly possible.
3
(i)
Use of generally placed centres (knots) and/or a discrete inner product.
(ii)
Use of higher dimensions - as in Anderson et al [2].
(iii)
Replacement of interval (-co, oo) in a continuous norm by [0, n]
and [0, n] by [0,1] using scaling.
(iv)
Consideration of a function with wider (approximate) support, such as [—3,3]
or more generally [—r, r] for r > 2.
Chebyshev polynomials in two-dimensional collocation
The (first kind) Chebyshev polynomial T,(a;) of degree i is defined by
Ti{x) = cos iO
0,... ,m,
-1 < a; < 1,
(3.1)
where x = cos 0 and 0 < ^ < TT.
Among its many properties is the discrete orthogonaUty property
m
f 0
Y,Ti{xk)Tj{xk) = \ m
fe=i
\m
where Xk are the m zeros of Tm{x), namely
ioi
for
for
i ^ j; i,j <m—1
i=i=0
t = j ^ 0,
(3.2)
240
J. Mason and A. Crampton
Xk = cos(^^——j,
k = l,...,m.
(3.3)
The orthogonality property of (3.2) is not a unique one amongst the Chebyshev polynomials of four kinds. Indeed, Mason and Venturino [5] showed that there are at least
fourteen such formulae, depending on alternative weights, choices of Chebyshev-related
abscissae and kinds of Chebyshev polynomial.
3.1
The elliptic problem — mixed methods
Let us now exploit this property (3.2) in a pseudo-spectral method for a hnear elhptic
PDE problem on a square. The PDE
Lu = f{x,y),
|a;|,M<l,
(3.4)
subject to
u = 9{x,y),
(3.5)
where g{x, y) is a function known explicitly only on x = ±1 and y = ±1, can be solved
approximately in the form
m
i=0
n
j=0
where a dashed summation denotes that the first term in a sum is halved.
To obtain equations for aij, we solve
Lum.n = f,
Umn =9)
Umn = 9,
at the (m-l) X {n~l) zeros oiTm-i{x)Tn-iiy),
on X = ±1 at zeros of Tn{y)
(2n equations),
on y = ±1 at zeros of Tm{x)
(2m equations).
(3.7)
(3.8)
(3.9)
Together (3.7)-(3.9) form (m -t-1) x (n -f-1) equations for {atj}. However, we claim that
the included equations (3.8), (3.9) are singular of joint rank 2m -f 2n - 1. If this is so,
then the system is singular without consideration of the PDE collocation equations (3.7).
The equations (3.8), (3.9) become
m
n
m
5fc,±i = E' E' ^iM^k)Ti{±l\
i=0
g±u = E' E' aijTi{±l)Tj{ye),
i=0
j=0
n
j=0
where Xk,yi are zeros of Tm{x),Tn{y) respectively and where
5i/ = 5(i.yc)9k,i = 9{xk,'^),
5-i,< = 5(-i,y«)i
gk,-i = g{xk,-l)-
(3.10)
241
Orthogonalisation procedures
If we add/subtract the first pair and also the second pair of equations in (3.10), noting
that
T,(i) = i, T,-(-i) = (-l)^
we deduce that
m
m
n
4°^ = ^'^'^iyTi(a:fc),
t=0
4'^ = E!' Z]' ("iM^k), k = l,...,m,
j=0
{j even)
m
i=0
•,=0
(jodd)
m
n
n
4"=>;'>;'
n
aijTjiye),
e^^^ = }^' >7 a,,-T;(w),
^ = l,...,n,
(3.11)
(3.12)
j=0 j=o
(iodd)
i=0 j=o
{i even)
where,
4' = 5(fffc,i + c/fc,-i),
4^^ = UdkA -9k-i),
ef
e'j^'^ = ^{gi,e - g-i,i).
= UdU + 9-i,e),
Multiplying (3.11) by 2Tr{xk)/{m + 1) and summing over k, and multiplying; (3.12) by
2Ts{ye)l{n + 1) and summing over i, discrete orthogonality (3.2) gives
p(0)
-\^/ „
= b^Z'
R^^l^^arj = bi%
j=0
{j even)
r = 0,...,m-l.
(3.13)
s-O
(3.14)
j=r
(jodd)
m
m
-M
r^i) =W. -c^i)
i=0
(iodd)
j=0
(i even)
m-l
,
where
,,(0)
_
^+1 - m
'^+1 - n
fc=l
/s=l
|-fi:4»ir.to)
c2. = ^E4"T.(»).
This constitutes a greatly simplified system to replace (3.10). Indeed we may verify
that, for m = n,
m—l
m—1
4=0
{m — i odd)
i=0
{m — i odd)
E'^a^ = E'e.
(3.15)
1
242
J. Mason and A. Crampton
where i = 0,1 for m = odd, even, respectively, and hence that the equations (3.13) and
(3.14) are singular. For example, for m {= n) = 2, we seek equations in AQO, ..., 022, and
(3.13) gives
j?(o) =
i„__ +
^ ao2.
„_.
nf^
= |aoo
R(0)
-"■2
_ 1
,
= 2^*10 + ^^12,
p(i) = „„.
M"=«oi,
ti2
= an,
(3.16)
meanwhile (3.14) gives
cl'" = iaoo + a20,
C2
faoi +021,
Cf'' = iaoi+a2i,
C^i^ = am,
(3.17)
^2' =
= '^lla\\.
*-^2
Clearly R^ =0^ , consistent with (3.15) for m = 2. Which equation do we eliminate?
For simplicity, in the case of m even, we delete the equation for C^ and replace it by
the equation for i?,„+i. It is easy to verify that, within the system (3.13) and (3.14), this
leads to full rank, and i?,„^i is equivalent to boundary specifications of either of
u(0,l) + u(0,-l),
(3.18)
«(1,1) + u(-l, 1) + u(l,-1) + u(-l,-1).
For m = n = 2, this is equivalent to
4°^ = ^020+ 022.
(3.19)
In the case when m is odd, we delete the equation for C\ ' and replace it by the
equation for C^^^, the latter being equivalent to adding four boundary point conditions
anti-symmetrically, i.e.,
u(l,l)-u(-l,l) + u(-l,-l)-u(l,-l).
(3.20)
If g{x, y) is known everywhere in the square, then we could of course consider replacing a
mixed collocation problem by an interior collocation problem by including the boundary
conditions automatically in the form of approximations. For example, we could replace
the form (3.6) by
m-2 n-2
«mn = (a;2-l)(?/'-l)5]' 5]'«i,Ti(x)T^(y) + ff(.T,j/),
or by an alternative form such as
m
n
i=0
j=0
(3.21)
Orthogonalisation procedures
where Tj = To{x) or Ti{x) according as i is even or odd. These forms have the disadvantage of being difficult to generalise to other kinds of (non-rectangular) boundaries,
although (3.21) is adaptable to the case where an equation of the boundary is known
(see Mason [3]).
The best Chebyshev method available for the Poissoh problem oil a rectangle is
probably a "differentiation matrix" method, such as is described in Trefethen [6], which
represents the solution by nodal values rather than Chebyshev coefficients.
Acknowledgement: We thank the referees for their perceptive remarks.
Bibliography
1. T. N. T. Goodman, C. A. Micchelli, G. Rodriguez and S. Seatzu, On the Cholesky
factorization of the Gram matrix of locally supported functions, BIT 35(2), 1995,
233-257.
2. I. J. Anderson, J. C. Mason, G. Rodriguez and S. Seatzu, Training radial basis
function networks using separable and orthogonalised Gaussians, in Mathematics
of Neural Networks, S. W. EUacot, J. C. Mason and I. J. Anderson (eds), Kluwer,
1997, 265-269.
3. J. C. Mason, Chebyshev polynomial approximations for the L-membrane eigenvalue
problem, in 5/ylM J. o/^ppZ. Mafft 15 (1967), 172-186.
4. J. C. Mason, G. Rodriguez and S. Seatzu, Orthogonal splines based on B-spUnes
with applications to least squares, smoothing and regularisation problems, in Numerical Algorithms 5 (1993), 25-4:0.
5. J. C. Mason and E. Venturino, Integration methods of Clenshaw- Curtis type based
on four kinds of Chebyshev polynomials, in Multivariate Approximation and Splines,
G. Nuernberger, J. W. Schmidt and G. Walz (eds), Birkhauser, Basel, 1997, 158165.
6. h.'H.Tneieth.en, Spectral Methods in MATLAB,^lAM,2mO.
243
Geometric knot selection for radial scattered data
approximation
Rossana Morandi and Alessandra Sestini
Dipartimento di Energetica, Universita di Firenze, IT.
morandi@de.unifi.it, sestini@de.unifi.it
Abstract
Scattered exact and non-exact data are approximated by means of radial basis functions
with compact support and the related knot selection is based on the information given
by the discrete Gaussian curvature defined on a data triangulation. In case of non-exact
data, a strategy to obtain a sign-reliable estimate of its distribution is given extending
an approach already studied by the authors for non-exact 2D data.
1
Introduction
It is well known that, for any interpolation/approximation scheme, data shape preservation is often a desirable quality and, as a consequence, the determination of some criteria
to establish the data shape is a very important topic. For this purpose, the use of the
discrete curvature in case of exact 2D data is a standard approach. On the other hand, in
case of non-exact data, the proposal in [6] allows the determination of a reasonable and
sign-reliable discrete curvature estimate if the maximum data error is a priori given. In
recent literature, interesting formulas have been introduced [3, 4] for defining the discrete
Gaussian curvature when scattered 3D exact data are given and a related triangulation
is assigned. Starting from these formulas, the approach considered in [6] is extended to
the case of 3D scattered non-exact data in order to define a reasonable and sign-reliable
estimate of the Gaussian curvature at the data points thereby obtaining important shape
information. Thus we get some suggestions for determining the supports of the local radial basis functions [8] used in the approximation scheme together with the number, the
position and the multiplicity of the related knots. The result is a good approximating
surface (in particular with respect to its shape) with a high data reduction [2, 7].
The outline of the paper is as follows. In Section 2 the discrete Gaussian curvature is
defined and an inequality is given to check its sign-reliability in case of non-exact data.
In Section 3 the approximation scheme is presented and the knot selection strategy is
given. Finally, in Section 4 some numerical results are presented to illustrate the features
of the proposed approach.
2
Information about the shape
In this section, following the approach presented in [3, 4], we define the discrete Gaussian
curvature (dGc) to obtain information about the shape suggested by the data. For this
244
245
Geometric knot selection
purpose, we need the following notation
• p^y := {Xj = (xj,yj),j = 1,...,N} C IR^ is the set of the assigned distinct
vertices on the xy-plane;
• V := {Pj = {Xj,Zj),j = 1,..., A^} C E,^ is the data set, with Zj = /(Xj);
• T := {Ij G IN^, 1 < lkj,< N,k = 1,2,3, j = 1,..., T} is a given triangulation of
rxy
Thus, for any Xj e Vxy not belonging to the boundary of the convex hull of Vxy we can
define the integral Gaussian curvature with respect to a related area Sj, [3]
n,(J)
Kr.= 2^-Y:»\^\
fc=l
where the angles a]^ ,k = 1,..., n^ are as follows
M Jj)
a iJ)
^i^k i^fc+l)'
M) ..
riJ)
■~^k
^k
"j' k — 1,. . . ,nj,
U) ._ ^(J)
^„j+i :=e'
rU) C P is the set of ordered neighboring points of Pj given by the
rU) ,..., Vn/}
and {Vj
assigned triangulation. To derive the curvature at the vertex Pj from the above integral
value, we normalize by the Voronoi area Sj [4]
""f-
Sj'
(2.1)
If Xj is on the boundary of the convex hull of Vxy, some auxiliary suitable "phantom"
points should be defined in order to obtain a reliable estimate of the Gaussian curvature
from (2.1).
0
0.1
FIG.
0.2
03
0.4
0.5
0.6
0.7
0.8
0.9
1
1. The triangulation (left) and the discrete Gaussian curvature (right).
Shown on the left of Figure 1 is the Delaunay triangulation related to a set Vxy of
441 scattered vertices in the unit square and shown on the right is the discrete Gaussian
curvature distribution related to the Pranke function sampled on Vxy
,246
Rossana Morandi and Alessandra Sestini
In case of non-exact data, we need to check the sign-rehability of Kj for deriving
some useful information about the shape suggested by the data. For this purpose, we
use the theorem below, where
,
pU)
._ _ffc
^fc + i
r. _ 1
'V
n,
„ .
X
(2.2)
Remark 2.1 fCj is an approximation of Kj obtained by replacing the angle a]^ with
2(1 — Cj. ),fc = 1,... ,nj.
Theorem 2.2 Let Pj G ]R^,J = 1,..., A'' be assigned distinct non-exact data points
and let e be a positive quantity such that \Pj — Pj\<e,j = l,...,N, where P^ is the
(unknown) exact data point corresponding toPj. If e is sufficiently small and
fc=i \^k I
then
iCjiCj > 0,
where /C| is defined as Kj using the exact data points.
Proof: Let us consider a point Pj and its neighboring points {Vj ,..., V„j'} C V and
let us write the corresponding (unknown) exact points as follows
P'j
:=Pj-eoWo,
with 0 < eo,ei,...,e„^ < e and |wo| = |wi| = ••• = |w„J = 1.
So, if e is sufficiently small, we can define the non-zero vectors
r,Uh ._
and we have
Y(J)O
_ pe
file
(i)
^k
=e^ -e/tWfc-t-eoWQ.
Thus, if
{j)e ._ ^k
^k+1
.
-0)e I \c^J)f-1 '
Cr.
using a first order Taylor approximation, we obtain
\^k I I^A:+1I
Geometric knot selection
247
where
(2.4)
Thus, we can write
•^ fc=i \
l^fc ll^fe+il
So, if e is sufficiently small, ICjK.^ > 0 if
^ fc=i \
i^fe ii^fe+ii/
and this is true if
-^E l^.ll^?'l + -J^ >-4/5.
' ^' fc=l \
F& iFfe+ll/
(2.5)
Now, from (2.4) it is easy to verify that \Ak\ < 2e(|e^^'^|-i + \e^,^li\'^) and \Bk\ <
"^^i^k \~^~'^\^k+i\^^)\^k \\^k+i\- Using these inequalities, after alittle algebra, we obtain
that, if e is sufficiently small, (2.3) implies (2.5).
□
If e is an assigned small positive quantity such that |Pj—P|| < e, j = 1,... ,N,ii (2.3)
holds we use (2.1) to define Kj because we consider it sign-reliable. Otherwise, we try
to get information about the sign of the Gaussian curvature at the point Pj, repeating
the check after substituting the neighboring points of Pj with other new suitable rij
points. In particular, these are chosen among the neighbors of all the V^,''^ ,k = 1,.. .,nj
and they are uniformly spaced as much as possible with respect to the azimuth (defined
relating to Pj). If after this substitution (2.3) holds the new neighboring points are used
to define Kj through (2.1), otherwise this strategy is repeated until we consider that the
new neighbors are too far from Pj. In the last case, we put the curvature value equal to
0.
3
■
Knot selection in radial approximation
Let 4> '■ lR>o -+ R, be a compactly supported radial basis function. We approximate the
given data by the surface
where the set of knots {X.^,1 = 1, • • • j M} C Vxy and the set of positive ^-parameters
{Si,l = 1,..., M} are previously chosen. The coefficients UQ,. .., CLM are determined minimizing Z^,=i(2^j —-z(Xj))^. The knot number and their positions are selected considering
248
Rossana Morandi and Alessandra Sestini
the information given by the discrete Gaussian curvature distribution as defined in the
previous section.
Inspired by the algorithm proposed in [6], the strategy for the Xf and 5i,l = l,...,M
choice can be summarized as follows:
• an input tolerance tola is given;
• a first set of distinct knots {Xf,/ = l,...,Mo} C Vxy with Mo < M is chosen.
This is done selecting the areas where the absolute value of the discrete Gaussian
curvature is greater than tola- A knot is located in the middle of an area if the sign
of the related curvature is positive. In case of negative curvature, four knots are
located near the boundary of the area also taking into consideration the suggestions
given by the data distribution;
• initial values for the 5-parameters 5;, / = 1,..., Mo are determined considering the
knot separation distance;
• the final set of knots is defined by possibly increasing the multiplicity of the previously selected knots. In this case, the (5-parameters associated to the same knot
must be diff'erent.
Remark 3.1 We observe that, to he sure that the least squares problem has a unique
solution, it should be proved that the related collocation matrix is of full rank and this is
clearly equivalent to the uniqueness of the corresponding interpolation problem (the only
result we know about uniqueness of the radial interpolant defined with different scales is
given in a submitted paper [1] where interesting sufficient conditions are given). However,
we believe that the least squares problem is much more robust than the corresponding
interpolation problem and in all the numerical experiments we have never had problem.s
related to the rank of the collocation matrix (see also [5, 7j).
4
Numerical results
In this section we use the compactly supported radial basis function [8]
<^(r):=(l-r)^(l + 3r)
for checking the features of the proposed approach on two test functions. The first is
the well known Franke function and the second is the function 2(X) = 0.35(sin(27r3;) +
sin(27ry)), X € [0,1]^. For both tests, N = 441 data points are considered. The exact
data are obtained by evaluating the functions at the vertices represented on the left of
Figure 1. The corresponding non-exact data are defined adding a random noise to the
exact values. In particular, in the first test we have used e = 0.07 and in the second we
have used e = 0.08, in [0,0.5]^ U [0.5,1]^ and e = 0.008, otherwise. The related discrete
Gaussian curvature (dGc) distributions computed with the strategy sketched at the end
of Section 2 are reported in Figure 2.
Figures 3 and 4 relate to the first test with exact and non-exact data, respectively.
The distinct knots are X^ = (0.207,0.205), X^ = (0.449,0.797),XJ = (0.756,0.349) and
each of them is repeated three times with three diff'erent (^-parameter values, 0.6,0.4,0.3.
The mean error Jj^f^iizj - z{Xj))'^/N is about 0.016 in Figure 3 and 0.025 in Figure
Geometric knot selection
FIG.
2. dGc for the first (left) and second (right) set of non-exact data.
FIG.
3. The parent Franke surface (left) and its approximation (right).
4 (it was about 1/3 using only 3 distinct knots with all the (5-parameters equal to 0.6).
Figures 5 and 6 relate to the second test. The distinct knots are (0.258,0.238), (0.749,0.737),
(0.950,0.264), (0.700,0.264), (0.756,0.050), (0.756,0.300), (0.050,0.751), (0.300,0.751),
(0.264,0.700), (0.264,0.950). The related 5-parameters are 0.8,0.8,0.6,0.4,0.6,0.4,0.6,
0.4,0.4,0.6. The mean error is about 0.020 in Figure 5 and 0.026 in Figure 6.
Acknowledgments: The authors would like to thank the referees for their useful comments.
Bibliography
1. M. Bozzini, L. Lenarduzzi, M. Rossini and R. Schaback, Interpolation by basis
functions of different scales and shapes, submitted to Adv. Comp. Math., available
at http://www.num.math.uni-goettingen.de/schaback/.
249
Rossana Morandi and Alessandra Sestini
250
FIG.
4. The non-exact set of data (left) and its approximation (right).
FIG.
5. The parent surface (left) and its approximation (right).
2. C. Conti, R. Morandi, C. Rabut and A. Sestini, Cubic spline data reduction: choosing the knots from a third derivative criterium, to appear in Numerical Algorithms.
3. R. van Damme and L. Alboul, Tight triangulations. Mathematical Methods for
Curves and Surfaces, M. Daehlen, T. Lyche and L.L.Schumaker (eds), Vanderbilt
University Press, 1995, 517-526.
4. N. Dyn, K. Hormann, S. J. Kim and D. Levin, Optimizing 3D triangulations using
discrete curvature analysis. Mathematical Methods for Curves and Surfaces: Oslo
2000, T. Lyche and L. L. Schumaker (eds), Vanderbilt University Press, 2001, 135146.
5. R. Franke, H. Hagen and G. Nielson, Least squares surface approximation to
scattered data using multiquadric functions, Adv. Comp. Math. 2 (1994), 81-99.
6. R. Morandi, D. Scaramelli and A. Sestini, A geometric approach for knot selection in
convexity-preserving spline approximation, Curve and Surface Design, P. J. Laurent,
P. Sablonniere and L.L. Schumaker (eds), Vanderbilt University Press, 2000, 287296.
Geometric knot selection
FIG.
6. The non-exact set of data (left) and its approximation (right).
7. R. Morandi and A. Sestini, Data reduction in surface approximation, Mathematical
Methods for Curves and Surfaces: Oslo 2000, T. Lyche and L. L. Schumaker (eds),
Vanderbilt University Press, 2001, 315-324.
8. H. Wendland, Piecewise polynomial, positive definite and compactly supported radial functions of minimal degree, Adv. Comp. Math. 4 (1995), 389-396.
251
On the boundary over distance preconditioner
for radial basis function interpolation
C. T. Mouat and R. K. Beatson
Dept. of Mathematics and Statistics, Univ. of Canterbury, Christchurch, New Zealand.
cam@mouat.net, R.BeatsonSmath.canterbury.ac.nz
Abstract
In this paper we consider the boundary over distance preconditioner for radial basis
function interpolation problems. We give both theoretical and numerical results indicating
that it performs extremely well.
1
Introduction
Let $ : TZ"^ -^TZ, X = {xi,... ,XN} be a set of N distinct points in T?."^ and / be a real
valued function which we can evaluate at least at the x,:'s. Define
^'"^
\
where E.ti ^jQi^'j) = 0,
for &\\ q € irf J '
^
'
We consider the problem of finding an element s of 5$,x + 7rf satisfying the interpolation
conditions
s{xi) == fixi), for all Xi e X.
(1.2)
Assume # is strictly conditionally positive definite of order 2 (SCPD2) and X is unisolvent for rrf. Then there is a unique element of S^^x + ^^f satisfying the interpolation conditions (1.2). This setting includes popular choices of the basic function such
as the thin-plate spline, $(•) = | ■ |^log| • |, and minus the ordinary multiquadric,
#(•) = -\/| • p + c^- In this paper we consider various ways of formulating the interpolation problem, showing in particular that a certain inexpensive change of basis
can dramatically improve its conditioning.
The usual way to formulate this problem is in terms of the functions {$(• — rcj)} and
some basis {po,Pi,. ■ ■ ,Pd} for irf. Then the interpolation conditions together with the
side conditions taking away the extra degrees of freedom introduced by the polynomial
part can be written as
AX + Pc = f
and
P^A = 0,
(1.3)
where Aij = ^{xi — Xj), Pij = Pj{xi), and / = [fixi),..., f{xN)Y. It is well known [3,
4, 5] that the matrix
' A P
(1.4)
P^ O
252
Preconditioning RBF interpolation
253
of this usual formulation is frequently badly conditioned, even when the number of nodes
is small. Indeed many authors have commented on the numerical difficulties that solving
this system presents [3, 4, 5]. Results of Narcowich and Ward show that conditioning of
the system (1.4) depends very heavily on the geometry of the nodes. However, frequently
in numerical analysis a change of basis, or other reformulation, can make a highly intractable problem tractable. Hence, our goal is to find an inexpensive but highly effective
preconditioner for RBF interpolation systems.
In this paper we establish properties of a preconditioning method for the RBF interpolation equations which was first presented in Sibson and Stone [5]. In the following
section we give a detailed account of the preconditioning method. In Section 3 we prove
that the construction produces an iV x (A'^—3) matrix Q whose columns are orthogonal to
P, and which is of full rank whenever the nodes X are unisolvent for Trf. Finally, Section
4 contains numerical results for different SCPD2 basic functions over a range of data
sets and scales. These numerical results show that using this inexpensive 0{N log N)
flop preconditioner and variants of it, dramatically improves the conditioning of RBF
interpolation problems. See Figure 1 below.
10
10
10
10
Sorted trial number
(a) Multiquadric basic function.
10
10
10
10
Sorted trial number
(b) Thin-plate spline basic function.
FIG. i. Sorted 2-norm condition numbers of the unpreconditioned matrices, A^, (top)
and of the preconditioned matrices, S, (bottom) for fifty thousand random data sets of
size one hundred.
2
A preconditioning method
A general approach to preconditioning interpolation problems with SCPD2 basic functions in Ti^ [1, 5] is to choose Q as any Nx(N — 3) matrix whose columns are orthogonal
254
Mouat and Beatson
to P and has rank N — Z. Letting A = Qfi and premultiplying (1.3) by Q^ gives the new
system to be solved for ^i, or equivalently A,
Bfi = Q'^f
where
B = Q'^AQ.
(2.1)
The three polynomial coefficients can then be found by a small subsidiary calculation.
In this section we present the boundary over distance method of Sibson and Stone [5]
for constructing the matrix Q. We will prove in the subsequent section that Q has full
rank and is orthogonal to P for any set of distinct nodes X = {XI,...,XN} C TZ'^,
which are unisolvent for irf. These properties of Q are well known (see e.g. [1, 5]) to
imply that the matrix of the preconditioned system B = Q^AQ is positive definite. The
construction of Q is appealing in that for "interior" points Xj of X it is local. That is,
for such points the entries in the j-th column of Q depend only on the geometry of the
nodes near Xj and not on any properties of nodes far away.
Choose a closed bounded convex polygonal region W of TZ'^ such that X C W.
Suppose without loss of generality that {xi\f-2,XN-i,XN} is unisolvent for wf. We will
refer to these points as special points. They are generally chosen so that they are well
spread throughout W. In our experience, and that of Sibson and Stone, for typical data
sets the choice of special points is not at all critical, as long as the triangle they define
has largish area. However, for contrived data sets, such as all but a very few points
on a straight line, the choice of special points becomes important. In these cases we
have observed that bad choices of special points can lead to large condition numbers.
However, the strategy of choosing the three special points to maximise the area of the
corresponding triangle has always led to small condition numbers.
The region W is divided into panels by intersecting a Voronoi diagram of the points
of X with the region W. We denote this panelling of W by
N
VwiX) = [jVi
i=l
,
■
where Vi is the Voronoi panel about the ith centre and is defined by
Vi = {x e W : \x - Xi\ < \x - Xj\,
for all 1 < j < A'' with j ^ i\.
Recall that the locus of points equidistant from two fixed points is the perpendicular
bisector of the segment connecting the points. It follows that each Voronoi region is
polygonal. Associated with a panel V, are its edges. These are a finite number of distinct closed line segments of non-zero length. They are the boundaries between different
Voronoi panels, or between a Voronoi panel and W^. The collection of all edges of all
the Voronoi panels will be denoted by (S.
Definition 2.1 Two polygonal regions ofTZ^ will he said to be strongly contiguous if
they have a common boundary of non-zero length.
Definition 2.2
a sequence
Two Voronoi regions Vi and Vj will be said to be C-related if there is
{Vi,Ve„Ve,,...,Ve„„Vj},
I <i,j,ei,...Jm < N-3,
Preconditioning RBF interpolation
255
in which all adjacent pairs are strongly contiguous.
Loosely speaking Vt and Vj are C-related if they are connected by a chain of strongly
contiguous pairs. C-related is an equivalence relation on the set {V^}^'^^ of Voronoi
regions of non-special points. Therefore it breaks this set into a finite number of nonempty
equivalence classes {Qi : 1 <l < k}.
Lemma 2.3 Let Qi he any of the equivalence classes above. Then there is at least one
Voronoi region Vi in Ge which is strongly contiguous to either W^ or one of
Proof: Consider
T=
\^
Vi .
i:Vi€ge
This union is a closed bounded connected polygonal set whose boundary can be written
as the union of some of the hne segments from £. Recall in particular that all these
hne segments have non-zero length. Pick one line segment < a,b > from the boundary
of T. Since it forms part of the boundary of T on one side of it Ues a Voronoi region
Vj from Qe. On the other side hes either W"^ or another Voronoi region Vj. In the first
case the Lemma is proven. Consider the second case. If 1 < j < A'' — 3 then Vi is
strongly contiguous to T^. Consequently, Vj eGe- This contradicts < o, 6 > being on the
boundary of T. Hence, N — 2 < j < N and the Lemma follows.
□
We now detail the construction of the N x{N — 3) matrix Q using boundary over distance weights. Note that because most elements of Q are zero sparse storage of Q requires
only 0{N) memory. A non-special point from {xi : 1 <i < N — 3} which has Voronoi tile
that is strongly contiguous to W'-' will be called a Voronoi external point. Define VE{X)
as the set of indices of all Voronoi external points. All other points are referred to as
Voronoi internal points. The corresponding indices are Vi{X) = {1, —, A^ — 3} — VE{X).
We first consider forming a column of Q for an index, j, such that j £ Vj{X). In
this case the panel Vj shares non-trivial edges only with other Voronoi panels and not
with W'^. The column is formed using boundary over distance weights, found from the
Voronoi diagram. For j € Vi{X) the boundary over distance weight r^j is
OyXij XjJ
for all Vi strongly contiguous to V^-,
(2.2)
where b{xi,Xj) is the length of the boundary between Vi and Vj. For other values of
i ^ j, rij is set to zero. In order that column j of Q is orthogonal to constants the
diagonal element rjj is specified as
Finally, the jth column of R is scaled by dividing by the area of Vj to obtain the jth
column of Q. Note that the column is by construction diagonally dominant, but not
strictly so.
If j € VE{X) then Vj is stongly contiguous to the complement of W, W^. The
boundary segment corresponds to a Voronoi edge between Xj and an artificial point, the
reflection of Xj in the boundary (see Figure 3 in [7]). The reflected point, Xj, can be
Mount and Beatson
256
written as a linear combination of the special points, i.e.,
Xj
- XNXN + >^N-1XN-1 +^N-2XN-2,
(2.4)
where X^ + AAT-I + XN^2 = 1- If Vj has k edges with W^ then k reflected points
{xp... ,Xj} are required. Associated with each reflected point, ;r|, are the coefficients
{A^, A^_j, A^_2}. The boundary over distance weights for Xj are partitioned amongst
the special points to obtain for all j G VE{X) and i 7^ j
Vi strongly contiguous to Vj,
|Xi-Xj|
' ij
ie{N:N-l,N-2}.
EU>^1'^,
(2.5)
Of course, Vj could be strongly contiguous with a Voronoi panel associated with a special
point. If this is the case r^j
b(x.
ji^ +Xti ^i§fe2l- Again, for other values of i ^ j,
rij is set to zero. Finally rjj is specified as in (2.3) and column j of Q is defined as
column j of R scaled by dividing by the area oiVj.
Partition Q as
Q=
p
,
(2.6)
where E is {N - S) x {N - 3). Thus E results from interactions between non-special
points, and F those between special and non-special points. Note in the construction
above that for 1 < i, j < iV - 3, etj is non-zero if and only if Vi is strongly contiguous
to V^-. Furthermore, note that E is necessarily column diagonally dominant, with strict
dominance in column j whenever Vj is strongly contiguous to the Voronoi region of a
special point, or to W'-'.
Relabelling if necessary we can assume the indices of the Voronoi regions in each of the
equivalence classes Qi form a contiguous subset of {1,..., A^ - 3}. Similarly, we can also
assume that the indices corresponding to any Qt precede those corresponding to Qt+i.
Furthermore, by construction Hi ^ j none of the regions in Qi is strongly contiguous with
a region in Gj- Thus, corresponding entries in the matrix E constructed using boundary
over distance weights and artificial points are zero. That is E is block diagonal with
the square matrix En on the main diagonal corresponding to the equivalence class of
Voronoi regions Gi. More precisely, Q will have form
En
0
Q=
■ ■■
■ ■■
0
0
:
:
•..
:
0
0
F2
■■■
■ ■■
Ekk
Fk
Fi
3
0
E22
.
(2.7)
Properties of the matrix Q
In this section we establish the fundamental properties of the matrix Q of (2.7). Namely
that it is of full rank and that its columns are orthogonal to those of P.
Preconditioning RBF interpolation
257
Definition 3.1 For m > 2, an m x m matrix K is irreducible if there does not exist
anmx m permutation matrix P such that
PKP'^ =
Mil
0
Mi2
M22
where Mn is r x r, M22 is {m — r) x {m — r), and 1 <r < m.
The following result is well known, see for example Varga [6].
Theorem 3.2 Suppose the square matrix K is irreducible and row (column) diagonally
dominant with strict row (column) diagonal dominance in at least one row (column).
Then K is invertible.
Lemma 3.3 Let X be a finite set of distinct points unisolvent for Trf. Let En be one of
the square blocks from the diagonal of Q constructed in the previous section. Then En
is invertible.
Proof: Prom the construction En is column diagonally dominant. Furthermore, by
Lemma 2.3 the diagonal dominance is strict for at least one column of Eu. Prom the
definition of the equivalence relation C-related there is a chain of strongly contiguous
pairs of Voronoi regions, connecting any two Voronoi regions in Qi. This implies the corresponding entries in En are non-zero and hence from [6] Theorem L6 En is irreducible.
It follows from Theorem 3.2 that En is invertible.
D
Theorem 3.4
The matrix Q described in Section 2 is orthogonal to P i.e. Q^P = O.
Proof: Omitted, see [2] and [7].
n
Theorem 3.5 Let X be a set of distinct points unisolvent for nf. Let Q be formed by
the construction in Section 2 and Aij = i>(a;j — Xj) where $ is strictly conditionally
positive definite of order 1. Then B = Q^AQ is positive definite.
Proof: Prom Lemma 3.3 each of the matrices En occurring in the block partitioning of
Q given in Equation (2.7) is invertible. Hence Q has full rank. Also from Theorem 3.4 the
columns of Q are orthogonal to the columns of P. Let ^ be any non-zero vector in TZ'^~^,
and define A = Qfx. Then A ^ 0, P^A = P'^Qfi = 0, and fi^Bfi = ii^Q'^AQn = X'^AX.
Hence, by the definition of strictly conditionally positive definite, fx^Bjj, > 0 whenever
/i 7^ 0 and B is symmetric positive definite.
D
Theorem 3.6 Let # be strictly conditionally positive definite of order 2 and such that
<^{hx,hy) = h'^^{x,y)+ph{x— y), h>0 withph S TTI. The preconditioned matrix Bh,
which corresponds to preconditioning on the point set hX, is a homogeneous function
of scale. Thus its condition number and the relative clustering of its eigenvalues are the
same over all scales.
Proof: Omitted, see [7]..
D
Theorem 3.6 applies in particular to the usual thin-plate spline, $(■) = | • |^log | • |, in
The extended version of this paper [7] contains a proof that the elements By decay
like \xi — Xj\~'^ when \xi — Xj\ is large. Por the multiquadric K is three and for the
thin-plate spline K is two.
Mount and Beatson
258
Definition 3.7 , The preconditioned matrix S is obtained from B by pre-multiplying and
post-multiplying B by the diagonal matrix D with ii entry l/y/bu-
4
Numerical results
In this section we present numerical results for the thin-plate spline and multiquadric
basic functions. In the following tables the matrix J4$ is defined in (1.4), B in (2.1),
5 in Definition 3.7 and the homogeneous matrix, C, is presented in [1]. In Table 1 we
show 2-norm condition numbers of matrices for the various preconditioning techniques
over seven different scales. It is clear that the algorithm in Section 2 gives a matrix
which dramatically improves the conditioning of the interpolation problem. In one case
by a factor of 10^**! Tables 2 and 3 contain condition numbers of the matrices resulting
from applying the preconditioning techniques of this paper for the thin-plate spline and
multiquadric basic functions. For N < 3200, the entries in the tables are the maximum
over one hundred random point sets of size A^. For N = 3200, the tables contain the
maximum over twenty random point sets of size 3200. In all cases the preconditioning
results in a smaller condition number. For these basic functions the maximum observed
condition number of the scaled preconditioned matrix, S, grows very slowly with A^.
Certainly there is no numerical evidence of power growth with A''.
Scale parameter
a
0.001
0.01
0.1
1
10
100
1000
Conventional
matrix A,f,
1.531(11)
1.544(9)
1.597(7)
3.107(5)
1.915(6)
1.271(11)
4.006(15)
Homogeneous
matrix C
1.534(5)
1.534(5)
1.534(5)
1.534(5)
1.534(5)
1.534(5)
1.534(5)
Preconditioned
matrix B
4.905(1)
4.905(1)
4.905(1)
4.905(1)
4.905(1)
4.905(1)
4.905(1)
TAB. 1. Condition numbers for one hundred points in [0,
spline. The point set for scale a is Xa = aXi.
Q]^
Scaled
matrix S
2.405(1)
2.405(1)
2.405(1)
2.405(1)
2.405(1)
2.405(1)
2.405(1)
and the thin-plate
In an attempt to rule out the possibility that our numerical results were flukes due to
the small number of 100 experiments we also conducted 50,000 trials with random data
sets of size 100. The results of these trials are shown in Figure 1. The maximum condition
number, over all trials with the thin-plate spline, for the matrix A^ was 1.2465(9), for
matrix C, 1.5750(9) and for matrix S, 1.8066(2). In our experiments the matrix S is
always well conditioned. This held even for geometries of centres for which the matrix
A^ is very badly conditioned.
To test further the behaviour of 5 for "bad" configurations of points a similar experiment was run with one thousand trials of one hundred points almost on a circle. The
maximum condition numbers of the A matrix, C matrix and S matrix were 1.2885(9),
7.2692(8) and 6.6005(2) respectively over 1000 trials. Even though the Voronoi regions
Preconditioning RBF interpolation
Number of
data points
200
400
800
1600
3200
Conventional
matrix A^
6.555(7)
5.675(8)
1.960(10)
1.092(10)
4.997(10)
Homogeneous
matrix C
3.068(7)
3.397(8)
1.348(10)
8.413(9)
3.783(10)
259
Preconditioned
matrix B
1.617(3)
1.945(3)
2.034(3)
8.099(3)
1.261(4)
Scaled
matrix S
6.028(1)
8.946(1)
9.775(1)
1.258(2)
1.569(2)
2. Maximum condition numbers encountered over a sample of 100 random
point sets of size N in [0, 1]^ with the thin-plate spline.
TAB.
Number of
data points
200
400
800
1600
3200
Conventional
matrix A^
2.014(8)
2.045(10)
6.641(10)
1.554(10)
2.477(11)
Preconditioned
matrix B
1.532(2)
5.932(2)
4.559(2)
7.025(2)
9.362(2)
Scaled
matrix S
4.224(1)
7.669(1)
5.826(1)'
5.601(1)
6.280(1)
TAB. 3. Maximum condition numbers encountered over a sample of 100 random
point sets of size N in [0, 1]^ with the multiquadric function, parameter c = I/^/N.
are long and thin the matrix S is still well conditioned!
Bibliography
1. R. K. Beatson, W. A. Light and S. Billings, Fast solution of the radial basis function interpolation equations: Domain decomposition methods, SIAM Journal on
Scientific Computing, 22 (2000), 1717-1740.
2. N. H. Christ, R. Priedberg and T. D. Lee, Weights of hnks and plaquettes in a
random lattice. Nuclear Physics B 210 (1982), 337-346.
3. N. Dyn, D. Levin and S. Rippa, Numerical procedures for surface fitting of scattered
data by radial functions, SIAM Journal of Scientific and Statistical Computing, 7
(1986), 639-659.
4. P. J. Narcowich and J. D. Ward, Norm estimates for the inverse of a general class
of scattered-data radial-function interpolation matrices, Journal of Approximation
Theory, 69 (1992), 84-109.
5. R. Sibson and G. Stone, Computation of thin-plate slines, SIAM Journal on Scientific and Statistical Computing, 12 (1991), 1304-1313.
6. R. S. Varga, Matrix Iterative Analysis, Prentice-Hall, New Jersey (1962).
7. C. T. Mouat and R. K. Beatson, Some properties of the boundary over distance preconditioner for radial basis function interpolation, Research report UCDMS 2001/6,
Department of Mathematics and Statistics, University of Canterbury, (2001).
What are 'good' points for local interpolation by
radial basis functions?
Robert P. Tong
The Numerical Algorithms Group Ltd, Jordan Hill, Oxford, 0X2 SDR, UK.
robert.tong@nag.co.uk
Andrew Crampton
School of Computing and Mathematics, University of Huddersfield, Huddersfield, UK.
a.crainpton@hud.ac.uk
Anne E. Trefethen
The Numerical Algorithms Group Ltd, Jordan Hill, Oxford, 0X2 SDR, UK.
anne.trefethen@nag.co.uk
Abstract
Radial basis function interpolation has an advantage over other methods in that the
interpolation matrix is nonsingular under very weak conditions on the location of the
interpolation points. However, we show that point location can have a significant effect
on the performance of an approximation in certain cases. Specifically, we consider multiquadric and thin plate spline interpolation to small data sets where derivative estimates
are required. Approximations of this type are important in the motion of unsteady interfaces in fluid dynamics. For data points in the plane, it is shown that interpolation to
data on a circle can be related to the polynomial case. For scattered data on the sphere,
a comparison is made with the results of Sloan and Womersley.
1
Introduction
Radial basis functions (RBFs) such as multiquadrics or thin plate splines have been
successfuly used for scattered data approximation in many applications. They have been
shown to perform well for data fitting, although problems of ill-conditioning and the
computational cost of processing large data sets must be handled carefully. In general,
when considering the accuracy of a RBF interpolant, a balance must be achieved between
the reduction in fill distance necessary for convergence of the approximation to an assumed underlying function and the need to maximise the separation distance between
data points to avoid problems of ill-conditioning [4].
In the present study, we focus on the use of RBF approximation as one stage of a
larger algorithm to compute the evolution of an uasteady interface in fluid dynamics.
The accuracy of the approximations made in the algorithm and the interaction between
its different stages determine whether the output is close to the true solution of the
260
Good points for RBF interpolation
261
governing equations or whether spurious effects are produced. In the three-dimensional
setting, a typical example is described by Zinchenko et al. [8] where the deformation of
liquid drops in a viscous medium is studied. A critical feature of the algorithm is the
approximation of the normal directions and curvatures of the droplet surface defined at
a number of discrete points.
The focus here is algorithmic rather than theoretical and we investigate the performance of multiquadric and thin plate spline local interpolants applied to the determination
of normal directions and curvatures of a smooth, closed surface. Certain configurations
of data points, such as points located on a circle, impose constraints on the interpolant.
A framework for understanding the behaviour of the RBF interpolants is provided by a
comparison with the multiva;riate polyiiomial interpolant of de Boor and Ron [1] and by
considering the free parameter in the multiquadric as a tensioning parameter [2].
2
Approximation method
A common approach to solving fluid dynamics problems that include moving interfaces, combines a computational grid with meshless approximation methods. The governing partial differential equations, or corresponding integral equation formulation, are
solved on the grid, while quantities characterising the interface are computed as meshless
scattered data approximations.
Here we examine the behaviour of local RBF approximations in the general context
described by Zinchenko et al. [8]. For a given data set, a particular point is selected
together with its nearest neighbours giving a set of typically 6 or 7 points. The initial
locations of these points may be determined by a regular mesh, but the surface is allowed
to deform so that the approximation is essentially to a small set of scattered data. The
constructed RBF interpolant, S, can be expressed as
'
N
K
S{x) = Y^aj(l){\\x-Xj\\) + ^biPi{x),
with the constraint
JV
.
2_]ajPi{xj) = 0,
ioT 1 <i < K,
3=1
where a; G 3?^ and {pi{x)}i=:-\_:K is a basis for the space of bivariate polynomials of degree
< m — 1 with K = m{m + l)/,2. The chosen forms for <j) are the thin plate spline
(j){\\x-Xj\\) = \\x-Xj\\'^\og\\x-Xj\\,
(TPS)
and the multiquadric
4>{\\x-x^\\) = {\\x-x^\\^ + ^)K
(MQ)
with 11 • 11 taken to be the Euclidean norm.
A framework for interpreting the computed results in the context being considered
can be derived from [2] where the arbitrary parameter, c, of the MQ function is viewed
as a tensioning parameter. As c ^^ oo the MQ interpolant approaches the correspond-
Tong et al.
262
ing polynomial interpolant to the given data, while as c -> 0, the MQ surface is tensioned. Multivariate polynomial interpolation can fail on particular point sets and this
has provided a motivation for using RBF methods. However, the algorithm of de Boor
and Ron [1] provides a reliable means of computing the 'least' polynomial interpolant.
This algorithm is used to compute a polynomial fit as one reference point for the interpretation of the MQ interpolants. A second reference point is provided by the TPS
interpolant which gives a minimum energy surface in a certain norm. This is shown to
correspond closely to the MQ fit for a 'small', but nonzero value of c. The MQ interpolant can thus be shown to connect the minimum energy, tensioned, TPS surface with
the polynomial fit to given data as c increases. In a fluid dynamics context a fluid-fluid
interface is often assumed to be represented by a C°° function (although cusps may
occur requiring a change in the representation). This would suggest that a high degree
polynomial would be preferred to a TPS surface.
1. Interpolants to random data at 6 points (-I-) in the plane: (left) polynomial
(upper) and multiquadric (c = 10) (lower), contours [0:0.1:2]; (right) thin plate spline
(upper) and multiquadric (c = 0.4) (lower), contours [0:0.1:1.1].
FIG.
3
Scattered data in the plane
To illustrate the behaviour of local interpolation by MQ and TPS methods, random
points in the a;2/-plane (with -1 < Xj,2/,: < 1, for i = 1 : 6) are associated with random
data values, fi (-1 < fi < 1). Figure 1 shows, in the upper frames, the two reference
Good points for RBF interpolation
263
FIG. 2. Effect of varying the parameter c on multiquadric interpolants to random data
in the plane: (left) norms of the difference between multiquadric and polynomial interpolants (upper curve doo = IMIoo, lower curve dg = IMb/v^); (right) curvature
{K = 2H) computed at the centroid: — multiquadric; - • — thin plate spline; — polynomial.
interpolating surfaces: (left) the polynomial surface computed by the algorithm of [1]
and (right) the TPS surface. The lower frames give the contours of the MQ interpolants
for c = 10.0 (left) and c = 0.4 (right). There is a close correspondence between the upper
and lower frames on each side, but a large difference between the polynomial and TPS
surfaces.
Figure 2 (left) shows the difference between the MQ surface and the polynomial
reference interpolant computed on a regular grid on the interior of the circle with centre
at the centroid of the data points (0.44, -0.09) and radius the maximum distance from
the centroid to a data point. There is convergence of the MQ surface to the polynomial
as 1/c -^ 0, but the condition number of the interpolation matrix increases until the
calculation cannot be continued. For c = 10.0 the condition number is 3 x 10^.
As an indication of the behaviour of first and second partial derivatives of the interpolating surfaces we calculate the curvature at the centroid of the data points for the
polynomial and TPS, together with MQ as c varies, using K = 2H where H is the mean
curvature. Figure 2 (right) shows that KMQ for the MQ interpolant coincides with the
value KTPS = -0.46 for the TPS when c « 0.4. When c < 0.4, KMQ < I^TPS, while
'tMQ—^ K^p =-9.78, the polynomial curvature, as c increases.
An interesting example is presented in [1] of polynomial interpolation for points
located at the vertices of a regular hexagon
{xuVi)
cos
2'Ki
~6~
,sm
2m
~6~
1,...,6
(3.1)
with data values /, = (-1)'. This gives the interpolant
p{x,y)=x'^ -"ixy^.
Since the points lie on the unit circle, the quadratic polynomial
P2{x,y) = I - x^ - y^
(3.2)
Tong et al.
264
vanishes at the data points and this causes difficulties for general polynomial methods.
MQ interpolants do not suffer from these difficulties. When c = 10.0, the MQ surface
is very close to (3.2). As c becomes smaller, the MQ surface approaches that of the
TPS with the data values becoming local maxima or minima as the surface is tensioned.
In addition, the restriction of the data points to a circle implies that the interpolating
polynomial is harmonic, but the convergence of the approximation is only first order [1].
The MQ surface for large c inherits the properties of the polynomial fit. Thus, points on
a circle are 'good' if the data being interpolated correspond to a harmonic function, but
'bad' if the data describe a function which has a maximum or minimum within the circle
or a singularity. These constraints on the interpolant are discussed further in Section 5.
4
Scattered data on the sphere
In this section we examine the accuracy obtained from three separate methods for interpolating scattered data on the unit sphere S^ C K''. In particular we compare the
results obtained using the MQ basis function in 9?^ with those obtained using the spherical harmonics of Sloan and Womersley^ [6] and the C^ Hermite interpolant of Renka [5].
For the multiquadric function, we list the uniform norm interpolation errors calculated
using a range of values for the shape parameter c.
..<:::;x.
FIG.
3. Minimum energy points and spherical cap.
The point distribution used is the 256 'minimum energj^' points of Fliege and Maier
[3] and the uniform norm interpolation errors are calculated at points distributed on a
spherical cap (see [5]).
The following functions are used for the comparisons in Table 4, where the results
presented in [7] are labelled 'W&S'.
Fl=
F3=
^e^+^'+^
l|x||i/10,
F2=
F4=
-5sin(l + 10^),
sin2(l + ||x||i)/10.
We note from Table 4 that the multiquadric function provides consistently better
interpolants to the four test functions compared with the spherical harmonics. Here, the
^Uniform norm errors used for comparison are approximate only and were taken from graphical representations
presented in Womersley and Sloan [7].
Good points for RBF interpolation
Method
W&S
Renka
MQ c = 0.01
MQc = 1
MQc = 2
TAB.
Fl
2.0000e-10
0.0013
6.0128e-04
4.5807e-10
2.2615e-13
F2
0.5000
0.1951
0.3276
0.0175
0.0227
F3
0.1100
0.0054
0.0051
0.0076
0.0079
265
F4
0.0500
0.0055
0.0051
0.0062
0.0065
1. Comparison of uniform norm errors.
points have been chosen to minimise the interpolation errors for the harmonic functions,
yet we see from results given in [7] that increasing the number of points in the distribution (which also increases the degree of the interpolating function) does not necessarily
produce better accuracy. However, these point distributions when used for the multiquadric function provide consistently better accuracy. Further evidence suggests that
points considered optimal for the spherical harmonics are also 'good' for the multiquadric function when compared to an equal number of generally scattered points. However,
this is due to the uniformity of the point distributions and similar results can be obtained
on a refined icosahedral mesh.
Method
Renka
MQ c = 0.01
MQc = 1
MQ c = 2
TAB.
12 pts
0.1730
0.2596
0.0715
0.0442
92 pts
0.0103
0.0170
7.7662e-05
3.8206e-05
362 pts
0.8230e-03
0.0020
1.9678e-10
3.4113e-ll
2. Multiquadric vs Renka for/(a;, y, 2;) = sin(a; + j/) + sin(a;2;) .
The Renka algorithm produced similar results to those obtained using the multiquadric (for small c) for the F3 and F4 functions, although the results for the functions Fl and
F2 were poor. Further comparisons with the Renka algorithm have been made using 12,
92 and 362 icosahedral points to interpolate the function /(a;, y, ^) = sm.{x+y)+sm.{xz).
The uniform norm interpolation errors have been calculated on the previously mentioned
spherical cap. Again we see that the multiquadric function produces better accuracy than
the Renka method when the number of interpolation points is increased.
5
Evolution of a smooth closed surface
In this section we return to the local interpolation scheme of §3 and apply it to scattered
data on a smooth closed surface. This is the setting described in [8], where initially the
interface is spherical with the point locations determined by subdivision of an icosahedral
mesh. Each set of points consists of a central point together its nearest neighbours,
giving sets of 6 points associated with the 12 vertices of the icosahedron and sets of
7 points otherwise. The local method of Renka [5] is followed and, for a chosen point,
a local coordinate system is defined with this point on the z-axis. The local point set
Tong et al.
266
is projected onto the xy-plane and the surface heights provide the data vahies. This
typically gives a configuration very close to the hexagon points (3.1) with an additional
point at the centre, except for those points associated with the icosahedron vertices where
the arrangement is a pentagon. As the iscosahedral mesh is refined these configurations
become less regular.
The addition of the central point to the hexagon points increases the order of the
approximation. When the surface is spherical, the symmetry of the data ensures that the
computed unit normal at the centre point for polynomial, MQ or TPS is exact except for
rounding error {e.g. for MQ the error is ||n - nMglh = 3 x 10"-'''). However, taking MQ
with c = 10 and a sphere of radius 9, if the central point is displaced from the origin to
(0.01,0.01) the error in the normal is 3 x 10~^. To illustrate convergence for an irregular
point set, the hexagon points are perturbed by the addition of a factor (i - l)eh[l, 1]^
for points i = 1,2..., 6 with h the radius of the circumcircle and taking e = 0.05. For
MQ with c = 10, the error in the surface normal is 0{h^) whereas, for c = 0.4, the error
is larger and the rate of convergence varies (see Table 3).
h
1.0
0.5
0.1
0.05
TAB.
lln-riA'/glb, c = 10
3.15 X 10-^
3.85 X lO-''
3.06 X 10-»
4.99 X 10-^
\\n-nMQ\
6.14 X
1.26 X
6.35 X
8.47 X
2, c = 0.4
10-^
10-3
10-'^
10-^
3. Error in MQ approximation to surface normal of sphere, irregular point
set.
Accurate curvature values are essential for an interface which is driven by surface
tension. The exact value of K = -2/9 for a sphere of radius 9, together with the computed
values, are shown in Table 5. The polynomial and MQ with c = 10 are close to the exact
value.
Method
exact
polynomial
MQ c = 0.1
MQ c = 10.0
TAB.
4. Curvature,
K
K
-0.222...
-0.222912
-1.638002
-0.225387
= 2H, evaluated at the central point of a regular hexagon.
It is found that, for the icosahedral mesh with N = 362, the local point sets are
sufficiently regular to give good accuracy for surface normals and curvature using MQ
interpolants when c is chosen to be 'large' in relation to the point spacing. This mesh also
gives a corresponding accuracy for the discretised integral equation. These points can
thus be considered 'good' for the MQ approximation. However, if the mesh is further
refined Or the surface deforms during its evolution, then the approximation becomes
Good points for RBF interpolation
'less good' as the regularity of the point locations is lost. Numerical experiments suggest
second order convergence with point separation for irregular local point sets.
6
Conclusions
The behaviour of MQ and TPS interpolants can be interpreted by reference to the
corresponding 'least' polynomial interpolant, with the MQ connecting the polynomial
C°° surface to the tensioned surface of the TPS as the parameter c decreases. The MQ
interpolant with 'large' c (relative to the point separation) exhibits the properties of the
polynomial case and is similarly affected by the location of data points. Thus, points
on a circle in the plane can be 'good' if the function to be represented is harmonic,
but in general give only first order convergence on the interior. For data on the sphere,
'good' points for polynomial interpolation are also good for the MQ with 'large' c, but
other near equispaced point distributions appear to give similar accuracy with MQ.
The tensioning effect of smaller values of c can improve the results if the underlying
function is not C°°. When appHed to an evolving interface, starting from an initially
spherical shape and a refined icosahedral point distribution, it is found that local MQ
approximations to the surface derivatives are affected by the point locations. This can
be understood by reference to the polynomial interpolant to data located on a circle and
causes an irregularity in the convergence as N increases.
Bibliography
1. C. de Boor and A. Ron, Computational aspects of polynomial interpolation in several variables, Math. Comp. 58 (1992), 705-727.
2. M. Eck, MQ-curves are curves in tension, in Mathematical Methods in Computer
Aided Geometric Design II, T. Lyche and L. L. Schumaker (eds). Academic Press,
1992,217-228.
3. J. Fliege and U. Maier, The distribution of points on the sphere and corresponding
cubature formulae, IMA J. Num. Anal 19 (1999), 317-334.
4. A. Iske, Perfect centre placement for radial basis function methods, preprint (1999).
5. R. J. Renka, Interpolation of data on the surface of a sphere, ACM Trans. Math.
Softw. 10 (1984), 417-436.
6. I. H. Sloan and R. S. Womersley, The search for good polynomial interpolation
points on the sphere, in Numerical Analysis 1999, D. F. Grifffths and G. A. Watson
(eds), Chapman and Hall, 2000, 211-229.
7. R. S. Womersley and I. H. Sloan, How good can polynomial interpolation on the
sphere be? preprint (1999).
8. A. Z. Zinchenko, M. A. Rother and R. H. Davis, A novel boundary-integral algorithm
for viscous interaction of deformable drops, Phys. Fluids 9 (1997), 1493-1511.
267
Chapter 5
Regression
269
preceding Page Blank
Generalised Gauss-Markov regression
Alistair B Forbes, Peter M Harris and Ian M Smith
National Physical Laboratory, Teddington, Middlesex, TWll OLW, UK.
alistair.forbesQnpl.co.uk, peter.harrisOnpl.co.uk, ian.smithQnpl.co.uk
Abstract
Experimental data analysis is an key activity in metrology, the science of measurement.
It involves developing a mathematical model of the physical system in terms of mathematical equations involving parameters that describe all the relevant aspects of the system.
The model specifies how the system is expected to respond to input data and the nature
of the uncertainties in the inputs. Given measurement data, estimates of the model parameters are determined by solving the mathematical equations constructed as part of
the model, and this requires developing an algorithm (or estimator) to determine values
for the parameters that best explain the data. In many cases, the parameter estimates
are given by the solution of a least-squares problem. This paper discusses how various
uncertainty structures eissociated with the measurement data can be taken into consideration and describes the algorithms used to solve the resulting regression problems. Two
applications from NPL are described which require the solution of generalised distance
regression problems: the use of measurements of primary standard natural gas mixtures
to estimate the composition of a new natural gas mixture, and the analysis of calibration
data to estimate the effective area of a pressure balance.
1
Introduction
Many metrology experiments involve determining the behaviour of a response variable y
as a function of a set of independent variables x = {3:i,a;2,.. .,x„}. Model building involves establishing the functional relationship between these quantities, usually involving
a set of model parameters a, i.e.,
y* =(/)(x*,a),
where y* and x* represent exact values of the variables. The terms a parametrize the
range of possible response behaviour and the actual behaviour is specified by determining values for these parameters from measurement data. In practice, measurements
are subject to error, and the error structure must be taken into account firstly in order
to determine effective methods for obtaining parameter estimates and secondly in determining the uncertainty in the fitted model parameters. For a set of measurement data
{xj, t/i}i^i, the data analysis problem involves the accurate estimation of the parameters
a, taking into account knowledge of the uncertainties in {xJ and/or {yi}, and typically
leads to a least-squares problem [4].
This paper describes the various uncertainty structures that arise and corresponding
regressions problems for determining estimates of the model parameters. If the covari270
Generalised Gauss-Markov regression
271
ance information associated with the measurements is structured so that only the ith set
of measurement errors are correlated with each other, a generalised distance regression
approach is appropriate. However, some applications have quite general correlation structure and a full Gauss-Markov estimation approach is required to make efhcient use of
the statistical model [7]. This leads to a generalised Gauss-Markov regression problem to
take into account the errors in the variables and the general correlation structure. While
the covariance structure may dictate which solution algorithms are to be employed, the
information required of the model function 0 is limited to the evaluation of the function
and its derivatives with respect to a and x. This means that solution algorithms can
be based on a compact set of model-dependent modules and a generic set of harnessing
routines that link the models to general purpose least-squares optimisation software.
The layout of the paper is as follows. In Section 2 we Consider the various error structures and corresponding regression problems. Section 3 introduces two measurement
problems encountered at NPL: the use of measurements of primary standard natural
gas mixtures to estimate the composition of a new natural gas mixture; and the analysis of calibration data to estimate the effective area of a pressure balance. Although
the functional models for these measurement systems are simple, taking the form of
low-order polynomials, the statistical models need to account for (a) uncertainties in
both the dependent and independent variables, and (b) possible correlations between
measurements. These requirements lead us to solve generalised regression problems. An
overview of solution algorithms for the various problems is given in Section 4. Concluding
remarks are made in Section 5.
2
Error structures and regression problems
Within metrology, various error structures arise all of which can be taken into account.
We now consider the main types.
2.1
2.1.1
Error in one variable only
Ordinary (weighted) least squares
The simplest type of error structure occurs when only one of the system variables is
subject to error and there is no correlation between errors. The model is summarised by
y*='?^«>a),
yi=y*-^ei,
Xj = x*,
where it is assumed that
E{ei)=0,
var(ei)=o-?,
cov(ei,ej) = 0, i ^ j.
Good estimates of a can be found by solving
min
^u;f[2/i-^(xi,a)]2,
j=i
where tUj = l/cr,, i = 1,... ,m.
(2.1)
Forbes et al.
272
2.1.2 Gauss-Markov regression
If instead of (2.1), the measurement errors are correlated so that
Eie) = 0,
var(6) = V,
with V full rank, then an estimate of a can be found by solving
min[y
- 0(a)py-My - 0(a)]>
a
(2-2)
where the ith element of (/)(a) is (^(xj,a).
2.2 Errors in more than one variable
In many metrological applications more than one of the measured variables is subject to
error, and this must be taken into account in order to determine estimates of the model
parameters which are statistically efficient and free from major bias.
2.2.1 Orthogonal distance regression
The simplest case arises when the covariance matrix associated with the ith set of measurements is a multiple of the identity matrix and there is no correlation between any of
the errors, summarised by the model
y* = <j){x*,ai),
Vi = y* + u,
Xj = X* + 5i,
with
Eim) = 0,
var(77,) = pfl,
(2.3)
where rj^ = {euSj)'^. In this case, appropriate estimates of the parameters are determined by the solution of
min
y"vf{(Xi-x*)'^(xi-xn + (j/,:-<^(x*,a))2},
where Vi = l/pi, i = 1,. ■ ■ ,m.
Note that this optimisation problem involves m sets of parameters xt as well as the
parameters a specifying the model y = (^(x, a).
2.2.2 Generalised distance regression
If we assume that the errors r}i are correlated with var(rjj) = Vi with Vi full rank, but
that cov{rii,r]j) =0,1^ j, then the appropriate regression problem is
mm
K},a
E
j/i-0(xt,a)
Vi
X,; - X ■
2/i-0(x*,a)
X,: - X*
(2.4)
2.2.3 Generalised Gauss-Markov regression
The most complicated error structure arises when all variables are subject to measurement error and there is general correlation between the errors. If ^ (4*) is the vector of
measurements {xj (variables {x^}), then the corresponding regression problem is
mm
y-</'(ea)
T
V -1
y-0(ea)
(2.5)
i
Generalised Gauss-Markov regression
where the ith element of <f>{^,a) is </>(x*,a).
3
Examples from metrology
3.1
Preparation of primary standard natural gas mixtures
Within the Centre for Optical and Analytical Measurement at NPL, one part of the
work of the Environmental Standards Group is to prepare primary standard natural gas
mixtures. These are cylinders containing natural gas prepared gravimetrically to contain known compositions of each of the 11 constituent components (methane, ethane,
propane, 1-butane, n-butane, 1-pentane, n-pentane, neo-pentane, hexane, nitrogen and
carbon dioxide). Mixtures are prepared to cover various concentration ranges, e.g., methane: 64% — 98%. These primary standard mixtures are used as the basis for determining
the composition of a new mixture and hence its calorific value.
Given a number of primary standard natural gas mixtures containing known concentrations of one of the constituent components (e.g., CO2), the detector response for
each mixture and the detector response for the new mixture, we wish to determine the
concentration of CO2 in the new mixture.
An approach to solving this problem is firstly to use the calibration data (relating to
the primary gas mixtures) to calibrate the detector and, secondly, to use the calibration
curve so constructed with the new measurement to predict the concentration in the new
mixture.
Errors to be accounted for are:
• the calibration data is known inexactly. The process of preparing the primary standards means that they are known inexactly, and indeed the errors in the standards
may be correlated (this is a consequence of the gravimetric process used to prepare
the standard mixtures which involves comparing on a balance each standard mixture at each stage of preparation against calibrated masses selected from a common
set of masses),
• the data returned by the detector (which is based on the analytical technique of
chromatography) is subject to measurement error.
Consequently, we wish our data analysis to account for the inexactness of the measurement data and to quantify the resulting uncertainty associated with the final measurement result.
Figure 1 shows a sample set of measurement data, with the ellipses around the calibration points illustrating the errors in the concentrations and detector responses. (The
error ellipses have been magnified greatly for illustrative purposes.) The figure also shows
a calibration curve which is used to estimate the concentration of the component for
which the detector response (and its uncertainty) is known.
3.2
Calibration of pressure balances
The principal role of the Pressure and Vacuum Section in the Centre for Mechanical and
Acoustical Metrology at NPL is the development and maintenance of primary measurement standards for pressure and vacuum and their dissemination to industry. Pressure
balances are pressure generators and consist essentially of finely-machined pistons moun-
273
Forbes et al.
274
»x10'
FIG.
1. Sample data (+), fitted calibration curve and predicted measurement (o).
ted vertically in close-fitting cylinders. The pressure required to support a piston and
associated ring-weights depends on the mass of the piston and ring-weights and the
cross-sectional area of the piston [5]. Due to various fluid dymamic effects, the effective
area Aip, a) of the piston-cylinder assembly is a function of pressure, usually taken to
be a hnear function A{p,a) = ai-V a2P- Many other factors such as temperature and air
buoyancy have to be taken into account but for our purposes here, the pressure generated
satisfies
aip + a2p'^ = j/(m),
where a are the instrument parameters and y{m) is a simple function of the applied load
m. This equation determines p implicitly as a function of m and a. Suppose a reference
pressure balance has been calibrated so that estimates of the instrument parameters
a and their uncertainties are known. The reference balance can be used to calibrate a
test balance in a cross-floating experiment in the following way. A load m; is applied
to the reference balance to generate pressure pi — p{mi,a). A load Jii is applied to the
test balance so that the pressures generated are matched. The test calibration curve is
determined from a best fit to the data (rii,Pi)
hp* + b2iPi)^ = yini),
Pi=p*+ei,
ni = n* + £i,
where 6i and e^ represent measurement error associated with the pressures and masses,
respectively. However, the following must be taken into account. Firstly, the pressures
Pi all depend on the common estimates a of the instrument parameters of the reference
balance, leading to correlation of the measurement errors S,. Secondly, the masses n,
and rrii are made up from the same ensemble of masses /it = (/xi,..., (J-M)^ SO that
n,;
T
rrii
mjfi,
i
275
Generalised Gauss-Markov regression
where n, and nii are binary coefiScient vectors. This means that measurement errors
associated with the masses Hk give rise to (further) correlation between 6i and e^. Taking
this general correlation into account, estimates of the the instrument parameters b, are
found from solving
mm
b,p'
P-p*
V-
y
p-
(3.1)
where the ith elements of (f) and y are bip* + &2(p*)^ and y{ni), respectively, and V is
the appropriate covariance matrix determined from the dependence of y and 0 on a and
/x. This is a generalised Gauss-Markov regression problem.
4
Algorithms for generalised regression
Algorithms for ordinary least squares problems of the form miua Xli fii^) ^'^^ "'^^^^ known
and include QR factorisation methods for linear models or the Gauss-Newton algorithm
for non-linear models; see, e.g., [2, 6]. The latter algorithm requires the user to supply a
software module to evaluate the vector of function values f (a) and the Jacobian matrix
J of partial derivatives
*Ji'.
dai
If fi{a) =yi — (^(xj, a) as considered above, the user has to supply a module to calculate
0(x, a) and d(j)/daj.
If V is symmetric and strictly positive definite, the Gauss-Markov regression problem
(2.2) can be formulated as an ordinary least squares problem. li V = LL'^ where L is
lower-triangular, then the problem becomes
min^?(a),
Bi
where f = L~^{. The associated Jacobian matrix is J = L~^J. If the matrix V is
well-conditioned, matrix operations with V or L~^ should not lead to unnecessary loss
of precision. However, explicit calculations involving V can be avoided by using the
generalised QR factorisation [2, 8, 9], leading to solution algorithms with good numerical
properties.
The generalised distance regression problem (2.4) can be solved efficiently by making
use of the fact that the parameters x* appear only in the ith summand. The associated
Jacobian matrix has a block-angular structure that can be exploited effectively in the
QR factorisation stage [2, 3]. Alternatively, a separation-of-variables approach can be
adopted in which the parameters x* (a) are first determined as functions of a specified
by the solution of the corresponding footpoint problem
mm
J/i-0(x,^a)
X,: - x?
T
Vr
Vi
<?!'(x*,a)
i-x*
and the problem formulated as a non-linear least squares problem in a [1, 4]. Either approach yields an algorithm requiring 0(mv?) flops while a full matrix approach requires
0{m^) flops.
276
Forbes et al.
The generalised Gauss Markov problem (2.5) can be solved as a Gauss-Markov problem problem in the variables {x*} and a, but ideally, we would like to develop algorithms
that exploit problem structure as in generalised distance regression algorithms. In particular, while the covariance matrix V may well be full, in many situations it is constructed
from smaller matrices and for which more efficient algorithms could be developed.
From the user's point of view, all the regression algorithms discussed here require only
the calculation of the model function (p and its derivatives ^ and ^. Thus, a wide
range of regression problems can be solved using standard optimisation modules along
with generic harness modules that perform the conversion without input from the user
over and above the calculation of </> and its derivatives. For example, we have implemented
a generalised Gauss-Markov solver to solve problems such as (3.1) for any explicit model
y = (j){x,a). However, issues of efficiency and numerical stability need to be taken into
account. As part of the UK Department of Trade and Industry's Software Support for
Metrology programme, NPL is developing and making available to metrologists a suite of
routines for the generalised regression problems discussed above. By combining structure
exploiting linear algebra and numerically stable components such as the orthogonal
factorisation, it is hoped that metrologists will be able to use these routines with the
same confidence and effectiveness that they currently experience with standard, wellengineered regression modules available in numerical libraries.
5
Concluding remarks
In metrology, we are interested in the determination of accurate estimates of the parameters that describe a physical process. It is imperative that knowledge of the measurement
system should be used to describe the error structure as accurately as possible. We have
described the five types of regression problems that can occur in metrology depending
on the error structures that are assumed. In all cases it is important that we employ
efficient, numerically stable algorithms and exploit any structure in both the Jacobian
and covariance matrices.
Acknowledgements. This work has been supported by the Department of Trade and
Industry's National Measurement System Software Support for Metrology Programme
and undertaken by a project team at the Centre for Mathematics and Scientific Software,
National Physical Laboratory. The authors are particularly grateful to Paul Holland
(Centre for Optical and Analytical Measurement) and the Pressure and Vacuum Section
for their contributions.
Bibliography
1. M. Bartholomew-Biggs, B. P. Butler, and A. B. Forbes, Optimisation algorithms for
generalised distance regression in metrology, in Advanced Mathematical and Computational Tools in Metrology IV, P. Ciarlini, A. B. Forbes, F. Pavese and D. Richter
(eds), 21-31, World Scientific, Singapore, 2000.
2. A. Bjorck, Numerical Methods for Least Squares Problems, SIAM, Philadelphia,
1996.
i
Generalised Gauss-Markov regression
3. M. G. Cox, The least-squares solution of linear equations with block-angular observation matrix, in Advances in Reliable Numerical Computation, M. G. Cox and
S. Hammerhng (eds), 227-240, Oxford University Press, 1989.
4. M. G. Cox, A. B. Forbes, and P. M. Harris, Software Support for Metrology Best
Practice Guide 4: Modelling Discrete Data, National Physical Laboratory, Teddington, 2000.
5. A. B. Forbes, and P. M. Harris, Estimation algorithms in the calculation of the
effective area of pressure balances, Metrologia, 36(6): 689-692, 1999.
6. G. H. Golub and C. F. Van Loan, Matrix Computations, John Hopkins University
Press, Baltimore, third edition, 1996.
7. K. V. Mardia, J. T. Kent, and J. M. Bibby, Multivariate Analysis, Academic Press,
London, 1979.
8. C. C. Paige, Fast numerically stable computations for generalized least squares
problems, 5L4M J. ATMmer. .4na/., 16:165-171, 1979.
9. SIAM, Philadelphia, The LAPACK Users' Guide, third edition, 1999.
277
Nonparametric regression subject to a given number of
local extreme value
Ali Majidi and Laurie Davies
Department of Mathematics and Computer Science, University of Essen,
Germany.
{ali.majidi,laurie.davies}@stat-math. iini-essen.de
Abstract
We consider the problem of nonparametric regression. The aim is to get a smooth function
which represents the dataset and has a reasonable number of extreme values. An iterative
method, the QSOR method is introduced. Problems with the slow convergence of the
method are reduced using multigrid techniques.
1
Introduction
Given a dataset {y{ti),i = 1,... , n} which we denote by y, we look for a decomposition
y{ti) = f{U) + r{ti), {ti = i/n, i = l,...,n)
where / is a simple function and the {r{ti), (i = 1,... ,n)} are the resulting residuals
which approximate white noise. We use two different concepts of simplicity. The first
is the number of local extreme values. The second is the smoothness of the function as
measured by the standard smoothness functional
S{f) := f\f^'Ht)rdt,
Jo
where /^^^ is the second derivative of /. The number of local extremes is taken to have priority over smoothness. The number of local extremes and their locations are determined
by the taut string method developed in [3]. This is described briefly in the next section.
The residuals are required to look like white noise in the sense that the means over certain
dyadic intervals are required to lie within given bounds [3]. The multiresolution coefficients for (n = 2") are defined by: Wij := 2'---'/^^J2k=j2^iir{tk),ii = 0,...,i^),(i =
0,..., 2^''"') - 1). The multiresolution condition now requires that -c„ < Wij < c„,
where c„ represents some form of thresholding. The defatilt value of c„ which we use is
c„ = cT„i/2.51og(n) where CT„ = 1.482 • median(|2/2 - J/il, • ■ ■ , IVn - 2/n-i|)/\/2Supported by SFB 475, University of Dortmund
278
i
Smooth regression subject to extreme values
279
1. The top-left caption shows the original doppler function and the top-right
caption shows the noisy version. The bottom-left caption shows the result of the taut
string algorithm with the resulting residuals being shown in the bottom-right caption.
FIG.
2
Taut string
A short description of the taut string method is as follows. We write / = (/i,..., /„)-^ :=
(/(ii),... ,/(i„))-^ € R" and denote the cumulative sums of y and f hy Y and F
respectively, Yi = X)}=i Vj, Fi = Yl]=i /j, (« = 0,..., n), with YQ = FQ = 0. We specify
bounds defined by A = (Ai,..., A„)^ G R" and consider the tube
{G:|y-G| <A}.
(2.1)
The taut string V{X) is now the function defined by a taut string attatched to the
points {0,Yo) and (n,y„) and constrained to lie within the tube (2.1). It can be shown
that the taut string minimizes the number of extreme values of the functions g whose
cumulative sums G lie within the tube. The taut string is continuous and piecewise
Unear. Its derivative v{X) is taken as a candidate regression functions. The vector A is
determined in a data dependent manner by the requirement that the residuals associated
with v(\) {r{\)i = Vi — v{X)i,i = l,...,n} satisfy the multiresolution condition. If
such a condition fails on an interval then the A-values associated with that interval
are reduced in size. An application of the taut string method to the doppler data of
Donoho and Johnstone (see e.g. [4]) is shown in Figure 1. The function is defined by
f{t) = 2lA/(i;(l - t))sin (27ri^^^ j . The derivative v(A) is piecewise constant as may
be seen from Figure 1. The function v{X) determines the number of local extremes. We
take the midpoints of the intervals associated with a local extremes as the locations of
the local extremes for the smoothing algorithm.
280
A. Majidi and L. Davies
3
The smoothing problem
We make the smoothing problem precise as follows. The number, locations and type of
extreme values are taken from the taut string as explained in the last section. We further
require the function / to lie in the tube determined by the taut string. This is to prevent
the smoothing procedure from moving too far from the data. These restrictions may be
described in the form
Af>b
(3.1)
for an appropriate matrix A and vector b. This leads to the following problem:
minimize ^"^i(/i+i - 2/i +/i_i)2 subject to (3.1),
or equivalently
minimize F^QsF subject to (3.1),
for some quadratic form Q3. We denote this latter quadratic programming problem by
QP3. Clearly the matrix associated with the quadratic form Yll^iifi+i ~ ^fi + fi-i)^
is singular. Nevertheless the solution of QP3 may be unique. We have the following
theorem.
Theorem 3.1 Let V{X) be the result of the taut string method. Assume that V{\) has
one extreme value. We define the bounds L,U by
L:=Y-X,U ■.= Y + X.
Let Fi,F2 be two solutions of the corresponding quadratic program. Additionally let Fi
touch three bounds alternately
(i.e. Ui^,Li.2,Ui^ or Li^,Ui^,Li^,{ii < 12 <iz) are active).
Then
We call a problem with a unique solution a nondegenerate problem. From now on we
assume that our problem is nondegenerate.
3.1
Quadratic programming
There are many algorithms which solve quadratic programming problems directly. Unfortunately most of them are expensive in terms of memory requirements and are not
feasible for data sets of the order say n = 8196. To overcome this we look for iterative
methods which converge to the solution. Gradient projection methods (e.g. as defined
in [8], [2] or [9]) are not appropriate for this purpose a.s the monotonicity constraints
make the projection into the feasible set too expensive. Instead we use a modified version of the QSOR (quasi successive over relaxation) method developed by Metzner in
[7]. QSOR is a very cheap iteration and converges to the solution of QP3. Unfortunately
the convergence is very slow on sections where the solution is smooth. To overcome this
we use multigrid methods which have to be adapted to our requirements.
i
Smooth regression subject to extreme values
4
QSOR
The QSOR algorithm is an iterative method which produces a feasible sequence {F''}'^^^
converging towards the solution of QP3. For simpUcity, we describe the iteration only for
a convexity interval. Let F° G R" be an arbitrary feasible vector. The obvious candidate
is the derivative of the taut string. Let Q — Qs, and w G (0,2). The following defines a
QSOR iteration.
• While convergence not achieved
i=l
Fi = Fi- ^.{Qf)u U = max{2Fi+i - Fi+2, Li}, Ui = Uu Pi = med{Li, % FJ
i=2
"
Fi = Fi- -^,{Qz)ii,li = max{2Fi+i - Fi+2,U}
tJi = mm{{Fi+^+Fi^{)l2,Ui},Fi=m.ed{Li,Ui,Fi}
• for (iin3:(n-2)){
Fi = Fi--^^{Qz)i,Li = m&x{2Fi+i-Fi+2,2Fi^i-Fi_2,Li]
Ui = mm{{Fi+^ + Fi_i)l2,Ui),Fi = med.{Li,Ui,Fi}
if (i active) mark i
}
}
•
i=n
Fi = F2- ■^{Qz)ii,Li = max{2Fi_i - F^-a,U},Ui = Ui,Pi == med{Zi,Ui,Pi}
i=l
"
Pi = Fi-^^{Qz)i
Li = ma,x{2Fi+i - Fi+2, Li}
Ui = UiPi = med{Li,Ui,Pi}
• correct the active intervals:
* Let [Fi^,F^+k] be an active Interval: Fi = F^ + ^ ■'"J"^ (-Fiz+fc — F,^). Denoting
the i-th unit vector in K" by e^ and a, b defined by
set Ff := Pj - &{atj + b) with
• F^ = Pi for all i in other intervals
Theorem 4.1 (convergence) Let (F'^)^Q be the sequence in R" produced by the QSOR
algorithm and let the problem QP3 be nondegenerate. Then
• (-F'^)^Q converges in M".
• F* := Mm F'^ is the solution of QP3.
fc—»oo
'
281
282
A. Majidi and L. Davies
FIG. 2. The captions top-left, top-right, bottom-left, bottom-right show the result of
the QSOR iteration for the doppler data (n = 2048) after 1000,5000,10000,20000 steps
respectively.
The slowness of the convergence can be seen by the fact that the doppler data of Figure 1
required two million iterations before a satisfactory solution was obtained. This is shown
in Figure 2. After a small number of iterations the solution does not change any more
on the "left side" where the function oscilates rapidly. After 1000 iterations of QSOR
(which is fast because one QSOR step is very cheap!) the solution looks very smooth
except for a few "buckles" on the "right side" of the solution. The method needs many
iterations (up to two million) to reach an adequate smoothness. The slowness of the
convergence is known from the original SOR method for solving linear equations. In the
standard case of solving linear equations multigrid methods can be used to speed up the
rate of convergence. We now apply this idea to solving the problem QP3.
5
Multigrid QSOR
Multigrid techniques are general techniques to speed up iterative methods which indeed
have other good properties. The ideas are given for example in [1] or [5]. We will give here
a short description of the multgrid idea in our case. First some notation. Given a grid
G = Gf = {ti,...,tn) we define the coarse gridGc = (<i,*J2, • •■ ,i.„,^i,tn), Ji = l,im =
n with ij £ {1,... ,n}. We define the projection down IcX - (Fi,Fi^,.
F
F 1^
and the projection up Px = y where yi = Fi {I € {ij\j = l,...,m}). and by hnear
interpolation elsewhere, i.e..
(ij_i < / < ij).
yi =
''ij
*i
We define now the multigrid QSOR with only two grids, i.e., of level two. The general
case of level i/ G N is defined similarly. Let QSOR{G, A, b, fx, x) denote the result of the
Smooth regression subject to extreme values
BOOD OSOR Iteratfonsn
QSOR a Multlgrld
FIG. 3. The left figure shows the result of QSOR after 5000 iterations. The right figure
shows the result of (1000) multigrid QSOR with one coarsing step (i.e. the right figure
is "cheaper" than 2000 QSOR streps).
QSOR method apphed to the problem on the grid G after fx iterations on the Grid G
with starting vector x and constraints defined by A, b. Additionally let F'' be given.
• Multigrid QSOR
* while precision not achieved
o F = QSOR{G,A,b,n,F'')
o F = PQSOR{Gc,AcA,ti,IcF)
o F^+^^QSOR{G,A,b,n,F)
o k^k + l
where A^ be are the corresponding constraints for the coarser grid. The question is now
how to define the projection of the constraints. One can think of an example where
the canonical projection of bounds like Gc can fail. This happens for example if strong
constraints (e.g. tight bounds) are not on the coarse grid. To overcome this problem one
has to think of a method of defining the problem QP3 on the coarser grid in a reasonable
way. One way to handle this problem is to define Li. := ra&yi{Lk\ij-i < k < ij+i} and
"min" for the upper bounds. Special cases have to be treated but we do not go into
details here. A coarser grid means that the QSOR step on this grid converges much
faster. On the other hand the approximation of the solution gets worse by coarsening
the grid. In our case (see Figure 4) we have n = 2048. The coarsest grid was made by
taking every eighth gridpoint. We iterated until there was no recognizable improvement.
6
Proofs
Proof of Theorem 3.1: We set D = F2 — Fi. One simply verifies that D has to be a
fine, i.e., there are a, 6 £ R such that Di = ati + b. Touching three bounds alternately
means that D changes its sign at least two times which leads to D = 0.
□
283
A. Majidi and L. Davies
284
MuHlgrtdOSOR
FIG. 4. Multigrid QSOR applied to the doppler data with n = 2048. The figure took
less than 6 seconds comparing to three hours without multigrid on the same computer.
Proof of Theorem 4.1: We set Q = Qs. We have to show:
1) (S'3(F'=))fceNo decreases;
2) (F'^)fegN is feasible;
3) If F^ is a stationary point of QSOR, then F^ minimizes S3 in the feasible set.
• our feasible set is compact, so the sequence has a convergent subsequence,
• a hmit of a subsequence of (F'')^i is a stationary point of QSOR,
• the problem has only one solution.
To the first point, we only remark that a, 6 as defined in the algorithm, are the minimizers
of the term:
f i/+fc
i/+fe
N
[z-ixY^tiei + y'^tiei]] QIZ
The others are treated as in [6]. The second point is clear, because by the definition we
start with a feasible vector and we retain the feasibility in every single step. It remains
to show the third point. Let F* be a stationary point of the algorithm. It is sufficient to
show that {QF^,Z - F*) > 0 for all feasible vectors z (see [6]), where (,) denotes the
standard inner product in M". To show this we first note that Q = D^Q2D, where
/ 1
-1
\
D=
-1
1
\
-11/
and Q2 is the matrix according to QP3, i.e., to the direct problem. So we can deduce
that {QF', Z-F') = (Z -F'fQF' = [Z - F'YD'^QMDF' = {QMI'^Z - Dif' :=
DF^,z := DZ). Now we only have to look at the "active points" because {QF^)i is
Smooth regression subject to extreme values
285
zero everywhere else. Let Z be an arbitrary feasible vector and j be an index with
Z| = Lj and {Qz)j 7^ 0, so -uj{QF') < 0. With the feasibility of Z, it follows that
{QF^)j{Zj - Zj) — {QF^)j{Zj - Lj) > 0. With the same argument we can derive
{QF^)j{Zj — Zj) > 0 if F* touches the upper bound. It remains to show the inequality
for the linearity intervals. Let [iy,iiy+fc] be a linearity interval of F*. Then obviously
[ti,+i,ti,+k] is a constancy interval for F^. Furthermore it follows from the stationarity
of -F'* that a, b as defined in the algorithm are zero. This is equivalent to
J2iQn==0,
J^tiQF'=-J2iQF' = 0 {ti=^i/n), ,
which implies that
I
I
_
I
for arbitrary X € K" and x = DX. So our conditions are
v+k
{QMF%
= 0,
^(QMF^)^
i=u
This case was proved by Lowendick [6].
i/+k
= 0^
Y1
iQMF^i = 0.
i=i/+l
D
Bibliography
1. William L. Briggs. A Multigrid Tutorial. SIAM, New York, 1994.
2. Paul H. Calamai and Jorge J. More. Projected gradient methods for linearly constrained problems. Afaiftemaiica? Pro^rammmg, 39:93-116, 1987.
3. P. L. Davies and A. Kovac. Modality, Runs, Strings and Multiresolution. To appear
in Annals of Statistics, 2001.
4. D. L. Donoho and LM. Johnstone. Ideal Spatial Adaption by Wavelet Shrinkage.
Biometrika, 81:425-455, 1994.
5. W. Hackbush. Multi-Grid Methods and their Applications. Springer, Berlin, 1985.
6. M. Lowendick. On Smoothing under Bounds and Geometric Constraints. Dissertation, Universitdt Essen, 2000.
7. L. Metzner. Facettierte Nichtparametrische Regression. Dissertation, Universitdt
Essen, 1997.
8. Jorge Nocedal and Stephen J. Wright. Numerical Optimization. Springer, Berlin,
1999.
9. Gerardo Toraldo and Jorge J. More. On the solution of large quadratic programming
problems with bound constraints. SIAM J. Optimization, 1:93-113, 1991.
Model fitting using the least volume criterion
Chris Tofallis
University of Hertfordshire Business School
Dept. of Statistics, Economics, Accounting and Management Systems
Mangrove Rd, Hertford, SG13 8QF, UK
c.tofallis@herts.ac.uk
Abstract
Given data on multiple variables we present a method for fitting a function to the data
which, unlike conventional regression, treats all the variables on the same basis i.e. there
is no distinction between dependent and independent variables. Moreover, all variables
are permitted to have error and we do not assume any information is available regarding
the errors. The aim is to generate law-like relationships between variables where the data
represent quantities arising in the natural and social sciences. Such relationships are
referred to as structural or functional models. The method requires that a (monotonic)
relationship exists; thus, in the two variable case we do not allow cases where there is
zero correlation. Our fitting criterion is simply the sum of the products of the deviations
in each dimension and so corresponds to a volume, or more generally a hyper-volume.
One important advantage of this criterion is that the fitted models will always be units
(i.e. scale) invariant. We formulate the estimation problem as a fractional programming
problem. We demonstrate the method with a numerical example in which we try and
uncover the coefficients from a known data-generating model. The data used suffers from
multicoUinearity and there is preliminary evidence that the least volume method is much
more stable against this problem than least squares.
1
On the undeserved ubiquity of least squares regression
In fitting a function to data, conventional regression requires one variable to be 'special'
— this is the dependent variable. In the sciences however, one often wishes to re-arrange
the model equation by changing the subject of the formula. Statisticians tell us that
in that case we should carry out a second regression. Yet scientists are uncomfortable
with having separate models for each variable, which are not equivalent to each other
and yet are meant to represent the same relationship. Calibration Is another case where
one would like mutual equivalence: e.g. in psychology one can have two tests that are
intended to measure the same ability: a formula or conversion table is required to relate
the score on one test to that on the other.
Another case where regression is inappropriate is where one wants to deduce a parameter such as the rate of change (slope). If both variables are subject to error then ordinary least squares will under-estimate the slope, and regressing x on y will over-estimate
it. A simple example involves plotting galaxy speed (or redshift) against distance from
the observer. The slope of the fitted line gives what is called the Hubble constant, whose
286
Model fitting using the least volume criterion
value crucially determines the future of the universe: will it continue expanding or will
it eventually begin to collapse in on itself? Conventional regression gives different values
for the Hubble constant depending on which variable is treated as being dependent, but
there is no apparent reason for choosing one variable as against the other.
An oft-cited reason for using least squares fitting is that under certain assumptions
on the errors, it will provide the best linear unbiased estimate ('BLUE') of the slope.
This is the Gauss-Markov theorem, where 'best' is taken to mean minimum variance.
What is not widely appreciated is that 'hnear' here refers not to the form of the fitted
model, but rather that the expression for the estimated coefficient be hnear in y. One
can find estimators with even lower variance by removing this non-essential condition
e.g. other Lp-norm estimators are not linear in y.
In multiple regression it is widely, and mistakenly, believed that that the fitted coeSicients tell us the contribution that a particular variable makes to the dependent variable.
In fact, not even the sign of the coefficient can be relied upon to tell us the direction
of the relationship i.e. a particular a;-variable may be positively correlated with the yvariable, and yet have a negative coefficient in the regression model. This is the problem
of multicoUinearity: if there are near-linear relations among the explanatory variables
then the coeflacients produced by regression wiU not only be highly uncertain (large
standard error) but also not be open to sensible interpretation.
We shall present a technique for model-fitting which treats all variables on the same
basis. The method has the important property of being units-invariant; this is an advantage not shared by the total least squares approach (also known as orthogonal regression),
and arises from the fact that we use the product of the deviations in each direction rather
the sum (or sum of squares) when calculating the fitting criterion.
2
The least areas criterion
Consider a set of data points in two dimensions as in Figure 1. By drawing the vertical
and horizontal deviations from the line we create a right-angled triangle for each data
point. Our fitting criterion is simply to minimise the sum of these areas. A key advantage
of this approach is that changing the units of measurement will not affect the resulting
line. In other words it is a scale invariant method. Furthermore we can add a constant
to either variable and the geometry is such that the line merely gets shifted vertically or
horizontally. Combining the scale and translation invariance imphes that the least areas
fine is invariant to linear transformations of the data. It is also apparent that switching
the axes has no effect: the variables are treated symmetrically. (A textbook discussion
of this method appears in Draper and Smith [5].)
We note that it is essential that there be a non-zero correlation in the data otherwise
the method fails. For those seeking to quantify relationships between data variables in the
experimental sciences this would hardly seem to be a restrictive requirement. However
for those working in the area of design and who are concerned with geometrical shapes,
it does rule out the fitting to data scattered around a vertical or horizontal line, or circle,
or rectangle with sides parallel to the co-ordinate axes etc.. We shall not discuss fitting
curves in this paper but we note that this method is not suitable for fitting a relationship
287
,288
Chris Tofallis
FIG.
1. Sum of areas to be minimised in least area calculation.
that is not monotone over the range of the data i.e. there cannot be maxima or minima
over the data range otherwise the area deviation associated with a given point may not
be uniquely specified. Such problems may be avoided by breaking up the data set into
subsets at the optima and fitting a monotone function to each subset, thus producing a
piecewise monotone function.
The least areas method has an interesting history, it has surfaced under different
guises in diverse research literatures throughout the twentieth century. In astronomy it
is known as Stromberg's impartial line. In biology it is the line of organic correlation.
In economics it is the method of minimised areas or diagonal regression. In statistics
it is sometimes referred to as the 'standard or reduced major axis'. This derives from
the fact that if the data are standardised by dividing by their standard deviation, then
the fitted hue corresponds to the major (i.e. principal) axis of the ellipse of constant
probability for the bivariate normal distribution. Yet another name for this technique is
the geometric mean functional relationship. This is because the slope has a magnitude
equal to the geometric mean of the two slopes arising from ordinary least squares (OLS)
(proved in Barker, Soh and Evans [2], and Teissier [20]) i.e. if we regress y on x and get
a slope 6i and then regress x ony (so as to minimise the sum of squared deviations in
the X- direction) and obtain a regression line y = a + b2X, then the geometric mean slope
is 6 = {bib2)^^^. It is interesting to note that the two OLS slopes are connected via the
correlation between the variables
,
■
b2'
This implies that as the correlation falls the disagreement between the two OLS slopes
increases; for example, even with a correlation as high as 0.71 one of these slopes will
be twice as large as the other! It also follows that the magnitude of the slope of the
least areas line lies between those of the two OLS lines. This is intuitively satisfying in a
technique that aims to treat x and y deviations symmetrically. Specifically, for the case
of positive but imperfect correlation, we have 62 > ^ > ^i because b/r > b> rb.
Prom the geometric mean property and the expressions for OLS slopes one can deduce
that the magnitude of the slope of the least areas line takes on a particularly simple closed
form: it is the standard deviation of y divided by the standard deviation of x. The sign
of the slope is provided by the sign of the correlation between y and x.
Numerical experiments have been carried out to compare this fitting technique against
five others (Babu and Feigelson [1]). A specified underlying model was used to generate
data (mostly bivariate normal samples) and the aim was to see which method could
Model fitting using the least volume criterion
best recover the slope of the model. The simulations involved varying the sample size,
correlation and variances. Orthogonal regression gave the poorest accuracy. There were
two methods that came out with highest accuracy: the least areas method and the least
squares bisector. The latter bisects the smaller angle formed between the two OLS lines.
Unfortunately the OLS bisector is not units invariant and so does not suit our purposes
(Ricker [17]).
Turning now to applications, the method seems to have appeared most often in the
field of biometrics (the application of statistics to biological data). For example, in relating the size of one body part to another (or to the total weight or height) in humans
and other animals, one may collect data from an individual at successive stages in their
growth, or from many individuals at different points in their development. It is not generally possible to distinguish between dependent and independent variables in such a
context. Isometric growth is the special case where the two body parts grow such that
their size ratio remains constant. Miller and Kahn [13] argue in favour of our method
thus: 'there is usually no clear justification for saying, e.g. that increase in skull length is
dependent upon increase of body length; it is more realistic to consider changes in skull
length and body length as due to a set of common factors'. Ricker [16] discusses the value
of the method in fishery research. Applications include modelling relationships between
weight and length, between weight and fecundity (the number of eggs), and estimating
the 'catchability' of fish (the fraction of the stock taken by one unit of fishing effort).
Rayner [15] gives an apphcation to the flight speed of birds as related to the windspeed.
We have already noted the scope for application in astronomy. Babu and Feigelson
[1] point out that 'differences in regression methods on similar data may be responsible
for a portion of the long-standing controversy over the value of Hubble's constant, which
quantifies the recession rate of the galaxies'. The earliest appearance of our method in
the astronomical literature seems to be that of Stromberg [19].
The method has also been proposed in the context of educational and psychological
testing. A very early reference being that of Otis [14] who named it the 'relation line'.
If two tests are meant to measure the same aptitude or attainment one may need to
match pairs of equivalent scores on the pair of tests for Creating a conversion table. The
direction of the conversion should obviously not affect which values are paired off, hence
the need for a symmetric approach. Greenall [7] proposes the 'equivalence line' for this
purpose:
y- IJ-y ^ X- ^ipo
(Jy
Ox
This turns out to be yet another name for our least areas line. For standardised scores
the line equation reduces to y = x. He also proves a very interesting uniqueness result:
'When we seek a relation that will deem a pair of scores mutually equivalent if and only
if the proportion of a;-scores less than X equals the proportion of y-scores less than F, we
aim at pairing off scores that give rise to equal percentile ranks. In the case of continuous
bivariate distributions which satisfy a simple condition \F{x^y) = F{y/c,ex)], only the
equivalence relation will provide this relation'. The normal distribution is one case which
satisfies this condition. A relevant theoretical result due to Kruskal [12] is that if the two
289
290
Chris Tofallis
variables are normally distributed and a line is needed to predict x from y as often as y
from X, then the least areas line maximises the probability of correct prediction (i.e. the
probability of being within z standard deviations, for any given ^-value). This provides
another justification for the use of this line.
Hirsch and Gilroy [9] show how it can be useful in hydrology and geomorphology
where one may be interested in relationships between e.g. stream slope versus elevation,
or stream length versus basin area, etc.. 'In such cases there is no clear direction of
causality but there is clearly an inter-relation of variables'. 'A major motivation for the
use of the line lies in the equivalence of the cumulative function of y and j/e.st'In general terms when should the least areas method be used? Rayner [15] cites the
result of Kendall and Stuart [10] that if no error information is available then this method
gives the least-bias or maximum likelihood estimate of the functional relation. Rayner
goes on to demonstrate that this line also has the property of being independent of the
correlation between the errors of the two variables.
Ricker [17] deals with the question of usage by first distinguishing between random
measurement error and mutual natural variability (as arises for example in biology).
In the former case for each observation there is an associated true point which would
arise if the errors in both variables were zero. If one can estimate the variances of the
errors by replicating the measurements then measurement error models can be used to
estimate the line. One monograph on such models is Cheng and Van Ness [4]. If one
cannot estimate the error variances (or their ratio, A) then Ricker recommends the use
of the least areas line as being the best approximation: it gives y and x equal weight and
will be exact if A = var(i/)/var(a;), i.e. when the ratio of error variances equals the ratio of
data variances. For the case of mutual natural variability 'there is no basis for assigning
separate vertical and horizontal components to the deviation', i.e. 'it is impossible to
say whether it is j/ or a; that is responsible for the deviations from the line'. In this case
Ricker concludes that if the data are binormally distributed then the least areas line be
used to describe the central trend, and least squares to estimate one variable from the
other. For the mixed case i.e. having both measurement error and natural variability,
'the best that can be done is to treat them in terms of whichever source of variation
makes the larger contribution to the total. In biological work this will usually be natural
variability'.
'
Despite appearing in so many other fields, it is remarkable that this technique does not
seem to have appeared in the numerical analysis/approximation literature. For example it
is not listed in Grosse's Algorithms for Approximation catalogue. The present paper looks
at an obvious way of extending the approach to any number of variables by minimising
volumes.
3
Least volume fitting
We now intend to fit a linear function of the form Yl^r^i^-j^j = c to data {Xj} in p
dimensions, in other words we have data on p variables and we seek a linear relationship
between them. Of course this is not uniquely specified because we can divide through by
any non-zero constant. Thus we are free to impose a constraint on the coefficients, such
Model fitting using the least volume criterion
as c = 1. Note that we shall not permit any of the coefficients Oj to be zero because that
would imply the associated variable Xj is unrelated to the other variables
One obvious way of generalising the least areas procedure to higher dimensions is to
minimise the volumes (or hypervolumes). Each data point will have associated with it a
'volume deviation' which is simply the product of its deviations from the fitted plane in
each dimension. We must take care to make all these non-negative by taking the absolute
values. For the ith data point this volume deviation Vi is proportional to
We now introduce non-negative variables Ui, vi to deal with the absolute value of the
numeratoi:. The positive Ui represent points on one side of the fitted plane, and positive
Vi refer to points on the opposite side. Setting c — 1 allows us to model the bracketed
term thus:
Ui — Vi = 2_^o,jXij - 1.
j
At least one of each of the pairs {ui,Vi} will be forced to be zero by their presence in
the objective function which is being minimised. Consequently the numerator can be
represented as X^(w^ -|- vf). We shall suppose the denominator is positive; if it is not we
can always make it so by multiplying one of the a:-variables by —1 so that its coefficient,
and hence that of the product of coefficients, also changes sign. We can now formulate
our problem as the following fractional programme:
Minimise
^(wf + vf)/ [J aj
i
such that
■
■
Ui — Vi = NJ o,jXij — 1
3
and
Ui,Vi>0.
The field of fractional programming is comprehensively covered by Stancu-Minasian
[18]. We note that Draper and Yang [6] used a different route to generalising the technique
to more than two dimensions. They minimised the pth root of the squared volumes and
showed that the estimated coefficients were a convex combination of those from the p
OLS estimates.
4
Numerical test
We shall now apply the least volume criterion to try and uncover the coefficients from
data that have been generated from a known underlying model with some randomness
thrown in. In order to make this a difficult test we shall choose data, which suffers from
multicollinearity. This means that there is a near linear dependence within the data, i.e.,
one of the variables almost lies in the space spanned by the remaining variables, and so
we are close to being rank-deficient. The data is taken from Belsley's [3] comprehensive
monograph on coUinearity. The generating model is
y = 1.2 - OAxi + 0.6a;2 -|- 0.9a;3 + e
291
292
Chris Tofallis
with € normally distributed with zero mean and variance 0.01. The absolute correlations
between the variables ranged from 0.35 to 0.61 and so these in themselves would not
have alerted the researcher to any difficulty associated with multicollinearity. Two very
similar data sets (A,B) are tabulated in Belsley based on this model. For set A ordinary
least squares (OLS) gives:
y = 1.26 + 0.97x1 + 9.0.T2 - 38.4x3.
The fit as measured by R^ is very good at 0.992 but the underlying model is far from
being uncovered. In particular, the coefficient of X2 is 15 times too high and two of the
coefficients have the wrong sign! Getting the signs wrong is very serious if one is trying to
understand how variables are related to each other. Turning to the least volume approach
we find:
2/= 1.20 - 0.43.T1 + 0.37x2 + 1.97x3We now have all the correct signs and the magnitudes are much closer to the true ones.
Repeating this for data set B:
OLS:
Least volume:
1/= 1.275+ 0.25x1+4.5x2-17.6x3
y = 1.20 - 0.43xi+0.37x2 +
1.98.T;3-
Once again the least volume approach produces a superior model. Moreover it is also
worth noting that the two OLS models are very different from each other whereas the
least volume models seem to be more stable to small variations in the data. This is
noteworthy because of how similar the two data sets were: the y-values were identical
for sets A and B, and the x-values never varied by more than one in the third digit.
Thus our method seems to be much more stable than OLS. Of course a comprehensive
set of Monte Carlo simulations is required to fully explore this aspect.
5
Conclusion
We have presented a fitting method whose criterion combines the deviations in each
dimension by multiplying them together. This simple device means that re-scaling of
any of the variables e.g. by changing the units of measurement, will give rise to an
equivalent model. This property of units-invariance is not shared by the total least
squares approach (or orthogonal regression: where the sum of the perpendicular distances
to the fitted plane are minimised). By taking the product of the deviations we ensure
that all variables are treated on the same basis and this is useful if the purpose is to find
an underlying relationship rather than to predict one of the variables.
When we applied the technique to data we were able to recover the underlying relationship much more closely than when least squares was used. Not only were the signs
of the coefficient correctly reproduced (which is crucial for understanding directions of
change) but also the magnitudes were much closer to the true values than least squares
estimates. It appears that the least volume method may be superior when there is multicollinearity in the data. Much more simulation needs to be done to investigate this
potentially very valuable feature.
Model fitting using the least volume criterion
Bibliography
1. G. J. Babu and E. D. Feigelson, Analytical and Monte Carlo comparisons of six
different linear least squares fits, Communications in Statistics: Simulation and
Computation, 21 (2) {1992), 5S3-5A9.
,
2. F. Barker, Y. C. Soh, and R. J. Evans, Properties of the geometric mean functional
relationship, Biometrics 44, (1988) 279-281.
'
3. D. A. Belsley, Conditioning Diagnostics, Wiley,'New York, 1991.
4. C-L Cheng and J. W. Van Ness, Statistical Regression with Measurement Error,
Arnold, London, 1999.
5. N. R. Draper and H. Smith, Applied Regression Analysis (3rd edition), Wiley, New
York, 1998.
6. N. R. Draper and Y. Yang, Generahzation of the geometric mean functional relationship. Computational Statistics and Data Analysis 23 (1997), 355-372.
7. P. D. Greenall, PD (1949). The concept of equivalent scores in similar tests. British
J. of Psychology: Statistical Section 2 (1949), 30-40.
8. E. Grosse, (1989). A catalogue of algorithms for approximation, in Algorithms for
Approximation II, eds. J. C. Mason and M. Cox.
9. R. M. Hirsch and E. J. Gilroy, Methods of fitting a straight line to data: examples
in water resources, Water Resources Bulletin 20 (5) (1984), 705-711.
10. M. G. Kendall and A. Stuart, The Advanced Theory of Statistics, 4th edition, vol.2,
391-409, Grifiin, London, 1979.
11. D. K. Kimura, Symmetry and scale dependence in functional relationship regression,
SystematicBiology 41 (2) (1992), 233-241.
12. W. H. Kruskal, On the uniqueness of the line of organic correlation, Biometrics 9
(1953), 47-58.
13. R. L. Miller and J. S. Kahn, Statistical Analysis in the Geologocal Sciences, Wiley,
NY, 1962.
14. A. S. Otis, The method for finding the correspondence between scores in two tests,
J. of Educational Psychology XIII (1922), 524r-545.
15. J. M. V. Rayner, Linear relations in biomechanics: the statistics of scaling functions,
J. Zoo/., iond.('yi; 206 (1985), 415-439.
16. W. E. Ricker, Linear regressions in fishery research, J. Fisheries Research Board of
Canada 30 (1073), 409-434.
17. W. E. Ricker, Computation and uses of central trend lines, Canadian J. of Zoology
62 (1984), 1897-1905.
18. L M. Stancu-Minasian, Fractional Programming: Theory, Methods and Applications,
Kluwer Academic, Dordrecht, 1997.
19. G. Stromberg, Accidental and systematic errors in spectroscopic absolute mag. nitudes for dwarf G0-K2 stars, Astrophysical J. 92 (1940), 156-169.
20. G. Teissier, (1948). La relation d'allometrie, Biometrics 4 (1) (1948), 14-48.
293
Some problems in orthogonal distance and nonorthogonal distance regression
G. A. Watson
Department of Mathematics, University of Dundee, Dundee DDl 4HN, Scotland.
gawatson@maths.dundee.ac.uk
Abstract
Of interest here is the problem of fitting a curve or surface to given data by minimizing
some norm of the distances from the points to the surface. These distances may be
measured orthogonally to the surface, giving orthogonal distance regression, and for this
problem, the least squares norm has attracted most attention. Here we will look at two
other important criteria, the h norm and the Chebyshev norm. The former is of value
when the data contain wild points, the latter in the context of accept/reject criteria.
There are however circumstances when it is not appropriate to force the distances to
be orthogonal, and two possibilities of this are also considered. The first arises when
the distances are aligned with certain fixed directions, and the second when angular
information is available about the measured data points. For the least squares norm, we
will consider some algorithmic developments for these problems.
1
Introduction
Of interest here is the problem of fitting to given data a curve or surface which depends on
a vector a G i?" of parameters. The underlying approach is such that (1) a point on the
surface is associated with each data point, (2) the fit of the surface is measured by a norm
of the vector whose components are the distances between each pair of corresponding
points, (3) the (correct) Gauss-Newton steps in a are used as a basis for minimizing
this norm. The distances may be orthogonal to the surface, giving orthogonal distance
regression (ODR), or may be forced to satisfy some other criterion which makes them
non-orthogonal in general. We consider both situations.
For the ODR problem, most attention has been given to the least squares norm (eg
[5], [8], [9], [16], [17], [22]). Here we will look at two other important criteria, the h norm
and the Chebyshev norm. The former is of value when the data contain wild points, the
latter in the context of accept/reject criteria. For the non-orthogonal distance problem
we will restrict attention to the least squares case.
In terms of a vector a € i?" of parameters, the curve or surface may be defined in
two ways, (a) parametrically, when a point x on the surface is given by
x = x(a,t),
294
Some problems in distance regression
with t the parameters whose values define the particular point, or (b) implicitly, when
the surface is defined by the set of points x satisfying the scalar equation
/(a,x)=0.
It is also assumed here that the expressions required in these representations are differentiable functions of their parameters.
2
h and /oo ODR
Consider first the h case. Then the problem is
m
minimize >J ||xi — Zj(a)||,
where the points Zi(a) are the nearest points to Xj on the surface defined by a, and
where we will assume throughout that unadorned norms are Euchdean norms. Let
Si = \\xi -Zi(a)||, « = l,...,m.
Then the problem is effectively now defined in terms of the vector a alone. It is easy to
to calculate the correct Gauss-Newton step in a, which minimizes
||.5 + Va<5d||i
with respect to d. Now
Va<5, =-^^^i:i|^^Va^,(a), 5, ^ 0,
so that there are potential problems if any 6i -* 0. Given the nature of the h problem, we cannot exclude that possibility. In fact although 5 is not a smooth function,
because derivative discontinuities only occur at zero values it is a strong semi-smooth
function, as defined in [12]. Ideas from smooth analysis and from strong semi-smooth
analysis as developed in [11] can then be combined to give a local convergence analysis
for the present problem. Fast local convergence for the usual smooth problem relies on
strong uniqueness [4]; for the h norm, this can be interpreted in terms of a requirement
that the sequence of solutions d*^ is "well-behaved" in a certain sense [1]. An analogous
requirement can be stated here.
Let the current approximation be a*^ and let J*' denote the Jacobian matrix Va5(a*^),
assuming this exists. Then the Gauss-Newton step d*^ minimzes
||<5(a'=) + j'^d||i.
It is well known (see for example [18]) that if J'' has full rank then there always exists
a solution d'' and an index set Z'' containing n indices such that
5i(a^) + efj'=d*==0, ^6Z^
where GJ is the ith coordinate vector. Let a* be a limit point of the iteration. Then for
a'' close enough to a*, assume that J'' exists and
(i) 5{a'') + J^d'' has exactly n zeros, corresponding to an index set Z'^,
295
296
G. A. Watson
(ii) Z*^ = Z*, independent of A;,
(iii) the n X n matrices whose rows are efj'', i£ Z*, are bounded away from singularity.
In practice these conditions ensure that d*' is unique, and there is no redundancy in the
zero components. An analysis is given in [21] for both parametric and implicit fitting.
The main result is the following.
Theorem 2.1 [21] Let the Gauss-Newton method produce a sequence a'' —> a*, where
6{a.'') has no zero components, and let (i)-(iii) above hold. In the parametric case, assume
that for all i e Z*, there exists a unique unit normal vector n,; (up to change of sign) at
the point Xj on the surface defined by a*. Then the (undamped) Gauss-Newton method
converges to a* at a second order rate.
The significance of this result is that, for both parametric and implicit fitting, any
Si tending to zero is not by itself necessarily an obstacle to good performance of the
Gauss-Newton method in the h case. What is more significant is the possibility of very
slow convergence and this has more to do with the number of those zero components
of 5 at a limit point, rather than just their presence. A fundamental requirement for
the condition (ii) is that the number of zero components of <5(a*) is n. Of course, this
condition is a rather special one, and for many problems, will not be satisfied. There is
slow (possibly very slow) convergence associated with this case.
Turning now to the/oo problem, this can be stated
minimize max||xj -z,;(a)||,
i
:
with Zi(a) defined as before. Again 5i = \\xi - Zj(a)|| is not a smooth function, but a
solution normally occurs in a region where 6 is smooth. Therefore the problem does not
differ significantly from the usual nonlinear minimax problem: the main requirement for
fast local convergence is that at a limit point the norm is attained at n + 1 indices [4].
Two simple examples in 2 dimensions are given by way of illustration. A standard
line search is incorporated to force global convergence, although trust region methods
are a popular alternative. Indeed, local convergence is the main concern here, and we
have not begun to address important issues to do with the development of robust general
purpose algorithms.
Example 2.2 Consider the Spath data set [13] (m = 7), and consider fitting an ellipse
defined imphcitly, using the loo and h norms. The solutions are illustrated in Figure 1,
where the dashed ellipse and dashed lines are the ^oo solution and corresponding orthogonal directions, and the solid ellipse and solid lines are the h solution and corresponding
directions. Both ellipses were obtained using the Gauss-Newton method starting from
the circle centre (5,5), radius 2, in 4 and 5 iterations respectively for 5 figure accuracy.
Example 2.3 Consider next the GGS data set [6], which has m - 8. Similar fits to
those for Example 1 are shown in Figure 2. Again the Gauss-Newton method was used
starting from the circle centre (5,5), radius 2, to give convergence in 6 iterations (loo)
and 7 iterations (?i).
For both these examples n = 5, and favourable conditions hold so that there is
quadratic convergence both in the h and l^o cases. Otherwise, the key to recovering fast
Some problems in distance regression
FIG.
297
1. l\ and loo fits to Spath data set.
local convergence in the l\ case is to identify Z* and to reformulate the problem locally
as
minimize Yj ||xi — Zi(a)|| subject to Xj — Zj(a) = 0, i € Z*.
(2.1)
A similar remedy in the /oo case is as follows. For a limit point a* of the iteration, let
r = {i:(5i(a*)=max<5i(a*)}.
i
•
Then if we can identify /*, a* solves, for any j € I*:
minimize 5j{a.) subject to ^j(a) — Sj{a.) = 0, i e I*\jExample 2.4 Fitting an l^o ODR hne in R^ to 100 random data points (equivalent
to finding the circumscribing cylinder of smallest radius) gives slow convergence of the
basic method, because |/*| = 3 and n = 4. But once we identify I* = {4,42,58}, only 5
iterations of the NAG Fortran subroutine E04UCF are required for 6 figure accuracy.
3
3.1
Non-orthogonal I2 distance regression
Using fixed directions
Suppose that the data come from sampling the surface of a manufactured part, using
a coordinate measuring machine with a touch probe. It has been argued by Hulting
[10] that choosing the directions to be the known probe directions Vj (relative to a
fixed frame of reference) not only makes explicit use of the measurement design, but
G. A. Watson
298
FIG.
2. h and ^oo fits to GGS data set.
also complies with traditional fixed-regressor assumptions (enabling standard inference
theory to apply).
Let Xj, i = l,...,m as usual be the data points, and let Zj be the corresponding
points on the surface reached by travelling along the lines from x,; in the direction v,.
Then we require to minimize \\5\\ where
5i = ||xi -Zi(a)||, i = 1,.. . ,m,
with Zj(a) defined by
Zi(a) - x,-= (5iVi, i = 1,... ,m,
where Vj satisfying vf v^ = 1 is given for each i. In case of ambiguity, the smallest value
of 5i is chosen. The basic idea in efficient algorithmic development is again to treat the
problem as one in a alone, which can be solved as before by the Gauss-Newton method
(or variants). Let a be given. Then for each point Xj, the point where the line through
Xj in the direction Vj first cuts the surface can be obtained (this calculation replaces the
"footpoint problem" of calculating Zj (a) as the point on the surface in the orthogonal
distance problem), giving 5,; as a function of a. Methods based on Gauss-Newton steps
are developed for the parametric case in [19], [20], and for the implicit case in [7].
By way of illustration, the 2 data sets previously considered in Examples 1 and 2
are used to fit ellipses defined implicitly with a particular choice of directions Vj. The
initial (circles) and final ellipses (together with the data points and the directions v,;)
are shown in Figures 3 and 4. The calculations needed respectively 19 and 17 iterations,
reflecting the fact that, unlike the h and /QO cases, the convergence rate is linear.
Some problems in distance regression
10
3.2
12
299
14
FIG.
3. ^2 fit to Spath data set: fixed Vj.
FIG.
4. h fit to GGS data set: fixed Vj.
Using angular information
Berman and Griffiths [2, 3] consider fitting a circle when angular differences between
successively measured data points are known, with apphcations in physics and archaeology. This fitting problem has been extended to the case of ellipses and ellipsoids by
Spath in [14, 15] and it is this kind of problem which is of interest here. The methods
of [14] and [15] are based on the alternating algorithm, and while this can be perhaps
surprisingly effective (particularly with a reparameterization of the problem), we consider here a correct separated Gauss-Newton method similar to that used before. In
addition to (usually) better local convergence properties, standard step-length control
can be incorporated.
G. A. Watson
300
To illustrate, consider fitting an ellipse in general position. It is convenient to do this
by allowing the data to rotate, and fitting to those an ellipse in normal position, aligned
with the axes. Let [x,y) denote the components of x. Then we work with the data
Xi{4)) = XiCos<j) + yi sin(j), yi{(j)) = -Xi sin^ + j/; cos<f),
for i = 1,... ,m, where <f) is an unknown parameter. Therefore we require to minimize,
with respect to the 6 parameters a, b,p, q, a, 0, the function
m
j=i
where the numbers U are given. Because (a + U+i) - (a + U) = U+i - U, for each i, we
can interpret this as saying say that the angular differences are known, with a degree of
freedom given by the parameter a. Note that at a solution to this problem, the directions
between pairs of points {xi{(})),yi{4>)) and the corresponding points on the ellipse will
not generally be orthogonal to the ellipse.
Differentiating the above expression with respect to a,p, b, q gives
Ai
(3.1)
Cl,
where
■ Ai =
EfciCos(a + ii)
YZ.cos^a + ti)
b
,Cl =
(3.2)
C2,
where
^ A2 =
"^
Efci sin(a + ti)
_
Er=i sin(a + U) Er=i sm'(" + *i) J ' ' "
Er=i?/«Wsm(a + ii)
Then (3.1) and (3.2) give {a,b,p,q) as functions of a and <{), provided that Ai and A2
are nonsingular: this will be assumed. For given a and 0, we can therefore define the
function to be minimized as
F{a,4>) = ma,4>)l
where
||wi||, i = l,.. ,m,
(3.3)
with
Wi = {xi{(t>)-a-pcos{a + ti), yi{({>)-b-qsm{a + ti))'^,
and with a,b,p,q defined by (3.1) and (3.2). Then we can apply the Gauss-Newton
method to the minimization of F{a, (f>). The basic step d = {6a, Scf)'^ is given by finding
min ||(5-|-Jd||,
defi=
(3.4)
Some problems in distance regression
301
where J e _R™x2 h^g jth row given by
ef J = Vo,,,^<5i(a, ^), i = 1,..., m.
Now
V„,^(5i(a, 0) = -^(V„,^Wi + (Va,p,6,,Wi)M), (5^ ^^ 0,
(3.5)
where
6
ei?'4x2
It is easy to compute M from (3.1) and (3.2) which can be interpreted as identities in
a and 0. The details are omitted, but all the hnear systems use just the matrices Ai
and A2-I and apart from the solution of (3.4) (a least squares problem in two variables),
there remains only evaluation of expressions.
Example 3.1 Consider Example 1 from [14], which has m = 11. Starting from a —
0, ^ = 0, 15 iterations are required to satisfy the stopping criterion ||d||oo < 0.001.
The resulting value of \bf is 7.7211, with o = 2.1253, 6 = -0.1700, p = 4.1281, q =
3.0931, a = 13.2348°, </) = 34.7309°.
Example 3.2 Next consider Example 2 from [14], which has m = 8. Again starting
from a = 0, 0 = 0, 9 iterations are required to satisfy the Stopping criterion ||d||oo <
0.001. The resulting value of \bf is 4.4946, with a = 4.3608, h = 1.9537, p = 5.3717, q =
3.3704, a = -0.6215°, cj) = 26.3889°.
4
Conclusions
We have examined some aspects of fitting curves and surfaces to given data. The underlying criterion involves associating with each data point a point on the surface and
minimizing some norm of the vector whose components are the distances between pairs
of points. The distances can be orthogonal to the surface, or fixed in some other way. But
the problems have in common that methods based on separated Gauss-Newton steps can
readily be developed.
Bibliography
1. Anderson, D. H. and M. R. Osborne, Discrete, non-finear approximation problems
in polyhedral norms. Num. Math. 28, 143-156 (1977).
2. Herman, M., Estimating the parameters of a circle when angular differences are
known, Appl. Statist. 32, 1-6 (1983).
3. Herman, M. and D. Griffiths, Incorporating angular information into models for
stone circle data, Appl. Statist. 34, 237-245 (1985).
4. Cromme, L., Strong uniqueness: a far reaching criterion for the convergence of
iterative processes, Numer. Math. 29, 179-193 (1978)
302
G. A. Watson
5. Forbes, A. B., Least squares best fit geometric elements, in Algorithms for Approximation II, eds. J. C. Mason and M. G. Cox, Chapman and Hall, London, 311-319
(1990). ,
6. Gander, W., G. H. Golub and R. Strebel, Fitting of circles and ellipses: least square
solution, BIT 34, 556 -577 (1994).
7. Gulliksson, M., L Soderkvist and G. A. Watson, Implicit surface fitting using
directional constraints, BIT 41, 331-344 (2001).
8. Helfrich, H.-P. and D. Zwick, A trust region method for implicit orthogonal distance regression, Numer. Alg. 5, 535-545 (1993).
9. Helfrich, H.-P. and D. Zwick, A trust region algorithm for parametric curve and
surface fitting, J. Comp. Appl. Math. 73, 119-134 (1996).
10. Hulting, F. L., Discussion contribution to the paper by M. M. Dowling, P. M.
Griffin, K.-L. Tsui and C. Zhou, Statistical issues in geometric feature inspection
using coordinate measuring machines, Technometrics 39, 18-20 (1997).
11. Qi, L., Convergence analysis of some algorithms for solving nonsmooth equations.
Math, of Operations Research 18, 227-244 (1993).
12. Qi, L. and G. Jiang, Semismooth Karush-Kuhn-Tucker equations and convergence
analysis of Newton methods and quasi-Newton methods for solving these equations. Math, of Operations Research 22, 301-325 (1997).
13. Spath, H., Least squares fitting by circles. Computing 57, 179-185 (1996).
14. Spath, H., Estimating the parameters of an ellipse when angular differences are
known, Comput. Stat. 14, 491-500 (1999).
15. Spath, H. Least squares fitting of spheres and ellipsoids using not orthogonal distances. Math. Comm. 6, 89-96 (2001).
16. Turner, D. A., The Approximation of Cartesian Co-ordinate Data by Parametric
Orthogonal Distance Regression, PhD Thesis, University of Huddersfield (1999).
17. Turner, D. A., I. J. Anderson, J. C. Mason, M. G. Cox and A. B. Forbes, An efficient separation-of-variables approach to parametric orthogonal distance regression, in Advanced Mathematical and Computational Tools in Metrology IV, eds P.
Ciarlini, A. B. Forbes, F. Pavese and D. Richter, Series on Advances in Mathematics for Applied Sciences, Volume 53, World Scientific, Singapore, 246-255 (2000).
18. Watson, G. A., Approximation Theory and Numerical Methods, John Wiley,
Chichester (1980).
19. Watson, G. A., Least squares fitting of circles and ellipses to measured data, BIT
39,176-191(1999).
20. Watson, G. A., Least squares fitting of parametric surfaces to measured data,
ANZIAM J 42 (E), C68-C95 (2000).
21. Watson, G. A, On the Gauss-Newton method for h orthogonal distance regression,
IMA J. Num. Anal, (to appear).
22. Zwick, D. S., Applications of orthogonal distance regression in metrology, in i?ecen<
Advances in Total Least Squares and Errors-in-Variables Techniques, ed S. Van
Huffel, SIAM, Philadelphia, pp. 265-272 (1997).
Chapter 6
Splines and Wavelets
305
Preceding Page Blank
Nonlinear multiscale transformations: From
synchronization to error control
F. Arandiga and R. Donat
Dept. Matematica Aplicada, University of Valencia, Spain.
arandiga@uv.es donat@uv.es
Abstract
Data-dependent interpolatory techniques can be used in the reconstruction step of a
multiresolution "d la Harten". These interpolatory techniques lead to nonlinear multiresolution schemes. When dealing with nonlinear algorithms, the issue of the stability
needs to be carefully considered. In this paper we analyze and compare several strategies
for image compression and their ability to effectively control the global error due to
compression.
1
Introduction
Multiscale transformations are being used in recent times in the first step of transform
coding algorithms for image compression. Ideally, a multiscale transformation allows for
an efficient representation of the image data, which is then processed using a (nonreversible) quantizer and passed on to the encoder which produces the final compressed
set of data which is ready to be transmitted or stored. Compression is indeed achieved
during the second and third steps: the quantization and the encoding of the transformed
set of discrete data.
It is quite clear that the properties of the multiscale transformation are most important in the overall performance of the transform coding algorithm. Until recently, the
multiscale transformations used for image compression were always based on linear filter
banks, however, the nonlinear alternative has been explored lately by various authors
from different points of view, and preliminary results show the alternative to be very
promising [12, 8, 6, 2, 3]. The key question when using, or even designing, a nonlinear
multiscale transformation is that of stability. In order for such transformations to be
useful tools in image coding, it is absolutely necessary to keep a tight control on the
effect of quantization errors in the decoding process.
In this paper we examine the question of stability for nonlinear multiscale transformations within Marten's framework for multiresolution [14, 15]. Harten's framework
is broad enough to include all classical wavelet transformations as particular cases (just
as it happens in the Lifting framework of W. Sweldens [17], developed slightly later in
time but independently), however the design of the multiscale transformation is done
directly on the spatial domain.
306
Nonlinear multiscale transformations
307
The building blocks of Harten's multiresolution framework are two operators that
connect adjacent resolution levels. The Decimation (or also, Restriction) operator is a
linear operator which acts as a low-pass filter, extracting low-resolution information from
a discrete data set. The Prediction operator (also Projection) uses low-resolution data to
predict discrete data at a higher resolution level. It is precisely the design of this operator
what distinguishes Harten's framework from all other multiresolution frameworks. The
prediction operator is based on a consistent Reconstruction technique, and this opens up
a tremendous number of possibilities in the design of multiresolution schemes. The use
of the reconstruction process as a design tool makes it, conceptually, a simple matter
to introduce adaptivity into the multiscale transformation; we only need to make the
reconstruction process data-dependent [5, 4, 14].
This paper is organized as follows. In Section 2 we recall the so-called cell-average
framework, an appropriate setting for image compression, and describe a class of nonlinear prediction operators obtained by mean-average interpolation [10,14,15]. In Section 3
we examine the question of stability for nonlinear multiscale transformations and relate
it to the synchronization of the data-dependent choices made in the encoder and the
decoder. We also include a set of numerical experiments that illustrate he performance
of several nonlinear multiscale transformations.
2
Multiscale transformations in the cell-average setting
Harten's general framework for multiresolution [15] relies on two operators. Decimation
and Prediction, that define the basic interscale relations. These operators act on finite
dimensional Unear vector spaces, W, that represent the different resolution levels {j
increasing implies more resolution)
(a) D^ : V^ -^ V^-\
(b) Pj : V^'^ -> W,
(2.1)
and must satisfy two requirements of algebraic nature; D^ needs to be a linear operator
and D^Pj = /yj-i, i.e., the identity operator on the lower resolution level represented
by V^~^. For all practical purposes, V^ can be considered as spaces of finite dimensional
sequences.
Using these two operators, a vector (i.e., a discrete sequence) v^ G V^ can be decomposed and reassembled as follows
(-)
\
..■
gj
^
—
„i_P..i-i'
yj — p.yi
(b)
-'=P^-'-'+-\
(2-2)
where e^ represents the error in trying to predict the jth level data, v^, from the low
resolution data v^~^ = D^v^, using the prediction operator Pj .
In the cell-average setting, the discrete data are interpreted as the cell-averages of
a function on an underlying grid, which determines the level of resolution of the given
data. The one dimensional case, in which one considers a set of nested dyadic grids on
the interval [0,1], {X^}, j > 0 of size/ij = 2-^'/io,
X^=^{xi}
xi=i-hj,
i = 0,...,Nj
Nj-hj = l
(2.3)
308
F. Arandiga and R. Donat
is the easiest one to describe, and it is also directly applicable to two-dimensional (2D)
data via tensor product [2, 3] (the cell-average framework in several dimensions and
non-tensor product (unstructured) grids is considered in e.g. [1]).
In this simple one-dimensional setting, the cell-average framework is characterized by
the following decimation operator D^
'
(i5V), = -(4_i+4),
1 <i< Ar,-_i,
(2.4)
where Nj is the number of equally spaced intervals on X^, the grid on [0,1] that represents the jth resolution level. The consistency requirement for the prediction operator,
i.e., D^Pj = lyj--^ which is the only necessary requirement for the prediction in Harten's
framework, becomes then
(2.5)
Observe that (2.4) and (2.5) imply that
Hence
4i-x = <-i - iPjv'-')2i-i = iPjv^-%i
''21
-2i-
Therefore the prediction errors at even and odd grid points on the jth level in (2.2)
are not independent. By considering only the prediction errors at (for example) the
odd points of the grid X-', one immediately gets a one-to-one correspondence between
,Ni
the sets {vaSi
^ {{vrmi\{dimn, with dl = 4,^, and v^-' = DhK The
one-dimensional multiscale transformation and its inverse can be written as follows.
Mv'
(^°,d\...,d^)
I
FoTJ = L,...,l
For i = 1,... ,Nj-i
+ <-i)/2
iPjV'-')2i
"ii-i
(R
Vd = iv'',d\:..,d^)^M-'vd
Forj = l,...,L
For i = l,...,Nj-i
{PjV^-')2i-l + di
''2i-l
''2i
= 2vr
(2.6)
(2.7)
^2i-l
Observe that since d] = e^i^i = -63,;, the consistency relation (2.5) implies that the
computation of vl^ in (2.7) is equivalent to
vi, = 2vr'-vi,_,=={Py-'hi-4 = iPjV^-')2i + 4i-
(2-8)
Therefore (2.6) and (2.7) are just the repeated application of the decomposition and reassembling specified in (2.2)(a) and (2.2)(b). Thus (2.6) defines a multiscale transformation
and (2.7) is the inverse transformation, whether or not the prediction operator is linear.
Next, we follow [4, 14, 15] to describe a class of linear prediction operators that leads
to the (1, M) branch of the Cohen-Daubechies-Feauveau family [7], which is biorthogonal
Nonlinear multiscale transformations
309
to the box function [11, 15]. This class is also considered in [6] within the lifting framework, where it is described as a particular case of Donoho's average interpolation [9].
Given an integer s > 1, for each 1 < « < Nj-i we construct a polynomial, Pi{x), of
degree 2s such that
/
Pi{x)dx = vj., ,
ioTl =-s,...,s.
(2-9)
There are various ways to prove that pi{x) in (2.9) always exists and it is uniquely
defined by the 2s + 1 conditions in (2.9) [1, 9, 14]. Then we define
{PjV^-^hi = ^ T' Pi{x)dx,
{PjV^-%i^i = ^ r^''Pi{x)dx.
(2.10)
The prediction operator defined by (2.10) is data-independent, hence linear, and
it clearly satisfies the consistency relation (2.5). It can be shown that the multiscale
transformations (2.6) and (2.7) for this class of prediction operators turns out to be the
(1, M = 2s -|-1) branch of the Cohen-Daubechies-Feauveau family.
A nonlinear prediction operator is obtained if we construct Pi{x) in a data-dependent
way. An example of nonlinear multiresolution transformation constructed in this fashion
is considered in [14, 4, 2], where a nonUnear ENO-type technique (Essentially Non Oscillatory, see [16]) is used to construct Pi{x). The key idea, which is in essence common
to the approach used in designing nonlinear filter banks, is to avoid using data across
an edge for the prediction step.
The ENO nonhnear technique is better described if we associate to each polynomial
piece Pi{x) a stencil, Si, which is the set of indices of the values used to define Pi{x). In
the linear case <Sj = {i — s,... ,i + s}; the stencil is independent of the data set {v^"^}
and, as a consequence, Pj is a linear operator. In the ENO technique described in [16], the
selection of stencil is made in a data-dependent way using the divided differences of the
data as a measure of its smoothness. Large divided differences occur when considering
data across an edge, while divided (or undivided) differences of data on smoother regions
tend to be smaller in size.
The information contained in the divided differences is then used to decide what is
Si for each i, with the only restriction that i e Si (to satisfy the consistency requirement
(2.5)). We follow [4] and consider all polynomial pieces of the same degree. In our case
#iSi = 2s, but in principle one could decide to lower the degree of Pi{x), or that of
some of its neighbours, whenever an edge-detection mechanism finds an edge at the ith
interval. By lowering the degree of some polynomial pieces close to an edge, one can
avoid crossing the edge in the prediction step, as much as possible. This option is closely
related to the nonlinear multiscale transformation considered in [6] (within the Lifting
framework), where the nonhnearity comes in from adaptively choosing from the (1,M)
family of linear filters.
Once Si is determined (i G <Sj), Pi{x) can be uniquely determined when degree Pi{x) =
310
F. Arandiga and R. Donat
a^Si [1] so that
1
Hn'
/
pi{x)dx = vi^'^
iovmeSi,
(211)
and the prediction operator is then defined by (2.10).
One can be sHghtly more 'sophisticated' in the design of the polynomial pieces. The
Suhcell Resolution technique [4, 13] allows to account for discontinuities within a cell as
follows. If an edge is detected in the ith cell, the polynomial piece Pi{x) is discarded and
substituted by its left and right neighbours, pij^\{x) and pj_i(a;), assuming that their
respective stencils do not intersect, i.e. <Si_i 0*^1+1 — ^- ^^ ^ t^'^f' one-dimensional edge
(a jump) on the ith cell, the function
I fv
Fiy) = T-
f^J Jxi,_,
1 r4i
Pi-iix)dx+—
pi+i{x)dx
rij Jy
will have a zero on the ith cell [13], say r), and the location of r] is used to substitute the
polynomial piece Pi{x) by the discontinuous piecewise polynomial function
"W={^:;;[x1 III
p-'^)
The prediction operator is again defined by (2.10) at nonsingular cells (cells in which no
edge has been detected), while at the singular cell
■ .
1 r'^i
{Pjv^-%i = q,{x)dx,
■ ,
1 Hi-i
qi{x)dx.
{Pjv^-')2i-i -^j-
In practice it is unnecessary to compute explicitly the value of r/; only its location with
respect with X2i_i is needed, which can be found by a sign check. We refer the reader
to [4] (and references therein) for specific details on this technique, in particular on the
detection mechanism, and on its performance.
3 The question of stability: Error control versus synchronization, with numerical examples
Lossy coding schemes introduce errors into the transform coefficients, and it becomes
crucial that the nonlinearities do not unduly amplify these errors. In lossy compression
the decoder only has the quantized detail coefficients. If we use a nonlinear prediction
operator (whether it is constructed as described in the previous section or based on
locally adapted filters, as in [6] within the Lifting framework), the quantization errors in
coarse scales could cascade across the scale ladder and cause a series of incorrect choices
(either on the filters or on the stencils) leading to serious reconstruction errors.
To avoid incorrect choices in the prediction step, whether within Harten's or the
Lifting framework, one would need to send side information on which filter was used
(Lifting) or what was the interpolatory stencil (Harten's). This is clearly inappropriate
when trying to design a compression scheme. One way to avoid storing (and sending) side
information is to somehow synchronize the nonlinear prediction operators in the encoder
Nonlinear multiscale transformations
and the decoder, so as to ensure that at a given spatial location on a given scale, the
prediction operator will select the same stencil (filter bank), both in the encoding and
the decoding steps.
Within the Lifting framework, synchronization is achieved in [6] by changing the
typical Split-Predict-Update steps to Split-Update-Predict. In doing so, it is possible to
base the choice of predictor directly on already 'quantized data', thus synchronizing the
nonlinear decisions made by the encoder and the decoder.
Within Harten's framework, synchronization is just a consequence of a strategy that is
designed to fully control the compression error. Because the main design tool in Harten's
framework for multiresolution is a reconstruction technique, and because A. Harten had
already worked with nonlinear reconstruction techniques in the context of the numerical
simulation for hyperbolic conservation laws, so-called Error-Control (EC) strategies can
be found already in the early papers of Harten on multiresolution [14].
Harten's mechanism to control the global accumulated error is based on a modification
of the direct multiscale transformation, M, that ensures a prescribed tolerance on the
global prediction errors (explicit error bounds can be found in [4, 13]). The modified
transformation incorporates the quantizer to the direct multiscale transformation in
such a way that the prediction operator in the encoder also acts on already 'quantized'
data, hence synchronization is achieved because the nonUnear prediction operators both
in M and M~^ work on the same set of discrete data at each resolution level.
To illustrate the effect of the different techniques, we take a particular nonlinear
prediction operator, a third order ENO reconstruction technique with Subcell Resolution, as described in last section. We denote by MSR the multiscale transformation
(2.6), while Mg^j^ denotes the EC modified transform as described in [2, 4], and Mf^ a
multiscale transformation in which only synchronization is enforced, as proposed in [6].
The quantization step is carried out as follows:
qu(d^) = 2ejround [d^/{2ej)]
and it is incorporated to the direct transformation in M||j and Mgj^ (see [2, 6] for
specific details), while in MSR it is applied to the scale coefficients obtained after the
transformation. In the numerical tests we report, we take CL = 8 with L = 4 and
ej = ej+i/2.
We consider two different images: the familiar image of Lena as an example of a 'real'
image, and a purely geometrical image, to which texture has been added, as in [6].
After the direct transformation (plus the quantization step) has taken place, a lossless
Lempel-Ziv compression algorithm is applied to reduce the size of the transformed image,
then a compression ratio is computed as the number of bits of the compressed representation over the number of bits of the original image. To recover the original image, we
undo the lossless compression and transform back using (2.7) in all three cases. The full
compression algorithm is identified in each case by an acronym, 'ST' for MSR, 'EC for
M|fj and'SYNC for M|«.
In Tables 1 and 2 we compile a number of quantities that measure the 'quality' of the
reconstructed image, and therefore the robustness and reliability of each multiresolutionbased compression algorithm, the magnitude of the global compression error, measured
311
312
F. Arandiga and R. Donat
Method
ST
SYNC
EC
TAB.
II ' llco
258
195
25.4
II-111
5.71
6.45
4.47
II -112
Tc
9.08
9.82
5.73
11.3:1
7.9:1
9.7:1
entropy
.6449
.8875
.6850
1. Geometrical image.
a^ l^Jl^ !^JLr l^JL
I >qi
FIG.
W ; "IF™™!" I "HT"™-*^
1. Geometrical image: (a) original, (b) ST, (c) EC, (d) SYNC.
in various norms, the compression rate TC and the entropy of the transformed image.
The reconstructed images in both cases can be observed in Figures 1 and 2.
It can be clearly observed that the absence of any type of synchronization procedure
can lead to a very poor reconstructed image. Synchronization only improves the quality,
but is not as robust as the full EC mechanism, designed in this case to enforce a certain
error bound in the 2-norm (as observed in Tables 1 and 2, the 2-norm of the global error
is kept below et = 8). It is worth mentioning that the compression rate and the entropy
of the compressed data are all very close, however the visual quality of the reconstructed
image is significantly better for the EC compression algorithm.
Bibliography
1. R. Abgrall and A. Harten. Multiresolution representation in unstructured meshes.
SI AM J. Numer. Anal. 35, 2128^2146 (electronic), 1998.
2. S. Amat, F. Arandiga, A. Cohen, and R. Donat. Tensor product multiresolution
analysis with error control for compact image representation. Submitted to Signal
Processing, 2000.
3. S. Amat, F. Arandiga, A. Cohen, R. Donat, G. Garcia, and M. Von Oehsen. Data
compression with ENO schemes. Applied and Computational Harmonic Analysis
11, 273-288, 2001.
4. F. Arandiga and R. Donat. Nonlinear multi-scale decompositions: The approach of
A. Harten. Numer. Algorith. 23, 175^216, 2000.
5. F. Arandiga, R. Donat, and A. Harten. Multiresolution based on weighted averages
of the hat function II: Nonlinear reconstruction operators. SIAM J. Sci. Comput.
20, 1053-1093, 1999.
6. R. L. Claypoole, G. Davis, W. Sweldens, and R. Baraniuk. Nonlinear wavelet
transforms for image coding via lifting scheme, submitted to IEEE Trans, on Image
313
Nonlinear multiscale transformations
Method
ST
SYNC
EC
318
277
26.4
TAB.
FIG.
5.66
5.97
3.59
10.59
10.56
4.84
i:l
7.5:1
8.2:1
entropy
.8261
.9430
.8704
2. Lena.
2. Lena: (a) original, (b) ST, (c) EC, (d) SYNC.
Processing., 1999.
7. A. Cohen, L Daubechies, and J.C. Feauveau. Biorthogonal bases of compactly
supported wavelets. Comm. Pure Applied Math. 45, 485-560, 1992.
8. R. L. de Quieroz, D. A. Florencio, and R. W. Schafer. Non-expansive pyramid
for image coding using a non-linear filter bank. IEEE Trans. Image Processing 7,
246-252, 1998.
9. D. L. Donoho. Interpolating wavelet transforms. Technical report. Department od
Statistics, Stanford University, 1992.
10. D. L. Donoho and Thomas P.Y. Yu. Nonlinear pyramid transforms based on medianinterpolation. SIAM Journal on Mathematical Analysis 31, 1030-1061, 2000.
11. M. Guichaoua. Analyses Multiresolution Biorthogonales associees a la Resolution
d'Equations aux Derivees Partielles. PhD thesis, Ecole Superieure de Mecanique
de Marseille, Universite de la Mediterranee Aix-Marseille II, 1999.
12. F.J. Hampson and J.C. Pesquet. A nonlinear subband decomposition with perfect
reconstruction. In Proce. IEEE Int. Conf. Acoust., Speech, and Signal Proc, 1996.
13. A. Harten. ENO schemes with subcell resolution. J. Comput. Phys. 83, 148-184,
1989.
14. A. Harten. Discrete multiresolution analysis and generalized wavelets. J. of Applied
Num. Math. 12, 153-193, 1993.
15. A. Harten. Multiresolution representation of data: A general framework. SIAM J.
A^umer. ^na/. 33, 1205-1256, 1996.
16. A. Harten, B. Engquist, S. Osher, and S.R. Chakravarthy. Uniformly high-order
accurate essentially nonoscillatory schemes. III. J. Comput. Phys., 7 1231-303,
1987.
17. W. Sweldens. The lifting scheme: a custom-design construction of biorthogonal
wavelets. Appl. Comput. Harmon. Anal. 3, 186-200, 1996.
Splines: a new contribution to wavelet analysis
Amir Z. Averbuch, and Valery A. Zheludev
School of Computer Science, Tel Aviv University, Israel.
amir@math.tau.ac.il, zhel@post.tau.ac.il
Abstract
We present a new approach to the construction of biorthogonal wavelet transforms using polynomial splines. The construction is performed in a "lifting" manner and we use
interpolatory, as well as local quasi-interpolatory and smoothing splines as predicting
aggregates in this scheme. The transforms contain some scalar control parameters which
enable their flexible tuning in either time or frequency domains. The transforms are
implemented in a fast way. They demonstrated efficiency in application to image compression.
1
Introduction
Until recently, two methods have been used for the construction of wavelet schemes
using splines. One is to construct orthogonal and semi-orthogonal wavelets in the spline
spaces (Battle-Lemarie [2, 7], Chui-Wang [6], Unser-Aldroubi-Eden [12]). Another way
was introduced by Cohen, Daubechies and Feauveau [3] who constructed symmetric
compactly supported spline wavelets whose duals, remaining compactly supported and
symmetric, do not belong to a spline space. However, since the introduction of the lifting
scheme for the design of wavelet transforms [11], a new way was opened to use splines as
a tool for devising a full discrete scheme of wavelet transforms. Namely, various splines
can be employed as predicting aggregates in lifting constructions.
2
Lifting scheme of biorthogonal wavelet transform
The sequences {a{k)}'^__^, which belong to the space h, we call the discrete-time signals. The ^-transform of a signal {a{k)} is defined as follows: a{z) = YlkL-oo ^~'' o^k).
Throughout the paper we assume that z = e"^. We introduce a family of biorthogonal
wavelet-type transforms that operate on the signal x = {x{k)}'^^_^, which we construct
through lifting steps.
The lifting scheme for the wavelet transform of a signal can be implemented in primal
or dual modes. For brevity we consider only the primal mode.
Decomposition Generally, the primal lifting scheme for decomposition of signals consists of three steps: 1. Split. 2. Predict. 3. Update or lifting.
SPLIT - We split the array x into even and odd sub-arrays:
ei = {ei(fc) = x{2k)},
di = {rfi(fc) = x{2k + 1)}, fc G Z.
314
Splines and wavelets
315
PREDICT - We use the even array ei to predict the odd array di and redefine the
array di as the difference between the existing array and the predicted one. To be specific,
we apply some filter with transfer function zU{z) to the sequence ej and predict the
function di{z^) which is the z^—transform of di. The z^-transform of the new d—array
is defined as follows:
dUz^) = di{z'^)-zU{z)ei{z'^).
(2.1)
Prom now on the superscript u means an update operation of the array. Obviously, the
prediction zU{z)ei{z'^) should approximate di{z'^) well.
LIFTING - We update the even array using the new odd array:
eUz^) = e,{z^) + I3iz)z-'d^{z^).
(2.2)
Generally, the goal of this step is to eliminate aliasing which appears while downsampling
the original signal x into ei. Further on we will discuss how to achieve this effect by a
proper choice of the filter /3.
Reconstruction The reconstruction of the signal x from the arrays e" and d" is
implemented in reverse order: 1. Undo Lifting. 2. Undo Predict. 3. Unsplit.
UNDO LIFTING - We restore the even array: ei{z'^) = e^{z'^) - ^{z)z^^ di{z^).
UNDO PREDICT - We restore the odd array: ^1(2;^) = d"(2:^) + zU{z)ei{z^).
UNSPLIT - The last step represents the standard restoration of the signal from its
even and odd components. In the 2;—domain this is x{z) = 61(2;^) + z~^di{z'^).
The lifting scheme presented above, yields an efficient algorithm for the implementation of the forward and backward transform of x <—^ e" U d". These operations can be
interpreted as a transformation of the signal by a filter bank that possesses the perfect
reconstruction properties and it is associated with the biorthogonal pairs of bases in the
space of discrete-time signals. These basis signals are synthesis and analysis wavelets.
Further steps of the transform are implemented in an iterative way by the same lifting
operations.
3
Polynomial splines
We will construct polynomial splines of various kinds using the even subarray of a signal,
calculate their values in the midpoints between nodes and use these values for prediction
of the odd array. In this section we discuss some properties of such splines and derive
the corresponding filters U.
3.1 B—splines
The central B—spUne of first order on the grid {kh} is defined as follows:
M^{x)
1/h
0
iixe[-h/2,h/2],
elsewhere.
The central B-spline of order p is the convolution Mj^{x) = M^~"^(a;)*M^(a;) p > 2.
Note that the B—spline of order p is supported at the interval {—ph/2,ph/2). It is positive
within its support and symmetric around zero. The nodes of B—splines of even orders
are located at points {kfi} and of odd orders at points {h{k +1/2)}, fc € Z. It is readily
316
Z. Averbuch and V. A. Zheludev
verified that hMf^{hx) =
MP{X),
where
MP{X)
:= Mf(x). Let
u" := {hMlihk) = MP{k)}, and w" := {hMl {h{k + 1/2)) = M^ {k + 1/2)} ,keZ.
(3.1)
Due to the compact support of 5-splines, these sequences are finite. We will use for our
constructions only splines of odd orders p = 2r - 1. In Table 1 we present the sequences
for initial values r which are of practical importance.
k
u^ x8
u^ X 384
w^ X 2
w'' X 24
TAB.
-2
0
1
0
1
-3
0
0
0
0
-1
1
76
1
11
0
6
230
1
11
1
1
76
0
1
2
0
1
0
0
3
0
0
0
0
1. Values of the sequences uP and w^.
We need the z^-transforms Of the sequences u^ and w'' :
oo
.
oc
uP{z'^) := ^ z-^''uP{k), w^iz^) := J] z-^^wP{k).
fc=—oo
k= — oo
These functions are Laurent polynomials, and are called the Euler-Frobenius polynomials
110]. ,
Proposition 3.1. ([9]) On the circle z = e"^ the Laurent polynomials u'''{z'^) are
strictly positive. Their roots are all simple and negative. Each root (^ can be paired with
a dual root 0 such that (^6 = 1. Thus, ifp = 2r + l is odd, then u'''{z^) can be represented
as follows:
«^(^')=n-(l+7n2')(l+7n^-'),
We denote
Uf{z):=z
0<7n<l.
_iwP{z^)
uP{z^) ■
(3.2)
(3.3)
Proposition 3.2 The rational functions Uf{z) are real-valued and Uf{-z) = -Uf{z).
Ifp = 2r + 1 is odd then
,
' _ {a-2r+%{a)
l-^^(^)-
„P(,2)
'
_ {-a-2Y+%{-a)
1 + C^(^)-
„P(,2)
(3-4)
where a := z + z~^ and Cr(ct) is a polynomial of degree r — 1.
3.2 Interpolatory splines
The shifts of J5—splines form a basis in the space S^ of splines of order p on the grid kh.
Namely, any spline 5,^ G S^^ has the following representation:
Slix)=hJ2lil)Mj:{x-lh).
(3.5)
Splines and wavelets
317
Let q := {Q{1)}, and q{z'^) be the 2:'^—transform of q. We introduce also the sequences sP := h{Sl{hk) = S^{k)} and m^ := {Sl{h{k + 1/2)) = 5f (fc + 1/2)} of values
of the spline on the grid points and on the midpoints. Let SP{Z'^) and mP^z^) be the
corresponding 2:'^-transforms. We have
5rW=X«(0Mnfc-/), and5f(fc+l)=^9(0M^(fc-«-fi).
(3.6)
Respectively, sP(0^) = g(2:^)M(0^), and mP{z^) = q{z'^)w{z'^).
Prom these formulae we can derive expression for the coefficients of a spline which
interpolates a given sequence e := {e(fc)} at grid points:
hSl{hk) = e{k),kGZ,^q{z')uP{z) = e{z^)^q{z') = ^^y
(3.7)
The 0^-transform of the sequence mP is:
mP{z^) = q{z'')wP{z^) = zUf{z)e{z'').
(3.8)
Our further construction exploits the super-convergence property of the interpolatory
splines of odd orders (even degrees).
Theorem 3.3. ([13]) Let a function f € L^(—00,00) havep+1 continuous derivatives
and let Sf^ € S^ interpolate f on the grid {kh}. Denote fk = f{{k + l/2)h). Then in the
case of odd p = 2r+ 1, the following asymptotic relation holds.
Sl{h{k+l/2)) = ^-/z2r+2^(2.+2)(;,(^+i/2))(2r+l)^Hl±^fc^|^
(3.9)
where hs{x) is the Bernoulli polynomial of degree s.
Recall, that in general the interpolatory spline of order 2r + 1 approximates the
function / with accuracy of h^^^^. Therefore, we may claim that {{k + l/2)/i} are
points of super-convergence of the spline S^. Note, that the spline of order 2r -|-1, which
interpolates the values of a polynomial of degree 2r, coincides with this polynomial.
However, the spline of order 2r -)- 1 which interpolates the values of a polynomial of
degree 2r +1 on the grid {kh} restores the values of this polynomial at the mid-points
{(fc-M/2)/i}. This property will result in the vanishing moments property of the wavelets
to be constructed later.
3.3
Quasi-interpolatory splines
We can see from (3.7) and (3.8) that in order to find values at the midpoints of the spline
interpolating the signal e, the signal has to be filtered with the filter whose transfer
function is zUf{z). This filter has infinite impulse response (IIR). However, the property
of super-convergence at the midpoints is not an exclusive attribute of the interpolatory
splines. It is also inherent to the so called local quasi-interpolatory splines of odd orders,
which can be constructed using finite impulse response (FIR) filtering.
Definition 3.4 Let the function f have p continuous derivatives and f := {fk =
f{hk)}, fc € Z. The spline S^ e S^ of order p given by (3.5) is said to be the local
318
Z. Averbuch and V. A. Zheludev
quasi-interpolatory spline if the array q of its coefficients is derived by FIR filtering the
array of samples f
q{z')=T{zyiz^),
(3.10)
where r{z'^) is a Laurent polynomial, and the difference \f{x) — Sl{x)\ — 0{f^P^hP). If
f is a polynomial of degree p — I, then the spline Sf^{x) = f{x).
If wP is the sequence defined in (3.1) then the midpoint values m'' are produced by
the following FIR filtering of the array of samples f: mP{z'^) = ZUP{Z){{Z^), U^{Z) :=
z~^T{z'^)w'''{z'^). Explicit formulas for the construction of quasi-interpolatory splines as
well as the estimations of the differences were established in [13]. In the present work
we are interested in splines of odd orders p = 2r + 1. There are many FIR filters which
generate quasi-interpolatory splines but only one filter of minimal length 2r -f 1 for each
order p = 2r-I-1. Let A(2;) := x"^ - 2-I-z^.
Theorem 3.5 A quasi-interpolatory spline of order p = 2r -|- 1 can be produced by
filtering (3.10) with filters F of length no less than 2r -|- 1. There exists a unique filter
FJ„ of length 2r+l which produces the minimal quasi-interpolatory spline S^''"'"^(x) . Its
transfer function is:
TUz') = l + j:f3lX\z), n^I^pim
= B-l)'/^^*''-
(3-11)
// the function f has 2r -|- 3 derivatives then the following asymptotic relations hold
for the midpoint values of the minimal quasi-interpolatory spline of odd order:
Sl'^\h{k + 1/2)) = f{h{k + 1/2)) + /l2r+2j(2r+2)(^(^ ^ ^j^^^^j^r ^ Q(^f{2r+3),^2r+3^^
A'--='(2r-H)b2r+2(0)
'l.jX^r-^Ui^
(2r + 2)!
(3.12)
where bs{x) is the Bernoulli polynomial of degree s.
This implies that the super-convergence property is similar to that of the interpolatory splines. The asymptotic representation (3.12) provides tools for custom design of
predicting splines retaining or even enhancing the approximation accuracy of the minimal spline at the midpoints.
Proposition 3.6 // the coefficients of the spline Sf^^^ G ^"h"^^ of order 2r + 1 are
derived as in (3.10) using the filter F^ of length 2r -I- 3, with the transfer function
Fp(z^) = T^niz'^) + pX^'^^iz), then the spline restores polynomials of degree 2r -h 1 at
the midpoints between nodes, for any real value p. However, if p = —A^ then the spline
restores polynomials of degree 2r -I- 3.
If the parameter p is chosen such that p = {-lY\p\ then the spline S^'^J'^ possesses
the smoothing property [14].
Splines and wavelets
3.4
3.4.1
319
Examples
Quadratic splines
Interpolatory spline
Let a = z~^ + z. Then
Ul{z) = ^4^
., and 1 - UHz) = ^."""^f ,,
'^^
z^+6 + z-^
"-^ '
z-^+6 + z^
Minimal spline
The filters are
X.1 / 9x
.
Ix/N
rrl/X
-Z~^ + 9Z~'^ + 9Z - Z^
16
and 1 - U^iz)
(a-2)2(^-1+4 +z)
16
Extended spline
^^ ,,1, ,
(a-2)3(3^-2+ 18^-1+38 + 182 + 3^2)
andl-[/J.) = ^
^^g
-,
Remark 3.7 In [5] Donoho presented a scheme where an odd sample is predicted by
the value in the central point of the polynomial of odd degree which interpolates adjacent
even samples. One can observe that our filter U^ coincides with the filter derived by
Donoho's scheme using the cubic interpolatory polynomial. The filter Ul coincides with
the filter derived using the interpolatory polynomial of fifth degree. On the other hand,
the filter U} is closely related to the commonly used Butterworth filter [8]. Namely, in
this case the filter transfer functions ^]''-{z) := (l + C//(z))/2, $-'''(z) := {l-Ul{z))/2
coincide with magnitude squared of the transfer functions of the discrete-time low-pass
and high-pass half-band Butterworth filters of order 4, respectively.
3.4.2 Splines of fifth order (fourth degree)
Interpolatory spline
2
_
16(^3+ m +11^-1+ 2-3)
■ 76z2 + 230 + 762-2 + ^-4'
_
_
» w
{a - 2)3 (Q - 10)
^4 + ^^^2 + 230 + 76^-2 + ^-4 •
Minimal spline The filter is
2 , s _ 47(2-^ + 2^) + 89(2-^ + z^) - 2277(2-3 ^ ^3) ^ 15965a
^rnK^l 27648
■
4
,
Wavelet transforms using spline filters
4.1 Choosing the filters for the lifting step
In the previous section we presented a family of filters U for the predicting step which
were originated frona spHnes of various types. But, as it is seen from (2.2), to accomphsh
the transform we have to define the filter /3. There is a remarkable freedom in the choice of
these filters. The only requirement needed to guarantee a perfect reconstruction property
of the transform is that /?(—2) = P{z)- In order to make synthesis and analysis filters
320
Z. Averbuch and V. A. Zheludev
similar in their properties, we choose I3{z) = U{z)/2, where U means one of filters U
presented above. In particular, U may coincide with the filter U which was used for the
prediction.
We say that a wavelet V has m vanishing moments if the following relations hold:
:/ SfceZ^'^-'(^) = 0' s = 0,l,...,m-l.
Proposition 4.1 Suppose the filters U{z) and P{z) = U{z)/2 are used for the predicting
and lifting steps, respectively. If I — U{z) contains the factor {z — 2 + 1/zY then the
high-frequency analysis wavelets i/A have 2r vanishing moments. If, in addition 1 — lJ{z)
contains the factor (2—2+1/^;)'' then the synthesis wavelet ipl has 2q vanishing moments,
where q = m.m{p,r].
4.2
Implementation of the transforms
Suppose, we have chosen the filter j3 = U/2. The functions zU{z) and zU{z) depend on
z'^ and we write F{z'^) :— zU{z) and F{z^) := zU{z). Then the decomposition procedure
is (see (2.1), (2.2)):
dUz)=driz)-Fiz)e,{z),
e^z) = e^iz) + j-/{z)d^,{z).
(4.1)
Equation (4.1) means that in order to obtain the detail array d'/, we must process the
even array ei with the filter F, with transfer function F{z), and extract the filtered array
from the odd array di. In order to obtain the smoothed array e", we must process the
detail array d" with the filter $ that has the transfer function $(2) = z~^F{z)/2 and
add the filtered array to the even array ej. But the filter $ differs from Fr/2 only by
one-sample delay and it operates similarly. Thus, both operations of the decomposition
are, in principle, identical. For the reconstruction the same operation is conducted in
reverse order.
Therefore, it is sufficient to outline the implementation of the filtering with the func: tion F{z).
Implementation of FIR filters originating from local splines is straightforward and,
therefore we only make a few remarks on IIR filters originating from interpolatory
sphnes. A detailed description can be found in [1]. Equations (3.2) and (3.3) imply
that, while the interpolatory spline of order 2r + 1 is used, the transfer function F{z) =
P{z)/Iln=i T-{1 + JnZ){l + 'JnZ'^), whcrc P{z) is the Laurent polynomial. It means
that the IIR filter F can be split into a cascade consisting of a FIR filter with the
transfer function P(2), r elementary causal recursive filters denoted by R{n), and r elementary anti-causal recursive filters, denoted by R{n). The causal and anti-causal filters
operate as follows:
y = R{n)x^=^y{l) = x{l) + j„yil-l),
y
= ^)K
^=^ y{l) = x{l)+ jny{l+ 1)-
Example 4.2 (Example of recursive filter) We present IIR filters derived from the
interpolatory splines of third order.
Splines and wavelets
321
Let 7J = 3 - 2v^ « 0.172. Then
The filter can be implemented with the following cascade:
xo{k) = A^l{x{k)+x{k + l)),
xi{k) = xo{k)-jlxi{k-l),
y{k)=xi{k)-jly{k + l).
Bibliography
1. A. Z. Averbuch, A. B. Pevnyi and V. A. Zheludev, Butterworth wavelets derived
from discrete interpolatory splines: Recursive implementation, to appear in Signal
Processing, www.math.tau.ac.il/~amir (~zhel).
2. G. Battle, A block spin construction of ondelettes. Part I. Lemarie functions,
Comm. Mffli/i. P/i?/s. 110 (1987), 601-615.
3. A. Cohen, I. Daubechies and J.-C. Feauveau, Biorthogonal bases of compactly
supported wavelets, Commun. on Pure and Appl. Math. 45 (1992), 485-560.
4. I. Daubechies, Ten lectures on wavelets, SIAM, Philadelphia, PA, 1992.
5. D. L. Donoho, Interpolating wavelet transform, Preprint 408, Department of Statistics, Stanford University, 1992.
6. C. K. Chui and J. Z. Wang, On compactly supported spline wavelets and a duality
principle. Trans. Amer. Math. Sac. 330 (1992), 903-915.
7. P. G. Lemarie, Ondelettes a localisation exponentielle, J. de Math. Pures et Appl.
67 (1988), 227-236.
8. A. V. Oppenheim, R. W. Shafer, Discrete-time signal processing, Englewood Cliffs,
New York, Prentice Hall, 1989.
9. I. J. Schoenberg, Contribution to the problem of approximation of equidistant data
by analytic functions, Quart. Appl. Math. 4 (1946), 112-141.
10. L J. Schoenberg, Cardinal spline interpolation, CBMS 12, SIAM, Philadelphia,
1973.
11. W. Sweldens, The Ufting scheme: A custom design construction of biorthogonal
wavelets, Appl. Comput. Harm. Anal. 3 (1996), 186-200.
12. M. Unser, A. Aldroubi and M. Eden, A family of polynomial spline wavelet transforms, 5«3naZ Processmt? 30 (1993), 141-162.
13. V. A. Zheludev, Local spline approximation on a uniform grid, U.S.S.R. Comput.
Math. & Math. Phys. 27 (1987), 8-19.
14. V. A. Zheludev, Local smoothing splines with a regularizing parameter, Comput.
Math. & Math Phys. 31 (1991), 193-211.
Knot removal for tensor product splines
T. Brenna
Dept. of Informatics, Univ. of Oslo, Oslo.
trondbre@ifi.ulo.no
Abstract
Given a spline function as a B-spline expansion the object of knot removal is to remove as
many knots as possible without perturbing the spline by more than a specified tolerance.
In 1987 Lyche and M0rken proposed an efficient knot removal algorithm which determines
both the number of remaining knots and their position automatically. In this paper we
show how their method can be extended to knot removal techniques for multivariate
tensor product splines. We propose a number of new strategies for removing as many
knots as possible, and discuss some of the advantages and challenges posed by the special
structure Of tensor product splines.
1
Introduction
Given a spline function we are often interested in an approximate representation requiring less data. The object of knot removal is to remove as many knots as possible
from a given spline without perturbing the spline by more than a given tolerance. An
efficient knot removal strategy presented in [6] determines both the number of remaining
knots and their location automatically. This strategy was later extended to parametric
curves and surfaces in [5], and incorporated with various constraints such as monotonicity and convexity in [1]. An efficient implementation of knot removal for the special
case of trilinear splines is given in [3]. In this paper we address some of the questions
and problems arising when extending the knot removal technique to multivariate tensor
product splines.
The outline of this paper is as follows. We start by fixing notation and presenting
techniques for representing tensor product splines. We then proceed with generalizations
of coefficient norms, approximation methods, methods for ranking the knots etc., as we
review the central parts of the knot removal strategy. Two different ways of perforrriing
knot removal are given together with accompanying strategies for finding the desired
approximations. We end the paper with two examples demonstrating various aspects of
the knot removal techniques presented.
2
Notation
Let d = (dfc),m = (m/t) € Z* with 0 < d < m (component-wise) for some positive
integer s. Also let t*" = {t'-}'^-^'^'''^^ be a knot vector with dk + 1 equal knots at both
ends and with no knot value occurring more than dk + 1 times, for fc = 1,..., s. In this
paper we will treat the collection t = {t''}^^i as a "single" knot vector with "length"
322
Knot removal for tensor product splines
323
m + d +1 defined to be the sum of the length of the knot vectors t'', k = l,...,s. Given
such a knot vector we may form products of the basis functions associated with each
individual knot vector t*^. By letting
s
Bi{-x.):=Bi^d,t{^) = Y[Bi^^dkM^k)
for
l<i<m,
fc=i
where i = (ik) S Z® and x = {xk) G K* ,we get a total of nfe=i '^fc ^^^ basis functions
s
for the tensor product space Sd,t = <Si Sd^.f^- In this paper we let Bj^, ^^ t*- be the i^th
fc=i
'
'
'
B-spline of degree dk associated with t'^, for k = 1,. ..,s.
To represent an element of Sd,t we use a variant of the classical Kronecker product of
matrices. Recall that if A = (aijj[^\'^ii ^ M!^^''^\ B = (bi,j)™^i'"!^i e R^^.ns then this
product is given by A ® B = (ai,jB)™\'j^j. In this paper we will use the "equivalent"
product defined by A (g) B = (Abij)?^^j^'jf^j, which gives a more convenient ordering of
the matrix elements for our use. Also recall that for real matrices A,B,C,D we have the
following useful relations (assuming that the matrix products and inverses are defined)
(A(8)B)(C<8)D) = (AC)(g)(BD), (A®B)-i = A-^OB-i andA(8)B = Pi(B(g)A)P2,
for some permutation matrices Pi and P2. In addition we have that the product A<8)B
will have linearly independent columns, provided the same holds for A and B. For further
properties of the Kronecker product we refer to [4].
An element
Ttii
rUs
s
/W = I] • • • I]/n,-,^. n ^'^A.t*(^fc) = H/'^'.d,t(x) G §d,t
ii=l
25=1
fc=l
i;^!^
can now be written
/(x) = B?^f,
where Bt = <8) Btk with Btk = (Bidk.t", • • •jB^^^d^.tk)''^ for k = l,...,s. Here f is
a vector containing the B-spline coefficients F = (fii,..,!^) of / given by f = vec(F) :=
s
2Ji<m fi^i) where ei = ® e^^ with e^j. £ M™*=. Finally we state that for a tensor of real
""
fe=i
coefficients F = (fi)i<i<m € R™ we let F^'^''' denote the tensor F with its elements
rearranged according to the cyclic permutation of the s-tuple {1,2,..., s] given by (Tk =
{fc, fc + 1,..., s, 1,..., fc — 1}, for/c = 1,..., s.
Finally, for a spline / = Z)i<m/i-^i,d,t(x) we define a class of weighted P-norms of
its B-spline coefficients, given by
^{{HiKr^^m"?'",
ll/IU'',t—s max I/iI,
a<i<m
for
l<p<0O,
for
p = 00,
where the weights are given by Wi = Ylk=i '^"^/+i ''' > for 1 < i < m. Using the
notation introduced above we have that || / ||ip,t=|| W^'^f ||ip, {p > 1) where Wt is a
324
T. Brenna
diagonal scaling matrix given by
Wt= ® Wtk,
k=i
with
Wtk=diag. .
;'•••'»
\\ak + l/
V
^ _L i
dk + l
These coefBcient norms are easy to compute and are known to approximate the ordinary
LP-norms well for splines of moderate degree [2,6]. In the algorithms we use p = 2 when
computing approximations and p = oo to measure the error.
3
The knot removal algorithm
Given an element / G Sd,t, a tolerance e > 0 and some norm || • |1 the goal of the knot
removal algorithm presented in [6] is to find a subspace Sd.r of Sd,t (T C t) and an
element g 6 Sa.r with || / - g ||< e, and where we want r to be of minimal length. In
this section we review the basic parts of this algorithm as we extend the theory to tensor
product splines. Further details of the material in this section can be found in [2].
3.1 Finding approximations
To approximate / € Sd,t hi a subspace Sd,T, where r is of "length" n + d + 1 with
n < m, we use the sphne g which is the best approximation to / in the /^,t-norm.
In other words, the spline we seek will be the solution to the minimization problem
min II / - hWfi ^- Solving this problem is equivalent to solving the linear least squares
problem given by
min ||Wy^Ac-f)||?.,
(3.1)
where A = (gi Ak is the knot insertion matrix from r to t (i.e. Ak is the knot insertion
k=l
matrix from r'' to t*", for k = l,...,s), f = vec(F) are the given B-spline coefficients
of / in §d,t and c = vec(C) are the unknown B-spline coefficients of g in Sd,T- Since
the knot insertion matrix A has full rank and Wt is non-singular, the normal equations
A'^^WtAc = A'^Wtf associated with the system (3.1) will have a unique solution which
can be found ([2,3]) by solving a series of s tensor equation systems given by
(AjWt.Ak)D[r'<'= (A^WtODt''i\
(3.2)
for k = l,...,s. HereDk e M""^ with Uk = (ni,.. .,nfc,mi:+i,.. .,ms), and we let Do =
F, and set the coefficients of the approximation g equal to the solution of the last
tensor equation system, C = Dg. The tensor equations (3.2) can be efficiently solved
by calculating the Cholesky factorization of the banded coefficient matrix (AjJ'WtkAk)
and solving for each right hand side in the tensor {A'^Wtk)'D^^_^^.
3.2 Ranking the knots
The final approximation to the initial spline is found by searching through a sequence of
approximations, constructed by using the approximation method of the previous section,
on subsets of the knots of the initial spline. These subsets are calculated by associating a
weight with each interior knot, representing a rough measure of its importance. See [6] for
Knot removal for tensor product splines
325
the details. For higher dimensional tensor product splines we set the weight for a given
knot to the maximum of the weights corresponding to this knot when the calculation is
iterated over the "remaining" parameter directions. We refer to [2] for further details.
4
Knot removal methods
When removing knots from a tensor product spline we are faced with more options than
in the case of a spline curve. In this section we present two different ways of performing
knot removal. The first one studied in [2] based on a symmetric approach, treats all the
parameter directions of a tensor product spline simultaneously, while the second one will
treat one parameter direction at a time.
4.1
Knot removal based on a symmetric approach
If we let G/(T) denote the approximation to / G Sd,t defined on the knot vector
T we see that the approximations in the sequence mentioned above can be written
{Gf{Tj)}jLQ, where TJ is constructed from t by removing j of its interior knots, and
^ = Sfc=i["^fc ~ (^fc + 1)] is the total number of interior knots of t. Given such a
sequence of approximations we can perform a search on the index j to determine an
approximation g* = Gf{T*) to the initial spline / with a preferably short knot vector
r*, and with the property that \\f — g*\\i°^,t<s, where e is the specified tolerance. If the
knot vector r* is not equal to any of the two knot vectors TQ or TN we may repeat the
process to find a new approximation based on g* as proposed in [6]. Taking into account
how the sequence {G/(rj)}j^o '^^^ constructed we expect the error || / - Gf{Tj) ||;oo_t
to decrease, but not necessarily strictly, for decreasing values of the search parameter
j. How the search among the possible approximations is done will generally depend on
a number of factors, including some which will be discussed later through examples.
Also note that we only have to compute approximations for indexes actually used in
the search. By treating all the directions simultaneously we take into consideration the
inherent symmetry of the problem. As we will see later this will in some cases enable
us to remove more knots than by treating one parameter direction at a time, but it will
also lead to more complicated and slower code in an implementation.
4.2
Knot removal for one parameter direction at a time
In the second knot removal method we start by thinking of a spline / € Sd,t as a series
of parametric curves in corresponding high dimensional spaces. We can then perform a
parametric knot removal for each parameter direction. The advantage of this approach
is that it is easy to implement since we may use existing knot removal routines for spline
curves with only minor modifications.
In the following discussion we let e = Xli=i£^ii with e, > 0 for all i, be a given
s
tolerance. Also let /(x) = ^i<„ /i.Bi,d,t(x) = B^ f be a spline in Sd,t = 0 Sdfc,t*=i with
~
fc=i
B^ = ® 3jk and f = vec(F). We start by identifying a series of parametric curves
k=l
which may be naturally associated with this tensor product spline. We say that the
spline / consists of the curves fk{xk), for A; = 1,..., s, where fk{xk) is the parametric
T. Brenna
326
rk-l
curve in M^^ for Mk = {U;Ji mk)iU.Uk+i ^^k), given by
fk{Xk)= (® Im,)®Bjl® (_® Im,)
We now return to the problem of finding a preferably short knot vector r C t and a spline
P(x) = Ej<nCj-Bj,d,T(x) e Sd,T = ® Sd^.r'' with the property that ||/ - 5lli<»,t< £■ To
apply knot removal to / e Sd,t we can now go through the following steps for fc = 1,..., s.
1.
Apply parametric knot removal with the tolerance eu to the parametric curve
fk-l
fkiXk) = [( ®^In,) 0 B^K ® ( _l> Im,)]fk-]
2.
defined on t*^, starting with fo = f.
This will produce a new parametric curve defined on the knot vector
T'' C
t''
fk,
where ffc = vec(Fk) for Fk G M"u-,n^:mu+i,-,m,_
3.
We also have that
/l(^/t) = [C® In.) ® ^r'^ ® (,^f Jm,)]fk
= [(Vl„,) ® Bj: ® ( _® Im.)] [f ®/n.) ® A]
4.
fM=k+1
® Im,)/
fk,
where Ak is the knot insertion matrix from r*^ to t*^.
And consequently
\fk — fk lu'^.t'' —
f/fc-1 - [( ® In,) ® Ak ® (^^1 Jm,)
fk
<e/t.
Finally we let the coefficients of the function g{x) = B^c 6 §d,T be c = vec(Fs), and
we have the following result.
Theorem 4.1 // we let /(x) = B^f e §d,t and g{x) = B!J'C e Sd.r be the tensor
product splines from the discussion above, then we have \\f — ff ||/~,t< £■
Proof: Let A = ® Ak be the knot insertion matrix from r to t, and let /o(x) = B^fo
k=l
327
Knot removal for tensor product splines
be equal to / and /s(x) = B^fg be equal to g, i.e. fo = f and (s = c. Then
||/-fl'||z~,t= ||f0- Afsll;^
/fc-1
k=2
\
/ s
M
r/k-1
—
—
—
fc=l
ffc-1
fc=i
5
fe=i
i°°,t''
<£.
U
Examples
The knot removal methods presented above have been implemented and tested on a
computer. In this section we present trivariate examples from this implementation and
propose different knot removal strategies depending on the problem at hand. See [3] for
a detailed description of this implementation.
Example 5.1 In this first example we will compare two different strategies for searching
through a hst of approximations {G/(TJ)}^Q introduced above. We will consider the
knot removal method treating one parameter direction at a time, which means that we
end up solving a parametric knot removal problem with tolerance e, = e/3, i = 1,2,3,
for each of the three parameter directions.
To improve efficiency the parametric knot removal routine implemented is constructed
in a way that lets it abort the computation if an approximation for any component of the
parametric curve fails to lie within the specified tolerance. This fact suggests a search
strategy where we compute successive approximations to the initial spline by adding one
interior knot at a time, starting with zero interior knots, and where each intermediate
approximation is given by the first of these approximation processes to be completed.
Intuitively we would expect such a sequential search strategy to perform best for "large"
tolerances and/or large problems, where it is more to gain by aborting an approximation
process. In this example we have compared this search strategy with a strategy proposed
in [6] using a binary search.
In all the tests we have used an initial trilinear spline constructed by samphng the
function given by f{x,y,z) = |[sin(27ra;) + sin(27ry) + sin(27r2;)] in the points specified
by a uniform 3-dimensional grid on the domain f2 = [0,1]^, for four selected grid sizes.
Each spline was reduced by using both of the search strategies mentioned above, for
tolerances varying from s — 0.001 to £ = 0.01. Both of the search strategies produced
approximately the same end grid size in each test.
In Figure 1 the CPU-time of the two search strategies is plotted against the tolerance
for the selected grid sizes. We observe that the reductions utilizing a binary search
perform best on Small problems, while the sequential search strategy turn out to be
superior for large problems.
T. Brenna
328
Comparing tif Oifffrcnl Hciirch ^iriiif|!(tf< , Prohli-m Si/e
A Cixnpnrl^on of niffrrt'tii Search Sirrilrpti-t. ProNcm Hire 25 x 3S i M
0,(101
o.mi
(b) Problem size 100^
(a) Problem size 25"^
A Comp.irl'irffi of nilTercni Si-an-h Srrim-f (r^. Picit-k-m
IW1:50 X 7V>
Binary Wrrh -*■Scqmrmi.i? SiMiTh -t*~
\Kr:^ ^1,^
^Wu-v'V,—.„
101
(c) Problem size 250''
FIG.
1.
am:
om
ooiw
anri^
oow,
Tok-riim-t
lu m
nom
oni')
a
(d) Problem size 400^
A comparison of two different search strategies.
Example 5.2 In this example we compare the two different knot removal methods
presented in this paper. Here we have used an initial trilinear spline constructed by
sampling a function given by f{x, y, z) = e^'^t^xx yz) -^^ ^j^g points specified by a uniform
3-dimensional grid on the domain fi = [0,1]^, for varying grid sizes. Each spline was
reduced by both the method based on the symmetric approach and the method treating
one parameter direction at a time.
The results are presented in Table 1. We see that in our implementation the method
using the symmetric approach is by far the slowest method. However, at least for the
type of function considered in this example the method based on the symmetric approach
will give a much better reduction than the other.
Knot removal for tensor product splines
Start
grid
100^
1503
200^
2503
3003
3503
4003
CPU
16.53
56.44
99.48
165.3
256.8
391.4
494.6
Knot Removal for Trilinear Splines, Tolerance e = 0.005
|
Parametric, binary search
Symmetric, binary search
|
End grid
Error
CPU
End grid
Error
72 X 65 X 65
4.93800 • 10-3
63.23
54 X 53 X 53
4.92080 • 10-^
81 x71 x71
4.80243 10-3
122.2
51 X 49 X 49
4.77236 • 10-3
68 X 66 X 66
4.91142 10-3
4.98275 ■ 10-3
300.9
54 X 50 X 51
74 X 62 X 62
4.74970 10-3
584.8
61 X 56 X 56
4.85916 ■ 10-3
72 X 62 X 62
4.85316 10-3
1094
60 X 54 X 53
4.81551 • 10-3
75 X 65 X 63
4.77028 10-3
1312
4.92422 • 10-3
54 X 50 X 50
71 X 59 X 63
4.79631 10-3
1865
54 X 50 X 50
4.81064 ■ 10-3
TAB.
1 Knot removal for the trilinear splines of Example 2.
Bibliography
1. Arge, E., Daehlen, M., Lyche, T. and M0rken, K. (1990). Constrained spline approximation of functions and data based on constrained knot removal. In: Algorithms for
Approximation II, J. C. Mason and M. G. Cox (eds.), Chapman and Hall, London,
4-20
2. Brenna, T. (1998). Knot removal for multivariate tensor product splines. Master
thesis, part I. Dept. of Informatics, Univ. of Oslo.
3. Brenna, T. (1998). Knot removal for linear, bilinear and trilinear splines. Master
thesis, part 11. Dept. of Informatics, Univ. of Oslo.
4. Graham, A. (1981). Kronecker Products and Matrix Calculus With Applications. Ellis
Horwood Series. Mathematics and its applications.
5. Lyche, T. and M0rken, K. (1986). Knot removal for parametric B-spline curves and
surfaces. Computer Aided Geometric Design, 4, 217-230
6. Lyche, T. and M0rken, K. (1987). A data reduction strategy for splines with applications to the approximation of functions and data. IMA Journal of Numerical
Analysis, 8, 185-208.
329
Fixed- and free-knot univariate least-squares data
approximation by polynomial splines
Maurice Cox, Peter Harris and Paul Kenward
National Physical Laboratory, Teddington, Middlesex, TWll OLW, UK
maurice.cox@npl.co.iik, peter.harrisOnpl.co.uk, paul.kenwardOnpl.co.uk
Abstract
Fixed- and free-knot least-squares data approximation by polynomial splines is considered. Classes of knot-placement algorithms are discussed. A practical example of knot
placement is presented, and future possibilities in free-knot spline approximation are
addressed.
1
Introduction
The representation of univariate polynomial splines in terms of B-splines is reviewed
(Section 2), leading to the problem of obtaining fixed- and free-knot £2 spline approximations (Section 3). The accepted approach to the fixed-knot case is recalled (Section 4)
and the manner in which spline uncertainties can be evaluated given (Section 5). The
importance of families of spline approximants is emphasised (Section 6). The free-knot
problem is formulated (Section 7) and several of the established and some lesser-known
knot-placement strategies reviewed (Section 8). Conclusions are drawn and future possibilities indicated (Section 9).
2
Univariate polynomial splines
Let / := [a;min,a;max] be an interval of the x-axis, and Xm\„ = AQ < Ai < A2 < • • • <
Ajv-i < Ajv < Ajv+i = XmHK apartition oil. A spline s{x) of order n (degree n-1) on / is
a piecewise polynomial of order n on {Xj, Aj+i), j = 0,...,N. The spline s is C""''''^ at
Aj if card(Af = Xj,i G {1,... ,n}) = k. The partition points A = {Aj}f are the (interior)
knots of s. To specify the complete set of knots needed to define s on / in terms of Bsphnes, the knots {Aj}f are augmented by knots {Aj}j"i„ and {Xj}%+2^ q = N + n,
satisfying
Ai_„<---<Ao,
Ajv+i < ••■ < A,.
For many purposes, a good choice [10] of additional knots is
Xl-n = • • • =
AQ,
AAT + I
330
= ■ ■ ■ = Xq.
Data approximation by polynomial splines
331
■
It readily permits derivative boundary conditions to be incorporated in spline approximants [7]. On I, s{x) has the B-spline representation [5]
g
s{x):=s{c,X]x) = Y^CjNn,j{X;x),
(2.1)
where A^„j(A; x) is the B-spline [5,12] of order n with knots {Afe}j_„ and c = (ci,..., c,) T
are the B-spline coefficients of s. Each Nnj{X; x) is a spline with knots A, is non-negative
and has compact support. Specifically,
Nn,ji\;x) >0, xe (Aj_„,Aj),
supp(Ar„j(A;a;)) = [Aj_„,Aj].
(2.2)
The B-spHne basis {A^„j(A;a;)}j^;^ for splines of order n with knots A is generally very
well-conditioned [10]. Moreover, the basis functions for any x £ [a;inin,a;max] can be
formed in an unconditionally stable manner using a three-term recurrence relation [5,12].
Specifically, the relative errors in the values fl{Nnj{X; x)) of the basis function computed
using IEEE floating-point arithmetic [18] satisfy
\fl{Nn,jiX;x))-N„j{X;x)\<CnNn,j{X;x)r],
where C is a constant that is a small multiple of unity and r] is the unit roundoff of the
floating point processor [5]. The B-spline basis for splines of order 3 with interior knots
at a; = (1,2,5)''^ and coincident end knots at a; = 0 and 10, is shown in Figure 1.
1. The B-spline basis for splines of order 3 for some nonuniformly spaced knots.
The first three B-spline basis functions are shown as solid fines and the remaining three
as dotted lines.
FIG.
Valuable properties of s can be deduced [12] from those of the B-splines. A useful
property is that, for any x e I, s{x) is a convex combination of the coefficients of the
B-splines whose support contains x. Thus, local bounds for s can readily be found:
min
j<k<j+n
Ck < six) < • max
j<k<j+n
c^,
x e fA,-,A,+i].
^ ^' ^^ '
These bounds imply a mimicking property for s, viz., that the elements of c tend to
vary in much the same way that $ varies. Figure 2 depicts a spline curve s of order
4 with "non-polynomial" shape having interior knots at x = (1,2,5)'^, coincident end
knots at a; = 0 and 10, and B-spline coefficients (0.00,0.20,0.60,0.22,0.18,0.14,0.12)'^.
332
M. Cox, P. Harris and P. Kenward
To reproduce this shape to visual accuracy with a polynomial would require a high
degree and hence many more defining coefficients. The mimicking property is evident:
successive elements of c rise, fall sharply and then gently, behaving in a similar way to s.
FIG.
2. A spline curve with "non-polynomial" shape illustrating the mimicking prop-
erty.
3
Fixed- and free-knot approximation
Two types of data approximation (or data modelling) in the £2 norm by splines are
regularly considered. One is the determination of the B-spline coefficients c for given
data, a prescribed order n and prescribed knots A. The other is the determination of c
and X for given data and spline order n. The former problem is linear with respect to
the parameters of the spline, just c being regarded as unknown. The latter is nonlinear,
both c and A being unknown.
The linear case is well understood, with highly satisfactory algorithms [10] and software implementations [1, 16] available. The nonlinear case remains a research problem,
although useful algorithms (Section 8) have been proposed, implemented and used. Many
of these algorithms "iterate" with respect to A, where for each choice of knots the resulting linear problem is solved for c. Thus, the linear problem (Section 4) is important
in its own right and as part of the solution strategy for knot-placement algorithms.
4
Least-squares data approximation by splines with fixed knots
The £2 data approximation problem for splines with fixed knots can be posed as follows.
Given are data points {(x,,?/,)}?', with Xi < ■■■ < x^, and corresponding weights
{wi}f or standard uncertainties {ui}f. The Wi reflect the relative quality of the j/;,^ Ui
is the standard uncertainty of yj and corresponds to the standard deviation of possible
"measurements" a.t x = Xi of the function underlying the data, y, being one realisation.
Given also are the N knots A = {Aj}f and the order n of the spline s.
When weights are specified, the problem is to determine the spline s{x) of order
n, with knots A, such that the two-norm of {wiei}"^ is minimised with respect to c.
■ ^The Xi are taken as exact for the treatment here. A generahsed treatment is possible, in which the Xj are alsc
regarded as inexact. The problem becomes nonlinear (in c).
Data approximation by polynomial splines
333
When standard uncertainties are specified, the two-norm of {«~-^ej}™ is minimised with
respect to c. If u;i = u" ,i = l,...,m, the two formulations are identical in terms of the
spline produced. When weights are specified, s is referred to as a spline approximant.
When uncertainties are prescribed, s is known as a spline model. There are differences
(Section 5) in interpretation in terms of the statistical uncertainties associated with the
solution and in terms of validating the spline model so obtained.
The use of a formulation in terms of standard uncertainties, together with the B-spline
representation (2.1) of s, gives the linear algebraic formulation^
mine'^V^~^e,
e = y - ^c,
(4.1)
where y = (yi, • • • ,J/m)^, A is an m x q matrix with a^j = Nnj{xi), and Vy =
diag(uf,...,u^). Matrix computational methods can be applied to this formulation.
As a consequence of property (2.2) of the B-splines, ^ is a rectangular banded matrix
of bandwidth n [8].
The linear algebraic solution can be effected using Givens rotations to triangularise
the system, back-solution then yielding the coefficients c [6]. The number of floatingpoint operations (flops) required is to first order 0{mn'^), i.e., independent of the number
of knots. Hence computing a sphne model for many knots is hardly more expensive than
one for a few knots. Moreover, since for many problems cubic sphnes (n = 4) yield a
good balance between approximation properties and smoothness (continuity class C^),
regarding the order as/ixed gives a flop count 0(m).
The vector c is unique [11] if there is a strictly ordered subset t = {tj}l of x such
that the Schoenberg-Whitney conditions [21]
tjes\ipp{Nn,j{\;x)),
j = l,:..,q,
(4.2)
hold. In a case where the conditions (4.2) do not hold^, an appropriate member can be
selected from the space of possible solutions. Such a selection is also advisable if the
conditions are in a practical sense "close" to being violated. A particular solution can
be determined by augmenting the least-squares formulation by a minimal number of
equality constraints for c such that A has full column rank [10].
An instance of the type of data set to which the algorithms of this paper are addressed
is shown in Figure 3.- Such a data set (cf. Section 2) has the variety of behaviour that
cannot readily be reproduced by some other classes of approximating functions.
5
Spline uncertainties
Once a vahd spline model has been obtained, the uncertainties associated with the
spline can be evaluated [9]. Uncertainty evaluations are essential in metrology, where all
measurement results are to be accompanied by a quantification of their reliability [2],
and important in other fields. The key entity is the covariance matrix Vc of the spline
^A further generalisation is possible in which mutual dependencies are permitted among the measurement errors.
In this case, Vy is non-diagonal.
^A set of knots giving rise to this circumstance may be a consequence of an automatic knot-placement procedure.
M. Cox, P. Harris and P. Kenward
334
i(Klepfln<lent vartab'B
3. A data set representing heat flow as a function of temperature. Such data forms
the basis of the determination of thermophysical properties of materials under test. For
clarity only every fifth data point is shown.
FIG.
coefficients c. Using recognised procedures of linear algebra,
Vc = iA'^Vf'Ar\
(5.1)
Prom this result, the standard uncertainty of any quantity that depends on c can be
evaluated. Specifically, for a given constant vector p, the standard uncertainty u(p'^c)
of p'^c is given by
«2(pTc) = pTycP.
By setting p to contain the values of the B-spline basis at a point x e I, the standard
uncertainty of s{x) can be formed. The standard uncertainty of a nonlinear function of
c can be estimated by first linearising the expression about the solution value of c.
If weights rather than uncertainties are specified for the data, (5.1) takes the form
V, = a\A^W^Ar\
where & estimates the standard deviation of the weighted residuals {wiEiJI', W =
diag(wi,...,i(;TO), and
evaluated at the solution.
6
Families of approximants
When dealing with certain classes of approximating function it is natural and useful to
consider families of approximants. A simple example is polynomial approximation, for
polynomials pj{x) of order j = 1,2,...,N, for some maximum order N. Each member
of the family "contains" the previous member. It is then meaningful to consider the
approximation measure, e.g., the ^a-norm here, with respect to indices denoting members. Thus, the value of the fa-norm for the polynomial approximant of order j can be
inspected with respect to index j for j = 1,2,..., iV. For data approximation, it is more
Data approximation by polynomial splines
335
meaningful to use as the measure the root-mean-square residual given by dividing the
^2-norm by (m - j)^/^. For representative data, the expectation is that as j increases
this quantity should stabilise to an essentially constant value. This property provides a
useful validation procedure. If weights u~^ are used as in Section 4 this measure should
settle to the value unity. Thus the approximant with index j (normally the smallest
such) that achieves the value one is sought.
Within most of the strategies outlined in Section 8 it is possible to produce results
for A'' = 1,2,... knots, and thus to study the effect of the number of knots on the quality
of the approximant. Prom such information it may be possible to select an acceptable
solution. If for each number of knots, the knots contain those for the previous number,
and an £2 approximant is determined, the sequence of approximants for A'' = 1,2,...
knots forms a family. A family has the property that the sequence of values of the
^2-norm is monotonically decreasing.
7
Least-squares data approximation by splines with free knots
The problem of least-squares data approximation by sphnes with free knots can be
formulated in the same way as that for fixed knots (Section 4), except that the knots
are not specified a priori, either in location or number. The formulation (4.1) no longer
yields a hnear problem, since the matrix A of B-sphne values is now a function of A.
Instead, e(A) = y — ^(A)c, and it is required to solve
mine^(A)y-ie(A).
A;c
(7.1)
In order to reflect the fact that for any given knot set the B-spline coefficients are given
by solving a relatively simple, linear problem, formulation (7.1) can be expressed as
min('mme^(A)Fjrie(A)V
(7.2)
Extensive use is made of this elementary result.
8
Knot-placement strategies
Many knot-placement strategies have been proposed and used. Some of these strategies
are outlined and their properties indicated. Several of the strategies generate a family of
candidate spline approximants, with advantages for model validation.
8.1
Manual methods
Manual methods can be classed as those methods for which the user examines the general
"shape" of the function underpinning the data, selecting the number and location of the
knots on this basis. With practice and visual aids, acceptable solutions can often be
obtained [6]. Naturally, knots are chosen to be more concentrated where "things are
happening" in contrast to regions where the underpinning behaviour is innocuous.
8.2
Strategies that depend only on abscissa values
Strategies based on the manner in which the values of the independent variable are
distributed may be used to place the knots (at points that are not necessarily the data
336
M. Cox, P. Harris and P. Kenward
abscissae themselves). A facility in DASL (the NPL Data Approximation Subroutine
Library) [1] provides one such strategy, based on the Schoenberg-Whitney conditions
(4.2) in the following way. Intuitively, these conditions imply that there is no region where
there are "too many" knots compared with the number of data points. Mathematically,
these conditions guarantee uniqueness. Numerically, their satisfaction does not ensure
that the solution is well-defined. If the conditions are "close" to being violated, c will be
sensitive to perturbations in the data. In particular, since the behaviour of c "controls"
that of s (Section 2), the spline is likely to exhibit spurious behaviour such as large
undesirable oscillations if ||cl|2 > ||y||2It follows that a sensible choice of knots would be such that the Schoenberg-Whitney
conditions are satisfied "as well as possible" for a data subset. Such a choice is made in
DASL [1] for spline approximation of arbitrary order. It is also made in a cubic spline
interpolation routine in the NAG Library [16], regarding spline interpolation as a special
case of spline approximation in which q- m and N = m - n. The choice made is seen
most simply by first applying it to spline interpolation. Consider the choice
1
>'j = 2^^j+[n/2\+Xj+i{n+i)/2}),
j =^l,...,m-n,
where [v\ is the largest integer no larger than u. For n even, \j = Xjj^n/2- Thus, the choice
tj = Aj_„/2 would be made. However (Section 2), supp(JV„,j) = [Aj_„,Aj]. Thus, indexwise, the Schoenberg-Whitney conditions are satisfied as well as possible in the sense
that the index of Xj-n/2 falls halfway between the indices of the support endpoints Aj_„
and Xj. Comparable considerations apply for n odd. Precisely this choice is recommended
[14, 16] in the context of cubic sphne interpolation. It is the "not a knot" criterion, as
a practical alternative to the classical use of boundary derivatives. A knot is placed at
each "interior" data value Xi apart from X2 and Xm-iThe above choice can be interpreted as follows. Consider the graph x = F(£) given
by the join of the points {{i,Xi)}^. The jth interior knot, Xj, for j = 1,... ,m - n,
is given by F(j + n/2). The successive spacings between the index arguments of F for
j = 0,...,N + 1, using F(0) = x„,in and F{N + 1) = Xmax, are therefore
1 + n/2,1,..., 1,1 + n/2.
AT-l
For approximation, these successive spacings are proportionally increased to account for
the fact that there are fewer knots. The resulting expression for the jth interior knot is
A,- = F(l + (m - l)(j + n/2 - l)/(g - 1)),
j = l,...,N.
The choice can be interpreted as placing the interior knots such that there is an approximately equal number of data points in each knot interval (interval between adjacent
knots), except that in the first and the last interval there are approximately n/2 times
as many points. The strategy [1] has the property that when N is such that the data is
interpolated, the choice of knots agrees with one of the recommended choices for spline
interpolation.'*
*The approach tends to give better knot locations if the data is gathered in a manner which ensures that the local
Data approximation by polynomial splines
Figure 4 illustrates the above strategy for a spline interpolant and approximant of
order 4 to data with abscissae x = (0,0.25,0.5,0.75,1,1.25,1.5,1.75,2,3,4,5,7.5,10)^.
Each figure shows the graph x = F{£). For the interpolant (left-hand graph), ten knots
are chosen to coincide with the abscissa values X3,... ,a;i2. For the approximant (righthand graph), four knots are chosen such that there are two points in each interval,
excepting the first and last interval where there are four points, i.e., n/2 = 2 times as
many. The distribution of the knots reflects that of the abscissa values.
J
/
/
/
< cr
FIG.
6 6 6 6 o-o-cwMw9
4. A knot placement strategy depending only on the abscissa values.
A simpler strategy is to select uniformly spaced knots. The Schoenberg-Whitney
conditions will not necessarily automatically be satisfied by such a choice, and the spline
approximant would therefore not be unique, although the approach indicated at the end
of Section 4 could be applied.
8.3 Sequential knot-insertion strategies
In a sequential knot-insertion strategy, a succession of approximants is obtained, in which
for each approximant a knot is inserted in the knot interval that gives rise to the greatest
contribution to the £2 error. A knot interval is an interval between adjacent knots, where
the endpoints of I count as knots for this purpose. Previously inserted knots are retained
undisturbed. Several variants are possible (also see Section 8.10), e.g.:
• Start the process with a number of knots already in place, perhaps obtained from
information specific to the application.
• Candidate positions for a new knot are
* The continuum of points within the interval. The approach gives rise to the
minimisation of a univariate function that may possess local minima.
* The subset within the interval of a discrete set of points chosen a priori, e.g., the
data abscissa themselves or a uniformly spaced set of a;-values. The approach
density of the data is greater in regions where the behaviour of y is more marked.
337
M. Cox, P. Harris and P. Kenward
338
gives rise to a finite computation for the globally-best choice of knot, relative
to the discretisation, with respect to previous knots.
• More than one knot can be inserted at a time. Doing so gives an approach that
is intermediate between full optimisation (Section 8.6) and sequential (single) knot
insertion. Computation times rise rapidly with the number of "simultaneous" knots
so inserted, so in practice only a small number, say two or three, might be feasible.
The "upper set" of crosses in Figure 5 shows the root-mean-square residual as a function
of the number of knots for the application of this strategy to the thermophysical data
of Figure 3.
N: number of knots
5. The root-mean-square residual as a function of the number of knots for the
application of knot-insertion and knot-removal strategies to the thermophysical data of
Figure 3. The "upper set" of crosses indicate the values obtained for knot insertion and
the lower for knot removal. The knot-removal strategy starts with the knot set provided
by the knot-insertion strategy, which was terminated after 81 knots had been placed.
The figure depicts the root-mean-square residual on a logarithmic scale, so its value
varies by a factor of 1000 from 1 to 81 knots.
FIG.
8.4
Sequential knot-removal strategies
In a sequential knot-removal strategy, the starting point is an initial spline approximant
having a "large" number of knots that typically would be regarded as an acceptable
approximant to the data and that contains (perhaps many) more knots than desired. Also
see Section 8.10. Each successive approximant is obtained from the previous approximant
by deleting one (or more) knots. The knot selected for removal is chosen as that having
least effect in terms of the change in the ^2 error. The process is continued until an
acceptable approximant is no longer obtained.
The initially large number of knots (Section 8.10) provides an appreciable number
of candidate knots for removal and thus greater flexibility. The rationale is that in
Data approximation by polynomial splines
339
contrast to successive knot insertion a succession of acceptable approximants is obtained
as opposed to a succession of unacceptable approximants, until the final "solution" is
provided. There are variants, as with sequential knot insertion. For example, several
knots can be removed at each stage.
A different class of knot removal algorithms [20] is based on a general class of ^p norms.
It is not concerned specifically with data approximation, but with replacing an initial
spline approximant (that may correspond to an approximant) by one that is acceptably
close according to the measure.
The "lower set" of crosses in Figure 5 shows the root-mean-square residual as a
function of the number of knots for the application of this strategy to the thermophysical
data part-depicted in Figure 3.
8.5
Theory-based approaches
The distance of a spline s{x) with knots A from a sufficiently differentiable function
f{x) is proportional to /i"|/'"HOI) where h is the local knot spacing and £, is a value of
X [14]. Consider inverting this expression in order approximately to equaUse the error
with respect to x. The lengths of the knot intervals should consequently be chosen to be
proportional to \f^"'\0\~^^"'j where ^ is a value in the neighbourhood of the respective
knot interval. Consider the function
F{x)=
|/("Hi)l'/"^V /
|/("Ht)r/"rfi •
(8-1)
Take knots given by
^A,-)-^,
i = l>--->^-
*
(8-2)
This result corresponds to dividing the range of the monotonically increasing function
F{x), for a; e /, into A'' +1 contiguous subranges of equal length, taking the values of x
corresponding to the subrange endpoints as the knots.
In practice /, let alone F, is unknown. Various efforts have been made to estimate /
and hence F from the data points. For instance, if the data is approximated by a spline of
order n-f 1, its nth derivative, a piecewise-constant function, can be used to estimate F
[3]. It is then straightforward to form the required knots. The approach begs the question
in the case of data. In order to estimate knots for a spline of order n, it is first necessary
to construct a spline approximant of order n -|-1 for the data, the construction of which
itself requires a choice of knots.
Alternatively [13], a spline approximant of order n for the data can be constructed
for some convenient choice of knots. Its nth derivative is of course zero (except at the
knots). However, its (n — l)th derivative is piecewise constant, a function that can be
approximated by the join of the mean values at the knots of the constant pieces to the
immediate right and left, with special consideration at the endpoints of /. The derivative
of this piecewise-linear function then provides a piecewise-constant representation of the
nth derivative, that can be used as before. Knots can then be deduced from this form
as above. The advantage of this approach is that it can be iterated [13]. If the process
"converges", the result can be used to provide the required knot set. The process can
340
M. Cox, P. Harris and P. Kenward
work well, but is capable of producing disappointing results. Several variants of the basic
concept are possible. The approach warrants careful re-visiting.
8.6
"Overall" optimisation approaches
For any given value of A^, the problem is regarded as an optimisation problem with
respect to the overall error measure. It is necessary to provide a sensible initial estimate of
the knot positions. Local solutions which may be grossly inferior to the global solution are
possible [4]. At an optimal solution, knots may coalesce, thus reducing the continuity of
the spline at such points [19]; the same comment applies to the sequential-knot-insertion
and optimisation approach (Section 8.7).
8.7
Sequential knot insertion and optimisation
Sequential knot insertion with optimisation is identical to the sequential knot-insertion
strategy (Section 8.3) except that, after each knot is inserted, all previously-inserted
knots are adjusted such that the complete set of knots at that stage are (locally) optimal
with respect to the overall error measure. One such strategy [15] carries out the optimisation at each stage by adjusting in turn each knot in the current knot set in order to
achieve satisfactory reduction in the £2 norm, and repeating the complete adjustment as
necessary. This strategy is not as poor as the traditional one-variable-at-a-time strategy
for nonlinear optimisation because knots far from the newly-inserted knot tend to have
little effect on the error measure.
Buffering to prevent knots coalescing and reducing the continuity of the approximant
can be used. Various features can be incorporated to improve computational efficiency,
including the use of contemporary nonlinear least-squares optimisation. It is emphasised
that for each choice of knots the problem is linear (cf. Section 7).
8.8
Optimal discontinuous piecew^ise-polynomial approximation
Consider the class S^ of splines having N interior knots of multiplicity n (i.e., nN
interior knots in all, counting coincidences). An s € SAT will in general be discontinuous
at these knots. It is possible to determine the globally optimal locations of such knots,
using the principle of dynamic programming [4]. The approach is based on the fact that
the best approximant SN e SN to the leading p (> nN) data points is given by the best
over g = nN -n + l,nN - n + 2,. ..,p- N of s^-i 6 SN-I for the leading q <p- N
points, together with a polynomial piece of order n over points g -|-1 to p. By this simple
recursive means the globally best knots for splines of any order that are discontinuous
at any number of knots can be computed.
Such a solution may not be suitable as the final result in an application. However,
it can be useful as part of a knot placement strategy. For example, suppose good knots
for a spline of order n are required. An approach would be to determine an optimal
discontinuous sphne of order n+l. Use this spline to estimate / in expression (8.1). The
integral in the numerator of (8.1) will be continuous piecewise linear and estimates of
the optimal knots for a C^""^^ spline readily obtained from (8.2). Mixed results have
informally been obtained by the authors with an implementation of this approach. It is
suggested that it be revisited.
Data approximation by polynomial splines
8.9
Knot dispersion
A set of knots of multiplicity n is positioned using an appropriate strategy, such as
that in Section (8.8) and a C^^^\l) spline with these knots determined. Each of these
multiple knots is "dispersed", viz., replaced by n nearby simple knots, and a replacement
(7("-2)(/) spline computed. A careful strategy for knot dispersion is required. Again,
informal experiments have been made by the authors and mixed results obtained.
8.10
Knot initialisation and candidate knot locations
Several of the above procedures require or can benefit from an initial placement of the
knots. Some make use of "candidate knot locations".
The solution to the free-knot spline approximation problem returned by iterative
algorithms typically depends on the starting set of knots. Although an algorithm may
return a result that satisfies the necessary and sufficient conditions for a solution [17], this
result may be locally rather than globally optimal. There is no known characterisation of
a globally optimal solution. The careful interpretation of solutions is therefore important.
The use of candidate knot positions can be helpful. For instance, it may be decided
that for splines of even order, only knots that coincide with data abscissae are in the
candidate set, or, for splines of odd order, knots only at points mid-way between adjacent data abscissae may be so regarded. Such criteria are consistent with the choice for
interpolating splines and the generalisation covered in Section 8.2. The Lyche-M0rken
knot removal algorithms [20] use data abscissae as candidate knots. The use of a finite
number of candidate knot locations helps to reduce the dimensionality of the problem:
there can then only be a finite number of possible knot sets. For large N this number can
be extremely large, making it prohibitive to examine all possibilities. However, for small
A'', e.g., 1, 2 and 3, it may indeed be possible, and can pay dividends. Knot insertion
and knot removal algorithms can also implement the concept. For example, at each stage
of a knot insertion strategy, two or three knots can be inserted "simultaneously". By
the method of their introduction these new knots will be optimal relative to the knots
previously used and the available candidate knot locations.
Another aspect of a candidate knot set is that if it is suflBciently dense it will contain,
to a degree of approximation dictated by its "spacing", the optimal knots for the given
data set [19]. For instance, consider a set of m » 100 data points specified over an interval
/ normalised to [—1,1]. Take 100 uniformly spaced points spanning this interval. This set
will contain, to approximately two figures, each globally optimal knot set having N <98
knots^ (assuming all knots are simple). If a spline based on these 98 candidate interior
knots provided a valid model, a suitable knot removal algorithm might be expected to be
able to identify reasonably closely the optimal knot sets. Work is required to determine
the degree of success in this regard.
,
9
Conclusions, discussion and future possibilities
There are theoretical difficulties associated with existence, uniqueness and characterisation of best free-knot £2 spline approximants, which influence practical considerations.
"The two endpoints do not constitute interior knots.
341
M. Cox, P. Harris and P. Kenward
342
A best spline in the class of splines required may not exist. Take as {xi}i', m = 21
uniformly spaced values in [-1,1] and j/,; = |.T,|^. TO see that a best £2 spline s of order 4
with three interior knots for this data may not exist, consider the choice Aj = -e, A2 = 0
and A3 = e. The £2 error can be made smaller than any given 5 > 0 for some e > 0.
However, if the £2 error is made zero by the choice e = 0, the resulting three coincident
knots at re = 0 mean that s has lower continuity than the class of splines considered. In
practice, allowing knots to come "too close" together can introduce undesirable "sharpness" into the approximant. Buffering of knots [15], to ensure a minimal separation helps
in this regard. The use of a candidate knot set introduces a form of buffering. In some
circumstances the coalescing of knots would be ideal in terms of the resulting closeness
of s to the data. In some applications the loss of smoothness would be unacceptable.
Therefore, whether buffering is appropriate depends on the use to be made of s.
The solution may not be unique. Figure 6 shows a set of 201 uniformly spaced points
in [-1,1] taken from /(x) = sign(a;)min(a;, 1/2). Figure 7 shows the root-mean-square
residual as a function of knot location for £2 splines of order 4 with one interior knot.
There are two best approximants, one with its knot at x = -0.63 and the other at
X — -1-0.63. One of the two approximants is shown in Figure 6. The other spline is its
skew-symmetric counterpart.
Raw data and spline fll
6. 201 uniformly spaced points in [-1,1] taken from f{x) = sign(a;) min(x, 1/2)
and a best £2 spline approximant with one knot.
FIG.
It is rarely required to determine an £2 spline approximant that is globally or even
locally optimal with respect to its knots. An approximant that met some closeness requirement with the smallest possible number of knots is an academic rather than a
pragmatic objective. Today, the more important consideration is to obtain an approximant that represents the data in that its smoothness is consistent with that of the function
underlying the data and the uncertainties in the data. (This statement must be qualified for situations where the continuity class of splines is a consideration as discussed
Data approximation by polynomial splines
FIG. 7. The root-mean-square residual as a function of knot location for £2 spline approximants with one knot to the data of Figure 6.
above.) These ends may be achieved by seeking an approximant with a reasonable but
not necessarily optimal number of knots.
The use of knot removal strategies is likely to attract research effort in the future.
One reason for this statement is that the need to work with large initial knot sets is
not as computationally prohibitive with today's powerful personal and other computers.
Another reason is that the approach can be expected to produce better approximants,
i.e., smaller £2 errors for the same number of knots.
The two sets of crosses in Figure 5 correspond to the values of the root-mean-square
residual as a function of the number of knots for the application of the knot-insertion
strategy followed by the knot-removal strategy for the thermophysical data of Figure 3.
The two sets, where the "progress" takes place from left to right along the "top set",
followed by right to left along "the bottom set", constitutes a form of hysteresis. The
behaviour in the two directions is distinctly different. In particular, the figure indicates
that once an acceptable approximation has been obtained by knot insertion, the use
of knot removal can deliver an approximation of comparable quality with many fewer
knots or alternatively for the same number of knots an appreciably better approximation
can be obtained. In this case, with 30 knots, knot removal gives an £2 error that is one
quarter of that for knot insertion. For an £2 error of 0.005, 30 knots are required using
knot removal and 43 using knot insertion.
Large data sets, as are now frequently being produced in metrology from computercontrolled measuring systems, are ideal for the purpose of obtaining a sound initial
approximant in the form of a valid model containing possibly many more knots than
the minimum possible. Their size permits initial approximants to be obtained, even with
large numbers of uniformly spaced knots, that provide valid but highly redundant models
for the data. The fact that such sets do not contain "appreciable gaps", because of the
manner in which they gathered, means that this fact together with the quantity of data
far outweighing this initial number of knots goes a long way towards ensuring that this
343
344
M. Cox, P. Harris and P. Kenward
initial approximant is valid. There is much scope for an appreciable number of knots to
be removed. The initial large number of knots may also have been obtained by the use
of a knot insertion strategy. It is the experience of the authors that knot insertion can
introduce appreciably more knots than given by the optimal choice.
Because the early approximants may be far from optimal, an insertion algorithm
can produce knots that are totally different from those in an optimal approximant. In
contrast, a knot removal algorithm has a possibility to obtain good knots. (See Section
8.10.) For instance, because of the sequential manner in which knots are inserted, there
may be two or more close or even coincident knots, although a good knot set might not
have this property. It is also possible that such knots, although not part of an optimal
set, are influential in their effect on a knot removal algorithm, with the result that they
appear in the "final" approximant.
The problem of data containing wild points is not addressed satisfactorily by existing
knot placement algorithms. Because such points are responsible for a large contribution
to the ^2 error, more knots would be placed in the neighbourhood of such a point than
would otherwise had been done. The knot placement strategy can then be influenced
more by the errors in the data than by the properties of the underlying function. Formulations and hence algorithms are needed that have greater resilience to such effects.
In solving the fixed-knot spline approximation problem as part of the free-knot problem, a knot set differs from a previous knot set only by the addition or removal of a
small number of knots. In linear algebraic terms the "new" matrix A{\'), say, differs in
only a few rows from the previous matrix A(A). Considerable gains in computational
efficiency can be obtained by accounting for this fact. This paper has not addressed this
issue, concentrating more on the concepts in the area. There is much scope, however,
for the application of the recognised stable updating and downdating techniques of linear algebra [17]. Their application will not reduce the computational complexity of a
procedure, but could reduce computation times for large problems by an appreciable
factor.
The work described here was supported by the National Measurement System Policy
Unit of the UK Department of Trade and Industry as part of its NMS Software Support
for Metrology programme. The referee provided carefully considered comments that
permitted the paper to be improved.
Bibliography
1. G. T. Anthony and M. G. Cox. The National Physical Laboratory's Data Approximation Subroutine Library. In J. C. Mason and M. G. Cox, editors. Algorithms for
Approximation, pages 669-687, Oxford, 1987. Clarendon Press.
2. BIPM, lEC, IFCC, ISO, lUPAC, lUPAP, and OIML. Guide to the Expression of
Uncertainty in Measurement, 1995. ISBN 92-67-10188-9, Second Edition.
3. H. G. Burchard. On the degree of convergence of piecewise polynomial approximations on optimal meshes. j4mer. Mat/i. 5oc., 234:531-559, 1977.
4. M. G. Cox. Curve fitting with piecewise polynomials. J. Inst. Math. AppL, 8:36-52,
1971.
Data approximation by polynomial splines
5. M. G. Cox. The numerical evaluation of B-splines. J. Inst. Math. Appl., 10:134-149,
1972.
■
6. M. G. Cox. A survey of numerical methods for data and function approximation. In
D. A. H. Jacobs, editor. The State of the Art in Numerical Analysis, pages 627-668,
London, 1977. Academic Press.
7. M. G. Cox. The incorporation of boundary conditions in spline approximation problems. In G. A. Watson, editor, Lecture Notes in Numerical Analysis 630: Numerical
Analysis, pages 51-63, Berlin, 1978. Springer-Verlag.
8. M. G. Cox. The least squares solution of overdetermined linear equations having
band or augmented band structure. IMA J. Numer. Anal, 1:3-22, 1981.
9. M. G. Cox. The NPL Data Approximation Subroutine Library: current and planned
facilities. AM G iVewsHier, 2/87:3-16, 1987.
10. M. G. Cox. Algorithms for spline curves and surfaces. In L. Piegl, editor. Fundamental Developments of Computer-Aided Geometric Modelling, pages 51-76, London, 1993. Academic Press.
11. M. G. Cox and J. G. Hayes. Curve fitting: a guide and suite of algorithms for
the non-specialist user. Technical Report NAG 26, National Physical Laboratory,
Teddington, UK, 1973.
12. C. de Boor. On calculating with B-splines. J. Approx. Theory, 6:50-62, 1972.
13. C. de Boor. Good approximation by splines with variable knots II. In G. A. Watson,
editor. Numerical Solution of Differential Equations, Lecture Notes in Mathematics
No. 363, pages 12-20. Springer-Verlag, Berlin, 1974.
14. C. de Boor. A Practical Guide to Splines. Springer-Verlag, New York, 1978.
15. C. de Boor and J. R. Rice. Least squares cubic spline approximation II - variable
knots. Technical Report CSD TR 21, Purdue University, 1968.
16. B. Ford, J. Bentley, J. J. du Croz, and S. J. Hague. The NAG Library 'machine'.
Software - Practice and Experience, 9:56-72, 1979.
17. P. E. Gill, W. Murray, and M. H. Wright. Practical Optimization. Academic Press,
London, 1981.
18. IEEE. IEEE standard for binary floating-point arithmetic. Technical Report
■ ANSI/IEEE standard 754-1985, IEEE, IEEE Computer Society, New York, USA,
1985.
19. D. Jupp. Non-linear least square spline approximation. Technical report, Fhnders
University, Austraha, 1971.
20. T. Lyche and K. M0rken. A discrete approach to knot removal and degree reduction
for splines. In J. C. Mason and M. G. Cox, editors, Algorithms for Approximation,
pages 67-82, Oxford, 1987. Clarendon Press.
21. I. J. Schoenberg and Anne Whitney. On Polya frequency functions HI. Trans. Am.
Math., 74:246-259, 1953.
345
On the approximation power of local
least squares polynomials
Gleg Davydov
Universitdt Giessen, Mathematisches Instittd, D-35392 Giessen, Germany.
oleg.davydov@math.uni-giessen.de
Abstract
We discuss the relationship between the norm of the local discrete least squares polynomial approximation operator, the minimal singular value (T,nin(-P=) of the matrix F=
of the evaluations of the basis polynomials, and the norming constant of the set of data
points H with respect to the space of polynomials. Since these three quantities are equivalent up to bounded constants, and since CTminC-Ps) can be efficiently computed, it is
feasible to use fminC^H) ^ * tool ^°^ distinguishing good local point constellations, which
is useful for scattered data fitting. In addition, we give a simple new proof of a bound
by Reimer for the norm of the interpolation operators on the sphere and extend it to
discrete least squares operators.
1
Introduction
Let fi be a bounded domain in R*^, rf > 1, and let E = {^i,..., Cm} be a set of scattered
points in Q. Given the values /]= = ifi^i), ■■■, /(Cm))'^of an otherwise unknown function
/ : Q —> R, we want to reconstruct / from these data. The least squares method consists
in choosing some linear independent functions pi,... ,p„ on Q, n < m, and computing
the coefficients ai, • • •, On G R that minimize the £2 norm of the residual on E,
m
, ||/|E-p|E||2=(El/te)-P(eOl')
-tin
,
with p = axpi -\
h anVn eV := span{pi,... ,p„}. Let P\s := spanfpiln, • • ■ ,Pn\s}If dimT'ln = n, then the least squares solution is unique, and we denote it by L-p^^f.
Note that the minimum norm solution available in the case of a rank deficient problem
(dim'PlH < n) seems less useful since in general it does not reproduce the elements of V
exactly.
The computation of least squares approximation L-p^sf of / is expensive if m and
n are large. To obtain a scattered data fitting algorithm with linear complexity with
respect to the size of data, a two-stage method [8] can be employed which consists in
1) covering the original domain fl with a number of subdomains fi^ each containing
only a small subset Sfc = E fl Clk of E, computing local approximations to the data in
Efc, and 2) using the information obtained from these local approximations to build the
final approximation of the (possibly huge) original data set. The least squares method
346
Approximation power of local least squares
can be employed in the local approximation stage, especially to deal with "real world"
data usually contaminated with errors or just containing undesirable "high frequency"
components.
If V is chosen to be the space 11^ of algebraic polynomials in d variables of a suitable
degree q, then n = ( ^'). To achieve high approximation order, it is desirable to choose
q such that n is only a little smaller then m. However, this is not always possible due to
the rank deficiency or ill-conditioning of the least squares problem, which is especially
difficult to control if ^i,...,^m G S^ are unevenly distributed in fifc. This difficulty can
in principle be overcome by constructing, for each S^, a suitable subspace of higher
degree polynomials (least interpolation space [2]). If, however, the polynomial degree
is not allowed to exceed a fixed small value, then a common practical approach is to
choose larger sets H^ C E, with m substantially greater than n, see e.g. [4] where it is
suggested to use for local least squares approximation m = 11 points if P = III with
n = 6 and m = 15 points if P = 11^ with n = 10. However, even these higher m provide
no guaranty that the matrix
Ps^-=\Pj{^i) ■ i = 'i-,---,rn,
j = l,...,n]
of the local least squares problem will always be well-conditioned. Moreover, for some
data, this method may lead to the use of inappropriately distant points for the local
approximation.
The purpose of this paper is to draw attention to the fact that the conditioning of
the matrix PH^ is not only the issue of numerical stability of the computation of least
squares. Indeed, the reciprocal of the minimal singular value u^miPs) of PH provides
a bound for the norm of the least squares operator L-p^s if both m and n are small.
Therefore, the approximation power of local least squares depends on (JminC-Fk) and
the best approximation from V. Since cr^in{Ps) can be efficiently computed for a small
matrix PE by well known numerical algorithms, it is feasible to use it as a tool to
decide whether a particular portion of data is suitable for building local least squares
approximation from P with reasonable approximation power. If cr^in{Ps) is too small,
then either H or V should be modified, e.g. by adding more points to E or using an
appropriate subspace of P. A two-stage algorithm for fitting large irregularly distributed
scattered data sets employing the conditioning of the local observation matrices PH^. is
studied in [3, 5].
The paper is organized as follows. In Section 2 we discuss the relationship between
the norm of the discrete least squares approximation operator, the minimal singular
value (Tmin(-PH), and the norming constant !/(P,H). As a by-product, we obtain a new
proof of a known bound for the norm of the interpolation operators on the sphere [7],
and extend it to the discrete least squares operators. Section 3 illustrates the above
concepts in the univariate case, when they are also related to the separation distance of
S, while Section 4 is devoted to a discussion of the least squares multivariate polynomial
approximation.
347
348
Oleg Davydov
2
Bounds for
HI'P.HII
and approximation error
Let pi,..., p„ be linearly independent continuous functions on n c R'' spanning a linear
space V. Since all norms on a finite dimensional linear space are equivalent, there are
positive constants Ki, K2 such that
< K2\\a\\2
^l||«l|2<
(2.1)
c(n)
for any coefficient vector a = (fli,..., a„)'" e R".
Given H = {6,-..,^m} C fi, we consider the matrix Ps € R™^" as defined in the
introduction. Obviously, rankPs = dimT'lH- If PE has full rank, then dimPln = n,
and the least squares approximation Lp^sf is uniquely determined, giving rise to the
operator Lp,E : C'(fi) -* "P C C{Q).
It is easy to see that ip = exactly reproduces the elements of V, i.e.,
Lp,BP = P,
all p£V.
(2.2)
Therefore, a standard argument shows that
\\f-Lp,Bf\\c(n)<{l + \\Lv,B\\Mf,V)cin),
(2.3)
where E{f, V)c{n) denotes the error of the best approximation of/ from V in Chebyshev
norm,
E{f,P)cm:=ini\\f-p\\ciu)Thus, an estimate for ||LP,E|| immediately gives an upper bound for ||/ — L-p=f\\c(Q)The norming constant v{V, H) of S with respect to V [6] can be defined by
KP,H) = min||p|H||oo/lbllc(n).
pEV
(2.4)
Given any matrix A, we denote by CTmmi-^) the minimal singular value
o-min(^)= min ||^a;||2.
||X||2 = 1
Recall that if A has full rank, then c7^\ri{A) = ||>1''"||J'', where A+ is the pseudoinverse
of A, seee.5. [1].
Theorem 2.1
//rankPH = ", then
i^l/^min(PH) <
1MP,S)<
\\LvM\
Ili^'.Ell
<K2^f^|<yrmr.{PE),
<v/^MP,H),
K,V{'P,'^)< ^mm(P3) <i^2V^K^,H).
(2.5)
(2.6)
(2.7)
Proof: We first prove (2.5). Let Lp^zf = Yll=i C'jPj- It follows by a well-known result
in numerical linear algebra that the vector a = (oi,... ,a„)^ can be computed as the
product of the pseudoinverse Pj" of PE with the vector /|=. Therefore,
IHb =
||PEV|H||2
<
||PE+I|2||/|H||2
=CT-J„(PE)||/iE||2.
Approximation power of local least squares
349
Since \\L'p,3f\\ciQ) ^ ^2||a||2 and ||/|H||2 < V^II/IHUOO < \/?ri||/||c(f!), the upper
bound in (2.5) follows. To prove the lower bound in (2.5), we choose a function / e C(f2)
such that
||PHV1E||2 = ||PH'"II2||/IH||2,
II/IEIIOO = ||/||c(fi),
which is obviously possible. Then by (2.1) we have
\\Lv,Encin)>K^\\P+M2 = K^a^JPs}\\f\Eh,
which implies the desired lower bound since ||/|E||2 > H/IHUOO = ||/||c(n)Since \\L-p^sf\\c{Q) < '^~^{'P,'^)\\{L-p,3f)\s\\oc>, the upper bound in (2.6) follows by
||(i7',H/)|E||oo < i|(ip,E/)|H||2 <
||/|E||2
< V^||/||c(n).
To prove the lower bound, we denote by p an element of V for which the minimum in (2.4)
is attained and choose a function / G C{Cl) such that /|H =P|E and ||/||c(n) = ||/|E||OOThen by (2.2),
IliT'.s/lhfi) = l|p||c(n) = t^-'(P,H)||p|H||oo - i^-'(P,E)||/lc(Q)^
which implies ||L-p=li >'^"H'PjE)We finally establish (2.7). For any p G V, let p = YJj=i o-jPj and a = (ai,... ,o„)^.
Then p\s — Psa and hence
lb|E||oo<||PEa||2<V^|b|H||oo.
Since
o"min(-PE) = min ||PHa||2/||a||2,
(2.7) follows by (2.1).
In view of (2.3), the upper bound in (2.5) implies
D
\\f-Lp,Bf\\cia)<{l + K2V^/<Jmin{PE))E{f,r)cia),
(2.8)
which shows that the approximation power of discrete least squares proportionally reduces if (TminiPE) (or uCPjE)) is small. We will discuss some practical consequences of
this fact in the next two sections.
Although uCPjE) gives tighter bounds for ||Lp=||, a^in{Ps) has a clear practical
advantage that it is easily computable by using e.g. the singular value decomposition of
the small "local" matrix Pn. On the other hand, the norming constants were used in [6, 9]
to derive estimates for the approximation error of radial basis function interpolation and
moving least squares, respectively.
Remark 2.2 li pi,... ,pn is an orthonormal basis for V, then ||a||2 = ||p||z,2(n)) P =
Y^^^iO-jPj, and the constants Ki,K2 in (2.1) are closely related to Nikolskii constants
of the space V, namely,
where
-^9i,g2(^) :=max|bllL,i(n)/lblli,,(f2), •
l<9i,92<oo.
350
Oleg Davydov
In particular, if Q = S''-"^, the unit sphere in R**, and {pi,... ,p„} is the set of spherical
harmonics forming an orthonormal basis for the space P = W^ of spherical polynomials
of degree g in d variables, then it is not difficult to prove that K^ = Noc,2{'Hq) =
y/n/\S'^~'^\, where \S^~^\ denotes the surface area of S'^"'^. Therefore, for any set E C
S'^~'^ with #E = m > n, we have by (2.5),
WLnid < \lnml\S<^-mcr^^{Ps\
(2.9)
which recovers in the case of interpolation (m = n) an error bound by Reimer [7]
originally proved by using Lagrangian square sums (see also [10]).
3
Univariate polynomials
Let n be an interval [-/i, /i] on the real line R, and let
Then V is the restriction to [-h,h] of the space Yl\^i of all univariate polynomials
of degree at most n-l. By the well-known interpolation properties of the univariate
polynomials, rankPu = n for any S = {^i,... ,^m} C [-h,h], m > n, with distinct ^,:'s.
For any E' = {^jj,... ,4in} C E, let qs' denote the separation distance,
9E':=
The Lebesgue constant
estimated as
HLP.E'II
;Jmjn|6/-C^J■
of the corresponding interpolation scheme can be easily
Since S' may be any subset of E of cardinality n and since i/{V, E) >
HI-P.E' W"^,
we get
''"'(^'^^-(Sj!^''/^-")""''
where
QE,n ■= max qs
Hence, by (2.3) and (2.6),
11/ - Ln^^_^,sf\\ci-hM < (l + T^Tiyr^'^/'^^'"^""') ^(/'nn-i)c[-...i-
(3.1)
This last estimate shows that the univariate least squares polynomials have the
approximation power of the best local polynomial approximation as /i —> 0 provided
h/qs,n remains bounded. However, if the scattered points £,i,...,^vi € [-h,h] are
clustered together in at most n-l very tight groups, then qs,n may be arbitrarily
small, thus forcing the right hand side of (3.1) to blow up. To figure out what happens
to 11/ - Lxii ,sf\\c[-h,h] in these circumstances, we consider the following example.
Approximation power of local least squares
351
Let h = l,n = 2, f{t) = i^ _ 1/2, and 5 = {-^, 0, £,} for some 0 < ^ < 1. It is easy
to see that Lnj^/ = -1/2 + 2^2/3, gince jE;(/,n])c[-i,i] = 1/2, we have
11/- inl,H/llc[-M] = 1/2 + |l/2 - 2^V3| < 2£;(/,ni)c[-i,i]
even though, by a simple calculation,
l|ini,Hll = 1/3 + 1/^,
^/2/<Tmin(PE) = l/K^,S) = l/fe,2 = 1/^-^00
as
^->0.
This may contribute to the opinion that Hini^ll) crmin(-f=), i^i'P,'^) and qE,n are not
the right quantities to describe the behaviour of the approximation. Indeed, as the
three points -^,0,^ coalesce, L^i ^f converges to a Hermite interpolation polynomial
provided the entries of PE as well as the values of /|H are exact. However, if we simulate "real world" data by adding to /(-0)/(0)i/(0 normally distributed errors with
standard deviation 10^^, then the picture substantially changes. Table 1 shows that
11/ — Ljii^fWci-i,!] does blow up in this case. For comparison we also include in the
table the error of 11/- Ln\H/llc[-i,i] for the same contaminated data.
TAB. 1 Average {d^ean) and maximum (dJnax) of 11/ - inJ,H/llc[-i,iI as well as
maximum of ||/ - Lni,E/llc[-i,i] (<^max) ™ 1000 t^^t^ with contaminated data
e
'^mean
''max
'^max
10-3
1.06
1.24
1.00018
10-^
1.56
3.39
1.00018
10-5
6.63
24.9
1.00018
10-6
57.3
10-^
240
2390
1.00018
1.00018
564
10-^
5630
23900
1.00018
Thus, if qB,n is too small, we cannot practically achieve with least squares the approximation order of E{f, 'n.n_i)c[-h,h] simply because the points lying too close to each
other carry redundant information and we have at most n — 1 clusters of such points.
Therefore, we should adjust the polynomial degree to the given data paying attention to
the trade-off between higher approximation power of higher degree polynomials and the
"pollution" caused by the factor q^\ that increases with n. In practice one may choose
maximal n such that h/qs,n is smaller than a prescribed tolerance value 0 < E <oo.
4
Multivariate polynomials
The situation becomes substantially more complicated when we turn to multivariate
polynomials. Let fi be a bounded domain in IRf and let {pi,... ,p„}, n = ( ^'), be
a basis of the space V = Hg of polynomials in d variables of total degree q satisfying
(2.1) on n. (For example, we may consider a properly scaled standard power basis with
the center at a point in Q or the Bernstein-Bezier basis with respect to some simplex
overlapping Q or a significant part of it.) Let, furthermore, S be an arbitrary finite set
of points in fi such that m = #H >n.
The first problem we face in the case d > 2 is that the matrix Ps may be rank deficient.
It is clear, however, that there is no practical difference between this situation and the
352
Oleg Davydov
one when Ps has full rank but is extremely ill-conditioned, i.e., CminC-PE) is very small.
Moreover, (2.8) shows that even moderately small c7^m{Ps) may significantly reduce
the approximation power of Lp^z- Clearly, the same can also happen in the univariate
case if qs,n is too small. The real difficulty of the multivariate case seems to be that
simple characteristics of E, like separation distance qs,m do not give much information
about the norm of ip=- For example, six equidistant points on the unit circle in R^
are well separated and look reasonably distributed. However, they are not good for least
squares approximation from the space U^ since the matrix Ps is rank deficient. Suitably
perturbed, these points will give rise to the least squares operator in^.E with a very
large norm. More generally, the norm of I'n<',H will be large if the points in E C R'* lie
"too close" to an algebraic hypersurface of order q.
If the data are comparatively dense in ft, namely the fill distance
/i=. fj := sup min |a; - ^1
does not exceed some small positive constant depending on ft and the polynomial degree,
then the estimates of the norming constant 1/(11^,5) given in [9] provide a bound for
ll-^nrfin.slli i"^ ^i^w of (2.6). For example, if fi is a ball of radius r, then i/(n^|n,H) > 1/2
iihE,Q<0.nr/q^.
On the other hand, without any density assumptions we can always rely on (2.8),
where crj„in(-fE) can be efficiently computed by well known algorithms of numerical linear algebra. In some sense, small <Tmin(-PE) indicates that the local data has "hidden
redundancies" {e.g. too many points lying very close to the same straight line or the
same ellipse) that prevent it from carrying enough information for a "full power" approximation of the underlying function from 11^. Similar to the univariate case, but using
(^mm{Ps) instead of qs,n, we can adaptively choose the polynomial degree according to
the following algorithm that has proven to be useful for scattered data fitting [3, 5].
Let n C R'', E C 0, #E = m. Denote by PJ the matrix of the evaluations of
appropriate basis functions for 11^, g > 0, at the points ^ G E.
Algorithm 4.1 Starting with some q = qo >0 such that ( ^'') < m, compute aminiPs.)If l/crmin(-fE) '* smaller than a prescribed tolerance E < oo, then compute the least
squares U'^-approximation to the data in E and accept it as a reliable approximation on
Cl. Otherwise, repeat the same with q = qo — 1 and successively redxLce the degree q to
go — 2,... ,0, while l/crini„(P|) > E. For q = 0 no comparison of l/<Tinin(-Fs) ™'*'* -^ '•'
needed since ||in''|n,Hll *^ bounded for any fl andE.
Note that, optionally, the condition number ll-Pllia/crminC-Pl) ^^ -^1 ^^" ^^ "^^d in
the above algorithm instead of l/a^\niPs)' ^ ^* ^^ ^^^^ formulated in [5].
Bibliography
1. A. Bjorck, Numerical Methods for Least Squares Problems, SIAM, Philadelphia,
1996.
Approximation power of local least squares
2. C. de Boor and A. Ron, On multivariate polynomial interpolation, Constr. Approx.
6 (1990), 287-302.
3. 0. Davydov and F. Zeilfelder, Scattered data fitting by direct extension of local
polynomials with bivariate splines, 2002, preprint.
4. T. A. Foley, Scattered data interpolation and approximation with error bounds,
Comput. Aided Geom. Design^ {l^%&),l&'i-m.
5. J. Haber, F. Zeilfelder, O. Davydov and H.-P. Seidel, Smooth approximation and
rendering of large scattered data sets, in Proceedings of IEEE Visualization 2001,
Th. Ertl, K. Joy and A. Varshney (eds), IEEE, 2001, 341-347, 571.
6. K. Jetter, J. Stockier and J. Ward, Error estimates for scattered data interpolation
on spheres. Math. Comp. 226 (1999), 733-747.
7. M. Reimer, Interpolation on the sphere and bounds for the Lagrangian square sums,
Results in Mathematics 11 (1987), 144-164.
8. L. L. Schumaker, Two-stage methods for fitting surfaces to scattered data, in
Quantative Approximation, R. Schaback and K. Scherer (eds). Lecture Notes 556,
Springer, Berlin, 1976, 378-389.
9. H. Wendland, Local polynomial reproduction and moving least squares approximation, IMA J. Numer. Anal21 (2001), 285-300.
10. R. S. Womersley and I. H. Sloan, How good can polynomial interpolation on the
sphere be?, Advances in Comp. Math. 14 (2001), 195-226.
353
A wavelet-based preconditioning method for dense
matrices with block structure
Judith M. Ford* and Ke Chen
Department of Mathematical Sciences, University of Liverpool, Liverpool L69 7ZL, UK.
Judyford@liv.ac.uk, k.chen@liv.ac.uk
Abstract
In recent years application of a discrete wavelet transform (DWT) has become an established tool for the design of preconditioners for smooth, dense matrices, such as those
that arise in the solution of certain integral equations. In this paper we consider the
higher dimensional case, where the matrix A is not itself smooth, but has a smooth block
structure. To precondition such matrices, we use repeated application of a level 1 blockwise DWT to exploit the fact that corresponding entries in adjacent blocks are close in
value. We illustrate the effectiveness of our methods by means of numerical examples.
1
Introduction
We have previously ([9]) considered wavelet-based preconditioning methods for dense
matrices having the property that the entries vary smoothly (that is to say, adjacent
entries are close in value) apart from known areas of singularity, for example a nonsmooth diagonal band. The main idea is to use wavelet compression (see, for example
[14]) to convert "smoothness" in the original matrix into "smallness" in the transformed
matrix, and then to approximate the transformed matrix by dropping small entries.
Smooth matrices arise in a range of applications (see, for example, [6, 8, 10]) involving
an essentially 1-dimensional discretization process. In higher dimension cases the corresponding matrices have a block structure: each block is smooth and corresponding entries
in different blocks vary smoothly; but discontinuities at the edges of the blocks mean that
standard application of DWT does not give good compression. In this paper we extend
the ideas of [9] to enable preconditioners to be designed for such matrices. Throughout
we use Daubechies wavelets, which are orthogonal and have compact support.
2
DWT-based preconditioners
We are interested in fast solution of linear systems
Ax = h,
3;,6GC",
^eC"'^",
(2.1)
where A is a large, dense matrix. Krylov subspace iterative methods, such a.s GMRES
(described in [13]), can be used to solve (2.1), but in most cases preconditioning is
The first author was supported by the Engineering and Physical Sciences Research Council, UK
354
Wavelet-based preconditioning
required in order to obtain good convergence. One method of preconditioning is to seek
a matrix M K A such that M~^v can be calculated cheaply for any vector v. For smooth
dense A the task is usually made easier by transforming (2.1) into a wavelet basis (see
e.g. [4, 5, 6, 10, 11]). When a DWT is applied to such an A, the resulting matrix A has
many small elements. A sparse AK A can be obtained by setting to zero small elements.
This is the; main idea underlying most wavelet-based preconditioners.
2.1
Preconditioners for 1-D problems
Typically A is smooth apart from a narrow diagonal band. When a level k standard DWT
is applied A has a 'finger' pattern of large entries (caused by the non-smooth diagonal
feature) and an n/2'^ x n/2'' block of large entries at the top-left corner. Here n should
be a power of 2. We can form a preconditioner M « A by setting to zero entries that fall
below some chosen threshold, but, because of the finger pattern, a large amount of fill-in
occurs under LU factorization. To avoid this problem M can be obtained by setting to
zero entries in A that fall outside of a diagonal band. We describe this approach as a
"band cut".
The finger pattern can be avoided by using DWTPer (DWT with permutations, first
proposed in [6], see also [7]), which centres the fingers to form a sparse diagonal band
whose width can be predicted accurately. M can then be formed by applying a band cut
to A and (optionally) imposing a threshold.
An alternative way of avoiding the creation of a finger pattern matrix is to use the
Non-Standard-forms (NS-forms) of Beylkin, Coifman and Rokhlin (see [3]) to represent
A in terms of the blocks of a larger matrix. In [9] we presented a new way of using the
NS-form submatrices to precondition A based on the Schur complement and recursive
application of a flexible GMRES iteration. We compared four alternative DWT-based
preconditioning methods:
PI
P2
P3
P4
standard DWT preconditioner with band cut ([5]),
DWTPer preconditioner with band cut ([6, 10]),
NS-form preconditioner with threshold ([3, 11]),
Recursive Schur complement preconditioner ([9]),
and found that, for smooth matrices with a diagonal singularity, P4 gave consistently
good performance, PI performed well for moderate singularities and P2 was best when
the diagonal singularity was very pronounced. When we came to consider 2-D problems,
the robustness of P4 encouraged us to consider ways of extending it to higher dimensions.
2.2
Extension to matrices with block structure
In the 2-dimensional case we are concerned with matrices that have a smooth block
structure. We can compress dense block matrices of this type using two different types
355
M. Ford and K. Chen
356
of DWT: The block DWT has a transform matrix of the form
/ W(")
(m.,n)
W;
0
= /m®W^^"'
V
0
•••
0
VFW
■■.
:
(2.2)
M/(n) J
0
where W^"'^ is a standard nxn DWT matrix and 0 is the n x n zero matrix. It exploits
smoothness within blocks. The Big Block DWT (BBDWT) exploits smoothness
between blocks. It has a transform matrix of the form
wiT^
/ hoi
0
hil
0
■ ■■
hoi
HD-II
h2l
gol
■■■
g\l
ho-il
■■■
0
go^il
\ g2l
■■■
go^il
0
0
0
gol
hj
gj
ho^il
0
0
0
0
0
go-il
hoi
\
hil
0
0
0
0
gol
(2.3)
gJ J
where /IQ, ... , /ID-I and go,... ,gD-i are the low-pass and high-pass filter coefficients
respectively {D being the order of the wavelet transform), / is the n x n identity matrix
and 0 is the nxn zero matrix. The resulting transformed matrix has a 'finger' structure of
blocks, each with a diagonal structure. We can avoid the finger pattern by permuting the
rows and columns of the transformed matrix so as to centre the blocks containing large
entries. We call this modified big block transform BBDWTPer, because it is a big block
version of the DWTPer transform described in [10]. We anticipate that BBDWTPer
may be useful for preconditioning block matrices with a very strong block diagonal
singularity (see the comparison of DWTPer and other DWT-based preconditioners in
[9]), but we have not yet found example matrices for which BBDWTPer provides a good
preconditioner. Preconditioners based on BBDWT and BBDWTPer are tested in Section
4; we now present a more effective method.
3
Recursive BBDWT-based preconditioning
An alternative way of avoiding the 'finger' pattern is to use a 'Big Block' version of the
NS-forms presented in [3]. We define the Big Block NS-form (BBNS-form) of a matrix
as follows. To transform a matrix consisting of m^ blocks, each of dimension n (where m
and n are powers of 2) we define Pj, Qi to be the mn/2' x mn/2'~^ matrices such that
^(m/2<'-i',n) _ { Pi \
(3.1)
Wavelet-based preconditioning
357
Given an mn X mn matrix A, define To = ^,
-'j = t^iM-iPi ,
Ai = QiTi-iQi ,
T,=
Bi — QiTi-iPi ,
"^+^
Ci+i
::^+^
Tj+i
Ci = PiTi—iQi ,
.
(3.2)
(3.3)
The level k BBNS-form of A comprises Tk together with Ai,Bi,Ci, i =1,2,... ,k. (The
blocks of Ti are arranged differently from those of the standard level 1 DWT of Tj. We
have used this ordering in order to be consistent with the notation of [3].)
We propose to use banded approximations to the submatrices of the BBNS-form as
the basis for our preconditioner. If the blocks of A vary smoothly apart from a diagonal
block band, then each of Ai+i, Bi+i, Cj+i will have small entries except for a wrapround diagonal block band. So we can approximate them by J4J+I, BJ+I, Ci+i, formed
by cutting to a block band, giving an approximation Tj to Ti:
4»+l
^^+1
f, =
-^i+l
^^+1
.
(3.4)
(In practice, it is unnecessary to compute ^j+i, JBJ+I, Cj+i and then to set entries outside
the block band to zero; instead we can compute only the non-zero entries of ^i+i, Bi+i,
Cj+i. This enables us to reduce the cost of forming Tj.)
We now show how this can help us to solve (2.1). We use a flexible GMRES iteration
(see [12]) preconditioned by approximate solution of an equation of the form Ay = v &i
each step. To do this we first apply a level 1 BBDWT with a block band cut to give
where yi = Qiy,y2 = Piy,yi = Qiv, V2 = Piv. We solve this equation using the Schur
complement Si =Ti — CiA^^Bi. This requires us to solve an equation of the form
Siy2 = W2,
... (3.6)
which we do by a further GMRES iteration. We expect that Ti will be an effective
preconditioner for Si (see [1]§9.3), so we now seek a cheap way of applying Tf^ to a
vector. To do this we repeat the process of applying a level 1 BBDWT and using the
Schur complement. In summary, during the solution of (3.6) we solve a preconditioning
equation of the form
Tiy = v,
2/,t;GC""/2.
;
(37)
To do this cheaply we repeat the process of applying a level 1 BBDWT and using the
Schur complement and obtain an equation of the form
S2z = w,
2:,WGC"'"/^
(3.8)
This in turn can be solved using flexible GMRES preconditioned by T2. We continue
358
M. Ford and K. Chen
recursively, solving equations of the form
SiZ = w,
^,«;eC'""/2*
(39)
iteratively, preconditioning by solving equations of the form
Tiy = v,
2/,i>GC"'"/2\
(3.10)
uiitil the matrix T, is small enough that T^^ can be applied directly by means of LU
factorization at low cost. Therefore, at level i, each GMRES iteration requires a preconditioning step that in turn calls for iterative solution by GMRES of a coarser level
equation. At the coarsest level the preconditioner is applied directly using an LU factorization of Tj+i. This process is summarized in Algorithm 3.1.
Algorithm 3.1 Approximate solution of Tiy = v.
(!)
(2)
(3)
(4)
(5)
Compute vi = Qi+iv, V2 = Pi+iv.
Solve Ai+iWi =Vi for Wi.
Setw2 = V2-Ci+i'Wi.
Define Si+i = T^+i - Ci+iAl^^Bi+i.
Solve Si+iy2 = W2 for y2 by flexible GMRES iteration, preconditioning with T,:+i,
using Algorithm S.lifi + l<k and using matrices Li+i, Ui+i othenvise.
{&) Set yi = wi - A-^^Bi+iy2.
''>^"-(^:)=(f:l)'
To solve equation (2.1), we start the solution process for level i = 0 and apply a
GMRES iteration with the preconditioner Ti to the Schur complement of the transformed
To = A. The overall method is presented in Algorithm 3.2.
Algorithm 3.2 Solution of Ax = b by recursively preconditioned flexible GMRES.
(1) Set up
(a) Input matrix A, vector b, tolerance t.
(b) Decide on values for:
• maximum wavelet level, k,
• tolerance ti for inner iterations,
• block band width for approximating the submatrices.
(c) Set To = A and i = 0.
(d) Recursively, for i = 1.. .k + 1, compute Ti, At, Bi, Ci, and factorize Ai.
(e) Factorize Tk+i into Lk+i, Uk+i(2) Solve TQX = b by flexible GMRES preconditioned using Algorithm. 3.1.
Note that the relatively expensive step of computing the BBNS-form matrices Ai,Bi, Ci,Ti
is done only once.
Wavelet-based preconditioning
4
359
Numerical results
Here we illustrate the effectiveness of our method, and compare it with some alternative
approaches, by considering two example mn x mn matrices:
A.ni+j,nk+l
\ log((i - kY + [j - Z)^)
k and j = I,
otherwise.
(4.1)
for i, fc = 0,1,... , m — 1; j,l = 0,1,... , n — 1; c a constant.
B.ni+j,nk+l
^-((i-kf+u-if)
(4.2)
for i,fc = 0,1,... ,m — 1; j,Z = 0,1,... ,n - 1.
Tables 1 and 2 give typical results for the matrices A and B respectively. The cost
of reducing the relative residual norm to a tolerance of 10"^ is shown for matrices of
various sizes using the following preconditloners:
PI simple band preconditioner,
P2 standard BBDWT + band cut preconditioner,
P3 BBDWTPer + band cut preconditioner,
P4 recursive BBDWT-based preconditioner.
In each case GMRES was restarted after 10 iterations. '*' denotes non-convergence of
GMRES(IO). Unpreconditioned GMRES(IO) failed to converge to the required tolerance
for any size of matrix, so it is omitted from the tables.
m
n
A'' = mn
8
16
32
64
8
16
32
64
64
256
1024
4096
its.
30
58
86
*
TAB.
m
n
N = mn
8
16
32
64
8
16
32
64
64
256
1024
4096
its.
8
62
67
69
TAB.
PI
Mflops
0.65
17
393
*
Preconditioned GMRES
P2
P3
its. Mflops its. Mflops
49
1.2
38
0.99
*
*
*
*
*
*
*
*
*
*
*
*
its.
6
7
7
7
P4
Mflops
0.32
5.5
150
6300
Direct
solution
Mflops
0.21
12
720
46000
its.
4
6
6
6
P4
Mflops
0.26
5.6
120
1700
Direct
solution
Mflops
0.21
12
720
46000
1. Cost of solving Ax = b.
PI
Mflops
0.19
19
310
5000
Preconditioned GMRES
P2
P3
its. Mflops its. Mflops
8
0.26
.8
0.26
66
21
63
21
76
74
380
370
78
6000
78
6000
2. Cost of solving Bx = b.
Clearly the recursive BBDWT approach gives better performance -than the alternat-
360
M. Ford and K. Chen
ive preconditioners that we tested and offers substantial savings compared with direct
solution.
5
Conclusion and future work
We have designed a preconditioning method that exploits smoothness between the blocks
of a class of dense matrices giving useful savings compared with both direct solution and
preconditioned GMRES using band preconditioners. In the future we plan to explore a
number of ways of further improving our methods including: (a) using a block DWT,
in addition to the BBDWT, to exploit smoothness within each block; (b) using biorthogonal wavelets or multiwavelets (particularly the new supercompact Haar multiwavelets
presented in [2]) to give improved compression; (c) preprocessing the matrix to enhance
smoothness.
Bibliography
,
1. 0. Axelsson. Iterative solution methods. Cambridge University Press, Cambridge,
UK, 1996.
2. R. M. Beam and R. F. Warming. Multiresohition analysis and supercompact multiwavelets. SIAM J. Sci. Comput, 22:1238-1268, 2000.
3. G. Beylkin, R. R. Coifman, and V. Rokhlin. Fast wavelet transforms and numerical
algorithms I. Comm. Pure Appl. Math., XLIV:141-183, 1991.
4. T. F. Chan and K. Chen. Two-stage preconditioners using wavelet band splitting
and sparse approximation. Report CAM 00-26, UCLA, 2000.
5. K. Chen. On a class of preconditioning methods for dense linear systems from
boundary elements. SIAM J. Sci. Comput, 20:684^698, 1998.
6. K. Chen. Discrete wavelet transforms accelerated sparse preconditioners for dense
boundary element systems. Electron. Trans. Numer. Anal., 8:138-153, 1999.
7. J. Ford and K. Chen. An algorithm for accelerated computation of DWTPer-based
band preconditioners. TVwm. ^/(7., 26(2):167-172, 2001.
8. J. Ford and K. Chen. Wavelet-based preconditioners for dense matrices with nonsmooth local features. BIT, 41(2):282-307, 2001.
9. J. Ford, K. Chen, and D. Evans. On a recursive Schur preconditioner for iterative
solution of a class of dense matrix problems. Int. J. Comput. Math., 79: to appear.
10. J. Ford, K. Chen, and L. Scales. A new wavelet transform preconditioner for iterative solution of elastohydrodynamic lubrication problems. Int. J. Comput. Math.,
75:497-513, 2000.
11. D. Gines, G. Beylkin, and J. Dunn. LU factorization of non-standard forms and
direct multiresohition solvers. Appl. Comput. Harmon. Anal, 5:156-201, 1998.
12. Y. Saad. A flexible inner-outer preconditioned GMRES algorithm. SIAM J. Sci.
Compttt, 14(2) :461-469, 1993.
13. Y. Saad. Iterative Methods for Sparse Linear Systems. PWS, Boston, 1996.
14. G. Strang and T. Nguyen. Wavelets and Filter Banks. Wellesley-Cambridge Press,
USA, 1996.
Some properties of the perturbed Haar wavelets
A. L. Gonzalez
Departmento de Matemdtica, Universidad Nacional de Mar del Plata, Funes 3350,
7600 Mar del Plata, Argentina.
algonzal@mdp.edu.ar
R. A. Zalik
Department of Mathematics, Auburn University, AL 36849-5310.
zalik@auburn.edu
Abstract
One of the authors has studied the properties of a family of Riesz bases obtained by
perturbing the Haar function using B-splines. Although these bases cannot be obtained
by multiresolution analyses, they have other interesting properties. The present paper
discusses how a discrete signal {ar;0 < r < AT — 1} can be studied by considering
a suitable function of the form f{t) := X^^_Q arfr{t), so that the existing theory for
functions defined over a continuous domain can be applied.
1
Introduction
In what follows Z will denote the integers and R the real numbers; t and x will always
denote real variables. The support of a function / will be denoted by supp(/), its
quadratic norm by ||/|| and if / G L{TR) its Fourier transform is defined by
fix) := f e-'"f{t)dt.
./IR
In [3] we found a family of affine wavelet Riesz bases of L^(]R), of bounded support and
arbitrary degrees of smoothness, obtained by smoothing the discontinuities of the Haar
function using B-splines. Although these bases are not orthogonal they are symmetric,
a feature that is lacking in orthogonal wavelets. Our bases can be constructed so that
the difference between the frame bounds (which are given explicitly) can be made as
small as desired. In general, orthogonal wavelets are represented by infinite series, and
for computational purposes values are generated over a discrete set using the cascade
algorithm [2, 5]. Our bases, on the other hand, are given in closed form. We now briefly
describe how these wavelets are defined and introduce additional notation and make
assumptions that will be used in the subsequent discussion.
Let Nmit) denote the B-spline of order m (m > 2) ([1], Chapter 4), X[o,m-i]{t) the
362
Some properties of the perturbed Haar wavelets
363
characteristic function of [0, m — 1]),
m-2
g{t)~Xlo,m-i]{t)J2Nm{t-k),
gi{t):=git-m + l),
m-2
h{t) := (1/2) J^ ^m{t-k),
and q{t) := gi{t) - h{t). For 0 < 5 < 1/2, let QI = -a2 = -a^ = a^ = 2(m - 1)15,
A = 2(m - 1),/32 = 2(m - 1)(1 + 5)/(5,/33 =-/34 = (m - l)/5,
p«(i):=(-l)'-i9(Qii+A),i = 1,2,3,4,
P^'>W:=-(X[i/2-5,i/2)(i)-X[i/2,i/2+5)W),
6
P^'^(i) := X[o,i/2)(t) - X[i/2,i)W,
and
•
^(t) := ^pW(i).
We will call V' ^/te perturbed Haar wavelet. In [3] we proved that supp (V') C [-5,1 + ^],
V- e C""-2(]R), and that if Vj,fc(i) := lil'^i,{2H - k), then {^j,fe; j, fc e Z} is a Riesz
basis, and we provided explicit upper and lower frame bounds. Moreover, in [7] we
showed that given a function fi, the wavelet coefRcients {ii,tpj^k) can be computed in
O(A^) steps (where A'' is the sample size), just as in the orthogonal case.
In this paper we will discuss the application of the perturbed Haar wavelet to the
study of discrete signals. Let us first look at the orthogonal case for comparison.
Let fi be an orthogonal wavelet associated with a multiresolution analysis {Vj; j 6 Z}
and a scaling function (j), with the caveat that the definition of multiresolution analysis
that we are adopting is that of [1] and [4], and therefore Vj C Vj+i,j e Z, whether
other authors, Uke [2] and [5] assume that V^+i C V^-. If a := {a^; Q <r <N -1} is an
arbitrary sequence of real or complex numbers, then this discrete signal is transformed
into a continuous one by considering the function v{t) := Ylr=o a-r4>{t - 'r)The study of the signal u{t) has two stages: the analysis stage consists in computing
the wavelet coefficients,whereas the synthesis stage consists in reconstructing the signal
from the wavelet coefficients. If Wj denotes the closure of the Unear span of the functions
Hj^k, 3 € Z, then the Wj are mutually orthogonal and VQ = ®j<oWj. Since u € Vb, it
turns out that the wavelet coefficients {v,^ij,k) vanish for j > 0. Moreover, since u{t)
has compact support, for each j < 0 there is only a finite number of nonzero wavelet
coefficients.
With the perturbed Haar wavelet we face an additional problem: the spaces Wj are no
longer orthogonal, and we can therefore no longer assume that all the wavelet coefficients
corresponding to positive values of j must vanish. Moreover, we may not even have a
scaling function: in [8] we showed that ii 6 = 2^, where ^ is a negative integer, then the
perturbed Haar wavelet ip that corresponds to this value of 5 cannot be generated by a
multiresolution analysis.
To overcome these difficulties, we proceed as follows. Let n € Z be such that 2" >
4(m- 1), 6{i}(i) := X[o,2(m-i))W^W, b^^^t) := q{Aim-l)-t), b{t) := b^^^t) + b^^\t),
fr{t) := arb{2H - 4(m - l)r), and f{t) := Y!^^^ U{t). By a direct application of [3]
Lemma 6 we obtain the following
Lemma 1 The function b{t) has the following properties:
(a) supp{b) C [0,4(m - 1)],
(b) 6 e C^-^M),
(c) 6(2(m - 1)) = 1,
364
A. L. Gonzalez and R. A. Zalik
(d) 1^6(0) = j5,6(2(m-l)) = ^6(4(m-l)) = 0, 1 < fc < m - 2,
(e) The total variation ofb does not exceed 4(m - 1),
(f) \b{t)\ < 1.
Prom the preceding lemma we conclude that supp(/) C [0,1], and that the functions fr have disjoint supports. This imphes that ||/||^ = ||6|p||a|p2~", where ||a|p :=
YI^IQ \arf. We will also use the ii norm: ||a||i := E^"o^ l«r|- Note, moreover, that
/ G (7'"-2(IR), and that/(2i-"(m-l)(2r + l)) =/r(2i-"(m-l)(2r + l)) = ar6(2(m1)) = 0^.
In theory, given all its wavelet coefficients, the function / can be reconstructed using
the frame algorithm or other, even faster, algorithms [5]. However, since there may be
an infinite number of nonzero wavelet coefficients, the application of such algorithms
may not always be practical. We will adopt an approximation approach, li A = A{S, m),
and B = B{d,m) are respectively the lower and upper frame bounds of the Riesz basis
generated by ip, hj^k ■= {f,'ipj,k), and Lf := Ej,fc,ez'*j,fcV'j,it, then from the error
estimates for the frame algorithm we know that \\Lf - f\\ < {{B - A)/{B + A))||/||.
Since, as remarked above, we can make A and B as close to 1 as we want by making 5
sufficiently small, we conclude that for every e > 0 there is a SQ such that iiO <S <So,
then \\Lf- f\\ < e\\f\\. To approximate / using the wavelet coefficients it will therefore
suffice to approximate L / by an operator of the form
h
Observe that since / has bounded support, E f reduces to a finite sum.
Our objective will be accomplished by showing that there is a constant K such that
Y^hj^ki'j^k
itez
<K||a||2-l^'l/^
But first we need to prove five lemmas, of some independent interest. We begin with
Lemma 2 Let {a/,;fc e Z} and {6;,;fc e Z} be increasing sequences such that Uk <
bk-i < ak+i, k G Z. Assume that fk € L^(JR) and that supp{fk) Q K,6fc], and let
/==E.6zA-r/ien||/f <2EfcezllAll'■
Proof: If r < fc - 1 then br < bk-2 < o-k, whereas if r > A; + 1 then ar > ak+2 > hThis implies that if r 7^ fc - 1, A; then fr{t) = 0 on [ak, bk], and we readily see that
ll/f < 25] r|A(Ol' = 25^11/,.(OIP.
keZ-^"''
Q
k€Z
Lemma 3 Letu € L"^{IR) be a function with support in an interval [a, b] with b-a< 1.
Ifj < 0, then
k€Z
Proof: Let j < 0 be arbitrary but fixed, and define I{k) := supp {i)j,k) n [a, b]. Then
I{k) C [2-^k - 5),2'^{k + S + l)]n \a,b]. If I{k) = 0 then, either 2-^{k + 6 + 1) < a,
Some properties of the perturbed Haar wavelets
365
or 2-^k -S)> 6. This implies that if I(k) j^ 0, then k € (2% - S ~ 1,2^b + S). Since
the length of this interval is less than 3, we conclude that there are at most three values
of k for which I{k) ^ 0. In other words, there are at most three values of k for which
/ij,fc 7^ 0. Since |V'(i)| < 1) for any such A; we have:
2
2^ / u{t)ij{2H -k)
Jl(k)
\{u,il>j,k)?
< {h-a)2^ f
Jl{k)
dt< 2^ [ |u(^)|2 dt [ \-iP{2H - k)\ dt
Jl{k)
Jl{k)
\u{t)\'^dt<2^
D
Lemma 4 Let a,(3,'y,a £ R, with a, 7 7^ 0, and define c{t) := q{at + 0), d{t) :=
q{'yt + a), and
i<: = 2 { [25/64 + (25/192f/^] (m - 1)* + (m - 1)2/1024} .
If j > 0 and i = 5,6, then
(a) ^ \{d,cj,k)f < 2 (4v^Q-2 + 1/3)'2-^- (b) J2 lid^pfi:) ' < (2^2 + 1/2)%-^'.
kez
kez
Proof: (a) Prom [3] p. 3367 (bearing in mind the slightly different definition of the
Fourier transform), we have
g(x) = (i/:c)e-(™-i)^'/2 ^—{m—l)xi/2
((2/a;)sina:/2)
m—1
Prom [1] p. 56 (3.2.16),
iV„(:c) = e-(i/2)m«[(2/x)sina;/2]
(1.1)
Let
s(a;):=[(2/a;)sina;/2]"-^
Then
giix) = e-("-i)^^5(a;) = (i/a;)[e-2('"-i)^^ - e-^^/^^^^'-^^^'six)].
Since
m-2
few = 0 E ^'""Nmix) = (1/2)'
fe=0
"_ _^.
JV^(x),
,
a straightforward computation yields
^(:c) = -i/(2cc)[e-(^/2)(m-l)xi _ g-(3/2)(m-l)«]^^^>)^
whence
, q{x)
=
i-e~("~^'^'[cos(m-l)a;-s(a;)cos-(m-l)a;
+z(2s(a;)sin-(m- l)a; - sin(m - l)i)].
(1.2)
This implies that
■(a;)p <8a;-2,
a; ^ 0.
(1.3)
A. L. Gonzdlez and R. A. Zalik
366
On the other hand,
q{x) = ix-h-'-"'-'^^" [{vi + V2) + i{va + U4)],
where
ui :=cos(m-l)a;-cos(l/2)(m-l).'C,
^2 ~ [1 - s(2:)]cos(l/2)(m - l)x,
U3 :=s(a;)[2sm(l/2)(m-l)a;-sin(m-l)x],
i;4 := [s(a;) - 1] sin(m - l)x.
A McLaurin expansion shows that |t;i| < (5/8)(m - l)^x'^. Since 1 - w™"^ = (1 u) Ylk=o "'^ ^^^ I sinuI < |u|, we infer that
|1 - S(.T)| < (m - 1)|1 - (2/a;)sina;/2| = (m - 1)(2/.T)|X/2 - sin3:/2|.
Since \u - sinu| < |u|'V6, we conclude that |1 - s{x)\ < (m - l)x2/48. Thus,
|v2(x)| < (m - 1)3:748,
and
\v4{x)\ < (m - l)a:V48.
Another McLaurin expansion yields jual < (5/24)(m - lf\xf. Clearly |u3| < 3; thus
|V3| = |v3|2/3|,;3|i/3 < (25/192)i/3(m - 1) V. Since
\q{x)f=^x-^[{vi+vy + iv3 + Vi)^]<2x-M + vl + vl + vl],
we deduce that
\q{x)f < Kx^.
(1.4)
Prom Plancherel's identity we have:
{d,Cj,k)
= V'^ f d{t)c{2H-k)dt = V'^I{2T^) f e^^%x)d{Vx)dx
/•27r
= 2^'/V(2T) /
e'=" J3c(x + 27rr)rf(2^'(x + 27rr))d.T.
This means that {2-^l'^{d,Cj^k)\k € Z} is the sequence of Fourier coefficients of the
function Efcez^C^ + 27rr)d(2-'(x + 27rr)). Thus, applying Bessel's identity and then the
Cauchy-Schwarz inequality twice (once for sums and once for integrals), we have:
2
27r2-^ 5^ \{d,Cj^k)f = r Y.^{x + 2-Kr)d{V{x + 2^r))
/c€Z
•'°
reZ
< /"'[|c(x)d(2^x)| + |c(x-27r)d(2J'(x-27r))| + | Y. c(x + 27rr)d(2^'(x + 27rr))
< ( /
|c(x)d(2^x) fdxj
+i
\c{x - 27r)rf(2^ {x 0/-2"
'
0
2TT))
\^ dxj
-
\ 1/2
I Yl c{x+27Tr)d{2^{x+27rr))\'^dx]
,jn
_i
r#0,-l
Some properties of the perturbed Haar wavelets
367
Since c{x) = a-'^e'-^/"'>'='q{a-^x), (1.3) implies that
|c(a; + 27rr)|2 <8|a; + 27rr|-2,
x^2TTr,
(1.5)
whereas from (1.4) we see that
\c{x + 2TTr)\^ < Ka-^\x + 2'Kr\^.
:
(1.6)
Since d(a;)= 7-ie('"/T)^*g(7-ia;), (1.3) also imphes that
\d{2^{x + 2nr))f<4-^+^2\x + 2Trr\-^,
xj^2nr.
(1.7)
Since ^i is obtained by integrating the product of the left-side members of (1.6) and
(1.7) (with r = 0) over an interval of length 2'ir, we readily see that
Si< 167rKa-H-^.
(1.8)
52 < 16nKa-H-^.
(1.9)
A similar argument yields
Prom Minkowski's inequality
/.27r
53< /
■'"
y^ \c{x + 2Trr)f Yl
T^O-l
d{2^{x + 2Trr))
dx.
11
rj^O-l
If a; G [0,27r] and r > 1 then from (1.5) we have:
y^\c{x + 27:r)f<2^-'J2r-' = l/3,
r>l
r>l
whereas (1.7) implies that
\^ d{2^{x + 2Trr))
r>l
<2 4-^7r-2^r-2=4-V3.
r>l
Similarly,
^|c(x + 27rr)|^<27r-2'^r-2 = l/3,
r<-2
r>l
and
) 1 d{V{x + 2Trr))
r<-2
<2 4-%-2S V-2=4-V3,
r>l
whence we conclude that S3 < (47r/9)4""^. Combining (1.8), (1.9) and the preceding
inequality, the assertion follows.
(b) Note that p{6} jg ^{5} ^i^]^ ^ = 1/2. since p{^y{x) = 2ia;-ie-(i/2)^^(l - cosfe),
we see that
___ _
'
\pi^}{x + 2Trr)\'^ <A\x + 2Trr\-^, xj^2Trr.
(1.10)
On the other hand, the inequality |1 -cos&l < {l/2)6'^x'^ implies that |p'(^>(a;)| < S^\x\;
therefore
\pi^}{x + 2-Kr)f <5^\x + 2'Kr\'^.
(1.11)
1
■
^
1
1
1
1
A. L. Gonzalez and R. A. Zalik
368
We now repeat the argument employed in (a), using (1.10) instead of (1.5), (1.11) instead
of (1.6), and bearing in mind that 5 < 1/2.
□
We now find bounds for the quadratic norms of q{t) and b{t).
Lemma 5 (a) \\tjj\\ < 1; (b) ||fe|| < 2(m - 1).
Proof: (a) [1] Theorem 4.3 implies that the functions Nm are nonnegative. This implies
that both g and h are nonnegative. In the proof of [3] Lemma 6(f) we show that
/ g(t)dt^ [ h{t)dt^{m-l)/2,
whence
/
m
\q(t)\dt<m-l.
Moreover, \q{t)\ < 1 ([3] Lemma 6(h)). Thus,
i\q{t)?dt< f \q{t)\dt<m--1.
Therefore,
/ |p«(t)|2d< = (5/2(m-l)) / ig(i)pdt<V2,
J = l,2,3,4.
This implies that
f \'il}{t)\^dt<A5l2+ f \p^^\t)-p^^\t)\^dt = 25 + {l-25) = \.
(b)
[ \b{t)fdt< I \b{t)\dt = 2 I \q{t)\dt<2im-l).
7lR
./IR
JlEi,
D
Theorem 1
(a) Ifj < 0,
<2\/6(m-l)l|a||2f^-^^/^
fcez
(b) Let K he defined as in Lemma 4- If j > 0,
Xl(/-V'j,fc)V'j,fc
<
V2 {K2^-" + 1/3) + V2 + l/sl ||a||i 2'^^^.
kez
Proof: Assume first that j < 0. Applying Lemma 2, Lemma 3, and Lemma 5, we have:
^{f,^j:kHj,k
kez
<2'£\\{f,^j,k)^^j,kf = 2Uf J2 l^/'^J.-^)
fcez
fcez
< 2^] K/,V^,fc)f < 61|/||2 2^- < 6||6|n|a||22^-" < 24(m - l)^||af 2^"
fcez
Some properties of the perturbed Haar wavelets
369
Assume now that j > 0. Setting br'{t) := ar6^*^(2"i - 4(m — l)r), we see that fr{t)
bi^\t) + bj.'^\t). Thus,
2
fceZ
6 N-1
i=l i=l r=0
E^^^^^^JS)v^.^
fcez
Applying Lemma 2 and Lemma 5 as above, we see that
2
Y.'^W\i^i)i^,k
k€7.
<'^Y.W.^^
fcez
Since the Fourier transforms of q{t) and X[o,2(m-i))9(^) are identical, and the functions
6r are of the form ar q{at + (3) or a^ X[o,2(m-i))(a* + P)<l{oit + (3) with
Lemma 4 we have:
|Q:|
= 2", from
E |(^^'^pjS)f < 2|a.|2 (2^/:^2-" + l/s)^-^ £ = 1,2,3,4,
fcez
and
E I(^^'^-PS)f ^ l«^l'2 (V2 + 1/3)%-^ ^ = 5,6,
fcez
whence the assertion readily follows.
D
Bibliography
1. C. K. Chui, An Introduction to Wavelets, Academic Press, San Diego, 1992.
2. L Daubechies, Ten Lectures on Wavelets, SIAM, Philadelphia, 1992.
3. N. K. Govil and R. A. Zalik, Perturbations of the Haar Wavelet, Proc. American
' Math. Soc. 125 (1997), 3363-3370.
4. E. Hernandez and G. Weiss, A First Course on Wavelets, CRC Press, Boca Raton,
FL, 1996.
5. S. Mallat,^ Wavelet Tour of Signal Processing, Academic Press, San Diego, 1997.
6. R. M. Young, An Introduction to Nonharmonic Fourier Series, Academic Press,
New York, 1980.
7. R. A. Zalik, A class of quasi-orthogonal wavelet bases. Wavelets, Multiwavelets and
their Applications (A. Aldroubi and E. B. Lin, eds.). Contemporary Mathematics,
Vol. 216, American Mathematical Society, Providence, RL 1998, pp. 81-94.
8. R. A. Zalik, Riesz bases and multiresolution analyses, Appl. Comput. Harm. Analysis 7 {l%m), 2,1^-2,2,1.
An example concerning the Lp-stability
of piecewise linear B-wavelets
Peeter Oja
Department of Mathematics, Tartu University, Liivi 2, Tartu, Estonia.
Peeter.OjaOut.ee^
Ewald Quak
SINTEF Applied Mathematics, P.O. Box 124 Blindern, 0314 Oslo, Norway.
Ewald.Quak@math.slntef.no ^
Abstract
In this paper we consider B-wavelets of order 2, i.e. piecewise hnear spline prewavelets of
smallest support, over nonuniform knot sequences. We discuss an example showing that
for 1 < p < 00, there is no absolute Lp-stability for these B-wavelets. This means that
regardless what specific scaling of the B-wavelets is chosen, the corresponding stability
constants cannot be made independent of the knot sequences involved.
1
Introduction
Polynomial splines are fundamental tools in numerous branches of applied mathematics,
and for spline spaces defined over a given knot sequence, the basis of choice is provided by
B-splines, which possess a lot of attractive properties for numerical computations. One
of these important properties of B-splines is their absolute stability. Given a B-spline
basis {B^}^^^ of polynomial order d over a valid knot sequence t, a classical result by de
Boor [1] states that properly normalized B-splines are stable in the sense that for each
set {&i}jg2: of real coefficients it holds that
<l|b||.
C^' l|b|L <
(1.1)
Here ||'|| denotes the standard integral and discrete p-norms for 1 < p < oo, respectively,
and the normalizing factor 5i for each B-spline is the length of its support divided by
the order d. The important point is that the positive constant Ca is dependent on the
order d alone, and not in any way on the underlying knot sequence t.
Since nested knot sequences give rise to nested spline spaces, spline functions have also
become a focus of attention within the theory of wavelets and multiresolution analysis,
■'Research supported by the Estonian Science Foundation Grant no. 3926.
^Research supported by the EU Research and Training Network MINGLE, RTNl-1999-00117.
370
An Lp-stability example for piecewise linear B-Wavelets
starting with cardinal spline wavelets on infinite equally spaced and uniformly refined
knot sequences, for which Fourier transforin techniques are available, see [3] and the
references therein.
The study of spline wavelets on bounded intervals, for arbitrary knot sequences and
nonuniform refinement began with the papers [4], [5] and [2], respectively. The construction of so-called minimally supported B-wavelets for a given spline order d and two nested
knot sequences to provide a basis of the relative orthogonal complement (wavelet) space
is described in detail in [6]. This means that given the coarse and fine knot sequence,
there exist expUcit algorithms to determine the supports of the B-wavelet functions, the
so-called minimal intervals, and also to compute the corresponding wavelet functions,
though only up to a normalization constant.
One open problem, however, is how to fix the normalization factor for each B-wavelet
function to achieve best possible stability for the whole B-wavelet basis. We provide an
example for the case of piecewise linear wavelets, i.e. polynomial order 2, that shows
that for 1 < p < 00 there is no absolute stability of B-wavelets, meaning that there is no
choice of normalization that provides absolute stability constants which are completely
independent of the underlying knot sequences. Lp-stability estimates involving a quantity
dependent on the knot sequences for 1 < p < oo and showing absolute stability for p = 1
are given in [7].
2
Piecewise linear B-wavelets
The theory of B-wavelets [6] covers general cases of knot refinement, such as'situations
where several or no knots at all are inserted into an old knot interval, or where the
multiplicity of an existing knot is increased. For our purposes, however, it is suflacient to
consider what one might call the standard setting, where all knots are simple except at
the interval endpoints, which we can count as double knots, and where exactly one new
knot is inserted strictly between two old ones.
Our notations are as follows for the closed interval [0,1]. We have a coarse knot
sequence with n — 1 interior knots, namely
T : 0 = To < n < • • • < r„ = 1.
Strictly between each pair of coarse knots r^-i and
arbitrary location, i.e.
TJ
we insert a new knot Si at an
Ti-i < Si <Ti for each i = 1,..., n.
Thus we have a sequence s of new knots
s : 0 < si < ••• < s„ < 1.
The fine knot sequence t = r U s, when ordered appropriately, is given as
t: 0 = to < ti < • • • < *2n = 1,
where the even numbered knots in t correspond to old knots in r, while the odd numbered
knots represent the newly inserted knots from s. To account for the boundary, we treat
the interval endpoints as double knots by setting r_i = f _i = 0 and r„+i = ^2,14-1 = 1.
371
372
P. Oja and E. Quak
For our investigations it is necessary to introduce also some notation related to the
knot spacings. We set
di = ti+i — ti for i = 0,...,2n- 1, and St = U+i - ij-i for i = 0,..., 2n,
which means 5o = do = ti- to and 62,1 = d^n-i = hn - <2n-i at the boundary. Thus 5i is
the distance between two consecutive old knots if i is odd, and between two consecutive
new knots if i is even (and not at the boundary).
We also introduce the index sets
fi = {1,3,..., 2n - 1} and fio = {3,5,..., 2n - 3} .
The piecewise linear functions on the knot sequences T Ct form nested linear spaces
Vo C Fi of dimensions n +1 and 2n+l, respectively. The corresponding piecewise/wear
B-splines forming a basis of these spaces are simple hat functions. We denote them as ipj
and 7J for r and t, respectively, where with the necessary adjustments at the endpoints,
{x-Tj-i)/62j-i
'Pjix)={ (Tj+i - x)/S2J+1
0
li{^)={
{x-ti-i)/di-i
{U+i-x)/di
0
ifx6[rj_iTj]
iixe[Tj,Tj+i]
otherwise
iixe[ti-iti]
a x€[titi+i]
otherwise
ioT j = 0,...,n,
for i = 0, ...,2n.
(2.1)
(2.2)
Using for any two functions f,g €Vi the standard inner product
{f,9)= I f{t)g{t)dt,
Jo
we can write
^1 = ^0©!^,
where W is the relative orthogonal complement of Vo in Vi, and (B denotes orthogonal
summation. The dimension of W is n, so that there is a basis function rpi^ for every index
k G fi, or in other words for each newly inserted knot s^.
Nonzero functions ipk GW with minimal support are called B-wavelets. The general
theory for B-wavelets developed in [6] establishes in this special case that there are n
different piecewise linear B-wavelets which form a basis of the wavelet space W. Each
such B-wavelet is uniquely determined up to a constant multiple. There are two boundary
B-wavelets Vi and ■02n-i and n — 2 interior B-wavelets tpk for k € O.0, which we will
consider first. Each interior B-wavelet has support [tk-3,tk+3], so that
fc+2
V'fc (x) = Y^ qhi {^)
for x € [0,1]
i=k-2
with the coefficients determined by tpk €W, or in other words
{ipk,fj) = 0
for j = 0, ...,n.
An Lp-stability example for piecewise linear B-Wavelets
373
For the boundary wavelets tpi and ^^2^-1 we have to make some minor modifications
Their supports are [toM] and [i2n-4,i2n], respectively, so that
3
2n
^1 i^) = J2li^i (^) and ip2n-i (x) = Yl 'li"~S (x) for a; 6 [0,1].
i=0
i=2n-3
In the paper [7] the values of all B-wavelet coefficients qf are given explicitly in terms
of the knot locations for the standard setting described here. In the same paper estimates
for the coefficients are used to derive ip-stability estimates for these B-wavelets.
3
Stability of B-wavelets
Our aim in this paper is to establish
Theorem 3.1 Given the B-wavelet basis {A}ken . then for 1< p < 00, there are no
sets of weights ak,p,k £0,, such that
Ki \\c\\p <
}, CkOlk,pllJk
< K2 llcl
ken
(3.1)
holds for any wavelet coefficients (ci, C3,..., C2„_i) and with absolute constants K^ > 0
and K2 > 0, which are completely independent from the choice of knot sequences T ands.
Due to the finite dimension of W, it is clear that stability constants Ki and K2 exist
as any two norms on W are equivalent. The pertinent question is how the weights could
be chosen to achieve that the constants are actually independent of the dimension, the pnorm and, if possible, the choice of new knots s. We will prove the assertion by assuming
that the estimate (3.1) holds with constants independent of the knot sequences. Then
the following special case serves as a counterexample to this assertion.
The old knot sequence r consists of the equally spaced points:
To = 0, n = 1/3, r2 = 2/3, rg = 1.
We want to investigate what happens if two newly inserted points are positioned ever
more closely, so we introduce the new knots as
si = 1/3 - e, S2 = 1/3 + r?, S3 = 5/6, for 0 < e, ?? < 1/3,
in order to find out what happens if both e ^ 0+ and 7/ -+ 0+.
Thus the fine knot sequence i is
fo = 0, ii = 1/3 - e, i2 = 1/3, is = 1/3 + r/, «4 = 2/3, % = 5/6, is = 1.
The fine interval lengths are
do = 1/3 - e, rfi == e, d2 = 77, cJs = 1/3 - r/, ^4 = 1/6, dg = 1/6,
while
^1 = (^3 == (^5 = 1/3, and 5o = 1/3-£, ^2 =£ + »?, ^4 = 1/2 - ?7, (Je = 1/6.
P. Oja and E. Quak
374
In this setting any wavelet
ip = Y^qi7i&W
i=0
must be orthogonal to the coarse hat functions (/^o, • • •, ^3- This actually means that the
column vector q of coefficients qt must satisfy the matrix equation
Aq = 0,
(3-2)
where the entries of A are the inner products of the coarse and fine hat functions, i.e.
aj,i = {^j,li),
fori = 0,...,3,
i = 0,...,6.
Direct computations using (2.1) and (2.2) yield as the only nonzero entries
1
1
oo,o
1
1
1
ai,o
' r^f-R'
18' «M=-6^+9'
1 o 1
1 o 1
ai,3
02,2
=
11
-77+-,
-g'/-r
g'
11
1 2 1 , 1
a,,i
= -v
"1,*2'' -i^+Ti3
1
1-1,1
L
6'
13
22
a2,4 = 4^'-^+72'
1
^2,5
^3,4
12'
1
72'
1
''''' = 72'
1
"^■^=12'
''^■'^"72-
We now investigate the B-wavelets i/^i and Vs in detail, corresponding to si = h and
S2 = h Speciahzing the results from [7] then yields all necessary B-wavelet coefficients
for this setting up to a scaling factor. Note, however, that it is straightforward to check
that the corresponding coefficient vectors satisfy the matrix equation (3.2).
The coefficients of the boundary wavelet tpi are
3
QI
l-3e'
QI
4
4
=
6-
9erj
£ + r? + 6eT7
1 + 37?
e + 7? + 6e77'
e + 77 + 6e7j'
An Lp-stability example for piecewise linear B-Wavelets
while the ones for the interior B-wavelet ips are
3
9£2
4 = - e-\-r] + 6er]
3(£ + r?)
9(1 - 277)
+
2(£ + ?7 + 6£?7) ' 2(5-12?7)'
9
5-1277'
3
2(5-12r;)'
3+
ll
QI
<ll =
We first provide estimates for the p-norms of these B-wavelets.
Proposition 3.2 For small enough e and r], it holds for 1 <p < 00 that
llV-illp
45 V2
ie + vY
.,„.> SG)""(e.,)-
Honp
Proof: For all 0 < e, 7? < 1/3 we find that
H\ > i^ + vY
= gi^ + V)
inf
0<e"ri"<l/3
(l + 3?7)(£ + 7?)
£ + 77 + 6£77
-1
and, similarly,
\ql\>7:{e + v)
-1
Note that instead of 8/9 we may write 1 — c for any cr > 0 if £ and 77 are small enough
or even 1 if £ = ry.
In the process £, 77 ^^ 0+ all other coefficients qj and qf have finite limits. This means
that for small enough £ + 77
max|9,^| = l^^l > -(£ + 77) \
Q
maxl^fl = \q^\ > ^ (£ + 7?)"\
Us
The absolute stability of piecewise linear B-splines (1.1) yields with C2 > 5/2 (see [1])
and 82 = e + T]
3
IIV'ilIp
=
E QIH
i=0
>
2/1^'/"
l[^)
\\{<ilsl^'MsY',4sl/',44'')
375
P. Oja and E. Quak
376
>
2/1
5V2
1/p-l
Analogously we get
\m\p = E^hi
>
ia)""i''i^="^SG)'"<-')-"D
to complete the proof.
Proof of Theorem 3.1: Let us now assume that with some scaling factor B-wa,velets
are absolutely stable in p-norm for 1 < p < oo, i.e. there exist weights ak,p so that
the inequalities (3.1) hold with constants independent of the specific choice of knot
sequences. Choosing in the current setting all coefficients equal to zero except for ci = 1,
the stability inequality (3.1) yields
||ai,p^i||p <K2
or in other words, using Proposition 3.2
"'■•'i^^^i^""^^<^+"'""
(3.3)
and by a similar argument
i-'^i^^^i^^""^^^^^^^'"'
(3.4)
On the other hand, the stability estimate (3.1) yields for arbitrary ci and C3, while
setting all other c^ to zero, that
Wciai^ptlh +C3a3,pV'3||p >Ki{\cif + \c3f)^^^ > ii:imax(|ci|, |c3|).
Let us choose specifically
ci = a3,p and C3 = -ai,p,
which results in
|Q;i,pa3,p| 11^1 - V'sllp >/<^imax(|Q;i,j,|, |a3,p|),
leading with (3.3) and (3.4) to
i/p
i^ + v)
1-1/p I
01 - 1p3
lp>(^2J
I6K1
45K2'
(3.5)
On the other hand we derive from the absolute stability of linear B-splines (1.1) that
Ui-Mp
=
<
IkoTo + {ql - qi) 71 + (92 - 5f) 72 + (93 - 93) 73 - 9^74 - qh4p
(qlSl^",
{ql - qf) S'/', {ql - ql) ^l^', {ql - 4) S'/', qls'j", qW)/ 11lip
\
An Lp-stability example for piecewise linear B-Wavelets
:< 6m^x{\ql\6l^',\ql-qf\Sl^^,\ql-ql\5l^',\ql-ql\Sl^',
All the terms
I llrl/p I 1_ 3|rl/p I 1_ 3|cl/p |„3|rl/p I 3lvl/p
are in fact bounded from above for £ + ?7 —> 0"*", so that the expression
and the other such terms tend to zero.
Since \q2 — ^fl = 3 |e — 7?| / (s + ?? + &er]) we obtain for the only remaining term
(r \ n\^-'^^P\n^
(e + r/)
1^2
„3| rl/p _ •
921^2
^ |£ - ?7|
-i + 6£r7/(£ + 7?)-^'^
''I'
which goes to zero as well for £ + ?? ^^ 0+. As a consequence
hm
(£ + 7y)^-^/niV'i-V'3||p = 0,
which contradicts (3.5).
□
Remark 3.3 Although we have chosen an example with one boundary and one interior
B-wavelet, let us remark that the lack of absolute stability is in no way due to a boundary
effect. A completely analogous reasoning is possible if one chooses knot sequences with
more interior knots and studies the behaviour for two interior B-wavelets once two new
knots coalesce. Similarly just two boundary B-wavelets could be used on an even shorter
knot sequence, where there are no interior B-wavelets at all.
Bibliography
1. C. deBoor, The quasi-interpolant as a tool in elementary polynomial spline theory,
in Approximation Theory, G. G. Lorentz et.al. (eds), Academic Press, 1973, 269-276.
2. M. Buhmann and C. A. Micchelli, Spline prewavelets for non-uniform knots, Numer.
Math. 61 (1992), 455-474.
3. C. K. Chui, An Introduction to Wavelets, Academic Press, 1992.
4. C. K. Chui and E. Quak, Wavelets on a bounded interval, in Numerical Methods in
Approximation Theory, ISNM105, D. Braess and L. L. Schumaker (eds), Birkhauser,
1992, 53-75.
5. T. Lyche and K. M0rken, Spline-wavelets of minimal support, in Numerical Methods in Approximation Theory, ISNM 105, D. Braess and L. L. Schumaker (eds),
Birkhauser, 1992, 177-194.
6. T. Lyche, K. M0rken and E. Quak, Theory and algorithms for nonuniform sphne
wavelets, in Multivariate Approximation Theory , N. Dyn, D. Leviatan, D. Levin and
A. Pinkus (eds), Cambridge University Press, 2001,152-187.
7. J. Mikkelsen, P. Oja and E. Quak, Lp-stability of piecewise linear B-wavelets, preprint.
377
How many holes can locally linearly independent
refinable function vectors have?
Gerlind Plonka
Institute of Mathematics, University of Duisburg, Germany
plonka@math.uni-duisburg.de
Abstract
In this paper we consider the support properties of locally linearly independent refinable
function vectors $ = (</>!, •■ ■ ,'l'r)'^- We propose an algorithm for computing the global
support of the components of $. Further, for $ = {^i,4)2f we investigate the supports,
especially the possibility of holes of refinable function vectors if local linear independence
is assumed. Finally, we give some necessary conditions for local linear independence in
terms of rank conditions for special matrices given by the refinement mask. But we are
not able to give a final answer to the question whether a locally linearly independent
function vector can have more than one hole.
1
Introduction
Let # = ((/)!,..., ^r)^) r 6 IN, be a vector of compactly supported continuous functions
on H. The function vector $ is said to be refinable if it satisfies a vector refinement
equation
#(a;)= ^^(fc)$(2x-fc),
XGIR,
(1.1)
where {A{k)} is a finitely supported sequence of real (r x r)-matrices.
Refinable function vectors play a basic role in the theory of multiwavelets. In the last
years the properties of refinable function vectors have been investigated very extensively.
In fact, it is possible to characterize properties like approximation order and regularity
of # and L^-stability of the basis generated by $ completely by means of the refinement
mask {^(/c)} [1, 6, 7, 11].
We say that $ is L"^-stable if there are constants 0 < J4 < B < oo such that for any
sequences ci,...,Cr G/^(K),
2
'
<5EEM'^)|'
i/=ifee2z
In some applications one needs not only t^-stability of the basis generated by $ but other
stronger conditions of linear independence. We say that $ is globally linearly independent
378
Locally linearly independent function vectors _
if for any sequences ci,... ,Cr on 7L
r
Y^Y^Cu{k)(j)^{--k)-:=0 on IR
i/=lfc€ZZ
implies that Cu{k) = 0 for all i^ = 1,..., r and all fc e ZZ (see [8, 5]).
The following definition is even more restrictive: A function vector # is called to be
linearly independent on a nonempty open subset G of IR if for any sequences ci,..., c^
on z;
r
Y,Y^c,{k)M--k)=0 on G
implies that c^{k) = 0 for all k G Iv{G), v = 1,... ,r, where Iv{G) contains all fc 6 Z
with (/)y(- - fc) ^ 0 on G. Finally, $ is called to be locally linearly independent if it is
hnearly independent on any nonempty open subset G of IR.
Obviously, local, hnear independence of $ imphes global linear independence and
global linear independence of $ implies L^-stability. It has been shown by Sun [12], that
for compactly supported, refinable functions (r = 1) with dilation factor 2 the notions
of local and global linear independence are equivalent. However, this is not longer true
for function vectors [4].
For (scalar) refinable functions 4>, local linear independence implies that (j) has integer
support, i.e., supp^ starts and ends with an integer, and supp(/) does not contain holes,
i.e., supp(^ is an interval.
Now, one can ask, 'is this also true for locally linearly independent refinable function
vectors?' Unfortunately this is not the case. In [10] it has been shown that a component
of $ can have a hole. However, it is not clear, whether a refinable, locally linearly independent function vector can also have components with finitely many or even infinitely
many holes.
In this paper, we want to investigate support properties of locally linearly independent
function vectors and consider the 'hole problem' more closely. In the second section we
briefly recall a characterization of local linear independence for function vectors in terms
of the mask {A{k)}. In Section 3, we present an algorithm for computing the starting
points and endpoints of the support of the components (^^ of $.
In the remaining part of the paper we restrict ourselves to the special case # =
{'t>ij'p2)^- We collect some observations on function vectors with holes in Section 4 and
show that holes can only occur in special situations. In Section 5 we give necessary
conditions for local linear independence in terms of rank conditions for matrices formed
by the mask {A{k)}. In Section 6 we prove that the function vector $ given in Example
4.1 is continuous and locally linearly independent. Finally, we summarize our findings in
the conclusion. However, the question put in the title of this paper cannot be answered
completely. We conjecture that it is not possible to have locally linearly independent
function vectors with more than one hole.
379
380
Gerlind Plonka
2
Preliminaries
Let us start with some notations. For a compactly supported, continuous function (j) '■
]R -^ ]R let supp()f) be the closed subset of IR, where cp does not vanish. Further, let the
global support gsupp (j) be the smallest interval containing supp (f>. The fvmction (f) is said
to have a hole if there is an interval / which is a subset of gsupp 4> of Lebesgue measure
greater than zero, where ^ is identically zero. The function vector $ is said to contain a
hole if one of its components has a hole.
For a characterization of locally linearly independent function vectors we briefly recall
the result of Goodman, Jia and Zhou [4]. Let $ satisfy the refinement equation (1.1),
where the mask matrices A{k) are zero matrices for fc < 0 and for k > N. Considering
the vector
*(.T) = (*(.T + fc))fro^
of length rN and the (rN x r7V)-block matrices
^o = (^(2/c-0)^r=o.
Ai = {A{2k-l + l))^^i-2o'
(2-1)
the refinement equation can equivalently be written as
^{x/2) = Ao^{x)
and
*((x + l)/2) = A*(x),
a;e[0,l].
For ei,... ,e„ e {0, 1} it follows that
* (| + --- + |^ + |r)=Ax---A„*W,
a:e[0,l].
Now let VQ be a right eigenvector of ^o to the eigenvalue 1. This eigenvector is unique
(up to multiplication with a constant) if # is L^-stable (see [3]). Let V be the minimal
common invariant subspace of {^Oi ^i} generated by VQ. Then V contains the vectors
^{x), X E [0,1), since #(0) = CVQ with some constant c and each x £ [0,1) can be
represented as a hmit of a sequence of dyadic numbers Z/2", ? e Z, n = 1,2, — Further,
let M be an {rN x dim y)-matrix such that the columns of M form a basis of V. Then
we have from [4]
Theorem 2.1 Let ^ be a refinable vector of compactly supported, continuous functions
satisfying (1.1) with A{k) = 0 for k <0 and k > N. Then we have
(1) $ is linearly independent on (0,1) if and only if all nonzero rows of M are linearly
' independent.
(2) # is locally linearly independent if and only if for all n with 0 < n < 2'"^ and all
ei,..., e„ G {0, 1} the nonzero rows of Ae„ ■ ■ .Aa^M are linearly independent.
Remark 2.2 A similar characterization of local linear independence is possible also for
L^-solutions of vector refinement equations (1.1) and even for distributions (see [2, ISj).
Some examples of locally linearly independent function vectors can be found in [4, 10].
3
Global support of $
Now we want to give an algorithm for computing the global support of the components
of refinable function vectors # from the mask. To this end let us assume that the (r x r)matrices A{k) in (1.1) are of the form A{k) = (^j,j(A;))[j-^j. We look for a„, 0^ e TR
Locally linearly independent function vectors
with gsupp(/.^ = [a^,/3^]. Let for all pairs (i,j), z,j = 1,...r,
Si,j
9i,j
:=
:=
mm{k:Aij{k)T^O},
mhyi{k:Aij{k)^Q}.
Observe that Si^j, gi^j are integers. The numbers a^ can be found by the following algorithm.
Algorithm 3.1
Input: Sij, i,j = l,...,r.
(1) Let p := (pi,... ,p^) be a vector of length r.
Fori^ from 1 tor do a^:^ Su,u,■?,.■.= ly enddo.
(2) For V from 1 to r do
For j from 1 to r do
if s^j < 2a^ - aj then a^ := (s^j + aj)/2; p^ := j endif
enddo
enddo.
(3) Repeat step (2) as long as the vector p = (pi,... ,pr) changes.
(4) Form the (r x r)-coefficient matrix P with
1
2
Pi,j = {
-1
0
if i = j and i=p{i),
if i =j and ij^p{i),
if i^j and j=p{i),
elsewhere,
and the vectors a :- (ai,... ,a,)^, s :^
equation system Pa = s.
(SI,PX,-•
• ,Sr,p.r and solve the linear
Output: a =(ai,...,ar)'^.
Analogously we obtain the algorithm for the endpoints 0^:
Algorithm 3.2
Input: gij, i,j = l,...,r.
i^) Let p:={pi,...,p^) be a vector of length r.
For V from 1 to r do /3^ := g^^^; p^ := u enddo.
(2) For V from I to r do
For j from I to r do
ifg.d > 2/3^ - Pj then (3^ := (g^j + (3j)/2; p^ := j endif
enddo
enddo.
(3) Repeat step (2) as long as the vectorp = {pi,...,p^) changes.
(4) Form the (r x r)-coefficient matrix P as defined in Algorithm 3.1, and the vectors
'■— {pi,---,Pr) , 9 •■= {9i,pi,---,gr,pr)'^ and solve the linear equation system
Pb = g.
Output: b:={f3,,...,p^)T,
381
Gerlind Plonka
382
Proof: The refinement equation (1.1) implies for each component (l)„ that
fceKj=i
In particular, it follows from the local hnear independence, that for all k with A^j{k) i^ 0,
gsupp<?ij(2--fe) Cgsuppi?^^,
I/, j = l,...,r,
that is [(aj +fc)/2, (/3i + fe)/2] C [a,, /3,]. Using the numbers s^j and g,-/defined above,
we obtain (a,- + s.,i)/2 > a, and (/?,- + 5.j)/2 < /3., or equivalent^,
2a^ - aj < s^j
and
2/?^ - /3j > g^j
(3-1)
for all 1/ j = 1,... ,r. In particular, for each fixed v at least one of the r inequalities in
(3.1) for the starting points (and for the endpoints, respectively) must be an equality.
Let us look to the first algorithm computing the starting points, the second works
analogously. In the first step of the algorithm we just put a^ := s^,^. These s,.,^ are upper
bounds of the true starting points of </., since, for j = u, (3.1) implies a, < s^,.. Hence
it is clear that, if 2a, - aj is greater than s,j for a fixed u and some jE {1,. •., r)
then a, must be reduced since a,- is already an upper bound for the starting point
of </.,•. Putting now a. := (s.j + aj)/2 in step 2, we obtain again an upper bound of
a. Repeating the second step of the algorithm we obtain decreasing sequences for a^
(being dyadic rationals, and) approaching the exact starting values. However, if the exact
starting values are not dyadic rationals then they cannot be obtained by a finite number
of repetitions of step 2. That's why we consider the vector p which stores for each u an
index j = p. for which the inequality in (3.1) is even an equality. Then step 2 must only
be repeated a few times in order to find the correct vector p. Now, we can use the r
equalities
2a„ — a p^
'F,P^
in order to compute a, directly. By a suitable rearranging of the equations one obtains
an (r x r)-coefRcient matrix
/ Pi
0
0
Pa
0
0
0 \
(3.2)
P:=
PK
R
V
0
D j
where P(, / = 1,..., K, are circulant matrices of the form
/ 2
0
\-l
-1 ...
2-1
0
0 \
-1
2 /
Locally linearly independent function vectors
383
D is a diagonal matrix with diagonal elements 1 or 2, and i? is a matrix of dimension
dim D X {r — dimD), with one nonvanishing entry in each row at most. For example, in
the case p = (1,2,..., r), P is just the (r x r)-identity matrix, i.e., dimD = r and the
matrices Pi and R do not occur in P. For p = (2,3,..., r, 1) we find P = Pi and D as
well as i? vanish. If p contains smaller 'cycles' of the form {jpm,- ■ ■ ^Pn^) withp„^. = rij+i,
j = 1,..., /i -1 and p„^ = ni, then each cycle corresponds to a circulant matrix P; in P.
Since the circulant matrices Pi are invertible, the equation system is uniquely solvable.
D
Example 3.3 Let r = A and let the values Sj,j,
i, j = 1,2,3,4 be given by the matrix
1,3 >
/
(si,:
2
1
1
0
0\
3
1
1/
Algorithm 3.1 gives
step 1: a^ = (01,02,03,0:4) = (1,1,1,1) and p = (1,2,3,4)
step 2: a^ = (oi, 02,03,04) = (1/2,3/4,3/4,3/8) and p = (4,1,1,2)
step 3: one repetition of step 2:
a^ = (oi, 02,03,04) = (3/16,19/32,19/32,19/64) and p = (4,1,1,2)
Since p did not change no further repetition of step 2 is necessary,
step 4: We obtain
0 0 -1 \
( 2
-1 2 0 0
P=
-1 0 2 0
\ 0 -1 0 2 /
which can be simply changed into a matrix of the form (3.2) by rearranging the equations
for the vector 0! = (01,04,02,03)-^. The system Pa = s with s = (0,1,1,0)-^ gives
a = (1/7,4/7,4/7,2/7)^.
Remark 3.4 In [10] it has been shown that for locally linearly independent refinable
function vectors $ = {(pi,... ,(j)r)'^ the starting points and the endpoints of gsupp<j)^,
u = 1,..., r, are rational numbers of the form k + Cr, where k £2Z and Cr S Jr with
(2' - 1)2'--' ■
4
),...,(2'-l)2'-'-l|.
' = 1, ...,r, fc = 0.
Function vectors with holes
In contrast with the scalar case, where a locally linearly independent refinable function
cannot have a hole, for function vectors this need no longer to be true.
Example 4.1 Let $ = {4>i, (^2)^ satisfy
2/9
1/3
+■■"
0
0
$(2a;) +
$(2a; - 7).
1/3
1
f)
^2x - 1) +
/2/3
Vl/3
$(2a; - 2)
Gerlind Plonka
384
FIG.
1. Locally linearly independent function vector $
=
{4>i,<j>2V with a
hole.
Hence AQ and ^i in (2.1) are (14 x 14)-matrices. The function vector $ is uniquely
determined by the refinement equation (up to multiplication by a constant). Further,
gsupp(/ii = [0, 3] and gsupp02 = [0, 5], and (f)^ possesses a hole of length 1, namely
(t>2{x) = 0 for a; G (5/2, 7/2) (cf. Figure 1). As we shall show in Section 6, # is continuous
and locally linearly independent.
Further, one can simply find function vectors # with infinitely many holes (but not
being locally linearly independent).
Example 4.2 Let $ = (01,^2)^ with
<t>i{x) =\<}>i{2x) + <Ai(2a; - 1) + i</.i(2.T - 2),
FIG.
4>2{x) = -4>2{2x) + 4>i{2x-A).
2. Function vector $ = {(f>i,^2)'^ with infinitely many holes.
Here Ao, Ai in (2.1) are (8 x 8)-matrices. Observe that 0i is just the hat fimction
with supp^i = [0,2] and 02 is a fractal function with gsupp(/)2 = [0, 3], formed by
infinitely many 'hats' of support length 2"^, j = 0,1,..., and with infinitely many holes
of the form 2~''(3/2,2), j = 0,1,... (cf. Figure 2). Of course, this function vector is not
locally hnearly independent, since 0i is refinable by itself (see also the proof of Theorem
4.3).
We want to consider the support properties of function vectors $ more closely, and
investigate, in which cases the components of # can have holes.
Locally linearly independent function vectors
385
In the remaining part of the paper, we only investigate the case r = 2, i.e., $ =
Theorem 4.3 Let $ = (</>i,^2)^ be a refinable, locally linearly independent vector of
compactly supported, continuous functions with gsupp ^i;^ = [a^,, (3i,] and let l^, — Pu — oiv,
1^ — 1,2, be the lengths of the global supports with li < l^- Suppose that $ contains holes.
Then we have
(1) The support lengths satisfy 1^12 <l\<l2(2) There exist compactly supported, continuous functions /i, /g such that (f>2 = /1 + /2
and the vector {4)1, fI, f2)'^ is refinable.
Proof: Since $ contains holes, there exists an open interval I = (71, 72) of greatest
length and a t- € {1,2} with I C gsupp (jiy, where (j)i, vanishes on I. If there are several
intervals of greatest length (biggest holes) we just choose one of them. Refinability implies
for a; € /
(t>^{x) = Q = '^A^^i{k)(i}i{2x-k) + A^^2{k)4>2{2x-k).
k
Since $ is locally linearly independent, it follows that
Au^i{k)
Au,2{k)
=
=
0
0
for
for
supp<?ii(2--fe)n/7^0,
supp<^2(2--A;)n/#0.
The choice of I as the greatest interval now implies that we can replace supp ()f)y by
gsupp(/>!y, such that
Av,i{k)
Aufilk)
=0
=0
for
for
271-/3i < fc < 272-ai,
2-^1- p2<k< 272 - a2.
(4.1)
Let now /i := <t)vX[a^,'yi] and /2 := ^i/X[72,/3„]! where X[a,b] denotes the characteristic
function of the interval [a, b]. Then 4>v = /i + h and from refinability and from (4.1) it
follows that
/iW =
Yl A^Ak)M'^x-k) +
fe<27i-/3i
f2{x) =
Yl
k>2f2—ai
Yl A^Mk)M^x~k),
fc<27i-/32
A^Ak)M^x-k)+
Y A.,2{k)M^^--k).
fc>272 —02
If the hole / were in (j6i, then at least one of the two functions /i, /2 would have a
global support length less than li/2 and hence would vanish since gsupp(;ii(2 ■ -k) and
gsupp^i(2 • —k) have a length > li/2. Thus the hole must be in (/>2, i.e., <f>2 = fi + /2For li = I2 we obtain a contradiction, since, with the same argument as before, one
of the two functions /i, /2 vanishes. Hence Z2 > ^i- In this case (0i, /i, /2)^ is obviously
a refinable vector of continuous functions.
It remains to show that I2/2 > li leads to a contradiction. For ^2/2 > ^i, <f>i must be
refinableby itself, since gsupp^2(2-—fc) cannot be contained in gsupp (/)i for some fc € 7L.
In particular, from local linear independence we know that then [ai,^i] is an integer
interval and that 0i has no holes. Further, since at least one of the two functions /i,/2
386
Gerlind Plonka
has a global support length less than I2/2, it follows that this function is representable
by (?!)i(2 • -k), k e7L, only. Without loss of generality let
f,{x)=
Y.
A2,iik)M^^-k),
xeJR.
(4.2)
fc<27i-/3i
: Considering *i = ((?l)i(- + fc))fiJ^\ local linear independence implies that the space
V'l = span{*i(x) : a; 6 [0,1)} has full dimension li. Further, we consider
(Here, for x € IR, [x\ denotes the greatest integer less than or equal to x and \x]
denotes the smallest integer greater that or equal to x.) Now, choosing a matrix M of
basis vectors of the space V = span {*(x) : x e[0,1)}, then, because of (4.2), the rows
of M corresponding to /i depend on the first h rows (corresponding to 4>i). However,
not all /i-rows can be zero rows since /i is not a zero function. But this contradicts the
local linear independence condition by Theorem 2.1.
□
Corollary 4.4 Let # = {(t)\,4>2Y ^^ 0, refinable, locally linearly independent vector of
compactly supported, continuous functions with gsuppcf),^ = [a^, /J^,] and l^ = f3^ - a^,
u = 1,1. Suppose that h < l2- Then we have: Ifh = h or li < I2/2 then (pi, 4>2 do not
possess holes.
Lemma 4.5 Let i> = {<t>i,(p2)'^ be a refinable, locally linearly independent vector of
compactly supported, continuous functions. Then i> has no holes that start or end with
an integer.
Proof: Suppose, $ has a hole which ends with an integer. Choose a hole (71,72) of this
type with biggest length. Without loss of generality assume that this hole is in (j)2. Then,
at least in a small right neighborhood of 0, 4>2{- + 72) is representable only by (/>!(2 • +ai)
and (j)2{2 ■ +a2). Recall from [10] that the supports gsupp<)!>i = [ai, /3i], gsupp02 =
[02, 1^2] satisfy
Q^ = fc + C2,
/3^ = / + C2,
fc,/GK,C2e {0,1/2,1/3,2/3}.
Now, if both, ai and 02 are integers, then ^i(x + ai), (j)2{x + Oi2), <}>2{x-\-^2) are linearly
dependent in some suitable interval x G [0,e), e > 0, since they can be represented by
the two functions 0i(2x + ai), 1^2(2.^ + 02). This is a contradiction to the local Hnear
independence. If only one a^, v e {1,2} is an integer, then (f)^{x + a^) and ^2(2; + 72)
are representable only by (/>^(2x + a^) in some interval x G [0, e) as before and we again
obtain a contradiction. If neither ai nor 02 are integers, then (/)2(a; + 72) cannot be
represented by integer translates of ({>u{2x), u = 1,2, contradicting the refinability.
Analogously, the contradiction follows for holes starting with an integer.
□
Let us call a hole (71,72) in # biggest hole if there is no other hole in $ of double
size of the form (271 + k, 272 + k) with some k &7L.
Lemma 4.6 Let $ = (i?!>i,</>2)^ he a refinable, locally linearly independent vector of
compactly supported, continuous functions. Then there is at most one biggest hole in #.
Locally linearly independent function vectors
387
Proof: Assume that $ has two biggest holes. Let again h, I2 denote the lengths of the
global supports of ^1, (^2 and suppose that h < l2- Then ^1 cannot have a biggest hole by
Theorem 4.3. Hence the two holes must be in ^2 and we get a partition (j)2 ■— /i +/2 + /s
analogously as in the proof of Theorem 4.3 such that (gsupp /i)U(gsupp /2)U(gsupp /s) C
gsupp(?l)2. Further, by refinability, each function /i, /2,/3 can be represented by </>i(2 •
-k), ^2(2 • —k), fc G ZZ. Moreover, at least one of the three functions /i, /2, /s must
contain a translate of </>2(2-), otherwise at least two of the functions /i, /2, /a would be
linearly dependent in a suitable interval inside the starting intervals, since (/>i(2 ■ —fc)
either starts at Z + ai/2 or at ZZ + (ai +1)/2 (depending on whether k is even or odd).
Hence gsupp(/12 > (gsupp02)/2 + 2(gsupp0i)/2. But this contradicts Corollary 4.4.
□
Remark 4.7 All results in this section can be generalized tor >2 and to L^-integrable
functions, if the characterization of local linear independence in [2] is used.
5
Rank conditions for matrices formed by the refinement mask
We again restrict ourselves to the case that $ = {(j>i,(j)2Y' is a vector of compactly
supported, continuous functions satisfying the refinement equation (1.1) with A{k) = 0
for A; < 0 and k> N.
Let us consider the matrices ^0 and Ai in (2.1) and the minimal common invariant
subspace V of {^0, Ai} generated by VQ as defined in Section 2. Recall that V contains
*(a;), X e [0, 1). Let M be an {rN x dim y)-matrix such that the columns of M form a
basis of V. Now delete all components in the vector * = i^{x + k))^~Q corresponding to
zero rows in M in order to get *. Further, delete the corresponding rows and columns
in the matrices ^0 and Ai in (2.1) in order to obtain Ao and ^1 with
^{x/2)=Ao^{x),
*((a; + l)/2)=^i*(x),
a;G[0,l].
(5.1)
Deleting the zero rows and the corresponding columns in M we obtain M.
Example 5.1 Let us consider Example 4.1. Here * is a vector of length 14 and V =
span{$(a; + fc)|^Q : x e [0,1)}. Since supp^ii = [0, 3] and supp(?!)2 C [0, 5], it follows
that the rows oiM corresponding to ^i{x + j), j = 3,4,5,6, and (j)2{x+j), j = 5,6 are
zero rows. Indeed, these are all zero rows of A^, i.e., V has dimension 8. We delete these
components of $(a;) and obtain
^{x) = (Mx), Mx), Mx + 1), Mx + 1), Mx + 2), M^ + 2), Mx + 3), <A2(a; + 4))^
as well as
9A
/I 2 0 0 0 0 0 0\
/3 3 1 2 0 0 0 0\
33000000
90330000
6 0 3 3 12 0 0
0 0 603320
3090 3 300
0 0309030
, 9ii =
00006032
0 0 0 0 0 0 0 3
000 03003
0 0000000
00000000
9 0 0 0 0 0 0 0
\0
0
9
0
0
0
0
0/
\0
0
0
0
9
0
0
0/
388
Gerlind Plonka
Let us call a row of ^o (resp. ^i) 0i-row if it corresponds to an 0i-entry in * and
(/)2-row if it correspond to an 02-entry.
Let n be the length of the new vector * and hence ^o, Ai are (n x n)-matrices. If $
is a locally linearly independent vector then Theorem 2.1 implies that M is an invertlble
(n X n)-matrix.
Deleting the first ^i-row and the first (^2-row and the corresponding columns in .4o,
we obtain a new matrix B of dimension (n - 2) x (n - 2). The same matrix B is obtained, if
we delete the last 0i-row and the last (?!>2-row and corresponding columns in Ai. Further,
the structure of ^0) «4i implies that
spec ^0 = spec Jo U spec B,
spec .4i = spec Ji U spec B,
where JQ (resp. Ji) is a 2 x 2-matrix containing the entries of Ao (resp. ^i) being at
the same time in the first 0i- or 02-row (resp. last ^i- or (j>2-raw) and in the first 0i- or
02-column (resp. last 0i- or (/)2-column). (Here spec A denotes the set of eigenvalues of
a matrix A.)
Example 5.2 For $ = (0i, 02)^ in Example 5.1 we obtain the matrix B after deleting
the first and second row and corresponding columns in AQ or by deleting the 5th and
8th row and corresponding columns in ^i. Hence,
/3
9
0
B='0
9
0
3
0
0
0
0
1
3
6
3
0
2
3
0
0
0
0
0
3
0
0
0\
0
2
3
0
■^°"9 (3
3)'
-^^"9 (9
0)
Vg 0 0 0 0 o/
where Ji and J2 are invertible.
We obtain
Theorem 5.3 Let $ = (<l>i,<}>2)'^ be a refinable, locally linearly independent vector of
compactly supported, continuous functions. Further, let Ao, Ai and B be given as above.
Then we have
(1) rank(Jo) > 1 and rank(Ji) > 1,
(2),rank(S) >n-3,
(3) rank(ylo) > n - 2 and rank(^i) > n - 2,
(4) |rank(A) - rank(.Ai)| < 1.
Proof: (1) First observe that JQ and Ji at lea-st have rank 1, otherwise a component
of *(x), X G [0,1) would completely vanish, contradicting the definition of *.
Let gsupp</)i = [ai, /3i] and gsupp(/>2 = [02, ft]- Then, one simple eigenvalue zero in
Jo implies that QI G K, a2 G K + 1/2 or vice versa. If Jo has two eigenvalues 0 then
the geometric multiplicity of 0 must be 1 and we obtain ai €71 + 1/3, Q2 S Z + 2/3 or
vice versa. Analogously, a corresponding behavior of Ji implies ft G Z + 1/2, ft G K
or vice versa, and /3i G Z5 + 2/3, ft G IZ + 1/3 or vice versa, respectively.
Locally linearly independent function vectors
389
(2) If the matrix B possesses the eigenvalue zero, then both, Ao and Ai possess the
eigenvalue zero. Hence, AQM. and AiM. are not invertible, while M is an invertible
matrix. Thus, by Theorem 2.1, „4o and Ai have a zero row, but being not the first or
last 4>i- or (;&2-row. Hence, also B has a zero row and, by construction, if ^o has the
zero row in the ^th 0j-row, i G {1,2}, then Ai must have a zero row in the {I — 1)th (j)i-raw. This means by (5.1), the two zero rows imply a hole in $ containing the
interval {k — 1/2, k + 112), for some k € TL. This hole must be a biggest hole. If B
has the eigenvalue zero with geometric multiplicity greater than 1, then with the same
arguments one obtains a second biggest hole in $. But this contradicts the local linear
independence by Lemma 4.6. Hence rank(S) > n — 3..
(3) The above considerations directly imply that rank(^o) >"■-2 and rank(^i) >
n-2.
_
_
.
(4) Now, if ^0 has rank n — 2, then B has rank n — 3 and hence Ai can have rank
n — 1 at most. Analogously, rank(^i) = n — 2 implies rank(^o) <n—l.
D
Prom Theorem 5.3 it follows that we have to investigate the following five cases:
(1) rank(.4o) = rank(.Ai) = n,
(2) rank(J^o) = rank(J^i) = n - 1,
(3) rank(>^o) = rank(^i) = n - 2,
(4) rank(^o) = n — 1, rank(^i) = n,
(5) rank(^o) = ^ ~ 1) rank(^i) := n — 2.
All further cases can be reduced to one of the above. However, some of these cases may
contradict the local linear independence assumption for $.
Considering the first two cases, we obtain a partial answer to the question of whether
the support of c^j, i = 1,2, can have holes. Moreover, we obtain sufficient conditions for
the local linear independence of $ in terms of rank conditions iox AQ, AiFor the first case we obtain:
Theorem 5.4 Let $ = [4)1, ^2)^ be a refinable vector of compactly supported, continuous functions. Let the space V = span{^{x) : x £ [0,1)} have full dimension, i.e. M,
formed by basis vectors of V is an invertible (n x n)-matrix. Let Ao, Ai be given as
above. Then rank(.4o) = rank(.Ai) = n implies that $ is locally linearly independent and
has no holes.
Proof: The assertion on local linear independence is already proved in [4], Theorem
3.2. Since ^O; -^i are invertible, the matrix A^^ ■ ■ -Ae^M. never has a zero row, hence
from
_
^{^ + ---+'^ + §^)=-^^^---^^r.Hx),
a:G[0,l),
- (5.2)
it follows that there is no dyadic interval where (/)i or ^2 vanishes. Thus $ has no holes.
o
For the second case we find
Theorem 5.5 Let $ = (0i,^2)'^ be a refinable vector of compactly supported, continuous functions. Let the space V = span{^{x) : x E [0,1)} have full dimension, i.e. M,
390
Gerlind Plonka
formed by basis vectors ofV is an invertible (n x n)-matrix. Let Ao, Ai and B be given
as above. Further, let Ta,nk{Ao) = rank(^i) = n — 1 and each of these matrices has one
zero row. Then we have
(1) // rank(B) = n - 2 and the four matrices AQAO, ^o-^i, AiAo^ AiAi have rank
n — 1, then $ is locally linearly independent and has no holes.
(2) //rank(B) = n-3 and the four matrices AOAQ, AQAI, AIAQ, AiAi haverankn-1,
then $ is locally linearly independent and has one hole of the form {k -1/2, fc+1/2)
for some k £71.
Proof: (1) We consider the first case. Since rank(B) = n - 2, it follows that B is
invertible and the zero row of Ao must be the first ^i-row or the first ^2-row. Analogously,
the zero row of Ai must be the last (f>i- or (j!)2-row. Since rank(^o-4o) = Ta.nk{AiAi) =
n - 1, it follows that Jo and Ji only have a simple eigenvalue^zerj5^and the assumptions
(1) of the theorem imply that all matrix products A^ ■■■Ac„M, n 6 IN, have rank
n - 1 and one zero row, namely the same as ^o if ei = 0 and the same as ^i if ej = 1.
The assumption on V in the theorem already ensures that $ is linearly independent on
(0,1). Now the above observations also imply that, by Theorem 2.1, * is locally linearly
independent.
_
The zero row in ^o implies that the support of one component of $ starts with an
integer and the support of the other with a half integer. Considering the zero row in Ai
we also find that the support of one component ends with an integer and the support of
the other with a half integer. In particular, from (5.2) it follows that $ cannot have holes.
(2) We consider the second case. Since rank(B) = n - 3, it follows that B possesses the
eigenvalue zero and the zero rows of Ao and Ai are not the first or the last ^i- or (f)2rows. Moreover, as shown in the proof of Theorem 5.3, if the l-th (f>i-TOW, i € {1,2}, of
^0 is a zero row then the {I - l)-th ^j-row of Ai is also a zero row, and this implies
by (5.1) a hole of the form (fc - 1/2, k + 1/2) for some k € 7L m (/i^. Further, the rank
conditions (2) of the theorem imply that all matrix products Aci • • • A^^M, n 6 IN, have
rank n - 1 and either a zero row in the l-th or in the (/ - l)-th row. Thus, by Theorem
2.1, # is locally linearly independent and has only one hole.
□
Remark 5.6 Example 4-1 satisfies the assumptions of Theorem 5.5 (2). An example
satisfying Theorem 5.5 (1) can be found in [10].
Observe that the case (2) is not completely settled by Theorem 5.5 since for rank(Ao) =
rank(^i) = n - 1 some of the four matrices ^o-^o, AQAI, AIAQ, AiAi can also have
rank n - 2. Indeed, there exist locally linearly independent function vectors, where
rank(.4o) = rank(^i) = n - 1 and rank(«4o^o) = Tank{AiAi) = n - 2, see [10]. The
remaining cases are more complicated to handle and we cannot give a final answer to
the question of whether a locally linearly independent refinable vector i> can have more
than one hole.
6
Proof of the example
In this section we want to verify the assertion that the function vector # given by the
refinement mask in Example 4.1 is continuous and locally linearly independent. Let us
Locally linearly independent function vectors
first prove that $ is continuous. To this end we use the following observation by Jia,
Riemenschneider and Zhou [9]:
Let {-A{k)}^_Q be a real refinement mask satisfying the following properties:
(1) I J2k=o -^i^) ^^ o^® eigenvalue 1 and all further eigenvalues are inside the unit
circle.
(2) The matrices ^o and Ai both have the simple eigenvalue 1 and there is a vector
ei 6 IR^''with ef ^0 = ef ^1 = ef.
(3) Considering the space U = {u e IR'"^ : efu = 0} the joint spectral radius of Ao\u
and ^i|!7 satisfies p(>^o|c/^o|c/) < 1Then the subdivision scheme associated with {A{k)}^^Q converges in the maximum
norm, and hence the solution vector $ of the refinement equation is continuous.
Here the joint spectral radius satisfies for any matrix norm
P{AQ\UAO\U)=
inf(max{||^eilc/---A.|i7|| : ei e {0,1}, i = 1,... ,n})^/".
TI>1
For our example we find:
7
^^ 2Z_/^(^) ~ ( 4/Q
1 /f! ) possesses the eigenvalues 1 and -5/18.
2) The matrices ^0 and ^i both have the simple eigenvalue 1 with the left eigenvector
el^ = (3,1,3,1,3,1,3,1,3,1,3,1,3,1).
3) The space U = {u £ IR^^ : ej u = 0} has dimension 13 and we find the orthonormal
basis oiU:
ui
U2
=
=-
28-1/2(4,0,0,0,-3,-1,0,0,0,-1,0,0,0,-1)^,
110-1/2(0,0,0,0,-3,-1,0,0,0,0,0,0,0,10)^,
Us
U4
=
=
130-1/2 (-3,0,0,0,-3,-1,-3,0,0,-1,0,0,10,-1)^,
132-1/2(0,0,0,0,-3,-1,0,0,0,11,0,0,0,-1)^,;
ns
U6
=
=
70-1/2 (-3,0,0,0,-3,-1,7,0,0,-1,0,0,0,-1)^,
208-1/2 (-3,0,0,0,-3,-1,-3,0,0,-1,13,0,-3,-1)^,
UY
Us
ug
=
=
=
3540-1/2 (-3,-1,-3,0,-3,-1,-3,59,0,-1,-3,-1,-3,-1)^,
3660-1/2 (-3,-1,-3,60,-3,-1,-3,-1,0,-1,-3,-1,-3,-1)^,
2352-1/2 (-3,48,0,0,-3,-1,-3,0,0,-1,-3,0,-3,-1)^,
uio
=
3422-1/2 (-3,-1,-3,0,-3,-1,-3,0,0,-1,-3,58,-3,-1)^,
un
ui2
ui3
=
=
=
4270-1/2 (-9,-3,-9,-3,-9,-3,-9,-3,61,-3,-9,-3,-9,-3)^,
10-1/2(0,0,0,0,1,-3,0,0,0,0,0,0,0,0)^,
2842-1/2 (-9, -3,49,0, -9, -3, -9,0,0, -3, -9,0, -9, -3)^.
The matrix representations of ^o|i7, ■Ailu under this basis are Ao\u = {{Ao Uj)'^ Uk)]\^i
and Ai\u = ((-^i Wj)^Ufc)]^j._i, and a computation with Maple gives for the spectral
norm
(max{||Ailc;v4,,|i7A3lc/||2 : ei,e2,e3 G {0, 1}})1/^ <0.95.
391
392
Gerlind Plonka
Hence $ is continuous.
Let us prove the local linear independence of #. Here we use Theorem 2.1 and a
procedure proposed by Goodman, Jia and Zhou [4]. The space V C IR" (as given in
Section 2) is spanned by the vector VQ = (0,0,9/5,38/15,6/5,1,0,0,0,9/5,0,0,0,0)^
and by AIVQ, AQAIVO, AIAIVQ, AQAOAIVQ, AIAQAIVQ, AQAOAQAIVQ, AIAOAOAIVQHere VQ is a right eigenvector of Ao to the eigenvalue 1. Hence dim V = 8. Forming the
matrix M, we observe that the 7-th, the 9-th and the last four rows of M are zero rows.
Hence gsupp(?!>i = [0,3] and gsuppi^a = [0,5]. The remaining 8 rows of M are linearly
independent. Thus $ is linearly independent on (0,1) by Theorem 2.1.
We can restrict our considerations to the shortened matrices ^o, Ai as given in
Example 5.1. Further, we can choose the matrix M as the identity matrix. The procedure
proposed in [4] gives rank^o = rank^oA = rank^o^i = 7 and the 7-th rows are zero;
rank^i = rank^^i = rank^i^o = 7 and the 6-th rows are zero.
Hence, $ is locally linearly independent. Moreover, 02 possesses a hole of length 1,
namely M^) = 0 for x G (5/2, 7/2).
7
Conclusions
In Section 3 we have presented an algorithm to compute the global supports of the r
components of a compactly supported refinable function vector $ from the refinement
mask. The rest of the paper was restricted to f = 2.
While for the scalar case local linear independence of a refinable function (f> guarantees
that the support of 0 is an integer interval without holes, this is not longer the case for
r > 1. As we have seen in Section 4, a function vector $ = {(pi, 4>2)'^ can only have holes
, if the lengths h and h of the global supports of </>], 02 satisfy ^2/2 < ^i < h- As another
property, it has been shown that the endpoints of a hole cannot be integers. Further, $
can have at most one biggest hole.
In Section 5 we have investigated matrices derived from the refinement mask. In
Theorem 5.3 some results on the rank of these matrices are obtained leaving five different
cases to be investigated. The first case has been solved completely in Theorem 5.4. The
second case has been settled partiallyjn Theorem 5.5. For the other cases we cannot give
a final answer. However, if A and Ai have different rank (as in case (4) and case (5))
then one can show by Theorem 2.1 that $ must have infinitely many holes. In case (4)
this can be seen as follows. Since rank(A) = n-1 it follows that rank(^5"A) -n-l for
A; = 0,1,.... Hence, by Theorem 2.1, AIAQ has a zero row for all fc = 0,1,... implying
that * contains vanishing intervals of the form [h + (2*^ - l)/2^ h + (2*^ -1/2)/2*0 with
suitable integers h- Here Ik cannot be the same integer for all A; = 0,1,2,..., in particular
one finds Ik i^ h+i, fc e IN. Hence * has infinitely many holes. This observation leads to
the following
Conjecture 7.1 Let $ = (01,02)^ be a refinable, locally linearly independent vector
of compactly supported, continuous functions. Then # cannot have more than one but
finitely many holes.
Our numerical computations however lead to the hypothesis that the cases (3), (4) and
(5) contradict the property of local linear independence. So we obtain
Locally linearly independent function vectors
Conjecture 7.2 Let $ = {4>i,(l>2)'^ &e a refinable, locally linearly independent vector of
compactly supported, continuous functions. Then $ cannot have infinitely many holes.
Acknowledgment The author thanks the referees for their valuable suggestions to
improve the paper.
Bibliography
1. C. de Boor, R. A. DeVore, and A. Ron, Approximation orders of FSI spaces in
L2{M'^), Constr. Approx. 14 {1998), 411-427.
2. H. L. Cheung, C. Tang, and D.-X. Zhou, Supports of locally linearly independent
M-refinable functions, attractors of iterated function systems and tilings, preprint,
2001.
3. W. Dahmen and C. A. Micchelli, Biorthogonal wavelet expansions, Constr. Approx.
13 (1997), 293-328.
4. T. N. T. Goodman, R. Q. Jia, and D.-X. Zhou, Local linear independence of refinable
function vectors of functions, Proc. R. Soc. Edinb. 130 (2000), 813-826.
5. T. A. Hogan, Stability and independence of the shifts of finitely many refinable
functions, J. Fourier Anal. Appl. 3 (1997), 757-774.
6. K. Jetter and G. Plonka, A survey on L2-Approximation order from shift-invariant
spaces, in Multivariate Approximation and Applications, N. Dyn, D. Leviatan,
D. Levin, and A. Pinkus (eds.), Cambridge University Press, 2001, 73-111.
7. R. Q. Jia, Shift-invariant spaces on the real line, Proc. Amer. Math. Soc. 125 (1997)
785-793.
8. R. Q. Jia and C. A. MicchelH, On linear independence of integer translates of a
finite number of functions, Proc. Edinburgh Math. Soc. 36 (1992), 69-85.
9. R. Q. Jia, S. D. Riemenschneider and D.-X. Zhou, Vector subdivision schemes and
multiple wavelets, Maift. Com^. 67 (1998), 1533-1563.
10. G. Plonka and D.-X. Zhou, Properties of locally linearly independent refinable function vectors, preprint, 2001.
11. A. Ron and Z. Shen, The sobolev regularity of refinable functions, J. Approx. Theory
106 (2000), 185-225.
12. Q. Y. Sun, Two-scale difference equation: local and global linear independence,
manuscript, 1991.
13. J.-Z. Wang, Linear independence relations of the shifts of a vector-valued distribution, manuscript, 2001.
. 393
The correlation between the convergence of subdivision
processes and solvabiUty of refinement equations
Vladimir Protasov
Department of Mechanics and Mathematics, Moscow State University, Moscow.
protasov@dionis.iasnet.ru
Abstract
We consider the univariate two-scale refinement equation (p{x) = J2k=o^>''P('^^ ~ '^)'
where Co,-- ■ ,CN are complex values and J^cjt = 2. This paper analyses the correlation
between the existence of smooth compactly supported solutions of this equation and the
convergence of the corresponding cascade algorithm/subdivision scheme. In the work [11]
we have introduced a criterion that expresses this correlation in terms of the mask of the
equation. It is shown that the convergence of subdivision scheme depends on values that
the mask takes at the points of its generalized cycles. In this paper we show that the
criterion is sharp in the sense that an arbitrary generalized cycle causes the divergence
of a suitable subdivision scheme. To do this we construct a general method to produce
divergent subdivision schemes having smooth refinable functions. The criterion therefore
establishes a complete classification of divergent subdivision schemes.
1
Introduction
Refinement equations have been studied by many authors in great detail in connection
with their role in the theory of wavelets and of subdivision schemes in approximation
theory and design of curves and surfaces (see [1^14]). In this paper we study a criterion
of convergence of subdivision processes having smooth refinable functions. This criterion
was presented in the work [11]. In particular we show that the criterion is sharp in
the sense that each if its cases is realized. To do this we provide a general procedure
for constructing divergent subdivision schemes (or cascade algorithms) corresponding to
smooth refinable functions.
We restrict ourselves to univariate equations with a compactly supported mask.
Through the paper we denote by T = R/27rZ the unit circle, by H the space of entire functions on C, by C' the space of I times continuously differentiable functions on
K, by C^ = C the space of continuous functions, by CQ the space of compactly supported
functions from C', and by CQ the space of compactly supported continuous functions on
R. A sequence {fk} converges to zero in CQ if it converges to zero in C' and the supports
of /fc, k eN are uniformly bounded.
Consider a refinement equation
N
<p{x) = Y^Ck^{2x-k),
fc=0
394
(1.1)
Subdivision processes and refinement equations
395
where Ck £ C,Y.k^k = 2. The trigonometric polynomial m(^) = lJ2k=o'^ke~''''^ is
the mask of this equation. It is well known that a Co-solution of this equation {refinable
function), if it exists at all, is unique up to normalization and has its support on the
segment [Q,N]. For a given mask m we denote by [m] the corresponding refinement
equation. Let us also define the following subspaces of the space Co:
In other words the Fourier transform of a function from M'' has zeros of order > / + 1
at all the points 27rfc, /c G Z. The Fourier transform of a function from £' has zero at
the point ^ = 0 and has zeros of order > Z + 1 at all the points 2-?Tk, A; e Z \ {0}. Let us
also denote £ = £° = M°.
The cascade algorithm for refinement equations is the construction of the sequence
fn = Tfn-1 for some initial function /o € Co, where Tf{x) = I]fcCfe/(2a; - k) is the
subdivision operator associated to equation (1.1). This operator is defined on the space
Co and preserves all the subspaces C', £'. If /„ converges in the space C^ to a function
(f GCQ {l> 0), then obviously it converges in CQ and (p is the solution of (1.1). Moreover,
in that case the function g = f^-ip necessarily belongs to £} (see [1], [5]). Thus we say
that the cascade algorithm converges in C' liT'^g —> 0, n —» oo for any g E CK Properties
of the cascade algorithms have been studied by many authors in various contexts. This
algorithm gives a simple way for approximation of refinable functions and wavelets. On
the other hand the convergence of the cascade algorithm is equivalent to the convergence
of the corresponding subdivision scheme ([4]). For a given mask m(^) we say that the
subdivision process {m} converges in C' if the corresponding cascade algorithm or the
corresponding subdivision scheme converges in that space.
It is clear that if a subdivision process converges in C', then the corresponding refinement equation has a Cg-solution. In general the converse is not true, corresponding
examples are well-known (see [1], [2], [13] for general discussions of this aspect). A natural question arises; under which extra conditions the solvability of a refinement equation
implies the convergence of the subdivision process?
1) A necessary condition (first introduced in [6]):
// a subdivision process {m} converges in C', then its mask can be factored as
MC) = (i^) a{0
(1.2)
for some trigonometric polynomial a{(,). In particular the condition
m{0={
1—)a{0
^Z1^2fe = 5^C2fc+i = l
k
'
(1.3)
k
is necessary for the convergence of the subdivision process in C. Let us remember that for
the existence of smooth solutions of refinement equation this condition is not necessary
(there is a weaker condition for this, see [10]).
For a given mask m denote by l(m) the maximal integer I such that condition (1.2)
is satisfied. So if a subdivision process {m} converges in C'^, then k < l(m).
2) A sufficient condition (introduced in [1], developed in [8],[14],[7],[9]):
396
Vladimir Protasov
Suppose a mask m satisfying 1.2 for some l>0 has neither symmetric roots nor cycles;
then if the equation [m] has a C'^-solution, then the process {m} converges in Ci.
Let us recall the notation used in this statement. If, for a trigonometric polynomial
p(0 and for some a e T, we have p{a/2) = piir + a/2) = 0, then {a/2, rr + a/2} is a
pair of symmetric roots for p{^). In order to be defined we set that for any a 6 T the
element a/2 G T has the corresponding real value from the half-interval [0,7r). Further,
: a given set b = {/3i, ■ • •, /?„} C T, where n > 2, is called cyclic if 2b = b, i.e., 2/3j = /3j+i
for j = 1,• • • ,n (we set /3„+i = /3i). We consider only irreducible cyclic sets, for which
all the elements are different. Note that if two cyclic sets do not coincide, then they are
disjoint. A cyclic set b is called a cycle of a trigonometric polynomial p if p(b + TT) = 0,
i.e.,p(/9 + 7r) = 0fbrall/3eb.
It is well known that the sufficient condition (2) for a mask m is equivalent to the
stability of the corresponding refinable function (i.e., integer translates of the refinable
function possess Riesz basis property in Z,2(K)). It is also equivalent to say that the mask
satisfies Cohen's criterion (see for example [5, Proposition 2.4]). Actually condition (2)
was formulated for the case / = 0 only, but it can be easily extended to general I. It is
seen, for instance, from Theorem 2.2 of this paper.
Thus we have one necessary and one sufficient condition for the convergence of subdivision processes having smooth refinable functions. It was a natural problem to fill this
gap and to elaborate a criterion in terms "if and only if". In 1998 two attempts were
made independently from each other and almost simultaneously. They were the work
[9] by M. Neamtu and my work [11]. Those two criteria were very similar, but different.
Moreover, it turned out that our results were actually incompatible. We will discuss this
aspect after formulating the main result of the work [11].
2
A criterion for convergence
We give a criterion of convergence of a subdivision process under the condition that the
corresponding refinement equation has a smooth solution. We will see that symmetric
roots of mask do not influence the convergence of subdivision processes. This means
in particular that the stability of solutions is not necessary for the convergence. The
convergence entirely depends on values of the mask at the points of so-called generalized
cycles:
Everywhere below we consider trigonometric polynomials without positive powers,
i.e., polynomials of the form p(0 = EfcLo afc^-''^^. As usual we set deg p = iV (assuming
floajv ^ 0). To a given value a e T we assign a binary tree denoted in the sequel by %,.
To every vertex of this tree we associate a value from T as follows: put a at the root,
then put a/2 and TT -I- a/2 at the vertices of the first level (the level of the vertex is the
distance from this vertex to the root. The root has level 0). If a value 7 is associated to a
vertex on the n-th level, then the values 7/2 and TT + 7/2 are associated to its neighbors
on the (n + l)-st level. Thus there are the values ^ + ^, fc = 0, • ■ •, 2" - 1 on the
n-th level of the tree 7^,. A set of vertices A of the tree % is called a minimal cut set if
every infinite path (all the paths are without backtracking) starting at the root includes
Subdivision processes and refinement equations
exactly one element of A. For instance the one-element set A
set. Every minimal cut set is finite.
^,^f";*!°" 2.1
397
{root} is a minimal cut
Aset{0,,...,/?„} cT is called a generuMzed cycle of a polynomial
,f ?f '' ?f'f """"^ ^"'' ""^ ■?■ = 1' • • •'" ^/^e ^^ee ^ft+T possesses a mmimaZ c«<
set Aj such that p{Aj) = 0.
The family {.4i, • • • ,^„} is said to be sets of zeros of the generalized cycle b Let us
remark that for a given generalized cycle the set of zeros may not be defined in a unique
way. Any (regular) cycle of p(0 is also a generalized cycle, in this simplest case each
mmimal cut set Aj is the root of the corresponding tree T«.+^. On the other hand, not
any^generahzed^ycle is a regular cycle. For example, the polynomial p{^) = {e-^i % '/l^^ Tr ^"^ ^^ ^° regular cycles, but is has a generalized cycle b = {/3i,P2} =
{27r/3,47r/3}. Indeed, this polynomial has three zeros on the period: 7r/3, -7r/6 57r/6 S
T. The set Ai = {-7r/6,57r/6} is a minimal cut set for the point /3i + TT, ^2 = W/3} is
a minimal cut set for /3, + -K, and p{A^) =. piA^) = 0. Roughly speaking, each cyclic set
iPi, ■ • •, /On I has a unique corresponding cycle (the family of zeros is {/Jj + TT
/3 + TT j)
and a variety of generalized cycles (all possible sets of zeros {^1,..., A}, where X is an
arbitrary mimmaJ cut set of the tree T,^^^, j = l,...,n). Note, that if at least one set
^^ differs from the root f3j + n, then it necessarily contains a pair of symmetric roots of
p iheretore, if the polynomial p has no symmetric roots, then all its generalized cycles
It there are any, are regular cycles.
'
For any trigonometric polynomial p and any finite subset Y = fa, ■■■ a \ c T
we denote p,{Y) = (U^^^ b(a,)|)V^ This is a multiplicative function'on the set of
trigonometric polynomials.
Now we formulate the criterion of stability of subdivision process.
Theorem 2.2 Suppose a refinement equation [m] has a Ci-solution for some I > Qthen the process {m} converges in C if and only if the mask m satisfies (1.2) and for
any generalized cycle b of the mask m we have p„(b) < 2-'.
In particular, for I = 0, this means that a subdivision process {m}, whose refinement
equation has a continuous solution, converges if and only if p„(b) < 1 for every generalized cycle b of the mask. Another corollary is Condition (2) from the Section 1. Indeed
^a mask has neither symmetric roots nor cycles, then it has no generalized cycles either'
Hence, by Theorem 2.2, the subdivision process must converge.
Example 2.3 Consider a mask
m(0 = (0.2 + 0.5e-'«-F0.3e-2^«)(e-'«-e-^)2(e-2i«_ef)2
(2.1)
The corresponding equation [m] has a Co-solution, this is shown in Example 4 5 The
polynomial m has a unique generalized cycle b = {27r/3,47r/3}, the same as in the
previous example, with the same sets of zeros ^1 = {-n/6, STT/G}, A^ = {n/3}. Actually
this IS not one but two coinciding generalized cycles, if we count roots with multiplicity.
We have {pm{h)y =
,27r,
m{~)
Air
(-0.2-0.lV3i).M • (-0.2 + 0.1v/30-4e^.4e-
1.12 >1.
ggg
Vladimir Protasov
Hence the subdivision process {m} diverges.
3
: :
Statement of the problem
Most examples of divergent subdivision schemes (having smooth refinable functions) are
constructed for some special class of masks. These are either "unload masks of the form
m(£) = p(nO for some polynomial p and an odd integer n, or, at lea.st masks whose
associated matrix B = {c,,-jhmo,...M have a multiple eigenvalue 1. The divergence
of such schemes is well known and does not require any special criterion. A natural question arises; whether one really needs the criterion of Theorem 2.2 to determine divergent
processes? Maybe the family of generalized cycles is too wide to describe unstable subdivision schemes. In general there is no evidence that the condition pM > 1 can be
combined with the existence of a smooth solution for the mask m. In this paper we are
going to show that Theorem 2.2 indeed characterizes the family of unstable subdivision
processes properly. We show that each generalized cycle can cause the^ divergence^of a
suitable scheme. On the other hand, we will see that every converging subdivision scheme
can be "spoiled" by some generalized cycle.
4
Preliminary results. Reductions of masks
To construct examples of divergent processes we need some auxiliary results. The first of
them establishes two properties of cyclic sets. The proof of this lemma IS an easy exercise
for the reader.
Lemma 4.1 a) Let b be a cyclic set and a G T. Then for the polynomials pi(0 =
e-a _ e-'" and P2{0 = ^-^^^ - e'*" we have Pp,(b) = PPA^).K ^
I
b) Leth^ andh, be cyclic sets andp{0 = Ylpei>^i^-'^+^"^^- ^'^^" ""' ^'""' P^^^'' = ^
if hi ^ b2, and PpCbs) = 2 i/bi = baNow turn back to the subdivision schemes. For a given integer J> 0, a mask m
and a function / e C\ denote n{m,f) = -"-"-^^'"^ ^/S'^^'lTrfJ -
the subdivision operator associated to m (we set log^O = -oo). The value i^K™) inf f^w n{m, f) is the degree of convergence of the process [m] m the space L .
_
For every mask m we have n[m) <l + l (see [3]). Furthermore, it was shown m [3]
and [2] that a process {m} converges in C if and only if i^tim) > I In particular the
inequality uo{m) > 0 means that {m} converges in C. Let L be the maximal mtegei
such that {m} converges in C^ (if the process {m} does not converge in C, then we
nevertheless set L = 0). The value u,{m) is said to be the degree of convergence of the
process {m} and denoted in the sequel by u{m). If r(mi) = u{m.2), then Ui{rm) = i^iim^)
°' FOT a given refinement equation [m] denote by L{m) the maximal integer L such that
the corresponding refinable function ^ belongs to C^. If this eqtiation has no cxm inuous compactly-supported solution, we set L{m) = -1. The smoothness of the refinable
function V. i/the value s{m) = L + h, where h is the Holder exponent of the Lth derivative «^(^) on E. It is well known that a refinable function belongs to C if and only it
s{m) > I (the equality s{m) = I is impossible). In particular, a refinement equation has
a Co-solution if and only if s{jn) > 0.
Subdivision processes and refinement equations
Now we can describe the procedure of reduction of subdivision schemes introduced
in [11]. This reduction makes it possible to get rid of both symmetric roots and cycles.
4.1
Eliminating of symmetric roots
Let p(^) be a given trigonometric polynomial (let us remember that we consider polynomials without positive powers). Assume that p possesses a pair of symmetric roots
{a/2, TT + a/2}. The transfer from p{^) to the polynomial pa{^) = ''^e-2.";_e-ia'°^ is said
to be a transfer to the previous level. The inverse transfer from pa to p is a transfer
to the next level. So a transfer to the previous level reduces a pair of symmetric roots
{a/2, n + a/2} to the one root a.
Proposition 4.2 Let a mask rh be obtained from a mask m by a transfer to the previous
level. Then 3(171) = s{m). Moreover, 1/(171) = ^(m), whenever l(rh) = l(m).
(The constant \(m) responsible for condition 1.2 was defined in Section 1). This imphes,
in particular, that the reduced equation [rh] possesses a smooth compactly supported
solution if and only if the initial equation [m] does; and the same true for the convergence
of the corresponding subdivision schemes. Thus, a transfer to the next (previous) level
does not change the smoothness of solutions. It also respects the rate of convergence of
subdivision processes, unless this transfer does not violate condition 1.2 (a transfer to the
previous level may increase the value l(m)). Using this Proposition one can consequently
eliminate all symmetric roots of a given mask.
4.2
Elimination of regular cycles
Let a polynomial p possess a cycle b. The transfer from p(^) to the polynomial p($) =
PiO/ n/36b(^~*^ + e'"''') is called an eliminating of a cycle.
Proposition 4.3 Let a mask fh he obtained from a mask m by eliminating of a cycle
h. Then s(m) = s(m) and ^(m) =iLaax.{i'(in),pm(h)}.
Thus the equation [m] possesses a smooth compactly supported solution if and only if
the equation [rh] does. Moreover, the process {m} converges in C' if and only if the
process {m} does and in addition/9TO(b) < 2~'.
See [11] for the proofs of Propositions 4.2 and 4.3. Now it becomes clear how to estabhsh Theorem 2.2. First we consequently eliminate all symmetric roots. By Proposition
4.2 it does not change neither the smoothness of solution nor the rate of convergence (if
the initial mask satisfied condition 1.2). Moreover, by Lemma 4.1 this process respects
the constants pm(t>) for all cyclic sets b. The final mask has no symmetric roots, hence
it can have only regular cycles. Then we eliminate all regular cycles (refereeing to Proposition 4.2) and obtain a mask satisfying Cohen's criterion, whose subdivision process
does converge. This line of reasoning also allow us to eliminate directly all generahzed
cycles as follows.
4.3
Eliminating of generalized cycles
Let a polynomial p possess a generalized cycle b with corresponding sets of zeros
^1,... ,An- The transfer from p(^) to the polynomial p(^) = p(^)/ IlaeAj,j=i,...,ni^''''^ - :
e"*") is called an eliminating of a generalized cycle.
399
400
Vladimir Protasov
Proposition 4.4 Let a mask rh be obtained from a mask m by eliminating of a generalized cycle b. Then s{m) = s{m,) and i/(m) = max{i^(m),/9„,(b)}.
Proof: After a suitable sequence of transfers to the previous level all the sets of zeros
Ai,...,An drop to the corresponding roots /3i + TT, ...,/?„ + TT, and b becomes a regular
cycle. By Lemma 4.1 this does not change the value /)m(b). Now it remains to apply
Proposition 4.3.
O
Example 4.5 Consider again the mask m(0 from Example 2.3. After eliminating the
generahzed cycle b = {^, f} we obtain the mask m(0 = 0.2 + 0.5e-'« + 0.3e-2'«.
Since all the coefficients of m are positive, it follows that the equation [in] has a Cosolution and, moreover, the corresponding subdivision process {m} converges (see, for
instance [1]). Now applying Proposition 4.4 we see that the initial process {m} diverges,
since pm{^) = \/1.12. Let us note, that the matrix B corresponding to the mask m
{B = {c2i_j}i,jg{o,...,8}) has the eigenvalue 1 with multiplicity one and has no other
eigenvalues on the unit circle. So the divergence of the subdivision scheme in this case
does not follow from the well-known argument of multiple eigenvalues.
5
:
Unimprovability of criterion. Examples of divergent schemes
Now we are going to see that Theorem 2.2 gives a full description of divergent subdivision
schemes having smooth refinable functions. This means that all possible cases of the
criterion of convergence are realized on suitable masks. For the sake of simplicity we
formulate this result for the convergence in the space C, i.e., for the case / = 0.
Theorem 5.1 Let h = {/3i,... ,jS„} be a cyclic set and let Ai,...,An be arbitrary
minimal cut sets of the trees 7^j+^,... ,7^„+x -respectively. Then there exists a mask
m(^) such that
1) m{Aj) = 0, j = 1,... ,n, i.e., h is a generalized cycle of the mask m, and Aj are
its sets of zeros;
2) the equation [m] has a Co-solution, but the subdivision process {m} does not converge inC;
3) after eliminating of the generalized cycle b this process becomes converging inC.
Proof: Consider a mask p(0 = (1 -H e-'«)/2a(0 such that deg o > 2, and the subdivision process {p} converges in C. To obtain such a mask it suffices to take an arbitrary
polynomial a(^) with positive coefficients such that a(0) = 1. Now we use the fact that if
the process {p} converges in C, then it will still converge in this space after all sufficiently
small perturbations of the coefficients of a(C) preserving the condition a(0) = 1 (see [3]).
Thus, with possible perturbation of the coefficients, we assume that the trigonometric
polynomial a has no real roots and that the value pa{h) is irrational. Such a perturbation exists by the mean value theorem, because pa{h) is a continuous function of the
coefficients of a{^). This implies, in particular, that pa{h) > 0 and hence Pp{h) > 0.
Now take the polynomial q{^) = nae.4j,j=i,...,n(«~'^ - «="*")• % Lemma 4.1 we have
Ppqr{h) = 2'^pp{h) for every r > 0. Consequently there exists a nonnegative integer r
such that Ppqr{h) > 1. Take the smallest such integer ro and denote a = aq^""^ and
p = pq'^o-^ (if ro = 0, then we put a = a,p = p). Let us remark that the case pp{h) = 1
is impossible, because this value is not rational, therefore Pp(b) < 1. Since b is the only
Subdivision processes and refinement equations
generalized cycle of the polynomial p, therefore, by Proposition 4.4, the subdivision process {p} converges. Now make a small perturbation of the coefficients of the polynomial
a after which the process {p} still converges, and the value Ppg(b) is still bigger than 1,
but the polynomial a does not have real roots. Then denote rh = p,m = rhq. We see that
the mask m has a unique generalized cycle b, and this cycle has sets of zeros Ai,... ,AnSince /9m(b) > 1, the process {m} diverges, however removing this generalized cycle we
obtain the converging process {rh}. This proves the theorem.
□
Bibliography
1. D. Cavaretta, W. Dahmen, C. Micchelli, Stationary subdivision, Mem. Amer. Math.
Soc. 93 (1991), 1-186.
2. D. Collela and C. Heil, Characterization of scaling functions. I. Continuous solutions,
SIAM J. Matrix Anal. Appl. 15 (1994), 496-518.
3. I. Daubechies and J. Lagarias, Two-scale difference equations. I. Global regularity of
soZuiions, SIAM. J. Math. Anal. 22 (1991), 1388-1410.
4. I. Daubechies and J. Lagarias, Two-scale difference equations. 11. Local regularity,
infinite products of matrices and fractals, SIAM. J. Math. Anal. 23 (1992), 1031-1079.
5. S. Durand, Convergence of the cascade algorithms introduced by I. Daubechies, Numer.
Algorithms 4 (1993), 307-322
6. N. Dyn, J. A. Gregory and D. Levin, Analysis of linear binary subdivision schemes
for curve design, Constr. Approx. 7 (1991), 127-147.
7. L. Herve, Regularite et conditions de bases de Riesz por les fonctions d'echelle,, C.
R. Acad. Sci., Paris, Ser. I 335 (1992), 1029-1032.
8. R. Q. Jia and J. Wang, Stability and linear independence associated with wavelet
decomposjiion, Proc. Amer. Math. Soc. 117 (1993), 1115-1124.
9. M. Neamtu Convergence of subdivisions versus solvability of refinement equations,
East J. Approx 5, 1999, 183-210.
10. V. PrOtasov, A complete solution characterizing smooth refinable functions, SIAM
J. Math. Anal. 31 (1999), 1332-1350.
11. V. Protasov, The stability of subdivision operator at its fixed point, SIAM J. Math.
Anal. 33 (2001), 448-460.
12. L. Villemoes, Wavelet analysis of refinement equations, SIAM J. Math. Anal. 25
(1994), 1433-1460.
13. Y. Wang, Two-scale dilation equations and the cascade algorithm. Random Comput.
Dynamic 3 (1995), 289-307.
14. D.-X. Zhou, Stability of refinable functions, multiresolution analysis, and Haar bases.
SIAM J. Math. Anal. 27 (1996), 891-904.
401
Accurate approximation of functions with
discontinuities, using low order Fourier coefficients
R. K. Wright
Department of Mathematics and Statistics, UVM, Burlington, VT, 05445 USA.
wrightSemba.uvm.edu
Abstract
In previous work we introduced a method of using polynomial splines with appropriate
discontinuities to approximate a piecewise smooth function / with jump discontinuities of
/ and /'. The information used is location of discontinuities, and low order, possibly noisy
Fourier coefficients. The number of discontinuities was limited to two at most, and the
discontinuities needed to lie at meshpoints in a uniform mesh. We showed that the linear
operator corresponding to the method is L2-bounded with a modest bound, and thus
that the method is L2-robust in the presence of noise. In the present paper we develop
a new method of analysis which enables us to determine operator bounds that are valid
for arbitrarily many discontinuities. The new analysis allows discontinuities to be placed
arbitrarily. Given a placement, an initially uniform spline mesh of width h must be used
such that nearest meshpoints to discontinuities are at least 4h apart (discontinuities then
replace these meshpoints); the number of available Fourier coefficients must be at least
three times the number of mesh intervals in a period. The previous work was restricted
to quadratic splines; the present work includes cubic splines. Much of the analysis uses
exact computations with a computer algebra system. We give an example to illustrate
the accuracy of the method using noisy Fourier coefficients.
1
Introduction
We consider approximating a function / when the information consists of low order, possibly noisy Fourier coefficients, and knowledge that / is smooth except for jumps of / or
/' at known locations but unknown magnitudes. We will work with a method, introduced
in [10], which amounts to linear least squares fitting of the available coefficients with the
coefficients of splines with appropriately placed discontinuities. Since we anticipate applications to ill-posed problems where boundedness of the solution operator is crucial,
we develop a method for bounding the norm of this operator. The bounding method
depends heavily on exact computations in certain spline spaces. These computations are
fundamentally finite dimensional linear algebra with rational integer coefficients. Their
goal is to develop upper bounds for the norms of certain projector operators whose
norms are naturally expressed in terms of generalized eigenvalues, and to prove by exact
computation that the bounds are correct. A computer algebra system is used for the
computations. The programming is detailed in [9].
402
Accurate approximation, discontinuities
In [10] we obtained bounds under much more restrictive conditions than in the present
paper. In [10] the splines were quadratic only, while here results also are given for cubic splines. The analysis in [10] required all knots of the approximating splines to be
uniformly spaced, and since the discontinuities are at the knots, the location of discontinuities was limited. Further, in [10] the estimation process is linear in the total number
of discontinuities, and produces results unacceptably large for cases with more than one
discontinuity of / and two of /'.
Others ([2, 3, 4, 5]) have addressed questions of accurate approximations to functions
with discontinuities given Fourier coefficients as information. In [8] we give examples
which show that those methods can substantially magnify noise in the coefficients; our
main concern here is to prove robustness of our method. We illustrate with an example
in Section 5.
2
General linear space-theoretic results
Let V be a real Hilbert space with inner product ( , ). We will denote the norm
associated with { , ) by || ||. Let P and Q be closed subspaces of V; suppose P is the
orthogonal projector on V. Here, as in [10], we deal with the approximation /* obtained
as the solution to the constrained least squares problem
min||Pr-P/||,r eQ.
Assuming that P is invertible as a mapping on Q, we denote by P"*" the mapping from
PiQ) to Q which inverts P. It is not hard to verify that /* = P^RPf where R is the
orthogonal projector on P{Q). Let A denote the operator that takes / to /*.
Theorem 2.1 Let C he a mapping from V to Q. Let e be T-periodic and in ^2(0,T).
Then
||A(P/ + e)-/||<(||P+|| + l)||C/-/|| + ||P+||||e||.
Proof: A{Pf + e) = Af + Ae. \\Af - f\\ < \\Af - Cf\\ + \\Cf - f\\ = ||A(/ - C/)||
+ IIC/-/II < (11^11 +1)11/-C/jj. \\A\\ = \\P+RP\\ < ||P+II because P and iZ are
orthogonal projections.
D
A main objective of the following work will be to bound ||P"'"||. This will be done
by establishing upper bounds for ||/ — P|| as a mapping on Q. Prom these, bounds can
easily be derived for ||P'''||.
Theorem 2.2 Let r/ < 1 exist such that \\{I - P)q\\ < r]\\q\\, for all q € Q. Then P is
injective as a mapping on Q and for all h £ P{Q), P"*", the inverse of the restriction of
P to Q, satisfies
\\P+hf <
^
l-r?2
403
404
R. K. Wright
We will obtain bounds for ||/ - P|| by considering the projector perpendicular to a
spline space Q which is more tractable than PV, and on which / - F is small. In the next
section, Q is the approximating spline space, S a subspace of maximally continuous
sphnes, and ^ is a space of maximally continuous splines whose knots are in a mesh
refining the mesh for the members of <S. S and Q have orthogonal projectors S and G,
respectively. The following estimates ||/ - P|| in terms of ||/ - G\\.
Theorem 2.3 Suppose ||(7-P)g|| < r?oil5|| for allg £ Q. Suppose \\{I-G)q\\ < T]i\\q\\
for all QGQ. Then \\{I-P)q\\<{T]o+ r]i)M\ for all for all q€Q.
\
Proof: For ^ e Q, 11(7 - P)9|| < 11(1 - P)Gg|| + 11(7 - P)(/- G)g||.
11(7-P)G5||<r7o||G9||<77o|MI, and 11(7-P)(7-G)g||< 11(7-G)9||<mlMI-
,:
°
Theorem 2.4 enables us to bound ||7-G|| on Q by instead bounding projectors orthogonal
to small subspaces of G, restricted to small subspaces of Q.
Theorem 2.4 Let Q and S be closed subspaces ofV with S CQr\Q. Let Vi, V2, ■.., Vr
be nonzero mutually orthogonal subspaces ofV. Let Qt Q QnVi, I <i<r be nonzero
closed subspaces such that Q = <S + Qi + Q2 H
\- Qr- Let Qi cgn Vi, Hi C 5-*- n Vi,
l<i <r be nonzero closed subspaces with orthogonal projectors Gi,Hi. Let v be a constant such that \\{I - Gi)qi\\^ < uWHiqiW^ for all qi £ Qi,l<i<r. Then \\{I - G)q\\'^ <
^Ikll^ for all q e Q.
Proof: q e Q can be written q = s + v where s G S and v = gi + 52 H
h Qr, 9; e Qi,
l<i<r. 11(7 - G)q\\ = ||(7 - G)v\\ since S C g. Let F = Gi + G2 + ■ ■ ■ + Gr- Since
C^i + ^2+ ••• +a. C a, 11(7 - GHP < 11(7 - FH|2 = ELi lia - G09,.||', the latter
:
equality because of orthogonality of the Qi. \\q\\^ > ||(7 - S)v\\'^ > WYlUi^^M? =
Z)I=i ll-f^iftll^! since Yd^i'^i ^ '5"^) and the Hi are orthogonal. If all Htqi = 0 the
hypothesis implies all (7 - Gi)g,: = 0. The above then implies (7 - G)q — 0, and the
conclusion is true. We proceed assuming Hiqi ^ 0 for some i and let Af be the set of all
those I. Then
\KI-G)q\\'
Ei^j^\\{I~Gi)qif
IkIP
-
EieAfWHiQi]?
■
An elementary argument shows the quotient of sums is < 1/ since for each « G TV,
.
\\iI-Gi)qi\\y\\Hiqi\f<u.
3
□
Bounds for restricted projectors
Below, we specialize the spaces of the last section, and get our main results. Let T > 0
be a fixed period. We take V to be the space of real-valued T-periodic functions which
belong 7^2(7) for some, and thus every, period interval 7. On V and its subspaces we define
the inner product (/, g) = Jj f{t)g{t) dt, I a period interval. The other realizations are
defined in the statements and proofs of the following results. Lemma 3.1 sets up an
application of Theorem 2.4; Theorem 3.2 uses this, together with Theorem 2.2, to get
our main result.
Lemma 3.1 Let X be a finite set of points in [0,T). Let N > A be an integer. Let
K = {iT/N, 0 < i < N}: for each x e X, let k^- be a member of K closest to x where
Accurate approximation, discontinuities
0 is identified with T. Assume N large enough that between any two distinct kx are at
least three other members of K. Let Kx result from substituting in K each x & X for its
kx- Form = 3,4 let Q be the space ofm-th order T-periodic polynomial splines with Kx
as knots and with continuity C™"^ at all knots except the x G X, where no continuity
is required. Let Q be the space ofm-th order periodic splines with knots in [0, T) at the
points {ir/(3JV),0 <i< SN}, and let G be the orthogonal projector on Q. Then I — G
restricted to Q satisfies \\I - GWl < .69 ifm = S, and \\I - GUi <.9ifm — 4.
Proof: Let S be the subspace of Q consisting of those splines which are C°° at the
kx- Clearly S C Q. Let h = T/N. Fix x = Xi £ X = {xi, X2,---, Xr} and let yo = Xi,
Va — kxi — ah, a = —2,-1,1,2. Take Vj to be the subspace of V consisting of those
functions with support in [t/_2,y2] and its T-translates.
For m = 3 let ji and J2 be B-splines with knots y-i,yo,yo,yo and yo,yo,yo,yi; let
J3 be the difference of the B-splines with knots y-2,y-i,yo, Vi and j/_i,j/o,2/1,2/2 (see [1]
for explanation of multiplicity versus degree of continuity). For m = 4 let ji and J2 be
B-splines with knots 2/-i,J/o,2/o,2/o,2/o and 2/o,yo,2/o,2/o,2/i; let j3 be the difference of the
B-splines with knots y_2,y-i,yo,yo,2/i and y-i,yo,yo,yi,y2', and let J4 be the B-spline with
knots y-2,y-i,yo, 2/1,2/2. Since 2/2 — 2/-2 < T we may identify the j^ with their T-periodic
extensions.
Let Qi be the space of splines whose generic member is qi = YJ^=I '^aja for constants
Ca- For each i, nonzero members of Qi have continuity from C""~^ through full discontinuity at Xi, while members of iS are C°° at xi- It follows that Sn{Qi-\-Q2-\
f-Qr) = 0
and Q = S-\-Qi + ----\-Qr.
Let Qi be the subspace of G with basis the C"""^ periodic B-splines whose knots
in the period containing [y_2,2/2] are length m+ 1 sublists of consecutive knots from
the list {ah/3 -|- fc^, — 6 < a < 6). Let Hi be the space of those m-th order periodic
splines which in [—T/2 -\- kx,T/2 + kx] have support in [j/_2,2/2], which have knots at
the yi, i ^ 0 and at x, are C™"^ at y_i and j/i, which may be fully discontinuous at
2/-2,2/2, and x, and which are orthogonal to all members of <S. ||(J —Gj)gj||^/||iJjgj|p is a
ratio of quadratic forms in the Ca- An upper bound i^ for it can be obtained as an upper
bound for the eigenvalues of the pencil A — XB where aap = ((/ — Gi)ja, [I — Gi)jf3),
ba0 = {Hija,HiJ0),l<a,P,<m.
In [9] explicit bases for the spaces Qi and Hi are calculated as m-th order splines.
From their definitions ([1]), B-spUnes are rational functions of the knots, and thus are
also inner products of B-splines. The null-basis and orthogonal projection calculations
in [9] use standard methods which involve only rational operations. Thus the (7 — Gi)ja
and Hija and then the aa/s and fta/j are rational functions of the knots of qi, so long
as X remains in [kx,kx + h/3]. When x crosses into [kx + h/Z,kx -\- h/2], thus crossing
knots for splines in Gi, the rational functions change, so in general the matrix entries
are piecewise rational functions of a;.
Let 1/ be a conjectured upper bound for the maximum eigenvalue Xmax of A — \B (in
[9] a floating point approximation to Xmax is plotted as a function of x; v is determined
from inspecting this plot). For computational convenience in [9] we represent x as 2e/i/3-tkx, 0 < e < 1/2 for x<kx^h/3, and as {l + e)h/3-¥kx, 0 < e < 1/2 for kx + h/2, <x<
405
406
R. K. Wright
kx + h/2. For further convenience we take k^ = 0, clearly losing no generality. We have
represented only x>kx-, but because of symmetry, x < k-c produces the same bounds.
Since /i is a linear factor in all knots in the calculation, we see that aa/) and ba/i
can be written as h multiplying piecewise rational functions of e (with integer rational
coefficients). The determinant oi A- uB is thus h"^ times a piecewise rational function
of e. The MAXRAT algorithm ([9]) proves that its reciprocal is bounded as a function
of e in the appropriate ranges, so the determinant itself is bounded away from 0. In [9],
e is then set equal to 0 in ^ - TB, and the determinant of that matrix is then shown
to have m sign changes as r decreases from v. Thus the conjectured value v bounds all
eigenvalues oi A- \B for all values of x. The upper bounds thus obtained are i/ = .69
for m = 3 and i^ = .9 for m = 4. We emphasize that the B-splines, matrix entries, and
determinants all are calculated exactly, using the Maple ([6, 7]) computer algebra system,
so the bounding property of u is rigorously proven. Since the bounds we obtain apply to
the spaces Qi and Bi associated with any one of the Xi, they satisfy the hypotheses of
Theorem 2.4 which now provides our conclusions.
□
Our main result now follows.
Theorem 3.2 Let the hypotheses be those of Lemma 3.1. In addition, let P he the
orthogonal projector onto the space of n-th order real-valued T-periodic trigonometric
polynomials, where n > SN. Ifm = 3, we have ||P"*"||2 < 2.4, while if m = 4, we have
;
l|-P+l|2<4.5.
Proof: The space Q in Lemma 3.1 consists of periodic spHnes with uniformly spaced
knots. Theorem 3.1 of [10] implies that ||/ - P\\2 < {a/{l + a)y^^ where
^2m
a = 4£(l/(l + 2r))^
In [9] we use this formula to get upper bounds of .076 when m = 3 and .025 when m = 4
for ||/ - P\\2- Taking these bounds as rjo in Theorem 2.3 and taking the bounds from
Lemma 3.1 as r]i in Theorem 2.3, we obtain from that theorem bounds for ||/ - P||2 of
.907 for m = 3 and .974 for m = 4. Theorem 2.2 now applies to produce the present
results.
D
Above, we required n > 3N; under this condition we can get our simplest and most
comprehensive results. Since we contemplate applying our results where the number n
of useful coefficients may be limited, we have tried to get versions of Theorem 3.2 where
n is smaller compared with N. We have no useful versions for n < SN and m = 4 (cubic
splines). The following result for quadratic splines may be useful. To formulate it, let
ei = max{|a; - kx\N/T}. In the previous results, the separation of the values x from
their nearest uniform mesh points kx was unrestricted, which corresponds to ci = 1/2.
Here, we can get results for quadratic splines, and n > 2N, provided the x are more
restricted; our methods of analysis "blow up" for n > 2A'' as ei approaches a number
slightly larger than .25.
Accurate approximation, discontinuities
407
Theorem 3.3 Letm = 3 (quadratic splines); letn > 2N. Otherwise, let the hypotheses
be those of Theorem 3.2. Corresponding to the list 0, .1, .2, .25 for values of ei, we have
the list of values 1.7,2.1,3.9,16 as bounds for ||P"^||.
Proof: For each of the cases for e, an argument similar to the proof of Lemma 3.1
apphes to produce a bound rji for ||/ — G||2 where Q now is defined using the uniform
knot spacing 1/{2N) rather than 1/{3N). The only difference in the argument is that
here, a discontinuity location x always stays in the interval [kx, kx + eih] where h = T/N,
so the matrix entries and determinants can be treated as functions of e in [0,ei]. Each
bound ?7i now is used just as in the proof of Theorem 3.2, to get the present bounds for
r+ii2.
4
. °
Uniform norm bounds
Using representers of point evaluation, as in [8], we can get uniform norm bounds for P+,
and thus for A. The arguments are similar to those in [8]. The main difference is that
there the mesh is uniform and the order m is 3. The constructions of representers extend
fairly easily to the present case: here the norms of representers are functions both of the
evaluation point and the location of the discontinuity nearest to the evaluation point.
One can show that for each point f e [0, T), a spline rj exists in a space U containing Q,
such that {rt, q) = q{t) for each q € Q, and such that ||rt||2 < k/Vh where A; = 5, m = 3
and fc = 7,m = 4;/i = T/N as before. The computations for the construction and bound
calculations are in [9]. Noting that VT/y/h = y/N, we have
||A/||oo < max|K||2M/||2 < (fc/v^)||P+||2VT||/||oo <kVN\\P+\\2
When N < 100 and the hypotheses are those of Lemma 3.1, this gives ||A/||oo <
120||/||oo for m - 3, and ||A/||oo < 315II/IU for m = 4.
5
Example
FIG.
1.
f - Af, no noise
f - Af, 1 % noise
.20
.10
-.10
-.20
•.30t
-.40
-.50
^
exact f
408
R. K. Wright
We illustrate the method using an example where the function / is 27r-periodic and on
[0,27r) consists of the function e~^/^ with a piecewise quadratic added, so as to produce
discontinuities at 0, .5,1.5,2.5, and 4. / is a modification of an example in [2]; for convenience we have shifted that example left by 1 unit, and we have added the exponential term
because our method can represent a piecewise quadratic exactly in the absence of noise.
Exact (up to 17-decimal digit floating point error) Fourier coefficients are derived from
/by exact integration using the Maple ([6, 7]) system. Noisy approximate coefficients
are also derived by sampling / at 1024 equidistant values in [0,27r], adding uniformly
distributed pseudo-random noise to the samples, and taking the discrete Fourier transform of the samples. In effect, we work with / + e where e is a perturbing function.
The level of the noise is set so that the discrete L2-norm of the noise vector is 1% of
the discrete i2-norm of the vector of samples oi f. N = 45 and thus n = 135 are the
smallest values of n and N for which the hypotheses of the previous section are satisfied.
Using these values, we proceed with m = 4 (cubic splines) for each of these cases for
Fourier coefficients. Plots of / and of the error for the two cases appear in the figure.
The ratio ||/-j4(/-|-e)||2/||/||2 is about .005 for the case of 1% noise. In [9] we develop
a probabilistic estimate of .0037 for the ratio of ||e||2/||/||2- This estimate indicates an
L2-norm noise magnification of about 1.35-fold, compared with the upper bound of 4.5
given in Theorem 3.2. The uniform error, for noise-free coefficients, is about 10"^; computational experiments show this is dominated by truncation error in approximating the
exponential term. In [9] we do the corresponding calculations for m = 3, and find similar
results for 1% noise, with larger, but still small, error for noise-free coefficients.
In [9], we implement Eckhoff's method as described in [3], used on the above data.
For noiseless data, the results are comparable to those reported by Eckhoff for similar
examples. The uniform norm error seems to be about .06, with errors at jumps somewhat
smaller. For 1% noise, the results of Eckhoff's method are about 750-fold in error.
Bibliography
1. C. de Boor, Practical guide to splines, Springer Verlag, New York (1978).
2. K. Eckhoff, Accurate and efficient reconstruction of discontinuous functions from,
truncated series expansions, Math. Comp. 61 (1993), 745-763.
3. K. Eckhoff, Accurate reconstructions of functions of finite regularity from truncated
Fourier series expansions, Math. Comp. 64 (1995), 671-690.
4. D. Gottlieb and C.-W. Shu, On the Gibbs phenomenon and its resolution, SIAM
Review 39 (1997), 644-667.
5. D. Gottfieb, C.-W. Shu, A. Solomonoff and H. Vandeven, On the Gibbs phenomenon
I: Recovering exponential accuracy from the Fourier partial sum of a nonperiodic
ana/yiic/MnciJon, J. Comput. Appl. Math. 43 (1992), 81-98.
6. K. M. Heal, M.L. Hansen, and K.M. Rickard, Maple V Learning Guide, SpringerVerlag New York (1998).
'
7. M. B. Monagan, K. 0. Geddes, K. M. Heal, G. Labahn and S. M. Vorkoetter, Maple
V Programming Guide, Springer Verlag, New York (1998).
8. R. K. Wright, A robust method for accurately representing nonperiodic functions
Accurate approximation, discontinuities
given Fourier coefficient information, J. Comput. Appl. Math. 140, (2002) 837848.
9. R. K. Wright Computations and examples for spline approximation of discontinuous functions using low order Fourier coefficients, UVM Math/Stat Department
Technical Report 2001.2
10. R. K. Wright, Spline fitting discontinuous functions given just a few Fourier coejQ?dente. Numerical Algorithms 9 (1995), 157-169.
409
Chapter 7
General Approximation
--
411
Preceding Page Blank
'"\
Remarks on delay approximations based on feedback
Alessandro Beghi and Antonio Lepschy
Dipartimento di Elettronica e Informatica, Universitd di Padova, Padova, Italy.
{beghi,lepschy}@dei.unipd.it
Wieslaw Krajewski
Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland.
krajewsk@ibspan.waw.pl
Umberto Viaro
Dipartimento di Ingegneria Elet., Mecc. e Gest., Universita di Udine, Italy.
viato@uniud.it
Abstract
The response of a unity-feedback system with a delay element in the forward path exhibits
a periodic component that can be approximated by truncating its harmonic expansion.
Rational approximants of the transfer function e
of such element can simply be obtained from this closed-loop approximation. A unifying approach to recent methods bafsed
on this criterion [2, 3] is presented, which allows us to point out their respective features.
The standard Pade technique and a heuristic method described in [5] are also considered.
1
Introduction and problem statement
In modelling dynamic systems for control purposes, it is often necessary to account for
time delays due, e.g., to transport phenomena or distributed-parameter components.
The response of an ideal delay element (delayor) to an input u{t), identically equal to
0 for t < 0, is y{t) = u{t — T), T > 0, where T indicates the time delay. By denoting with
U{s) the Laplace transform of u{t), the Laplace transform of y{t) is Y{s) = e~'^'^U{s).
Therefore the transfer function of the delayor is the transcendental function e""^".
The problem of approximating e~-^^ by means of a rational function has a long history
(see, e.g., [1]) but is still important from both the computational and the conceptual
point of view; a few recent contributions on the subject are quoted in [2]. In many
practical applications the physical realizability and the stability of the approximant
limit the choice of the approximant to proper rational functions with real coefficients
and a Hurwitz denominator. These requirements are satisfied by Blaschke products, i.e.,
functions of the form:
g(.)^ffii(^-°i)
lL=l(,* "I" "ij
412 .
ReN>0.
(LI)
Remarks on delay approximations based on feedback
This has the desirable property that |B(ja;)| = |e~^"^'^| = 1, Vw, and arg[B(ja;)] is
monotonically decreasing with w hke arg[e~J'^'^] = -Tw. On the other hand, the step
response of a system with transfer function B{s) starts from +1 or -1, whereas the step
response of an ideal delayor obviously starts from 0.
The most widely adopted method to form a rational approximant of a delay element is
based on the Pade technique which does not always guarantee stability (even if biproper
Pade models are necessarily stable). Since such a technique leads to the retention of the
first Maclaurin expansion coefHcients of e~^*, the resulting approximation is the best in
the neighbourhood of a; = 0. In different frequency bands, other types of models may be
preferred.
In [3] a unity-feedback system whose forward path consists of a delayor is analysed.
In the case of negative feedback, the unit step response is a piecewise constant function
taking on the value 0 for 2kT < t < {2k + 1)T and the value 1 for (2fc + 1)T < t <
(2fc + 2)r, fc > 0, which can be decomposed into a step of amplitude |, and a square
wave of amplitude | starting from —^&tt = 0.
In the case of positive feedback, similar considerations allow us to decompose the unit
step response into a linear ramp of slope ^, a step of amplitude — |, and a saw-tooth
wave that Unearly decreases from | to — | in every period from kT to (A; -|- 1)T.
In both cases, the periodic component can easily be expressed as a series of harmonic terms (for t > 0). It is therefore natural to approximate the step response of the
unity-feedback system by retaining the non-periodic component together with a suitable
number of the first harmonics of the periodic component.
A rational approximation Wa{s) of the transcendental transfer function W{s) of the
above-mentioned feedback system is obtained by dividing the Laplace transform of the
approximate step response by the Laplace transform j of the step input. The rational
approximant Ga{s) of the delayor transfer function is then determined as
where the minus sign applies to the case of negative feedback and the plus sign to that
of positive feedback. It turns out [3] that Ga{s) is a stable biproper rational function
having the form of a Blaschke product; precisely, negative feedback supplies even-order
approximants and positive feedback produces odd-order approximants.
Obviously, the same result could be achieved by referring to different inputs (even
an impulse), but the choice of the unit step is particularly convenient. According to
the terminology suggested in [4], the rationale of such a procedure consists in retaining
the "input component" (and the "resonant component", if any) and in truncating the
periodic "system component" of the response.
In [2] a feedback structure is used as well, but another approximation criterion is
adopted, which leads to different models depending on the chosen input. In particular,
the family of inputs considered in [2] is {u{t) = t"^ ,m £N ,t > 0}, and the procedure
exploits several properties of Bernoulli numbers and polynomials.
In the following, the above approaches are presented in a unified form which allows
us to point out their respective features and to derive the related approximants in an
413
414
Beghi et al.
easier way. Finally, criteria are given to choose the approximation that is most suited to
the application at hand, also taking into account the standard Pade approximation and
a further approximation presented in [5].
2
Derivation of the approximant
For the sake of simplicity, we shall almost exclusively refer to the case of negative feedback; only a brief mention will be made of the case of positive feedback.
2.1
Negative feedback
The transfer function W{s) of the negative feedback system with forward-path transfer
function G{s) = e"^" is
whose singularities (poles) are the roots of e^^ = -1, i.e.
W{s) can also be interpreted as the Laplace transform of the sequence of positive and
negative impulses forming the derivative of the step response described in the introduction. Therefore, it is the sum of a constant equal to | (corresponding to the step
component in the just-mentioned step response) and a series of "harmonic" terms associated with the above poles:
1
°°
fc=l •-
Tk
Tk
S - jPk
S + jPk
where the bar denotes conjugate and, using the standard formula for the residues,
rfe = lim (s - jPk)W{s) = -- .
It follows that
oo
W{s) = l-^y
(2.2)
In order to compare the results in [2] and [3], let us consider a canonical input of the
form
Mt) = jfzj^,t>0,
(2.3)
whose Laplace transform is
s'
(In [3] only the case of i = 1 is considered, whereas the inputs used in [2] differ from
(2.3) by a scaling factor which is irrelevant for the following considerations.)
415
Remarks on delay approximations based on feedback
On the basis of (2.2) the Laplace transform of the (forced) response to (2.3),
Yi{s) = -j W{s),
can be rewritten as
i-l
n ;
^^z^ h
-r A> ^2 + (2fc _ 1)2-I'
h=0
where for i even,
I3ki = (-1)5
Qffci = 0 .
f (2fc-l)7r
(2.4i)
and for i odd.
(i-i)
an = (-1) =
,3fci = 0.
T [(2fc-l)7r_
(2.4n)
Therefore, W{s) can also be presented in the alternative form
^s2 + (2fc + l)2fi
^=0
(2.5)
Each term of the series in (2.5) is given by the sum of a polynomial of degree i — 1
(quotient of the division of its numerator by its denominator) and a strictly proper
rational function (whose numerator is the remainder of the division). Therefore, (2.5)
becomes
w{s) = E <^ns-+E E ^H'--'- +,. Ill % --
^'-'^
which can be rewritten as
w^(5) - E
h'^ + fe=l
E ^fci.M/ ^'^+E
■
ft=0 \
fc=l*^ + (2^-l)^f2
(2.7)
By comparing (2.7) with (2.2), one finds that
oo
^
cp + y ^dkifl
(2.8)
2'
oo
C/j + E'^fci.''
=
0)
^>0i
0,
Vfc,i,
(2.9)
fc=i
Iki
^fci
=
—Tf;
Vfe,i.
The procedure suggested in [3] could alternatively be presented with reference to
expression (2.7) where coefficients related to the specific input appear. Precisely, the
approximant Wa(s) is obtained in this case by adding to the exact value | of the first
416
Beghi et al.
sum (cf. (2.8) and (2.9)) the first K (harmonic) terms of the second summation
2
T^s2 + (2fc_i)2^'
which is independent of the input Ui{t).
The procedure suggested in [2] refers instead to expressions (2.5) or (2.6), and the
approximation consists in truncating the summation over fc, where each addendum is
formed by a polynomial and a strictly proper harmonic term. Therefore the resulting
Wa(s) is
which does depend on i and it is not proper because the part added to the harmonic
terms does not reduce to the constant ^, as is instead the case in W{s). Nevertheless,
the approximant Ga{s) = Wa{s)/{1 — Wa{s)) of e~^* turns out to be biproper.
As concerns the computation of the above approximants, the suggested approach
seems to be preferable to that adopted in [2] because
(i) coefficients c/,,, which correspond to the first i Maclaurin expansion coefficients of
W{s) =
J^
-,
can be easily be evaluated using the classic Pade procedure, and
(ii) formulae (2.4?) and (2.4u) immediately supply coefficients ahi,Pki2.2 Positive feedback
Considerations analogous with those of Section 2.1 lead to the following transfer function
in the case of positive feedback
so that Yi{s) = W{s)Ui{s) can be separated into a (harmonic) series associated with the
imaginary conjugate poles of W{s) and a strictly proper fraction with denominator s'+^
Using the terminology in [4], the mentioned series corresponds to the "system component" of the forced response and the fraction corresponds to its "interaction component"
because the poles of the latter are common to W{s) and Ui{s) (no "input component"
is present in this case since Ui{s) does not exhibit poles different from those of W{s)).
As shown in [3], the truncation of the series in (2.2) results in even-order biproper
approximants Ga{s), whereas the truncation of the series in (2.11) results in odd-order
biproper approximants Go(s).
Instead, as shown in [2], truncating the series in (2.5) leads to odd-order approximants, whereas truncating the analogous series corresponding to positive feedback leads
to even-order approximants.
Remarks on delay approximations based on feedback
2.3
Stability and approximation error
It has been proved [3] that the even-order rational approximations Ga (s) of e"^* obtained
from (2.1), as well as the odd-order ones obtained by truncating (2.11), are stable.
Instead, as explicitly stated in [2] for inputs t", m > 2 (i.e., using the previous notation,
Ui{t) with i > 3) the "alternating sign of the Bernoulli numbers makes the approximation
in general unstable [...]. Hence, from a practical point of view, any improvement with
respect to the approximants obtained in [3] is to be found with p = 1", i.e., i = 2,3.
The approximation accuracy can be evaluated by referring, e.g., to the "closed-loop
error"
E{s) ■= W{s) - Wais).
From (2.1) we get
E{s) = E^{s) :=-^ ^2
^.^^r^'+Pl
whereas from (2.10) we have
«—1
oo
E{s) = E2{s):=^ Y. dki,hs''+ E,{s) .
h=Ok=K+l
Since E{s) is a complex quantity, |£'2(s)| may well be smaller than |.Ei(s)| for certain
values of s (or ju).
3
Alternative approximants
As already pointed out, the procedure suggested in [2] leads to approximants that depend
on the chosen canonical input. To improve the approximation within suitable frequency
bands not centred at the origin, it is reasonable to resort to non-canonical inputs whose
spectrum has larger amplitude there. A simple choice corresponds, e.g., to
1
U{s)
s
s^ '
1 + 2^-^-f^
in which a;„ is at the centre of the band and ^ is suitably small.
The choice of the form of the input (as well as the order of the canonical input) is
somewhat arbitrary and is influenced, in practice, by empiric considerations. Therefore,
it makes sense to compare the results of the above procedures with those obtained in
[5] using a heuristic procedure based on the direct approximation of the phase Bode
diagram of e^^'^'^ by means of a Blaschke product Bn{juj) of order n. For n odd, the
first factor of S„(s) has the form
^ , •.
i — TS
Gi{s) = —^,
1 + TS
:
^
r>0,
417
Beghi et al.
418
and the others have the form
+ UJ
1-2^,:Ulr,
Gi{s) = —
1 + 2^,— + ^
1 > ^i > 0 ,
W„i > 0 :
(3.1)
whereas for n even all factors have form (3.1).
All the considered techniques produce unit-magnitude all-pass frequency responses so
that the approximation they afford can be judged with reference to the phase deviation
A{jw) from -Tu only. As w -^ oo, A{jw) -* oo in all cases. Therefore, reasonable criteria
for choosing the method most suited to the specific application are: (i) the bandwidth
Be where |A(iw)| is less than a specified value e, or (ii) the maximum AB of |A(jtj)| in
a prescribed band B.
By way of example, Fig. 1 shows A{ju)) vs w for the 4-th order all-pass approximants
of e~^'^, (T = 1) obtained according to (2.1) with K = 2 (curve a), to the procedure
suggested in [2] for u^it) = t^ (curve b), to the standard Fade procedure (curve c), and to
the heuristic method in [5] (curve d). For instance, with reference to criterion (i) above,
the Fade approximant is best for e very small, the method suggested in [2] is optimal
for e ~ 10°, and the heuristic method and the method suggested in [3] are preferable for
e > 45°.
Analogous results are obtainable for approximants of different order.
1
■
-
,
■
b
c
d
/^>'^- ~''"\> ^
:'^ \
1
'
/
/
!'■
JM.
:
:
/
/
■
\
ll
4
-
»/
■/'
f '
^
ID
FIG.
-
II
, ,,ll , , .
!'
12
1. Phase deviations A(ja;) for the considered 4-th order approximants.
Conclusions
The approximation procedure presented in [2] and [3] have been embedded in a unified
frame which points out well their respective features and allows us to determine the
Remarks on delay approximations based on feedback
parameters of the approximants in an easier way. Criteria have been provided for choosing
the approximation method that is most suited to the specific appHcation.
Bibliography
1. O. Perron, Die Lehre von den Kettenbriichen. Stuttgart: Teubner, 1913. 3rd ed.
1957. In German.
2. C. Battle and A. Miralles, "On the approximation of delay elements by feedback,"
Automatica, vol. 36, pp. 659-664, 2000.
3. A. Beghi, A. Lepschy, and U. Viaro, "Approximating delay elements by feedback,"
IEEE Trans. Circ. Sys. I, vol. 44, pp. 824-828, 1997.
4. P. Dorato, A. Lepschy, and U. Viaro, "Some comments on steady-state and asymptotic responses," IEEE Trans. Education, vol. 37, pp. 264-268, 1994.
5. A. Beghi, A. Lepschy, and U. Viaro, "On the simplification of the mathematical
model of a delay element," in E. Kuljanic, ed., Advanced Manufacturing Systems and
Technology: Springer Verlag, 1996, pp. 617-624.
419
Point shifts in rational interpolation
with optimized denominator
Jean-Paul Berrut
Departement de Mathematiques, Universite de Fribourg, Switzerland
j ean-paul. berrut@uiiif r. ch
Hans D. Mittelmann
DepartmeM of Mathematics, Arizona State University, Tempe, USA
mittelmann@asu.edu
Abstract
In previous work we have suggested obtaining rational interpolants of a function / by
attaching optimally placed poles to its interpolating polynomials. For a large number of
interpolation points these polynomials are well-known to be good approximants only if
the nodes tend to cluster near the endpoints of the interval, as with Cebysev or Legendre
points. In practice, however, one would prefer to have them closer to equidistant. This
will in particular be the case when the difficult portion of / lies well within the interior
of the interval, or when approximating derivatives of /, a.s in the solution of differential
equations. To address this difficulty, we use here a conformal change of variable to shift
the points from the Cebysev position toward a more equidistant distribution in a way
that should maintain the exponential convergence when / is analytic. Numerical examples
demonstrate the resulting improvement in the quality of the approximation.
1
Introduction
We are concerned here with rational approximation of a continuous function / on an
interval [a, b], which we may take as [-1,1] =: /, after a linear change of variable when
necessary. We further assume that the approximant r should interpolate / between a
finite number, say A^ + 1, of distinct points (nodes) XQ, xi,..., a;Ar in /. In a similar way
as in [5], r will be constructed by attaching a certain number of poles to an interpolating
polynomial.
In some applications, such as the numerical solution of two-point boundary value
problems (see, e.g., [6]), one may choose the points more or less at will; in that case,
one will place them so as to reach the best compromise between two often conflicting
goals: points good for interpolation, on one side, and points favourable for the condition
of the problem to be solved, on the other side. In [5], we have considered equidistant
and Cebysev points, the first for their regularity, the second for the condition of the
interpolation and for the fast convergence of the interpolant for very smooth functions.
For the solution of two-point boundary problems in [6] we have merely used Cebysev
points.
420
Point shifts in rational interpolation with optimized denominator
421
There is in general no reason besides the problem condition for accumulating the
nodes toward the boundary, as with Cebysev or Legendre points. Moreover, one of the
reasons for using rational instead of polynomial interpolation is its better suitability for
approximating functions with large slopes. Here too, shifting the points away from the
center may not be appropriate.
Another odd consequence of accumulating interpolation points toward the extremities
is the consequent ill-conditioning of the derivatives of the interpolating polynomials [7,1].
This worsens the stability properties of time-stepping in the solution of time evolution
problems with the method of hues [13] as well as the convergence of iterative methods
for solving discretized stationary problems [3].
To address these difSculties, we will take advantage here of the fact that the fast convergence of the interpolant can be maintained while shifting the points with a conformal
map g (independent of A'') toward an equidistant position. This, however, requires an
important change to the method in [5], because this point shift ruins the exponential
convergence of the Cebysev interpolating polynomial. We therefore use here as the starting interpolant the polynomial interpolating f{g~^) in the domain of the inverse 5"-^ of
the conformal map employed for the point shift, and attach poles to this polynomial.
Section 2 reviews the formulae and advantages of shifting Cebysev points conformally
toward the center of the interval when interpolating functions, and Section 3 briefly recalls the method of optimally attaching poles to the interpolating polynomial introduced
in our earlier work. In Section 4 we describe how to take advantage of the better conditioning of derivatives induced by the conformal point shift; the corresponding practical
improvements are finally documented with numerical examples.
2
Rational interpolation with a variable change for point shifts
Let Vm and TZm,n, respectively, denote the linear space of all polynomials of degree at
most m and the set of all rational functions with numerator in Vm and denominator in
Vn, furthermore, denote by fk the interpolated values f{xk), k = 0(l)iV, of /. Then, the
unique polynomial p &VN that interpolates / at the Xfc's,
N
Pix) = '^fkLk{x),
Lkix):=Y[{x-Xi) /Yl{xk-Xi),
can be written in its barycentric form [9]
P(-) = E-^^A/E-^^'
(2-1)
where the so-called weight Wk corresponding to the point Xk is given by
/ ^
I i=0, ijtk
Despite its appearance, (2.1) determines a polynomial of degree at most N: the Wk
are precisely the numbers which guarantee this [4]. By choosing other Wk's, a rational
422
, . :;
Jean-Paul Berrut and Hans D. Mittelmann
interpolant is constructed.
The barycentric formula has several advantages over other representations of the
interpolating polynomial ([4] p. 357). One of them is the fact that the weights appear in
both the numerator and the denominator, so that they can be divided by any common
factor. For example, simphfied weights for Cebysev points of the first kind x)^' :- coscpk,
where (j)k := ^^'^ and k = 0,...,N, are given by w[.^' = {-l)''sin<pk ([9] p. 249),
while for the Cebysev points of the second kind x^^^ := cos fc^ - which will be used here
- one simply has Salzer's formula ([9] p. 252)
(21
/
,sfcs-
wi^ = i-l)%,
r
f 1/2,
-^^^ = { i:
k = OoTk = N,
, otherwise.
These points are, together with Legendre's, the most used nodes for global polynomial
interpolation and large N. They achieve exponential convergence of p toward / if the
latter is analytic in an ellipse Ep with foci at ±1 and sum of its axes equal to 2p, p> 1.
However, this fast convergence comes at the cost of a concentration of the nodes in
the vicinity of the extremities of /. As mentioned above, this accumulation may have
drawbacks, such as poor spreading of the information about / over the interval and
ill-conditioning of the derivatives near the endpoints.
With a suitable choice of the interpolant, one may conformally shift the nodes toward an equidistant position (though not all the way) without losing the exponential
convergence. For that purpose, one considers, beside the a:-space in which / is to be
approximated, another space, denoted by y, say, and the iV + 1 Cebysev points of the
second kind
(2)
in the interval J := [-1,1] in this j/-space. Let g be a conformal map from a domain Vi
containing J (in the j/-space) to a domain 1)2 containing / (in the a;-space); moreover,
suppose that / is a function V2>-^ C such that the composition fog : 2?ii-> C is analytic
in an ellipse Ep, as defined above. With this map we may define new interpolation points
on /, Xk = givk), as well as the conformal transplantation F{y) := f{x) [10] of / into
the y-space.
Then, with the polynomial interpolating F{y) at the yk
N
N
AN{y):=Y,Fiyk)Lk{y) = J^f{xk)Lk{g-Hx))=:aM{x),
fc=0
(2.2)
fc=o
one has
\aN{x)~f{x)\ = 0{p-''),
x€[-l,l].
Rational interpolation with all poles prescribed is very simple in the barycentric setting
[5]: the P poles zi are attached to (2.1) by replacing Wk with
p
bk = Wkdk,
dk ■ = J\ixk - Zi).
Point shifts in rational interpolation with optimized denominator
If A'' > P this results in a rational interpolant in
(when such an interpolant exists, see [5]).
TZN^P
with poles at Zi, i = 1,..., P
Remark 2.1 Exponential convergence of interpolation at the shifted points is also attained with the rational function given by (2.1) with Wk = w^^' [2]. However, this is
in general a rational function in 'RN,V, V > N - P: there is not enough defect in the
denominator degree for the weights w^ 'dk to warrant the presence of the P poles Zi.
We then use a^ as the starting interpolant to which we attach the poles Vi in the
y-space. This yields
p
p
N WkY[{yk-Vi)
uM. S
R[y) ■-
^~y-y^
_
N Wk
^-^
^" = S ^~l-\x)-9-\xk)—_
YliVk-Vi)
T-J^
fc=0
^ WkY[{9''\xk) -g~'^{zi))
V-Vk
II — 111.
^'
_ ^: r{x).
j^ WkY[{g \xk)-g\zi))
T
^—1
^
g \x)-g \xk)
If a rational interpolant with these poles exists, it is given in the y-space by R, and r is
a rational function in the argument g^^{x). Its poles are at Zi = g{vi).
3
Construction of the optimal interpolant
Our method consists in optimizing the position of the Vi's so as to minimize
||i2-i^lioo = ||r-/||co,
as described in §3 of [5]. Optimal Uj's always exist, but these are not unique in general.
Whether the optimal R is unique is an open question; however, for every optimized pole
Vi an indicator may be calculated which, if nonzero, guarantees that Vi is indeed a pole
of P.
In the practical computations documented in §5 the optimization of the Vi's was
performed using the same two algorithms as in [5]: for small A'' we used a discrete differential correction algorithm according to [11], while for larger N the simulated annealing
method of [8] was applied. Both methods will in principle locate a desired global maximum. The first method achieves it in a systematic and guaranteed way evaluating the
error not continuously but on a fine grid; the simulated annealing method cannot be
guaranteed to find the global extremum but, when used for an extensive search, will
produce a reasonable approximation of it.
As mentioned in [5], our way of attaching poles to the interpolating polynomial has a
very nice property: the approximation error can only decrease or at worst stay constant
with a growing number of poles, this in sharp contrast with classical rational interpolation; when a new unknown, say Vj, is added to the set of variables, {vi,... ,Wj_i}, the
optimal values of the latter are a feasible vector for the higher dimensional optimization.
Let us conclude this section with a comment on the use of the nomenclature "attaching the poles". In classical rational interpolation, the poles of the interpolant are
423
Jean-Paul Berrut and Hans D. Mittelmann
424
determined by the data. There too, however, one sometimes wishes to prescribe the location of the poles (with corresponding decrease of the number of degrees of freedom):
many authors then speak of "assigning", or "prescribing" the poles. In that sense one
cannot "assign" poles to a polynomial, which obviously cannot have poles. We thus
start with the interpolating polynomial and its poles at infinity and make it a rational
interpolant by bringing the poles into an optimal position in C. We call this procedure
"attaching poles", to distinguish it from the process of forcing a rational function to
have a pole at a particular place.
4
Derivatives of the optimal interpolant with shifted points
As mentioned in §1, one of the reasons for shifting the points from their Cebysev position
toward the interior of the interval is the improvement of the condition of the derivatives
resulting from such a shift. Besides r, we will evaluate also r' and r" as approximants of
/', resp. /", and estimate \\r - /'||oo and \\r - /"||ooSchneider and Werner [14] have noticed that every rational interpolant R € 'JIN,N,
written in its barycentric form
can easily be differentiated. The formulae for the first two derivatives read
' N
R'iv) = {
fe=0
'
I N
y-Vk
k=0
y-Vk
N
-{ Y. UkR[yi,yk])/ui,
y = yi
fc=0
kjti
and
N
Uk
R"iy) = {
„r
_
IN
_; 1 /V^
"A-
2V
2/, 2/.] // E-^^'
2/^ 2/'' ^ = 0(1)^'
.^—' ^i^iib,
II
^—' II
— 111.
y-yk
y—
- 1h.
yk
fc=0
fc=0
N
-2( J2 ukR[yi,yi,yk])/ui,
y = yi
fe=0
fc/i
with R[z,z,yk] = :5Mr|Md,The chain rule then yields, for r{x) = R{g-Hx)),
r'ix) = R'(y).[g-\x)]' = ^,
r"{x)
^
R"iy)-r^R'bj){9'iy)]
WivW
(4.1)
Specifically, in our calculations we have used the map suggested by Kosloff and Tal-Ezer
, .
arcsm(aw)
giy) - arcsm a
0 < a < 1.
Point shifts in rational interpolation with optimized denominator
a
P=0
P=2
P=4
P=6
P=8
0.0
0.5
0.75
0.9
0.95
0.96
6.37e - 5
3.11e-5
8.06e - 6
1.12e - 6
2.78e - 7
1.85e - 7
1.42e - 6
6.69e - 7
1.60e - 7
1.97e - 8
4.47e - 9
2.93e - 9
5.83e - 8
2.48e - 8
5.50e - 9
5.90e - 10
1.29e - 10
8.27e-ll
9.38e - 9
4.21e-9
9.47e - 10
3.94e - 11
1.36e - 11
4.20e - 12
1.30e - 9
4.23e - 10
1.27e-10
2.05e - 11
3.82e - 12
3.88e - 12
1. Errors when approximating / with increasing P and a in Example 1
TAB.
In the hmiting cases, a ^ 0 keeps the points at their Cebysev position, whereas a -^ 1
renders them equidistant. The derivatives of 5 are given by
9'{y) =
arcsina^l-(Q,j/)2'
9"(y) =
arcsin a
{ayf)
so that in (4.1)
9"iy)
[9'{yW
5
■ (arcsin^ a)y.
Numerical evidence
We now report on practical computations, performed on two examples, which demonstrate the efficiency of point shifts for improving the rational interpolants with optimized
denommators. These examples share the property that the difficult part of / lies in the
center of /, so that the shift of the points toward a more equidistant position naturally
improves the quality of the information provided to the interpolation method
a
P-0
P=2
P=4
P=6
P=8
0.0
0.5
0.75
0.9
0.95
0.96
5.27e-3
2.67e - 3
7.47e - 4
1.14e-4
2.97e - 5
2.01e-5
1.26e - 3
5.87e-4
1.49e - 5
2.01e - 6
4.99e - 7
3.24e - 7
4.85e - 6
2.33e-6
5.16e-7
6.56e - 8
1.48e - 8
9.52e - 9
8.69e - 7
4.03e - 7
9.44e - 8
4.28e - 9
1.59e - 9
4.80e - 10
1.40e - 7
4.63e-8
1.30e-8
2.16e-9
4.52e - 10
4.70e - 10
TAB.
2. Errors when approximating /' with increasing P and a in Example 1.
The sup-norm ||
|| has thereby been estimated by considering the 1000 equally
spaced pomts:r, = -| + f-if, ^ = 1(1)1000, on the interval [-5/4,5/4] and computing
ttie maximal absolute value of the error at those Xi lying in [-1,1].
Example 5.1 We have first revisited Example 3 of [5], which displays in the center of
425
Jean-Paul Berrut and Hans D. Mittelmann
426
I a slope increasing with a positive parameter, here denoted by e
erf((5.T)
6 — \A5e,
f{x) = cos7rx +
erf(5)
where erf denotes the error function (see [5] for a graph).
In Table 1 we give the results obtained with e = 500 and iV = 81, increasing numbers
P of poles and increasing a. Tables 2 and 3 display the same information for the approximation of /' and /" with r' and r" as given by the formulae (4.1). The combination
of extra poles and a point shift brings about 7 digits of accuracy, where the PO^t «hift
alone makes only for 2^3. The improvement in the derivatives is especially remarkable
the error in the second derivative decreases from the useless value of 9.26 to about 10 .
a
P=0
P=2
0.0
0.5
0.75
0.9
0.95
0.96
9.26
4.26
9.50e - 1
9.30e - 2
1.59e - 2
9.18e - 3
4.05e - 2
2.07e - 2
5.48e - 3
6.49e - 4
1.23e - 4
7.36e - 5
TAB.
P=4
4.82e - 4
2.18e-4
6.25e - 5
8.86e - 6
1.88e - 6
1.29e-6 1
P=6
P=8
7.85e - 5
3.75e - 5
9.53e - 6
4.93e-7
1.75e - 7
6.00e-8
1.46e - 5
4.91e - 6
1.26e-6
2.34e-7
5.31e-8
| 5.57e -» |
3. Errors when approximating /" with increasing P and a in Example 1.
Example 5.2 Example 3 in [5] has demonstrated that the attachment of poles may be
very effective in improving the approximation of oscillatory functions. Here we change
the function to
2
/i(a;) = e""'' sin6.T,
„ ,
„
a > 0, 6 > 0,
so that the most oscillatory part lies in the center of the interval.
Results with a = 5, 6 = 25, AT = 31, P = 0 and P = 2 are given in Table 4. In contrast
with the preceding example, here the point shift brings much more improvement than the
attachment of poles, about 6^7 digits, an especially heartemng fact for the derivatives,
to which the interpolants without shift are useless approximants.
Acknowledgement: The authors wish to thank Peter Graves^Morris for his comments
which have enhanced the present text.
Bibliography
1. R. Baltensperger and J.-P. Berrut, The errors in calculating the pseudospectral
differentiation matrices for Cebysev-Gauss-Lobatto points, Comvut. Math. Apphc.
37 (1999), 41-48. Errata: 38 (1999), 119.
2. R. Baltensperger, J.-P. Berrut, and B.Noel, Exponential convergence of a hn^^^^^^^^^^
tional interpolant between transformed Chebyshev points. Math. Comp. 68 (1999),
1109-1120.
Point shifts in rational interpolation with optimized denominator
a
0.0
0.5
0.75
0.9
0.92
0.94
0.96
TAB.
h'
h
P=0
P=2
P=0
4.12e-2
1.66e - 2
1.97e - 3
1.91e-5
4.57e - 6
6.56e - 7
3.03e-8
2.49e - 3
8.68e - 4
7.95e - 5
4.20e - 7
7.78e - 8
7.18e-9
2.39e - 9
2.03
8.90e - 1
1.17e-l
1.09e - 3
2.48e - 4
3.26e - 5
1.81e-6
427
h"
P=2
1.36e
6.08e
7.73e
4.56e
8.20e
5.69e
5.49e
-
1
2
3
5
6
7
7
P=0
P=2
1.43e + 3
5.63e + 2
5.98e +1
3.97e - 1
8.24e - 2
9.56e - 3
4.71e-4
9.51e + l
3.84e +1
3.98
1.68e-2
2.75e - 3
1.66e - 4
1.62e - 4
4. Change in the errors induced by the introduction of two poles in Example 2.
3. J.-P. Berrut and R. Baltensperger, The linear rational collocation method for boundary value problems, J5/T 41 (2001), 868-879.
4. J.-P. Berrut and H. D. Mittelmann, Matrices for the direct determination of the
barycentric weights of rational interpolation, J. Comput. Appl. Math. 78 (1997),
355-370.
5. J.-P. Berrut and H. D. Mittelmann, Rational interpolation through the optimal attachment of poles to the interpolating polynomial. Numerical Algorithms 23 (2000),
315-328.
6. J.-P. Berrut and H. D. Mittelmann, The linear rational collocation method with
iteratively optimized poles for two-point boundary value problems, SIAM J. Scient.
Comput. 23 (2001), 961-975.
7. K. S. Breuer and R. M. Everson, On the errors incurred calculating derivatives using
Chebyshev polynomials, J. Comput. Phys. 99 (1992), 56-67.
8. A. Corana, M. Marchesi, C. Martini, and S. Ridella, Minimizing multimodal functions of continuous variables with the "Simulated Annealing" algorithm, ACM
Trans. Math. Software 13 (1987) 262-280.
9. P. Henrici, Essentials of Numerical Analysis, Wiley, New-York, 1982.
10. P. Henrici, Applied and Computational Complex Analysis Vol. 3, Wiley, New York,
1986.
11. E. H. Kaufman Jr, D. J. Leeming, and G. D. Taylor, Uniform rational approximation
by differential correction and Remes-differential correction. Int. J. Numer. Meth.
Engin. 17 (1981), 1273-1278.
12. D. Kosloff and H. Tal-Ezer, A modified Chebyshev pseudospectral method with an
^(A^"^) time step restriction, J. Comput. Phys. 104 (1993), 457-469.
13. S. C. Reddy and L. N. Trefethen, Lax-stability of fully discrete spectral methods
via stability regions and pseudo-eigenvalues, Comput. Methods Appl. Mech. Engrg.
80 (1990), 147-164.
14. C. Schneider and W. Werner, Some new aspects of rational interpolation, Mai/i.
Comp. 47 (1986) 285-299.
An application of a mathematical blood flow model
Michael BreuB, Andreas Meister
Department of Mathematics, Universitxj of Hamburg, Germany.
breuss@math.uni-hamburg.de, meister@math.uni-hamburg.de
Bernd Fischer
Mathematical Institute, Medical University of Lilheck, Germany.
fischer@math.mu-luebeck.de
Abstract
Mathematical models of blood flow are inevitably embedded in models of human thermoregulation because they take the role of the most significant heat distributor in models
of the human thermal system [14, 6]. Models of human thermoregulation have a wide
range of applications, e.g. for the prediction of the impact of accidents, diseases and clinical treatments (see [14] and the references therein). The application of our interest is the
prediction of the influence of cooling on the heat distribution in premature infants, see
Section 2. In Section 3 we discuss the requirements of a reliable thermoregulation model
while the governing equation is described in paragraph four. The employed blood flow
model is discussed within Section 5. Section 6 deals with numerical results, followed by
concluding remarks in the last paragraph.
1
Motivation
Lack of oxygen of the fetiis or newborn is known to be an important cause for injuries of
the developing brain [9]. Experimental studies have shown that the neuronal loss evolves
over several days after such an incident [8]. An important factor influencing the degree
and distribution of neuronal loss is the cerebral temperature, i.e. lowering the cerebral
temperature can prevent much damage [5].
The question arises, if it is possible to lower the cerebral temperature of an infant by
2 -ZK hy the manipulation of the environment inside an incubator while the rest of
the body maintains a pleasant temperature. The objective of this paper is to discuss the
mathematical measurements which can be used to predict an answer to that question
by the use of numerical simulations.
2
Modeling the thermoregulation of premature infants
The term thermoregulation stands for the measurements of the body to hold a pleasant
temperature [4]. Models for thermoregulation consist of two parts: the active and the
passive system [6]. The active system consists of the regulatory mechanisms shivering
(heat production within the muscles attached to the skeleton), vasomotion (control over
the degree of blood flow within the skin) and sweating (control over the degree of effectiveness of heat transfer between the infant and the surrounding air). The passive system
428
Blood flow model
is the combination of the physical human body and the heat transfer in it and at its
surface. The idea behind this distinction is that the active system has a controUing influence over the passive system. Naturally, only results obtained by the complete model
can be compared with available real life data.
Concerning premature infants, it is known that shivering and sweating are not of
importance for the modelling process [4, 13], while vasomotion should not be of great
concern for our special application [13]. The modeling of the passive system demands the
discretiza,tion of the body and the modeling of metabolic heat production and blood flow.
We do not consider phenomena which are related to environmental conditions, namely
the response to air convection, the probability to gain or loose heat due to radiation and
heat loss due to evaporation in dependence on pressure, temperature and humidity of
the surrounding air, assuming that these are controllable by the use of an incubator [13].
In order to give an answer to the defined question by use of numerical simulations,
a model needs to deliver detailed temperature profiles within the head and a detailed
resolution of the heat transfer processes in the body. It should be applicable to diflFerent
size neonates whereby aspects like the anatomy and the thermal maturity have to be
considered. With the exception of the blood flow model, these aspects can be defined via
a suitable geometry and the use of real life data for spatially dependent rates of metabolic
heat production within a numerical method [7, 2]. This also incorporates that ejdsting
numerical methods made for the simulation of thermoregulation of adults are of no use
in the given context since studies have shown [3] that a detailed modeling of geometry
and tissue composition is necessary in order to obtain relevant temperature profiles. As
it can be shown experimentally [7, 2] in agreement to theoretical discussions concerning
thermoregulation models of adults [6, 14], the use of a blood flow model greatly affects
the computed numerical solutions.
3
Analysis of the blood flow model
The bio-heat equation derived by Pennes [10] forms the basis of the majority of models
for human thermoregulation in use today [14, 6]. It describes the dissipation of heat in
a homogeneous, infinite tissue volume. For two spatial dimensions, it can be written in
the form
c(x)p(x)atT(x,i)=div[A(x)VT(x,t)]+f(x,t).
(3.1)
Thereby, the temperature T depends on the spatial variable x = {xi,X2)'^ as well
as on time t. Furthermore, A(x), c(x) and p(x) denote the heat conductivity, specific
heat capacity and density of the tissue, respectively. The term /(x) can be decomposed
via /(x, i) = (5M(X) + QB(X, i) into parts corresponding to metabolic heat production
(5M(X) and blood flow QB(X,i).
As already indicated, the term <5M(X) can be defined by the use of real life data
[7]. The formulation of the source term due to blood flow is based on variations of the
following procedure [6, 14]. The idea is that the body is supplied from a central pool of
blood by the major arteries. Before the tissue is perfused, the temperature of the arterial
blood mixes with the temperature of venous blood flowing in adjacent veins. After that,
the arterial blood exchanges heat with the tissue in the capillaries and becomes venous
429
430
M. Breuss, B. Fischer and A. Meister
blood. The venous blood is collected in the major veins and its temperature mixes with
the temperature of arterial blood in the adjacent arteries before it flows back into the
blood pool.
Since equation (3.1) deals with the change of thermal energy per unit volume, the
term QB(X) takes the form
QB{X,
t) -
CBPBCCX{X)BF{X)
[TB{t) - T{x, t)],
(3.2)
whereby Tsit) denotes the time-dependent mean value of the temperature of the blood
within the blood pool, we also assume that the specific density of the blood pg and the
specific heat capacity of the blood CB are constant variables.
The described modeling results in a differential equation for the temporal evolution
of the temperature within the blood pool, namely in
mBCBdtTB{t)= f pBCBCCX{K)BFix)dx[Tv{t)-TBit)].
(3.3)
JD
Thereby, the total blood mass ma, the time dependent mean value of the temperature
of the venous blood Tv{t), and locally defined tissue-dependent measures for the blood
perfusion BF{x.) and the counter-current heat exchange CCX{x) are introduced.
Equation (3.3) shows that the temporal change of the blood pool temperature is
proportional to the difference to the temperature of the venous blood. The outlined idea
leads to the modeling of the temperature of the venous blood as
r M ^ lDCCX{x)BF{x)T{x,t)d,x
^^^^^=
:
J^CCX{x)BF{x)dx
^
^^■^>
which is also usable when only steady states are considered [7]. The crucial terms in
the order of importance are the blood perfusion BF{x) and the counter current heat
exchange CCX{x).
There is much debate about the choice of these functions in literature [14, 6]. This
debate arises because the representation of blood circulation is substituted by a rather
simple model formulation. The cure to this disadvantage is generally sought by exploring
more and more detailed models of microstructure, organs, etc., or it is sought by a better
modeling of control mechanisms of the actice system in the case of adults [14, 6].
The main drawback of the described blood flow model is given by the blood pool idea
itself. This is up to now to our knowledge not outlined in any mathematical description
of this model within the literature and can be illustrated as follows. Let a detailed geometry be given with a stationary temperature distribution together with a homogeneous
neutral temperature at the whole boundary as initial state. Let us assume that we start
a numerical computation where a selective cooling at the neck is employed. By heat conduction of the tissue, the effect of cooling computed with the help of the discretization
of heat gradient and heat conductivity of the local tissue propagates into the inner part
of the domain. Concerning the blood flow, the averaging step within (3.4) captures the
local cooling eflPect which results in a slightly cooler average temperature of the venous
blood within the whole domain than in the initial state. Employing this value in (3.3)
results in a slight negative change of the blood pool temperature. Taking account of the
Blood flow model
431
evaluation of the source term (3.2) for the control volumes located in the vicinity of the
neck, we notice that a strong cooling is locally equalized by the combination of a) the
source term due to blood flow which is mostly influenced by the neutral blood temperature in the rest of the body and b) of the source term due to metabolic heat production
which was not influenced at all by the change in the boundary temperature. The result
is that the effect of a local coohng mechanism is instantly distributed over the whole
domain while a weighted mean Value of the temperature over the domain equalizes local
cooling mechanisms. The validity of this reasoning is verified by numerical results [7, 2]
and by an exemplary result shown in Section 6.
The non-local nature of the described blood flow model can directly be seen by
applying an impUcit time stepping strategy. Due to the integration over the whole computational domain in (3.4), one ends up with a fully occupied matrix after the usual
linearization step which was already recognized in [7] in the context of steady state
calculations.
We now illuminate a further property of the bloodflow model. Therefore, let the
abbreviations a = PBCB, P - JD i<'B(x)B(x) dx and 7 = pB/ruB hold. A straightforward
computation gives
TB{t)=Tv{t)-^fTB{t).
(3.5)
Note that a, (3 and 7 are positive constants. Consider a steady state situation as initial
state, i.e. Tg = Ty holds. If the body is heated, the temperature within the body
increases and so Ty will increase. This has the effect that the bloodpool temperature TB
will increase in the near future, i.e. Tg {t) > 0. We now investigate the net effect of the
bloodflow. Integration of the source over the computational domain D results in
/ QB{-x.,t)dyi = a pTeit)- f irB(x)B(x)r(x,i)dx
JD
I
JD
(3.5) ad
= —~n.-'-B\t)'j dt
When employing T'jg{t) > 0 we see that the total of all sources in the body is negative,
i.e. while the blood in the bloodpool cools the increasingly warm body in the mean if
the body is exposed to heat, it also takes over heat from it. The bloodpool and the body
are to be seen as two separate systems which are connected via heat fluxes and so one
can consider the bloodpool as a regulator.
4
Numerical method and experiments
The following numerical approximation of the unsteady bio-heat equation (3.1) represents a convenient extension of the finite volume method developed in [7], which has been
proven to be a robust, accurate and reliable algorithm in the context of steady state
temperature distributions. However, finite volume schemes are categorically based on
the integral form of the governing equation. In order to apply Gauss's integral theorem
it is neccessary to write the equation in divergence form. Therefore, we introduce the
auxihary variable A;(x) = p(x)c(x) and the auxihary temperature T{x,t) = fc(x)r(x,i)
M. Breuss, B. Fischer and A. Meister
432
iiito the governing equation and consequently the bio-heat equation (3.1) writes
1
dt
//""'"'''=I [i)'^*"'"
»i)v.«
n(x) ds + / f(x, t) dx
Ja
(4.1)
for all control volumes a C D, see [2]. In order to solve equation (4.1) numerically,
the space part D is decomposed into a finite number of sub-domains. We start from an
1. General form of a control volume of the triangulation (left) and its boundary
(right).
FIG.
arbitrary conforming triangulation V^ of the domain D which is called the primary mesh
and consisting of finitely many triangles T>i and the corresponding nodes are abbreviated
by Xj G ^. Based on the triangulation a discrete control volume CT, is defined as the open
set of R^ including the node x,- and bounded by straight lines which are determined by
the connection of the midpoints of the edges of the corresponding triangles Vj (i.e.
Xj e dVj) and their barycentre (see Figure 1). The union B'" of all boxes is called the
secondary mesh. A finite volume method represents a discretizationof the evolutionary
equation (4.1) for cell averages defined by (MT) {t)\^ = (1/|(T|)/^T(x,t)rfx, where \a\
denotes the volume of the box a. With respect to the secondary mesh B'' we can write
the integral form (4.1) as
dt
(^^)(^)|^^ = ML
A(x)
VT{x,t)
A;(x) ' *'"'"'
A(x)T(x,f)
VA;(x) ■ n(x) ds
fc(x)2
+ / (5B(x,t)dx-f / (5M(x)dxl, Moi&B^.
(4.2)
Corresponding to a finite element method the evaluation of the boundary integral is performed by using a piecewise constant distribution_of the heat coefficient A and a piecewise
linear distribution of the auxilary temperature f. with respect to the triangles of the
trianglutaion used. Note that the source term remains unchanged and the calculation is
Blood flow model
433
given by
/
and
I
QM{x)dx = \ai\QM{xi)
QB{yi,t) dx = \ai\cBPBCCX{xi)BF{xi) [Ts^ - T(xi,t)].
J at
The computation of the blood pool temperature is directly performed by an explicit time
discretization of equation (3.3). Thereby, the temperature of the venous blood is given
by equation (3.4).
It is remarkable that the method degenerates to the scheme presented in [7] in the
context of a steady state solution and therefore the excellent properties like the discrete min-max principle are maintained in such a situation. Due to the space available
kernel
FIG.
2. Primary mesh and tissue layers in the head region.
we restrict ourself to the consideration of steady state calculations using the described
method. Thereby, we distinguish layers of skin, fat, bone and kernel by different rates
of metabolism, specific heat capacity and blood perfusion associated with the regions
depicted in Figure 2. As boundary conditions we employ a comfortable boundary temperature of 309.15 if at head, back, legs, and belly while we set 299.15 K at the neck,
i.e. we selectively cool the neck. In reality, this corresponds to the situation where the
infant is wearing a water-filled collar with the purpose of cooling the blood flowing into
the brain through the arteries adjacent to the skin.
In Figure 3 (a) we can see the temperature distribution in the two-dimensional discretized idealization of the body of a premature infant. Thereby, no blood flow and no
metabolic heat production is applied, so that the depicted distribution of heat is only
influenced by the heat conductivity of the employed tissues. The situation where tissue
dependent metabolic heat production is taken into account is shown in Figure 3 (b).
Note that the heat sources visualized within the picture not only have local effects, they
also influence the mean value of the temperature of the blood pool. Within Figure 3 (c),
blood flow is additionally given.
M. Breuss, B. Fischer and A. Meister
434
It is evident that the blood flow has the effect outlined in Section 5. Especially, the
numerical solution incorporates no hint of the fact, that in reality there is a transport
of cool blood to the brain and also a transport of blood by the veins coming from the
brain.
299.15
301.1
303.1
305.1
307.0
309.0
310.3
Temperature in [Ti']
FIG. 3. Comparison of steady state situations (a) only with heat conduction (b) with
heat conduction and metabolic heat production and (c) with blood flow additionally
taken into account (from top to bottom).
5
Concluding remarks
The range of applicability of the described blood flow model is restricted to situations
where it makes sense to employ a mean value of the whole blood, e.g. if the whole body is
exposed for a longer time to the same temperature. For a clinical application where the
effects of local cooling or heating have to be studied, caution is required when dealing
with the results achieved by employing variations of the described model.
Blood flow model
Bibliography
1. B. Fischer, M. Breufi and A. Meister, The unsteady thermoregulation of premature infants — a model and its application, in Discrete Modelling arid Discrete Algorithms in Continuum Mechanics, Proceedings of thei GAMM Workshop, Th. Sonar
and I. Thomas (eds.), 2000.
2. B. Fischer, M. Breufi and A. Meister, The numerical simulation of unsteady heat
conduction in a premature infant, in Numerical Methods for Fluid Dynamics,
M.J. Baines (editor), ICFD, Oxford University Computing Laboratory 7 (2001).
3. M. Buse and J. Werner, Heat balance of the human body: influence of variations of
locally distributed parameters, in Journal of Theoretical Biology 114 (1985), 34-51.
4. O. Bufimann, A model for the thermoregulation of premature infants and neonates
under consideration of the thermal maturity, PhD Thesis, Medical University of
Lubeck, (2000), in German.
5. R. Busto et al.. The importance of brain temperature in cerebral ischemic injury,
in Stroke 20 (1989), 1114-1134.
6. D. Fiala, K.J. Lomas and M. Stohrer, A computer model of human thermoregulation
for a wide range of environmental conditions: the passive system, in Journal of
Applied Physiology 87'No. 5 (1999), 1957-1972.
7. B. Fischer, M. Ludwig and A. Meister, The thermoregulation of infants: Modeling
and numerical simulation, in 5/r 41 No. 5 (2001), 950-966.
8. P.D. Gluckman and C.E. Williams, When and why do brain cells die?, in Dev. Med.
Child Neurol. 34 {1992), 1010-lOU.
9. E.G. Mallard et al., Neuronal damage in the developing brain following intrauterine
asphyxia, in Reprod. Fertil. Dev. 7 (1995), 647-653.
10. H.H. Pennes, Analysis of Tissue and Arterial Blood Temperatures in the Resting
Human Forearm, in Journal of Applied Physiology 1 (1948), 93-122.
11. G. Simbruner, Thermodynamic Models for Diagnostic Purposes in the Newborn and
Fetus, Facultas Verlag, Wien, 1983.
12. T. Sonar, On the Construction of Essentially Non-Oscillatory Finite Volume Approximations to Hyperbolic Conservation Laws on General Triangulations: Polynomial Recovery, Accuracy, and Stencil Selection, in Comp. Meth. Appl. Mech. Eng.
140 (1997), 157-181.
13. K. Thomas, Back to Basics: Thermoregulation in Neonates, in Neonatal Network
13 No. 2 (1994), 15-22.
14. J. Werner, Thermoregulatory models. Recent research, current applications and
future development, in Scand. J. Work Environ. Health 15 Suppl. 1 (1989), 34-46.
15. J. Werner and P. Webb, A six-cylinder model of human thermoregulation for general
use on personal computers, in Ann. Physiol. Anthrop. 12 No. 3 (1993), 123-134.
16. E.H. Wissler, A mathematical model of the human thermal system, in Bulletin of
the human thermal system 26 (1964), 147-166.
435
Zeros of the hypergeometric polynomial F{—n,b; c; z)
K. Driver* and K. Jordaan
School of Mathematics, University of the Witwatersrand, Johannesburg, South Africa.
036kad0cosmos.wits.ac.za, 036jordOcosmos.wits.ac.za
Abstract
Our interest lies in describing the zero behaviour of Gauss hypergeometric polynomials
F{-n,b; c; z) where b and c are arbitrary parameters. In general, this problem has not
been solved and even when 6 and c are both real, the only cases that have been fully
analysed impose additional restrictions on b and c. We review recent results that have
been proved for the zeros of several classes of hypergeometric polynomials F{-n, b; c; z)
where b and c are real. We show that the number of real zeros of F{~n,b; c; z) for
arbitrary real values of the parameters b and c, as well as the intervals in which these
zeros (if any) lie, can be deduced from corresponding results for Jacobi polynomials.
1
Introduction
The Gauss hypergeometric function, or 2F1, is defined by
FO
{a)k{b)k z^
where a, h and c are complex parameters and
I
(a)fc = a(a + l)...(a + fc-l) = r(Q + fc)/r(a)
is Pochhammer's symbol. When a = -n is a negative integer, the series terminates and
reduces to a polynomial of degree n, called a hypergeometric polynomial. Our focus lies
in the location of the zeros F{-n, b; c; z) for real values of b and c.
Hypergeometric polynomials are connected with several different types of orthogonal
polynomials, notably Chebyshev, Legendre, Gegenbauer and Jacobi polynomials. In the
cases of Chebyshev and Legendre polynomials, the connection demands fixed special
values of the parameters b and c, namely, (cf. [1], p.561)
-n,n;^;z)=Tn{l-2z)
and
Fi-n,n + l;l;z)=Pn{l-2z),
'Research of the first author is supported by the John Knopfmacher Centre for Applicable Analysis
and Number Theory, University of the Witwatersrand.
436
Zeros of the hypergeometric polynomial F{—n, b; c; z)
437
respectively. However, in the cases of Gegenbauer and Jacobi polynomials, we have
F(^-n,n + 2X;X + \;z^=j^C::{l-2z)
\
(1.1)
F{-n,a + p + l + n;a + l;z) = —^Vi"'^\l-2z),
(1.2)
and
respectively. Since the zeros of orthogonal polynomials are well understood, we expect
the connections (1.1) and (1.2) to be very useful in analysing the zeros of F{—n, b; c; z).
Conversely, if the zeros of F{—n, b; c; z) are known, this leads to new information about
the zero distribution of Gegenbauer or Jacobi polynomials for values of their parameters
that lie outside the range of orthogonality of these polynomials.
This paper is organized as follows. In Section 2 we give a self-contained review of
recent results regarding the zeros of several special classes of hypergeometric polynomials.
Section 3 contains results originally due to Klein [9] which detail the numbers and
location of real zeros of F{—n,b; c; z) for arbitrary real values of b and c. We provide
simple proofs using results proved in [13].
2
Zeros of special classes of hypergeometric polynomials
We begin with a few general remarks. Since we shall assume throughout our discussion that b and c are real parameters, we know that all zeros of F{—n, b; c; z) must
occur in complex conjugate pairs. In particular, if n is odd, F must always have at
least one real zero. Further, if 6 = —m where m < n,m E N, F{—n,b; c; z) reduces
to a polynomial of degree m. However, since we are interested in the behaviour of the
zeros of F{—n, b; c; z) as b and/or c vary through real values, we shall adopt the convention that F{—n,—m; c; z) = lirat,-,-mF{—n,b; c; z). This ensures that the zeros of
F vary continuously with b and c. Note also that F{—n,b; c; z) is not defined when
c = 0, —1,..., —n + 1. Regarding the multiplicity of zeros, a hypergeonietric function
w = F{a, b; c; z) satisfies the differential equation
z{l-z)w"+[c-{a + b + l)z]w'-abw = 0,
so if w{zo) = w'{zo) =0 at some point ^o 7^ 0 or 1, it would follow that w = 0. Thus
multiple zeros of F{—n, b; c; z) can only occur at ^ = 0 or 1.
2.1
Quadratic transformations
The class of hypergeometric polynomials that admit a quadratic transformation is specified by a necessary and sufficient condition due to Kummer (cf. [1], p.560). There are
438
K. Driver and K. Jordan
twelve polynomials in this class (cf. [14], p.l24)
F(-n,6;26;2)
F{-n,b;-n - b + I; z)
F{-n,b; -"+''+^ z)
F (-n,b; I; z)
F [-n,-n + I; c; z)
F {-n,b;-n + b+I; z)
jF(-n, 6;|;2)
F (-n,-n - 5; c; z)
F {-n,b;-n + b-I; z)
F {-n,b;-2n; z)
F{-n,b; b + n + 1; z)
F {-n,n + l; c; z) .
The most important polynomial in this class is F{-n, b; 26; z) because complete analysis
of its zero distribution for all real values of b (cf. [4], [5]) leads to corresponding results
for the zeros of the Gegenbauer polynomials C^{z) for all real values of the parameter
;A(cf. [6]).
Theorem 2.1. Let F = F{—n,b; 26; z) where b is real,
(i) For b> — |, all zeros of F{—n, b; 26; z) are simple and lie on the circle \z — l\ = 1.
(a) For -| - j < 6 < 5 - i, j = 1,2,... [|] - 1, (n - 2j) zeros of F lie on the circle
\z - 1\ = 1. If j = 2k is even, there are k non-real zeros of F in each of the four
regions bounded by the circle \z - 1\ = 1 and the real axis. If j = 2k + 1 is odd,
there are k non-real zeros of F in each of the four regions described above and the
remaining two zeros are real.
(Hi) Ifn is even, for - [f ] <''<"" [f] + h "° ^^^°^ °f ^ ''^ °" k " 1| = 1- -V'^ = 4fc,
all zeros of F are non-real whereas ifn = Ak-\-2, two zeros of F are real andAk are
, non-real. Ifn is odd, for -1 - [|] < 6 < - [§] + |, only the fixed real zero of F at
z = 2 lies on \z — l\ = 1. Ifn = 4fc + 1, n — 1 = 4fc zeros of F are non-real whereas
ifn = Ak + 3, two further zeros are real and the remaining ik are non-real.
(iv) For j - n < 6 < j - n + 1, j = 1,2,... [|] - 1, (n - 2j) zeros of F are real and
greater than 1. If j = 2k is even, all remaining 2j zeros of F are non-real with
k zeros in each of the regions described above; while if j = 2fc + 1, 4/c zeros are
non-real as before and 2 are real.
(v) For b < l-n, all zeros of F{—n, 6; 26; z) are real and greater than 1. Asb —* —00,
all the zeros of F converge to the point z = 2.
An analogous theorem which describes the behaviour of the zeros of C^{z) can be
found in [6], Section 3 or [7], Theorem 1.2.
For the polynomial F (-n, 6; |; 2) the following result has been proved in [7], Theorem 2.3.
Theorem 2.2. Let F = F {-n,b; |; z) with b real.
(i) For 6 > n - |, alln zeros of F are real and simple and lie in (0,1).
(ii) For n - I - j < 6 < n + I - jf, j = 1,2,..., n - 1, (n - j) zeros of F lie in (0,1)
and the remaining j zeros of F form [^] non-real complex pairs of zeros and one
real zero lying in (l,oo) when j is odd.
Zeros of the hypergeometric polynomial F{—n, b; c; z)
439
(Hi) For 0 < 6 < \, F has \_^'\ non-real complex conjugate pairs of zeros with one real
zero in (l,oo) when n is odd.
(iv) For —j <b < —j + 1, j = 1,2,... ,n—l, F has exactly j real negative zeros. There
is exactly one further real zero greater than 1 only when (n — j) is odd and all the
remaining zeros of F are non-real,
(v) For b <l—n, all zeros of F are real and negative and converge to zero asb ^> —CXD.
A very similar theorem is proved for the zeros of F (-n, h; |; 2) in [7], Theorem 2.4
with only minor differences of detail.
For the hypergeometric polynomial F{—n, b; —2n; z), less complete results have been
proved. We have (cf. [8] Theorem 3.1 and Corollary 3.2) the following.
Theorem 2.3. Let F = F {-n, b; -2n; z) with b real.
(i) For b > 0, F has n non-real zeros if n is even whereas if n is odd, F has exactly
one real negative zero and the remaining (n— 1) zeros of F are all non-real.
(a) For -n < b < 0, if -k < b < -k -\- 1, k = l,...,n, F has k real zeros in the
interval (l,oo). In addition, if{n — k) is even, F has {n — k) non-real zeros whereas
if {n — k) is odd, F has one real negative zero and \n — k — l) non-real zeros.
(Hi) For -n> b> -2n, if -n-k > b> -n-k-1, k = 0,1,... ,n-l,F has (n - k)
real zeros in the interval (l,oo). In addition, if k is even F has k non-real zeros
while if k is odd, F has one real zero in (0,1) and {k — 1) non-real zeros.
(iv) For b < —2n, all n zeros of F are non-real for n even whereas for n odd, F has
exactly one real zero in the interval (0,1).
The identities (cf. [7], Lemma 2.1)
F{-n, b;c;l-z)= ^^ ~" ^)" F{-n, b;l-n + b-c; z)
(2.1)
\C)n
and
F{-n,b;c;z) = ^{-zYFl-n,l-c-n;l-b-n;-\
(c)„
V
zj
(2.2)
hold for b and c real, c -^^ {0,-1,... ,-n + 1}. Applying (2.1) and (2.2) to each of
the polynomials F{-n,b; 26; z), F [-n,b; \; z), F [-n,b; |; z) and F{-n,b; -2n; z)
in turn, we obtain the remaining eight polynomials in the quadratic class. It is then an
easy task to deduce analogous results for their zero distribution.
A similar set of results has been proved for the sixteen hypergeometric polynomials
in the cubic class. Again, this class arises from a necessary and sufficient condition (cf.
[2], p.67) and details can be found in [7].
3
The real zeros of F{—n, b; c; z) for h and c real
The results proved below are due to Klein [9] who considered the zeros of more general
hypergeometric functions (not necessarily polynomials). Klein's proof is geometric and
440
K. Driver and K. Jordan
difficult to penetrate. A more transparent perspective in the polynomial case may be
provided by the approach given here.
The classical equation linking the hypergeometric polynomial F{-n, b; c; z) with Jacobi polynomials vi°''^\z) is given by (1.2). We will find an alternative expression (cf.
[12], p.464, eqn. (142))
n-n,6;c;^) = g^7'^'')(l-^),
(3.1)
where a = -n-6 and (3 = b-c-n, more suited to our analysis. The number of real zeros
of V^'^\x) in the intervals (-1,1), (-oo, 1) and (1, oo) are given by the Hilbert-Klein
formulas (cf. [13], p.l45, Theorem 6.72), also known to Stieltjes. We use Klein's symbol
0
E{u) = ^ [u]
if « < 0
itu>0, u^ integer
M-1 if w = 1,2,3,...
Noting that under the linear fractional transformation w = 1 - 2/z, the intervals
1 < 10 < oo, -00 < w < -1 and -1 < w < 1 correspond to -oo <2;<0, 0<^;<1
and 1 < z < 00 respectively, we can use equation (3.1) to rephrase the Hilbert-Klein
formulas for hypergeometric polynomials.
Theorem 3.1. Let 6, c € ffi with b,c,c-h^Q, -1,..., -n + 1. Let
X = £|^(|l-c|-|n + 6|-|6-c-n| + l)|
(3.2)
Y = £;|i(-|l-c| + |n-|-6|-|6-c-n| + l)l
(3.3)
Z = £^-(-|l-c|-|n
' \ 9 (-|1 - cj - |n + 6| + |6-c-n|
|6 - c - n| + l)^.
1) I
(3.4)
Then the numbers of zeros of F{—n,b; c; z) in the intervals (l,oo), (0,1) and (-oo,0)
respectively are
2[(X + l)/2] i/(-l)"r„^)M>0
■^1 = S
=!
2[X/2] + l
I
N2 = {
,■.,,..
if{-ini')C7)<^
2[(F+i)/2] if{-:){'-') >^
2[y/2] + i
,
if{-:){'7)<^
^^j2[(.+i)/2] ifi-:){-:)>o
12[z/2]+i
if {-:){-:) <o.
(^•^)
(3-6)
^^^^
Zeros of the hypergeometric polynomial F{—n, b; c; z)
441
Proof: The expressions all follow immediately from the Hilbert-Klein formulas (cf. [13],
p. 145, Thm. 6.72) together with equation (3.1).
n
Theorem 3.2. Let F = F{-n, b; c; z) where b, ceM. and c > 0.
(i) For b> c + n, all zeros of F are real and lie in the interval (0,1).
(a) For c<b < c + n, c + j -I <b < c + j, j = l,2,...,n; F has j real zeros in (0,1).
The remaining {n — j) zeros of F are all non-real if (n — j) is even while if (n — j)
is odd, F has {n — j — 1) non-real zeros and one additional real zero m (1, oo).
(Hi) For 0 < b < c, all the zeros of F are non-real if n is even, while if n is odd, F has
one real zero in (l,oo) and the other {n — 1) zeros are non-real,
(iv) For -n <b <0, -j <b < -j -\-l, j = l,2,...,n, F has j real negative zeros. The
remaining (n — j) zeros of F are all non-real if {n — j) is even, while if (n — j) is
odd, F has {n — j — 1) non-real zeros and one additional real zero in (l,oo).
(v) For b < —n, all zeros of F are real and negative.
Proof: We use the identity (cf [1], p.559, (15.3.4))
F{-n,b;c;z) = {l-zYF(-n,c-b;c;^—\
(3.8)
to show that (i) ^ (v) and (ii) => (iv) so that it will suffice to prove (i), (ii) and (iii)
above.
(i) => (v): If 6 < —n then c — b>c + n and by (i), all zeros of F{—n, c -^ b; c; w) are
real and lie in the interval (0,1). Since w = z/{z — 1) maps (-oo, 0) to (0,1), (v) follows
from (3.8).
(ii) =^ (iv): If -j < b < -j + 1, j = 1,2,..., n, then c + j - 1 < c - 6 < c + j,
ji = 1,2,... ,n. By (ii), since w = z/{z — 1) maps (—oo,0) to (0,1) and (l,oo) to (l,oo),
(iv) follows again from (3.8). To prove (i), (ii) and (iii), we note that in each part, 6 > 0
(and of course c > 0 by assumption). Then
(i) Suppose b> c-\-n. Then b — c> n and
sign ( ~ "^ j > 0 for all n.
(3.10)
Considering (3.5), (3.6) and (3.7) with (3.9) and (3.10), we observe that
ATi
=
2[(X + l)/2],
Ar3 = 2[(Z + l)/2],
r 2[(y + l)/2]
for n even
[ 2[y/2] + l
fornodd
Assume now that c > 1. Then for 6 > c + n, we have from (3.2), (3.3) and (3.4)
that X = Q,Y — n, Z = Q. Substituting these values into A'"i, N2 and A^3 yields the
result. A similar calculation shows that the same result is obtained when 0 < c < 1.
442
K. Driver and K. Jordan
(ii) For c + j - 1< 6 < c + j, i = 1,2,..., n, we find that sign (''7) = (-1)"-^'. Then
from (3.5), (3.6), (3.7) we see that
2[(X + l)/2]
for (n-j) even
2 [X/2] + 1
for (n - j) odd
2 [{Y + l)/2]
for j even
2 [Y/2] + 1
for j odd
Ni
=
N2
=
Ns
= 2[(Z + l)/2].
It follows from (3.2), (3.3) and (3.4) by an easy calculation that X = 0, Y = j,
Z = 0 and we deduce that ATj = / ^ !j {'^ ~^!j I'' ^l^^ , iVa = j and N3 = 0
[ 1 it [n- J) IS odd
which proves (ii).
/•■•\ -n
r> ^ I, ^
•
/b-c\
/
i\r, rp,
^r
/ 2[(X + l)/2]
if n is even
(m) For 0 < 6 < c, sign (^/) = (-1)". Then iVi = | 2 [3^/2] + 1
if n is odd ■
N2 = 2 [(y + l)/2], ATa = 2 [(Z + l)/2]. Also, we find X = 0, F = 0 and Z = 0
which completes the proof of (iii) and hence the theorem.
□
For c < 0, the range of values of b and c that have to be considered can be reduced
if we use the identities (2.1) and (2.2). Since the real zeros of F{-n,b; c; z) are now
known for all c> 0 and 6 e K from Theorem 3.2, it follows from (2.1) that we need only
consider c-b> 1-n. Similarly, from (2.2) and Theorem 3.2, we can assume 6 > 1 -n.
We split the result for c < 0 into the cases where 6 > 0 and 1 - n < 6 < 0.
Theorem 3.3. Let F = F(-n, 6; c; 2;). Suppose that c <0, b > 0, c-b> 1-n. Then
(i) l-n<c-6<0 and 0 < 6 < n - 1 and 1 - n < c < 0.
(ii) If -k < c < -k + 1, k = 1,... ,n - 1 and
-j<c-b<-j + l,
j = l,...,n-l,
then F{-n,b; c; z) has {j-k) > 0 real zeros in (0,1). For the remaining (n-j + k)
zeros of F
(a) {n — j + k) are non-real if (n - j) and k are even
(b) {n — j + k - 1) are non-real and one real zero lies in (1,00) if (n - j) is odd
and k is even
(c) (n — j + fc — 1) are non-real if {n — j) is even, k odd and one zero is real and
negative
(d) (n — j + fc — 2) are non-real if (n-j) is odd and k is odd with one real negative
zero and one real zero in (l,oo).
Proof:
(i) This follows immediately from c<0,b>0, c-b>l-n.
Zeros of the hypergeometric polynomial F{—n, b; c; z)
443
(ii) For c < 0, 6 > 0, c - 6 > 1 - n, we have
|l-c| = l-c,
\b + n\ = b + n,
\b-c-n\ = c-b + n
and it follows from (3.2), (3.3) and (3.4) that
X-^(l-c-n),
Y = E{b),
Z = E{c-b).
Since 1 - c - n< 0 and c - 6 < 0, X = Z = 0. Now sign {-^) = (-1)" and for
fc = 1,... ,n - 1, -fc < c < -A; + 1 ^ sign (7) = (-l)"-^ while for -j < c - 6 <
-j + 1, j = l,...,n- 1, sign {^-^) = (-1)"-^'. Therefore, from (3.5), (3.6) and
(3.7),
^ = {? Z4^
;(3.n)
> = {; ;;t:r-
.(3.13)
Now for j > ft - c > j - 1 and -k < c < -k + 1, b e {j - k - 1, j - k + 1),
j-fc = l,2,...,n-2. If 6e (j-fc-1, j-fc), y = £^(6) = j - k-1, whereas if
b e ij-k, j-k+1), Y = E{b) = j-k. Considering the cases {j-k) even and {j — k)
odd, it is straight-forward to check that for all j, k eN with j - fc = 0,1,..., n - 2,
we have
N2=j-k.
(3.14)
Equations (3.11), (3.12), (3.13) and (3.14) complete the proof of (ii).
a
By virtue of Theorem 3.3 and the identities (2.1), (2.2) and (3.8), it is easy to see
that we only have one possibility left that has not been analysed, namely,
l-n<c-b<0,l-n<b<0,
l-n<c<0.
(3.15)
Theorem 3.4. Let F = F{—n,b; c; z) where b and c satisfy condition (3.15). If —j <
b<-j + l, j -l,...,n-l; -k<c<-k + l,k = l,...,n-l and-£ < c-b < -i-\-l,
i= l,...,n-l, then F has no real zeros ifn + j + i, k + i, j + k are even, one real zero
in {l,oo) if n + j +i is odd, one real zero in (0,1) if k + £ is odd and one real negative
zero if j + k is odd.
Proof: Under the restrictions (3.15), we have
|l-c| = l-c,
\b + n\=b + n,
\b-c-n\= c-b + n.
Then from (3.2), (3.3) and (3.4),
X = E{l-c-n),
Y = E{b),
Z = E{c-b),
444
K. Driver and K. Jordan
and it follows from (3.15) that X = Y = Z = Q. Also, sign ("'') = (-1)"-^ sign (7) =
(_l)n-fe and sign(''~'^) = (-1)""^. The stated result then follows immediately from
(3.5), (3.6) and (3.7).
D
Remark 3.1 We have not considered the asymptotic zero distribution as n ^> oo
of F{-n,b; c; z). There are recent interesting results in this regard using different approaches, namely complex analysis techniques [10], matrix theoretic tools [11], asymptotic
analysis of the Euler integral representation [3] and analysis of coefficients [8].
Bibliography
1. M. Abramowitz and I. Stegun, Handbook of Mathematical Functions, (Dover, New
York, 1965).
2. Bateman Manuscript Project, Higher Transcendental Functions, Volume /, (A.
Erdelyi, editor; McGraw-Hill, New York, 1953).
3. K. Driver and P. Duren, "Asymptotic zero distribution of hypergeometrlc polynomials",A/'«merim/^?5onf/im5, 21 (1999), 147-156.
4. K. Driver and P. Duren, "Zeros of the hypergeometrlc polynomials F{-n, b; 26; z)",
Indag. Math., 11 (1) (2000), 43-51.
5. K. Driver and P. Duren, "Trajectories of the zeros of hypergeometrlc polynomials
F{-n,b;2b;z) for b < -A", Constr. Approx., 17 (2001), 169-179.
6. K. Driver and P. Duren, "Zeros of ultraspherical polynomials and the Hilbert-Klein
formulas", J. Comput. and Appl. Math., 135 (2001), 293-301.
7. K. Driver and M. Moller, "Quadratic and cubic transformations and the zeros of
hypergeometrlc polynomials", J. Comput. and Appl. Math., to appear.
8. K. Driver and M. Moller, "Zeros of the hypergeometrlc polynomials
F{-n, b; -In; z)", J. Approx. Th., 110 (2001), 74-87.
9. F. Klein, "Uber die Nullstellen der hypergeometrischen Reihe", Ma^/iemafecfte i47ina?en, 37 (1890), 573-590.
10. A.B.J. Kuijlaars and W. van Assche, "The asymptotic zero distribution of orthogonal polynomials with varying weights", J. Approx. Th., 99 (1999), 167-197.
11. A.B.J. Kuijlaars and S. Serra Capizzano, "Asymptotic zero distribution of orthogonal polynomials with discontinuously varying recurrence coefficients", J. Approx.
Th., to appear.
12. A.P. Prudnikov, Yu. A. Brychkov and O.I. Marichev, Integrals and Series, Volume
3, (Moscow, "Nauka", 1986 (in Russian); English translation, Gordon & Breach,
New York, 1988); Errata in Math. Comp., 65 (1996), 1380-1384.
13. G. Szego, Orthogonal Polynomials, (American Mathematical Society, New York,
1959).
14. N. Temme, Special Functions: An introduction to the classical functions of mat/iemaiica/p/iysics, (Wiley, New York, 1996).
Approximation error maps
A. Gomide and J. Stolfi
Institute of Computing, University of Campinas, Brazil.
anamaria@ic.unicainp.br, stolfi@ic.unicamp.br
Abstract
In order to analyze the accuracy of a fixed, finite-dimensional approximation space which
is not uniform over its domain Q, we define approximation error map, a description of
how the error is distributed over Q—not for a single test function but for a general class
of such functions. We show how to compute such a map from the best approximations
to an orthonormal basis of the target function space.
1
Introduction
The expected accuracy of a finite-dimensional approximation space (e.g. a polynomial
spline space, or a finite wavelet decomposition) will often vary over its domain Q. Indeed,
adaptive-resolution schemes are based on the premise that refining the element grid in
a particular region of 0. will improve the approximation accuracy in that region.
Knowledge of how the expected approximation error varies over the domain fi is
obviously relevant to the evaluation of an approximation space, and to the tuning of knot
locations, grid geometry, refinement thresholds and other parameters. Towards that goal,
we introduce the concept of approximation error map, a description of how the error is
distributed over fi—not for a single test function, but for all functions in some specified
space J^. We then show how to compute such a map from the best approximations to
an orthonormal basis oi T.
1.1
Notation and definitions
Let T and A be two fixed, finite-dimensional vector spaces, not necessarily disjoint, of
functions defined on some domain Q with values in R. Let ||-|| be a vector semi-norm
for the space A + J^. For any function f e T,vre define its best approximation as the
function/•^ € .4 that minimizes the error ||/-/-^II.
We refer to A and J" as the approximation and gauge spaces, respectively. We assume
that the ||-||-balls in the suhspace A are strictly convex, ensuring that the best approximation always exists and is unique. Since (a/)-^ = oi{f-^) and ||a/|| = |a| ||/|| for any
real constant a, we can confine the analysis of approximation errors to the unit J^-sphere
:Fi = {f&T: ||/i| = l}.
1.2
Global error measures
Usually, the effectiveness of the approximation space A is measured by a single number
II/ - /-^11—either for the worst-case function / 6 .Fi, or by the root-mean-power average
446
Approximation error maps
447
over all functions / G ^i
L
^P,A,r ■
i/p
MI|P
rr df
I
i/p
idf
(1.1)
Note that integrals are taken over the function space J^i, not over the domain fi.
The worst-case error is the limit
MX^
1.3
=
(1.2)
lim (r;,A,r = sup{ ||/-/-^H : / € J^i } .
Uniform approximation spaces
A global error measure such as /x^^r or CT*j^,p is generally sufficient when all points of ft
are equivalent with respect to the quality of approximation. More formally, we say that
a normed function space X is uniform over O if there is some family $ of maps from Q
to Q that preserves X and its norm ||-||, and which can take any point of fi to any other
point. A natural example is y^, the set of all harmonic functions on the sphere S'^ of a
given maximum order n, with any Lp norm; this space is preserved by the family of rigid
rotations of S''. Obviously, if both A and J^ are uniform under the same family $, then
A approximates J- equally well at all points of Q. (Of course, for any specific function
/ G J", the error f — f'^ will usually vary over O.)
There are however many important approximation spaces A which are not uniform. A
familiar example is the space of polynomials or trigonometric series defined on a bounded
region Q C R". Another example is the space of the piecewise polynomial splines of fixed
order and continuity defined over a fixed grid G. Wavelet spaces truncated to a fixed
order provide yet another example. For such spaces, the expected approximation error
usually varies over Q, even when the functions to be approximated are drawn from a
uniform space.
2
Approximation error map
We define the root mean power approximation error map of T hy A
<^P,.4,:F of O to R defined by
cyr>AA^)= \ \A^)-f\^tdf
/ / irf/
BS
the function
•
(2.1)
As before, integrals are taken over the function space T\^ not over the domain O. Note
that (y.p^j^^jr{x) is not the error for a specific function /, but rather the average error at the
point X for a generic function / in J'^i. As a limiting case, we define also ihs worst-case
approximation error map of T hy Aas the function
iJ-AA^) =
lim
'^P,AAX)
= sup { \f{x) - f^{x)\ : f eTA.
(2.2)
Again note that the supremum is taken over Ti, not over fi, and that /i^_>-(a;) is not
the error at x for a single function /, but rather the error for the function / in J^i that
is worst for that particular x. A plot of (Jp^j,^j^{x) or /i^,jr(a;) over Q. should show at a
A. Gomide and J. Stolfi
448
glance how well A approximates :F in different parts of the domain, for all functions of
!F at once.
3
Computing the approximation error map
Formulas (2.1)-(2.2) become more tractable when the function metric ||-|| is the L2 norm
11/11 = [J^ |/(a;)|^ dx]^^^ defined on the space A+T—in other words, when ||/|| = (/, /)
where {f,g) = JQ f{x)g{x) dx. We make this assumption in the remainder of this section.
In that case, /"^ is a linear function of /, namely the orthogonal projection of / onto the
subspace A; and /i^,^^ is simply |sin0|, where 6 is the angle between the two subspaces.
3.1
Explicit formula for a
Let us suppose that A and T are disjoint, and let (/ii ,...,(/»„ be an orthonormal basis for
T. Let Ui = (j>f- for all i, and let ei = (pi-ai. We will call (p, a, and e the gauge, approximation, and error bases, respectively (even though Oi and EJ need not be independent).
The average error map a-p,A,:F{x) can be expressed in terms of the error basis
i/p
(^p,A:A^)
Idc
dc
=
(3.1)
An JS"-M j
where J4„ = 27Tt/r(|) is the measure of S n-l
Note that J2i^i^i{^) is the dot product of the unit vector c = (ci,C2,... ,c„) and
the vector e{x) = {£i{x),e2ix),... ,£„(3;)); it depends only on |e(a;)| and on the angle 6
between those two vectors, and is constant over the slice of S"~^ where 6 is constant.
The measure of that slice is An-i |sin^|" dO. Therefore,
(^p,A,r{x)
= \4- r kWI |cos»|M„_i Isin^l"-^ de
ii/p
\.An Jo
IVP
-
(r(|))2r(E±i)
\eix) v/ir(^)r(2±i±2)
3.2
(3.2)
Explicit formula for /x
The worst-case error map fiA,r can be obtained by taking p to the limit +00 in formula (3.2), or directly, as follows. Prom formula (2.2),
fJ'AA^)
=
supj (X]ci()f>jj (re)- f ^Ci^iJ
{x) :
^Ci(/>i =l|
Approximation error maps
CiSiyX j
449
:ceS"-i^.
(3.3)
By considering the effect of negating each c,, it is easy to see that the absolute value in
the last formula is superfluous, i.e.
fiAAx)
=
JY^asiix)
ceS"-i|.
sup<^
Vciei(a;) : ceS"-M.
(3.4)
Formula (3.4) is the supremum of a linear functional with coefficients ei{x) over the
sphere S"~-^; which is achieved at the point c*{x) of S"~^ that is collinear with the
coefficient vector, namely c*{x) = Siix)/ Jj^ji^ji^W> whence
I^A,A^) = E<(^)^^W =
,/E(^^-W)'
= 1^(^)1-
(3-5)
In summary, the error maps crp^^^j^{x) and fj,jy^jr{x) (which differ only by a constant
factor) can be derived from the approximation errors ei{x) for each basis function 4>i{x),
combined with the norm |e(a;)| = \/Y^i{si{x)Y.
4
4.1
Practical considerations
Connection between the function and point norms
The maps (2.2) and (2.1) will be more useful when there is a direct connection between
the function-space norm ||-|| and the absolute value |-|, used to compare functions values
at a given point x, as in formulas (2.1)-(2.2)—namely, when
11/9
/ \f{x)\''dx
.
(4.1)
Jn
In
.
More generally, the function values at x could be compared with a norm which could
depend on a;, or take derivatives of the function into account. We will not pursue such
extensions in this paper.
Connection (4.1) is not strictly necessary—at least when A and .F are finite dimensional. However, it may not make much sense to choose the approximant /•^ so as to
minimize the function norm ||-||, and then analyze its accuracy using some other norm
|-|, if there is no connection between the two.
Considering that the error map is relatively easy to compute when ||-|{ is the £2
norm (see Section 3), and probably intractable otherwise, the connection expressed by
formula (4.1) will probably hold in practice (with q = 2).
4.2
Choice of the gauge space
The approximation error map depends not only on the space A, but also on the gauge
space T and the error metric ||/||. Therefore, the choice of .F and ||-|| must be guided by
the intended application.
For example, suppose the domain fi is the circle or the sphere S'^, and the appUcation
does not specify a preferred direction. Then we should choose T and ||-|| so that they
450
A. Gomide and J. Stolfi
are invariant under rotations of Q—otherwise, any inhomogeneity in them may produce
irrelevant artifacts in the error map. Also, if the functions to be approximated are expected to be smooth, and/or only their low frequencies are important, then the functions
in !F should be smooth too. A natural choice for J^, in this case, are the circular or
spherical harmonics up to a certain maximum order, and the metric ||-|| can be simply
the Lq norm over the sphere S**.
4.3 Essential dimensions
We will argue next that, for the X2 function norm, the "interesting" part of the error
map is determined by two "essential" subspaces T' QT and .4' C A^ which are disjoint
and such that dim.F' > dim ,4'.
First, if the spaces A and T have a non-trivial intersection V, and we split a function
/ e ^ into its components g € V and /i ± V, we find that /-^ = g^-h^; and that /i-^
is itself orthogonal to V. Therefore, we can confine our attention to the complements T'
and A! of V relative to A and T^ which are disjoint.
Let us then suppose that A and T are disjoint. If dim.F < dim^, let .4' C ^ be the
projection of T onto A^ which contains all optimum approximants. Obviously, for any
function /, we have /"^ = /"^ , so we can confine our attention to the spaced', which is
still disjoint from T and satisfies dim.F > dim^'.
5
Examples
5.1 Trigonometric splines on the circle
Consider the approximation of a function by continuous trigonometric splines, of maximum frequency r = 2, defined on a partition T of S^ into n = 8 unequal intervals. This
space coincides with the space V^^lT] of non-homogeneous polynomial splines of R^,
restricted to S^, with Co continuity constraints [2].
For the gauge space T, we will use the family of trigonometric series truncated after
a suitable maximum frequency s>r, which coincides with the space of general spherical
polynomials (not spHnes) V^'"^ for some s > r. The norm is ||/|| = y/{f, f) where
(/)fl') = /si f{^)9{^)d'P- Specifically, T consists of the intervals Jo through /y shown
below
to
l_
„
0
■
lo
^
f1
1
TT
■:;■
2
/i
^2 h ta h U h h h *6
-I
-I
1
-I
-1
37r
YTT
97r
OTT
—r-TT'^-ir-r
4
8
8
4
h
h
-I
OTT
-:r
2
h
h
1
27r
Within each interval Ij, the generic approximant is a linear combination gj of the Fourier
basis functions ^i, for —r<i< +r. These partial functions are constrained to be
continuous across interval boundaries; i.e. gj^i{tj) = gj{tj) for each j in {0,..., n - 1}
(where all indices are taken modulo n). These equations turn out to be independent,
therefore the dimension of ^ is n(2r-f-1) - n = 32.
For the gauge space T, we will use the trigonometric polynomials of some order
s > r, i.e. Unear combinations of the basis functions ^j for -s <i < -f s, where (j)i{9) —
{l/\fn) sin(i^-|-7r/4). As observed in Section 4.3, we can ignore the subspace A! = 3^f\A of
Approximation error maps
451
A generated by ^-r,... ,<pr- Moreover, in order to use all of A, we need dim J^ > dim A—
i.e., 2s+l > 32, implying s > 16. See Figure 1. The resulting error map /x^,jr(a;) is shown
in Figure 2.
\
i = -4
v..y
12
FIG.
13
14
phifl)
alpha{t)
eps(t)
v.-/
15
16
1. The functions </>i(i), ai{t), and ei{t), for selected values of i.
FIG. 2. The error map Hj^^jr{t) for continuous (CQ) trigonometric sphnes on eight unequal intervals, tested with the space of trigonometric polynomials of order 16.
5.2
Spherical splines on a uniform mesh
For the examples in this section, the approximating functions are spherical polynomial
splines [1, 2, 3, 4] of continuity class zero and various degrees, homogeneous and nonhomogeneous, defined on some triangulation T of the sphere S^.
Figure 3 (left) shows the approximation error map fJ.A,3^{p) for the homogeneous
spherical spline space A = 'Ho[T]/S'^, which has dimension 252. In Figure 3 (right), A
is the non-homogeneous spherical spline space Vo[T]/S^, which has dimension 254. In
both cases, the gauge space J^ is the family 3^fg of spherical harmonics of maximum order
15, which has dimension 256. The intersection J"n 'HQ[T]/S'^ is the family of spherical
harmonics of odd order < 5 (dimension 21), whereas J^fl 'PQ[T]/S'^ is the full harmonic
space yl (dimension 25). The level curves are logarithmically spaced, five per decade.
A. Gomide and J. Stolfi
452
FIG. 3. Error maps HA,r{p) for the approximation spaces A = Hl[T]/S'^ (left) and
A = Vo[T]/S'^ (right). The maximum errors are 13.5 and 9.37, respectively.
5.3
Spherical splines on a variable mesh
In the following examples, the approximating functions are again spherical polynomial
splines, but the vertices of the triangulation T have been displaced so as to create regions
of very different sizes (still with icosahedral topology).
Figure 4 (left) shows the approximation error map HA,rip) for the space of homogeneous spherical splines A = TiQ\T]/S'^, which has dimension 252. In Figure 4 (right), A
is the space of non-homogeneous spherical splines Vo[T']/S'^, which has dimension 254.
In both cases, the gauge space J" is the family y^^ of spherical harmonics of maximum
order 15, which has dimension 256, as before. The level curves are logarithmically spaced
(5 per decade).
6
Conclusion
Asymptotic error analysis is not very helpful when comparing two fixed finite-dimensional
approximation spaces of similar dimensions—such as a spline space against a wavelet
space, or two spline spaces with different grid geometries. Approximation errors computed for individual test functions are difficult to interpret and may not be representative
of the average or worst cases. We expect that the approximation error map will be a useful analysis tool for those situations—especially for domains that admit natural uniform
target spaces, such as spheres (including the circle) and tori.
Acknowledgments. This research was supported in part by CAPES, FINEP, and
CNPq (PRONEX-SAI).
Approximation error maps
front
front
rear
rear
FIG. 4. Error maps (J.A,y^(p) for the approximation spaces A = TiQ[T]/S^ (left) and
A = Vo[T]/S'^ (right). The maximum errors are 17.1 and 17.9, respectively.
Bibliography
1. P. Alfeld, M. Neamtu, and L. L. Schumaker. Dimension and local bases of homogeneous spline spaces. SIAM Journal of Mathematical Analysis, 27(5):1482-1501, Sept.
1996.
2. A. Gomide and J. Stolfi. Non-homogeneous polynomial Ck splines on the sphere 5".
Technical Report IC-00-10, Institute of Computing, Univ. of Campinas, July 2000.
3. A. Gomide and J. Stolfi. Bases for non-homogeneous polynomial Cfe splines on
the sphere. In Lecture Notes in Computer Science 1380: Proc. LATIN'98 — Latin
American Theoretical Informatics Conference, pages 133-140. Springer, Apr. 1998.
4. A. Gomide. Splines Polinomiais Ndo Homogeneos na Esfera. PhD thesis. Institute
of Computing, University of Campinas, May 1999. (In Portuguese).
453
Approximation by perceptron networks
Vera Kurkova
Institute of Computer Science, Academy of Sciences of the Czech Republic
Pod voddrenskou vezi 2, P.O. Box 5, 182 07 Prague 8, Czechia
vera@cs.cas.cz
1
Introduction
The classical perceptron proposed by Rosenblatt [22] as a simplified model of a neuron computes a
weighted sum of its inputs and after comparing it with a threshold, applies an activation function
representing a rate of neuron firing. To model this rate, Rosenblatt used the Heaviside discontinuous threshold function, which still is, together with its various continuous approximations, the most
widespread type of activation used in neurocomputing. Formally, a perceptron with the Heaviside
activation function computes a characteristic function of a half-space of T^"*, which is for practical
reasons (all inputs are bounded) restricted to a box, usually [0,1]''. Thus theoretical study of perceptron networks leads to various questions concerning approximation of functions by a special class
of plane waves formed by linear combinations of characteristic functions of half-spaces (corresponding to the simplest model of perceptron network called the one-hidden-layer network with a linear
output unit).
Although Rosenblatt's model was inspired biologically, plane waves (sometimes called ridge functions) have been studied for a long time by mathematicians motivated by various problems from
physics. In contrast to integration theory, where functions are approximated by linear combinations
of characteristic functions of boxes (simple functions), the theory of perceptron networks studies
approximation of multivariable functions by linear combinations of characteristic functions of halfspaces. Expressions in terms of such functions exhibit the strength and weakness of plane waves
methods described by Courant and Hilbert [4], page 676: "But always the use of plane waves fails to
exhibit clearly the domains of dependence and the role of characteristics. This shortcoming, however,
is compensated by the elegance of explicit results."
In this paper we survey our recent results on properties of approximation by linear combinations
of characteristic functions of half-spaces. We focus on existence of best approximation, impossibility
of choosing among best approximations a continuous one, estimates of rates of approximation by
linear combinations of n characteristic functions of half-spaces and integral representation as a linear
combination of a continuum of half-spaces.
This work was partially supported by GA CR 201/99/0092 and 201/02/0428.
454
Approximation by perceptron networks
2
455
Preliminaries
A perceptron with aii activation function ip : TZ ^ TZ (where 72. denotes the set of real numbers)
computes real-valued functions on Tl'^ x T^-''^^ of the form ■0(v • x + b), where x £ T?.'' is an input
vector, V e T?.'' is an input weight vector and 6 S 7?. is a bias.
The most common activation functions are sigmoidals, i.e., functions with an ess-shaped graph.
Both continuous and discontinuous sigmoidals are used. Here, we study networks based on the
discontinuous Heaviside function d defined by ^(t) = 0 for i < 0 and i9(t) = 1 for t > 0. Let Hd
denote the set of functions on [0, l]** computable by Heaviside perceptrons, i.e.,
Fd = {/: [0, l]-^-^ 721/(x) = i?(v ■ X-f 6), V € 72^ 6 e 7e}.
Notice that Hd is the set of characteristic functions of half-spaces of TZ'^ restricted to [0, l]'^.
For all positive integers d, Hd is compact in {Cp{[0,1]**), ||.||p) withp e [l,oo) (see, e.g., [8]). This
can be verified easily once the set Hd is reparameterized by elements of the unit sphere S'^ in 72.''+^.
Indeed, a function 'd{v-x+b), with anon-zero vector {vi,... ,Vd,b) £ 72''+^, is equal to i9(v-x-t-6),
where {vi,... ,Vd,b) € 5"* is obtained from {vi,... ,Vd,b) € 72^^+^ by normalization.
The simplest type of multilayer feedforward network has one hidden layer and one linear output.
Such networks with Heaviside perceptrons in the hidden layer compute functions of the form
n
'Y^Wid{xi-yL-\-b),
where n is the number of hidden units, Wi eTZ are output weights and Vj e 72** and 6^ € 72 are input
weights and biases, respectively. The set of all such functions is the set of all linear combinations of
n elements of Hd and is denoted by span^HdFor all positive integers d, UneAr+span^Hd (where Af+ denotes the set of all positive integers) is
dense in (C([0,1]*^), ||.||c), the linear space of all continuous functions on [0,1]'' with the supremum
norm, as well as in (£p([0, l]**), ||.||p) with p £ [1, oo] (see, e.g., [5, 9]).
3
Existence of a best approximation
A subset M of a normed linear space {X, ||.||) is called proximinal if for every / € X the distance
||/-M|| = infggM ||/-ffll is achieved for some element of M, i.e., ||/-M|| = min^gM II/-5II (see,
e.g., [23]). Clearly, a proximinal subset must be closed.
A sufficient condition for proximinality of a subset M of a normed linear space (X, ||.||) is
compactness or boimded compactness. However, by extending Hd into span^Hd for any positive
integer n we lose compactness. Nevertheless compactness can be replaced by a weaker property
that requires only those sequences that "minimize" a distance from M of an element of X to have
convergent subsequences. More precisely, a subset M of a normed linear space (X, ||.||) is called
approximatively compact if for each f & X and any sequence {gi : i € A/+} C M such that
limi_>oo 11/ - 9i\] = 11/ - Af II) there exists g e M such that {gi : i E A/+} converges subsequentially
to g (see, e.g., [23], p. 368). The following theorem is from [16].
Theorem 3.1 For all n,d positive integers, span^Hd is an approximatively compact subset of
{Cp{[0,l]'^,\\.\\p) withpe[l,(x>).
The proof is based on an argument showing that any sequence of elements of span^iJ^ has a
456
,
yg^^ Kurkovd
subsequence that either converges to an element of sp&n„ Hd or to a Dirac delta distribution, and
the latter case cannot occur when such a sequence "minimizes" a distance from some function in
.£,([0,1]'^).
It follows directly from the definitions that each approximatively compact subset is proximinal.
Corollary 3.2 For alln,d positive integers, span^iJ^ is a proximinal subset o/(£p([0,1]''), ||.||p)
withp G [l,oo).
Thus, for any fixed number n, a function in £p([0,1]'') has a best approximation among functions
computable by a linear combination of n characteristic functions of half-spaces.
4
Uniqueness and continuity of a best approximation
Let M be a subset of a normed linear space {X, ||.||) and let V{M) denote the set of all subsets of
M. The set-valued mapping 'PM ■ X -^ P(M) defined by PMif) = {g E M : \\f - g\\ = \\f -M\\}
is called the metric projection of X onto M and PA/(/) is called the projection of f onto M.
Let F : X -^ P{M) be a set-valued mapping. A selection from F is a mapping (j): X -^ M such
that for all f € X, <f){f) e F{f). A mapping </> : X —» M is called a best approximation operator
from X to M if it is a selection from PMWhen M is proximinal, then Pmif) is non-empty for all / € X and so there exists a best
approximation mapping from X to M. The best approximation need not be unique. When it is
unique, M is called a Chebyshev set (or "unicity" set). Thus M is Chebyshev if for all / e X the
projection PM(/) is a singleton.
Recall that a normed linear space {X, ||.||) is called strictly convex (also called "rotund") if for
all f y^ ginX with ||/|| = ||5|| = 1 we have ||(/ + ff)/2|| < 1. It is well known that for all p G (1, cxi),
(£p([0, l]**), ||.lip) is strictly convex.
The following theorem from [13] implies for p in the open interval (1, CXD) that if among best
approximations to span„i?d (the existence of which is guaranteed by Corollary 3.2) there is a continuous one, then span„iJd must be a Chebyshev set.
Theorem 4.1 In a strictly convex normed linear space, any subset with a continuous selection
from its metric projection is Chebyshev.
We shall combine this theorem with the following geometric characterization of Chebyshev sets
with a continuous best approximation from [24].
Theorem 4.2 In a Banach space with strictly convex dual, every Chebyshev subset with continuous
metric projection is convex.
It is well known that £p-spaces with p G (l,oo) satisfy the assumptions of this theorem (since
the dual of Cp is Cg where l/p+ l/q = 1 and q e (l,oo)) (see, e.g., [7], p. 160). Hence, to show the
non-existence of a continuous selection, it is sufficient to verify that span„iJrf is not convex.
Proposition 4.3 For all n, d positive integers, span„ifrf is not convex.
Indeed, consider 2n parallel half-spaces with the characteristic functions 5i(x) = i9(v ■x + hi),
where 0 > 6i > ... > 62n > -1 and v = (1,0, • • •, 0) G T?.**. Then | YllZi 9i is a convex combination
of two elements of span„i?rf, X^"_j gi and J2illn+i 9ii ^^'^^ '*■ ^^ ^^^ ^^ span„Hrf, since its restriction
to the one-dimensional set {(^,0,... ,0) G "T^"* : i 6 [0,1]} has 2n discontinuities.
Summarizing results of this section and the previous one, we get the following corollary.
Approximation by perceptron networks
457
Corollary 4.4 In {Cp{[0,1]'^), ||.||p) with p e (1, oo) for all n,d positive integers there exists a best
approximation mapping from Cp{[0,1]'^) to span^Hd, but no such mapping is continuous.
Thus convenient properties of projection operators sucii as uniqueness and continuity are not
satisfied by span^Hd- These properties would allow one to estimate worst-case errors using methods
of algebraic topology (see, e.g., [6]). In Unear approximation theory, application of such methods
shows that some sets of functions defined by smoothness conditions exhibit the curse of dimensionality: the approximants converge at rate 0(1/v^n), where d is the number of variables and n is the
dimension of the approximating linear space (see, e.g., [20]). Our results show that these arguments
are not applicable to approximation by span^jETrf.
5
Rates of approximation
Let {X, ||.||) be a normed linear space and G be its subset, then G-variation {variation with respect
to G) is defined as the Minkowski functional of the set cl conv (G U —G), i.e.,
II/IIG
= inf{c € 7^+://c G clconv (G U-G)}.
Variation with respect to G is a norm on the subspace {/ € X : ||/||G < oo} C X. The closure in
its definition depends on the topology induced on X by the norm ||.||. When X is finite-dimensional,
G-variation does not depend on the choice of a norm on X, since all norms on a finite-dimensional
space are topologically equivalent.
Variation with respect to G has been introduced in [17] as an extension of the concept from [1]
of iJd-variation called variation with respect to half-spaces. For functions of one variable, variation
with respect to half-spaces coincides, up to a constant, with the notion of total variation studied in
integration theory (see [1]). For G countable orthonormal, it coincides with /i-norm with respect to
G(see [18]).
The following theorem from [17] is a reformulation of Maurey-Jones-Barron Theorem (see [2],
[10], [21]) on estimates of rates of approximation of the order of 0{l/y/n).
Theorem 5.1 Let {X, ||.||) be a Hilbert space, G be its subset and SQ = supg^Q\\g\\. Then for every
f G X and for every positive integer n,
l!/-span„G||<-/('««-f«°'^-«^«=
Corollary 5.2 For all positive integers d,n and for every f G (£2([0,1]'', IMIa);
||/-span„F,||2<MJj|i.
Thus worst-case error in approximation of functions from the unit ball in i?d-variation by finear
combinations of characteristic functions of n half-spaces of [0,1]'' is at most 1/y/n. Estimates derived
from Theorem 5.1 are sometimes called "dimension-independent", which is misleading since with
increasing number of variables, the condition of being in the unit ball in G-variation becomes
more and more constraining. See [19] for examples of smooth functions with iJ^-variation growing
exponentially with the number of variables d. However, such exponentially growing lower bounds
458
Vera Kurkovd
on variation with respect to half-spaces are merely lower bounds on upper bounds on rates of
approximation by span„i/d, they do not prove that such functions cannot be approximated with
faster rates than \\f\\Hdl\/^- Finding whether these exponentially large upper bounds are tight
seems to be a difficult task related to some open problems in the theory of complexity of Boolean
circuits.
Some insight into behavior of ffrf-variation gives its geometric characterization derived in [19]
using the Hahn-Banach Theorem.
Theorem 5.3 Let{X, ||.||) be a Hilbert space and G be its nonempty subset. Then for every f E X,
II/IIG = sup,e5 snp\g-h\' ""^^'^ S = {h€X-G^: \\h\\ = 1}.
geG
Thus functions that are "almost orthogonal" to Hd (i.e., have small inner products with characteristic functions of half-spaces) have large /Jd-variation.
6
Integral representation
The following theorem from [14] shows that a smooth real-valued function on 7?.'* with compact
support can be represented as an integral combination of characteristic functions of half-spaces. By
Jif";, is denoted the half-space {xeTZ'^ : e-x + b < 0}.
Theorem 6.1 Let dbe a positive integer and let f : 11^ -^ TZ be compactly supported and d+2-times
continuously dijferentiable. Then
/(x) = /
Wf{e,b)'d{e-x + b)dedb,
where for d odd
Wfie,b) = ad f
A'=V(y)dy,
kd = {d + l)/2, and aa is a constant independent of f, while for d even,
Wf{e,b) = ad
A'=<*/(y)a(e-y + 6)rfy,
where a{t) = —tlog \t\ +1 for t ^0 and a(0) = 0, kd = (d+ 2)/2, and ad is a constant independent
off.
The assumption that / is compactly supported can be replaced by the weaker assumption that /
vanishes sufficiently rapidly at infinity. The integral representation also applies to certain nonsmooth
functions that generate tempered distributions.
By an approach reminiscent of Radon transform but based directly on distributional techniques
from Courant and Hilbert [4], it was shown in [11] that if / is compactly supported function on 11''
with continuous d-th order partial derivatives, where d is odd, then / can be represented as
/(x) = /
Vf{e,b)i9{e-x + b)dedb,
Approximation by perceptron networks
459
where Vf = aa Jjj^^{Di^'^f){y)dy, aa = (-l)'=-Hl/2)(27r)i-'* for d = 2k + l, Di^'^f is the directional
derivative of / in the direction e iterated d times, de is the {d — l)-dimensional volume element on
S'''~^, and dy is likewise on a hyperplane. Although the coefficients Vf are obtained by integration
over hyperplanes, while the Wf arise from integration over half-spaces, these coeiKcients can be
shown to coincide by an appUcation of the Divergence Theorem [3] p.423 to the half-spaces H~f^.
Theorem 6.1 extends the representation of [11] to even values for d and target functions / which
are not compactly supported but which decrease sufficiently rapidly at infinity.
FoiweCiiS'^-'^ xTZ) and f eViW^) de&ae
Tff(u;)(x) = /
w{e,b)'d{e-y: + b)dedb,
SH{f){e,b) = Wfie,b).
Theorem 6.1 shows that for each / € 'D{TZ'^), THiSnif)) = /• This theorem can be also used to
estimate variation with respect to half-spaces by the £i-norm of the weighting function Wf = Vf. It
is shown in [11] that for any / to which the above representation appUes,
I/IIH.
< /
\wf{e,b)\dedb.
Combining this upper bound on /fd-variation with Corollary 5.2, we get a smoothness condition
that defines sets of functions that can be approximated by span„/7d with rates of the order of 1/y/n.
Bibliography
1. Barren, A. R. (1992). Neural net approximation, in Proceedings of the 7th Yale Workshop on
Adaptive and Learning Systems (pp. 69-72).
2. Barren, A. R. (1993). Universal approximation bounds for superposition of a sigmoidal hinction, IEEE Transactions on Information Theory 39, 930-9i5.
3. Biick, R. C. (1965). Advanced Calculus, McGraw-Hill: New York.
4. Courant, R. and Hilbert, D. (1962). Methods of Mathematical Physics, vol. 2, Wiley: New
York.
5. Cybenko, G. (1989). Approximation by superpositions of a single function. Mathematics of
Control, Signal and Systems 2, 303-314.
6. DeVore, R., Howard, R. and Micchelli, C. (1989). Optimal nonlinear approximation,
Manuscripta Mathematica 63, 469-478.
7. Friedman, A. (1982). Foundations of Modem Analysis, Dover: New York.
8. Gurvits, L. and Koiran, R (1997). Approximation and learning of convex superpositions.
Journal of Computer and System Sciences 55, 161-170.
9. Hornik, K., Stinchcombe, M. and White, H. (1989). Multilayer feedforward networks are universal approximators. Neural Networks 2, 251-257.
10. Jones, L. K. (1992). A simple lemma on greedy approximation in Hilbert space and convergence
rates for projection pursuit regression and neural network training, Annals of Statistics 20,
608-613.
460
Vera Kurkovd
11. Kurkova, V., Kainen, P. C. and Kreinovich, V. (1997). Estimates of the number of hidden
units and variation with respect to half-spaces, A^'ewra? A^etoorfo 10, 1061-1068.
12. Kainen, P. C, Kurkova, V. and Vogt, A. (1999). Approximation by neural networks is not
continuous, A^eurocompMimg 29, 47-56.
13. Kainen, P. C, Kurkovd, V. and Vogt, A. (2000). Geometry and topology of continuous best
and near best approximations. Journal of Approximation Theory 105, 252-262.
14. Kainen, P. C, Kurkova, V. and Vogt, A. (2000). An integral formula for Heaviside neural
networks, iVe«ra/TVetoorfc VForirf 10 313-319
15. Kainen, P. C, Kurkovd, V. and Vogt, A. (2000). Best approximation by Heaviside perceptron
networks. Neural Networks 13 645-647.
16. Kainen, P. C, Kurkovd, V. and Vogt, A. (2001). Best approximation by linear combinations
of characteristic functions of half-spaces (submitted to J. of Approx. Theory).
17. Kurkova, V. (1997). Dimension-independent rates of approximation by neural networks, in
Computer-Intensive Methods in Control and Signal Processing: Curse of Dimensionality (Eds.
Warwick, K., Karny, M.) (pp. 261-270). Birkhauser: Boston.
18. Kurkova, V., Sanguineti, M. (2001). Bounds on rates of variable-basis and neural network
approximation, IEEE Trans, on Information Theory 47, 2659-2665.
19. Kurkovd, V., Savicky, P. and Hlavdckova, K. (1998). Representations and rates of approximation of real-valued Boolean functions by neural networks, Neural Networks 11, 651-659.
20. Pinktis, A. (1986). n-Width in Approximation Theory, Springer: Berlin.
21. Pisier, G. (1981). Remarques sur un resultat non public de B. Maurey, in Seminaire d'Analyse
Foncttone/fe I., n.l2, Ecole Polytechnique, 1980-81.
22. Rosenblatt, F. (1958). The perceptron: A probabihstic model for information storage and
organization of the brain. Psychological Review 65, 386-408.
23. Singer, I. (1970). Best Approximation in Normed Linear Spaces by Elements of Linear 5w6spaces, Springer: Berlin.
: 24. Vlasov, L. P. (1970). Almost convex and Chebyshev sets. Math. Notes Acad. Sci. USSR 8,
776-779.
25. Zemanian, A. H. (1987). Distribution Theory and Transform Analysis, Dover: New York.
Eye-ball rebuilding using splines with a view to refractive
surgery simulation
Mathieu Lamard
Laboratoire de Traitement de I'Information Medicale, Ecole Nationale Superieure des
Telecommunications de Bretagne, F-29609 Brest Cedex, Prance.
Mathieu.LamardOenst-bretagne.fr
Beatrice Cochener
CHU de Brest Ophtalmologie, 5 avenue Foch 29609 Brest Cedex, Prance.
Beatrice.Cochener-Lainard@chu-brest.fr
Alain Le Mehaute
Departement de Mathematiques, UMR 6629, CNRS, Universite de Nantes,
BP 92208, P-44072 Nantes, Cedex 3, France
Alain.Le-MehauteOmath.univ-nantes.fr
Abstract
In this paper we present a use of splines in the biomedical field.
1
Introduction
In the surgical field of ophthalmology, refractive surgery has experiencied an important
expansion for about fifteen years. It allows the surgeons to correct different refractive
errors (myopia, hyperopia, astigmatism) aiming to decrease or minimize the use of optical
equipments such as glasses and lenses. Many surgical techniques are today available
for experts; with specific indications for each of them. Development of these methods
commonly takes time and requires many research studies on animals before any clinical
approach. In overall, abacus are established for all procedures. They provide to the
surgeon some rules for the achievement of the surgery. These nomograms are usually
based on statistical analysis of first wide series of operated patients. However, up to
now, no technique is able to take into account individual variability of eyes (morphology,
physiology).
The purpose of the present article is to consider this parameter in building a 3 dimensional numerical model of the eye and then applying to it various simulations of surgical
techniques in order to measure their effects.
462
Eye-ball rebuilding using splines with a view to refractive surgery simulation
2
2.1
Eye and vision
The eye anatomy
Schematically the eye-ball has quite a spherical shape with a vertical diameter (approximately 23 mm) and an antero-posterior of 2 mm longer (axial length). Its average
volume is 6.5 cm^ for a weight of 7 grams.
2.2
Refractive errors
When parallels rays reach a normal eye, they are refracted and converge without accommodation on the retina (called emmetropia). Errors of refraction come from a disparity
between the refractive capacity of the anterior segment of the eye and the length of the
eye; the light rays are no longer focus on the retina. This is called ametropia, and is
mainly of three types; myopia, hyperopia, astigmatism.
3
3.1
Correction of ametropia
Optical equipment
Glasses or lenses represent the traditional method. Glasses are safe and reversible for
correction of most refractive errors but they can be responsible for visual field reduction and prismatic aberrations. They can also be a source of discomfort and cosmetic
impairment for the wearer. Contact lenses have solved most of the problems associated
with glasses, but require very strict hygiene to avoid severe complications. Refractive
surgeries can bring an answer to these various problems.
3.2
Refractive surgery
Many techniques are available today in refractive surgeries. Most of them plan to reshape the cornea using of an excimer laser (193 nm). This laser (emitting in far UV) is
used in two distinct surgeries;
— The Photo Refractive Keratomileusis (PRK)
— Laser Assisted In Situ Keratomileusis (LASIK).
The PRK technique removes cornea tissue on its surface in breaking molecular bindings.
The depth and size of the ablation is determined as a function of the attempted correction. In LASIK the ablation is performed after' the cut of a thin cornea flap (160 fim).
This flap is replaced on the area of stromal ablation. In general PRK is used for correction of low ametropia and LASIK for low and medium corrections. For height corrections
other concepts have been developed (additive surgery).
4
Data acquisition
In order to reconstruct the eyeball in 3D, data from the eye under consideration are
needed. Numerous modalities allow us to obtain information about the eye anatomy.
4.1
Ultrasound
Ultrasound scan uses ultrasound waves for investigating human tissues in vivo. Nowadays
in ophthalmology it is a routine exam for the posterior segment of the eye, especially
for the research of foreign intra-ocular body. Reasons for this intense use are multiple.
463
464
Mathieu Lamard, Beatrice Cochener, and Alain Le Mehaute
including non invasive procedure, speed and low cost. But problems remain, which define
the current limits ultrasound. Multiple phenomena of reflection (between two internal interfaces, or between an interface and a transducer itself) create false echos. Inaccuracies
quickly increase with the deepening of the investigation because of all sources of "background noise", such as diffraction, diffusion and refraction. Advantages of ultrasound
allowed us to use it without constraint to obtain maximum image quality. Our first work
was to set up an images acquisition protocol of quality. The protocol privileged the underwater method to obtain a good acoustic coupling between the probe and the eye. The
patient is lying on his back, he is wearing on his face a submarine mask without pane.
This mask is filled with physiological serum. The probe, equipped with a Ughting target,
is plunged into the liquid. The patient fixes the target, in such conditions the provided
images are along the optical axis. The operator turns this probe manually and regularly
around the optical axis, and obtains a volume of data. A computer equipped with an
image acquisition board can save all images on an hard disk. The images resolution is
dependent on the probe and on the frequency of the ultrasound used.
4.2
MRI
The MRI, which tries to localize hydrogen pits by measuring their magnetization, realizes a real grey scale cartography of the proton concentration of the various examined
structures [1]. The resultant data volume has a dependent acquisition time resolution,
which currently represents one of the main important limitations of this technique. Besides the big quality of images obtained, the MRI has probably no harmful effect because
it does not use ionisants beams.
4.3
Computerized corneal topography
The anterior surface of the cornea is one fundamental element of the refraction. Any
modification or abnormality of this surface modifies the visual acuity. So the knowledge
of this shape is extremely important. In a traditional way the Javal's keratometer is used
to know punctually the refractive power of the cornea. In the last few years ophthalmologists have become used to another system, computerized corneal topography [2]. This
technique, based on the reflection and the analysis of the Placido's discs deformation,
allows us to obtain numerous data on the topology of the cornea. The curvature of the
cornea is represented on a colored map.
4.4
Visible Human images
The images of the Visible Human project (the photographic modality) have great space
resolution. They allow us to make reconstruction tests without acquisition problems.
5
Data segmentation
The purpose of this section is to addign a weight to each pixel of the image. The greater
the weight the greater the contribution of this pixel to the reconstruction of the edge
will be.
Eye-ball rebuilding using splines with a view to refractive surgery simulation
5.1 Pretreatments
Little pretreatment were done on the images under various modality. The speckle filtering
or the use of enhancement contrast filter have a sure visual action but the reconstruction
does not seem to be affected in our specific case. The only pretreatment used is an
overlooked one. The ophthalmologist places four points on each image to isolate the lens
and hence helps the treatment filters.
5.2 Treatments
To affect a weight to each pixel of the image, numerous edge detection filters were tested,
using different methods, LOG, Canny-Deriche, Shen-Castan, and the operator based on
the geometrical moments. The most convincing results were obtained with the Canny
operator. It has been created as the solution of an optimization problem with constraints
[3]. This filter is supposed to be an optimal compromise between the following criteria:
localization, detection and unicity. We have to note that this filter is optimized for images
flooded in a white, Gaussian, additive noise; and it is not the case in most of the used
data.
This filter is actually one of the references in the edge detection for its quality of
results; it is regularly used in the literature to the evaluation of new filters. A recursive
implementation of this operator was developed by [4] allowing an important performance
gain. The third dimension filter is obtained by supposing the filter separable and by
making a convolution product. This choice is an easy one but it introduces anisotropies.
These results images are difficult to use, and as recommended by [5], we extract its local
maxima. This method consists in estimating the gradient direction and only keeping its
watershed.
5.3 Post treatments
The previous stages can be applied to any type of images without taking into account
their contents. Two post-treatments types are presented to take into account peculiarities
of the eye contents. The first post-treatment consists to take into account ultrasound
sound images and MRI particularities. The center of the eye have got no edges and
generally the first visible edge is the good one. The "visible human" project images [10]
have specifics characteristics. They are in fact photos of frozen tissues; crystals of ice
are clearly visible in the vitrous, while it is uniform in the other modalities. A simple
threshold is ineffective. The hysteresis threshold , introduced by [3] takes into account
the edges connexity and luminance (levels of grey) and give us good results on such
images.
6
Eyeball rebuilding with splines
The most used techniques for edge reconstruction on medical prints are snakes (active
contour models) [7, 8]. A shape approaching the organ to be reconstructed is initialized,
then deformed locally to fit the data. These deformations use, generally, physical properties of elasticity materials. These various methods allow the organ edge reconstruction
of varied forms as bones, heart, brain, etc.. This type of reconstruction is effective but
numerous parameters must be set. We opted for a different technique. The edge to be
465
466
Mathieu Lamard, Beatrice Cochener, and Alain Le Mehaute
reconstructed in our case is a quasi-spherical shape, and we reconstruct it by using BspHnes. Their mathematical properties allow us to reconstruct the edge in a effective and
fast way, and with adjusting only few parameters.
6.1
Principle
For a B-spline (ID) on R = [a, b] we have to set:
— the degree k of the spline,
— the position and numbers of the knots {Xi,i = 0, ...,g + 1),
— the coefficients Ci of the spline representation:
9
i=—k
where Ni^k+ii^:) is the B-spline basis function.
We have chosen to set the degree of the spline to 3. Tests indicate this is a good compromise between computer time and result quality. The other parameter determination
depends on the approximation criteria used and the position of control knots.
6.1.1
The Dierckx criteria [6]
The Dierckx approximation criteria determine a spline like the solution of a constrained
minimization problem:
mmimize
n:=j:{si'H^i+)-s('H^i-)y
with the constraint
S ■.= 'Y2 {"WriVr - s{Xr))f < S
where {xr,yr) are the coordinates of the m data points, with Wr the associated weight.
6.1!2
Control knots number
As the number of control knots becomes important, the smoothness of the curvature
decreases. Using that property we set up an iterative algorithm to perform the calculation
of the spline. After an initialization with few control knots (we set for example AQ = a,
X2 = b and Ai = (a 4- 6)/2), the spline is computed. If the smoothness is too important
(with the S estimation) we add some control knots and we start again the estimation of
the smoothness. In the other case we stop the algorithm. At each iteration we can insert
one or more control knots. The distribution of the control knots is recomputed for each
iteration. They can be linearly distributed over R = [a,b].
This method can be generalized to surfaces without difficulty (see [6]) using spherical
coordinates and periodic boundary conditions.
Eye-ball rebuilding using splines with a view to refractive surgery simulation
6.2
Results
Different results are presented either in 3d or 2d view. In 2d view, the spline is drawn
in red, and represents the intersection of the 2d spline and the data volume. The main
reconstruction errors are due to segmentation errors. But the more data the better, and
the quality of the reconstruction needs to be good. The reconstruction of images issued
from visible human (22 shces) is better than from the MRI (8 slices) and the ultrasound
images (4 slices).
FIG.
1. Reconstruction using photographic images.
FIG.
2. Reconstruction using MRI.
467
Mathieu Lamard, Beatrice Cochener, and Alain Le Mehaute
468
.,«»,.
lillilll<|i||l|l|.M
■. ■iiJH !' 'fH
FIG.
7
3. Reconstruction using ultra sound images.
Elastic modelisation of surgery
7.1 Method used
The finite elements method is used to simulate surgery and solve the elasticity problem.
Actually the knowledge of the comportment law of the eye ball tissues is the main
limitation of this problem. Literature reports a wide range of coefficients to describe these
tissues. In fact they seem to have an individual variability. So we use the approximation
[9] for the elasticity coefficients which uses three parameters, internal pressure, radius of
the eye and width of the edge. The use of complex models does not offer much information
because of the low precision of the data that we used.
7.2 Results
Numerous simulations have been done. Results seem good in spite of the comportment
law and the duration of the finite element method. The result are represented with a
color map of the eye representing the curvature radius like the ophthalmologist does.
FIG.
4. Excimer Simulation before (left) and after (right).
Eye-ball rebuilding using splines with a view to refractive surgery simulation
8
Conclusion
This article presents a very modular path to realize modelisation of refractive surgeries.
Each part of this work can be independently modified and can be adapted to an other
organ. All this work has been validated by ophthalmologists. The eye ball reconstruction
using sphne appears to be an efficient method with a low CPU time. The mechanical
modelisation provides proper results despite several approximations. This study might
be useful for the medical doctor but also for testing new surgical techniques.
Bibliography
1. C. Dupas, La RMN au service de la medecine, La Recherche 81 (1977), 778-781.
2. S.D. Klyce, Computer assisted corneal topography : High resolution graphic presentation and analysis of keratoscopy, Invest. Ophtalmol. Vi SCI. 25 (1984), 1426-145.
3. J. Canny, A computational approach to edge detection. IEEE Transaction on Pattern Analysis and Machine Intelligence. 6 {1986), 679-698.
4. R. Deriche, Fast algorithms for low level vision, IEEE Transaction on Pattern Analysis and Machine Intelligence. 12 (1990), 78-87.
5. R. Deriche, Techniques d'extraction de contours,
(http://www-sop.inria.fr/robotvis/personnel/der/der-eng.html) Cours de I'INRIA
Sophia-Antipolis, 1998.
6. P. Dierckx, Curve and surface fitting with splines. Clarendon press, Oxford, 1995.
7. M. Klass A. Witkin D. Terzopulos, Snakes: active contour models, Int J Comput
Fmon. 1 (1988), 321-331.
8. D. Metaxas, Physics-based deformable models : Application to computer vision,
graphics and medical imaging. (1996) Kluwer Academic.
9. P.P. Purslow W.S. Karwatowski, Ocular Elasticity, Ophtalmology. 103 (1996), 16861692.
10. http://www.nlm.nih.gov/research/visible/visible-human.html.
469
A robust algorithm for least absolute deviations
curve fitting
Dongdong Lei, Iain J Anderson
University of Huddersfield, Huddersfield, UK.
d.lei@hud.ac.uk
Maurice G Cox
National Physical Laboratory, Teddington, UK.
Maurice.Cox@npl.co.uk
Abstract
The least absolute deviations criterion, or the £i norm, is frequently used for approximation where the data may contain outliers or 'wild points'. One of the most popular
methods for solving the least absolute deviations data fitting problem is the Barrodale
and Roberts (BR) algorithm (1973), which is based on linear programming techniques
and the use of a modified simplex method [1]. This algorithm is particularly efficient.
However, since it is based upon the simplex method it can be susceptible to the accumulation of unrecoverable rounding errors caused by using an inappropriate pivot. In
this paper we shall show how we can extend a numerically stable form of the simplex
method to the special case of £i approximation whilst still maintaining the efficiency of
the Barrodale and Roberts algorithm. This extension is achieved by using the £i characterization to rebuild the relevant parts of the simplex tableau at each iteration. The
advantage of this approach is demonstrated most effectively when the observation matrix
of the approximation problem is sparse, as in the case when using compactly supported
basis functions such as B-splines. Under these circumstances the new method is considerably more efficient than the Barrodale and Roberts algorithm as well as being more
robust.
1
Introduction
Given aset of m data points {{xi,yi)}^i, the ^i, or least absolute deviations curve-fitting
problem seeks c G IR" to solve the optimization problem
min||y-Ac||i = ^
Ei'-'i'
i=l
(1-1)
where Ais anmxn observation matrix, and Vi denotes the residual of the ith point.
Another way of stating the £i, or least absolute deviations curve-fitting problem, is by
the characterization theory of an £i solution [8], which may be given in different forms.
The following is perhaps the most commonly used.
470
A robust algorithm for least absolute deviations curve fitting
471
A vector c 6 M" solves the minimization problem (1.1) if and only if there exist
A € M"" such that
A^A = 0
with
('^^1-3'
. ^;''lf
\ Ai = sign(ri), ioTi^Z,
(1.2)
where Z represents the set of indices for which TJ = 0.
One of the popular methods designed for solving the £i approximation problem is the
Barrodale and Roberts (BR) algorithm. It replaces the unconstrained variables c and r
in (1.1) by nonnegative variables c+, c~, u and v^ and considers the linear programming
problem
min
e^u + e^v
subject to
Ac^-Ac~+u — v
c^,c~,u,v > 0.
c
=y,
(1.3)
Much of the reason for the popularity of the BR algorithm is that it exploits the
characteristics of the ii approximation in order to solve the problem in a more efficient
manner than the general simplex approach. However, it is a simplex based method, and
so it is susceptible to numerical instabilities caused by using inappropriate pivots. The
new method presented here uses matrix factorization instead of simplex pivoting. This
approach allows numerically stable updates to be made, thus avoiding the uimecessary
build-up of rounding errors. This method is particularly efficient when the observation
matrix is large and sparse [5].
Bartels [2] and Gill and Murray [4] presented methods that concentrate on avoiding
the inherent instability of the simplex method. However, these methods are designed for
a general linear programming problem and if we were to employ these techniques for the
special case of the i\ problem, the storage requirements and computational workload of
the method would be unnecessarily large compared to those of the highly efficient BR
algorithm.
The ii problem is, in essence, an interpolation problem. The aim of any iterative
procedure for the £i problem is to find an optimal set of interpolation points. Indeed,
this is how the BR algorithm solves the ii problem. It begins with all coefficients, c,
set to zero (being non-basic variables), and during each iteration of stage one, one of
the residuals, r-j, becomes non-basic by making the corresponding point an interpolation
point (i.e., the coefficients are altered so that TJ = 0). At the end of stage one, the current
estimate interpolates n distinct points. During stage two, the interpolation points are
exchanged one at a time with a non-interpolation point imtil an optimal solution is
achieved.
In fact, the new algorithm is effectively identical to the BR algorithm in the sense that
we use exactly the same pivoting strategy. However, we start with a predetermined set
of interpolation points and do not store the simplex tableau directly. In each iteration,
we only reconstruct the parts of the simplex tableau that are needed by the more stable
approach employed.
D. Lei, I. J. Anderson, and M. G. Cox
472
2
A more stable computational approach
The linear programming presentation of a least absolute deviations curve-fitting problem
is given in (1.3). It is a standard linear programming problem of dimension mx (2m+2n).
The robust approaches of Bartels and Gill and Murray can be applied to solve it. They
involve the factorization of an m x m matrix. On the other hand, the BR algorithm only
deals with an m x n matrix in each iteration, if m » n, the direct usage of these stable
approaches is less efficient. We shall show next that the factorization of an n x n matrix
is all that is required at each iteration.
We split the data points based on the set interpolation Z, and let Az, yz, uz and
vz be the counterparts of A, y, u and v in (1.3) corresponding to the set Z. Their
compjementary matrix and vectors are denoted by Az and yz, uz and vz, so that Az
and Az comprise A, etc., problem (1.3) can be expressed as
mm
e^ [uz + Uz) + e (vz + vz)
subjec;t to
Azc^ - AzC +UZ-VZ
Azc^ — Azc~ +uz -Vz
c+,c-,uz,uz,vz,vz
= yz,
= yz,
(2.1)
>o.
Since the coefficients for c^ are just the negative of the coefficients for ct, j = 1,2,.!., n,
it is possible to suppress cj and let c represent the unconstrained variable. The initial
simplex tableau associated with problem (2.1) can be constructed in matrix form by
Table 1, where e^. A; = m, n, m - n, are fc x 1 vectors with all components equal to one.
BV
c
Uz
Uz
Vz
Vz
r
Uz
Az
I
0
-I
0
yz
Uz
Az
0
/
0
-/
Vz
0
0
-2el
Az\
\ Az )
T(
Z
TAB.
-2pT
1. The initial simplex tableau of the ii fitting problem.
As we know, the simplex method is an iterative procedure in which each iteration is
characterized by specifying which m of 2m + n variables are basic. For the li approximation, we are only concerned with those vertices which are formed by a set of interpolation
points. For n interpolation points, the basic variables consist of n of the coefficient parameters c and m - n of the parameters uz corresponding to the non-interpolation points.
Let B be the rnxm basis matrix whose columns consist of the m columns associated
with the basic variables. Then
A robust algorithm for least absolute deviations curve fitting
BV
Uz
r
c
Af
Azhz
uz
-AzA-i
rz
Z
-el_„iAzA-^')-el
^m-Jz
473
TAB. 2. The condensed simplex tableau associated with a set of interpolation
points.
B=
(2.2)
Ai
It is readily verified that the inverse of B can be written in the form of (2.3) as long
as Az is invertible.
A-z'
B-
0
AzA-^^
(2.3)
Equation(2.3) shows that the explicit inverse computation of an m x m matrix in the
form of (2.2) can be achieved by dealing with an inverse of an n x n matrix, and in
general, w^.m.
To make the m non-basic variables become basic, we multiply the whole simplex
tableau hy B~^, and omit the identity and zero matrices. Then new simplex tableau is
given in Table 2.
An arbitrary choice of the interpolation set Z may cause some of the values in the
right hand side column to become negative. Although it is permissible for the coefficient
parameters c to be negative, for those rows having negative residuals Fz, we restore
feasibility by exchanging the corresponding uz for vz- This exchanging can be made by
subtracting twice those rows from the objective row and changing the sign of the original
rows [1].
Such an exchange process can be expressed in matrix terms by introducing a sign
vector
Xz =sign(rz).
Let Azs represent the matrix which is obtained by multiplying those rows of Az associated with negative residuals by —1,
Azs = dia.g{\z)Az.
D. Lei, I. J. Anderson, and M. G. Cox
474
TAB.
BV
Uz
T
c
A-z'
A'zVz
Uz
-AzsA-i
Vz\
Z
-\l{A^A-i)-el
'~^lrz
3. Restoration of feasibility of the simplex tableau.
The simplex tableau after restoring feasibility is shown in Table 3.
The point to be removed from Z is decided by the values of the objective row. Each
time the maximum value of the objective row (including the suppressed columns) is
chosen, we let the index of this element be k. In order to choose which new point is
to join the set 2, we compute the value of the pivotal column, the fcth column in the
simplex tableau. Since the simplex tableau is in the form of
^Zs
^z '
the fcth column can be obtained by using Azs and the fcth column of A^ •
The BR algorithm pivoting strategy is adopted to decide which new point is to be
added to the interpolation set, when a new set of indices Z is generated. We repeat the
process in an iterative manner until the optimal solution is achieved.
Table 3 is in fact identical to the simplex tableau of the BR algorithm in stage 2.
The difference here is that the BR algorithm is implemented by a simplex pivoting
approach, while the transformation of the simplex tableau in the form of Table 3 can be
accomplished in a numerically more stable manner.
3
The improved method
The improved method starts with a predetermined interpolation set Z, the minimum
requirement for Z being that it forms a well-behaved matrix Az- For B-spline basis functions, we can choose any set of points satisfying the Schoenberg-Whitney condition [6].
For a Chebyshev polynomial basis, points close to the n Chebyshev zeros can be regarded
as the initial interpolation set. In other cases, we can choose points approximate to them
or even uniformly distributed.
If we denote the set of Aj, i G 2, as Az, we can rewrite the characterization equation (1.2) as
(3.1)
Az^^z = -Az^z,
and \z can be obtained mathematically from
A robust algorithm for least absolute deviations curve fitting
\z = -{Alr\AT\z).
475
(3.2)
Table 3 shows that the objective row can be computed as
Objective row =-
-{XlAz )A^'
el.
(3.3)
Thus, using (3.2) we conclude that
Objective row = A^
-el
(3.4)
We know that at the i\ solution all the values in the objective row are in the range
[—2, 0], and also |A| < 1. This latter result can be explained in terms of the former by
the relationship (3.4).
(3.4) is useful because it can be used to verify whether an interpolation set forms an
optimal solution, or to compute A from the values of the objective row. We use it to
compute the values of the objective row.
The improved method can be summarized as follows;
(1) Choose an initial set of interpolation points and form the set Z.
(2) Construct A^, yz and their counterpart Az^, yz accordingly.
(3) Solve the equation Azc — yz for c, and compute
rz = yz- Azc,
and
Xz = sign(r2).
(4) Obtain the values of Xz from the equation
AlXz = -AlXz.
(3.5)
(5) If \Xz\ < 1 hold, the current solution is optimal, and the algorithm terminates.
Otherwise, continue.
(6) Obtain the objective row of the BR simplex tableau from
T
objective row = A^
— e.„T
n'
(7) Examine the values of the objective row; the point associated with the maximum
value of the objective row is chosen to leave the set Z.
(8) Decide the point to add by the BR pivoting strategy. Obtain a new set of indices
Z, and repeat from step 2.
4 Practical considerations and application to the ii spline approximation
The robustness of the above algorithm stems from the rehable updating of the relevant
parts of the simplex tableau in each iteration. The major computational work is obtaining
(explicitly or imphcitly) the inverse of an n x n matrix Az- It can be calculated and
stored explicitly by using an LU or QR factorization, or preferably it can be expressed as
a product of factors. Since Az differs from its predecessor by only one row, savings can
be made by reusing results from the previous step. Necessary material is available [4, 7]
regarding the stable implementation of this row updating procedure.
D. Lei, I. J. Anderson, and M. G. Cox
476
m = 512
\' 1 =
44
49
54
59
64
69
74
79
Numbers of iterations
Execution Time (seconds)
New
BR
New
BR
57
75
71
83
78
88
75
87
125
111
134
156
160
194
165
189
1.6
2.2
2.4
3.0
3.1
4.0
3.7
4.8
14.7
13.4
20.2
26.8
32.'
42.4
36.0
48.1
TAB. 4. The number of iterations and execution time taken by the algorithm of
this paper and the Barrodale and Roberts algorithm for a set of 512 response data
points provided by the National Physical Laboratory.
Sparsity almost always is more important than matrix dimension. Additional savings
can be made if the observation matrix A is sparse or structured. Approximation using
a B-spline basis often occurs in practical applications. In such cases, A is block banded,
and Az can be triangularized using 0{n) flops [3]. Similarly, the sparsity of A can be
exploited to compute other relevant parts of the simplex tableau efficiently.
We have applied our method to solve the least absolute deviations curve-fitting problems by B-splines using various numbers of interior knots. All software was written in
MATLAB and implemented on a Sun Workstation. The initial interpolation points are
chosen to be those points corresponding to the maximum value in each column of the
observation matrix A.
Some of our computational results are reported in Tables 4 and 5. Each table presents
the outcomes of a particular set of data points by the new method and by the BR
algorithm.
All the experimental results exhibit the effectiveness of the improved method on large,
sparse systems. Although these tables show that the improved method is faster than the
BR algorithm, it would be unfair to judge the convergence speed purely based upon
the time taken, since the improved method embodies some MATLAB built-in functions,
while the BR algorithm uses only user-defined functions. However, on average, the new
method requires far fewer iterations than the BR algorithm, and is competitive with the
BR algorithm both in efficiency and accuracy for a structured system.
Further work to be addressed by the authors will involve a definitive implementation
of this algorithm in Fortran, and development of an error analysis for both the improved
method and the BR algorithm.
A robust algorithm for least absolute deviations curve fitting
m = 1200
Numbers of iterations
Execution Time (seconds)
9=
New
BR
New
BR
50
56
62
68
74
80
86
92
98
82
105
113
131
121
132
155
173
153
143
165
190
189
223
216
245
252
272
4.0
5.2
6.1
7.6
7.8
9.2
11.8
14.0
13.6
58.7
85.8
110.2
110.4
157.9
163.2
209.8
241.8
292.6
TAB. 5. The number of iterations and execution time taken by the algorithm
of this paper and the Barrodale and Roberts algorithm for a set of 1200 data
points, generated by MATLAB command x = linspace(l, 10,1200)'; y = log(a;) +
randn(1200,1).
Bibliography
1. I. Barrodale and F. D. K. Roberts. An improved algorithm for discrete £i linear
approximation. SIAM Journal of Numerical Analysis 10, 839-848, 1973.
2. R. H. Bartels. A stabiUzation of the simplex method. Numerical Math. 16, 414-434,
1971.
3. M. G. Cox. The least squares solution of overdetermined linear equations having
band or augmented band structure. IMA Journal of Numerical Analysis 1, 3-22,
1981.
4. R E. Gill and W. Murray. A numerically stable form of the simplex algorithm. Linear
Algebra and its Applications 7, 99-138, 1973.
5. D Lei, I. J. Anderson, and M. G. Cox, An improved algorithm for approximating
data in the £i norm. In R Ciarlini, M. G. Cox, E. Filipe, F. Pavese, and D. Richter,
editors. Advanced Mathematical and Computational Tools in Metrology V, 247-250,
Singapore, 2001. World Scientific Publishing.
6. M. J. D. Powell. Approximation Theory and Methods. Cambridge University Press,
Cambridge, UK, 1981.
7. R. J. Vanderbei. Linear Programming — Foundations and Extensions. Kluwer Academic Publishers, Boston, MA, US, 1997.
8. G. A. Watson. Approximation Theory and Numerical Methods. Wiley, New York,
US, 1980.
477
Tomographic reconstruction using Cesaro-means and
Newman-Shapiro operators
Ulrike Maier
Mathematisches Institut, Justus-Liebig University, 35392 Giessen, Germany
Ulrike.Maier9math.uni-giessen.de
Abstract
Tomography is well known because of its many applications. Although theoretically
solved, the numerical implementation of tomographic reconstruction algorithms is still a
difficult problem. In this article the numerical implementation of a reconstruction method
using Cesaro-means and Newman-Shapiro operators is described. The key point herein
is the use of suitable quadrature formulae on the sphere. It turns out that in the context
described product Gaussian formulae are best suited. The algorithm is tested at the so
called Shepp-Logan phantom which is a three dimensional model of a human head.
1
Introduction and notation
The problem in tomography is to reconstruct a function F from its Radon transform sufficiently well. Since certain classes of functions can be expanded into series of orthogonal
polynomials it is essential to exploit the action of the Radon transform on orthogonal
polynomials and on polynomials in general.
This approach is the more interesting since the inverse of the Radon transform for
polynomials is known explicitly.
The convergence of orthogonal expansions to the given function is often achieved only
by applying a summability method. The application of such methods can be interpreted
as a kind of "filter technique" which is necessary for sufficiently good reconstructions. The
combination of an expansion of the function and the application of suitable summability
methods leads to promising reconstruction algorithms.
In this article two examples for a summabiUty method and their implementation are
presented — the Cesaro-means and Newman-Shapiro-means. After some introductory
remarks on Laplace-series at the end of this section, in Section 2 the theory of summability methods needed here is presented. In Section 3 this theory is applied to the
reconstruction of functions from their Radon transform. Section 4 describes the numerical implementation of the reconstruction formula which is tested on the so called
Shepp-Logan phantom of a head in Section 5.
In this article the following notation is used. Let B'' denote the unit ball in M^, S''~^
denote the unit sphere and Z'" := [-1,1] x S'""^ xy denotes the Euclidean product of
x,yem''.
478
Tomographic reconstruction
479
The spaces of restrictions of r-variate polynomials, homogeneous polynomials and
homogeneous harmonic polynomials of degree ^ € NQ onto a subspace X C MT {X =
S"-i or X = JB'-) are denoted by IPJ^iX), iP; (X), HJ^ (X), respectively. The space
C{S'^~^) of all continuously differentiable functions is provided with the inner product
< F,G >:= Jgr-i F{x)G{x)dx. The surface measure of the sphere is denoted by w^-i =<
1,1>.
Let C^ denote the Gegenbauer polynomials of degree /i and index A and C^ =
C^/C^{1) the normalized Gegenbauer polynomials. The reproducing kernel function of
*
1u -\- r 2
T—2
^r (5r-i^ jg given by G^{xy) =
—
C^= [xy), the normalized reproducing
kernel G^ is defined by G^ := G^/G^{1).
hetY £ {C{S^-^),L^S''-'^),LP{S''-^)}.Foi f eY let
oo
oo
„
^(/.^) = E(M(^) = E/
mGu{xy)dy
(1.1)
be the Laplace-series of /, where [Ki,f){x) := /^^.i f{y)G^{x,y)dy is the orthogonal
projection of / onto HI (S'""^) and the partial sums i^(/, x) = Y^'^^Q (■^•^f) (^) ^■re the
orthogonal projections of / onto 1F^{S'^~^).
Whereas for Y = i^(S''~-^) it is known that the partial sums L^{f, x) converge to /
in norm, no convergence is obtained for Y = C{S'^~^) or Y" = LP{S'^~^) for p > 2H
2
and p <2
(see e.g. [l]p.211). Applying a summability method the situation changes.
2
Summability methods
Let A = {a^v)fi,veiNo be an infinite matrix for which the elements a^^ G M fulfil the
following properties.
(i) a^v = 0 ioi V > ij.,
(ii) lim^_>oo a^,v = 1 for z/ € {0,1},
(iii) X^(0 > 0 for -1 < e < 1, where K^ := ^^^^g a^^G^.
If with the aid of a summability method the kernel G^ in (1.1) is substituted by a kernel
K^. = ^a^^uG^
(2.1)
then the operator L"^ defined by the transformed series
£^(/,x)=lim/
f{y)K^{x,y)dy
(2.2)
can be shown to converge pointwise to the identity provided that for the kernel K^ the
properties (i)-(iii) of the matrix A are valid.
480
U. Maier
Remark 2.1
The coefficients a/^u can be obtained from
a^, = {L^G,{t.)){t) = f
G,{tx)K^{tx)dw{x),
t G 5'^^
For A being the matrix of the Cesaro-means the proof was given by Kogbetlianz [4]
first. Berens et al. [1] give a proof for Cesaro-means as well as for Abel-Poisson-means.
They also prove results on the order of convergence and the corresponding saturation
classes. The convergence proof for Newman-Shapiro operators {Y = C{S'"'^)) can be
found in Reimer [7].
2.1
Cesaro-means
For Cesaro-means the coefficients a^^ in the summability method have to be chosen as
a,
■"''
(l)p
(fc-I-1)^-1/
(fc + 1)^
(1)^_.
.^^.
where (p)g =p-{p+l)-.. .-(p+q-l) denotes the Pochammer symbol. Then the kernels
Ky, in (iii) take on the form
Convergence of the transformed Laplace-series (2.2) is vaHd for k > (r-2)/2; for A; > r-l
the operators even are positive (see Kogbetlianz [4]).
2.2
New^man-Shapiro summability method
In [8] Reimer considers kernel polynomials
K2u+l{0 ■= K2u{0 ■= 9u+l
"g.+i(Ol'
(2.5)
as used by Newman-Shapiro [5]. Here, 77^+1 is the largest root of G^+i and
5.+i = (r-2)a;._i.^^^;^^(^^_3 j
=-^-^ • ^—^.
(2.6)
The coefficients a^^ in the Newman-Shapiro operators can be calculated to be
E{X)k (A)j-fc (A);-fc (l)j+;-2fc (2A)j-n-fc
,
^^^ (l),(l),_fc (!),_, (2A),+,_2. (A+ l)^+,-/'''''^'+'-'''
,
^■'
where 5uj+i-2k denotes the Kronecker delta and A = ^-^.
The matrix A defined by the Newman-Shapiro operators fulfils the properties (i)-(iii)
(see Reimer [8]).
Tomographic reconstruction
481
Remark 2.2 The corresponding partial sum operators L^ are nonnegative with positive
a^v For continuous and differentiable functions even more is valid (see Reimer [8]):
whereas for continuous functions the approximation error is of order 0{fj,~^) , functions
F eC^{S'-'^), j €{1,2}, have an error of order 0{fj.-^).
3
Application to tomography
The Radon transform 7^ : C(S'") -> C{Z'') is defined by
{nF)is,t):=
f
F{st + v)dv,
F€C{B^),
{s,t) £Z\
(3.1)
which means that the Radon transform 7?. of F is determined by integrating F over all
hyperplanes of dimension r — 1. This map can also be defined for fmictions in L?-{]RJ'),
L'^{B'^), the Schwartz space S{]RJ') or some Sobolev spaces. TZ is continuous on all of
these spaces, whereas the inverse TZ~^ is only continuous on S{FU') and on the Sobolev
spaces.
For polynomials it is known that
{ncl{a.)){s,t)^cl{s)cl{at),
aeS'-^\
is,t) e Z"-
(3.2)
(see Davison, Griinbaum [2]) and, more generally,
{nPm){s,t) = Cl{s)Pm{t),
{s,t)eZ\
(3.3)
*
where the polynomials Pm EJPJ^ {S'^"^) are generated by the Gegenbauer polynomials,
i.e. ■~~^C^{ax) = J2\Tn\^Li^^^rn{x)- These polynomials Pm, \Tn\ = /x, are known to
*
constitute a basis for IPJl {S^~^).
-
Let V^ := span{Pm '• I'm} = /x}. Since the Gegenbauer polynomials Cv can also be
.
* interpreted as the reproducing kernel of iH^"'"^(5''"'"^), the orthogonal projection F^, of
F S C{B^) onto V^{B^) can be identified with the orthogonal projection of F onto
*
.
iff^+^(S''"+^) (see Reimer [7] for details). Thus the theory of Laplace series can be used
here for the reconstruction of F from its Radon transform.
Let J4 be a matrix transformation aS introduced in Section 2 and let F„ be the
orthogonal projection of F onto Vl{B^). Then according to the summability theory
of Laplace series F — \va^.^l^aoY^=Q^i>■vFv■ Since the Radon tranform is linear and
continuous there is 72.F = lim^_oo J^^_Q a^y7?.Fy.
It can be shown that (see Reimer [7])
F,{x) = X,,r^ f {nF){s,t)cl{s)CHtx)d{s,t),
where
(r-l)cl(l) /■■
UJr-l ■ Wr-2
7-1 V
' V
'
a;^_j
(3.4)
(3.5)
482
U. Maier
^Prom this, after some lengthy calculation using the adjoint operator of 7?. (which essentially is the inverse operator of TZ), the reconstruction formula follows
F{x) = \ixn^J2'^>"^^^'r
/
{nF){s,t)Cd {s)Cl {tx)dsdt.
(3.6)
Because of the identification of the orthogonal projection of F onto V^{B^) and onto
iff^"''^(5'"+^), convergence of the Cesaro-means follows for k > r/2, and positivity of
the operators is valid for fc > r + 1. For the same reason the coefficients a^v in the
Newman-Shapiro summability method have to be calculated for \ = ^^"^^^ = §.
4
Numerical implementation
For the reconstruction of F formula (3.6) was used. As soon as the Radon transform of
F is known, the numerical implementation in principle reduces to a stable evaluation
of the Gegenbauer polynomials and a suitable approximation of the integrals in (3.6).
The Gegenbauer polynomials were evaluated by their recurrence relation (see Szego [11])
which is known to be numerically very stable. The coefficients a^^ for the Cesaro-means
and the Newman-Shapiro operators were computed with the aid of formula (2.3) and
(2.7), respectively. The factor A„,r was obtained by (3.5). Since the calculation of a^,,, for
the Newman-Shapiro operators is very time consuming (more than 10 hours for n > 100)
these coefficients were stored before the main computation was started.
Since the integrand in (3.6) is a polynomial of degree v + 2 with respect to s (see
(3.6) together with (5.1)), /_j ..ds was approximated by a Gaussian-Legendre quadrature of degree fx/2 + 1. This choice ensures that for the evaluation of TlF{s,t) enough
evaluations with respect to s are performed and that the integral is evaluated exactly
within numerical precision.
For the quadrature on S^"^ first an interpolatory quadrature as introduced in [6]p.l32
was used. The weights of such a quadrature formula are obtained as solutions of a linear
system of equations GA = e, where e = (1,...,1)^ G IR'^, N = dim JPJ^ (-S"""^),
J4 = (J4I, ..., AAT)^ the vector of weights and
■
*
G=-^ (C| (xjXk) + Cl_,{xjx,))f^,^,.
The points were chosen to be regularly distributed on latitudes of the sphere.
For // > 70 in the computation of the weights computational problems occured because of a lack of memory. Apart from this problem, several weights turned out to be
negative which led to oscillations of the reconstruction. Therefore, this interpolatory
quadrature was substituted by a product-Gauss formula for the sphere S""~^ as suggested by Stroud [10]p. 41. The points and weights of the Gaussian quadrature were
computed by the MATLAB program qrule.m which is available via internet from the
Mathworks Inc. The number of points of the product Gauss formula is N = 2M''~^
where M = /i/2 -I- 1 is the number of points used in each direction, i.e. A^ = 2M^ for
"■•■■
r = 3.
Tomographic reconstruction
483
All codes for computation were written in MATLAB 6. The actual computation
took place on a SUN UltralO with 256 MB main memory, 691 MB virtual memory and
SUN OS operating system release 5.7. To increase the computatinal speed all parts of the
MATLAB code were written with as few for-loops as possible. This gave an improvement
in speed of a factor > 500.
5
Computational results
The theoretical results have been applied to the so called Shepp-Logan phantom which
is usually used as a test function for tomographic reconstrution algorithms. It is a three
dimensional model of a human head consisting of 10 ellipsoids (see Shepp [9]) which were
shrinked here to fit into the unit sphere S^. Figure 1 shows a cut at xs = 0.2721.
Let a^ ,a2 ,03 , j = 1,..., 10, denote the axes of the j-th ellipsoid, d^^^ denote its
density value and $2 -Sj the diameter of the ellipsoid in the direction of t G S^. Since
the Radon transform is Unear, the Radon transform of the Shepp-Logari phantom can
be calculated to be
10
nFis,t) = 5^7rd(^)a(^')4^')4^')(s - s^^^){s^i^ - s)
„U)
_ *i
„(i)
'9
-3/2
(5.1)
j=i
Figure 2 shows the reconstruction results according to formula (3.6) for Cesaro-means
of index fe = 4 and for Newman-Shapiro operators.
The values A; = 1.6 and k = 2 were tested, too,
but for high degrees of [i no convergent behaviour could be observed.
For Cesaro-means with fc = 4 and for NewmanShapiro operators Figure 2 clearly shows an improving behaviour of the reconstructions for increasing fi.
The Newman-Shapiro operators show a better
convergence and for n > 150 even the small
structures in the original head can be detected
in the reconstruction. It can be expected that
for higher degrees of /U this behaviour will become more evident.
FIG.
1. Shepp-Logan phantom.
Unfortunately, for /z > 170 the computation of the coefRcients a^^ for the NewmanShapiro operators caused some numerical problems so that the calculations were stopped
with n = 160. Although the numerical results look quite promising, the drawback in the
reconstruction is the computational time. For fj. = 160 the computation took 27.5 hours
for the Radon transform and 31 hours for the evaluation at the points a; € [—1,1]^. The
evaluation was done on an equidistant grid of 200 x 200 points.
U. Maier
484
-1
-0.5
0
H =40
-1
-0.5
0
0.5
H=40
FIG.
0.5
1
-1
-0.5
0
0.5
H=100
-1
-0.5
0
0.5
H=100
1
-1
-0.5
0
0.5
(I =160
1
-0.5
0
0.5
H=160
1
2. reconstruction of the Shepp-Logan phantom.
In principle there is no problem to produce three dimensional reconstructions. The evaluation points X only have to be chosen from a grid in [-1,1]^. Because of the time
consuming calculations this was not done here, yet.
Bibliography
1. H. Berens, P.L. Butzer, S. Pawelke, Limitierungsverfahren von Reihen mehrdimensionaler Kugelfunktionen und deren Saturationsverhalten, Publ. Res. Inst. Math.
Sci. Ser. A 4 (1969) 201-268.
2. M.E. Davidson, F.A. Griinbaum, Tomographic reconstruction with arbitrary directions, Comm. Pure Appl. Math. 34 (1981) 77-119.
3. S.R. Deans, The Radon transform and some of its applications, Wiley & Sons, New
York, Chichester, Brisbane, Toronto, Singapore, 1983.
4. E. Kogbetlianz, Recherches sur la summabilite des series ultraspheriques par la
m^thode de moyenne arithmetiques, J. de Math, pure et appl. 9(3) (1924) 107-187.
5. D.J. Newman, H.S. Shapiro, Jackson's theorem in higher dimensions. In: On approximation theory, eds. P.L. Butzer, J. Korevaar, Birkhauser Verlag, Basel, 1964,
pp. 208-219.
6. M. Reimer, Constructive theory of multivariate functions. BI Wissenschaftsverlag,
Tomographic reconstruction
7.
8.
9.
10.
11.
Mannheim, Wien, Zurich, 1990.
M. Reimer, Radon-transform, Laplace-series and matrix-transforms, Comm. Appl.
Analysis 1 (1997) 337-349.
M. Reimer, Generalized hyperinterpolation on the sphere and the Newman-Shapiro
operators, submitted.
L.A. Shepp, Computerized tomography and nuclear magnetic resonance, J. Comp.
Ass. Tomography 4 (1980) 94-107.
A.H. Stroud, Approximate calculation of multiple integrals. Englewood Cliffs, NJ:
Prentice Hall 1971.
G. Szego, Orthogonal polynomials, Amer. Math. Soc, Providence 1991.
485
A unified approach to fast algorithms of discrete
trigonometric transforms
Manfred Tasche
University of Rostock, Department of Mathematics, D-18051 Rostock, Germany.
manfred.tasche9matheinatik.uni-rostock.de
Hansmartin Zeuner
Medical University of Liibeck, Institute of Mathematics, Wallstrafie 40,
D-23560 Liibeck, Germany.
zeunerSmath.mu-luebeck.de
Abstract
We present a unified approach to fast algorithms of various discrete trigonometric transforms. With the help of so-called Euler formulas we describe an elegant and useful connection between Fourier matrices and trigonometric matrices. It is known that FFTs
are closely related to the factorizations of the unitary Fourier matrix into a product of
unitary sparse matrices. Using these Euler formulas and FFTs, we obtain fast algorithms
of discrete trigonometric transforms. As a further consequence of these Euler formulas
and Gaussian sums, we compute all eigenvalues of some trigonometric matrices.
1
Introduction.
The fast Fourier transform (FFT) and related algorithms for orthogonal trigonometric
transforms are essential tools for practical computations. Special discrete trigonometric transforms are the discrete Hartley transforms (DHT), discrete cosine transforms
(DCT), and the discrete sine transforms (DST) of various types. These transforms have
found important applications in approximation methods with Chebyshev polynomials,
quadrature methods of Clenshaw-Curtis type (see [3]), signal processing, and image
compression (see [4, 6, 9]).
Euler formulas describe the algebraic connection between Fourier matrices of a certain type and corresponding cosine and sine matrices. Using these formulas, FFTs can
be transformed into fast and stable algorithms for the DCT and DST. Further, from
these Euler formulas the orthogonality of various trigonometric matrices follows immediately. For simplicity we consider only symmetric trigonometric matrices, i.e. Fourier
and Hartley matrices of type I and IV as well as cosine and sine matrices of type I, IV,
V and VIII.
This paper is organized as follows; first we introduce generalized Fourier matrices.
New Euler formulas for these matrices describe a close connection with various orthogonal Hartley, cosine and sine matrices. These results simplify and extend former results
486
Discrete trigonometric transforms
487
of [9], pp. 83-96. Applying these Euler formulas and FFTs, we obtain fast algorithms
of discrete trigonometric transforms. As a further consequence of these formulas and
Gaussian sums, we can compute all eigenvalues of orthogonal symmetric trigonometric
matrices.
2
Euler formulas for Fourier matrices of type I
Let A'' > 2 be a given integer. The Fourier matrix of type I is the classical Fourier matrix
defined in unitary form
with wjv := exp(-27ri/7V). Note that the Gaussian sum (see [5], pp. 326-330) yields the
trace of F^:
.
1
v-^
1^
1+1
,
,
Closely related with type I Fourier matrices are the cosine and sine matrices of types I
andV:
M\r+i
•
-
,/2"/
(j + l)(fc+l)7rN^-2
j,k=0
qV
^
/■^2(j + l)(fc+l)7rN^-i
V2N +1V
2Af + l
Jj,k=o ■
._
Here we set ef := \/2/2 for j € {0, AT} and ef := 1 for j e {l,... ,7V - 1}. In this
notation a subscript of a matrix denotes the order, while a superscript signifies the type of
the matrix. In the following, Jjv denotes the identity matrix and Jjv the counteridentity
matrix, which has the columns of IN in reverse order. Blanks in a block matrix indicate
blocks of zeros. The direct sum of ma,trices A, B will be denoted hy A®B. Defining the
orthogonal matrices
/ V2
pi
1_
0
IN-1
\
IN-1
V2
\
«^Ar-i
—JN-1 J
we obtain for Fourier matrices of type I the following Euler formulas:
Theorem 2.1 Depending on whether the order of the Fourier matrix of type I is even
or odd, we have
{Ifu) ^N^N
=
Qr+1 ® (-O-^Jv-i)
(2-2)
488
M. Tasche and H. Zeuner
mN+iY^N+iHw+i
= Q^+1 ® (~i)^jv-
(2-3)
Proof: It is obvious that {J^MY^N = ^N- Splitting I^j^ into four blocks
V2N \ (Wa^v
jj,fc=o
l'^2W
jj,fc=0 /
and using the classical Euler formula exp(-ia;) = cos a; - isini, we obtain (2.2) by
blockwise computation of il2NY^N^N- The proof of (2.3) is similar.
□
Remark 2.2 An analogous result to (2.2) can be found in [9], pp. 85-90, but with a
complex matrix instead of J^^. Compare also with [1]. The Euler formula (2.3) is new.
Note that the results and their proofs are simpler than in [9], pp. 85-90 and [1].
Corollary 2.3 The matrices C^_^-^,Sl,_-i^,C^^i,S^ are orthogonal.
Proof: Since t^fj is unitary and i|^ is orthogonal, C^+i®(-i)S^_i is unitary by (2.2).
Hence the real matrices C^+i and S]^_i are orthogonal. Other proofs can be found in
[4], pp. 12-16 and [6].
The proof for the type V matrices uses (2.3) and follows similar lines.
□
Remark 2.4 Results analogous to (2.2) and (2.3) are true for the Hartley matrix of
type I (see [9], pp. 77-80 and [8], pp. 224-227)
Hi, ~
A/JV
leas-!—-]
V
AT /j,fc=o
with cas X := cos x + sin x. Then we obtain the formulas
The Euler formula (2.2) can be used for fast and numerically stable computations of
DCTs and DSTs of type I: Let x G R^+^ and y e E'^"'^ with N = 2* {t > 2) and set
z := [j € E^^. Since if^^z is real, we can apply Edson's algorithm for the FFT of
real data (see [8], pp. 215-223 and [7]). The output of the conjugate even result is in the
form U2NF2N{I2NZ) where U2N ■= (JW+i ® (-i)Iiv-i) {IiNy■ Therefore by
t^2;vi^V^;.z = (CA+i®(-l)S;_i)^=f*^^ti=" \
we have calculated C^+ia: and Slf^^y simultaneously using 5Nt flops.
If we have to use an FFT with complex data, we combine real data vectors x,x' e
(X + ix' \
. , j. Then we can
compute two DCTs C^+ia;,C^_,_ix' and two DSTs S^_iy,S^_iy' simultaneously via
an FFT of length 2N applied to the complex input vector I2N'^'In a similar way, the Euler formula (2.3) can be used for fast computations of DCTs
and DSTs of type V: For given a;, a;' 6 K^+^ and y,y' e K^ the transformed vectors
Discrete trigonometric transforms
489
Cj^^iX,Cjf^ix', Sj^y, and Sj^y' can be calculated at the same time as components of
{P2N+1YF2N+1P2N+1Z' = (CN+I e i-i)S^)z' where we use an FFT of length 2iV + 1
with complex data i2jv+i^'- If 2iV + 1 = 3* or more generally, if 2A'' + 1 is a product of
small primes (see [8], pp. 76-101 and [7]) the FFT of length 2A'' + 1 can be computed
very efficiently.
3
Euler formulas for Fourier matrices of type IV
The Fourier matrix of type IV, defined by
is related to the Fourier matrix of type I by the formula
I^=uj,j,W^F^W^
(3.1)
with W^ := diag(a;2;v)fc^o^ ^^^ is therefore unitary. If iV is a power of 2 or 3, then J^
can be factorized into a product of sparse unitary matrices.
Lemma 3.1
The trace of the Fourier matrix of type TV is equal to
r _
1
V^
i2k+if _ I-'"
Proof: We begin with the generalized Gaussian sum (see [5], p. 330)
-
2N-1
which we split into two sums containing even and odd j respectively. Then
.
2N-1
N-1
AT-l
•
and the results follows by (2.1).
D
Now we introduce cosine and sine matrices of type IV and VIII which are closely
related with the Fourier matrix of type IV:
C^ .- ,[Ifcos ^^^+^^^2^±^]''~'
^
rvm
^
._
■"
._
\ N\
m
Jj,k=o'
^
r „(2j + l)(2fc + l)7rx^-i
V2N + lV°^
2{2N + 1)
)j,k=o'
2
/ M+i M+i
^+^ ■" ^f+T l'^-+i "'=+1 ''''
(2j + l)(2fc+l)7rNiv
2{2N + 1)
J,-,.=o-
490
M. Tasche and H. Zeuher
As above we define orthogonal matrices
piv ._ 1 / -fw
IN
\
pw
._ -^
I
,/9
Theorem 3.2 For the Fourier matrix of type IV and even resp. odd order, we obtain
the following Euler formulas:
(P,^fi^>,^
{P2N+1) ^N+i^^N+i
= C^®i-i)S^,
(3.3)
=
(3-4)
Qv ®(-i)'Sjv+i-
Proof: Similar to that of Theorem 2.1.
□
Corollary 3.3 The matrices C^,S^,Cj^ and S}^-^ are orthogonal.
Remark 3.4 An analogous result to (3.3) can be found in [9], pp. 94-96. Compare also
with [1]. Formula (3.4) is new. A difi'erent proof of the orthogonality of C^ and «S^ can
be found in [6].
Remark 3.5 Similar formulas as in Theorem 3.2 are true for the Hartley matrix of
t2/pe IV (see [1, 2])
1 /
(2j + l)(2fc + l)7r\JV-i
^
ViVV
2N
Jj,k=o
Then we have
(■^2^) -^AT-^iV
—
Cjv ® -Sjv ,
(3.5)
The Euler formulas can be used for a fast and numerically stable computation of
DCT and DST of types IV and VIII:
Using (3.3) and (3.1), for arbitrary x,x',y,y' G 1^ the DCTs C^x, C^x' and
DSTs S^y and S^y' can be calculated via one FFT of length 27V with complex data
i^^z' and z' := (^,^^, j.Ii N = 2'^, this procedure requires about lONt operations.
Likewise by (3.4), for x,x' e R^,y,y' € R^+i the DCTs of type VIII, C}^x, C^x'
and the DSTs S^^■^y, S^^^y' can be calculated via one FFT of length 27V + 1 with
complex data I^N+I^'Remark 3.6 The sine, cosine, Hartley, and Fourier matrices considered above enjoy
the interesting intertwining relations (see [2]):
Cfj
JN
=
Syv-Slv j
HN'I'N
—
=
-^NII-N^
F^Ji\f
Jff-^N'
(3.7)
I^N "IN
'^N
"IN
=
=
JNHN
•'ATFJV
1
)
with the diagonal matrix Sw+i := diag((-l)'')^^o and the reflection matrix fj^ :=
1© Jiv-i. Therefore applying (3.7) in the above algorithm, it is also possible to compute
Discrete trigonometric transforms
491
four DCTs (or four DSTs) of type IV and order N via one FFT of length 2N with
complex data.
4
Eigenvalues of trigonometric matrices
Finally we determine the eigenvalues of trigonometric matrices introduced above. Since
the cosine and sine matrices of type I, IV, V and VIII, and the Hartley matrices of type
I and IV are real, symmetric and orthogonal, only 1 and -1 are possible eigenvalues.
For a; e E we denote by [x\ resp. \x] the integer k e Z with k < x < k + 1 resp.
fc- 1 <a; < fc.
Theorem 4.1 The sine and cosine matrices C^,Slf, C^, Sff, C^,S^, C^ and S^ of
order N >2 possess the eigenvalues 1 and —1 with multiplicities
m{l)=\N/2l
m(-l)={iV/2j.
Proof: Since C^ is symmetric and orthogonal, only 1 and —1 can be eigenvalues. Their
multiphcities fulfil
m{l) + m{-l)=N.
On the other hand, since C^ and iS^_2 are real, it follows from (2.2) and the trace
formula (2.1) that
1 + 1^^-2
m(l)-m(-l) =trC^ = Re(tri^';v_2) =Re1+i
f 1
{I
for odd AT,
for even N.
Prom these two linear equations we obtain m(l) = \N/2] and m(—1)
other cases, the proof is similar.
[iV/2j.Inthe
D
From Theorem 4.1 and the Euler formulas (2.2)-(2.3) and (3.3)-(3.4) it follows immediately:
Corollary 4.2 The Fourier matrices of type I and IV have only eigenvalues 1, -l,i, —i
with multiplicities:
^N
m(l)
m(-l)
m(i)
m(-i)
4JV+l
jpTV
L^/2J + 1
r^/2]
\N/2'] -1
L^/2J + 1
\N/2]
[Ar/2J
riV/21
[N/2}
[N/2\
[N/2\
\N/2]
\N/2]
\N/2]
[7V/2J
\N/2]
[N/2\+1
From Theorem 4.1 and formulas (2.4)-(2.5) and (3.5)-(3.6) it follows:
Corollary 4.3 The Hartley matrices of type I and IV have only eigenvalues 1 and —1
with the following multiplicities:
i4:N
m(l)
m(-l)
2[Ar/2j + l
2\N/2] - 1
m.
N+1
iV+l
N
rrlV
-"2JV
2|"iV/2]
2[Ar/2j
N+1
N
492
M. Tasche and H. Zeuner
Bibliography
1. V. BRITANAK AND K. R. RAO, The fast generalized discrete Fourier transform: A
unified approach to the discrete sinusoidal transform computation, Signal Process.,
79(1999), pp. 135-150.
2. G. HEINIG AND K. ROST, Hartley transform representations of inverses of real
Toeplitz-plus-Hankel matrices, Numer. Funct. Anal. Optim., 21 (2000), pp. 175-189.
3. J. C. MASON AND E. VENTURINO, Integration methods of Clenshaw-Curtis type,
based on four kinds of Chebyshev polynomials, in Multivariate Approximation and
Splines, G. Niirnberger, J. W. Schmidt, and G. Walz, eds., Basel, 1997, Birkhauser,
pp. 153-165.
4. K. R. RAO AND P. YIP, Discrete Cosine Transform: Algorithms, Advantages, and
Applications, Academic Press, San Diego, 1990.
5. R. REMMERT, l^nfciionenf/ieone I, Springer, Berlin, 1992.
6. G. STRANG, The discrete cosine transform, SIAM Rev., 41 (1999), pp. 135-147.
7. M. TASCHE AND H. ZEUNER, Roundoff error analysis for fast trigonometric transforms, in Handbook of Analytic-Computational Methods in Applied Mathematics,
G. Anastassiou, ed., CRC Press, Boca Raton, 2000, pp. 357-406.
8. C. VAN LOAN, Computational Frameworks for the Fast Fourier Transform, SIAM,
Philadelphia, 1992.
9. M. V. WICKERHAUSER, Adapted Wavelet Analysis from Theory to Software, A K
Peters, Wellesley, 1994.
This volume contains the proceedings of an International
Symposium on Algorithms for Approximation Four (A4A4)
held at University of Huddersfield from July 15th to 20th,'
2001, attended by 106 people from no less than 32
countries. The 54 papers submitted cover a broad range of
topics in approximation theory, metrology, orthogonal
polynomials, splines, wavelets, radial basis functions
approximation on manifolds, and applications in medical
modelling, and the solution of integral and differential
equations. All papers were refereed meticulously.