Nothing Special   »   [go: up one dir, main page]

Iske A Approximation Theory and Algorithms For Data Analysis

Download as pdf or txt
Download as pdf or txt
You are on page 1of 363

Texts in Applied Mathematics 68

Armin Iske

Approximation
Theory and
Algorithms for
Data Analysis
Texts in Applied Mathematics

Volume 68

Editors-in-chief
S. S. Antman, University of Maryland, College Park, USA
A. Bloch, University of Michigan, Public University, City of Michigan, USA
A. Goriely, Universiyty of Oxford, Oxford, UK
L. Greengard, New York University, New York, USA
P. J. Holmes, Princeton University, Princeton, USA

Series editors
J. Bell, Lawrence Berkeley National Lab, Berkeley, USA
R. Kohn, New York University, New York, USA
P. Newton, University of Southern California, Los Angeles, USA
C. Peskin, New York University, New York, USA
R. Pego, Carnegie Mellon University, Pittsburgh, USA
L. Ryzhik, Stanford University, Stanford, USA
A. Singer, Princeton University, Princeton, USA
A. Stevens, Max-Planck-Institute for Mathematics, Leipzig, Germany
A. Stuart, University of Warwick, Coventry, UK
T. Witelski, Duke University, Durham, USA
S. Wright, University of Wisconsin, Madison, USA
The mathematization of all sciences, the fading of traditional scientific boundaries,
the impact of computer technology, the growing importance of computer modeling
and the necessity of scientific planning all create the need both in education and
research for books that are introductory to and abreast of these developments. The
aim of this series is to provide such textbooks in applied mathematics for the student
scientist. Books should be well illustrated and have clear exposition and sound
pedagogy. Large number of examples and exercises at varying levels are
recommended. TAM publishes textbooks suitable for advanced undergraduate and
beginning graduate courses, and complements the Applied Mathematical Sciences
(AMS) series, which focuses on advanced textbooks and research-level monographs.

More information about this series at http://www.springer.com/series/1214


Armin Iske

Approximation Theory
and Algorithms for Data
Analysis

123
Armin Iske
Department of Mathematics
University of Hamburg
Hamburg, Germany

ISSN 0939-2475 ISSN 2196-9949 (electronic)


Texts in Applied Mathematics
ISBN 978-3-030-05227-0 ISBN 978-3-030-05228-7 (eBook)
https://doi.org/10.1007/978-3-030-05228-7
Library of Congress Control Number: 2018963282

Mathematics Subject Classification (2010): 41-XX, 42-XX, 65-XX, 94A12

Original German edition published by Springer-Verlag GmbH, Heidelberg, 2017. Title of German
edition: Approximation.
© Springer Nature Switzerland AG 2018
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made. The publisher remains neutral with regard to
jurisdictional claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface

This textbook offers an elementary introduction to the theory and numerics


of approximation methods, combining classical topics of approximation with
selected recent advances in mathematical signal processing, and adopting a
constructive approach, in which the development of numerical algorithms for
data analysis plays an important role.
Although the title may suggest otherwise, this textbook is not a result
of the current hype on big data science. Nevertheless, both classical and
contemporary topics of approximation include the analysis and representation
of functions (e.g. signals), where suitable mathematical tools (e.g. Fourier
transforms) are essential for the analysis and synthesis of the data. As such,
the subject of data analysis is a central topic within approximation, as we
will discuss in further detail.
Prerequisites. This textbook is suitable for undergraduate students who
have a sound background in linear algebra and analysis. Further relevant
basics on numerical methods are provided in Chapter 2, so that this textbook
can be used by students attending a course on numerical mathematics. For
others, the material in Chapter 2 offers a welcome review of basic numerical
methods. The text of this work is suitable for courses, seminars, and distance
learning programs on approximation.
Contents and Standard Topics. The central theme of approximation
is the characterization and construction of best approximations in normed
linear spaces. Readers are introduced to this standard topic (in Chapter 3),
before approximation in Euclidean spaces (in Chapter 4) and Chebyshev ap-
proximation (in Chapter 5) are addressed. These are followed by asymptotic
results concerning the approximation of univariate continuous functions by
algebraic and trigonometric polynomials (in Chapter 6), where the asymp-
totic behaviour of Fourier partial sums is of primary importance. The core
topics of Chapters 3-6 should be an essential part of any introductory course
on approximation theory.

V
VI Preface

More Advanced Topics. Chapters 7-9 discuss more advanced topics


and address recent developments in modern approximation and its relevant
applications. To this end, Chapter 7 explains the basic concepts of signal
approximation using Fourier and wavelet methods. This is followed by a
more comprehensive introduction to multivariate approximation by meshfree
positive definite kernels in Chapter 8. The material in Sections 8.4-8.5 pro-
vides more recent results concerning relevant aspects of convergence, stability,
and update strategies for kernel-based approximation. Moreover, Section 8.6
presents basic facts on kernel-based learning. Lastly, Chapter 9 focuses on
mathematical methods of computerized tomography, exploring this impor-
tant application field from the viewpoint of approximation. In particular,
new convergence results concerning the approximation of bivariate functions
from Radon data are proven in Section 9.4.
For those who have studied Chapters 3-6, any of the three more advanced
topics in Chapters 7-9 could seamlessly be included in an introductory course
on approximation. Nevertheless, it is strongly recommended that readers first
study the Fourier basics presented in Sections 7.1-7.4, since much of the
subsequent material in Chapters 8-9 relies on Fourier techniques.
Exercises and Problem Solving. Active participation in exercise
classes is generally an essential requirement for the successful completion of a
mathematics course, and a (decent) course on approximation is certainly no
exception. As such, each of the Chapters 3-9 includes a separate section with
exercises. To enhance learning, readers are strongly encouraged to work on
these exercises, which have different levels of complexity and difficulty. Some
of the exercise problems are suitable for group work in class, while others
should be assigned for homework. Although a number of the exercise prob-
lems may appear difficult, they can be solved using the techniques explained
in this book. Further hints and comments are available on the website
www.math.uni-hamburg.de/home/iske/approx.en.html.

Biographical Data. To allow readers to appreciate the historical con-


text of the presented topics and their developments, we decided to provide
footnotes, where we refer to those whose names are linked with the corres-
ponding results, definitions, and terms. For a better overview, we have also
added a name index. The listed biographical data mainly relies on the online
archive MacTutor History of Mathematics [55] and on the free encyclopedia
Wikipedia [73], where more detailed information can be found.
Preface VII

Acknowledgement. The material of this book has grown over many


years out the courses on approximation and mathematical signal processing
that I taught at the universities of Hamburg (Germany), Lund (Sweden),
and Padua (Italy). I thank the participating students for their constructive
feedback, which has added great didactical value to this textbook. More-
over, I would like to thank my (post)doctoral students Dr Adeleke Bankole,
Dr Matthias Beckmann, Dr Benedikt Diederichs, and Niklas Wagner for their
careful proofreading. Additional comments and suggestions from Dr Matthias
Beckmann and Dr Benedikt Diederichs concerning conceptional and didac-
tical aspects as well as the technical details of the presentation are grate-
fully appreciated. Last but not least, I would like to thank Dr Martin Peters
(SpringerSpektrum, Heidelberg) for his support and encouragement, which
led to the initiation of the book project.

Hamburg, October 2018 Armin Iske


iske@math.uni-hamburg.de
Table of Contents

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Preliminaries, Definitions and Notations . . . . . . . . . . . . . . . . . . . 2
1.2 Basic Problems and Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Approximation Methods for Data Analysis . . . . . . . . . . . . . . . . . 7
1.4 Hints on Classical and More Recent Literature . . . . . . . . . . . . . 8

2 Basic Methods and Numerical Algorithms . . . . . . . . . . . . . . . . 9


2.1 Linear Least Squares Approximation . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Regularization Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 Interpolation by Algebraic Polynomials . . . . . . . . . . . . . . . . . . . . 19
2.4 Divided Differences and the Newton Representation . . . . . . . . . 28
2.5 Error Estimates and Optimal Interpolation Points . . . . . . . . . . 41
2.6 Interpolation by Trigonometric Polynomials . . . . . . . . . . . . . . . . 47
2.7 The Discrete Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . . 51

3 Best Approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.1 Existence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.2 Uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.3 Dual Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
3.4 Direct Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

4 Euclidean Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103


4.1 Construction of Best Approximations . . . . . . . . . . . . . . . . . . . . . 104
4.2 Orthogonal Bases and Orthogonal Projections . . . . . . . . . . . . . . 107
4.3 Fourier Partial Sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
4.4 Orthogonal Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
4.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

5 Chebyshev Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139


5.1 Approaches to Construct Best Approximations . . . . . . . . . . . . . 140
5.2 Strongly Unique Best Approximations . . . . . . . . . . . . . . . . . . . . . 152
5.3 Haar Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
5.4 The Remez Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
5.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

IX
X Table of Contents

6 Asymptotic Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185


6.1 The Weierstrass Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
6.2 Complete Orthogonal Systems and Riesz Bases . . . . . . . . . . . . . 195
6.3 Convergence of Fourier Partial Sums . . . . . . . . . . . . . . . . . . . . . . 204
6.4 The Jackson Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
6.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232

7 Basic Concepts of Signal Approximation . . . . . . . . . . . . . . . . . . 237


7.1 The Continuous Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . 239
7.2 The Fourier Transform on L2 (R) . . . . . . . . . . . . . . . . . . . . . . . . . 251
7.3 The Shannon Sampling Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 255
7.4 The Multivariate Fourier Transform . . . . . . . . . . . . . . . . . . . . . . 257
7.5 The Haar Wavelet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
7.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271

8 Kernel-based Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275


8.1 Multivariate Lagrange Interpolation . . . . . . . . . . . . . . . . . . . . . . 276
8.2 Native Reproducing Kernel Hilbert Spaces . . . . . . . . . . . . . . . . . 283
8.3 Optimality of the Interpolation Method . . . . . . . . . . . . . . . . . . . 289
8.4 Orthonormal Systems, Convergence, and Updates . . . . . . . . . . 293
8.5 Stability of the Reconstruction Scheme . . . . . . . . . . . . . . . . . . . . 302
8.6 Kernel-based Learning Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 306
8.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313

9 Computerized Tomography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317


9.1 The Radon Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
9.2 The Filtered Back Projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
9.3 Construction of Low-Pass Filters . . . . . . . . . . . . . . . . . . . . . . . . . 329
9.4 Error Estimates and Convergence Rates . . . . . . . . . . . . . . . . . . . 335
9.5 Implementation of the Reconstruction Method . . . . . . . . . . . . . 338
9.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349

Subject Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353

Name Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357


1 Introduction

Contemporary applications in computational science and engineering, as well


as in finance, require powerful mathematical methods to analyze big data sets.
Due to the rapidly growing complexity of relevant application data at limi-
ted computational (hardware) capacities, efficient numerical algorithms are
required for the simulation of complex systems with only a few parameters.
Both the parameter identification and the data assimilation are based on
high-performance computational methods to approximately represent mathe-
matical functions.
This textbook gives an introduction to the theory and the numerics of
approximation methods, where the approximation of real-valued functions

f : Ω −→ R

on compact parameter domains Ω ⊂ Rd , d ≥ 1, plays a central role.


But we do not only restrict ourselves to the approximation of functions.
We rather work with the more general assumption, where f lies in a linear
space F, i.e., f ∈ F. For the construction of a concrete approximation method
for elements f ∈ F, we first fix a suitable subset S ⊂ F, from which we seek
to select an approximation s∗ ∈ S to f . But the selection of s∗ requires
particular care. For the ”best possible” representation of f by s∗ ∈ S we are
interested in the selection of a best approximation s∗ ∈ S to f ,

s∗ ≈ f,

i.e., an element s∗ ∈ S which is among all elements s ∈ S closest to f .


In this introduction, we first explain important concepts and notions of
approximation, but only very briefly. Later in this chapter, we discuss con-
crete examples for relevant function spaces F and suitable subsets S ⊂ F.
For further motivation and outlook, we finally sketch selected questions and
results of approximation, which we will later address in more detail.

© Springer Nature Switzerland AG 2018 1


A. Iske, Approximation Theory and Algorithms for Data Analysis, Texts
in Applied Mathematics 68, https://doi.org/10.1007/978-3-030-05228-7_1
2 1 Introduction

1.1 Preliminaries, Definitions and Notations


For the construction of best approximations to f ∈ F, we necessarily need to
measure distances between f and its approximations s ∈ S. To this end, we
introduce a norm for F, where throughout this text we assume that F is a
linear space (i.e., vector space) over the real numbers R or over the complex
numbers C.

Definition 1.1. For a linear space F, a mapping  ·  : F −→ [0, ∞) is said


to be a norm for F, if the following properties are satisfied.
(a) u = 0, if and only if u = 0 (definiteness)
(b) αu = |α|u for all u ∈ F and all α ∈ R (or, α ∈ C) (homogeneity)
(c) u + v ≤ u + v for all u, v ∈ F (triangle inequality).
In this case, F with the norm  · , (F,  · ), is called a normed space.

For the approximation of functions, infinite-dimensional linear spaces F


are of particular interest. Let us make one relevant example: For a compact
domain Ω ⊂ Rd , d ∈ N,

C (Ω) := {u : Ω −→ R | u continuous on Ω}

denotes the linear space of all continuous functions on Ω. Recall that C (Ω)
is a linear space of infinite dimension. When equipped with the maximum
norm  · ∞ , defined as

u∞ := max |u(x)| for u ∈ C (Ω),


x∈Ω

C (Ω) is a normed linear function space (or, in short: a normed space). The
normed space C (Ω), equipped with the maximum norm  · ∞ , is complete,
i.e., (C (Ω), ·∞ ) is a Banach space. We note this important result as follows.

Theorem 1.2. For compact Ω ⊂ Rd , (C (Ω),  · ∞ ) is a Banach space. 

Further important examples for norms on C (Ω) are the p-norms  · p ,


1 ≤ p < ∞, defined as
 1/p
up := |u(x)|p dx for u ∈ C (Ω).
Ω

Example 1.3. For 1 ≤ p < ∞, (C (Ω),  · p ) is a normed space. ♦

We remark that the case p = 2 is of particular interest: In this case, the


2-norm  · 2 on C (Ω) is generated by the inner product (·, ·),

(u, v) := u(x)v(x) dx for u, v ∈ C (Ω),
Ω
1.1 Preliminaries, Definitions and Notations 3

via  · 2 = (·, ·)1/2 , so that



u2 = |u(x)|2 dx for u ∈ C (Ω).
Ω

To be more general, a linear space F, equipped with an inner product


(·, ·) : F × F −→ R is with  ·  := (·, ·)1/2 is a normed space, in which case
we say that F is a Euclidean space.
Example 1.4. The normed space (C (Ω),  · 2 ) is a Euclidean space. ♦
We analyze the approximation in Euclidean spaces in detail in Chapter 4.
As we will prove, the smoothness of a target function f to be approximated
takes influence on the resulting approximation quality, where we quantify the
smoothness of f by its differentiation order k ∈ N0 . For this reason, the linear
subspaces

C k (Ω) = {u : Ω −→ R | u has k continuous derivatives on Ω} ⊂ C (Ω)

are of particular interest. The function spaces C k (Ω) form a nested sequence

C ∞ (Ω) ⊂ C k+1 (Ω) ⊂ C k (Ω) ⊂ C k−1 (Ω) ⊂ · · · ⊂ C 1 (Ω) ⊂ C 0 (Ω) = C (Ω)

of infinite-dimensional linear subsets of C (Ω), where



C ∞ (Ω) := C k (Ω)
k∈N0

is the linear space of functions with arbitrary differentiation order on Ω.


For the construction of approximation methods, finite-dimensional linear
subspaces S ⊂ F are useful. To this end, let {s1 , . . . , sn } ⊂ S be a fixed
basis of S, where n = dim(S) ∈ N. In this case, any s ∈ S can uniquely be
represented by a linear combination

n
s= cj sj
j=1

by n parameters c1 , . . . , cn ∈ R. As we will see later, the assumption of


finite dimension for S will help simplify our computations, especially for the
coding and the evaluation of best approximations s∗ ∈ S to f . But the finite
dimension of S will also be useful in theory, in particular when it comes to
discussing the existence of best approximations.
For the special case of univariate functions, i.e., for Ω = [a, b] ⊂ R com-
pact, we consider the approximation of continuous functions f ∈ C [a, b] by
using algebraic polynomials. In this case, we choose S = Pn , for a fixed de-
gree n ∈ N0 , so that Pn is the linear space of all univariate polynomials of
degree at most n.
4 1 Introduction

For the representation of algebraic polynomials from Pn the monomial


basis {1, x, x2 , . . . , xn } is particularly popular, where in this case any p ∈ Pn
is represented by a unique linear combination of the monomial form
p(x) = a0 + a1 x + a2 x2 + . . . + an xn for x ∈ R
with real coefficients a0 , . . . , an . Note that dim(Pn ) = n + 1.
Further relevant examples for infinite-dimensional linear spaces of uni-
variate functions F are the 2π-periodic continuous functions,
C2π := {u ∈ C (R) | u(x) = u(x + 2π) for all x ∈ R} ⊂ C (R),
and their linear subspaces
C2π
k
:= C k (R) ∩ C2π for k ∈ N0 ∪ {∞}.
Example 1.5. For k ∈ N0 ∪ {∞} and 1 ≤ p < ∞, the function space C2π
k
,
equipped with the p-norm
 2π 1/p
up := |u(x)|p dx for u ∈ C2π
k
,
0

is a normed linear space. For p = 2 the function space C2πk


is Euclidean by
the inner product
 2π
(u, v) := u(x)v(x) dx for u, v ∈ C2π
k
.
0

Finally, the function space C2π


k
, equipped with the maximum norm
u∞ := max |u(x)| for u ∈ C2π
k
x∈[0,2π]

is a Banach space. ♦

The approximation of functions from C2π plays an important role in math-


ematical signal processing, where trigonometric polynomials of the form
a0 
n
T (x) = + [aj cos(jx) + bj sin(jx)] for x ∈ R
2 j=1

with Fourier coefficients a0 , . . . , an , b1 , . . . , bn ∈ R are used. In this case, we


choose S = Tn , for n ∈ N0 , where
Tn = span {1, sin(x), cos(x), . . . , sin(nx), cos(nx)} ⊂ C2π
is the linear space of all real-valued trigonometric polynomials of degree at
most n ∈ N0 . Note that dim(Tn ) = 2n + 1.

We will discuss further relevant examples for normed spaces (F,  · ) and
approximation spaces S ⊂ F later. In this short introduction, we will only
touch a few more important aspects of approximation for an outlook.
1.2 Basic Problems and Outlook 5

1.2 Basic Problems and Outlook


For the analysis of best approximations, the following questions are relevant.
• On given f ∈ F, does there exist a best approximation s∗ ∈ S for f ?
• Is a best approximation s∗ for f unique?
• Are there necessary/sufficient conditions for a best approximation s∗ ?
• How can we compute a best approximation s∗ analytically or numerically?
The answers to the above questions depend on the properties of the linear
space F, its norm  · , as well as on the chosen approximation space S ⊂ F.
We will give satisfying answers to the above questions. In Chapter 3 we first
provide general answers that do not depend on the particular choices of F,
 · , and S, but rather on their structural properties. Then we analyze the
special case of the Euclidean norm (in Chapter 4) and that of the maximum
norm  · ∞ , also referred to as Chebyshev norm (in Chapter 5).
Later in Chapter 6, we study the asymptotic behaviour of approximation
methods. In that discussion, we ask the question, how well we can approxi-
mate a target f ∈ F by certain sequences of approximations to f . To further
explain on this, suppose f ∈ F, the norm  · , and the approximation space
S are fixed. Then we can quantify the approximation quality by the minimal
distance
η ≡ η(f, S) = inf s − f  = s∗ − f 
s∈S

between f and S. In relevant application scenarios, we wish to approximate


f arbitrarily well. For fixed S, this will not work, however, since in that case
the minimal distance η(f, S) is already the best possible.
Therefore, we work with nested sequences of approximation spaces

S 0 ⊂ S1 ⊂ . . . ⊂ Sn ⊂ F for n ∈ N0

where we also regard the corresponding sequence of minimal distances

η(f, S0 ) ≥ η(f, S1 ) ≥ . . . ≥ η(f, Sn ) ≥ 0,

whose asymptotic behaviour we will analyze. Now if we wish to approximate


f ∈ F arbitrarily well, then the minimal distances must necessarily be a zero
sequence, i.e.,
η(f, Sn ) −→ 0 for n → ∞.
This leads us to the following fundamental question of approximation.
6 1 Introduction

Question: Is there for any f ∈ F and any ε > 0 one n ∈ N satisfying

η(f, Sn ) = s∗n − f  < ε,

where s∗n ∈ Sn is a best approximation to f from Sn ? ♦

If the answer to the above question is positive, then the union



S= Sn ⊂ F
n≥0

is called dense in F with respect to the norm  · , or, dense subset of F.


We are particularly interested in the approximation of continuous func-
tions, where we first study the approximation of univariate continuous func-
tions (in Chapters 2-6), before we later turn to multivariate approximation
(in Chapters 8-9).
For an outlook, we quote two classical results from univariate approxi-
mation, which are studied in Chapter 6. The following result of Weierstrass
(dating back to 1885) is often referred to as the ”birth of approximation”.

Theorem 1.6. (Weierstrass, 1885). For a compact interval [a, b] ⊂ R, the


set of algebraic polynomials P is dense in C [a, b] with respect to the maximum
norm ·∞ . In other words, for any f ∈ C [a, b] and ε > 0 there is an algebraic
polynomial p satisfying

p − f ∞,[a,b] = max |p(x) − f (x)| < ε.


x∈[a,b]

The above version of the Weierstrass theorem is an algebraic one. But


there is also a trigonometric version of the Weierstrass theorem, according
to which the set of trigonometric polynomials T is dense in C2π with respect
to the maximum norm  · ∞ . We will prove both versions of the Weierstrass
theorem in Section 6.1. Moreover, in Sections 6.3 and 6.4 we will, for f ∈ C2π ,
analyze decay rates for the minimal distances

η(f, Tn ) := inf T − f  and η∞ (f, Tn ) := inf T − f ∞


T ∈Tn T ∈Tn

with respect to both the Euclidean norm  ·  and the maximum norm  · ∞ .
The latter will lead us to the Jackson theorems, one of which is as follows.

Theorem 1.7. (Jackson). For f ∈ C2π


k
we have
 k
π
η∞ (f, Tn ) ≤ · f (k) ∞ = O n−k for n → ∞.
2(n + 1)

1.3 Approximation Methods for Data Analysis 7

From this result, we see that the power of the approximation method does
not only depend on the approximation spaces Tn but also and essentially on
the smoothness of the target f . Indeed, the following principle holds:
The smoother the target function f ∈ C2π , the faster is the convergence
of the minimal distances η(f, Tn ), or, η∞ (f, Tn ) to zero.
We will prove this and other classical results concerning the asymptotic
behaviour of minimal distances in Chapter 6.

1.3 Approximation Methods for Data Analysis

Having studied classical topics of approximation (in Chapters 3-6) we will


address more recent developments and trends of approximation. To this end,
we develop and analyze specific approximation methods for data analysis,
where relevant applications in signal processing play an important role.
We first introduce (in Chapter 7) basic concepts of Fourier analysis. Fur-
ther, in Section 7.3 we prove the Shannon sampling theorem, Theorem 7.34,
which is a fundamental result of signal theory. According to the Shannon
sampling theorem, any band-limited signal f ∈ L2 (R) (i.e., f has limited fre-
quency density) can be recovered exactly from its values taken on an infinite
discrete sampling mesh of constant mesh width. Our proof of the Shannon
sampling theorem will demonstrate the relevance and the fundamental im-
portance of the introduced Fourier methods.
Further advanced topics of approximation are comprising wavelets and
kernel-based methods for multivariate approximation. In this introductory
text, however, we can only cover a few selected theoretical and numerical
aspects of these multifaceted topics. Therefore, as regards wavelet methods
(in Section 7.5) we restrict ourselves to an introduction of the Haar wavelet.
Moreover, the subsequent discussion on basic concepts of kernel-based ap-
proximation (in Chapter 8) is based on positive definite kernels. Among our
addressed applications in multivariate data analysis are kernel-based methods
in machine learning (in Section 8.6). For further details on this subject, we
refer to our references in the following Section 1.4.
Another important application is the approximation of bivariate signals
in computer tomography, as addressed in Chapter 9, where we analyze theore-
tical aspects of this inverse problem rigorously from the viewpoint of approxi-
mation. This finally leads us to novel error estimates and convergence rates,
as developed in Section 9.4. The constructive account taken here provides a
new concept of evaluation methods for low-pass filters. We finally discuss the
implementation of the filtered back projection formula (in Section 9.5).
8 1 Introduction

1.4 Hints on Classical and More Recent Literature


Approximation theory is a vivid research area of mathematics with a long
history [55]. More recent developments have provided powerful numerical ap-
proximation methods aiming to address challenging application problems in
the relevant areas of data and computer science, natural science and engi-
neering. This has led to a large variety of diverse contributions to the approx-
imation literature by research monographs and publications that can hardly
be overviewed. In fact, it is obviously impossible to cover all relevant aspects
of approximation in broad width and depth in one textbook. For this ele-
mentary introduction, we decided to first treat selected theoretical aspects of
classical approximation, before we turn to more recent concepts of numerical
approximation.
For further reading, we refer to a selection of classical and contemporary
sources on approximation theory and numerical methods. Likewise, the list
of references cannot be complete, and in fact we can only give a few hints,
although the selection of more recent texts on approximation is rather limited.
As regards classical texts on approximation (from the second half of the last
century) we refer to [11, 12, 19, 50, 56, 70]. Further material on more advanced
topics, including nonlinear approximation, can be found in [9, 21, 43].
For a more recent introduction to approximation theory with accentuated
links to relevant applications in geomathematics we refer to [51]. A modern
account to approximation with pronounced algorithmic and numerical ele-
ments is provided in the modern teaching concept of [68].
Literature sources to more specific topics of approximation are dealing
with spline approximations [20, 36, 64, 65], wavelets [14, 18, 49] and ra-
dial basis functions [10, 24, 25, 27, 38, 72]. Since spline approximation is a
well-established topic in standard courses on numerical mathematics [57], we
decided to omit a discussion on splines in this work.
2 Basic Methods and Numerical Algorithms

In this chapter, we discuss basic mathematical methods and numerical al-


gorithms for interpolation and approximation of functions in one variable.
The concepts and principles which we address here should already be known
from numerical mathematics. Nevertheless, the material of this chapter will
be necessary for our subsequent discussion. Therefore, a repetition of selected
elements from numerical mathematics should be most welcome.
For the sake of preparation, let us first fix a few notations. We denote
by f : [a, b] −→ R a continuous function on a compact interval [a, b] ⊂ R,
f ∈ C [a, b]. Moreover,
X = {x0 , x1 , . . . , xn } ⊂ [a, b] for n ∈ N0
is a set of |X| = n + 1 pairwise distinct interpolation knots. We collect the
function values fj = f (xj ) of f on X in one data vector,
fX = (f0 , f1 , . . . , fn )T ∈ Rn+1 .
For the approximation of f , we will later specify suitable linear subspaces of
continuous functions, S ⊂ C [a, b], of finite dimension dim(S) ≤ n + 1.
We first consider linear least squares approximation. In this problem, we
seek an approximation s∗ ∈ S to f which minimizes among all s ∈ S the sum
of pointwise square errors on X, so that
 
|s∗ (x) − f (x)|2 ≤ |s(x) − f (x)|2 for all s ∈ S. (2.1)
x∈X x∈X

Moreover, we discuss numerical algorithms for interpolation, which could


be viewed as a special case of linear least squares approximation. To this end,
we first consider using algebraic polynomials, where S = Pn . To compute a
solution s ∈ Pn of the interpolation problem sX = fX , i.e.,
s(xj ) = f (xj ) for all 0 ≤ j ≤ n, (2.2)
we develop efficient and numerically stable algorithms. Finally, we address
interpolation to periodic functions by using trigonometric polynomials, where
S = Tn . This leads us directly to the discrete Fourier transform (DFT), which
will be of primary importance later in this book. We show how the DFT can
be computed efficiently by using the fast Fourier transform (FFT).

© Springer Nature Switzerland AG 2018 9


A. Iske, Approximation Theory and Algorithms for Data Analysis, Texts
in Applied Mathematics 68, https://doi.org/10.1007/978-3-030-05228-7_2
10 2 Basic Methods and Numerical Algorithms

2.1 Linear Least Squares Approximation


The following discussion on linear least squares approximation leads us to a
first concrete example of an approximation problem. As a starting point for
our investigations, we regard the minimization problem (2.1), whose solution
we wish to construct. To this end, we first fix a set B = {s1 , . . . , sm } ⊂ C [a, b]
of m ≤ n + 1 linearly independent continuous functions. This immediately
leads us to the linear approximation space
⎧ ⎫
⎨  ⎬
m

S = span{s1 , . . . , sm } := cj sj  c1 , . . . , cm ∈ R ⊂ C [a, b]
⎩ ⎭
j=1

with basis B and of dimension dim(S) = m. In typical applications of linear


least squares approximation the number n + 1 of given function values in fX
is assumed to be much larger than the dimension m of S. Indeed, we prefer
to work with a simple model for S which in particular is generated by only
a few basis functions B. We use the notation m  n + 1 to indicate that m
is assumed to be much smaller than n + 1.
But the following method for solving the minimization problem (2.1) can
be applied for all m ≤ n + 1. We formulate the linear least squares approxi-
mation problem (2.1) more precisely as follows.

Problem 2.1. Compute from a given set X = {x0 , . . . , xn } ⊂ [a, b] of n + 1


pairwise distinct points and a data vector fX = (f0 , . . . , fn )T ∈ Rn+1 a con-
tinuous function s∗ ∈ S = span{s1 , . . . , sm }, for m ≤ n + 1, which minimizes
among all s ∈ S the pointwise sum of square errors on X, so that

s∗X − fX 22 ≤ sX − fX 22 for all s ∈ S. (2.3)

To solve the minimization of Problem 2.1 we represent s∗ ∈ S as a unique


linear combination
m
s∗ = c∗j sj (2.4)
j=1

of the basis functions in B. Thereby, the linear least squares approximation


problem can be reformulated as an equivalent minimization problem of the
form
Bc − fX 22 −→ minm ! , (2.5)
c∈R

where the design matrix


⎡ ⎤
s1 (x0 ) · · · sm (x0 )
⎢ ⎥
B ≡ BB,X := ⎣ ... ..
. ⎦∈R
(n+1)×m

s1 (xn ) · · · sm (xn )
2.1 Linear Least Squares Approximation 11

contains all evaluations of the basis functions from B at the points in X. To


solve the minimization problem (2.5), we regard for the multivariate function
F : Rm −→ [0, ∞), defined as

F (c) = Bc − fX 22 = (Bc − fX )T (Bc − fX ) = cT B T Bc − 2cT B T fX + fX


T
fX ,

its gradient
∇F (c) = 2B T Bc − 2B T fX
and its (constant) Hessian1 matrix

∇2 F (c) = 2B T B.

Recall that any local minimum of F can be characterized via the solution
of the linear equation system

B T Bc = B T fX , (2.6)

referred to as Gaussian2 normal equation. If B has full rank, i.e., rank(B) =


m, then the symmetric matrix B T B is positive definite. In this case, the Gaus-
sian normal equation (2.6) has a unique solution c∗ = (c∗1 , . . . , c∗m )T ∈ Rm
satisfying
F (c∗ ) < F (c) for all c ∈ Rm \ {c∗ }.
The solution c∗ ∈ Rm yields the sought coefficient vector for s∗ in (2.4).
Hence, our first approximation problem, Problem 2.1, is solved.
However, our suggested solution via the Gaussian normal equation (2.6)
is problematic from a numerical viewpoint: If B has full rank, then the spec-
tral condition numbers of the matrices B T B and B are related by (cf. [57,
Section 3.1])
2
κ2 (B T B) = (κ2 (B)) .
Therefore, the spectral condition number κ2 (B T B) of the matrix B T B grows
quadratically proportional to the reciprocal of the smallest singular value of
B. For matrices B arising in relevant applications of linear least squares
approximation, however, its smallest singular value is typically very small,
whereby the condition number κ2 (B T B) of B T B is even worse. In fact, the
condition number of linear least squares approximation problems is, especially
for very small residuals Bc − fX 2 , very critical, so that a solution via the
Gaussian normal equation (2.6) should be avoided for the sake of numerical
stability (see [28, Section 6.2]). A more comprehensive error analysis on linear
least squares approximation can be found in the textbook [7].
Instead of this, a numerically stable solution for the linear least squares
approximation problem works with a QR factorization
1
Ludwig Otto Hesse (1811-1874), German mathematician
2
Carl Friedrich Gauß (1777-1855), German mathematician and astronomer
12 2 Basic Methods and Numerical Algorithms

B = QR (2.7)

of the design matrix B ∈ R(n+1)×m , where Q ∈ R(n+1)×(n+1) is an orthogonal


matrix and R is an upper triangular matrix of the form
⎡ ⎤
s11 · · · s1m
  ⎢ .. . ⎥
S ⎢ . .. ⎥
R= =⎢ ⎥ ∈ R(n+1)×m . (2.8)
0 ⎣ smm ⎦
0

Note that matrix B has full rank, rank(B) = m, if and only if no diagonal
entry skk , 1 ≤ k ≤ m, in the upper triangular matrix S ∈ Rm×m vanishes.
A numerically stable solution for the minimization problem (2.5) relies on
the alternative representation

F (c) = Bc − fX 22 = QRc − fX 22 = Rc − QT fX 22 , (2.9)

where we use the isometry of the inverse Q−1 = QT with respect to the
Euclidean norm  · 2 , i.e.,

QT y2 = y2 for all y ∈ Rn+1 .

Now the vector QT fX ∈ Rn+1 can be partitioned into two blocks g ∈ Rm


and h ∈ Rn+1−m , so that
 
g
QT fX = ∈ Rn+1 . (2.10)
h

Therefore, the representation for F (c) in (2.9) can be rewritten as a sum


of the form
F (c) = Sc − g22 + h22 , (2.11)
where we use the partitioning (2.8) for R and that in (2.10) for QT fX . The
minimum of F (c) in (2.11) can finally be computed via the solution of the
triangular linear system
Sc = g
by a backward substitution. The solution c∗ of this linear system is unique,
if and only if B has full rank.
In conclusion, the described procedure provides a numerically stable algo-
rithm to compute the solution c∗ of the minimization problem (2.5), and this
yields the coefficient vector c∗ of s∗ in (2.4). For the approximation error, we
obtain
F (c∗ ) = Bc∗ − fX 22 = h22 .

This solves the linear least squares approximation problem, Problem 2.1.
For further illustration, we discuss one example of linear regression.
2.1 Linear Least Squares Approximation 13
5.5

4.5

3.5

2.5

1.5

0.5

0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

(a) noisy observations (X, f˜X )

5.5

4.5

3.5

2.5

1.5

0.5

0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

(b) regression line s∗ ∈ P1

Fig. 2.1. (a) We take 26 noisy samples f˜X = fX + εX from f (x) = 1 + 3x. (b) We
compute the regression line s∗ (x) = c∗0 + c∗1 x, with c∗0 ≈ 0.9379 and c∗1 ≈ 3.0617, by
using linear least squares approximation (cf. Example 2.2).
14 2 Basic Methods and Numerical Algorithms

Example 2.2. We assume a linear model, i.e., we approximate f ∈ C [a, b]


by a linear function s(x) = c0 +c1 x, for c0 , c1 ∈ R. Moreover, we observe noisy
measurements f˜X , taken from f at n + 1 sample points X = {x0 , x1 , . . . , xn },
so that
f˜(xj ) = f (xj ) + εj for 0 ≤ j ≤ n,
where εj is the error for the j-th sample. We collect the error terms in one
vector εX = (ε0 , ε1 , . . . , εn )T ∈ Rn+1 . Figure 2.1 (a) shows an example for
noisy observations (X, f˜X ), where

f˜X = fX + εX .

We finally display the assumed linear relationship between the sample


points X (of the input data sites) and the observations f˜X (of the target
object). To this end, we fix the basis functions s1 ≡ 1 and s2 (x) = x, so that
span{s1 , s2 } = P1 . Now the solution of the resulting minimization problem
in (2.5) has the form
Bc − f˜X 22 −→ min2 ! (2.12)
c∈R

with the design matrix B ∈ R (n+1)×2


, where
 
1 1 ··· 1
T
B = ∈ R2×(n+1) .
x0 x1 · · · xn

The minimization problem (2.12) can be solved via a QR decomposition of


B in (2.7) to obtain a numerically stable solution c∗ = (c∗0 , c∗1 )T ∈ R2 . The
resulting regression line is given by s∗ (x) = c∗0 + c∗1 x, see Figure 2.1 (b). ♦

2.2 Regularization Methods


Next, we develop a relevant extension of linear least squares approximation,
Problem 2.1. To this end, we augment the data error

ηX (f, s) = sX − fX 22

of the cost function by a regularization term, given by a suitable functional

J : S −→ R,

where J(s) quantifies, for instance, the smoothness, the variation, the energy,
or the oscillation of s ∈ S. By the combination of the data error ηX and the
regularization functional J, being balanced by a fixed parameter α > 0, this
leads us to an extension of linear least squares approximation, Problem 2.1,
giving a regularization method that is described by the minimization problem

sX − fX 22 + αJ(s) −→ min ! (2.13)


s∈S
2.2 Regularization Methods 15

By choice of the regularization parameter α, we aim to compromise be-


tween the approximation quality ηX (f, s∗α ) of a solution s∗α ∈ S for (2.13)
and its regularity, being measured by J(s∗α ). We can further explain this as
follows. On the one hand, for very small values α the error term ηX (f, s) will
be dominant over the regularization term J(s), which improves the approxi-
mation quality of a solution s∗α for (2.13). On the other hand, for very large
values α the regularization term J(s) will be dominant. In this case, however,
we wish to avoid overfitting of a solution s∗α . But the selection of J : S −→ R
requires particular care, where a problem-adapted choice of J relies on spe-
cific model assumptions from the application addressed. In applications of
information technology, for instance, regularization methods are applied for
smoothing, deblurring or denoising image and signal data (see [34]).
Now we discuss one relevant special case, where the functional J : S −→ R
is being defined by a symmetric positive definite matrix A ∈ Rm×m . To
further explain the definition of J, for a fixed basis B = {s1 , . . . , sm } of S,
each element
 m
s= cj sj ∈ S
j=1

is mapped onto the A-norm (i.e., the norm being induced by A)

c2A := cT Ac (2.14)

of its coefficients c = (c1 , . . . , cm )T ∈ Rm .


Starting from our above discussion on linear least squares approximation,
this particular choice of J leads to a class of regularization methods termed
Tikhonov3 regularization. We formulate the problem of Tikhonov regu-
larization as follows.

Problem 2.3. Under the assumption α > 0 and A ∈ Rm×m symmetric


positive definite, compute on given function values fX ∈ Rn+1 and design
matrix B = (sj (xk ))0≤k≤n;1≤j≤m ∈ R(n+1)×m , m ≤ n + 1, a solution for the
minimization problem

Bc − fX 22 + αc2A −→ minm ! (2.15)


c∈R

Note that Problem 2.3 coincides for α = 0 with the linear least squares
approximation problem. As we show in the following, the minimization prob-
lem (2.15) of Tikhonov regularization has for any α > 0 a unique solution,
in particular for the case, where the design matrix B has no full rank. We
further remark that the linear least squares approximation problem, Prob-
lem 2.1, has for rank(B) < m ambiguous solutions. However, as we will show,
3
Andrey Nikolayevich Tikhonov (1906-1993), Russian mathematician
16 2 Basic Methods and Numerical Algorithms

the solution s∗α ∈ S converges for α  0 to a norm minimal solution s∗ of


linear least squares approximation.
Now we regard, for fixed α > 0, the cost function Fα : Rm −→ [0, ∞),

Fα (c) = Bc − fX 22 + αc2A = cT B T B + αA c − 2cT B T fX + fX


T
fX ,

its gradient
∇Fα (c) = 2 B T B + αA c − 2B T fX
and the (constant) positive definite Hessian matrix

∇2 Fα = 2 B T B + αA . (2.16)

Note that the function Fα has one unique stationary point c∗α ∈ Rm satisfying
the necessary condition ∇Fα (c) = 0. Therefore, c∗ can be characterized as the
unique solution of the minimization problem (2.15) via the unique solution
of the linear system
B T B + αA c∗α = B T fX ,
−1
i.e., c∗α = B T B + αA B T fX . Due to the positive definiteness of the Hes-

sian ∇ Fα in (2.16), cα is a local minimum of Fα . Moreover, in this case Fα
2

is convex, and so c∗α is the unique minimum of Fα on Rm .


Now we explain how c∗α can be computed by a stable numerical algorithm.
By the spectral theorem, there is an orthogonal matrix U ∈ Rm×m satisfying

A = U ΛU T ,

where Λ = diag(λ1 , . . . , λm ) ∈ Rm×m is a diagonal matrix containing the


positive eigenvalues λ1 , . . . , λm > 0 of A. This allows us to define the square
root of A by letting
A1/2 := U Λ1/2 U T ,
√ √
where Λ1/2 = diag( λ1 , . . . , λm ) ∈ Rm×m . Note that the square root A1/2
is, like A, also symmetric positive definite, and we have
√ T √  √ 2
 
αc2A = αcT Ac = αA1/2 c αA1/2 c =  αA1/2 c .
2

This implies
   2
√  B fX 
Bc−fX 22 +αc2A = Bc−fX 22 + 1/2
αA c22 
= √ c−  .
αA1/2 0 2

Using the notation


   
B fX
Bα = √ 1/2 ∈ R((n+1)+m)×m and gX = ∈ R(n+1)+m ,
αA 0

we can reformulate (2.15) by the linear least squares approximation problem


2.2 Regularization Methods 17

Bα c − gX 22 −→ minm ! , (2.17)


c∈R

whose solution c∗α can be computed by a stable algorithm via a QR factori-


zation of Bα , as already explained in the previous section for the numerical
solution of the linear least squares approximation problem, Problem 2.1.
Next, we aim to characterize the asymptotic behaviour of s∗α for α  0
and for α → ∞. To this end, we first develop a suitable representation for
the solution c∗α to (2.15). Since A is symmetric positive definite, we have

Bc − fX 22 + αc2A = BA−1/2 A1/2 c − fX 22 + αA1/2 c22 .

By using the notation

C = BA−1/2 ∈ R(n+1)×m and b = A1/2 c ∈ Rm

we can rewrite (2.15) as

Cb − fX 22 + αb22 −→ minm ! (2.18)


b∈R

For the solution of (2.18), we employ the singular value decomposition of C,

C = V ΣW T ,

where V = (v1 , . . . , vn+1 ) ∈ R(n+1)×(n+1) and W = (w1 , . . . , wm ) ∈ Rm×m


are orthogonal, and where the matrix Σ has the form
⎡ ⎤
σ1 0
⎢ .. ⎥
⎢ . ⎥
Σ=⎢ ⎥ ∈ R(n+1)×m
⎣ σr ⎦
0 0

with singular values σ1 ≥ . . . ≥ σr > 0, where r = rank(C) ≤ m.


But this implies

Cb − fX 22 + αb22 = V ΣW T b − fX 22 + αb22


= ΣW T b − V T fX 22 + αW T b22

and, moreover, we obtain for a = W T b ∈ Rm the representation

Cb − fX 22 + αb22 = Σa − V T fX 22 + αa22


r 
n+1 
m
= (σj aj − vjT fX )2 + (vjT fX )2 + α a2j .
j=1 j=r+1 j=1

For the minimization of this expression, we first let


18 2 Basic Methods and Numerical Algorithms

aj := 0 for r + 1 ≤ j ≤ m,

so that it remains to solve the minimization problem



r
 
(σj aj − vjT fX )2 + αa2j −→ min ! (2.19)
a1 ,...,ar ∈R
j=1

Since all terms of the cost function in (2.19) are non-negative, the minimiza-
tion problem (2.19) can be split into the r independent subproblems

gj (aj ) = (σj aj − vjT fX )2 + αa2j −→ min ! for 1 ≤ j ≤ r (2.20)


aj ∈R

with the scalar-valued cost functions gj : R −→ R, for 1 ≤ j ≤ r. Since

gj (aj ) = 2((σj2 + α)aj − σj vjT fX ) and gj (aj ) = 2(σj2 + α) > 0

the function gj is a convex parabola. The unique minimum a∗j of gj in (2.20)


is given by
σj
a∗j = 2 v T fX for all 1 ≤ j ≤ r.
σj + α j
Therefore, for the unique solution b∗ of (2.18) we have


r
σj
b∗ = W a∗ = v T fX w j .
j=1
σj2 + α j

From this we obtain


b∗ −→ 0 for α −→ ∞
and
r
1 T
b∗ −→ b∗0 := v j fX w j = C + fX for α  0
j=1
σ j

for the asymptotic behaviour of b∗ , where C + denotes the pseudoinverse of C.


Therefore, b∗0 is the unique norm minimal solution of the linear least squares
approximation problem

Cb − fX 22 −→ minm !


b∈R

Therefore, for the solution c∗ = A−1/2 b∗ to (2.15) we get

c∗ −→ 0 for α −→ ∞

and
c∗ −→ c∗0 = A−1/2 b∗0 for α  0,
where c∗0 ∈R m
denotes that solution of the linear least squares problem
2.3 Interpolation by Algebraic Polynomials 19

Bc − fX 22 −→ minm !


c∈R

which minimizes the norm  · A . For the solution s∗α ∈ S of (2.13), we obtain
s∗α −→ 0 for α −→ ∞
and
s∗α −→ s∗0 for α  0,
where s∗0 ∈ S is that solution for the linear least squares problem
sX − fX 22 −→ min !
s∈S

whose coefficients c ∈ R m
minimize the norm  · A .

2.3 Interpolation by Algebraic Polynomials


In this section, we work with algebraic polynomials for the interpolation of a
continuous function f ∈ C [a, b]. Algebraic polynomials p : R −→ R are often
represented as linear combinations

n
p(x) = a k x k = a0 + a 1 x + a 2 x 2 + . . . + a n x n (2.21)
k=0

of monomials 1, x, x2 , . . . , xn with coefficients a0 , . . . , an ∈ R. Recall that


n ∈ N0 denotes the degree of p, provided that the leading coefficient an ∈ R
does not vanish. We collect all algebraic polynomials of degree at most n in
the linear space
Pn := span{1, x, x2 , . . . , xn } for n ∈ N0 .
Now we consider the following interpolation problem.
Problem 2.4. Compute from a given set X = {x0 , . . . , xn } ⊂ R of n + 1
pairwise distinct points and a data vector fX = (f0 , . . . , fn )T ∈ Rn+1 an
algebraic polynomial p ∈ Pn satisfying pX = fX , i.e.,
p(xj ) = fj for all 0 ≤ j ≤ n. (2.22)

If we represent p ∈ Pn as a linear combination of monomials (2.21), then
the interpolation conditions (2.22) lead to a linear system of the form
a0 + a1 x0 + a2 x20 + . . . + an xn0 = f0
a0 + a1 x1 + a2 x21 + . . . + an xn1 = f1
.. ..
. .
2 n
a 0 + a 1 x n + a 2 x n + . . . + an x n = fn ,
20 2 Basic Methods and Numerical Algorithms

or, in the shorter matrix-vector notation,


VX · a = fX (2.23)
with coefficient vector a = (a0 , . . . , an )T ∈ Rn+1 and the Vandermonde4
matrix ⎡ ⎤
1 x0 x20 . . . xn0
⎢ 1 x1 x21 . . . xn1 ⎥
⎢ ⎥
VX = ⎢ . . . .. ⎥ ∈ R
(n+1)×(n+1)
. (2.24)
⎣ .. .. .. . ⎦
1 xn x2n . . . xnn
Now the interpolation problem pX = fX in (2.22) has a unique solution, if
and only if the linear equation system (2.23) has a unique solution. Therefore,
it remains to investigate the regularity of the Vandermonde matrix VX , where
we can rely on a classical result from linear algebra.
Theorem 2.5. For the determinant of the Vandermonde matrix VX , we have

det(VX ) = (xk − xj ).
0≤j<k≤n

Proof. We prove the statement by induction on n ∈ N, where we only use ele-


mentary properties of the determinant, in particular its linearity with respect
to the matrix rows.
Initial step: For n = 1 we have det(VX ) = x1 − x0 for X = {x0 , x1 }.
Induction hypothesis: Assume the statement is true for n points {x1 , . . . , xn }.
Induction step: For X = {x0 , x1 , . . . , xn } we have
⎡ ⎤ ⎡ ⎤
1 x0 . . . xn−1
0 xn0 1 x0 . . . xn0
⎢ 1 x1 . . . xn−1 xn1 ⎥ ⎢ 0 x1 − x0 . . . xn1 − xn0 ⎥
⎢ 1 ⎥ ⎢ ⎥
det(VX ) = det ⎢ . . .. .. ⎥ = det ⎢ .. .. .. ⎥
⎣ .. .. . . ⎦ ⎣ . . . ⎦
1 xn . . . xn−1
n xnn 0 xn − x0 . . . xnn − xn0
⎡ ⎤
x1 − x0 . . . xn1 − xn0
⎢ .. .. ⎥
= det ⎣ . . ⎦
xn − x0 . . . xnn − xn0
⎡ ⎤
x1 − x0 x21 − x0 x1 . . . xn1 − x0 xn−1
1
⎢ .. .. .. ⎥
= det ⎣ . . . ⎦
xn − x0 x2n − x0 xn . . . xnn − x0 xn−1 n
⎡ ⎤
1 x1 . . . xn−1
1
⎢ .. ⎥
= (x1 − x0 ) · . . . · (xn − x0 ) · det ⎣ ... ... . ⎦
1 xn . . . xn−1
n
4
Alexandre-Théophile Vandermonde (1735-1796), French mathematician
2.3 Interpolation by Algebraic Polynomials 21

= (x1 − x0 ) · . . . · (xn − x0 ) · det(VX\{x0 } )



= (x1 − x0 ) · . . . · (xn − x0 ) · (xk − xj )
1≤j<k≤n

= (xk − xj ),
0≤j<k≤n

which already completes our proof. 


We can conclude that for any set X = {x0 , x1 , . . . , xn } of n + 1 pairwise
distinct interpolation points the Vandermonde matrix VX is regular. This
gives an answer to our initial question for the existence and uniqueness of a
solution to Problem 2.4. We summarize our discussion on this as follows.
Corollary 2.6. The interpolation problem (2.22), Problem 2.4, has a unique
solution p ≡ pf,X ∈ Pn whose coefficients with respect to its monomial repre-
sentation in (2.21) are given by the solution of the linear system (2.23). 
We remark that the computation of the coefficient vector a in (2.23) is
rather problematic. This is because heterogeneous distributions of interpo-
lation points X typically yield ill-conditioned Vandermonde matrices VX .
Moreover, the computation of the coefficients in a ∈ Rn+1 via (2.23) is too
costly. Therefore, we prefer to avoid the linear system (2.23) for the solution
of Problem 2.4, mainly for the sake of numerical stability and computational
efficiency (see [28, Section 4.2]).
In fact, the above method for the solution to Problem 2.4 via (2.23) is
rather naive. We remark that the solution to the interpolation problem (2.22)
does not require a linear system at all. By choosing a suitable polynomial basis
we can immediately give the solution to Problem 2.4.
To this end, we consider the Lagrange5 polynomials
(x − x0 ) · . . . · (x − xj−1 ) · (x − xj+1 ) · . . . · (x − xn )
Lj (x) =
(xj − x0 ) · . . . · (xj − xj−1 ) · (xj − xj+1 ) · . . . · (xj − xn )

n
x − xk
= for 0 ≤ j ≤ n.
k=0
x j − xk
k=j

Any Lj is a polynomial of degree n, Lj ∈ Pn , for 0 ≤ j ≤ n, and we have

1 for k = j
Lj (xk ) = for all 0 ≤ j, k ≤ n.
0  j
for k =
Therefore, the Lagrange polynomials L0 , . . . , Ln are a basis of the polynomial
space Pn . Moreover, the solution p ∈ Pn to the interpolation problem (2.22)
is in its Lagrange representation given as
5
Joseph-Louis Lagrange (1736-1813), mathematician and astronomer
22 2 Basic Methods and Numerical Algorithms


n
p(x) = f0 L0 (x) + . . . + fn Ln (x) = fj Lj (x). (2.25)
j=0

Now let us make a first and very simple example.


Example 2.7. For two distinct interpolation points in X = {x0 , x1 } the two
corresponding Lagrange polynomials L0 , L1 ∈ P1 are
x1 − x x − x0
L0 (x) = and L1 (x) = for x ∈ R.
x1 − x0 x1 − x0
Note that L0 + L1 ≡ 1. The Lagrange representation of the unique inter-
polation polynomial p ∈ P1 satisfying pX = fX , for given function values
fX = (f0 , f1 )T ∈ R2 , is given by the affine combination
x1 − x x − x0
p(x) = · f0 + · f1 for x ∈ R
x1 − x0 x1 − x0
of the function values f0 and f1 . ♦
Now let us turn to a concrete interpolation problem.
Example 2.8. In this example, we consider interpolating the trigonometric
function f (x) = cos(x) by a cubic polynomial for the set of interpolation
points X = {0, π, 3π/2, 2π}. This yields the data vector fX = (1, −1, 0, 1)T ,
see Figure 2.2 (a). The cubic Lagrange polynomials for the points in X are
 
(x − π)(x − 3π/2)(x − 2π) 1 3
L0 (x) = = − 3 · (x − π) x − π (x − 2π)
(0 − π)(0 − 3π/2)(0 − 2π) 3π 2
 
(x − 0)(x − 3π/2)(x − 2π) 2 3
L1 (x) = = 3 · x x − π (x − 2π)
(π − 0)(π − 3π/2)(π − 2π) π 2

(x − 0)(x − π)(x − 2π) 8


L2 (x) = = − 3 · x(x − π)(x − 2π)
(3π/2 − 0)(3π/2 − π)(3π/2 − 2π) 3π
 
(x − 0)(x − π)(x − 3π/2) 1 3
L3 (x) = = 3 · x(x − π) x − π .
(2π − 0)(2π − π)(2π − 3π/2) π 2
The function graphs of L0 , L1 , L2 , L3 are shown in Figures 2.3 and 2.4.
The unique solution to the interpolation problem pX = fX is
p(x) = L0 (x) − L1 (x) + L3 (x)
 
1 3
= − 3 · (x − π) x − π (x − 2π)
3π 2
 
2 3
− 3 · x x − π (x − 2π)
π 2
 
1 3
+ 3 · x(x − π) x − π .
π 2
The function graph of p is shown in Figure 2.2 (b). ♦
2.3 Interpolation by Algebraic Polynomials 23

0.5

-0.5

-1

0 1 2 3 4 5 6

(a) f (x) = cos(x) and data X, fX

0.5

-0.5

-1

0 1 2 3 4 5 6

(b) Interpolant p ∈ P3 satisfying pX = fX

Fig. 2.2. For X = {0, π, 3π/2, 2π} and fX = (1, −1, 0, 1)T the cubic polynomial
p = L0 − L1 + L3 solves the interpolation problem pX = fX from Example 2.8.
24 2 Basic Methods and Numerical Algorithms

0.8

0.6

0.4

0.2

0 1 2 3 4 5 6

L0 (x) = − 3π1 3 · (x − π)(x − 32 π)(x − 2π)

1.4

1.2

0.8

0.6

0.4

0.2

-0.2

0 1 2 3 4 5 6

2
L1 (x) = π3
· x(x − 32 π)(x − 2π)

Fig. 2.3. The Lagrange polynomials L0 , L1 ∈ P3 for X = {0, π, 3π/2, 2π}.


2.3 Interpolation by Algebraic Polynomials 25

0.8

0.6

0.4

0.2

-0.2

-0.4

-0.6

-0.8

-1

0 1 2 3 4 5 6

L2 (x) = − 3π8 3 · x(x − π)(x − 2π)

0.8

0.6

0.4

0.2

0 1 2 3 4 5 6

1
L3 (x) = π3
· x(x − π)(x − 32 π)

Fig. 2.4. The Lagrange polynomials L2 , L3 ∈ P3 for X = {0, π, 3π/2, 2π}.


26 2 Basic Methods and Numerical Algorithms

Although the Lagrange representation in (2.25) leads us directly to the


solution of the interpolation problem (2.22), this Lagrangian interpolation
scheme is not our preferred solution in practice. In fact, from a numeri-
cal viewpoint, the evaluation and updating of the interpolation polynomial
in (2.25) is far too costly (see [28, Section 4.2] and [57, Section 8.2]).
An efficient and numerically stable method to evaluate interpolation poly-
nomials is based on a recursive representation, which we explain in the fol-
lowing discussion. To this end, pk,j ∈ Pj denotes, for 0 ≤ j ≤ k ≤ n, the
unique polynomial of degree at most j satisfying the interpolation conditions

pk,j (x ) = f for all k − j ≤ ≤ k. (2.26)

For fixed x ∈ R the values pk,j (x) can be computed recursively. This is done
by using the Aitken6 lemma.
Lemma 2.9. For the interpolation polynomials pk,j ∈ Pj satisfying (2.26)
we have the recursion

pk,0 (x) = fk for 0 ≤ k ≤ n


x − xk
pk,j (x) = pk,j−1 (x) + (pk−1,j−1 (x) − pk,j−1 (x))
xk−j − xk
xk−j − x x − xk
= pk,j−1 (x) + pk−1,j−1 (x) for k ≥ j > 0.
xk−j − xk xk−j − xk
Proof. We prove the statement by induction on j ≥ 0.
Initial step: For j = 0 we have pk,0 ≡ fk ∈ P0 , for 0 ≤ k ≤ n.
Induction hypothesis: Suppose pk,j−1 ∈ Pj−1 interpolates the data

(xk−j+1 , fk−j+1 ), . . . , (xk , fk )

and pk−1,j−1 ∈ Pj−1 is the interpolation polynomial for the data

(xk−j , fk−j ), . . . , (xk−1 , fk−1 ).

Induction step (j − 1 −→ j): Note that the right hand side of the recursion,
x − xk
q(x) := pk,j−1 (x) + (pk−1,j−1 (x) − pk,j−1 (x)),
xk−j − xk
is a polynomial of degree at most j, i.e., q ∈ Pj . From the stated recursion
and by using the induction hypothesis we can conclude that q, as well as pk,j ,
interpolates the data
(xk−j , fk−j ), . . . , (xk , fk ).
Therefore, we have q ≡ pk,j by uniqueness of the interpolant pk,j . 
6
Alexander Craig Aitken (1895-1967), New Zealand mathematician
2.3 Interpolation by Algebraic Polynomials 27

By the recursion of the Aitken lemma, Lemma 2.9, we can, on given in-
terpolation points X and function values fX , recursively evaluate the unique
interpolation polynomial p ≡ pn,n ∈ Pn at any point x ∈ R. To this end, we
organize the values pk,j ≡ pk,j (x), for 0 ≤ j ≤ k ≤ n, in a triangular scheme
as follows.
f0 = p0,0
f1 = p1,0 p1,1
f2 = p2,0 p2,1 p2,2
.. .. .. . .
. . . .
fn = pn,0 pn,1 pn,2 · · · pn,n
The values in the first column of the triangular scheme are the given function
values pk,0 = fk , for 0 ≤ k ≤ n. The values of the subsequent columns can be
computed, according to the recursion in the Aitken lemma, from two values in
the previous column. In this way, we can compute all entries of the triangular
scheme, column-wise from left to right, and so we obtain the sought function
value p(x) = pn,n .
To compute the entry pk,j we merely need (besides the interpolation
points xk−j and xk ) the two entries pk−1,j−1 and pk,j−1 from the previ-
ous column. If we compute the entries in each column from the bottom to
the top, then we can delete, in each step one entry, pk,j−1 , since pk,j−1 is no
longer needed in the subsequent computations.
This leads us to the Neville7 -Aitken algorithm, Algorithm 1, giving a
memory-efficient variant of the Aitken recursion in Lemma 2.9. The Neville-
Aitken algorithm operates on the input data vector fX = (f0 , . . . , fn )T re-
cursively as shown in Algorithm 1.

Algorithm 1 Neville-Aitken algorithm


1: function Neville-Aitken(X,fX , x)
2: input: Interpolation points X = {x0 , . . . , xn };
3: Function values fX = (f0 , . . . , fn )T ∈ Rn+1 ;
4: Evaluation point x ∈ R for p ∈ Pn ;
5: for j = 1, . . . , n do
6: for k = n, . . . , j do
7: let
x − xk
fk := fk + (fk−1 − fk );
xk−j − xk
8: end for
9: end for
10: output: p(x) = fn .
11: end function

7
Eric Harold Neville (1889-1961), English mathematician
28 2 Basic Methods and Numerical Algorithms

2.4 Divided Differences and the Newton Representation


Now we use the Aitken recursion in Lemma 2.9 to elaborate a suitable repre-
sentation for interpolation polynomials. To this end, we consider for a fixed
set X = {x0 , . . . , xn } of interpolation points the Newton8 polynomials


k−1
ωk (x) = (x − xj ) ∈ Pk for 0 ≤ k ≤ n. (2.27)
j=0

For the Newton polynomials we have





0 for < k,


ωk (x ) = k−1

⎪ (x − xj ) = 0 for ≥ k.

j=0

The Newton polynomials are obviously linearly independent, so that they


are a basis for the polynomial space Pn . Therefore, for any vector of function
values fX = (f0 , . . . , fn )T ∈ Rn+1 , the interpolation polynomial pn ∈ Pn to
fX has unique Newton coefficients b0 , b1 , . . . , bn ∈ R, so that

n
pn (x) = bk ωk (x)
k=0
= b0 + b1 (x − x0 ) + . . . + bn (x − x0 ) · . . . · (x − xn−1 ). (2.28)

The form of the polynomial pn in (2.28) is called Newton representation.


Next, we turn to the computation of the Newton coefficients in (2.28).
We start with the following scheme involving the function values of pn on X.

f0 = pn (x0 ) = b0
f1 = pn (x1 ) = b0 + b1 (x1 − x0 )
f2 = pn (x2 ) = b0 + b1 (x2 − x0 ) + b2 (x2 − x0 )(x2 − x1 )
.. ..
. .
fn = pn (xn ) = b0 + . . . + bn (xn − x0 ) · . . . · (xn − xn−1 ).

Note that the Newton coefficients bk of pn can be determined recursively by


⎛ ⎞
1 
k−1
bk = ⎝fk − bj ωj (xk )⎠ for k = 0, . . . , n. (2.29)
ωk (xk ) j=0

Further note that for the computation of bk we only need the first k + 1 data
8
Sir Isaac Newton (1643-1727), English philosopher and scientist
2.4 Divided Differences and the Newton Representation 29

(x0 , f0 ), (x1 , f1 ), . . . , (xk , fk ).

This gives the Newton representation an important advantage concer-


ning updating: If we add one datum (xn+1 , fn+1 ) to Xn = {x0 , . . . , xn } and
fXn , then it will be rather simple to update the interpolation polynomial pn
in (2.28). In fact, for the interpolation polynomial pn+1 ∈ Pn+1 from data
Xn+1 = {x0 , . . . , xn , xn+1 } and fXn+1 we have the representation


n
pn+1 (x) = pn (x) + bn+1 (x − xk ) = pn (x) + bn+1 ωn+1 (x),
k=0

where under the additional interpolation condition pn+1 (xn+1 ) = fn+1 we


immediately get
fn+1 − pn (xn+1 )
bn+1 = .
ωn+1 (xn+1 )
Now in order to develop a systematic scheme for computing the Newton
coefficients of interpolation polynomials we introduce divided differences.

Definition 2.10. On given data X = {x0 , . . . , xn } ⊂ R and fX ∈ Rn+1 let



n
p(x) = a k x k ∈ Pn (2.30)
k=0

be the unique interpolation polynomial satisfying pX = fX . Then, the leading


coefficient an ∈ R of p in (2.30) is called the n-th divided difference of
f with respect to X, where we use the notation

an = [x0 , . . . , xn ](f ). (2.31)

The linear mapping [x0 , . . . , xn ] : C (R) −→ R is referred to as the divided


difference operator, or the difference operator, with respect to X.

Before we discuss relevant properties of divided differences, we first make


a remark for further clarification.

Remark 2.11. Note that the n-th divided difference [x0 , . . . , xn ](f ) in Defi-
nition 2.10 is the leading coefficient of the interpolation polynomial p for f
on X with respect to its monomial representation (2.30). We remark that
the leading coefficient of p with respect to its monomial representation (2.30)
coincides with the leading coefficient of p with respect to its Newton repre-
sentation in (2.28) so that we have

p(x) = [x0 , . . . , xn ](f )xn + an−1 xn−1 + . . . + a1 x + a0 (2.32)

with the coefficients a0 , . . . , an−1 of the monomial representation of p and

p(x) = [x0 , . . . , xn ](f )ωn (x) + bn−1 ωn−1 (x) + . . . + b1 ω1 (x) + b0 (2.33)
30 2 Basic Methods and Numerical Algorithms

with the coefficients b0 , . . . , bn−1 of the Newton representation. This proper-


ty follows directly from the structure of the Newton polynomials ωk ∈ Pk
in (2.27), with the leading Newton polynomial


n−1
ωn (x) = (x − xj ) ∈ Pn
j=0

as a product of the n linear factors x − xj , for 0 ≤ j ≤ n − 1. Indeed, note


that the leading coefficient of ωn (with respect to the monomial basis) is one.
Therefore, the leading coefficients of p, as in the Newton representation (2.33)
and in the monomial representation (2.32), must be equal.
In hindsight, we could as well have introduced the n-th divided difference
[x0 , . . . , xn ](f ) in Definition 2.10 as the leading coefficient bn of the interpo-
lation polynomial p in its Newton representation in (2.28). Nevertheless, we
have decided to follow the common standard from the literature. We finally
remark that in the following recursive evaluation of [x0 , . . . , xn ](f ) by divided
differences (of a smaller order than n), the monomial representation (2.32)
of the interpolation polynomial p will be quite useful. 
In our discussion so far, we have not made any assumptions on the or-
dering of the interpolation points in X. Since the interpolation polynomial
p is, on given data fX , always unique, we can conclude that the leading co-
efficient an of p in its monomial representation (2.30) is independent of the
interpolation points’ order in X. We formulate this observation as follows.
Proposition 2.12. For X = {x0 , . . . , xn } and fX ∈ Rn+1 the divided
difference [x0 , . . . , xn ](f ) is independent of the order of interpolation points
x0 , . . . , xn in X, i.e., for any permutation σ of the indices {0, . . . , n} we have

[x0 , . . . , xn ](f ) = [xσ(0) , . . . , xσ(n) ](f ).


As we show now, all coefficients in the Newton representation (2.28) of
the interpolation polynomial p are divided differences.
Theorem 2.13. For X = {x0 , . . . , xn } and fX ∈ Rn+1 ,

n
p(x) = [x0 , . . . , xk ](f ) · ωk (x) ∈ Pn (2.34)
k=0

is the unique interpolation polynomial satisfying pX = fX .


Proof. We prove the statement by induction on n.
Initial step: For n = 0 we have p ≡ f0 = [x0 ](f ) for X = {x0 } and f0 ∈ R.
Induction hypothesis: Assume that, on given data X = {x0 , . . . , xn−1 } and
fX ∈ Rn , n ≥ 1,
2.4 Divided Differences and the Newton Representation 31


n−1
p= [x0 , . . . , xk ](f ) · ωk ∈ Pn−1
k=0

is the unique interpolation polynomial in Pn−1 satisfying pX = fX .


Induction step (n−1 −→ n): On given data X = {x0 , . . . , xn } and fX ∈ Rn+1
the unique interpolation polynomial p ∈ Pn has, according to Remark 2.11,
the representations (2.32) and

p(x) = [x0 , . . . , xn ](f ) · ωn (x) + qn−1 (x) (2.35)

with qn−1 ∈ Pn−1 , where the latter follows directly from (2.33). Since

qn−1 (xk ) = p(xk ) − [x0 , . . . , xn ](f ) ωn (xk ) = p(xk ),


& '( )
=0

for all 0 ≤ k ≤ n − 1, we see that the polynomial qn−1 ∈ Pn−1 is the


unique interpolation polynomial to f from Pn−1 on the interpolation points
x0 , . . . , xn−1 . By the induction hypothesis, qn−1 has the representation


n−1
qn−1 = [x0 , . . . , xk ](f ) · ωk .
k=0

This in combination with (2.35) completes our proof already, since


n−1 
n
p = [x0 , . . . , xn ](f ) · ωn + [x0 , . . . , xk ](f ) · ωk = [x0 , . . . , xk ](f ) · ωk .
k=0 k=0

We finally turn to the computation of the divided differences. To this end,


we rely on the Aitken recursion in Lemma 2.9.

Theorem 2.14. For X = {x0 , . . . , xn } and fX ∈ Rn+1 the recursion

[xj+1 , . . . , xk ](f ) − [xj , . . . , xk−1 ](f )


[xj , . . . , xk ](f ) = for 0 ≤ j < k ≤ n
xk − xj
[xj ] (f ) = f (xj ) for 0 ≤ j ≤ n

holds.

Proof. For n ≥ k ≥ j ≥ 0, let pj,k ∈ Pk−j be the unique interpolation poly-


nomial to f from Pk−j on the interpolation points {xj , . . . , xk }. Moreover,
for k > j, let pj+1,k ∈ Pk−j−1 be the unique interpolation polynomial to f on
{xj+1 , . . . , xk } and pj,k−1 ∈ Pk−j−1 be the unique interpolation polynomial
to f on {xj , . . . , xk−1 }.
Then, by the Aitken recursion in Lemma 2.9, we have the representation
32 2 Basic Methods and Numerical Algorithms

(xj − x)pj+1,k (x) − (xk − x)pj,k−1 (x)


pj,k (x) = . (2.36)
xj − xk
If we compare the leading coefficients in (2.36), then we get
−[xj+1 , . . . , xk ](f ) + [xj , . . . , xk−1 ](f )
[xj , . . . , xk ](f ) =
xj − x k
[xj+1 , . . . , xk ](f ) − [xj , . . . , xk−1 ](f )
=
xk − xj
for n ≥ k > j ≥ 0. For j = k we get [xj ](f ) = f (xj ), for 0 ≤ j ≤ n. 
Example 2.15. For X = {x0 , x1 } ⊂ R and fX = (f0 , f1 ) T
∈ R , the first
2

order divided difference yields the difference quotient


[x1 ](f ) − [x0 ](f ) f1 − f 0
[x0 , x1 ](f ) = = .
x1 − x0 x1 − x0
If f is differentiable at x0 , i.e., f ∈ C 1 (x0 − ε, x0 + ε), for ε > 0, then we have

lim [x0 , x1 ](f ) = f  (x0 ).


x1 →x0

Therefore, we allow coinciding interpolation points for f ∈ C 1 , where we let

[x, x](f ) := f  (x).


By the recursion in Theorem 2.14 we can view the n-th divided difference
[x0 , . . . , xn ](f ) as a discretization of the n-th derivative of f ∈ C n . We will
be more precise on this observation later in this section.

Table 2.1. Organization of divided differences, on input data X = {x0 , . . . , xn }


and fX = (f0 , . . . , fn )T ∈ Rn+1 , in a triangular scheme.

X fX
x0 f0
x1 f1 [x0 , x1 ](f )
x2 f2 [x1 , x2 ](f ) [x0 , x1 , x2 ](f )
.. .. .. .. ..
. . . . .
xn fn [xn−1 , xn ](f ) [xn−2 , xn−1 , xn ](f ) ··· [x0 , . . . , xn ](f )

On given points X = {x0 , . . . , xn } and values fX = (f0 , . . . , fn )T ∈ Rn+1


we can evaluate all divided differences [xj , . . . , xk ](f ), for 0 ≤ j ≤ k ≤ n,
2.4 Divided Differences and the Newton Representation 33

by using the efficient and stable recursion of Theorem 2.14. To this end, we
organize the divided differences in a triangular scheme, as shown in Table 2.1.
The organization of the data in Table 2.1 reminds us of the triangular
scheme of the Neville-Aitken algorithm, Algorithm 1. In fact, to compute
the Newton coefficients [x0 , . . . , xk ](f ) in (2.34), we can (similarly as in Al-
gorithm 1) process the data in Table 2.1 by a memory-efficient algorithm
operating only on the data vector fX = (f0 , . . . , fn )T , see Algorithm 2.

Algorithm 2 Computation of Newton coefficients [x0 , . . . , xk ](f )


1: function Divided Differences(X,fX )
2: input: interpolation points X = {x0 , x1 , . . . , xn };
3: function values fX = (f0 , f1 , . . . , fn )T ∈ Rn+1 ;
4: for j = 1, . . . , n do
5: for k = n, . . . , j do
6: let
fk − fk−1
fk := ;
xk − xk−j
7: end for
8: end for
9: output: (f0 , . . . , fn ) = ([x0 ](f ), [x0 , x1 ](f ), . . . , [x0 , . . . , xn ](f ))T ∈ Rn+1 .
10: end function

For further illustration, we make an example that is linked to Example 2.8.

Example 2.16. We consider interpolating the function f (x) = cos(x) on


interpolation points X3 = {0, π, 3π/2, 2π}. By fX3 = (1, −1, 0, 1) we get the
following divided differences in the triangular scheme of Table 2.1, for n = 3.

X3 fX3
0 1
π −1 − π2
3 2 8
2π 0 π 3π 2

2π 1 2
π 0 − 3π4 3

The Newton polynomials ω0 , . . . , ω3 for the point set X3 are given by


 
3
ω0 ≡ 1, ω1 (x) = x, ω2 (x) = x(x − π), ω3 (x) = x(x − π) x − π .
2

Therefore, the cubic polynomial


 
2 8 4 3
p3 (x) = 1 − x + 2 x(x − π) − 3 x(x − π) x − π (2.37)
π 3π 3π 2
34 2 Basic Methods and Numerical Algorithms

is the unique interpolation polynomial in P3 satisfying pX3 = fX3 .


The leading coefficient of the interpolation polynomial p3 in its Newton
representation (2.37) coincides with that of its monomial representation (see
Remark 2.11). The leading coefficient of p3 with respect to its monomial
representation can also be obtained by the sum of the coefficients of its La-
grange representation (see Example 2.8), i.e.,
1 2 1 4
− 3
− 3 + 3 = − 3.
3π π π 3π
On the downside, the approximation quality of the cubic interpolation
polynomial p3 for f on X3 is rather bad, see Figure 2.5 (a), where we find
p3 − f ∞,[0,2π] ≈ 1.1104 for the approximation error. To improve on the
approximation quality we add one interpolation point x4 = π/4 and so we
obtain X4 = {0, π, 3π/2, 2π,
√ π/4} for the updated set of interpolation points
and fX4 = (1, −1, 0, 1, 1/ 2) for the updated data vector of function values.
To compute the interpolation polynomial p4 ∈ P4 we update the triangular
scheme (see Table 2.1, for n = 4) as follows.

X4 fX4
0 1
π −1 − π2
3 2 8
2π 0 π 3π 2

2π 1 2
π √ 0 √
− 3π4 3√ √
4(1− 2) 8(2+5 2) 32(2+5 2)
π
4
√1
2
− 7√2π √
35 2π 2
− 105√2π3 − 16(16+5

105 2π 4
2)

Therefore, the quartic (i.e., degree four) polynomial


√  
16(16 + 5 2) 3
p4 (x) = p3 (x) − √ x(x − π) x − π (x − 2π)
105 2π 4 2
is the unique interpolation polynomial in P4 satisfying pX4 = fX4 , where the
approximation error p4 − f ∞,[0,2π] ≈ 0.0736 of p4 ∈ P4 is much smaller
than that of p3 ∈ P3 , see Figure 2.5 (b). ♦
Next, we develop a very useful representation for divided differences,
termed the Hermite9 -Genocchi10 formula, whereby divided differences can
be viewed as mean values of derivatives of f over a simplex spanned by the
interpolation points. In the following formulation for the Hermite-Genocchi
formula, we regard the n-dimensional standard simplex
*  +
 n
n
Δn = (λ1 , . . . , λn ) ∈ R  λk ≥ 0 for 1 ≤ k ≤ n and
T
λk ≤ 1 . (2.38)
k=1
9
Charles Hermite (1822-1901), French mathematician
10
Angelo Genocchi (1817-1889), Italian mathematician
2.4 Divided Differences and the Newton Representation 35

0.5

-0.5

-1

0 1 2 3 4 5 6

(a) p3 with approximation error p3 − f ∞,[0,2π] ≈ 1.1104

0.5

-0.5

-1

0 1 2 3 4 5 6

(b) p4 with approximation error p4 − f ∞,[0,2π] ≈ 0.0736

Fig. 2.5. (a) The cubic polynomial p3 ∈ P3 interpolates the trigonometric function
f (x) = cos(x) on X3 = {0, π, 3π/2, 2π}. (b) The quartic polynomial p4 ∈ P4
interpolates f (x) = cos(x) on X4 = {0, π, 3π/2, 2π, π/4} (see Example 2.16).
36 2 Basic Methods and Numerical Algorithms

Theorem 2.17. For f ∈ C n , n ≥ 1, the Hermite-Genocchi formula


 , -

n
[x0 , . . . , xn ](f ) = f (n)
x0 + λk (xk − x0 ) dλ,
Δn k=1

holds, where Δn is the n-dimensional standard simplex (2.38) in Rn .

Proof. We prove the Hermite-Genocchi formula by induction on n.


Initial step: For n = 1, we have Δ1 = [0, 1] and so

1
f  (x0 + λ1 (x1 − x0 )) dλ1 = (f (x1 ) − f (x0 )) = [x0 , x1 ](f ).
Δ1 x1 − x0

Induction hypothesis: Suppose the Hermite-Genocchi formula holds for n ≥ 1.


Induction step (n − 1 −→ n): For dλ = dλ1 · · · dλn−1 we have
 , -

n
f (n)
x0 + λk (xk − x0 ) dλ dλn
Δn k=1
  n−1 , -
1− k=1 λk 
n−1
= f (n)
x0 + λk (xk − x0 ) + λn (xn − x0 ) dλn dλ
Δn−1 0 k=1
 . n−1 /
xn + λk (xk −xn )
1 k=1
(n)
= n−1
f (z) dz dλ
xn − x0 Δn−1 x0 + λk (xk −x0 )
. , -
k=1

1 
n−1
= f (n−1)
xn + λk (xk − xn ) dλ
xn − x0 Δn−1 k=1
 , - /

n−1
− f (n−1)
x0 + λk (xk − x0 ) dλ
Δn−1 k=1
1
= ([xn , x1 , . . . , xn−1 ](f ) − [x0 , . . . , xn−1 ](f ))
xn − x0
1
= ([x1 , . . . , xn ](f ) − [x0 , . . . , xn−1 ](f ))
xn − x0
= [x0 , . . . , xn ](f ).

Now we can state further properties of divided differences. The following


results are direct consequences from the Hermite-Genocchi formula, Theo-
rem 2.17, and the standard mean value theorem of integration.
2.4 Divided Differences and the Newton Representation 37

Corollary 2.18. The divided differences satisfy the following properties.


(a) For f ∈ C n , n ≥ 0, we have

f (n) (τ )
[x0 , . . . , xn ](f ) = for some τ ∈ [xmin , xmax ],
n!
where xmin = min0≤k≤n xk and xmax = max0≤k≤n xk .
For x0 = . . . = xn , we have

f (n) (x0 )
[x0 , . . . , xn ](f ) = .
n!
(b) For p ∈ Pn−1 , we have [x0 , . . . , xn ](p) = 0 for n ≥ 1.


The discretization of higher order derivatives by divided differences is


consistent with the standard product rule of differentiation. We show this by
proving the Leibniz11 rule.

Corollary 2.19. For arbitrary points x0 , . . . , xn and f, g ∈ C n , n ∈ N0 , the


Leibniz formula

n
[x0 , . . . , xn ](f · g) = [x0 , . . . , xj ](f ) · [xj , . . . , xn ](g) (2.39)
j=0

holds.

Proof. Suppose that X = {x0 , . . . , xn } is a set of pairwise distinct points.


Moreover, let pf ∈ Pn be the unique interpolation polynomial for f on X
and pg ∈ Pn be the unique interpolation polynomial for g on X. Then, pf
and pg have the representations


n 
n
pf = [x0 , . . . , xk ](f )ωk and pg = [xj , . . . , xn ](g)0
ωj
k=0 j=0

with the Newton polynomials


k−1 
n
ωk (x) = (x − x ) ∈ Pk and 0j (x) =
ω (x − xm ) ∈ Pn−j
=0 m=j+1

for 0 ≤ k, j ≤ n, where we have used the independence of the divided diffe-


rences on the order of the interpolation points in X (cf. Proposition 2.12).
Now the product
11
Gottfried Wilhelm Leibniz (1646-1716), German philosopher and scientist
38 2 Basic Methods and Numerical Algorithms


n
p := pf · pg = [x0 , . . . , xk ](f ) ωk · [xj , . . . , xn ](g) ω
0j (2.40)
k,j=0

interpolates the function f · g on X. For the Newton polynomials ωk and ω


0j
we have
ωk (xi ) · ω
0j (xi ) = 0 for all 0 ≤ i ≤ n,
for k > j, so that the polynomial p in (2.40) has the representation


n
p= [x0 , . . . , xk ](f ) · [xj , . . . , xn ](g) ωk · ω
0j .
k,j=0
k≤j

Since ωk · ω
0j ∈ Pn+k−j , for all 0 ≤ k, j ≤ n, we have p ∈ Pn . Therefore, p is
the unique interpolation polynomial in Pn for f · g on X, and so we obtain
the stated representation

n
[x0 , . . . , xn ](f · g) = [x0 , . . . , xj ](f ) · [xj , . . . , xn ](g) (2.41)
j=0

for the case of pairwise distinct points x0 , . . . , xn .


By the Hermite-Genocchi formula, Theorem 2.17, the representation
 , -

m
[x0 , . . . , xm ](h) = h (m)
x0 + λk (xk − x0 ) dλ (2.42)
Δm k=1

holds for h ∈ C m . Therefore, the divided differences [x0 , . . . , xm ](h) are for
h ∈ C m continuous in X, since the integrand h(m) in (2.42) is continuous in
X. Since f · g ∈ C n , we can conclude that the representation (2.41) holds for
arbitrary point sets X = {x0 , . . . , xn }. 

Remark 2.20. For coincident points x0 = . . . = xn , the Leibniz formula


(2.39), in combination with Corollary 2.18 (a), yields the identity

(f · g)(n) (x0 )  f (j) (x0 ) g (n−j) (x0 )


n
= ·
n! j=0
j! (n − j)!

and so

n
n!
(f · g)(n) (x0 ) = f (j) (x0 ) g (n−j) (x0 )
j=0
j! (n − j)!
n  
n (j)
= f (x0 ) g (n−j) (x0 ),
j=0
j

which is the standard product rule for higher derivatives. 


2.4 Divided Differences and the Newton Representation 39

From Corollary 2.18 (a) we see that divided differences are also well-
defined for coincident interpolation points, provided that f has sufficiently
many derivatives. In particular, for the case of coincident interpolation points,
all coefficients in the Newton representation (2.34) are in this case well-defined
(cf. Example 2.15). Now we extend the problem of Lagrange interpolation,
Problem 2.4, to the problem of Hermite interpolation. In the case of Hermite
interpolation, the interpolation conditions contain not only point evaluations
of f , but also derivative values of f . In this case, we require coincident inter-
polation points. To be more precise, we formulate the Hermite interpolation
problem as follows.

Problem 2.21. Let X = {x0 , . . . , xn } be a set of n + 1 pairwise distinct


interpolation points. Moreover, suppose we are given N = μ0 + μ1 + . . . + μn
Hermite data

f () (xk ) for 0 ≤ < μk and 0 ≤ k ≤ n (2.43)

for f ∈ C m−1 , where m = maxk μk and μk ∈ N for 0 ≤ k ≤ n.


Then, the Hermite interpolation problem for (2.43) requires determining
a polynomial p ∈ PN −1 satisfying the Hermite interpolation conditions

p() (xk ) = f () (xk ) for 0 ≤ < μk and 0 ≤ k ≤ n. (2.44)

Note that Lagrange interpolation, Problem 2.4, is by μk = 1, 0 ≤ k ≤ n,


and N = n + 1 a special case of Hermite interpolation. Further note that the
Hermite data in (2.43) necessarily need to contain, for every interpolation
point xk ∈ X, all derivatives

f (xk ), f  (xk ), . . . , f (μk −1) (xk ) for k = 0, . . . , n.

In the following solution to Hermite interpolation, Problem 2.21, we first


add interpolation points to X, such that the resulting point set Y contains the
interpolation points xk multiple times, namely according to its multiplicity
μk of the Hermite data in (2.44). This leads us to the extended point set
1 2
Y = x0 , . . . , x0 , x1 , . . . , x1 , . . . , xn , . . . , xn = {y0 , . . . , yN −1 } (2.45)
& '( ) & '( ) & '( )
μ0 -fold μ1 -fold μn -fold

containing N = μ0 + μ1 + . . . + μn interpolation points (including their


multiplicities), where x0 = y0 = . . . = yμ0 −1 and

xk = yμ0 +...+μk−1 = . . . = yμ0 +...+μk −1 for 1 ≤ k ≤ n.

We can solve the Hermite interpolation problem, Problem 2.21, as follows.


40 2 Basic Methods and Numerical Algorithms

Theorem 2.22. The problem of Hermite interpolation, Problem 2.21, has


a unique solution p ∈ PN −1 . For the extended set of interpolation points
Y = {y0 , . . . , yN −1 } in (2.45) and the divided differences

[y0 , . . . , yk ](f ) for 0 ≤ k < N

the interpolation polynomial p has the Newton representation


N −1
p(x) = [y0 , . . . , yk ](f )ωk (x). (2.46)
k=0

Proof. The linear mapping L : PN −1 −→ RN , defined as

p −→ L(p) = (p(x0 ), . . . , p(μ0 −1) (x0 ), . . . , p(xn ), . . . , p(μn −1) (xn ))T ∈ RN ,

is injective, due to the fundamental theorem of algebra. Further, due to the


dimension formula of linear algebra, L is also surjective, and so L is bijective.
The Newton representation (2.46) for p follows directly from our solu-
tion (2.34) to Lagrange interpolation, Problem 2.4, which holds in particular
for the case of coincident interpolation points (with using our results on di-
vided differences for coincident interpolation points). 

Again, we can organize the divided differences of the Newton represen-


tation (2.46) in a triangular scheme (as in Table 2.1). Moreover, we can use
the recursion of Algorithm 2 to compute the scheme’s entries, where for the
case of coincident interpolation points yk = yk−j (see line 6 in Algorithm 2)
we insert the corresponding derivative value f (j) (yk )/j!.
For further illustration, we finally discuss the following example.

Example 2.23. We consider interpolating the sinc function f (x) = sin(x)/x.


We have
x cos(x) − sin(x) 2 sin(x) − 2x cos(x) − x2 sin(x)
f  (x) = and f  (x) = .
x2 x3
For the interpolation of f , we work with the Hermite data
1 2
f (0) = 1, f  (0) = 0, f (π) = 0, f  (π) = − , f  (π) = 2 , f (2π) = 0.
π π
This gives the extended set of interpolation points Y = {0, 0, π, π, π, 2π}.
We display the divided differences of the Newton representation (2.46) in
a triangular scheme (as in Table 2.1 for n = 5) as follows, where we mark the
inserted derivative values f (j) (yk )/j! by a box, respectively.
2.5 Error Estimates and Optimal Interpolation Points 41

Y fY
0 1
0 1 0
π 0 − π1 − π12

π 0 − π1 0 1
π3

π 0 − π1 1
π2
1
π3 0
2π 0 0 1
π2 0 − 2π1 4 − 4π1 5

Given the above divided differences, we see that the polynomial


1 2 1 1
p5 (x) = 1 − 2
x + 3 x2 (x − π) − 5 x2 (x − π)3 ∈ P5
π π 4π
is the unique solution to the posed Hermite interpolation problem. ♦

2.5 Error Estimates and Optimal Interpolation Points


In this section, we develop error estimates, i.e., upper bounds on the difference

f (x) − p(x) for x ∈ [a, b] (2.47)

between f and the interpolation polynomial p. In the following discussion, we


regard the problem of Lagrange interpolation, Problem 2.4, as a special case of
Hermite interpolation, Problem 2.21. To unify the notations of Problems 2.4
and 2.21 we denote by Y = {y0 , . . . , yN −1 } ⊂ [a, b] the extended set of
interpolation points, where for the case of Hermite interpolation we allow
coincident interpolation points, according to Problem 2.21 and as in (2.45).
We denote the unique solution to the Hermite interpolation problem by pN −1 .
In particular, the Newton representation (2.46) holds for pN −1 ∈ PN −1 .
We can represent the error in (2.47) as follows.
Theorem 2.24. Let pN −1 ∈ PN −1 be the solution to the Hermite interpola-
tion problem, Problem 2.21. Then we have the pointwise error representation


N −1
f (x) − pN −1 (x) = [y0 , . . . , yN −1 , x](f ) (x − yk ) for x ∈ R. (2.48)
k=0

Proof. The error representation in (2.48) is obviously fulfilled for any x ∈ Y .


Indeed, in this case, we have f (x) = pN −1 (x), and, moreover, the Newtonian
knot polynomial
N−1
ωY (x) := (x − yk ) (2.49)
k=0
42 2 Basic Methods and Numerical Algorithms

vanishes at every interpolation point from Y .


Now for x ∈ R\Y , we extend Y by the interpolation point x. Moreover, we
let pN ∈ PN denote the unique polynomial in PN which satisfies the Hermite
conditions (2.44) and the additional interpolation condition pN (x) = f (x).
In this case, we have the representation


N −1
pN (x) = pN −1 (x) + [y0 , . . . , yN −1 , x](f ) (x − yk )
k=0

and so
, −1
-

N
f (x) − pN −1 (x) = f (x) − pN (x) − [y0 , . . . , yN −1 , x](f ) (x − yk )
k=0

N −1
= [y0 , . . . , yN −1 , x](f ) (x − yk ).
k=0

Theorem 2.24 immediately yields the following upper bound for the in-
terpolation error f − p in (2.47) on the interval [a, b], where we combine the
representation in (2.48) with the result of Corollary 2.18 (a).

Corollary 2.25. Let p ∈ PN −1 denote the unique solution to the Hermite


interpolation problem, Problem 2.21. Then we have for f ∈ C N the pointwise
error estimate
 
N
−1 
1  
|f (x) − p(x)| ≤ max |f (ξ)| · 
(N )
(x − yk ) (2.50)
N ! ξ∈[a,b]  
k=0

in x ∈ [a, b]. 

As a direct consequence of Corollary 2.25, the uniform error estimate

f (N ) ∞
f − p∞ ≤ · ωY ∞ for f ∈ C N [a, b] (2.51)
N!
follows from the pointwise error estimate in (2.50) for any compact interval
[a, b] ⊂ R containing the set of interpolation points Y , i.e., Y ⊂ [a, b].
To reduce the interpolation error in (2.51), we wish to minimize the maxi-
mum norm ωY ∞ of the knot polynomial ωY under variation of the inter-
polation points in Y ⊂ [a, b]. Without loss of generality, we restrict ourselves
to the interval [a, b] = [−1, 1]. This immediately leads us to the nonlinear
optimization problem

ωX ∞,[−1,1] −→ min ! (2.52)


X⊂[−1,1]
|X|=n+1
2.5 Error Estimates and Optimal Interpolation Points 43

As we show in this section, the minimization problem in (2.52) has a


unique solution X ∗ ⊂ [−1, 1] consisting of n+1 pairwise distinct interpolation
points. This explains our chosen notation X = Y and n = N − 1, which is
in accordance with the Lagrange interpolation problem, Problem 2.4. We
formulate the minimization problem in (2.52) more precisely as follows.
Problem 2.26. Determine for n ∈ N0 a set X ∗ = {x∗0 , . . . , x∗n } ⊂ [−1, 1]
of n + 1 interpolation points, which minimizes the maximum norm of the
corresponding knot polynomial ωX ∗ on [−1, 1], so that the upper bound
ωX ∗ ∞,[−1,1] ≤ ωX ∞,[−1,1] (2.53)
holds for all point sets X = {x0 , . . . , xn } ⊂ [−1, 1] of size |X| = n + 1. 
For the solution of the minimization problem, Problem 2.26, we work with
the Chebyshev polynomials
Tn (x) = cos(n arccos(x)) for n ∈ N0 , (2.54)
where in the subsequent discussion we rely on their following properties.
Theorem 2.27. The Chebyshev polynomials are generated by the recursion
Tn+1 (x) = 2xTn (x) − Tn−1 (x) for n ∈ N (2.55)
with the initial values T0 ≡ 1 and T1 (x) = x.
Proof. The initial values T0 ≡ 1 and T1 (x) = x are obviously consistent with
Definition (2.54). For φ = arccos(x), we find the representation
cos((n + 1)φ) = 2 cos(φ) cos(nφ) − cos((n − 1)φ)
from standard trigonometric identities, which implies the recursion (2.55). 
Corollary 2.28. For n ∈ N0 , the Chebyshev polynomial Tn+1 is an algebraic
polynomial of degree n + 1 with leading coefficient 2n , so that an identity of
the form
Tn+1 (x) = 2n xn+1 + qn (x) (2.56)
holds for some qn ∈ Pn .
Proof. We prove the identity (2.56) by induction on n ∈ N0 . For n = 0, the
statement is trivial. Under the induction hypothesis for n ∈ N0 , the statement
in (2.56) follows, for n + 1, directly from the recursion in (2.55). 
Corollary 2.29. The n + 1 zeros of the Chebyshev polynomial Tn+1 ∈ Pn+1
are, for n ∈ N0 , given by the Chebyshev knots
 
∗ 2k + 1
xk = cos π ∈ [−1, 1] for 0 ≤ k ≤ n. (2.57)
2n + 2
Moreover, all extrema of Tn+1 on [−1, 1] are attained at the n + 2 points
 
k
yk = cos π ∈ [−1, 1] for 0 ≤ k ≤ n + 1. (2.58)
n+1
44 2 Basic Methods and Numerical Algorithms

Proof. For 0 ≤ k ≤ n, we have

Tn+1 (x∗k ) = cos((n + 1) arccos(x∗k ))


  
2k + 1 π
= cos (n + 1) π = cos (2k + 1) = 0.
2(n + 1) 2
The n + 1 Chebyshev knots x∗k in (2.57) are obviously pairwise distinct.
Therefore, the Chebyshev polynomial Tn+1 ∈ Pn+1 \{0} has no further zeros.
As regards the extrema of Tn+1 , we have Tn+1 ∞,[−1,1] ≤ 1 and, moreover,
   
k
Tn+1 (yk ) = cos (n + 1) arccos cos π = cos(kπ) = (−1)k ,
n+1
so that all n + 2 points in Y = {y0 , . . . , yn+1 } ⊂ [−1, 1] are extremal points
for Tn+1 on [−1, 1]. Since Tn+1 is a polynomial of degree n + 1, its derivative

Tn+1 has at most n zeros. Therefore, Tn+1 has at most n extrema in the open
interval (−1, 1) and at most n + 2 extrema in the closed interval [−1, 1]. But
this implies that Y already contains all zeros of Tn+1 on [−1, 1]. 

Table 2.2. Monomial form of the Chebyshev polynomials Tn ∈ Pn , n = 1, . . . , 12.

T1 (x) = x

T2 (x) = 2x2 − 1

T3 (x) = 4x3 − 3x

T4 (x) = 8x4 − 8x2 + 1

T5 (x) = 16x5 − 20x3 + 5x

T6 (x) = 32x6 − 48x4 + 18x2 − 1

T7 (x) = 64x7 − 112x5 + 56x3 − 7x

T8 (x) = 128x8 − 256x6 + 160x4 − 32x2 + 1

T9 (x) = 256x9 − 576x7 + 432x5 − 120x3 + 9x

T10 (x) = 512x10 − 1280x8 + 1120x6 − 400x4 + 50x2 − 1

T11 (x) = 1024x11 − 2816x9 + 2816x7 − 1232x5 + 220x3 − 11x

T12 (x) = 2048x12 − 6144x10 + 6912x8 − 3584x6 + 840x4 − 72x2 + 1


2.5 Error Estimates and Optimal Interpolation Points 45
1 1 1

0.8 0.8 0.8

0.6 0.6 0.6

0.4 0.4 0.4

0.2 0.2 0.2

0 0 0

-0.2 -0.2 -0.2

-0.4 -0.4 -0.4

-0.6 -0.6 -0.6

-0.8 -0.8 -0.8

-1 -1 -1
-1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1

T1 T2 T3
1 1 1

0.8 0.8 0.8

0.6 0.6 0.6

0.4 0.4 0.4

0.2 0.2 0.2

0 0 0

-0.2 -0.2 -0.2

-0.4 -0.4 -0.4

-0.6 -0.6 -0.6

-0.8 -0.8 -0.8

-1 -1 -1
-1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1

T4 T5 T6
1 1 1

0.8 0.8 0.8

0.6 0.6 0.6

0.4 0.4 0.4

0.2 0.2 0.2

0 0 0

-0.2 -0.2 -0.2

-0.4 -0.4 -0.4

-0.6 -0.6 -0.6

-0.8 -0.8 -0.8

-1 -1 -1
-1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1

T7 T8 T9
1 1 1

0.8 0.8 0.8

0.6 0.6 0.6

0.4 0.4 0.4

0.2 0.2 0.2

0 0 0

-0.2 -0.2 -0.2

-0.4 -0.4 -0.4

-0.6 -0.6 -0.6

-0.8 -0.8 -0.8

-1 -1 -1
-1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1

T10 T11 T12

Fig. 2.6. Chebyshev polynomials Tn ∈ Pn and their n knots, for n = 1, . . . , 12.


46 2 Basic Methods and Numerical Algorithms

Figure 2.6 shows the graphs of the Chebyshev polynomials Tn ∈ Pn ,


for n = 1, . . . , 12, along with their Chebyshev knots (2.57). Moreover, the
monomial representations of Tn , for n = 1, . . . , 12, are in Table 2.2.

Corollary 2.30. For n ∈ N0 , let X ∗ = {x∗0 , . . . , x∗n } ⊂ [−1, 1] denote the set
of Chebyshev knots in (2.57). Then the corresponding knot polynomial ωX ∗
has the representation
ωX ∗ = 2−n Tn+1 . (2.59)

Proof. The knot polynomial ωX in (2.49) has for any point set X leading
coefficient one, in particular for the set X ∗ of Chebyshev knots. By the
representation in (2.56), the polynomial 2−n Tn+1 ∈ Pn+1 has also leading
coefficient one. Therefore, the difference

qn = ωX ∗ − 2−n Tn+1

is an algebraic polynomial of degree at most n, i.e., qn ∈ Pn . Since qn (x∗ ) = 0,


for x∗ ∈ X ∗ , the polynomial qn has at least n + 1 zeros, and so qn ≡ 0. 

Now we can solve the minimization problem in (2.52), Problem 2.26.

Theorem 2.31. For n ∈ N0 the Chebyshev knots X ∗ = {x∗0 , . . . , x∗n },


 
2k + 1
x∗k = cos π ∈ [−1, 1] for 0 ≤ k ≤ n,
2n + 2

are the unique solution of the minimization problem (2.52).

Proof. By Corollary 2.30 the knot polynomial ωX ∗ = 2−n Tn+1 ∈ Pn+1 is


a multiple of Tn+1 . Moreover, due to Corollary 2.29 all extrema of ωX ∗ on
[−1, 1] are attained at the n + 2 points Y = {y0 , . . . , yn+1 } in (2.58), where
we have ωX ∗ ∞,[−1,1] = 2−n and

ωX ∗ (yk ) = 2−n (−1)k for 0 ≤ k ≤ n + 1.

Now assume that for a point set X = {x0 , . . . , xn } ⊂ [−1, 1] its knot
polynomial ωX ∈ Pn+1 satisfies

ωX ∞,[−1,1] < ωX ∗ ∞,[−1,1] = 2−n . (2.60)

Then we have ωX (yk ) < ωX ∗ (yk ), for all even indices k ∈ {0, . . . , n} and
ωX (yk ) > ωX ∗ (yk ), for all odd indices k ∈ {1, . . . , n}. Therefore, the difference

ω = ωX ∗ − ωX

has in any of the n + 1 intervals (y1 , y0 ), (y2 , y1 ), . . . , (yn+1 , yn ) at least one


sign change, i.e., ω has at least n + 1 zeros. Since the knot polynomials
ωX , ωX ∗ ∈ Pn+1 have leading coefficient one, respectively, we see that their
2.6 Interpolation by Trigonometric Polynomials 47

difference ω = ωX ∗ − ωX ∈ Pn is a polynomial of degree at most n. But this


implies ω ≡ 0, i.e.,
ωX ≡ ωX ∗ ∈ Pn+1 ,
in contradiction to (2.60). We can finally conclude that the Chebyshev knots
X ∗ = {x∗0 , . . . , x∗n } ⊂ [−1, 1] are the unique solution to the minimization
problem in (2.52), Problem 2.26. 

2.6 Interpolation by Trigonometric Polynomials


In this section, we consider the interpolation of periodic functions.

Definition 2.32. A function f : R −→ R is said to be periodic, if

f (x) = f (x + T ) for all x ∈ R (2.61)

for some T > 0. In this case, f is called a T -periodic function. A minimal


T > 0 satisfying (2.61) is called the period of f .

By (2.61) any T -periodic function f is uniquely determined by its func-


tion values on the interval [0, T ). In the following discussion, we restrict our-
selves to 2π-periodic functions. This is without loss of generality since any
T -periodic function f can be transformed into a 2π-periodic function by sca-
ling its argument with a scaling factor T /(2π), i.e., the function g : R −→ R,
given as  
T
g(x) = f ·x for all x ∈ R,

is 2π-periodic, if and only if f has period T . We collect all continuous and
2π-periodic functions in the linear space

C2π = {f ∈ C (R) | f (x) = f (x + 2π) for all x ∈ R} .

Now let us turn to the interpolation of periodic functions from C2π . To


this end, we first fix a linear space of interpolants, where it makes sense to
choose a finite-dimensional subspace of C2π . Obvious examples for functions
from C2π are the trigonometric polynomials cos(jx), for j ∈ N0 , and sin(jx),
for j ∈ N. In fact, trigonometric polynomials are suitable choices for the
construction of interpolation spaces contained in C2π . To be more precise on
this, we give the following definition.

Definition 2.33. For n ∈ N0 , we denote by

TnR = spanR {1, cos(j ·), sin(j·) | 1 ≤ j ≤ n} ⊂ C2π (2.62)

the linear space of real trigonometric polynomials of degree at most n.


48 2 Basic Methods and Numerical Algorithms

Clearly, TnR is a finite-dimensional linear space, and, moreover, any real-


valued trigonometric polynomial T ∈ TnR can be represented as a linear com-
bination
a0 
n
T (x) = + [ak cos(kx) + bk sin(kx)] (2.63)
2
k=1

with coefficients a0 , . . . , an , b1 , . . . , bn ∈ R, the Fourier12 coefficients of T . We


will provide supporting arguments in favour of the chosen form in (2.63) for
the interpolating functions, i.e., as a linear combination of the 2n + 1 (basis)
functions in (2.63).
In the following formulation of the interpolation problem we can, due to
the 2π-periodicity of the target function f ∈ C2π , restrict ourselves without
loss of generality to interpolation points from the interval [0, 2π).

Problem 2.34. Compute from a given set X = {x0 , x1 , . . . , x2n } ⊂ [0, 2π) of
2n+1 pairwise distinct interpolation points and corresponding function values
fX = (f0 , f1 , . . . , f2n )T ∈ R2n+1 a real trigonometric polynomial T ∈ TnR
satisfying TX = fX , i.e.,

T (xj ) = fj for all 0 ≤ j ≤ 2n. (2.64)

For the solution to Problem 2.34, our following investigations concerning


interpolation of complex-valued functions will be very useful. To this end, we
distinguish between real (i.e., real-valued) trigonometric polynomials, as for
TnR in (2.62), and complex (i.e., complex-valued) trigonometric polynomials.
In the following, the symbol i denotes as usual the imaginary unit.

Definition 2.35. For N ∈ N0 , the linear space of all complex trigonometric


polynomials of degree at most N is given as

TNC = spanC {exp(ij·) | 0 ≤ j ≤ N }. (2.65)

Theorem 2.36. For N ∈ N0 , the linear space TNC has dimension N + 1.

Proof. A complex-valued trigonometric polynomial p ∈ TNC can be written as


a linear combination of the form

N
p(x) = ck eikx (2.66)
k=0

with complex coefficients c0 , . . . , cN ∈ C.


12
Jean Baptiste Joseph Fourier (1768-1830), French mathematician, physicist
2.6 Interpolation by Trigonometric Polynomials 49

We can show the linear independence of the generating function system


{eik· | 0 ≤ k ≤ N } by a simple argument: For p ≡ 0 we have
 2π 
N N  2π
−imx ikx
0= e ck e dx = ck ei(k−m)x dx = 2πcm
0 k=0 k=0 0

for m = 0, . . . , N , whereby c0 = . . . = cN = 0. 
By the Euler13 fomula
eix = cos(x) + i sin(x) (2.67)
we can represent any real trigonometric polynomial T ∈ TnR in (2.63) as a
complex trigonometric polynomial p ∈ TNC of the form (2.66). Indeed, by
using the Euler formula (2.67) we find the standard trigonometric identities
1 ix 1 ix
cos(x) = e + e−ix and sin(x) = e − e−ix (2.68)
2 2i
and so we obtain for any T ∈ TnR the representation
a0 
n
T (x) = + [ak cos(kx) + bk sin(kx)]
2
k=1
n  
a0  ak ikx −ikx bk ikx −ikx
= + e +e + e −e
2 2 2i
k=1
n  
a0  ak − ibk ikx ak + ibk −ikx
= + e + e
2 2 2
k=1

n 
2n
= ck eikx = e−inx ck−n eikx
k=−n k=0

with the complex Fourier coefficients


1 1 1
c0 = a0 , ck = (ak − ibk ), c−k = (ak + ibk ) for k = 1, . . . , n. (2.69)
2 2 2
Let us now draw an intermediate conclusion.
Proposition 2.37. Any real trigonometric polynomial T ∈ TnR in (2.63) can
be represented as a product
T (x) = e−inx p(x),
where p ∈ TNC is a complex trigonometric polynomial of the form (2.66), for
N = 2n. Moreover, the Fourier coefficients of p are uniquely determined by
1 a0 1
cn−k = (ak +ibk ), cn = , cn+k = (ak −ibk ) for k = 1, . . . , n, (2.70)
2 2 2
where we have applied a periodification to the coefficients ck in (2.69). 
13
Leonhard Euler (1707-1783), Swiss mathematician, physicist, and astronomer
50 2 Basic Methods and Numerical Algorithms

Note that the mapping (2.70) between the real Fourier coefficients ak , bk
of T and the complex Fourier coefficients ck of p is linear,

(a0 , . . . , an , b1 , . . . , bn )T ∈ C2n+1 −→ (c0 , . . . , c2n ) ∈ C2n+1 .

Moreover, this linear mapping is bijective, where its inverse is described as

a0 = 2c0 , ak = cn+k + cn−k , bk = i(cn+k − cn−k ) for k = 1, . . . , n. (2.71)

The Fourier coefficients a0 , . . . , an , b1 , . . . , bn are real, if and only if

cn+k = cn−k for all k = 0, . . . , n.

By the bijectivity of the linear mappings in (2.70) and (2.71) between the
complex and the real Fourier coefficients, we can determine the dimension of
TnR . The following result is a direct consequence of Theorem 2.36.

Corollary 2.38. For n ∈ N0 , the linear space TnR has dimension 2n + 1. 

Now let us return to the interpolation problem, Problem 2.34. For the case
of complex trigonometric polynomials, we can solve Problem 2.34 as follows.

Theorem 2.39. Let X = {x0 , . . . , xN } ⊂ [0, 2π) be a set of N + 1 pairwise


distinct interpolation points and fX = (f0 , . . . , fN )T ∈ CN +1 be a data vector
of complex function values, for N ∈ N0 . Then there is a unique complex
trigonometric polynomial p ∈ TNC satisfying pX = fX , i.e.,

p(xk ) = fk for all 0 ≤ k ≤ N. (2.72)

Proof. We regard the linear mapping L : TNC −→ CN +1 , defined as

p ∈ TNC −→ pX = (p(x0 ), . . . , p(xN ))T ∈ CN +1 ,

which assigns every complex trigonometric polynomial p ∈ TNC in (2.66) to


the data vector pX ∈ CN +1 .
By letting zk = eixk ∈ C, for 0 ≤ k ≤ N , we obtain N +1 pairwise distinct
interpolation points on the boundary of the unit circle, where we have


N 
N
p(xk ) = cj eijxk = cj zkj .
j=0 j=0

If L(p) = 0, then the complex polynomial p has at least N + 1 zeros. But


in this case, we have p ≡ 0, due to the fundamental theorem of algebra.
Therefore, the linear mapping L is injective. Due to the dimension formula,
L is also surjective, and thus bijective. This already proves the existence and
uniqueness of the sought polynomial p ∈ TNC . 
2.7 The Discrete Fourier Transform 51

We finally turn to the solution of the interpolation problem, Problem 2.34,


by real trigonometric polynomials. The following result is a direct consequence
of Theorem 2.39.
Corollary 2.40. Let X = {x0 , . . . , x2n } ⊂ [0, 2π) be a set of 2n + 1 pairwise
distinct interpolation points and fX = (f0 , . . . , f2n )T ∈ R2n+1 be a data vector
of real function values, for n ∈ N0 . Then there is a unique real trigonometric
polynomial T ∈ TnR satisfying TX = fX .
C
Proof. Let p ∈ T2n be the unique complex trigonometric interpolation poly-
nomial satisfying p(xk ) = einxk fk , for 0 ≤ k ≤ 2n, with Fourier coefficients
cj , for 0 ≤ j ≤ 2n. Then we have


2n 
2n
q(x) := e 2inx
p(x) = cj e i(2n−j)x
= c2n−j eijx for x ∈ [0, 2π)
j=0 j=0

and, moreover, since fk ∈ R,


q(xk ) = e2inxk p(xk ) = einxk fk = p(xk ) for all 0 ≤ k ≤ 2n.
C
Therefore, the complex trigonometric polynomial q ∈ T2n is also a solution
to the interpolation problem q(xk ) = e inxk
fk for all 0 ≤ k ≤ 2n. From the
uniqueness of the interpolation by complex trigonometric polynomials we get
q ≡ p, and so in particular
cj = c2n−j for all 0 ≤ j ≤ 2n. (2.73)
The Fourier coefficients of the interpolating real trigonometric polynomial
T ∈ TnR can finally be obtained by the inversion of the complex Fourier
coefficients in (2.71). Note that the Fourier coefficients a0 , . . . , an , b1 , . . . , bn
of T are real, due to (2.73). 

2.7 The Discrete Fourier Transform


In this section, we explain interpolation by trigonometric polynomials. More
specifically, we discuss the special case of N ∈ N equidistant interpolation
points

xk = k ∈ [0, 2π) for 0 ≤ k ≤ N − 1.
N
As we will show, the required Fourier coefficients can be computed efficiently.
In the following discussion, we denote the values of the target function f
by fk = f (xk ), for 0 ≤ k ≤ N − 1. Moreover, we use the notation
ωN = e2πi/N for N ∈ N. (2.74)
for the N -th root of unity.
For further preparation, we make a note of the following observation.
52 2 Basic Methods and Numerical Algorithms

Lemma 2.41. For N ∈ N the N -th root of unity ωN has the property
N −1
1  (−k)j
ω = δk for all 0 ≤ , k ≤ N − 1. (2.75)
N j=0 N

Proof. Let 0 ≤ , k ≤ N − 1. Note that for = k the statement in (2.75) is


trivial. For = k we have ωN
−k
= 1, so that we can use the standard identity


N −1 (−k)N
−k j ωN −1 e2πi(−k) − 1
ωN = = =0
j=0
ωN − 1
−k −k
ωN −1

of geometric series. This already completes our proof for (2.75). 

Now we are in a position where we can already give the solution to the
posed interpolation problem at equidistant interpolation points.

Theorem 2.42. For N ∈ N equidistant points x = 2π /N ∈ [0, 2π), for


0 ≤ ≤ N − 1, and function values fX = (f0 , . . . , fN −1 )T ∈ CN the Fourier
coefficients of the complex trigonometric interpolation polynomial p ∈ TNC−1
satisfying pX = fX are given as
N −1
1  −jk
cj = fk ωN for 0 ≤ j ≤ N − 1. (2.76)
N
k=0

Proof. By using Lemma 2.41, we obtain the identity

 −1 N −1 −1 N −1
1   1  (−k)j
N N
−jk ijx
p(x ) = fk ωN e = fk ω = f
j=0
N N j=0 N
k=0 k=0

for all = 0, . . . , N − 1. 

Therefore, the linear mapping in (2.76) yields an automorphism

AN : CN −→ CN ,

which maps the data vector fX = (f0 , . . . , fN −1 )T ∈ CN on the Fourier coef-


ficients c = (c0 , . . . , cN −1 )T ∈ CN of the complex trigonometric interpolation
polynomial p ∈ TNC−1 satisfying pX = fX . The bijective linear mapping AN ,
called discrete Fourier analysis, is represented by the matrix
1  −jk 
AN = ωN ∈ CN ×N . (2.77)
N 0≤j,k≤N −1

We can characterize the inverse of AN as follows. The linear mapping

A−1
N : C −→ C ,
N N
2.7 The Discrete Fourier Transform 53

which assigns every vector c = (c0 , . . . , cN −1 )T ∈ CN of Fourier coefficients,


for a complex trigonometric polynomial


N −1
p(x) = cj eijx ∈ TNC−1 ,
j=0

to the complex values


N −1 
N −1
jk
fk = p(xk ) = cj eijxk = cj ωN for k = 0, . . . , N − 1 ,
j=0 j=0

i.e., pX = fX , is called discrete Fourier synthesis. Therefore, the linear


mapping A−1 N is the inverse of AN , being represented by the matrix
 
A−1
N = ωN
jk
∈ CN ×N . (2.78)
0≤j,k≤N −1

The discrete Fourier analysis and the Fourier synthesis are usually referred
to as discrete Fourier transform and discrete inverse Fourier transform. In
the following discussion, we derive an efficient method for computing the
discrete (inverse) Fourier transform. But we first give a formal introduction
for the discrete (inverse) Fourier transform.

Definition 2.43. The discrete Fourier transformation (DFT) of

z = (z(0), z(1), . . . , z(N − 1))T ∈ CN

is defined componentwise as


N −1
−jk
ẑ(j) = z(k)ωN for 0 ≤ j ≤ N − 1, (2.79)
k=0

and the inverse discrete Fourier transform (IDFT) of

ẑ = (ẑ(0), ẑ(1), . . . , ẑ(N − 1))T ∈ CN

is defined componentwise as
N −1
1  jk
z(k) = ẑ(j)ωN for 0 ≤ k ≤ N − 1.
N j=0

The discrete Fourier transform (DFT) and the inverse DFT are repre-
sented by the Fourier matrices FN = N AN and FN−1 = A−1 N /N , i.e.,
54 2 Basic Methods and Numerical Algorithms
 
−jk
FN = ω N ∈ CN ×N
0≤j,k≤N −1
1  jk 
FN−1 = ωN ∈ CN ×N .
N 0≤j,k≤N −1

Therefore, with using the notations in Definition 2.43, we have

ẑ = FN z and z = FN−1 ẑ for all z, ẑ ∈ CN .

This finally leads us to the Fourier inversion formula

z = FN−1 FN z for all z ∈ CN .

Now let us make one simple example for further illustration.

Example 2.44. We compute the DFT ẑ ∈ C512 of the vector z ∈ C512 with
components z(k) = 3 sin(2π · 7k/512) − 4 cos(2π · 8k/512). To this end, we
regard the Fourier series (from the Fourier inversion formula)

1 
511
z(k) = ẑ(j)e2πijk/512 ,
512 j=0

whereby we obtain the unique representation of z ∈ C512 in the Fourier basis


3  4
e2πijk/512  0 ≤ j ≤ 511 .

On the other hand, the Euler formula yields the representation

z(k) = 3 sin(2π7k/512) − 4 cos(2π8k/512)


3  2πi7k/512  4 
= e − e−2πi7k/512 − e2πi8k/512 + e−2πi8k/512
2i 2
−3i 2πi7k/512 3i 2πi(−7+512)k/512
= e + e − 2e2πi8k/512 − 2e2πi(−8+512)k/512
2 2
1 
= −3 · 256i · e2πi7k/512 − 1024 · e2πi8k/512
512 
−1024 · e2πi504k/512 + 3 · 256i · e2πi505k/512 .

Therefore, we have

ẑ(7) = −768i, ẑ(8) = −1024, ẑ(504) = −1024, ẑ(505) = 768i,

and, moreover, ẑ(j) = 0 for all j ∈ {0, . . . , 511} \ {7, 8, 504, 505}. Thereby,
the vector z ∈ C512 has a sparse representation by the four non-vanishing
Fourier coefficients ẑ(7), ẑ(8), ẑ(504) and ẑ(505) (see Figure 2.7). ♦
2.7 The Discrete Fourier Transform 55
8

-2

-4

-6

-8
0 50 100 150 200 250 300 350 400 450 500

(a) input vector z(k), k = 0, . . . , 511

1100

1000

900

800

700

600

500

400

300

200

100

0
0 50 100 150 200 250 300 350 400 450 500

(b) amplitude spectrum |ẑ(j)|

Fig. 2.7. Sparse representation of z(k) = 3 sin(2π · 7k/512) − 4 cos(2π · 8k/512)


with amplitude spectrum |ẑ(j)| (see Example 2.44).
56 2 Basic Methods and Numerical Algorithms

Remark 2.45. A componentwise computation of the DFT ẑ (or of the


IDFT) according to Definition 2.43 requires asymptotically O(N 2 ) steps,
namely O(N ) steps for each of the N components. 

In the remainder of this section, we explain how to compute the DFT


by an efficient algorithm, termed the fast Fourier transform (FFT), de-
signed by Cooley14 and Tukey15 [16]. The Cooley-Tukey algorithm is based
on a recursion according to a common (political) principle divide et impera
(Latin for divide and conquer) of Machiavelli16 dating back to 1513.
The recursion step of the Cooley-Tukey algorithm relies on the identity
2
ω2N = ωN ,

being applied as follows.


For N = 2n , n ≥ 1, and 0 ≤ j ≤ N − 1 we have


N −1
−kj
ẑ(j) = z(k)ωN
k=0
 −kj
 −kj
= z(k)ωN + z(k)ωN
k even k odd


N/2−1
−2kj

N/2−1
−(2k+1)j
= z(2k)ωN + z(2k + 1)ωN
k=0 k=0


N/2−1
−2kj −j

N/2−1
−2kj
= z(2k)ωN + ωN z(2k + 1)ωN .
k=0 k=0

This already yields for M = N/2 the reduction


M −1 
M −1
−2kj −j −2kj
ẑ(j) = z(2k)ωN + ωN z(2k + 1)ωN
k=0 k=0

M −1 
M −1
−kj −j −kj
= u(k)ωN/2 + ωN v(k)ωN/2
k=0 k=0

M −1 
M −1
−kj −j −kj
= u(k)ωM + ωN v(k)ωM
k=0 k=0

for j = 0, . . . , N − 1, where

u(k) = z(2k) and v(k) = z(2k + 1) for k = 0, 1, . . . , M − 1.


14
James W. Cooley (1926-2016), US American mathematician
15
John Wilder Tukey (1915-2000), US American mathematician
16
Niccolò di Bernardo dei Machiavelli (1469-1527), Florentine philosopher
2.7 The Discrete Fourier Transform 57

Therefore, we can, for any input vector z ∈ CN of length N = 2M , reduce


the computation of its DFT ẑ to the computation of two DFTs of half length
M = N/2 each. Indeed, the DFTs of the two vectors u, v ∈ CM yields the
DFT of z by
−j
ẑ(j) = û(j) + ωN v̂(j).
From this basic observation, we can already determine the complexity, i.e.,
the asymptotic computational costs, of the fast Fourier transform (FFT).
Theorem 2.46. For N = 2n , n ∈ N, the discrete Fourier transform of a
vector z ∈ CN is computed by the FFT in asymptotically O(N log(N )) steps.
Proof. In the first reduction step the DFT of z ∈ CN with length N is de-
composed into two DFTs (for u, v ∈ CN/2 ) of length N/2 each. By induction,
in the m-th reduction step for the current 2m DFTs of length N/2m , each
of these DFT can be decomposed into two DFTs of length N/2m+1 . After
n = log2 (N ) reduction steps we have N atomic DFTs of unit length. But the
DFT for a vector z of unit length is trivial: In this case, we have ẑ(0) = z(0)
for z = z(0) ∈ C1 , and so the recursion terminates. Altogether, N log2 (N )
steps are performed in the recursion. 
We finally discuss one relevant application of the fast Fourier transform.
In this application, we consider solving a linear equation system of the form
Cx = b (2.80)
efficiently, where C is a cyclic Toeplitz matrix.
Definition 2.47. A cyclic Toeplitz17 matrix has the form
⎡ ⎤
c0 cN −1 · · · c2 c1
⎢ . . ⎥
⎢ c1 c0 . . .. c2 ⎥
⎢ ⎥
⎢ .. .. .. ⎥
⎢ . cN −1 . ⎥ N ×N
C=⎢ . c1 ⎥∈C
⎢ .. . . ⎥
⎢c . c0 cN −1 ⎥
⎣ N −2 . ⎦
cN −1 cN −2 · · · c1 c0
where c = (c0 , . . . , cN −1 )T ∈ CN is called the generating vector of C.
The following observation is quite important for our solution to (2.80).
Proposition 2.48. Let C be a cyclic Toeplitz matrix with generating vector
c ∈ CN . Then C is diagonalized by the discrete Fourier transform FN , so
that
FN CFN−1 = diag(d),
where the eigenvalues d = (d0 , . . . , dN −1 ) ∈ CN of C are given by the discrete
Fourier transform of c, i.e.,
d = FN c.
17
Otto Toeplitz (1881-1940), German mathematician
58 2 Basic Methods and Numerical Algorithms

Proof. For the entries of the Toeplitz matrix C = (Cjk )0≤j,k≤N −1 , we have

Cjk = c(j−k) mod N for 0 ≤ j, k ≤ N − 1.

We recall the definition of the Fourier matrices


  1  jk 
−jk
FN = ωN and FN−1 = ωN ,
0≤j,k≤N −1 N 0≤j,k≤N −1

where ωN = e2πi/N . For 0 ≤ ≤ N − 1 we let


1 j
ω () = (ω )0≤j≤N −1 ∈ CN
N N
denote the -th column of FN−1 . By using the identity
 
(k−j) (k−j) mod N
ωN = ωN

we obtain
N −1 N −1
1  1 j  (k−j)
(Cω () )j = c(j−k) mod N · ωN
k
= ωN c(j−k) mod N · ωN
N N
k=0 k=0
 −1 N −1
1 j 
N
1 j −m −m 1 j
= ωN cm mod N · ωN = ωN cm ωN = ωN d ,
N m=0
N m=0
N

where

N −1
−k
d = ck ωN for 0 ≤ ≤ N − 1
k=0

is the -th component of d = FN c.


Therefore, ω () is an eigenvector of C with eigenvalue d , i.e.,

Cω () = d ω () for 0 ≤ ≤ N − 1,

whereby
CFN−1 = FN−1 diag(d)
or
FN CFN−1 = diag(d).


Now we finally regard the linear system (2.80) for a cyclic Toeplitz matrix
C ∈ CN ×N with generating vector c ∈ CN . By application of the discrete
Fourier transform FN to both sides in (2.80) we get the identity

FN CFN−1 FN x = FN b.
2.7 The Discrete Fourier Transform 59

Using Proposition 2.48 leads us to the linear system

Dy = r, (2.81)

where we let y = FN x and r = FN b, and where D = diag(d) for d = FN c.


Now the matrix C is non-singular, if and only if none of its eigenvalues in d
vanishes. In this case
 T
r0 rN −1
y= ,..., ∈ CN
d0 dN −1

is the unique solution of the linear system (2.81). By backward transformation


with the inverse discrete Fourier transform FN−1 , we finally obtain the solution
of the linear system (2.80) by

x = FN−1 y.

We summarize the proposed solution for the Toeplitz system (2.80) in Al-
gorithm 3. Note that Algorithm 3 can be implemented efficiently by using the
fast Fourier transform (FFT): By Theorem 2.46 the performance of the steps
in lines 5,6 and 8 of Algorithm 3 by the (inverse) FFT costs only O(N log(N ))
operations each. In this case, a total number of only O(N log(N )) operations
are required for the performance of Algorithm 3. In comparison, the solution
of a linear equation system (2.80) via Gauss elimination requiring O(N 3 ) ope-
rations is far too expensive. But unlike in Algorithm 3, the Toeplitz structure
of the matrix C is not used in the Gauss elimination algorithm.

Algorithm 3 Solution of linear Toeplitz systems Cx = b in (2.80)


1: function Toeplitz-Solution(c,b)
2: input: generating vector c ∈ CN of a non-singular
3: cyclic Toeplitz matrix C ∈ CN ×N ;
4: right hand side b ∈ CN ;
5: compute DFT d = FN c;
6: compute DFT r = FN b;
7: let y := (r0 /d0 , . . . , rN −1 /dN −1 )T ;
−1
8: compute IDFT x = FN y
9: output: solution x ∈ CN of Cx = b.
10: end function
3 Best Approximations

In this chapter, we analyze fundamental questions of approximation. To this


end, let F be a linear space, equipped with a norm  · . Moreover, S ⊂ F be
a non-empty subset of F. To approximate one f ∈ F \ S by elements from S
we are interested in finding a s∗ ∈ S, whose distance to f is minimal among
all elements from S. This leads us to the definition of best approximations.
Definition 3.1. Let F be a linear space with norm ·. Moreover, let S ⊂ F
be a non-empty subset of F. For f ∈ F, an element s∗ ∈ S is said to be a
best approximation to f from S with respect to (F,  · ), or in short: s∗ is
a best approximation to f , if
s∗ − f  = inf s − f .
s∈S

Moreover,
η ≡ η(f, S) = inf s − f 
s∈S
is called the minimal distance between f and S.
In the following investigations, we will first address questions concerning
the existence and uniqueness of best approximations. To this end, we develop
sufficient conditions for the linear space F and the subset S ⊂ F, under which
we can guarantee for any f ∈ F the existence of a best approximation s∗ ∈ S
for f . To guarantee the uniqueness of s∗ , we require strict convexity for the
norm  · .
In the following discussion, we develop suitable sufficient and necessary
conditions to characterize best approximations. To this end, we first derive
dual characterizations for best approximations, giving conditions for the ele-
ments from the topological dual space F  of linear and continuous functionals
on F.
This is followed by direct characterizations of best approximations, where
we use directional derivatives (Gâteaux derivatives) of the norm  · . On that
occasion, we consider computing directional derivatives of relevant norms
explicitly.
To study the material of this chapter (and for the following chapters)
we require knowledge of elementary results from optimization and functional
analysis. Therefore, we decided to explain a selection of relevant results. But
for further reading, we refer to the textbook [33].

© Springer Nature Switzerland AG 2018 61


A. Iske, Approximation Theory and Algorithms for Data Analysis, Texts
in Applied Mathematics 68, https://doi.org/10.1007/978-3-030-05228-7_3
62 3 Best Approximations

Before we address theoretical questions concerning the existence and


uniqueness of best approximations, we first discuss one elementary example,
which will illustrate relevant scenarios and phenomena.

Example 3.2. For F = R2 , let S = {x = (x1 , x2 ) | 2 ≤ x2 < 3} ⊂ R2


be a concentric circle around the origin. Moreover, let fα = (α, 0) ∈ R2 , for
α ∈ R. Now we wish to best-approximate fα (according to Definition 3.1) by
elements from S. To do so, we first need to fix a norm on R2 . To this end,
we work with three different norms on R2 :
• the 1-norm  · 1 , defined as x1 = |x1 | + |x2 | for x = (x1 , x2 );
• the Euclidean norm  · 2 , defined as x22 = |x1 |2 + |x2 |2 ;
• the maximum norm  · ∞ , defined as x∞ = max(|x1 |, |x2 |).
We let Sp∗ ≡ Sp∗ (fα ) denote the set of best approximations to fα with
respect to  ·  =  · p at minimal distances ηp ≡ ηp (fα , S), for p = 1, 2, ∞.
For the construction and characterization of best approximations to fα we
distinguish different cases (see Fig. 3.1).
Case (a): Suppose α ≥ 3. In this case, we have

ηp = inf s − fα p = α − 3 for p = 1, 2, ∞,
s∈S

where
s − fα p > inf s − fα p = α − 3 for all s ∈ S,
s∈S

i.e., there is no best approximation to fα from S, and so Sp∗ = ∅.


Case (b): Suppose α ∈ (0, 2). In this case, we have

8 − α2 − α
η 1 = η2 = 2 − α and η∞ =
2
3 √ √ 4
2 −α
and, moreover, Sp∗ = {(2, 0)} for p = 1, 2 and S∞∗
= 8−α2 +α
2 , ± 8−α
2 .
Case (c): Suppose α = 0. In this case, we have

η 1 = η2 = 2 and η∞ = 2
√ √
where S1∗ = {(±2, 0), (0, ±2)}, S2∗ = {x ∈ S | x2 = 2}, S∞∗
= {(± 2, ± 2)}.
Therefore, there exists, for any of the three norms  · p , p = 1, 2, ∞, a best
approximation to f0 . In either case, however, the best approximations are
not unique. For  · 2 there are even uncountably many best approximations
to f0 .
Case (d): For α ∈ [2, 3) we have fα ∈ S and so Sp∗ = {fα } with ηp = 0.
For any other case, i.e., for α < 0, we can analyze the set of best approx-
imations by using one of the (symmetric) cases (a)-(d). ♦
3 Best Approximations 63
5 5 5

4 4 4

3 3 3

2 2 2

1 1 1

0 f4 0 f 0 * f
0
S s* 0
S s* 4 0
S s 4
-1 -1 -1

-2 -2 -2

-3 -3 -3

-4 -4 -4

-5 -5 -5
-5 0 5 -5 0 5 -5 0 5

α = 4,  ·  =  · 1 α = 4,  ·  =  · 2 α = 4,  ·  =  · ∞
S1∗ = ∅, η1 = 1 S2∗ = ∅, η2 = 1 ∗
S∞ = ∅, η∞ = 1

4 4 4

3 3 3

2 2 2

*
1 1 1 s
2
0 f1 s* 0 f1 s* 0 f1
0
S 1
0
S 1
0
S
*
-1 -1 -1
s1
-2 -2 -2

-3 -3 -3

-4 -4 -4

-4 -3 -2 -1 0 1 2 3 4 -4 -3 -2 -1 0 1 2 3 4 -4 -3 -2 -1 0 1 2 3 4

α = 1,  ·  =  · 1 α = 1,  ·  =  · 2 α = 1,  ·  =  · ∞
 √ √ 
S1∗ = {(2, 0)} S2∗ = {(2, 0)} ∗
S∞ = 7+1
2
, ± 7−1
2

7−1
η1 = 1 η2 = 1 η∞ = 2

4 4 4

3 3 3

s *4
2 2 2
S* s *2 s *1
S 2
1 1 1

f0 f0 f0
0 s *1 s *2 0 S 0 S
-1 -1 -1

s *3 s *4
-2 -2 -2
s *3
-3 -3 -3

-4 -4 -4

-4 -3 -2 -1 0 1 2 3 4 -4 -3 -2 -1 0 1 2 3 4 -4 -3 -2 -1 0 1 2 3 4

α = 0,  ·  =  · 1 α = 0,  ·  =  · 2 α = 0,  ·  =  · ∞
√ √
S1∗ = {(±2, 0), (0, ±2)} S2∗ = {x ∈ R2 | x2 = 2} ∗
S∞ = {(± 2, √ ± 2)}
η1 = 2 η2 = 2 η∞ = 2

Fig. 3.1. Approximation of fα = (α, 0) ∈ R2 , for α = 4, 1, 0, by elements from the


approximation set S = {x = (x1 , x2 ) | 2 ≤ x2 < 3} ⊂ R2 and with respect to the
norms  · p for p = 1, 2, ∞ (see Example 3.2).
64 3 Best Approximations

3.1 Existence
In the following discussion, the notions of compactness, completeness, and
continuity play an important role. We assume that their definitions and fur-
ther properties are familiar from analysis. Nevertheless, let us recall the con-
tinuity of functionals. Throughout this chapter, F denotes a linear space with
norm  · .

Definition 3.3. A functional ϕ : F −→ R is said to be continuous at


u ∈ F, if for any convergent sequence (un )n∈N ⊂ F with limit u ∈ F, i.e.,

un − u −→ 0 for n → ∞,

we have
ϕ(un ) −→ ϕ(u) for n → ∞.
Moreover, ϕ is called continuous on F, if ϕ is continuous at every u ∈ F.

Now recall that any continuous functional attains its minimum (and its
maximum) on compact sets. Any compact set is closed and bounded. The
converse, however, is only true in finite-dimensional spaces.
For the discussion in this section, we need the continuity of norms. This
requirement is already covered by the following result.

Theorem 3.4. Every norm is continuous.

Proof. Let F be a linear space with norm  · . Moreover let v ∈ F and


(vn )n∈N ⊂ F be a convergent sequence in F with limit v, i.e.,

vn − v −→ 0 for n → ∞.

Now, by the triangle inequality for the norm  · , this implies

|vn  − v| ≤ vn − v −→ 0 for n → ∞

and therefore
vn  −→ v for n → ∞,
i.e.,  ·  is continuous at v ∈ F. Since we did not pose any further conditions
on v ∈ F, the norm  ·  is continuous on F. 

The above result allows us to prove a first elementary result concerning


the existence of best approximations.

Theorem 3.5. Let S ⊂ F be compact. Then there exists for any f ∈ F a


best approximation s∗ ∈ S to f .
3.1 Existence 65

Proof. For f ∈ F the functional ϕ : F −→ [0, ∞), defined as

ϕ(v) = v − f  for v ∈ F,

is continuous on F. In particular, ϕ attains its minimum on the compact set


S, i.e., there is one s∗ ∈ S satisfying

ϕ(s∗ ) = s∗ − f  ≤ s − f  = ϕ(s) for all s ∈ S.

From this result, we can further conclude as follows.

Corollary 3.6. Let F be finite-dimensional, and S ⊂ F be closed in F.


Then there exists for any f ∈ F a best approximation s∗ ∈ S to f .

Proof. For s0 ∈ S and f ∈ F the non-empty set

S0 = S ∩ {v ∈ F | v − f  ≤ s0 − f } ⊂ S

is closed and bounded, i.e., S0 ⊂ F is compact. By Theorem 3.5 there is one


best approximation s∗ ∈ S0 to f from S0 , so that

s∗ − f  ≤ s − f  for all s ∈ S0 .

Moreover, for any s ∈ S \ S0 we have the inequality

s − f  > s0 − f  ≥ s∗ − f ,

so that altogether,

s∗ − f  ≤ s − f  for all s ∈ S,

i.e., s∗ ∈ S0 ⊂ S is a best approximation to f from S. 

Corollary 3.7. Let S ⊂ F be a closed subset of F. If S is contained in a


finite-dimensional linear subspace R ⊂ F of F, i.e., S ⊂ R, then there exists
for any f ∈ F a best approximation s∗ ∈ S to f .

Proof. Regard, for f ∈ F, the finite-dimensional linear space

Rf = span{f, r1 , . . . , rn } ⊂ F,

where {r1 , . . . , rn } be a basis of R. Then there exists, by Corollary 3.6 a best


approximation s∗ ∈ S to f ∈ Rf , where in particular,

s∗ − f  ≤ s − f  for all s ∈ S.


66 3 Best Approximations

The result of Corollary 3.7 holds in particular for the case R = S.


Corollary 3.8. Let S ⊂ F be a finite-dimensional subspace of F. Then there
exists for any f ∈ F a best approximation s∗ ∈ S to f . 
In the above results concerning the existence of best approximations, we
require that S ⊂ F is contained in a finite-dimensional linear space. For
the approximation in Euclidean spaces F, we can refrain from using this
restriction. To this end, the following geometric identities are of fundamental
importance.
Theorem 3.9. Let F be a Euclidean space with inner product (·, ·) and norm
 ·  = (·, ·)1/2 . Then the parallelogram identity
v + w2 + v − w2 = 2v2 + 2w2 for all v, w ∈ F (3.1)
holds. If F is a Euclidean space over the real numbers R, then the polarization
identity
1
(v, w) =v + w2 − v − w2 for all v, w ∈ F. (3.2)
4
holds. If F is a Euclidean space over the complex numbers C, then the
polarization identity holds as
1
(v, w) = v + w2 − v − w2 + iv + iw2 − iv − iw2 (3.3)
4
for all v, w ∈ F.
Proof. Equations (3.1),(3.2) follow directly from the identities
v ± w2 = (v ± w, v ± w) = (v, v) ± 2(v, w) + (w, w) = v2 ± 2(v, w) + w2 .
Likewise, the polarization identity (3.3) can be verified by elementary calcu-
lations. 
For the geometric interpretation of the parallelogram identity, we make the
following remark.
For any parallelogram the sum of square lengths of the four edges coincides
with the sum of the square lengths of the two diagonals (see Fig. 3.2).
For the statement in Theorem 3.9, the converse is true, according to the
theorem of Jordan1 and von Neumann2 [40].
Theorem 3.10. (Jordan-von Neumann theorem, 1935).
Let F be a linear space with norm  · , for which the parallelogram iden-
tity (3.1) holds. Then there is an inner product (·, ·) on F, so that
(v, v) = v2 for all v ∈ F, (3.4)
i.e., F is a Euclidean space.
1
Pascual Jordan (1902-1980), German mathematician and physicist
2
John von Neumann (1903-1957), Hungarian-US American mathematician
3.1 Existence 67

v−
w
w

v+w

Fig. 3.2. On the geometry of the parallelogram identity (see Theorem 3.9).

Proof. Let F be a linear space over R. By using the norm  ·  of F we define


a mapping (·, ·) : F × F −→ R through the polarization identity (3.2), i.e., we
let
1
(v, w) := v + w2 − v − w2 for v, w ∈ F.
4
Obviously, we have (3.4) and so (·, ·) is positive definite. Moreover, (·, ·) is
obviously symmetric, so that (v, w) = (w, v) for all v, w ∈ F.
It remains to verify the linearity

(αu + βv, w) = α(u, w) + β(v, w) for all α, β ∈ R, u, v, w ∈ F. (3.5)

To this end, we note the property

(−v, w) = −(v, w) for all v, w ∈ F, (3.6)

which immediately follows from the definition of (·, ·). In particular, we have

(0, w) = 0 for all w ∈ F.

Moreover, by the parallelogram identity (3.1) we obtain


1
(u, w) + (v, w) = u + w2 − u − w2 + v + w2 − v − w2
4,
 2  2 -
1  1  1
 (u + v) + w −  (u + v) − w

=    
2 2 2
 
1
=2 (u + v), w ,
2

which, for v = 0, implies


68 3 Best Approximations
 
1
(u, w) = 2 u, w for all u, w ∈ F (3.7)
2
and thereby the additivity
(u, w) + (v, w) = (u + v, w) for all u, v, w ∈ F. (3.8)
From (3.7),(3.8) we obtain for m, n ∈ N the identities
m(u, w) = (mu, w) for all u, w ∈ F
 
1 1
(u, w) = u, w for all u, w ∈ F
2n 2n
by induction on m ∈ N and by induction on n ∈ N, respectively.
In combination with (3.6) and (3.8) this implies the homogeneity
(αu, w) = α(u, w) for all u, w ∈ F (3.9)
for all dyadic numbers α ∈ Q of the form

n
αk
α=m+ for m ∈ Z, n ∈ N, αk ∈ {0, 1}, 1 ≤ k ≤ n.
2k
k=1

Since any real number α ∈ R can be approximated arbitrarily well by


a dyadic number, the continuity of the norm  ·  implies the homogeneity
(3.9) even for all α ∈ R. Together with the additivity (3.8) this implies the
linearity (3.5). Therefore, (·, ·) is an inner product over R.
If F is a linear space over C, then we define (·, ·) : F × F −→ C through
the polarization identity (3.3), for which we then verify the properties of an
inner product for (·, ·) in (3.2), by analogy. 
Given the above characterization of Euclidean norms by the parallelogram
identity (3.1) and the polarization identities (3.2),(3.3), approximation in
Euclidean spaces is of particular importance. From the equivalence relation
in Theorems 3.9 and 3.10, we can immediately draw the following conclusion.
Corollary 3.11. Every inner product is continuous.
Proof. Let F be a Euclidean space over R with inner product (·, ·). Moreover,
let (vn )n∈N ⊂ F and (wn )n∈N ⊂ F be convergent sequences in F with limit
elements v ∈ F and w ∈ F. From the polarization identity (3.2) and by the
continuity of the norm  ·  = (·, ·)1/2 , from Theorem 3.4, we have
1
(vn , wm ) = vn + wm 2 − vn − wm 2
4
1
−→ v + w2 − v − w2 = (v, w) for n, m → ∞.
4
For the case where F is a Euclidean space over C, we show the continuity
of (·, ·) from the polarization identity (3.3), by analogy. 
3.1 Existence 69

Now we return to the question for the existence of best approximations. In


Euclidean spaces F we can rely on the parallelogram identity (3.1). Moreover,
we need the completeness of F. On this occasion, we recall the following
definition.
Definition 3.12. A complete Euclidean space is called Hilbert3 space.
Moreover, let us recall the notion of strictly convex sets.
Definition 3.13. A non-empty subset K ⊂ F is called convex, if for any
u, v ∈ K the straight line
[u, v] = {λu + (1 − λ)v | λ ∈ [0, 1]}
between u and v lies in K, i.e., if [u, v] ⊂ K for all u, v ∈ K.
If for any u, v ∈ K, u = v, the open straight line
(u, v) = {λu + (1 − λ)v | λ ∈ (0, 1)}
is contained in the interior of K, then K is called strictly convex.
Now we prove an important result concerning the existence of best ap-
proximations in Hilbert spaces.
Theorem 3.14. Let F be a Hilbert space with inner product (·, ·) and norm
 ·  = (·, ·)1/2 . Moreover, let S ⊂ F be a closed and convex subset of F. Then
there exists for any f ∈ F a best approximation s∗ ∈ S to f .
Proof. Let (sn )n∈N ⊂ S be a minimal sequence in S, i.e.,
sn − f  −→ η(f, S) for n → ∞
with minimal distance η ≡ η(f, S) = inf s∈S s − f .
From the parallelogram identity (3.1) we obtain the estimate
 2
 sn + sm 
sn − sm  = 2sn − f  + 2sm − f  − 4 
2 2 2  − f
2 
≤ 2sn − f 2 + 2sm − f 2 − 4η 2 .
Therefore, for any ε > 0 there is one N ≡ N (ε) ∈ N satisfying
sn − sm  < ε for all n, m ≥ N,
i.e., (sn )n∈N is a Cauchy4 sequence in the Hilbert space F, and therefore
convergent in F. Since S is a closed set, the limit element s∗ lies in S, and
we have
η = lim sn − f  = s∗ − f ,
n→∞

i.e., s ∈ S is a best approximation to f . 
3
David Hilbert (1862-1943), German mathematician
4
Augustin-Louis Cauchy (1789-1857), French mathematician
70 3 Best Approximations

Remark 3.15. The required convexity for S is necessary for the result of
Theorem 3.14. In order to see this, we regard the sequence space
*  +
 ∞
2
≡ (R) = x = (xk )k∈N ⊂ R 
2
|xk | < ∞
2
(3.10)
k=1

consisting of all square summable sequences of real numbers. The sequence


space 2 , being equipped with the inner product


(x, y) = x k yk for x = (xk )k∈N , y = (yk )k∈N ∈ 2

k=1

is a Hilbert space with the 2 -norm


5
6∞
6
x2 := 7 |xk |2 for x = (xk )k∈N ∈ 2
.
k=1

Now we regard the subset


   8
1 
S= x (k)
= 1+ 
ek  k ∈ N ⊂ 2
,
k

where ek ∈ 2 is the sequence with (ek )j = δjk , for j, k ∈ N. Note that the
elements x(k) ∈ S are isolated in 2 , and so S is closed. But S is not convex.
Now we have η(0, S) = 1 for the minimal distance between 0 ∈ 2 and S,
and, moreover,
x(k) − 02 > 1 for all x(k) ∈ S.
Hence there exists no x(k) ∈ S with unit distance to the origin.
Finally, we remark that the result of Theorem 3.14 does not generalize
to Banach spaces. To see this, a counterexample can for instance be found
in [42, Section 5.2]. 

3.2 Uniqueness
In the following discussion, the notion of (strict) convexity for point sets,
functions, functionals and norms plays an important role. Recall the relevant
definitions for sets (see Definition 3.13) and for functions (see Definition 3.20),
as these should be familiar from analysis.
Now we note some fundamental results, where F denotes, throughout this
section, a linear space with norm  · . We start with a relevant example for
a convex set.
3.2 Uniqueness 71

Theorem 3.16. Let S ⊂ F be convex and f ∈ F. Then the set


S ∗ ≡ S ∗ (f, S) = {s∗ ∈ S | s∗ − f  = inf s − f } ⊂ S
s∈S

of best approximations s ∈ S to f is convex.
Proof. Let s∗1 , s∗2 ∈ S ∗ be two best approximations to f ∈ F. Then, for any
element
s∗λ = λs∗1 + (1 − λ)s∗2 ∈ [s∗1 , s∗2 ] ⊂ S for λ ∈ [0, 1] (3.11)
we have
s∗λ − f  = (λs∗1 + (1 − λ)s∗2 ) − (λ + (1 − λ))f 
= λ(s∗1 − f ) + (1 − λ)(s∗2 − f )
≤ λs∗1 − f  + (1 − λ)s∗2 − f 
= λ inf s − f  + (1 − λ) inf s − f 
s∈S s∈S
= inf s − f ,
s∈S

i.e., s∗λ = λs∗1 + (1 − λ)s∗2 ∈ [s∗1 , s∗2 ], for λ ∈ [0, 1], lies in S ∗ . 

s*
1

f s* S

s*
2
Fig. 3.3. S is not convex and for s∗ ∈ S we have s∗ −f  < η(f, S), cf. Remark 3.17.

We continue with the following remarks concerning Theorem 3.16.


Remark 3.17. If S ⊂ F is in the situation of Theorem 3.16 not convex,
then any element s∗ ∈ [s∗1 , s∗2 ] is at least as close to f ∈ F as s∗1 and s∗2 are,
i.e.,
s∗ − f  ≤ η ≡ η(f, S) for all s∗ ∈ [s∗1 , s∗2 ].
Eventually, one s∗ ∈ [s∗1 , s∗2 ] could even lie closer to f than s∗1 , s∗2 , so that
s∗ − f  < η, as the example in Figure 3.3 shows. 
72 3 Best Approximations

Remark 3.18. For a convex subset S ⊂ F, and in the case of non-unique


best approximations, we can explain the situation as follows. If there are
at least two best approximations s∗1 = s∗2 to f , then all s∗ ∈ [s∗1 , s∗2 ] are
contained in the set of best approximations S ∗ , and so the distance between
f and elements in [s∗1 , s∗2 ] is constant, i.e.,

s∗ − f  = η(f, S) for all s∗ ∈ [s∗1 , s∗2 ].


To further illustrate this, let us make one simple example.
Example 3.19. For S = {x ∈ R2 | x∞ ≤ 1} and f = (2, 0), the set S ∗ of
best approximations to f from S with respect to the maximum norm  · ∞
is given by 1 2
S ∗ = (1, α) ∈ R2 | α ∈ [−1, 1] ⊂ S
with the minimal distance

η(f, S) = inf s − f ∞ = 1.
s∈S

For s∗1 , s∗2 ∈ S ∗ every element s∗ ∈ [s∗1 , s∗2 ] lies in S ∗ (see Figure 3.4). ♦

1.5

1 S∗
0.5

0
S f
-0.5

-1

-1.5

-2
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3

 
Fig. 3.4. S ∗ = (1, α) ∈ R2 | α ∈ [−1, 1] is the set of best approximations to
f = (2, 0) from S = {x ∈ R2 | x∞ ≤ 1} with respect to  · ∞ (see Example 3.19).
3.2 Uniqueness 73

Next, we recall the definition for (strictly) convex functions.


Definition 3.20. A function f : [a, b] −→ R is called convex on an interval
[a, b] ⊂ R, if for all x, y ∈ [a, b] the inequality

f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y) for all λ ∈ [0, 1]

holds; f is said to be strictly convex on [a, b], if for all x, y ∈ [a, b], x = y,
we have

f (λx + (1 − λ)y) < λf (x) + (1 − λ)f (y) for all λ ∈ (0, 1).

An important property of convex functions is described by the Jensen5


inequality [39], whereby the value of a convex function, when evaluated at a
finite convex combination of arguments, is bounded above by the correspond-
ing convex combination of functions values at these arguments.
Theorem 3.21. (Jensen inequality, 1906).
Let f : [a, b] −→ R be a convex function, and {x1 , . . . , xn } ⊂ [a, b] a set of
n ≥ 2 points. Then, the Jensen inequality
⎛ ⎞
 n 
n n
f⎝ λj x j ⎠ ≤ λj f (xj ) for all λj ∈ (0, 1) with λj = 1
j=1 j=1 j=1

holds. If f is strictly convex, then equality holds, if and only if all points
coincide, i.e., x1 = . . . = xn .
Proof. We prove the statement of Jensen’s inequality by induction on n.
Initial step: For n = 2, the statement of Jensen’s inequality is obviously true.
Induction hypothesis: Assume the statement holds for n points {x1 , . . . , xn }.
Induction step (n −→ n + 1): For n + 1 points {x1 , . . . , xn , xn+1 } ⊂ [a, b] and

n
λ1 , . . . , λn , λn+1 ∈ (0, 1) with λj = 1 − λn+1
j=1

we have
⎛ ⎞ ⎛ ⎞

n+1 
n
λj
f⎝ λj xj ⎠ = f ⎝(1 − λn+1 ) xj + λn+1 xn+1 ⎠
j=1 j=1
1 − λn+1
⎛ ⎞
 n
λj
≤ (1 − λn+1 )f ⎝ xj ⎠ + λn+1 f (xn+1 )
j=1
1 − λn+1

5
Johan Ludwig Jensen (1859-1925), Danish mathematician
74 3 Best Approximations

by the convexity of f . By the induction hypothesis, we can further conclude


⎛ ⎞
n
λ n
λj 1 n
f⎝ xj ⎠ ≤
j
f (xj ) = λj f (xj ) (3.12)
j=1
1 − λn+1 j=1
1 − λn+1 1 − λn+1 j=1

and thus, altogether,


⎛ ⎞

n+1 
n+1
f⎝ λj x j ⎠ ≤ λj f (xj ). (3.13)
j=1 j=1

If f is strictly convex, then equality holds in (3.12) only for x1 = . . . = xn


(by induction hypothesis), and, moreover, equality in (3.13) holds only for

n
λj
xn+1 = xj ,
j=1
1 − λn+1

thus altogether only for x1 = . . . = xn = xn+1 . 


Next, we introduce the convexity for functionals.
Definition 3.22. A functional ϕ : F −→ R is said to be convex on F, if
for all u, v ∈ F the inequality

ϕ(λu + (1 − λ)v) ≤ λϕ(u) + (1 − λ)ϕ(v) for all λ ∈ [0, 1] (3.14)

holds.
Remark 3.23. Every norm  ·  : F −→ [0, ∞) is a convex functional on F.
Indeed, for any u, v ∈ F we find the inequality

λu + (1 − λ)v ≤ λu + (1 − λ)v for all λ ∈ [0, 1] (3.15)

due to the triangle inequality and the homogeneity of ·. Moreover, equality
in (3.15) holds for all pairs of linearly dependent elements u, v ∈ F with
u = αv for positive scalar α > 0, i.e., we have

λαv + (1 − λ)v = λαv + (1 − λ)v for all λ ∈ [0, 1] (3.16)

by the homogeneity of  · . 
We introduce the notion of a strictly convex norm classically as follows.
Definition 3.24. A norm  ·  is called strictly convex on F, if the unit
ball B = {u ∈ F | u ≤ 1} ⊂ F is strictly convex.
As we will show, not every norm is strictly convex. But before we do
so, our ”classical” introduction for strictly convex norms in Definition 3.24
deserves a comment.
3.2 Uniqueness 75

Remark 3.25. Had we introduced the strict convexity of ϕ : F −→ R in


Definition 3.22 in a straightforward manner through the inequality

ϕ(λu + (1 − λ)v) < λϕ(u) + (1 − λ)ϕ(v) for all λ ∈ (0, 1), (3.17)

then no norm would be strictly convex in this particular sense! This important
observation is verified by the counterexample in (3.16). 
When working with strictly convex norms  ·  (according to Defini-
tion 3.24), we can exclude non-uniqueness of best approximations, if S ⊂ F
is convex. To explain this, we need to further analyze strictly convex norms.
To this end, we first prove the following useful characterization.
Theorem 3.26. Let F be a linear space with norm  · . Then the following
statements are equivalent.
(a) The norm  ·  is strictly convex.
(b) The unit ball B = {u ∈ F | u ≤ 1} ⊂ F is strictly convex.
(c) The inequality u + v < 2 holds for all u = v, with u = v = 1.
(d) The equality u + v = u + v, v = 0, implies u = αv for some α ≥ 0.
Proof. Note that the equivalence (a) ⇔ (b) holds by Definition 3.24.
(b) ⇒ (c): The strict convexity of B implies (u + v)/2 < 1 for u = v with
u = v = 1, and so in this case we have u + v < 2.
(c) ⇒ (d): For u = 0 statement (d) holds with α = 0. Now suppose u, v ∈
F \ {0} satisfy u + v = u + v. Without loss of generality, we may
assume u ≤ v (otherwise we swap u and v). In this case, in the sequence
of inequalities
     
 u v   v 
2≥ + = u + v −
v
− 
 u v   u u u v 
     
 u v     
≥ +  −  v − v  = u + v −  1 − 1  v
 u u   u v  u  u v 
 
u + v 1 1
= − − v = 2
u u v
equality holds everywhere, in particular
 
 u v 
 
 u + v  = 2.

From (c) we can conclude u/u = v/v and therefore


u
u = αv for α = > 0.
v

(d) ⇒ (b): Suppose u, v ∈ B, u = v, i.e., u ≤ 1 and v ≤ 1. Then we find


for any λ ∈ (0, 1) the inequality
76 3 Best Approximations

λu + (1 − λ)v ≤ λu + (1 − λ)v < 1,

provided that u < 1 or v < 1. Otherwise, i.e., if u = v = 1, we have

λu + (1 − λ)v = λu + (1 − λ)v = 1 for λ ∈ (0, 1).

If λu + (1 − λ)v = 1, then we have λu = α(1 − λ)v for one α > 0 from (d).
Therefore, we have u = v, since u = v. This, however, is in contradiction
to the assumption u = v. Therefore, we have, also for this case,

λu + (1 − λ)v < 1 for all λ ∈ (0, 1).

Next, we make explicit examples of strictly convex norms. A first simple


example is the absolute value | · |, taken as a norm on R.

Remark 3.27. The absolute value |·| is a strictly convex norm on R. Indeed,
in the equivalence (c) of Theorem 3.26 we can only use the two points u = −1
and v = 1, where we have |u + v| = 0 < 2. But note that the absolute value,
when regarded as a function | · | : R −→ R is not strictly convex on R. 

Further examples are Euclidean norms.

Theorem 3.28. Every Euclidean norm is strictly convex.

Proof. Let F be a linear space with Euclidean norm  ·  = (·, ·)1/2 . By


Theorem 3.9, the parallelogram inequality (3.1) holds in F, and so
   
 u + v 2  u − v 2 u2 v2
  +  for all u, v ∈ F.
 2   2  = 2 + 2

For u, v ∈ F, u = v, with u = v we thus have


 
 u + v 2
 
 2  < u = v ,
2 2

or, u + v < 2 for u = v = 1.


By statement (c) in Theorem 3.26, we see that  ·  is strictly convex. 

Next, we regard the linear space of all bounded sequences,


 8


≡ ∞ 
(R) = x = (xk )k∈N ⊂ R  sup |xk | < ∞ ,
k∈N


equipped with the -norm

x∞ := sup |xk | for x = (xk )k∈N ∈ .
k∈N
3.2 Uniqueness 77

Moreover, we regard for 1 ≤ p < ∞ the linear subspaces


*  +
 ∞
p
≡ (R) = x = (xk )k∈N ⊂ R 
p
|xk | < ∞ ⊂
p ∞
, (3.18)
k=1

p
equipped with the -norm
, ∞
-1/p

xp := |xk | p
for x = (xk )k∈N ∈ p
.
k=1

p
To further analyze the -norms we prove the Hölder6 inequality.

Theorem 3.29. (Hölder inequality, 1889).


Let 1 < p, q < ∞ satisfy 1/p + 1/q = 1. Then, the Hölder inequality

xy1 ≤ xp yq for all x ∈ p


,y ∈ q
, (3.19)

holds with equality in (3.19), if and only if either x = 0 or y = 0 or

xp−1
p
|xk |p−1 = α|yk | with α = >0 for y = 0. (3.20)
yq

Proof. For 1 < p, q < ∞ with 1/p + 1/q = 1 let

x = (xk )k∈N ∈ p
and y = (yk )k∈N ∈ q
.

For x = 0 or y = 0 the Hölder inequality (3.19) is trivial. Now suppose


x, y = 0. Then, we find for k ∈ N the estimate
     
1 |xk |p 1 |yk |q 1 |xk |p 1 |yk |q
− log + ≤ − log − log (3.21)
p xpp q yqq p xpp q yqq

by the Jensen inequality, Theorem 3.21, here applied to the strictly convex
function − log : (0, ∞) −→ R. This yields the Young7 inequality
 1/p  1/q
|xk yk | |xk |p |yk |q 1 |xk |p 1 |yk |q
= ≤ p + . (3.22)
xp yq xpp yqq p xp q yqq

Moreover, by Theorem 3.21, we have equality in (3.21), and therefore equality


in (3.22), if and only if
|xk |p |yk |q
p = . (3.23)
xp yqq
By q = p/(p − 1) we see that (3.23) is equivalent to
6
Hölder, Otto (1859-1937), German mathematician
7
William Henry Young (1863-1942), English mathematician
78 3 Best Approximations
 1/(p−1)
|xk | |yk |
= . (3.24)
xp yq

Therefore, we have equality in (3.22), if and only if (3.20) holds. Summing


up both sides in the Young inequality (3.22) over k, we find

 ∞
 ∞
|xk yk | 1 |xk |p  1 |yk |q 1 1
≤ p + q = + = 1,
xp yq p xp q yq p q
k=1 k=1 k=1

and this already proves the Hölder inequality (3.19), with equality, if and
only if (3.20) holds for all k ∈ N. 

Now we can show the strict convexity of the p


-norms, for 1 < p < ∞.

Theorem 3.30. For 1 < p < ∞, the p


-norm  · p on p
is strictly convex.

Proof. For 1 < p < ∞, let 1 < q < ∞ be the conjugate Hölder exponent of p
satisfying 1/p + 1/q = 1.
For
x = (xk )k∈N and y = (yk )k∈N ∈ p ,
where x = y and xp = yp = 1, we wish to prove the inequality

x + yp < 2, (3.25)

in which case the norm  · p would be strictly convex by the equivalence


statement (c) in Theorem 3.26.
For sk := |xk + yk |p−1 and s := (sk )k∈N ∈ q
we have


x + ypp = |xk + yk ||sk |
k=1
∞
≤ (|xk ||sk | + |yk ||sk |) (3.26)
k=1
≤ xp sq + yp sq , (3.27)

where we applied the Hölder inequality (3.19) in (3.27) twice.


By p = (p − 1)q, we have
, ∞
-1/q , ∞
- p1 · pq
 
sq = |xk + yk | (p−1)q
= |xk + yk | p
= x + yp−1
p
k=1 k=1

and this implies, in combination with (3.27), the Minkowski8 inequality


8
Hermann Minkowski (1864-1909), German mathematician and physicist
3.2 Uniqueness 79

x + yp ≤ xp + yp for all x, y ∈ p


,

in particular,
x + yp ≤ 2 for xp = yp = 1.
If x + yp = 2 for xp = yp = 1, then we have equality in both (3.26)
and (3.27). But equality in (3.27) is by (3.20) equivalent to the two conditions
1
|xk |p−1 = α|sk | and |yk |p−1 = α|sk | with α = ,
sq

which implies
|xk | = |yk | for all k ∈ N.
In this case, we have equality in (3.26), if and only if sgn(xk ) = sgn(yk ), for
all k ∈ N, i.e., equality in (3.26) and (3.27) implies x = y.
Therefore, the inequality (3.25) holds for all x = y with xp = yp = 1.


Remark 3.31. Theorem 3.30 can be generalized to Lp -norms,


 1/p
up := |u(x)| dx
p
for u ∈ Lp ,
Rd

for 1 < p < ∞, where Lp ≡ Lp (Rd ) is the linear space of all functions
whose p-th power is Lebesgue9 integrable. Indeed, in this case (in analogy to
Theorem 3.29) the Hölder inequality

uv1 ≤ up vq for all u ∈ Lp , v ∈ Lq

holds for 1 < p, q < ∞ satisfying 1/p + 1/q = 1. This implies (as in the proof
of Theorem 3.30) the Minkowski inequality

u + vp ≤ up + vp for all u, v ∈ Lp ,

where for 1 < p < ∞ we have equality, if and only if u = αv for some α ≥ 0
(see [35, Theorem 12.6]). Therefore, the Lp -norm  · p , forr 1 < p < ∞, is
by equivalence statement (d) in Theorem 3.26 strictly convex. 

We can conclude our statement from Remark 3.31 as follows.

Theorem 3.32. For 1 < p < ∞, the Lp -norm  · p on Lp is strictly convex.




But there are norms that are not strictly convex. Here are two examples.
9
Henri Léon Lebesgue (1875-1941), French mathematician
80 3 Best Approximations

Example 3.33. The 1


-norm  · 1 on 1
in (3.18), defined as


x1 = |xk | for x = (xk )k∈N ∈ 1
,
k=1

is not strictly convex, since the unit ball B1 = {x ∈ 1 | x1 ≤ 1} ⊂ 1 is not


strictly convex. Indeed, for any pair of two unit vectors ej , ek ∈ 1 , j =
 k, we
have ej 1 = ek 1 = 1 and

λej + (1 − λ)ek 1 = λ + (1 − λ) = 1 for all λ ∈ (0, 1).

Thus, by Theorem 3.26, statement (b), the 1 -norm ·1 is not strictly convex.
Likewise, we show that for the linear space ∞ of all bounded sequences
the ∞ -norm  · ∞ , defined as

x∞ = sup |xk | for x = (xk )k∈N ∈ ,
k∈N

is not strictly convex. This is because for any ek ∈ ∞ , k ∈ N, and the


constant sequence 1 = (1)k∈N ∈ ∞ , we have ek ∞ = 1∞ = 1 and

λek + (1 − λ)1∞ = 1 for all λ ∈ (0, 1).

Example 3.34. For the linear space C ([0, 1]d ) of all continuous functions
on the unit cube [0, 1]d ⊂ Rd , the maximum norm  · ∞ , defined as

u∞ = max |u(x)| for u ∈ C ([0, 1]d ),


x∈[0,1]d

is not strictly convex. To see this we take a continuous function u1 ∈ C ([0, 1]d )
satisfying u1 ∞ = 1 and another continuous function u2 ∈ C ([0, 1]d ) satis-
fying u2 ∞ = 1, so that |u1 | and |u2 | attain their maximum on [0, 1]d at one
point x∗ ∈ [0, 1]d , respectively, i.e.,

u1 ∞ = max |u1 (x)| = |u1 (x∗ )| = |u2 (x∗ )| = max |u2 (x)| = u2 ∞ = 1.
x∈[0,1]d x∈[0,1]d

This then implies for uλ = λu1 + (1 − λ)u2 ∈ (u1 , u2 ), with λ ∈ (0, 1),

|uλ (x)| ≤ λ|u1 (x)| + (1 − λ)|u2 (x)| ≤ 1 for all x ∈ [0, 1]d

where equality holds for x = x∗ , whereby uλ ∞ = 1 for all λ ∈ (0, 1).
In this case, the unit ball B = {u ∈ C ([0, 1]d ) | u∞ ≤ 1} is not strictly
convex, i.e.,  · ∞ is not strictly convex by statement (b) in Theorem 3.26.
To make an explicit example for the required functions u1 and u2 , we take
the geometric mean ug ∈ C ([0, 1]d ) and the arithmetic mean ua ∈ C ([0, 1]d ),
3.2 Uniqueness 81
√ x1 + . . . + x d
ug (x) = d
x 1 · . . . · xd ≤ = ua (x),
d
for x = (x1 , . . . , xd ) ∈ [0, 1]d . Obviously, we have ug ∞ = ua ∞ = 1, where
ug and ua attain their unique maximum on [0, 1]d at 1 = (1, . . . , 1) ∈ [0, 1]d .

Now we consider the Euclidean space Rd , for d ∈ N, as a linear subspace of


the sequence space p , 1 ≤ p ≤ ∞, via the canonical embedding i : Rd → p ,

x = (x1 , . . . , xd )T ∈ Rd −→ i(x) = (x1 , . . . , xd , 0, . . . , 0, . . .) ∈ p


.

This allows us to formulate the following statements concerning the strict


convexity of the p -norms  · p on Rd .

Corollary 3.35. For the p


-norms  · p on Rd , defined as


d
xpp = |xk |p for 1 ≤ p < ∞ and x∞ = max |xk |
1≤k≤d
k=1

the following statements are true.


(a) For 1 < p < ∞, the p -norms  · p are strictly convex on Rd .
(b) For d > 1, the 1 -norm  · 1 is not strictly convex on Rd .
(c) For d > 1, the ∞ -norm  · ∞ is not strictly convex on Rd . 

Remark 3.36. In statements (b), (c) of Corollary 3.35, we excluded the case
d = 1, since in this univariate setting the norms  · 1 and  · ∞ coincide
with the strictly convex norm | · | on R (see Remark 3.27). 

Now we formulate the main result of this section.

Theorem 3.37. Let F be a linear space, equipped with a strictly convex norm
 · . Moreover, assume S ⊂ F is convex and f ∈ F. If there exists a best
approximation s∗ ∈ S to f , then s∗ is unique.

Proof. Suppose s∗1 , s∗2 ∈ S are two different best approximations to f from
S, i.e., s∗1 = s∗2 . Then we have

s∗1 − f  = s∗2 − f  = inf s − f ,


s∈S

which, in combination with the strict convexity of the norm  · , implies


 ∗   ∗ 
 s1 + s∗2   (s1 − f ) + (s∗2 − f ) 
 − f =  < s∗1 − f  = s∗2 − f . (3.28)
 2   2 

Due to the assumed convexity for S, the element s∗ = (s∗1 + s∗2 )/2 lies in S.
Moreover, s∗ is closer to f than s∗1 and s∗2 , by (3.28). But this is in contra-
diction to the optimality of s∗1 and s∗2 . 
82 3 Best Approximations

We remark that the strict convexity of the norm · gives, in combination
with the convexity of S ⊂ F, only a sufficient condition for the uniqueness
of the best approximation. Now we show that this condition is not necessary.
To this end, we make a simple example.

Example 3.38. We regard the maximum norm  · ∞ on F = R2 . Moreover,


we let f = (0, 1) ∈ R2 and S = {(α, α) | α ∈ R} ⊂ R2 . Then, s∗ = ( 12 , 12 ) ∈ S
is the unique best approximation to f from S with respect to  · ∞ , although
 · ∞ is not stricty convex (see Figure 3.5). ♦

1.5

1 f
0.5

S∗
0

-0.5

-1

-1.5

-2
-2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5

Fig. 3.5. s∗ = ( 12 , 12 ) ∈ S = {(α, α) | α ∈ R} is the unique best approximation to


f = (0, 1) w.r.t.  · ∞ , although  · ∞ is not strictly convex (see Example 3.38).

We finally summarize our discussion concerning the uniqueness of best ap-


proximations, where we note three immediate conclusions from Theorem 3.37.

Corollary 3.39. Let F be a Euclidean space and S ⊂ F be convex. Then


there is for any f ∈ F at most one best approximation s∗ ∈ S to f . 

Corollary 3.40. Let S ⊂ Lp be convex for 1 < p < ∞. Then there is for
any f ∈ Lp at most one best approximation s∗ ∈ S to f w.r.t.  · p . 

Corollary 3.41. Let S ⊂ p be convex for 1 < p < ∞. Then there is for any
f ∈ p at most one best approximation s∗ ∈ S to f w.r.t.  · p . 
3.2 Uniqueness 83

We finally formulate an important result concerning the approximation


of continuous functions from C [−1, 1] by approximation spaces S ⊂ C [−1, 1]
that are invariant under reflections of the argument (in short: reflection-
invariant), i.e., for any s(x) ∈ S the function s(−x) lies also in S. For exam-
ple, the linear space Pn of all algebraic polynomials of degree at most n ∈ N0
is reflection-invariant. For the following observation, the uniqueness of the
best approximation plays an important role.
Proposition 3.42. Let f ∈ C [−1, 1] be an even function and, moreover, let
S ⊂ C [−1, 1] be a reflection-invariant subset of C [−1, 1]. If there exists a
unique best approximation s∗p ∈ S to f with respect to the Lp -norm  · p , for
1 ≤ p ≤ ∞, then s∗p to f is an even function.
Proof. Let f ∈ C [−1, 1] be an even function, i.e., f (x) = f (−x) for all
x ∈ [−1, 1]. Moreover, let s∗p ∈ S be the unique best approximation to f with
respect to  · p , for 1 ≤ p ≤ ∞. We regard the reflected function rp∗ for s∗p ,
defined as
rp∗ (x) = s∗p (−x) for x ∈ [−1, 1].
By our assumption we have rp∗ in S.

Case p = ∞: For the distance between r∞ and f with respect to  · ∞ ,
∗ ∗ ∗
r∞ − f ∞ = max |r∞ (x) − f (x)| = max |r∞ (−x) − f (−x)|
x∈[−1,1] x∈[−1,1]

= max |s∗∞ (x) − f (x)| = s∗∞ − f ∞ = η∞ (f, S),


x∈[−1,1]

we obtain the minimal distance between f and S with respect to  · ∞ ,



i.e., r∞ ∈ S is the best approximation to f . Now the uniqueness of the best
approximation implies our statement s∗∞ (x) = r∞ ∗
(x), or, s∗∞ (x) = s∗∞ (−x)
for all x ∈ [−1, 1], i.e., s∗∞ is an even function on [−1, 1].
Case 1 ≤ p < ∞: In this case, we regard the distance between rp∗ and f
in the Lp -Norm,
 1  1
rp∗ − f pp = |rp∗ (x) − f (x)|p dx = |rp∗ (−x) − f (−x)|p dx
−1 −1
 1
= |s∗p (x) − f (x)|p dx = s∗p − f pp = ηpp (f, S),
−1

whereby we get the minimal distance ηp (f, S) between f and S with respect
to  · p . Again, by the uniqueness of the best approximation we obtain the
stated result by

s∗p (x) = rp∗ (x) = s∗p (−x) for all x ∈ [−1, 1].


For an alternative proof of Proposition 3.42, we refer to Exercise 3.74.
84 3 Best Approximations

3.3 Dual Characterization


In this section and in the following section, we develop necessary and suf-
ficient conditions to characterize best approximations. We begin with dual
characterizations. To this end, let F be a normed linear space. We introduce
the topological dual, or, the dual space of F as usual by
1  2
F  = ϕ : F −→ R  ϕ linear and continuous .

The elements from the linear space F  are called dual functionals. On
this occasion, we recall the notions of linearity, continuity and boundedness
of functionals. We start with linearity.

Definition 3.43. A functional ϕ : F −→ R is called linear on F, if

ϕ(αu + βv) = αϕ(u) + βϕ(v) for all u, v ∈ F and all α, β ∈ R.

We had introduced continuity already in Definition 3.3. Next, we turn


to the boundedness of functionals. In the following discussion, F denotes a
linear space with norm  · .

Definition 3.44. A functional ϕ : F −→ R is said to be bounded on F, if


there exists a constant C ≡ Cϕ > 0 satisfying

|ϕ(u)| ≤ Cu for all u ∈ F. (3.29)

We call such a constant C upper bound for ϕ.

Now we can introduce a norm for the dual space F  , by using the norm
 ·  of F. To this end, we take for any functional ϕ ∈ F  the smallest upper
bound C ≡ Cϕ in (3.29). To be more precise on this, we define by

|ϕ(u)|
ϕ = sup = sup |ϕ(u)|
u∈F u u∈F
u=0 u=1

a mapping  ·  : F  −→ R. As it can be verified by elementary calculations,


 ·  is a norm on F  , according to Definition 1.1. In other words, the dual
space F  is a linear space with norm  · .
The following result for linear functionals is quite important.

Theorem 3.45. For a linear functional ϕ : F −→ R, the following state-


ments are equivalent.
(a) ϕ is continuous at one u0 ∈ F.
(b) ϕ is continuous on F.
(c) ϕ is bounded on F.
3.3 Dual Characterization 85

Proof. (a) ⇒ (b): Let ϕ be continuous at u0 ∈ F, and, moreover, let (un )n∈N
be a convergent sequence in F with limit u ∈ F. Then we have

ϕ(un ) = ϕ(un −u+u0 )+ϕ(u−u0 ) −→ ϕ(u0 )+ϕ(u−u0 ) = ϕ(u) for n → ∞.

Therefore, ϕ is continuous at every u ∈ F, i.e., ϕ is continuous on F.


The implication (b) ⇒ (a) is trivial, so the equivalence (a) ⇔ (b) is shown.
(c) ⇒ (b): Let ϕ be bounded on F, i.e., we have (3.29) for some C > 0. This
implies ϕ(un ) −→ 0, n → ∞, for every zero sequence (un )n∈N in F, and so
ϕ is continuous at zero. By the equivalence (a) ⇔ (b) is ϕ continuous on F.
(b) ⇒ (c): Let ϕ be continuous on F. Suppose ϕ is not bounded on F. Then,
there is a sequence (un )n∈N in F satisfying

un  = 1 and |ϕ(un )| > n for all n ∈ N,

since otherwise there would exist an upper bound N ∈ N for ϕ (i.e., ϕ would
be bounded). In this case, the sequence (vn )n∈N , defined as
un
vn = for n ∈ N,
|ϕ(un )|
is a zero sequence in F by
1
vn  = −→ 0 for n → ∞,
|ϕ(un )|
and so, by continuity of ϕ, we have

ϕ(vn ) −→ ϕ(0) = 0 for n → ∞.

But this is in contradiction to |ϕ(vn )| = 1 for all n ∈ N. 


Now we are in a position where we can formulate a sufficient condition
for the dual characterization of best approximations.
Theorem 3.46. Let S ⊂ F be a non-empty subset of F. Moreover, let f ∈ F
and s∗ ∈ S. Suppose that ϕ ∈ F  is a dual functional satisfying the following
properties.
(a) ϕ = 1.
(b) ϕ(s∗ − f ) = s∗ − f .
(c) ϕ(s − s∗ ) ≥ 0 for all s ∈ S.
Then s∗ is a best approximation to f .
Proof. For s ∈ S, we have ϕ(s − f ) ≤ s − f , due to (a). Moreover, we have

s − f  ≥ ϕ(s − f ) = ϕ(s − s∗ ) + ϕ(s∗ − f ) ≥ s∗ − f 

by (b) and (c). Therefore, s∗ is a best approximation to f . 


86 3 Best Approximations

Note that the above characterization in Theorem 3.46 only requires S to


be non-empty. However, if we assume S ⊂ F to be convex, then we can show
that the sufficient condition in Theorem 3.46 is also necessary. To this end, we
need the following separation theorem for convex sets, which can be viewed
as a geometric implication from the well-known Hahn10 -Banach11 theorem
(see [33, Section 16.1]) which was proven by Mazur12 in [3].

Theorem 3.47. (Banach-Mazur separation theorem, 1933).


Let K1 , K2 ⊂ F be two non-empty, disjoint and convex subsets in a normed
linear space F. Moreover, suppose K1 is an open set. Then there exists a
separating functional ϕ ∈ F  for K1 and K2 , i.e., we have

ϕ(u1 ) < ϕ(u2 ) for all u1 ∈ K1 , u2 ∈ K2 .

On the Banach-Mazur separation theorem, we can formulate a sufficient


and necessary condition for the dual characterization of best approximations.

Theorem 3.48. Let S ⊂ F be a convex set in F. Moreover, suppose f ∈


F \ S. Then, s∗ ∈ S is a best approximation to f , if and only if there exists
a dual functional ϕ ∈ F  satisfying the following properties.
(a) ϕ = 1.
(b) ϕ(s∗ − f ) = s∗ − f .
(c) ϕ(s − s∗ ) ≥ 0 for all s ∈ S.

Proof. Note that the sufficiency of the statement is covered by Theorem 3.46.
To prove the necessity, suppose that s∗ ∈ S is a best approximation to f .
Regard the open ball

Bη (f ) = {u ∈ F | u − f  < s∗ − f } ⊂ F

around f with radius η = s∗ − f . Note that for K1 = Bη (f ) and K2 = S


the assumptions of the Banach-Mazur separation theorem, Theorem 3.47, are
satisfied. Therefore, there is a separating functional ϕ ∈ F  with

ϕ(u) < ϕ(s) for all u ∈ Bη (f ) and s ∈ S. (3.30)

Now let (un )n∈N ⊂ Bη (f ) be a convergent sequence with limit element s∗ ,


i.e., un −→ s∗ for n → ∞. By the continuity of ϕ, this implies

ϕ(un ) −→ ϕ(s∗ ) = inf ϕ(s),


s∈S

10
Hans Hahn (1879-1934), Austrian mathematician and philosopher
11
Stefan Banach (1892-1945), Polish mathematician
12
Stanislaw Mazur (1905-1981), Polish mathematician
3.4 Direct Characterization 87

i.e., ϕ(s∗ ) ≤ ϕ(s) for all s ∈ S, and so ϕ has property (c).


To show properties (a) and (b), let v ∈ F with v < 1. Then, u = ηv + f
lies in Bη (f ). With (3.30) and by the linearity of ϕ, we have
   ∗ 
u−f s −f
ϕ(v) = ϕ < ϕ .
s∗ − f  s∗ − f 

This implies  
s∗ − f
ϕ = sup |ϕ(v)| ≤ ϕ
v ≤1 s∗ − f 
and, moreover, by using the continuity of ϕ once more, we have

ϕ(s∗ − f )
ϕ = ⇐⇒ ϕ(s∗ − f ) = ϕ · s∗ − f .
s∗ − f 

If we finally normalize the length of ϕ = 0, by scaling ϕ to unit norm, i.e.,


ϕ = 1, then ϕ satisfies properties (a) and (b). 

3.4 Direct Characterization


In this section, we develop necessary and sufficient conditions for the mini-
mization of convex functionals. We then apply these conditions to norms to
obtain useful characterizations for best approximations. To this end, we work
with Gâteaux13 derivatives to compute directional derivatives for relevant
norms. In the following discussion, F denotes a linear space.

Definition 3.49. For a functional ϕ : F −→ R,


1
ϕ+ (u, v) := lim (ϕ(u + hv) − ϕ(u)) for u, v ∈ F (3.31)
h 0 h
is said to be the G^
ateaux derivative of ϕ at u in direction v, provided that
the limit on the right hand side in (3.31) exists.

For convex functionals ϕ : F −→ R we can show that the limit on the


right hand side in (3.31) exists.

Theorem 3.50. Let ϕ : F −→ R be a convex functional. Then, the Gâteaux


derivative ϕ+ (u, v) exists for all u, v ∈ F. Moreover, the inequality

−ϕ+ (u, −v) ≤ ϕ+ (u, v) for all u, v ∈ F

holds.
13
René Gâteaux (1889-1914), French mathematician
88 3 Best Approximations

Proof. Let ϕ : F −→ R be a convex functional. We show that for any u, v ∈ F


the difference quotient Du,v : (0, ∞) −→ R, defined as

1
Du,v (h) = (ϕ(u + hv) − ϕ(u)) for h > 0, (3.32)
h
is a monotonically increasing function in h > 0, which, moreover, is bounded
below. To verify the monotonicity, we regard the convex combination
h 2 − h1 h1
u + h1 v = u + (u + h2 v) for h2 > h1 > 0.
h2 h2
The convexity of ϕ then implies the inequality
h2 − h1 h1
ϕ(u + h1 v) ≤ ϕ(u) + ϕ(u + h2 v)
h2 h2
and, after elementary calculations, the monotonicity
1 1
Du,v (h1 ) = (ϕ(u + h1 v) − ϕ(u)) ≤ (ϕ(u + h2 v) − ϕ(u)) = Du,v (h2 ).
h1 h2
If we now form the convex combination
h2 h1
u= (u − h1 v) + (u + h2 v) for h1 , h2 > 0,
h 1 + h2 h1 + h2
we obtain, by using the convexity of ϕ, the inequality
h2 h1
ϕ(u) ≤ ϕ(u − h1 v) + ϕ(u + h2 v)
h1 + h2 h 1 + h2
and, after elementary calculations, we obtain the estimate
1
−Du,−v (h1 ) = − (ϕ(u − h1 v) − ϕ(u))
h1
1
≤ (ϕ(u + h2 v) − ϕ(u)) = Du,v (h2 ). (3.33)
h2
This implies that the monotonically increasing difference quotient Du,v is
bounded from below for all u, v ∈ F. In particular, Du,−v is a monotoni-
cally increasing function that is bounded from below. Therefore, the Gâteaux
derivatives ϕ+ (u, v) and ϕ+ (u, −v) exist. By (3.33), we finally have

1 1
− (ϕ(u − hv) − ϕ(u)) ≤ −ϕ+ (u, −v) ≤ ϕ+ (u, v) ≤ (ϕ(u + hv) − ϕ(u))
h h
for all h > 0, as stated. 

Now we note a few elementary properties of the Gâteaux derivative.


3.4 Direct Characterization 89

Theorem 3.51. Let ϕ : F −→ R be a convex functional. Then the Gâteaux


derivative ϕ+ of ϕ has for all u, v, w ∈ F the following properties.
(a) ϕ+ (u, αv) = αϕ+ (u, v) for all α ≥ 0.
(b) ϕ+ (u, v + w) ≤ ϕ+ (u, v) + ϕ+ (u, w).
(c) ϕ+ (u, ·) : F −→ R is a convex functional.

Proof. (a): The case α = 0 is trivial. For α > 0 we have


1
ϕ+ (u, αv) = lim (ϕ(u + hαv) − ϕ(u))
h 0 h
1
= α lim (ϕ(u + hαv) − ϕ(u)) = αϕ+ (u, v).
h 0 hα

(b): The representation


1 1
u + h(v + w) = (u + 2hv) + (u + 2hw),
2 2
in combination with the convexity of ϕ, implies
1
ϕ+ (u, v + w) = lim (ϕ(u + h(v + w)) − ϕ(u))
h h
0
 
1 1 1
≤ lim ϕ(u + 2hv) + ϕ(u + 2hw) − ϕ(u)
h 0 h 2 2
1 1
= lim (ϕ(u + 2hv) − ϕ(u)) + lim (ϕ(u + 2hw) − ϕ(u))
h 0 2h h 0 2h

= ϕ+ (u, v) + ϕ+ (u, w).

(c): For u ∈ F, the Gâteaux derivative ϕ+ (u, ·) : F −→ R is convex, since

ϕ+ (u, λv + (1 − λ)w) ≤ ϕ+ (u, λv) + ϕ+ (u, (1 − λ)w)


= λϕ+ (u, v) + (1 − λ)ϕ+ (u, w)

holds for all λ ∈ [0, 1], by using properties (a) and (b). 

Remark 3.52. By the properties (a) and (b) in Theorem 3.51, we call the
functional ϕ+ (u, ·) : F −→ R sublinear. We can show that the sublinearity of
ϕ+ (u, ·), for all u ∈ F, in combination with the inequality

ϕ+ (u, v − u) ≤ ϕ(v) − ϕ(u) for all u, v ∈ F,

implies the convexity of ϕ. To see this, we refer to Exercise 3.80. 

Now we show further elementary properties of the Gâteaux derivative.


90 3 Best Approximations

Theorem 3.53. Let ϕ : F −→ R be a continuous functional. Suppose that


for u, v ∈ F the Gâteaux derivative ϕ+ (u, v) exists. Moreover, suppose that
F : R −→ R has a continuous derivative, i.e., F ∈ C 1 (R). Then the Gâteaux
derivative (F ◦ ϕ)+ (u, v) of the composition F ◦ ϕ : F −→ R exists at u in
direction v, and, moreover, the chain rule

(F ◦ ϕ)+ (u, v) = F  (ϕ(u)) · ϕ+ (u, v) (3.34)

holds.

Proof. For x := ϕ(u) and xh := ϕ(u + hv), for h > 0, we let



⎨ F (xh ) − F (x) for xh = x,

G(xh ) := xh − x

⎩ F  (x) for xh = x.

By the continuity of ϕ we have xh −→ x for h  0. Since F ∈ C 1 (R) this


implies

F  (x) = lim G(xh ) = lim G(ϕ(u + hv)) = F  (ϕ(u)).


xh →x h 0

Moreover, we have

F (xh ) − F (x) = G(xh )(xh − x) for all h > 0.

This finally implies

 1
(F ◦ ϕ)+ (u, v) = lim (F (ϕ(u + hv)) − F (ϕ(u)))
h 0h
1
= lim (F (xh ) − F (x))
h 0 h
1
= lim G(xh ) · lim (xh − x)
h 0 h 0 h
1
= lim G(ϕ(u + hv)) · lim (ϕ(u + hv) − ϕ(u))
h 0 h 0 h

= F  (ϕ(u)) · ϕ+ (u, v),

proving both the existence of (F ◦ ϕ)+ (u, v) and the chain rule in (3.34). 

We now formulate a fundamental sufficient and necessary condition for


the characterization of minima for convex functionals.

Theorem 3.54. Let ϕ : F −→ R be a convex functional. Moreover, let


K ⊂ F be convex and u0 ∈ K. Then the following statements are equivalent.
(a) ϕ(u0 ) = inf u∈K ϕ(u).
(b) ϕ+ (u0 , u − u0 ) ≥ 0 for all u ∈ K.
3.4 Direct Characterization 91

Proof. (b) ⇒ (a): Suppose ϕ+ (u0 , u−u0 ) ≥ 0 for u ∈ K. Then we have, due to
the monotonicity of the difference quotient Du0 ,u−u0 in (3.32), in particular
for h = 1,

0 ≤ ϕ+ (u0 , u − u0 ) ≤ ϕ(u0 + (u − u0 )) − ϕ(u0 ) = ϕ(u) − ϕ(u0 )

and thus ϕ(u) ≥ ϕ(u0 ).


(a) ⇒ (b): Suppose ϕ(u0 ) = inf u∈K ϕ(u).
Then for u ∈ K and small enough h > 0 the inequality
1
(ϕ(u0 + h(u − u0 )) − ϕ(u0 )) ≥ 0
h
holds, since, due to convexity of K, we have

u0 + h(u − u0 ) = hu + (1 − h)u0 ∈ K for all h ∈ (0, 1).

This finally implies


1
ϕ+ (u0 , u − u0 ) = lim (ϕ(u0 + h(u − u0 )) − ϕ(u0 )) ≥ 0.
h 0 h

Now we wish to use the conditions from Theorem 3.54 to derive direct
characterizations of best approximations to f ∈ F. To this end, we regard
the distance functional ϕf : F −→ R, defined as

ϕf (v) = v − f  for v ∈ F.

Note that the distance functional ϕf is, as a composition between two


continuous functionals, namely the translation about f and the norm  · , a
continuous functional. Moreover, ϕf : F −→ R is convex. Indeed, for λ ∈ [0, 1]
and v1 , v2 ∈ F we have

ϕf (λv1 + (1 − λ)v2 )
= λv1 + (1 − λ)v2 − f  = λ(v1 − f ) + (1 − λ)(v2 − f )
≤ λv1 − f  + (1 − λ)v2 − f  = λϕf (v1 ) + (1 − λ)ϕf (v2 ).

Therefore, ϕf has a Gâteaux derivative, for which the chain rule (3.34) holds.
Now the direct characterization from Theorem 3.54 can be applied to the
distance functional ϕf . This leads us to a corresponding equivalence, which
is referred to as Kolmogorov14 criterion.
For the Gâteaux derivative of the norm ϕ =  ·  : F −→ R we will
henceforth use the notation

+ (u, v) := ϕ+ (u, v) for u, v ∈ F.


14
Andrey Nikolaevich Kolmogorov (1903-1987), Russian mathematician
92 3 Best Approximations

Corollary 3.55. (Kolmogorov criterion).


For f ∈ F, S ⊂ F convex, and s∗ ∈ S the following statements are equivalent.
(a) s∗ is a best approximation to f .
(b) + (s∗ − f, s − s∗ ) ≥ 0 for all s ∈ S.

Proof. Using ϕ(u) = u − f  in Theorem 3.54, we see that s∗ ∈ S is a best


approximation to f , if and only if
1
ϕ+ (s∗ , s − s∗ ) = lim (ϕ(s∗ + h(s − s∗ )) − ϕ(s∗ ))
h 0 h
1
= lim (s∗ + h(s − s∗ ) − f  − s∗ − f )
h 0 h
1
= lim (s∗ − f + h(s − s∗ ) − s∗ − f )
h 0 h

= + (s∗ − f, s − s∗ ) ≥ 0 for all s ∈ S.

Remark 3.56. For proving the implication (b) ⇒ (a) in Theorem 3.54 we
did not use the convexity of K. Therefore, we can specialize the equivalence
in Corollary 3.55 to establish the implication

+ (s∗ − f, s − s∗ ) ≥ 0 for all s ∈ S =⇒ s∗ is best approximation to f

for subsets S ⊂ F that are not necessarily convex. 

Now we use the Gâteaux derivative to prove a characterization concerning


the uniqueness of best approximations. To this end, we introduce a sufficient
condition, which will be more restrictive than that of (plain) uniqueness.

Definition 3.57. Let F be a linear space with norm  · , S ⊂ F be a subset


of F, and f ∈ F. Then s∗ ∈ S is said to be the strongly unique best
approximation to f , if there is a constant α > 0 satisfying

s − f  − s∗ − f  ≥ αs − s∗  for all s ∈ S.

Relying on our previous investigations (especially on Theorem 3.54 and


Corollary 3.55), we can characterize the strong uniqueness of best approxi-
mations for convex subsets S ⊂ F directly as follows.

Theorem 3.58. Let F be a linear space with norm · and S ⊂ F be convex.
Moreover, suppose f ∈ F. Then the following statements are equivalent.
(a) s∗ ∈ S is the strongly unique best approximation to f .
(b) There is one α > 0 satisfying + (s∗ − f, s − s∗ ) ≥ αs − s∗  for all s ∈ S.
3.4 Direct Characterization 93

Proof. Suppose f ∈ F and s∗ ∈ S. For f ∈ F we regard the convex distance


functional ϕ : S −→ [0, ∞), defined as ϕ(s) = s − f  for s ∈ S. Now any
element in S \ {s∗ } can be written as a convex combination of the form
s∗ + h(s − s∗ ) = hs + (1 − h)s∗ ∈ S with s ∈ S \ {s∗ } and 1 ≥ h > 0.
Thereby, we can formulate the strong uniqueness of s∗ , for some α > 0, as
1 1
(ϕ(s∗ + h(s − s∗ )) − ϕ(s∗ )) ≥ α for all s ∈ S \ {s∗ } and h > 0.
s − s∗  h
By the monotonicity of the Gâteaux derivative, this is equivalent to
1
ϕ+ (s∗ , s − s∗ ) = lim (ϕ(s∗ + h(s − s∗ )) − ϕ(s∗ )) ≥ αs − s∗ 
h 0 h

for all perturbation directions s − s∗ ∈ S, or, in other words, we have


+ (s∗ − f, s − s∗ ) ≥ αs − s∗  for all s ∈ S.

Next, we add an important stability estimate to our discussion on strongly
unique best approximations. The following result is due to Freud15 .
Theorem 3.59. (Freud).
For a linear space F with norm  ·  and a subset S ⊂ F, let s∗f ∈ S be the
strongly unique best approximation to f ∈ F with constant α > 0. Moreover,
let s∗g ∈ S be a best approximation to g ∈ F. Then the stability estimate
2
s∗g − s∗f  ≤ g − f 
α
holds.
Proof. By the strong uniqueness of s∗f , we have the estimate
s∗g − f  − s∗f − f  ≥ αs∗g − s∗f ,
which further implies the inequalities
1
s∗g − s∗f  ≤
(s∗ − f  − s∗f − f )
α g
1
≤ (s∗g − g + g − f  − s∗f − f )
α
1
≤ (s∗f − g + g − f  − s∗f − f )
α
1
≤ (s∗f − f  + f − g + g − f  − s∗f − f )
α
2
= g − f ,
α
which already proves the stated stability estimate. 
15
Géza Freud (1922-1979), Hungarian mathematician
94 3 Best Approximations

Remark 3.60. If the best approximation s∗g ∈ S is for any g ∈ F unique,


then the mapping g −→ s∗g is well-defined. Further, due to the Freud theorem,
Theorem 3.59, this mapping is continuous at all elements f ∈ F which have
a strongly unique best approximation s∗f ∈ S.
If every f ∈ F has a strongly unique best approximation s∗f ∈ S, such
that their corresponding constants αf > 0 are on F uniformly bounded away
from zero, i.e., if there is some α0 > 0 satisfying αf ≥ α0 > 0 for all f ∈ F,
then the mapping f −→ s∗f is, due to the Freud theorem, Theorem 3.59, by

2
s∗g − s∗f  ≤ g − f  for all f, g ∈ F,
α0
Lipschitz continuous on F with Lipschitz constant 2/α0 (see Definition 6.64).


Next, we will explicitly compute Gâteaux derivatives for relevant norms


 · . But first we make the following simple observation.

Remark 3.61. For the Gâteaux derivative of ϕ(u) = u at u = 0, we have


1
+ (0, v) = lim (0 + hv − 0) = v
h 0 h

for any direction v ∈ F. 

Now we compute Gâteaux derivatives for Euclidean norms.

Theorem 3.62. The Gâteaux derivative of any Euclidean norm  ·  is given


as  
 u
+ (u, v) = ,v for all u ∈ F \ {0} and v ∈ F.
u

Proof. Let F be a Euclidean space with norm  ·  = (·, ·)1/2 .


For ϕ(u) = u the chain rule in (3.34) holds in particular for F (x) = x2 ,
i.e.,

2 + (u, v) = 2u · + (u, v) for all u, v ∈ F. (3.35)
Moreover, we have
 1
2 +
(u, v) = lim u + hv2 − u2
h h
0
1
= lim u2 + 2h(u, v) + h2 v2 − u2
h 0 h
1
= lim 2h(u, v) + h2 v2
h 0 h
= 2(u, v).

This implies, for u = 0 with (3.35),


3.4 Direct Characterization 95
 
u
+ (u, v) = ,v for all v ∈ F.
u

The Gâteaux derivative of the absolute value | · | is rather elementary.

Lemma 3.63. For the absolute-value function | · | : R −→ [0, ∞), we have

|+ (x, y) = y sgn(x) for all x = 0 and y ∈ R.

Proof. First, note that for x = 0 we have

|x + hy| = |x| + hy sgn(x) for h|y| < |x|. (3.36)

This implies
1 1
|+ (x, y) = lim (|x + hy| − |x|) = lim (|x| + hy sgn(x) − |x|) = y sgn(x).
h 0 h h 0 h

By using the observation (3.36), we can compute the Gâteaux derivatives


for all Lp -norms  · p , for 1 ≤ p ≤ ∞, on the linear space

C (Ω) = {u : Ω −→ R | u continuous on Ω}

of all continuous functions on a compact domain Ω ⊂ Rd , d ≥ 1. We begin


with the maximum norm  · ∞ on C (Ω), defined as

u∞ = max |u(x)| for u ∈ C (Ω).


x∈Ω

Theorem 3.64. Let Ω ⊂ Rd be compact. Then, for the Gâteaux derivative


of the maximum norm  ·  =  · ∞ on C (Ω), we have

+ (u, v) = max v(x) sgn(u(x))


x∈Ω
|u(x)|=u∞

for any u, v ∈ C (Ω), u ≡ 0.

Proof. Suppose u ∈ C (Ω), u ≡ 0, and v ∈ C (Ω).


”≥”: We first show the inequality

+ (u, v) ≥ max v(x) sgn(u(x)).


x∈Ω
|u(x)|=u∞

Suppose x ∈ Ω with |u(x)| = u∞ . Then, by (3.36) we have the inequality


96 3 Best Approximations

1 1
(u + hv∞ − u∞ ) ≥ (|u(x) + hv(x)| − |u(x)|)
h h
1
= (|u(x)| + hv(x) sgn(u(x)) − |u(x)|)
h
= v(x) sgn(u(x))
for h|v(x)| < |u(x)|, which by h  0 already implies the stated inequality.
”≤”: To verify the inequality
+ (u, v) ≤ max v(x) sgn(u(x))
x∈Ω
|u(x)|=u∞

we regard a strictly monotonically decreasing zero sequence (hk )k∈N of posi-


tive real numbers, so that limk→∞ hk = 0. Note that for any element hk > 0
there exists some xhk ∈ Ω satisfying
u + hk v∞ = |u(xhk ) + hk v(xhk )| for k ∈ N.
Since Ω is compact, the sequence (xhk )k∈N has a convergent subsequence
(xhk )∈N ⊂ Ω with limit element lim→∞ xhk = x ∈ Ω. For → ∞ we get
u + hk v∞ = |u(xhk ) + hk v(xhk )| −→ u∞ = |u(x)| with hk  0,
i.e., every accumulation point of (xhk )k∈N is an extremum of u in Ω.
Moreover, by (3.36) we obtain the inequality
1
(u + hk v∞ − u∞ )
hk
1
≤ (|u(xhk ) + hk v(xhk )| − |u(xhk )|)
hk
1
= (|u(xhk )| + hk v(xhk ) sgn(u(xhk )) − |u(xhk )|)
hk
= v(xhk ) sgn(u(xhk ))
for hk |v(xhk )| < |u(xhk )|, whereby
1
lim (u + hk v∞ − u∞ ) = + (u, v) ≤ v(x) sgn(u(x)),
→∞ hk

and where x ∈ Ω is an extremum of u in Ω, i.e., |u(x)| = u∞ . 


Theorem 3.65. Let Ω ⊂ Rd be compact and suppose u, v ∈ C (Ω). Then,
we have  
+ (u, v) = v(x) sgn(u(x)) dx + |v(x)| dx (3.37)
Ω+ Ω0

for the Gâteaux derivative of the L1 -norm  ·  =  · 1 on C (Ω),



u1 = |u(x)| dx for u ∈ C (Ω),
Ω

where Ω0 := {x ∈ Ω | u(x) = 0} ⊂ Ω and Ω+ := Ω \ Ω0 .


3.4 Direct Characterization 97

Proof. For u, v ∈ C (Ω), we have


1
(u + hv1 − u1 )
h   
1
= |u(x) + hv(x)| dx − |u(x)| dx
h
 Ω Ω

1
= (|u(x) + hv(x)| − |u(x)|) dx + |v(x)| dx. (3.38)
h Ω+ Ω0

By Ωh := {x ∈ Ω+ | h · |v(x)| < |u(x)|} ⊂ Ω+ , for h > 0, and with (3.36)


we represent the first integral in (3.38) as

1
χΩ+ (x)(|u(x) + hv(x)| − |u(x)|) dx
h Rd

= χΩh (x)v(x) sgn(u(x)) dx
Rd

1
+ χΩ+ \Ωh (x) (|u(x) + hv(x)| − |u(x)|) dx, (3.39)
h Rd
where for Ω ⊂ Rd
1 for x ∈ Ω,
χΩ (x) =
0 for x ∈ Ω,
denotes the indicator function (i.e., the characteristic function) for Ω.
Now we estimate the integral in (3.39) from above by

χΩ+ \Ωh (x) (|u(x) + hv(x)| − |u(x)|) dx
Rd

≤ χΩ+ \Ωh (x) (|u(x)| + h|v(x)| − |u(x)|) dx
Rd

= h· χΩ+ \Ωh (x)|v(x)| dx. (3.40)
Rd

Since χΩh −→ χΩ+ , or, χΩ+ \Ωh −→ 0, for h  0, the statement in (3.37)
follows from the representations (3.38), (3.39) and (3.40). 
To compute the Gâteaux derivatives for the remaining Lp -norms  · p ,
 1/p
up = |u(x)| dx
p
for u ∈ C (Ω),
Ω

for 1 < p < ∞, we need the following lemma.


Lemma 3.66. For 1 < p < ∞, let ϕ(u) = up for u ∈ C (Ω), where Ω ⊂ Rd
is assumed to be compact. Then, we have


(ϕp )+ (u, v) = p |u(x)|p−1 v(x) sgn(u(x)) dx (3.41)
Ω

for all u, v ∈ C (Ω), u ≡ 0.


98 3 Best Approximations

Proof. For u, v ∈ C (Ω), we have


1 p
(ϕ (u + hv) − ϕp (u))
h   
1 1
= u + hvp − up =
p p
|u(x) + hv(x)| dx −
p
|u(x)| dx
p
h h Ω Ω
, - 
1
= (|u(x) + hv(x)| − |u(x)| ) dx + h
p p p−1
|v(x)|p dx, (3.42)
h Ω+ Ω0

where Ω0 = {x ∈ Ω | u(x) = 0} and Ω+ = Ω \ Ω0 .


For x ∈ Ωh = {x ∈ Ω | h · |v(x)| < |u(x)|} ⊂ Ω+ , where h > 0, we have
p
|u(x) + hv(x)|p = (|u(x)| + hv(x) sgn(u(x)))
= |u(x)|p + p · |u(x)|p−1 · hv(x) sgn(u(x)) + o(h) for h  0
by (3.36) and by Taylor16 expansion of F (u) = up at |u|.
Thereby, we can split the first integral in (3.42) into the sum
 
1
χΩ+ (x) (|u(x) + hv(x)| − |u(x)| ) dx
p p
h
R
d

=p χΩh (x)|u(x)|p−1 v(x) sgn(u(x)) dx + o(1) (3.43)


R d

1
+ χΩ+ \Ωh (x) (|u(x) + hv(x)|p − |u(x)|p ) dx. (3.44)
h Rd
Now we estimate the expression in (3.44) from above by

1
χΩ+ \Ωh (x) (|u(x) + hv(x)|p − |u(x)|p ) dx
h Rd

1 p
≤ χΩ+ \Ωh (x) ((|u(x)| + h|v(x)|) − |u(x)|p ) dx
h Rd

=p χΩ+ \Ωh (x)|u(x)|p−1 |v(x)| dx + o(1) for h  0. (3.45)
Rd

Since χΩh −→ χΩ+ , or, χΩ+ \Ωh −→ 0, for h  0, the stated representation
in (3.41) follows from (3.42), (3.43), (3.44), and (3.45). 
Now we can finally provide the Gâteaux derivatives for the remaining
Lp -norms  · p , for 1 < p < ∞.
Theorem 3.67. Let Ω ⊂ Rd be compact. Moreover, suppose 1 < p < ∞.
Then, for the Gâteaux derivative of the Lp -norm  ·  =  · p on C (Ω), we
have 
 1
+ (u, v) = |u(x)|p−1 v(x) sgn(u(x)) dx
up−1
p Ω

for all u, v ∈ C (Ω), u ≡ 0.


16
Brook Taylor (1685-1731), English mathematician
3.5 Exercises 99

Proof. The statement follows from the chain rule (3.34) in Theorem 3.53 with
F (x) = xp in combination with the representation of the Gâteaux derivative

(ϕp )+ in Lemma 3.66, whereby
 
(ϕp )+ (u, v) p
ϕ+ (u, v) = = |u(x)|p−1 v(x) sgn(u(x)) dx,
pϕp−1 (u) pup−1
p Ω

for ϕ(u) = up . 

3.5 Exercises
Exercise 3.68. Consider approximating the parabola f (x) = x2 on the unit
interval [0, 1] by linear functions of the form

gξ (x) = ξ · x for ξ ∈ R

with respect to the p-norms  · p , for p = 1, 2, ∞, respectively. To this end,


first compute the distance function

ηp (ξ) = gξ − f p .

Then, determine the best approximation gξ∗ to f satisfying

gξ∗ − f p = inf gξ − f p


ξ∈R

along with the minimal distance ηp (ξ ∗ ), for each of the three cases p = 1, 2, ∞.
Exercise 3.69. Suppose we wish to approximate the identity f (x) = x on
the unit interval [0, 1], by an exponential sum of the form

pξ (x) = ξ1 eξ2 x + ξ3 for ξ = (ξ1 , ξ2 , ξ3 )T ∈ R3 ,

and with respect to the maximum norm  · ∞ .


Show that there is no best approximation to f from S = {pξ | ξ ∈ R3 }.
Hint: Use the parameter sequence ξ (k) = (k, 1/k, −k)T , for k ∈ N.
Exercise 3.70. Regard the linear space C [−π, π], equipped with the norm

g := g1 + g∞ for g ∈ C [−π, π].

Moreover, let f (x) = x, for −π ≤ x ≤ π, and


1 2
S = α sin2 (·) | α ∈ R ⊂ C [−π, π].

Analyze the existence and uniqueness of the approximation problem

min s − f .
s∈S
100 3 Best Approximations

Exercise 3.71. Let ϕ : F −→ R be a convex functional on a linear space F.


Prove the following statements for ϕ.
(a) If ϕ has a (global) maximum on F, then ϕ is constant.
(b) A local minimum of ϕ is also a global minimum of ϕ.

Exercise 3.72. Let (F, ·) be a normed linear space, whose norm · is not
strictly convex. Show that there exists an element f ∈ F, a linear subspace
S ⊂ F, and distinct best approximations s∗1 , s∗2 ∈ S to f , s∗1 = s∗2 , satisfying

η(f, S) = s∗1 − f  = s∗2 − f .

Hint: Take for suitable f1 , f2 ∈ F, f1 = f2 , satisfying f1  = f2  = 1


and f1 + f2  = 2, the element f = 12 (f1 + f2 ) ∈ F and the linear subspace
S = {α(f1 − f2 ) | α ∈ R} ⊂ F.

Exercise 3.73. Transfer the result of Proposition 3.42 to the case of odd
functions f ∈ C [−1, 1]. To this end, formulate and prove a corresponding
result for subsets S ⊂ C [−1, 1] that are invariant under point reflections,
i.e., for any s(x) ∈ S, we have −s(−x) ∈ S.

Exercise 3.74. Let (F,  · ) be a normed linear space and T : F −→ F be


a linear operator which is isometric on F, i.e., T v = v for all v ∈ F.
Moreover, let S ⊂ F be a non-empty subset of F satisfying T (S) ⊂ S.
First prove statements (a) and (b), before you analyze question (c).
(a) If s∗ ∈ S is a best approximation to f ∈ F and T (S) = S, then T s∗ ∈ S
is a best approximation to T f ∈ F.
(b) If f ∈ F is a fixed point of T in F, i.e., T f = f , and s∗ ∈ S is a unique
best approximation to f , then s∗ is a fixed point of T in S.
(c) Suppose f ∈ F is a fixed point of T in F. Moreover, suppose there is no
fixed point of T in S, which is also a best approximation to f . Can you
draw conclusions concerning the uniqueness of best approximation to f ?
Use the results from this exercise to prove Proposition 3.42.

Exercise 3.75. In this exercise, we analyze the existence of discontinuous


linear functionals ϕ on (C [0, 1],  · 2 ) and on (C [0, 1],  · ∞ ). Give examples,
if possible.
(a) Are there discontinuous linear functionals on (C [0, 1],  · 2 )?
(b) Are there discontinuous linear functionals on (C [0, 1],  · ∞ )?
3.5 Exercises 101

Exercise 3.76. Let a ≤ x0 < . . . < xn ≤ b be a sequence of pairwise


distinct points in [a, b] ⊂ R and λ0 , . . . , λn ∈ R. Show that the mapping
ϕ : C [a, b] −→ R, defined as


n
ϕ(f ) = λk f (xk ) for f ∈ C [a, b],
k=0

is a continuous linear functional on (C [a, b],  · ∞ ) with operator norm


n
ϕ∞ = |λk |.
k=0

Exercise 3.77. Let (F,  · ) be a normed linear space and S ⊂ F be a


finite-dimensional linear subspace of F. Moreover, suppose f ∈ F.
Prove the following statements on linear functionals from the dual space F  .
(a) If ϕ ∈ F  satisfies ϕ ≤ 1 and ϕ(S) = 0, i.e., ϕ(s) = 0 for all s ∈ S,
then we have
η(f, S) = inf s − f  ≥ |ϕ(f )|
s∈S

for the minimal distance η(f, S) between f and S.


(b) There exists one ϕ ∈ F  satisfying ϕ ≤ 1 and ϕ(S) = 0, such that

|ϕ(f )| = η(f, S).

If η(f, S) > 0, then ϕ = 1.

Exercise 3.78. Consider the linear space F = C ([0, 1]2 ), equipped with the
maximum norm  · ∞ . Approximate the function

f (x, y) = x · y for (x, y)T ∈ [0, 1]2

by a function from the linear approximation space


1 2
S = s ∈ F | s(x, y) = s1 (x) + s2 (y) for (x, y)T ∈ [0, 1]2 with s1 , s2 ∈ C [0, 1] .

(a) Construct a linear functional of the form


4
ϕ(g) = λj g(xj , yj ) for g ∈ F
j=1

to estimate the minimal distance η(f, S) between f and S, where


1
η(f, S) ≥ .
4
102 3 Best Approximations

(b) Show that


x y 1
s∗ (x, y) = + − for (x, y)T ∈ [0, 1]2
2 2 4
is a best approximation to f from S with respect to  · ∞ .

Exercise 3.79. Show that the function ϕ : R2 −→ R, defined as


* xy2
2 4 for (x, y) = 0
ϕ(x, y) = x +y for (x, y)T ∈ R2 ,
0 for (x, y) = 0

has a Gâteaux derivative at zero, although ϕ is not continuous at zero.

Exercise 3.80. Let F be a linear space and ϕ : F −→ R a functional on F.


Prove the following statements (related to Remark 3.52).

(a) If ϕ is convex on F, then the Gâteaux derivative ϕ+ is monotone, i.e.,

ϕ+ (u1 , u1 − u2 ) − ϕ+ (u2 , u1 − u2 ) ≥ 0 for all u1 , u2 ∈ F.

(b) Assume that the Gâteaux derivative ϕ+ (u, v) exists for all u, v ∈ F.
Moreover, assume that ϕ+ (u, ·) : F −→ R is sublinear for all u ∈ F. If
the inequality

ϕ+ (u, v − u) ≤ ϕ(v) − ϕ(u) for all u, v ∈ F,

holds, then ϕ is convex on F.


4 Euclidean Approximation

In this chapter, we study approximation in Euclidean spaces. Therefore, F


denotes a linear space, equipped with a Euclidean norm  · , i.e.,  ·  is
defined by an inner product,

f  = (f, f )1/2 for f ∈ F.

From the preceding chapter, we understand that Euclidean approximation


has fundamental advantages, in particular for the existence and uniqueness
of best approximations. We briefly summarize our previous results as follows.

Existence of best approximations: For a Hilbert space F, i.e., F is


complete with respect to  · , and a closed and convex subset S ⊂ F,
there exists for any f ∈ F a best approximation s∗ ∈ S to f .
Uniqueness best approximations: For convex S ⊂ F, a best approxi-
mation s∗ ∈ S to f ∈ F is unique, due to the strict convexity of  · .
The above statements are based on Theorems 3.14, 3.28, and Corol-
lary 3.39 from Chapter 3. Recall that for the existence and uniqueness of
s∗ , the parallelogram identity (3.1) plays a central role. Moreover, accord-
ing to the Jordan-von Neumann theorem, Theorem 3.10, the parallelogram
identity holds only in Euclidean spaces. Therefore, the problem of Euclidean
approximation is fundamentally different from the problem of approximation
in non-Euclidean spaces.
In this chapter, we explain further advantages of Euclidean approxima-
tion. To this end, we rely on the characterizations for best approximations. In
particular, we make use of the Kolmogorov criterion, Corollary 3.55, in com-
bination with the representation of Gâteaux derivatives for Euclidean norms
from Theorem 3.62. This yields for finite-dimensional approximation spaces
S ⊂ F constructive methods to compute best approximations by orthogonal
projection Π : F −→ S of f ∈ F on S.
We treat two important special cases of Euclidean approximation: Firstly,
the approximation of 2π-periodic continuous functions by trigonometric poly-
nomials, where F = C2π and S = Tn . Secondly, the approximation of con-
tinuous functions by algebraic polynomials, in which case F = C [a, b], for a
compact interval [a, b] ⊂ R, and S = Pn .

© Springer Nature Switzerland AG 2018 103


A. Iske, Approximation Theory and Algorithms for Data Analysis, Texts
in Applied Mathematics 68, https://doi.org/10.1007/978-3-030-05228-7_4
104 4 Euclidean Approximation

4.1 Construction of Best Approximations


In this section, we apply the characterizations for best approximations from
the previous chapter to Euclidean spaces. To this end, we assume that the
approximation space S ⊂ F is a linear subspace of the Euclidean space F. The
application of the Kolmogorov criterion immediately provides the following
fundamental result.

Theorem 4.1. Let F be a Euclidean space with inner product (·, ·). More-
over, suppose S ⊂ F is a convex subset of F. Then the following statements
are equivalent.
(a) s∗ ∈ S is a best approximation to f ∈ F \ S.
(b) We have (s∗ − f, s − s∗ ) ≥ 0 for all s ∈ S.

Proof. Under the stated assumptions, the equivalence of the Kolmogorov


criterion, Corollary 3.55, holds. Thereby, a best approximation s∗ ∈ S to f
is characterized by the necessary and sufficient condition
 ∗ 
s −f
+ (s∗ − f, s − s∗ ) = , s − s ∗
≥0 for all s ∈ S,
s∗ − f 
using the representation of the Gâteaux derivative from Theorem 3.62. 

Remark 4.2. If S ⊂ F is a linear subspace of F, then the variational inequa-


lity in statement (b) of Theorem 4.1 immediately leads us to the necessary
and sufficient condition

(s∗ − f, s) = 0 for all s ∈ S, (4.1)

i.e., in this case, s∗ ∈ S is a best approximation to f ∈ F \ S, if and only if


the orthogonality relation s∗ − f ⊥ S holds. 

Note that the equivalence statement in Remark 4.2 identifies a best ap-
proximation s∗ ∈ S to f ∈ F as the unique orthogonal projection of f onto
S. In Section 4.2, we will study the projection operator Π : F −→ S, which
assigns every f ∈ F to its unique best approximation s∗ ∈ S in more detail.
Before doing so, we first use the orthogonality in (4.1) to characterize best
approximations s∗ ∈ S for convex subsets S ⊂ F. To this end, we work with
the dual characterization of Theorem 3.46.

Theorem 4.3. Let F be a Euclidean space with inner product (·, ·) and let
S ⊂ F be a convex subset of F. Moreover, suppose that s∗ ∈ S satisfies
s∗ − f ⊥ S. Then, s∗ is the unique best approximation to f .

Proof. The linear functional ϕ ∈ F  , defined as


 ∗ 
s −f
ϕ(u) = , u for u ∈ F,
s∗ − f 
4.1 Construction of Best Approximations 105

satisfies all three conditions from the dual characterization of Theorem 3.46:
Indeed, the first condition, ϕ = 1, follows from the Cauchy1 -Schwarz2
inequality,
 ∗ 
 s −f  s∗ − f 
|ϕ(u)| =  ∗
, u ≤
 s∗ − f  · u = u for all u ∈ F,
s − f 
where equality holds for u = s∗ − f ∈ F, since
 ∗ 
s −f s∗ − f 2
ϕ(s∗ − f ) = ∗
, s ∗
− f = = s∗ − f .
s − f  s∗ − f 
Therefore, ϕ also satisfies the second condition in Theorem 3.46. By s∗ −f ⊥ S
we have  ∗ 
s −f
ϕ(s) = ,s = 0 for all s ∈ S,
s∗ − f 
and so ϕ finally satisfies the third condition in Theorem 3.46.
In conclusion, s∗ is a best approximation to f . The uniqueness of s∗ follows
from the strict convexity of the Euclidean norm  ·  = (·, ·)1/2 . 

Now we consider the special case of Euclidean approximation by finite-


dimensional approximation spaces S. Therefore, suppose that S ⊂ F is a
linear subspace with dim(S) < ∞. According to Corollary 3.8, there exists
for any f ∈ F a best approximation s∗ ∈ S to f and, moreover, s∗ is unique,
due to Theorem 3.37.
Suppose that S is spanned by n ∈ N basis elements {s1 , . . . , sn } ⊂ F, i.e.,

S = span{s1 , . . . , sn } ⊂ F

so that dim(S) = n < ∞. To compute the unique best approximation s∗ ∈ S


to some f ∈ F we utilize the representation

n
s∗ = c∗j sj ∈ S. (4.2)
j=1

Now the (necessary and sufficient) orthogonality condition s∗ − f ⊥ S from


Remark 4.2, or, in (4.1) is equivalent to the requirement

(s∗ , sk ) = (f, sk ) for all 1 ≤ k ≤ n.

Therefore, the representation for s∗ in (4.2) leads us to the n linear conditions



n
c∗j (sj , sk ) = (f, sk ) for 1 ≤ k ≤ n
j=1

1
Augustin-Louis Cauchy (1789-1857), French mathematician
2
Hermann Amandus Schwarz (1843-1921), German mathematician
106 4 Euclidean Approximation

and so to the linear equation system


⎡ ⎤ ⎡ ∗⎤ ⎡ ⎤
(s1 , s1 ) (s2 , s1 ) · · · (sn , s1 ) c1 (f, s1 )
⎢ (s1 , s2 ) (s2 , s2 ) · · · (sn , s2 ) ⎥ ⎢ ∗⎥ ⎢ ⎥
⎢ ⎥ ⎢ c2 ⎥ ⎢ (f, s2 ) ⎥
⎢ .. .. .. .. ⎥ · ⎢ .. ⎥ = ⎢ .. ⎥ ,
⎣ . . . . ⎦ ⎣ . ⎦ ⎣ . ⎦
(s1 , sn ) (s2 , sn ) · · · (sn , sn ) c∗n (f, sn )
or, in short,
Gc∗ = b (4.3)
with the Gram matrix G = ((sj , sk ))1≤k,j≤n ∈ R
3 n×n
, the unknown coeffi-
cient vector c∗ = (c∗1 , . . . , c∗n )T ∈ Rn of s∗ in (4.2) and the right hand side
b = ((f, s1 ), . . . , (f, sn ))T ∈ Rn . Therefore, the solution c∗ ∈ Rn of the linear
system (4.3) yields the unknown coefficients of s∗ in (4.2).
Due to the existence and uniqueness of the best approximation s∗ , the
Gram matrix G must be regular. We specialize this statement on G as follows.
Theorem 4.4. The Gram matrix G in (4.3) is symmetric positive definite.
Proof. The symmetry of G follows from the symmetry of the inner product,
whereby (sj , sk ) = (sk , sj ) for all 1 ≤ j, k ≤ n. Moreover, G is positive defi-
nite, which also immediately follows from the properties of the inner product,
⎛ ⎞  2
    
n n n
 n

T
c Gc = cj ck (sj , sk ) = ⎝ cj sj , ⎠ 
ck sk =  cj sj 
 >0
j,k=1 j=1 k=1  j=1 

for all c = (c1 , . . . , cn )T ∈ Rn \ {0}. 


Given our investigations in this section, the problem of Euclidean approxi-
mation by finite-dimensional approximation spaces S seems to be solved. The
unique best approximation s∗ can be determined by the unique solution of
the linear system (4.3), i.e., computing s∗ is equivalent to solving (4.3).
But note that we have not posed any conditions on the basis of S, yet.
Next, we show that by suitable choices for bases of S, we can avoid the
linear system (4.3). Indeed, for an orthogonal basis {s1 , . . . , sn } of S, i.e.,
*
0 for j = k,
(sj , sk ) =
sj  > 0 for j = k,
2

the Gram matrix G is a diagonal matrix,


⎡ ⎤
s1 2
⎢ s2 2 ⎥
⎢ ⎥
G = diag(s1 2 , . . . , sn 2 ) = ⎢ .. ⎥,
⎣ . ⎦
sn 2
3
Jørgen Pedersen Gram (1850-1916), Danish mathematician
4.2 Orthogonal Bases and Orthogonal Projections 107

in which case the solution c∗ of (4.3) is given by


 T
(f, s1 ) (f, sn )
c∗ = ,..., ∈ Rn .
s1  2 sn 2

For an orthonormal basis {s1 , . . . , sn } of S, i.e., (sj , sk ) = δjk , the Gram


matrix G is the identity matrix, G = In ∈ Rn×n , in which case

c∗ = ((f, s1 ), . . . , (f, sn )) ∈ Rn .
T

In the following of this chapter, we develop suitable constructions and


characterizations for orthogonal bases in relevant applications. Before doing
so, we summarize the discussion of this section, and, moreover, we derive a
few elementary properties of orthogonal bases.

4.2 Orthogonal Bases and Orthogonal Projections


From our discussion in the previous section, we can explicitly represent, for
a fixed orthogonal basis (orthonormal basis), {s1 , . . . , sn } of S, the unique
best approximation s∗ ∈ S to f , for any f ∈ F.

Theorem 4.5. Let F be a Euclidean space with inner product (·, ·). More-
over, let S ⊂ F be a finite-dimensional linear subspace with orthogonal basis
{s1 , . . . , sn }. Then, for any f ∈ F,

n
(f, sj )
s∗ = sj ∈ S (4.4)
j=1
sj 2

is the unique best approximation to f . For the special case of an orthonormal


basis {s1 , . . . , sn } for S we have the representation

n
s∗ = (f, sj )sj ∈ S.
j=1

Now we study the linear and surjective operator Π : F −→ S, which maps


any element f ∈ F to its unique best approximation s∗ ∈ S. But first we
note that the optimality of the best approximation s∗ = Π(f ) immediately
implies the stability estimate

(I − Π)(f ) ≤ f − s for all f ∈ F, s ∈ S (4.5)

where I denotes the identity on F. Moreover, for f = s in (4.5) we get

Π(s) = s for all s ∈ S


108 4 Euclidean Approximation

and so Π is a projection operator, i.e., Π ◦Π = Π. By the characterization


of the best approximation s∗ = Π(f ) ∈ S in (4.1), the operator Π is an
orthogonal projection, since
f − Π(f ) = (I − Π)(f ) ⊥ S for all f ∈ F,
i.e., the linear operator I − Π : F −→ S ⊥ maps onto the orthogonal comple-
ment S ⊥ ⊂ F of S in F. Moreover, I − Π is also a projection operator, since
for any f ∈ F, we have
((I − Π) ◦ (I − Π))(f ) = (I − Π)(f − Π(f ))
= f − Π(f ) − Π(f ) + (Π ◦ Π)(f )
= f − Π(f ) = (I − Π)(f ).
The orthogonality of Π immediately implies another well-known result.
Theorem 4.6. The Pythagoras4 theorem
f − Π(f )2 + Π(f )2 = f 2 for all f ∈ F (4.6)
holds.
Proof. For f ∈ F we have
f 2 = f − Π(f ) + Π(f )2
= f − Π(f )2 + 2(f − Π(f ), Π(f )) + Π(f )2
= f − Π(f )2 + Π(f )2 .

The Pythagoras theorem implies two further stability results.
Corollary 4.7. For I = Π the stability estimates
(I − Π)(f ) ≤ f  and Π(f ) ≤ f  for all f ∈ F (4.7)
hold. In particular, we have
I − Π = 1 and Π = 1
for the operator norms of I − Π and Π.
Proof. The stability estimates in (4.7) follow directly from the Pythagoras
theorem, Theorem 4.6. In the first inequality in (4.7), we have equality for
every element f − Π(f ) ∈ S ⊥ , whereas in the second inequality, we have
equality for every s ∈ S. Thereby, the operator norms of I − Π and Π are
already determined by
(I − Π)(f ) Π(f )
I − Π = sup =1 and Π = sup = 1.
f =0 f  f =0 f 

4
Pythagoras of Samos (around 570-510 BC), ancient Greek philosopher
4.2 Orthogonal Bases and Orthogonal Projections 109

Next, we compute for f ∈ F the norm Π(f ) of Π(f ) = s∗ . To this end,


we utilize, for a fixed orthogonal basis {s1 , . . . , sn } of S the representation
in (4.4), whereby

n
(f, sj )
Π(f ) = sj ∈ S for f ∈ F. (4.8)
j=1
sj 2

In particular, for s ∈ S, we have the representation



n
(s, sj )
Π(s) = s = sj ∈ S for all s ∈ S. (4.9)
j=1
sj 2

Theorem 4.8. Let {s1 , . . . , sn } ⊂ S be an orthogonal basis of S. Then, the


Parseval5 identity

n
(f, sj )(g, sj )
(Π(f ), Π(g)) = for all f, g ∈ F (4.10)
j=1
sj 2

holds, where in particular



n
|(f, sj )|2
Π(f )2 = for all f ∈ F. (4.11)
j=1
sj 2

Proof. By the representation of Π in (4.8), we have


⎛ ⎞
n
(f, s ) 
n
(g, s )
(Π(f ), Π(g)) = ⎝ s ⎠
j k
2 j
s , 2 k
j=1
s j  s k 
k=1


n
(f, sj ) (g, sk )  (f, sj )(g, sj )
n
= (sj , sk ) =
sj  sk 
2 2
j=1
sj 2
j,k=1

for all f, g ∈ F. For f = g we obtain the stated representation in (4.11). 


We finally add another important result.
Theorem 4.9. Let {s1 , . . . , sn } ⊂ S be an orthogonal basis of S. Then the
Bessel6 inequality

n
|(f, sj )|2
Π(f )2 = ≤ f 2 for all f ∈ F (4.12)
j=1
sj 2

holds. Moreover, we have the identity



n
|(f, sj )|2
f − Π(f )2 = f 2 − ≤ f 2 for all f ∈ F.
j=1
sj 2
5
Marc-Antoine Parseval des Chênes (1755-1836), French mathematician
6
Friedrich Wilhelm Bessel (1784-1846), German astronomer, mathematician
110 4 Euclidean Approximation

Proof. The Bessel inequality follows from the second stability estimate in (4.7)
in combination with the representation in (4.11). The second statement fol-
lows from the Pythagoras theorem (4.6) and the representation (4.11). 

4.3 Fourier Partial Sums


In this section, we study one concrete example for Euclidean approximation.
In this particular case, we wish to approximate a continuous 2π-periodic
function by real-valued trigonometric polynomials. To this end, we equip the
linear space of all real-valued continuous 2π-periodic functions
R
C2π ≡ C2π = {f : R −→ R | f ∈ C (R) and f (x) = f (x + 2π) for all x ∈ R}
with the inner product
 2π
1
(f, g)R = f (x)g(x) dx for f, g ∈ C2π , (4.13)
π 0
1/2
which by  · R = (·, ·)R defines a Euclidean norm on C2π , so that

1 2π
f R =
2
|f (x)|2 dx for f ∈ C2π .
π 0
1/2
Therefore, C2π with  · R = (·, ·)R is a Euclidean space.
For the approximation space, we consider choosing the linear space of all
real-valued trigonometric polynomials of degree at most n ∈ N0 ,
 8
1 
Tn ≡ Tn = span √ , cos(j ·), sin(j ·)  1 ≤ j ≤ n
R
for n ∈ N0 .
2
Therefore, by using the notations introduced at the outset of this chapter,
we consider the special case of the Euclidean space F = C2π , equipped with
1/2
the norm  · R = (·, ·)R , and the linear approximation space S = Tn ⊂ C2π
of finite dimension dim(Tn ) = 2n + 1, for n ∈ N0 .
Remark 4.10. In the following chapters, we also use complex-valued tri-
gonometric polynomials from TnC for the approximation of complex-valued
continuous 2π-periodic functions from
C
C2π = {f : R −→ C | f ∈ C (R) and f (x) = f (x + 2π) for all x ∈ R} .
C
In that case, we equip C2π with the inner product
 2π
1 C
(f, g)C = f (x)g(x) dx for f, g ∈ C2π , (4.14)
2π 0
1/2 C
and thereby obtain the Euclidean norm  · C = (·, ·)C on C2π . The different
scalar factors, 1/π for (·, ·)R in (4.13) and 1/(2π) for (·, ·)C in (4.14), will be
useful later. To keep notations simple, we will from now use (·, ·) = (·, ·)R and
R
 ·  =  · R for the inner product (4.13) and the norm on C2π ≡ C2π . 
4.3 Fourier Partial Sums 111

For the approximation of f ∈ C2π , we use fundamental results, as de-


veloped in the previous section. In particular, we make use of orthonormal
systems to construct best approximations to f . To this end, we take note of
the following important result.

Theorem 4.11. For n ∈ N0 , the real-valued trigonometric polynomials


 8
1 

√ , cos(j ·), sin(j ·)  1 ≤ j ≤ n (4.15)
2
form an orthonormal system in C2π .

Proof. From the usual addition theorems for trigonometric polynomials we


get the identities

2 cos(jx) cos(kx) = cos((j − k)x) + cos((j + k)x) (4.16)


2 sin(jx) sin(kx) = cos((j − k)x) − cos((j + k)x) (4.17)
2 sin(jx) cos(kx) = sin((j − k)x) + sin((j + k)x). (4.18)

The 2π-periodicity of cos((j ± k)x) and sin((j ± k)x) implies


 2π
1
(cos(j ·), cos(k ·)) = [cos((j − k)x) + cos((j + k)x)] dx = 0
2π 0
 2π
1
(sin(j ·), sin(k ·)) = [cos((j − k)x) − cos((j + k)x)] dx = 0
2π 0

for j = k and
 2π
1
(sin(j ·), cos(k ·)) = [sin((j − k)x) + sin((j + k)x)] dx = 0
2π 0

for all j, k ∈ {1, . . . , n}. Moreover, we have


   2π
1 1
√ , cos(j·) = √ cos(jx) dx = 0
2 2π 0
   2π
1 1
√ , sin(j·) = √ sin(jx) dx = 0
2 2π 0

for j = 1, . . . , n, so that the functions in (4.15) form an orthogonal system.


The orthonormality of the functions in (4.15) finally follows from
   2π
1 1 1
√ ,√ = 1 dx = 1
2 2 2π 0

and
112 4 Euclidean Approximation
 2π
1
(cos(j ·), cos(j ·)) = [1 + cos(2jx)] dx = 1
2π 0
 2π
1
(sin(j ·), sin(j ·)) = [1 − cos(2jx)] dx = 1
2π 0

where we use the representations in (4.16) and (4.17) yet once more. 

We now connect to the results of Theorems 4.5 and 4.11, where we can,
for any function f ∈ C2π , represent its unique best approximation s∗ ∈ Tn
by
 
1 
n
1
s∗ (x) = f, √ √ + [(f, cos(j·)) cos(jx) + (f, sin(j·)) sin(jx)] . (4.19)
2 2 j=1

We reformulate the representation for s∗ in (4.19) and introduce on this


occasion the important notion of Fourier partial sums.

Corollary 4.12. For f ∈ C2π , the unique best approximation s∗ ∈ Tn to f


is given by the n-th Fourier partial sum of f ,

a0 
n
(Fn f )(x) = + [aj cos(jx) + bj sin(jx)] . (4.20)
2 j=1

The coefficients a0 = (f, 1) and


 2π
1
aj ≡ aj (f ) = (f, cos(j·)) = f (x) cos(jx) dx (4.21)
π 0
 2π
1
bj ≡ bj (f ) = (f, sin(j·)) = f (x) sin(jx) dx (4.22)
π 0

for 1 ≤ j ≤ n are called Fourier coefficients of f . 

The Fourier partial sum (4.20) is split into an even part, given by the
partial sum of the even trigonometric polynomials {cos(j·), 0 ≤ j ≤ n} with
”even” Fourier coefficients aj , and into an odd part, given by the partial
sum of the odd trigonometric polynomials {sin(j·), 1 ≤ j ≤ n} with ”odd”
Fourier coefficients bj . We can show that for any even function f ∈ C2π , all
odd Fourier coefficients bj vanish. Likewise, for an odd function f ∈ C2π , all
even Fourier coefficients aj vanish. On this occasion, we recall the result of
Proposition 3.42, from which these statements immediately follow. But we
wish to compute the Fourier coefficients explicitly.
4.3 Fourier Partial Sums 113

Corollary 4.13. For f ∈ C2π , the following statements are true.


(a) If f is even, then the Fourier partial sum Fn f in (4.20) is even and the
Fourier coefficients aj in (4.21) have the representation

2 π
aj = f (x) cos(jx) dx for 0 ≤ j ≤ n.
π 0
(b) If f is odd, then the Fourier partial sum Fn f in (4.20) is odd and the
Fourier coefficients bj in (4.22) have the representation

2 π
bj = f (x) sin(jx) dx for 1 ≤ j ≤ n.
π 0
Proof. For an even function f ∈ C2π we have bj = 0, for all 1 ≤ j ≤ n, since
 
1 2π 1 −2π
bj = f (x) sin(jx) dx = − f (−x) sin(−jx) dx
π 0 π 0
 0  2π
1 1
=− f (x) sin(jx) dx = − f (x) sin(jx) dx = −bj ,
π −2π π 0
and so the Fourier partial sum Fn f in (4.20) is even. Moreover, we have
 
1 2π 2 π
a0 = f (x) dx = f (x) dx
π 0 π 0
and, for 1 ≤ j ≤ n,
 π  0  π
πaj = f (x) cos(jx) dx = f (x) cos(jx) dx + f (x) cos(jx) dx
−π −π 0
 π  π  π
= f (−x) cos(−jx) dx + f (x) cos(jx) dx = 2 f (x) cos(jx) dx.
0 0 0

This completes our proof for (a). We can prove (b) analogously. 
Example 4.14. We consider approximating the periodic function f ∈ C2π ,
defined as f (x) = π −|x|, for x ∈ [−π, π]. To this end, we determine for n ∈ N
the Fourier coefficients aj , bj of the Fourier partial sum Fn f . Since f is an
even function, we can apply Corollary 4.13, statement (a). From this, we see
that bj = 0, for all 1 ≤ j ≤ n, and, moreover,

2 π
aj = f (x) cos(jx) dx for 0 ≤ j ≤ n.
π 0
Integration by parts gives
 π π 
1  1 π 

f (x) cos(jx) dx = f (x) sin(jx) − f (x) sin(jx) dx
0 j j 0

0

1 π 1 
= sin(jx) dx = − 2 cos(jx) for 1 ≤ j ≤ n,
j 0 j 0
114 4 Euclidean Approximation

and so we have aj = 0 for all even indices j ∈ {1, . . . , n}, while

4
aj = for all odd indices j ∈ {1, . . . , n}.
πj 2
We finally compute the Fourier coefficient a0 by
   π
1 2π 2 π 2 1
a0 = (f, 1) = f (x) dx = (π − x) dx = − (π − x)2 = π.
π 0 π 0 π 2 0

Altogether, we obtain the representation

π  4  1
n n
π
(Fn f )(x) = + aj cos(jx) = + cos(jx)
2 j=1 2 π j=1 j 2
j odd

 n−1
2 
π 4  cos((2k + 1)x)
= +
2 π (2k + 1)2
k=0

for the n-th Fourier partial sum of f .


For illustration the function graphs of the Fourier partial sums Fn f and
of the error functions Fn f − f , for n = 2, 4, 16, are shown in Figures 4.1-4.3.

As we have seen in Section in 2.6, the real-valued Fourier partial sum Fn f


in (4.20) can be represented as complex Fourier partial sum of the form


n
(Fn f )(x) = cj eijx . (4.23)
j=−n

For the conversion of the Fourier coefficients, we apply the linear mapping
in (2.69), whereby, with using the Eulerean formula (2.67), we obtain for the
complex Fourier coefficients in (4.23) the representation
 2π
1
cj = f (x)e−ijx dx for j = −n, . . . , n. (4.24)
2π 0

We remark that the complex Fourier coefficients cj in (4.24) can also, like the
real Fourier coefficients aj in (4.21) and bj in (4.22), be expressed via inner
products. In fact, by using the complex inner product (·, ·)C in (4.14), we can
rewrite the representation in (4.24) as

cj = (f, exp(ij·))C for j = −n, . . . , n.


4.3 Fourier Partial Sums 115

2.5

1.5

0.5

0
-3 -2 -1 0 1 2 3

Fourier partial sum F2 f

0.2

0.1

-0.1

-0.2

-0.3
-3 -2 -1 0 1 2 3

error function F2 f − f

Fig. 4.1. Approximation of the function f (x) = π − |x| on the interval [−π, π] by
the Fourier partial sum (F2 f )(x) (see Example 4.14).
116 4 Euclidean Approximation

2.5

1.5

0.5

0
-3 -2 -1 0 1 2 3

Fourier partial sum F4 f

0.2

0.1

-0.1

-0.2

-0.3
-3 -2 -1 0 1 2 3

error function F4 f − f

Fig. 4.2. Approximation of the function f (x) = π − |x| on the interval [−π, π] by
the Fourier partial sum (F4 f )(x) (see Example 4.14).
4.3 Fourier Partial Sums 117

2.5

1.5

0.5

0
-3 -2 -1 0 1 2 3

Fourier partial sum F16 f

0.2

0.1

-0.1

-0.2

-0.3
-3 -2 -1 0 1 2 3

error function F16 f − f

Fig. 4.3. Approximation of the function f (x) = π − |x| on the interval [−π, π] by
the Fourier partial sum (F16 f )(x) (see Example 4.14).
118 4 Euclidean Approximation

Now we wish to approximate the complex Fourier coefficients cj .


To this end, we apply the composite trapezoidal rule with N = 2n + 1
equidistant knots

xk = k ∈ [0, 2π) for k = 0, . . . , N − 1,
N
so that
N −1 N −1
1  1  −jk
cj ≈ f (xk )e−ijxk = f (xk )ωN (4.25)
N N
k=0 k=0

where ωN = e2πi/N denotes the N -th root of unity in (2.74). In this way, the
vector c = (c−n , . . . , cn )T ∈ CN of the complex Fourier coefficients (4.24) is
approximated by the discrete Fourier transform (2.79) from the data vector

f = (f0 , . . . , fN −1 )T ∈ RN ,

where fk = f (xk ) for k = 0, . . . , N − 1. In order to compute the Fourier


coefficients c ∈ CN efficiently, we apply the fast Fourier transform (FFT)
from Section 2.7. According to Theorem 2.46, the FFT can be performed in
O(N log(N )) steps.
We close this section with the following remark.
The Fourier operator Fn : C2π −→ Tn gives the orthogonal projection
of C2π onto Tn . In Chapter 6, we will analyze the asymptotic behaviour of the
operator Fn , for n → ∞, in more detail, where we will address the following
fundamental questions.
• Is the Fourier series

a0 
(F∞ f )(x) = + [aj cos(jx) + bj sin(jx)] for f ∈ C2π
2 j=1

of f convergent?
• If so, does the Fourier series F∞ f converge to f ?
• If so, how fast does the Fourier series F∞ f converge to f ?
In particular, we will investigate, if at all, in which sense (e.g. pointwise,
or uniformly, or with respect to the Euclidean norm  · ) the convergence of
the Fourier series F∞ f holds. In Chapter 6, we will give answers, especially
for the asymptotic behaviour of the approximation error

η(f, Tn ) = Fn f − f  and η∞ (f, Tn ) = Fn f − f ∞ for n → ∞.

This will lead us to specific conditions on the smoothness of f .


4.4 Orthogonal Polynomials 119

4.4 Orthogonal Polynomials


Now we study another important special case of Euclidean approximation.
In this particular case, we wish to approximate continuous functions from
C [a, b], where [a, b] ⊂ R denotes a compact interval.
For the approximation space, we take, for fixed n ∈ N0 , the linear space
Pn of algebraic polynomials of degree at most n ∈ N0 , where dim(Pn ) = n+1,
i.e., throughout this section, we regard the special case where S = Pn and
F = C [a, b].
We introduce an inner product for the function space C [a, b] as follows.
For a positive and integrable weight function w ∈ C (a, b), satisfying
 b
w(x) dx < ∞,
a

the function space C [a, b] is by


 b
(f, g)w = f (x)g(x)w(x) dx for f, g ∈ C [a, b]
a

1/2
equipped with an inner product, yielding the Euclidean norm ·w = (·, ·)w ,
so that  b
f 2w = |f (x)|2 w(x) dx for f ∈ C [a, b].
a
Later in this section, we make concrete examples for the weight function w.
To approximate functions from C [a, b], we apply Theorem 4.5, so that
we can, for f ∈ C [a, b], represent the unique best approximation s∗ ∈ Pn
to f explicitly. In order to do so, we need an orthogonal system for Pn . To
this end, we propose an algorithm, which constructs for any weighted inner
product (·, ·)w an orthogonal basis

{p0 , p1 , . . . , pn } ⊂ Pn

for the polynomial space Pn .


The following orthogonalization algorithm by Gram-Schmidt7 belongs to
the standard repertoire of linear algebra. In this iterative method, a given
basis B ⊂ S of a finite-dimensional Euclidean space S is, by successive or-
thogonal projections of the basis elements from B, transformed into an or-
thogonal basis of S. The formulation of this constructive method is detailed
in the Gram-Schmidt algorithm, Algorithm 4, which we here apply to the
monomial basis B = {1, x, x2 , . . . , xn } of S = Pn .
7
Erhard Schmidt (1876-1959), German mathematician
120 4 Euclidean Approximation

Algorithm 4 Gram-Schmidt algorithm


1: function Gram-Schmidt
2: let p0 := 1;
3: for k = 0, . . . , n − 1 do
4: let
k
(xk+1 , pj )w
pk+1 := xk+1 − pj ;
j=0
pj 2w
5: end for
6: end function

Proposition 4.15. The polynomials p0 , . . . , pn ∈ Pn , output by the Gram-


Schmidt algorithm, Algorithm 4, form an orthogonal basis for Pn .
Proof. Obviously, pk ∈ Pk ⊂ Pn , for all 0 ≤ k ≤ n. Moreover, the orthogo-
nality relation
pk+1 = xk+1 − Π(xk+1 ) ⊥ Pk for all k = 0, . . . , n − 1,
holds, where

k
(xk+1 , pj )w
Π(xk+1 ) = pj
j=0
pj 2w

is the orthogonal projection of the monomial xk+1 onto Pk w.r.t. (·, ·)w .
Therefore, the polynomials p0 , . . . , pn form an orthogonal basis for Pn . 
Note that the orthogonalization method of Gram-Schmidt guarantees,
for any weighted inner product (·, ·)w , the existence of an orthogonal basis
for Pn with respect to (·, ·)w . Moreover, the Gram-Schmidt construction of
orthogonal polynomials in Algorithm 4 is unique up to n + 1 (non-vanishing)
scaling factors, one for the initialization (in line 2) and one for each of the
n for loop cycles (line 4). The scaling factors could be used to normalize the
orthogonal system of polynomials, where the following options are commonly
used.
• Normalization of the leading coefficient
p0 ≡ 1 and pk (x) = xk + qk−1 (x) for some qk−1 ∈ Pk−1 for k = 1, . . . , n
• Normalization at one
pk (1) = 1 for all k = 0, . . . , n
• Normalization of norm (orthonormalization)
Let p0 := p0 /p0 w (line 2) and pk := pk /pk w (line 4), k = 1, . . . , n.
However, the Gram-Schmidt algorithm is problematic for numerical rea-
sons. In fact, on the one hand, it is unstable, especially for input bases B with
almost linearly dependent basis elements. On the other hand, the Gram-
Schmidt algorithm is very inefficient. In contrast, the following three-term
recursion is much more suitable for efficient and stable constructions of or-
thogonal polynomials.
4.4 Orthogonal Polynomials 121

Theorem 4.16. For any weighted inner product (·, ·)w , there are unique or-
thogonal polynomials pk ∈ Pk , for k ≥ 0, with leading coefficient one. The
orthogonal polynomials (pk )k∈N0 satisfy the three-term recursion

pk (x) = (x + ak )pk−1 (x) + bk pk−2 (x) for k ≥ 1 (4.26)

for initial values p−1 ≡ 0, p0 ≡ 1 and coefficients

(xpk−1 , pk−1 )w pk−1 2w


ak = − for k ≥ 1 and b1 = 1, bk = − for k ≥ 2.
pk−1 2w pk−2 2w

Proof. We prove the statement by induction on k.


Initial step: For k = 0, the constant p0 ≡ 1 is the unique polynomial in P0
with leading coefficient one.
Induction hypothesis: Assume that p0 , . . . , pk−1 , k ≥ 1, are unique orthogonal
polynomials with leading coefficient one, where pj ∈ Pj for j = 0, . . . , k − 1.
Induction step (k −1 −→ k): Let pk ∈ Pk \Pk−1 be a polynomial with leading
coefficient one. Then, the difference pk − xpk−1 lies in Pk−1 , so that we have
(with using the orthogonal basis p0 , . . . , pk−1 of Pk−1 ) the representation


k−1
(pk − xpk−1 , pj )w
pk (x) − xpk−1 (x) = cj pj (x) with cj = .
j=0
pj 2w

We now formulate necessary conditions for the coefficients cj , under which


the orthogonality pk ⊥ Pk−1 holds. From pk ⊥ Pk−1 we get

(xpk−1 , pj )w (pk−1 , xpj )w


cj = − =− for j = 0, . . . , k − 1.
pj 2w pj 2w

This in turn implies c0 = . . . = ck−3 = 0 and, moreover,

(xpk−1 , pk−1 )w (pk−1 , xpk−2 )w (pk−1 , pk−1 )w


ck−1 = − and ck−2 = − =− .
pk−1 2w pk−2 2w pk−2 2w

Therefore, all coefficients c0 , . . . , ck−1 are uniquely determined, whereby pk


is uniquely determined. Moreover, the stated three-term recursion in (4.26),

pk (x) = (x + ck−1 )pk−1 (x) + ck−2 pk−2 (x) = (x + ak )pk−1 (x) + bk pk−2 (x)

holds with ak = ck−1 , for k ≥ 1, bk = ck−2 , for k ≥ 2, and where b1 := 1. 

Remark 4.17. Due to the uniqueness of the coefficients ak , for k ≥ 1, and


bk , for k ≥ 2, the conditions of the three-term recursion (4.26) are also suffi-
cient, i.e., the three-term recursion in (4.26) generates the unique orthogonal
polynomials pk ∈ Pk , for k ≥ 0, w.r.t. the weighted inner product (·, ·)w . 
122 4 Euclidean Approximation

Next, we discuss important properties of orthogonal polynomials, where


their zeros are of particular interest. We can show that orthogonal polynomi-
als have only simple zeros. To this end, we first prove a more general result
for continuous functions.

Theorem 4.18. Let g ∈ C [a, b] satisfy (g, p)w = 0 for all p ∈ Pn , i.e.,
g ⊥ Pn , for n ∈ N0 . Then, either g vanishes identically on [a, b] or g has at
least n + 1 zeros with changing sign in (a, b).

Proof. Suppose g ∈ C [a, b] \ {0} has only k < n + 1 zeros

a < x 1 < . . . < xk < b

with changing sign. Then, the product g · p between g and the polynomial


k
p(x) = (x − xj ) ∈ Pk ⊂ Pn
j=1

has no sign change on (a, b). Therefore, the inner product


 b
(g, p)w = g(x)p(x)w(x) dx
a

cannot vanish. This is in contradiction to the assumed orthogonality g ⊥ Pn .


Therefore, g has at least n + 1 zeros with changing sign in (a, b). 

Corollary 4.19. Suppose pn ∈ Pn is a polynomial satisfying pn ⊥ Pn−1 , for


n ∈ N. Then, either pn ≡ 0 or pn has exactly n simple zeros in (a, b).

Proof. On the one hand, by Theorem 4.18, pn has at least n pairwise distinct
zeros in (a, b). Now suppose pn ≡ 0. Since pn is an algebraic polynomial in
Pn \ {0}, pn has, on the other hand, at most n zeros. Altogether, pn has
exactly n zeros in (a, b), where each zero must be simple. 

Corollary 4.20. Let p∗n ∈ Pn be a best approximation to f ∈ C [a, b] \ Pn .


Then, the error function p∗n − f has at least n + 1 zeros with changing sign
in (a, b).

Proof. According to Remark 4.2, we have the orthogonality p∗n − f ⊥ Pn .


Since f ≡ p∗n , the error function p∗n − f has, due to Theorem 4.18, at least
n + 1 zeros with changing sign in (a, b). 

We remark that Corollary 4.20 yields a necessary condition to character-


ize the best approximation p∗ ∈ Pn to f ∈ C [a, b]. This condition could a
posteriori be used for consistency check. If we knew the n + 1 simple zeros
X = {x1 , . . . , xn+1 } ⊂ (a, b) of the error function p∗ − f a priori, then we
4.4 Orthogonal Polynomials 123

would be able to compute the best approximation p∗ ∈ Pn via the interpo-


lation conditions p∗X = fX . However, in the general case, the zeros of p∗ − f
are usually unknown.
We finally introduce three relevant families of orthogonal polynomials.
For a more comprehensive discussion on orthogonal polynomials, we refer to
the classical textbook [67] by Gábor Szegő8 or, for numerical aspects, to the
textbook [57].

4.4.1 Chebyshev Polynomials

We have studied the Chebyshev polynomials

Tn (x) = cos(n arccos(x)) for n ∈ N0 (4.27)

already in Section 2.5. Let us first recall some of the basic properties of the
Chebyshev polynomials Tn ∈ Pn , in particular the three-term recursion from
Theorem 2.27,

Tn+1 (x) = 2xTn (x) − Tn−1 (x) for n ∈ N (4.28)

with initial values T0 ≡ 1 and T1 (x) = x.


Now we show that the Chebyshev polynomials {T0 , . . . , Tn } ⊂ Pn are
orthogonal polynomials w.r.t. the weight function w : (−1, 1) −→ (0, ∞),
defined as
1
w(x) = √ for x ∈ (−1, 1). (4.29)
1 − x2
Theorem 4.21. For n ∈ N0 the set of Chebyshev polynomials {T0 , . . . , Tn }
form an orthogonal basis for Pn w.r.t. the weight function w in (4.29), where

⎨ 0 for j = k
(Tj , Tk )w = π for j = k = 0 for 0 ≤ j, k ≤ n. (4.30)

π/2 for j = k > 0

Proof. By the substitution φ = arccos(x) we show the orthogonality


 1  1
Tj (x)Tk (x) cos(j arccos(x)) cos(k arccos(x))
(Tj , Tk )w = √ dx = √ dx
−1 1−x 2
−1 1 − x2
 0  π
cos(jφ) cos(kφ)
= 9 (− sin(φ)) dφ = cos(jφ) cos(kφ) dφ
π 1 − cos2 (φ) 0

= Tk 2w δjk

by using Theorem 4.11. Theorem 4.11 also yields the stated values for the
squared norms Tk 2w = (Tk , Tk )w . 
8
Gábor Szegő (1895-1985), Hungarian mathematician
124 4 Euclidean Approximation

We remark that the Chebyshev polynomials are normalized by

Tn (1) = 1 for all n ≥ 0.

Indeed, this follows directly by induction from the three-term recursion (4.28).
Due to Corollary 2.28, the n-th Chebyshev polynomial Tn has, for n ≥ 1, the
leading coefficient 2n−1 , and so the scaled polynomial

pn (x) = 21−n Tn (x) for n ≥ 1 (4.31)

has leading coefficient one. Thus, the orthogonal polynomials p0 , . . . , pn ∈ Pn


satisfy the three-term recursion (4.26) in Theorem 4.16. We now show that
the three-term recursion in (4.26) is consistent with the three-term recur-
sion (4.28) for the Chebyshev polynomials.
To this end, we compute the coefficients ak and bk from Theorem 4.16. We
first remark that the coefficients ak are invariant under scalings of the basis
elements pk . Thus we can show that for the case of the Chebyshev polynomials
all coefficients ak in the three-term recursion (4.26) must vanish, since by the
substitution φ = arccos(x) we have
 π
(xTk (x), Tk (x))w = cos(φ) cos2 (kφ) dφ = 0 for all k ≥ 0
0

for the nominator of ak+1 in (4.26). For the coefficients bk , we get b1 = 1,


b2 = −1/2, and

pk 2w 21−k Tk 2w 1 Tk 2w 1


bk+1 = − = − 2−k =− =− for k ≥ 2.
pk−1 w2 2 Tk−1 w 2 4 Tk−1 2w 4
From Theorem 4.16, we obtain p0 ≡ 1, p1 (x) = x, p2 (x) = x2 − 1/2 and
1
pk+1 (x) = xpk (x) − pk−1 (x) for k ≥ 2.
4
Rescaling with (4.31) finally yields the three-term recursion (4.28).
In our above computations, we rely on structural advantages of Chebyshev
polynomials: On the one hand, the degree-independent representation of the
squared norms Tk 2w in (4.30) simplifies the calculations of the coefficients
bk significantly. On the other hand, for the calculations of the coefficients ak
we can take advantage of the orthonormality of the even trigonometric poly-
nomials cos(k·). We wish to further discuss this important relation between
the orthonormal system (cos(k·))k∈N0 and the orthogonal system (Tk )k∈N0 .
By our previous (more general) investigations in Section 4.2 the unique
best approximation p∗n ∈ Pn to a function f ∈ C [−1, 1] is given by the
orthogonal projection Πn f ≡ ΠPn f of f onto Pn ,

n
(f, Tk )w 1 2
n
Πn f = Tk = (f, 1)w + (f, Tk )w Tk , (4.32)
Tk 2w π π
k=0 k=1
4.4 Orthogonal Polynomials 125

where the form of Chebyshev partial sum in (4.32) reminds us on the form
of the Fourier partial sum Fn f from Corollary 4.12. Indeed, the coefficients
in the series expansion for the best approximation Πn f in (4.32) can be
identified as Fourier coefficients.

Theorem 4.22. For f ∈ C [−1, 1] the coefficients of the Chebyshev partial


sum (4.32) coincide with the Fourier coefficients ak ≡ ak (g) of the even
function g(x) = f (cos(x)), so that

a0 
n
Πn f = + a k Tk . (4.33)
2
k=1

Proof. For f ∈ C [−1, 1], the coefficients (f, Tk )w in (4.32) can be computed
by using the substitution φ = arccos(x):
 1  π
f (x)Tk (x)
(f, Tk )w = √ dx = f (cos(φ)) cos(kφ) dφ
−1 1 − x2 0
 2π
π1 π
= f (cos(x)) cos(kx) dx = ak (g),
2π 0 2

where ak (g), k ≥ 1, denotes the k-th Fourier coefficient of g(x) = f (cos(x)).


For k = 0 we finally get the Fourier coefficient a0 of g by

(f, T0 )w π 1 a0 (g)
= a0 (g) = .
T0 2w 2 π 2

As we had remarked at the end of Section 4.3, the Fourier coefficients


ak ≡ ak (g) can efficiently be approximated by the fast Fourier transform
(FFT). Now we introduce the Clenshaw algorithm [15], Algorithm 5, which,
on input coefficients a = (a0 , . . . , an )T ∈ Rn+1 , yields an efficient and stable
evaluation of the Chebyshev partial sum (4.33) at x ∈ [−1, 1].

Algorithm 5 Clenshaw algorithm


1: function Clenshaw(a, x)
2: Input: coefficients a = (a0 , . . . , an )T ∈ Rn+1 and x ∈ [−1, 1].
3:
4: let zn+1 := 0; zn := an ;
5: for k = n − 1, . . . , 0 do
6: let zk := ak + 2xzk+1 − zk+2 ;
7: end for
8: return (Πn f )(x) = (z0 − z2 )/2.
9: end function
126 4 Euclidean Approximation

To verify the Clenshaw algorithm, we use the recursion for the Chebyshev
polynomials in (4.28). By the assignment in line 6 of the Clenshaw algorithm,
we get the representation

ak = zk − 2xzk+1 + zk+2 for k = n − 1, . . . , 0 (4.34)

for the coefficients of the Chebyshev partial sum, where for k = n with
zn+1 = 0 and zn = an we get zn+2 = 0. The sum over the last n terms of the
Chebyshev partial sum (4.33) can be rewritten by using the representation
in (4.34) in combination with the recursion (4.28):


n 
n
ak Tk (x) = (zk − 2xzk+1 + zk+2 )Tk (x)
k=1 k=1

n 
n+1 
n+2
= zk Tk (x) − 2xzk Tk−1 (x) + zk Tk−2 (x)
k=1 k=2 k=3
= z1 T1 (x) + z2 T2 (x) − 2xz2 T1 (x)
n
+ zk [Tk (x) − 2xTk−1 (x) + Tk−2 (x)]
k=3
= z1 x + z2 (2x2 − 1) − 2xz2 x
= z1 x − z2 .

With a0 = z0 − 2xz1 + z2 we get the representation

a0 
n
1 1
(Πn f )(x) = + ak Tk (x) = (z0 − 2xz1 + z2 + 2z1 x − 2z2 ) = (z0 −z2 ).
2 2 2
k=1

Finally, we provide a memory efficient implementation of the Clenshaw


algorithm, Algorithm 6.

Algorithm 6 Clenshaw algorithm (memory efficient)


1: function Clenshaw(a, x)
2: Input: coefficients a = (a0 , . . . , an )T ∈ Rn+1 and x ∈ [−1, 1].
3:
4: let z ≡ (z0 , z1 , z2 ) := (an , 0, 0);
5: for k = n − 1, . . . , 0 do
6: let z2 = z1 ; z1 = z0 ;
7: let z0 = ak + 2x · z1 − z2 ;
8: end for
9: return (Πn f )(x) = (z0 − z2 )/2.
10: end function
4.4 Orthogonal Polynomials 127

4.4.2 Legendre Polynomials

Now we discuss another example for orthogonal polynomials on [−1, 1].

Definition 4.23. For n ∈ N0 , the Rodrigues9 formula

dn n!
Ln (x) = n
(x2 − 1)n for n ≥ 0 (4.35)
dx (2n)!

defines the n-th Legendre10 polynomial.

We show that the Legendre polynomials are the (unique) orthogonal poly-
nomials with leading coefficient one, belonging to the weight function w ≡ 1.
Therefore, we regard the usual (unweighted) L2 inner product on C [−1, 1],
defined as
 1
(f, g)w := (f, g) = f (x)g(x) dx for f, g ∈ C [−1, 1].
−1

Theorem 4.24. For n ∈ N0 , the Legendre polynomials L0 , . . . , Ln form an


orthogonal basis for Pn with respect to the weight function w ≡ 1 on [−1, 1].

Proof. Obviously, Lk ∈ Pk ⊂ Pn for all 0 ≤ k ≤ n.


Now for 0 ≤ k ≤ n we consider the integral
 1 n
d dk
Ink = n
(x2 − 1)n k
(x2 − 1)k dx.
−1 dx dx

For 0 ≤ i ≤ n, we have the representation


 1 n−i
d dk+i
Ink = (−1)i n−i
(x2 − 1)n k+i
(x2 − 1)k dx, (4.36)
−1 dx dx

as can be shown by induction (using integration by parts).


For i = n in (4.36), we have
 1
dk+n
Ink = (−1) n
(x2 − 1)n (x2 − 1)k dx = 0 for n > k (4.37)
−1 dxk+n

which implies
n!k!
(Ln , Lk ) = Ink = 0 for n > k. (4.38)
(2n)!(2k)!


9
Benjamin Olinde Rodrigues (1795-1851), French mathematician and banker
10
Adrien-Marie Legendre (1752-1833), French mathematician
128 4 Euclidean Approximation

We note two more important properties of the Legendre polynomials.

Theorem 4.25. The Legendre polynomials Ln in (4.35) satisfy the following


properties.
(a) Ln has leading coefficient one.
(b) We have Ln (−x) = (−1)n Ln (x) for all x ∈ [−1, 1].

Proof. For n ≥ 0, by using (4.35), we get the representation


⎛ ⎞
n  
(2n)! dn d n
⎝ n
Ln (x) = (x2 − 1)n = x2j (−1)n−j ⎠
n! dxn dxn j=0 j
 n2j 
= n! x2j−n (−1)n−j . (4.39)
j n
n/2≤j≤n

(a) By (4.39) the Legendre polynomial Ln has leading coefficient one.


(b) For even n, we have that 2j − n is even, and so in this case all terms
in (4.39) are even, which implies that Ln is even. Likewise, we can show that
Ln is odd for odd n (by analogy). Altogether, we see that statement (b) holds.


In conclusion, the Legendre polynomials are, for the L2 inner product


(·, ·) on [−1, 1], the unique orthogonal polynomials with leading coefficient
one. Finally, we derive a three-term recursion for the Legendre polynomials
from (4.26).

Theorem 4.26. The Legendre polynomials satisfy the three-term recursion

n2
Ln+1 (x) = xLn (x) − Ln−1 (x) for n ≥ 1 (4.40)
4n2−1
with initial values L0 ≡ 1 and L1 (x) = x.

Proof. Obviously, L0 ≡ 1 and L1 (x) = x.


By Theorem 4.16, the sought three-term recursion has the form (4.26),
where
(xLn−1 , Ln−1 ) Ln−1 2
an = − for n ≥ 1 and b1 = 1, bn = − for n ≥ 2.
Ln−1 2 Ln−2 2

By statement (b) in Theorem 4.25, the Legendre polynomial L2n is, for
any n ∈ N0 , even, and therefore xL2n (x) is odd, so that an = 0 for all n ≥ 0.
4.4 Orthogonal Polynomials 129

Table 4.1. The Legendre polynomials Ln in monomial form, for n = 1, . . . , 10.

L1 (x) = x
1
L2 (x) = x2 −
3
3
L3 (x) = x3 − x
5
6 2 3
L4 (x) = x4 − x +
7 35
10 3 5
L5 (x) = x5 − x + x
9 21
15 4 5 2 5
L6 (x) = x6 − x + x −
11 11 231
21 5 105 3 35
L7 (x) = x7 − x + x − x
13 143 429
28 6 14 4 28 2 7
L8 (x) = x8 − x + x − x +
15 13 143 1287
36 7 126 5 84 3 63
L9 (x) = x9 − x + x − x + x
17 85 221 2431
45 8 630 6 210 4 315 2 63
L10 (x) = x10 − x + x − x + x −
19 323 323 4199 46189

Now we compute the coefficients bn for n ≥ 2.


By the representation (4.37) for the integral Ink we have, for k = n,
 1  1
Inn = (−1)n (2n)! (x2 − 1)n dx = (2n)! (1 − x2 )n dx
−1 −1
 1
= (2n)! (1 − x)n (1 + x)n dx
−1
 1
n!
= (2n)! · (1 + x)2n dx
(n + 1) · . . . · (2n) −1
 x=1
1
= (n!)2 · (1 + x)2n+1
2n + 1 x=−1
22n+1
= (n!)2 ·
2n + 1
after n-fold integration by parts. This gives the representation
130 4 Euclidean Approximation

(n!)2 (n!)4 22n+1


Ln 2 = · I nn = · for n ≥ 0
((2n)!)2 ((2n)!)2 2n + 1

and therefore
Ln 2 n4 22 (2n − 1)
bn+1 = − =−
Ln−1  2 (2n) (2n − 1)
2 2 2n + 1
2 2
n n
=− =− 2 for n ≥ 1,
(2n − 1)(2n + 1) 4n − 1

which proves the stated three-term recursion. 

The Legendre polynomials Ln , for n = 1, . . . , 10, are, in their monomial


form, shown in Table 4.1. To compute the entries for Table 4.1, we have used
the three-term recursion (4.40) in Theorem 4.26, with initial values L0 ≡ 1
and L1 (x) = x. In summary, we see from Theorem 4.25 that
• Ln has leading coefficient one;
• L2k is even for k ∈ N0 ;
• L2k+1 is odd for k ∈ N.
Note that the above properties of the Legendre polynomials are consistent
with the representations of Ln for n = 2, . . . , 10, in Table 4.1.

4.4.3 Hermite Polynomials

We finally discuss one example of orthogonal polynomials on R.

Definition 4.27. For n ∈ N0 , we let Hn : R −→ R, defined as


2 dn −x2
Hn (x) = (−1)n ex e for n ≥ 0, (4.41)
dxn
denote the n-th Hermite11 polynomial.

Next, we show that the Hermite polynomials Hn are orthogonal polyno-


mials on R with leading coefficient 2n with respect to the weight function

w(x) = e−x .
2

Therefore, in this case, we work with the weighted L2 inner product



f (x)g(x) e−x dx
2
(f, g)w = (f, g) = for f, g ∈ C (R). (4.42)
R

Theorem 4.28. For n ∈ N0 , the Hermite polynomials H0 , . . . , Hn form an


orthogonal basis for Pn with respect to the weighted inner product (·, ·)w .
11
Charles Hermite (1822-1901), French mathematician
4.4 Orthogonal Polynomials 131

Proof. We first show for n ∈ N0 the representation

w(n) (x) = Pn (x) · e−x


2
for some Pn ∈ Pn \ Pn−1 . (4.43)
We prove the representation in (4.43) by induction on n ≥ 0.
Initial step: For n = 0, we have (4.43) with P0 ≡ 1 ∈ P0 .
Induction hypothesis: Suppose the representation in (4.43) holds for n ∈ N0 .
Induction step (n −→ n + 1): We get the stated representation by
d (n) d  
Pn (x) · e−x
2
w(n+1) (x) = w (x) =
dx dx
= (Pn (x) − 2xPn (x)) · e−x = Pn+1 (x) · e−x
2 2

with Pn+1 (x) = Pn (x)−2xPn (x), where Pn+1 ∈ Pn+1 \Pn for Pn ∈ Pn \Pn−1 .

Due to (4.43), the Hermite polynomial Hn , n ≥ 0, has the representation

Hn (x) = (−1)n ex · Pn (x) · e−x = (−1)n Pn (x)


2 2
for x ∈ R,
so that Hn ∈ Pn \ Pn−1 . Moreover, by (4.43) we have

w(n) (x) = (−1)n e−x · Hn (x)


2
for x ∈ R.
Now we consider for fixed x ∈ R the function gx : R −→ R, defined as

gx (t) := w(x + t) = e−(x+t)


2
for t ∈ R.
By Taylor series expansion on the analytic function gx around zero, we get

 (k) ∞ k
 ∞ k

gx (0) t t
(−1)k e−x Hk (x).
2
k (k)
w(x + t) = gx (t) = t = w (x) =
k! k! k!
k=0 k=0 k=0

2xt−t2
This yields, for the function h(x, t) = e , the series expansion
∞ k

2 t
h(x, t) = w(x − t) · ex = Hk (x) for all x, t ∈ R. (4.44)
k!
k=0

Now on the one hand we have for s, t ∈ R the representation


 
−x2
e−x e2x(t+s) e−(t +s ) dx
2 2 2
e h(x, t)h(x, s) dx =
R
R
e−(x−(t+s)) e2ts dx
2
=
R


e−x dx = π · e2ts
2
2ts
=e
R


√ (2ts)k
= π· . (4.45)
k!
k=0
132 4 Euclidean Approximation

On the other hand, with using the uniform convergence of the series for
h(x, t) in (4.44), we have the representation
  ,∞ -⎛ ∞ ⎞
 t k  s j
e−x h(x, t)h(x, s) dx = e−x Hk (x) ⎝ Hj (x)⎠ dx
2 2

R R k! j=0
j!
k=0

 
tk s j
e−x Hk (x)Hj (x) dx.
2
= (4.46)
k!j! R
k,j=0

By comparing the coefficients in (4.45) and (4.46), we get




e−x Hk (x)Hj (x) dx = 2k πk! · δjk
2
for all j, k ∈ N0 , (4.47)
R

and so in particular

Hk 2w = 2k πk! for all k ∈ N0 .
This completes our proof. 
Now we proof a three-term recursion for the Hermite polynomials.
Theorem 4.29. The Hermite polynomials satisfy the three-term recursion
Hn+1 (x) = 2xHn (x) − 2nHn−1 (x) for n ≥ 0 (4.48)
with the initial values H−1 ≡ 0 and H0 (x) ≡ 1.
Proof. Obviously, we have H0 ≡ 1. By applying partial differentiation to the
series expansion for h(x, t) in (4.44) with respect to variable t we get
 tk−1 ∞

h(x, t) = 2(x − t)h(x, t) = Hk (x)
∂t (k − 1)!
k=1

and this implies


∞ k
 ∞
 ∞ k

t tk+1 t
2xHk (x) − 2 Hk (x) = Hk+1 (x). (4.49)
k! k! k!
k=0 k=0 k=0

Moreover, we have
∞ k+1
 ∞
  tk ∞
t tk+1
Hk (x) = (k + 1)Hk (x) = kHk−1 (x) (4.50)
k! (k + 1)! k!
k=0 k=0 k=0

with H−1 ≡ 0. Inserting (4.50) into (4.49) gives the identity


∞ k
 ∞ k

t t
(2xHk (x) − 2kHk−1 (x)) = Hk+1 (x). (4.51)
k! k!
k=0 k=0

By comparing the coefficients in (4.51), we finally get the stated three-term


recursion in (4.48) with the initial values H−1 ≡ 0 and H0 ≡ 1. 
4.4 Orthogonal Polynomials 133

Table 4.2. The Hermite polynomials Hn in monomial form, for n = 1, . . . , 8.

H1 (x) = 2x

H2 (x) = 4x2 − 2

H3 (x) = 8x3 − 12x

H4 (x) = 16x4 − 48x2 + 12

H5 (x) = 32x5 − 160x3 + 120x

H6 (x) = 64x6 − 480x4 + 720x2 − 120

H7 (x) = 128x7 − 1344x5 + 3360x3 − 1680x

H8 (x) = 256x8 − 3584x6 + 13440x4 − 13440x + 1680

By Theorem 4.29, we get another recursion for the Hermite polynomials.

Corollary 4.30. The Hermite polynomials Hn satisfy the recursion

Hn (x) = 2nHn−1 (x) for n ∈ N. (4.52)

Proof. Differentiation of Hn in (4.41) yields


 n

d 2 d
Hn (x) = (−1)n ex e −x2
= 2xHn (x) − Hn+1 (x),
dx dxn

whereby (4.52) follows from the three-term recursion for Hn+1 in (4.48). 

Further properties of the Hermite polynomials Hn follow immediately


from the recursions in Theorem 4.29 and in Corollary 4.30.

Corollary 4.31. The Hermite polynomials Hn in (4.41) satisfy the following


properties.
(a) Hn has leading coefficient 2n , for n ≥ 0.
(b) H2n is even and H2n+1 is odd, for n ≥ 0.

Proof. Statement (a) follows by induction from the three-term recursion (4.48),
whereas statement (b) follows from (4.52) with H0 ≡ 1 and H1 (x) = 2x. 

We can conclude that for the weighted L2 inner product (·, ·)w in (4.42)
the Hermite polynomials Hn are the unique orthogonal polynomials with
leading coefficient 2n . The Hermite polynomials Hn are, for n = 1, . . . , 8,
shown in their monomial form in Table 4.2.
134 4 Euclidean Approximation

4.5 Exercises
Exercise 4.32. Let F = C [−1, 1] be equipped with the Euclidean norm
 · 2 , defined by the inner product
 1
(f, g) = f (x)g(x) dx for f, g ∈ C [−1, 1],
−1

so that  · 2 = (·, ·)1/2 . Compute on given coefficients a, b, c, d ∈ R, a = 0, of


a cubic polynomial

f (x) = a x3 + b x2 + c x + d ∈ P3 \ P2 for x ∈ [−1, 1]

the unique best approximation p∗2 to f from P2 with respect to  · 2 .


Exercise 4.33. Compute for n ∈ N0 the Fourier coefficients a0 , . . . , an ∈ R
and b1 , . . . , bn ∈ R of the Fourier partial sum

a0 
n
Fn (x) = + [aj cos(jx) + bj sin(jx)] for x ∈ [0, 2π)
2 j=1

(a) for the rectangular wave



⎨ 0 for x ∈ {0, π, 2π}
R(x) = 1 for x ∈ (0, π)

−1 for x ∈ (π, 2π);

(b) for the saw-tooth function


0 for x ∈ {0, 2π}
S(x) = 1
2 (π − x) for x ∈ (0, 2π).

Plot the graphs of R and the best approximation F10 R to R in one figure.
Plot the graphs of S and the best approximation F10 S to S in one figure.
Exercise 4.34. Approximate the function f (x) = 2x−1 on the unit interval
[0, 1] by a trigonometric polynomial of the form

c0 
n
Tn (x) = + ck cos(kπx) for x ∈ [0, 1]. (4.53)
2
k=1

Compute (for arbitrary n ∈ N0 ) the unique best approximation Tn∗ of the


form (4.53) to f with respect to the Euclidean norm  · 2 on [0, 1]. Then
determine the smallest m ∈ N, satisfying
 1

|f (x) − Tm (x)|2 dx ≤ 10−4 ,
0

and give the best approximation Tm to f in explicit form.
4.5 Exercises 135

Exercise 4.35. For a continuous, positive and integrable weight function


w : (a, b) −→ (0, ∞), let C [a, b] be equipped with the weighted Euclidean
1/2
norm  · w = (·, ·)w , defined by
 b
(f, g)w = f (x) g(x) w(x) dx for f, g ∈ C [a, b].
a

Moreover, let (pk )k∈N0 , with pk ∈ Pk , be the unique sequence of orthogonal


polynomials with respect to (·, ·)w with leading coefficient one. According to
Theorem 4.16, the orthogonal polynomials pk satisfy the three-term recursion

pk (x) = (x + ak ) pk−1 (x) + bk pk−2 (x) for k ≥ 1

with initial values p−1 ≡ 0, p0 ≡ 1, and with the coefficients

(xpk−1 , pk−1 )w pk−1 2w


ak = − for k ≥ 1 and b1 = 1, bk = − for k ≥ 2.
pk−1 2w pk−2 2w

Prove the following statements for k ∈ N0 .

(a) Among all polynomials p ∈ Pk with leading coefficient one, the ortho-
gonal polynomial pk is norm-minimal with respect to  · w , i.e.,
1 2
pk w = min pw | p ∈ Pk with p(x) = xk + q(x) for q ∈ Pk−1 .

(b) For all x, y ∈ [a, b], where x = y, we have


k
pj (x) pj (y) 1 pk+1 (x) pk (y) − pk (x) pk+1 (y)
=
j=0
pj 2w pk 2w x−y

and, moreover,


k
(pj (x))2 pk+1 (x) pk (x) − pk (x) pk+1 (x)
= for all x ∈ [a, b].
j=0
pj 2w pk 2w

(c) Conclude from (b) that all zeros of pk are simple. Moreover, conclude
that pk+1 and pk have no common zeros.

Exercise 4.36. Show the following identities of the Chebyshev polynomials.


(a) Tk · T = 12 Tk+ + T|k−| for all k, ∈ N0 .
(b) Tk (−x) = (−1)k Tk (x) for all k ∈ N0 .
(c) Tk ◦ T = Tk for all k, ∈ N0 .
136 4 Euclidean Approximation

Exercise 4.37. In this problem, make use of the results in Exercise 4.36.
(a) Prove for g ∈ C [−1, 1] and h(x) = x · g(x), for x ∈ [−1, 1], the relation
1
c0 (h) = c1 (g) and ck (h) = (ck−1 (g) + ck+1 (g)) for all k ≥ 1
2
between the Chebyshev coefficients ck (g) of g and ck (h) of h.
(b) Conclude from the relation in Exercise 4.36 (c) the representation
T2k (x) = Tk (2x2 − 1) for all x ∈ [−1, 1] and k ∈ N0 . (4.54)
(c) Can the representation in (4.54) be used to simplify the evaluation of
a Chebyshev partial sum for an even function in the Clenshaw algo-
rithm, Algorithm 5? If so, how could this simplification be used for the
implementation of the Clenshaw algorithm?
Exercise 4.38. On given coefficient functions ak ∈ C [a, b], for k ≥ 1, and
bk ∈ C [a, b], for k ≥ 2, let pk ∈ C [a, b], for k ≥ 0, be a function sequence
satisfying the three-term recursion
pk+1 (x) = ak+1 (x) pk (x) + bk+1 (x) pk−1 (x) for k ≥ 1
with initial functions p0 ∈ C [a, b] and p1 = a1 p0 ∈ C [a, b]. Show that the
sum

n
fn (x) = cj pj (x) for x ∈ [a, b]
j=0

can, on given coefficients c = (c0 , . . . , cn )T ∈ Rn+1 , be evaluated by the


following generalization of the Clenshaw algorithm, Algorithm 7.

Algorithm 7 Clenshaw algorithm


1: function Clenshaw(c, x)
2: Input: coefficients c = (c0 , . . . , cn )T ∈ Rn+1 and x ∈ [a, b].
3:
4: let zn+1 = 0; zn = cn ;
5: for k = n − 1, . . . , 0 do
6: let zk = ck + ak+1 (x) zk+1 + bk+2 (x) zk+2 ;
7: end for
8: return fn (x) = p0 (x) z0 .
9: end function

Which algorithm results especially for the evaluation of a Legendre partial


sum
n
fn (x) = cj Lj (x) for x ∈ [−1, 1]
j=0

with the Legendre polynomials L0 , . . . , Ln (from Definition 4.23)?


4.5 Exercises 137

Exercise 4.39. In this problem, we consider approximating the exponential


function f (x) = e−x on the interval [−1, 1] by polynomials from Pn , for
1/2
n ∈ N0 , with respect to the weighted norm  · w = (·, ·)w , where
1
w(x) = √ for x ∈ (−1, 1).
1 − x2
To this end, we use the Chebyshev polynomials Tk (x) = cos(k arccos(x)).
Compute the coefficients c∗ = (c∗0 , . . . , c∗n )T ∈ Rn+1 of the best approximation

n
p∗n (x) = c∗k Tk (x) ∈ Pn for x ∈ [−1, 1] and n ∈ N0 .
k=0

Exercise 4.40. In this problem, we use the Legendre polynomials

dk k!
Lk (x) = (x2 − 1)k for 0 ≤ k ≤ n
dxk (2k)!
to determine the best approximation p∗n ∈ Pn , n ∈ N0 , to the exponential
function f (x) = e−x on [−1, 1] w.r.t. the (unweighted) Euclidean norm  · 2 .
Compute the first eight coefficients c∗ = (c∗0 , . . . , c∗7 )T ∈ R8 of the sought
best approximation

n
p∗n (x) = c∗k Lk (x) for x ∈ [−1, 1].
k=0

Exercise 4.41. In this programming problem, we compare the two approx-


imations to f (x) = e−x from the previous Exercises 4.39 and 4.40.
(a) Evaluate the two best approximations p∗n ∈ Pn (from Exercises 4.39
and 4.40, respectively) for n = 3, 4, 5, 6, 7 at N + 1 equidistant points
2j
xj = −1 + for j = 0, . . . , N
N
for a suitable N ≥ 1 by the modified Clenshaw algorithm, Algorithm 6.
Plot the graphs of the functions p∗n and f in one figure, for n = 3, 4, 5, 6, 7.
(b) Record for your computations in (a) the approximation errors
5
6N
6
ε2 = 7 |p∗n (xj ) − f (xj )|2 and ε∞ = max |p∗n (xj ) − f (xj )|,
0≤j≤N
j=0

for n = 3, 4, 5, 6, 7. Display your results in one table.


(c) Compare the approximation by Chebyshev polynomials (Exercise 4.39)
with the approximation by Legendre polynomials (Exercise 4.40). Take
notes of your numerical observations. Did the observed numerical results
match your perception?
138 4 Euclidean Approximation

Exercise 4.42. Consider for n ∈ N0 the Hermite function

hn (x) := Hn (x) · e−x


2
/2
for x ∈ R, (4.55)

where Hn denotes the n-th Hermite polynomial in (4.41).


Show that the Hermite functions hn satisfy the differential equation

hn (x) − x2 − 2n − 1 hn (x) = 0 for n ≥ 0.

Moreover, prove the recursion

hn+1 (x) = xhn (x) − hn (x) for n ≥ 0.

Hint: Use the recursions from Theorem 4.29 and Corollary 4.30.
5 Chebyshev Approximation

In this chapter, we study, for a compact domain Ω ⊂ Rd , d ≥ 1, the approxi-


mation of continuous functions from the linear space
C (Ω) = {u : Ω −→ R | u continuous}
with respect to the maximum norm
u∞ = max |u(x)| for u ∈ C (Ω).
x∈Ω

The maximum norm  · ∞ is also referred to as Chebyshev1 norm, and so in


this chapter, we are concerned with Chebyshev approximation, i.e., approxi-
mation with respect to  · ∞ .
To approximate functions from C (Ω), we work with finite-dimensional
linear subspaces S ⊂ C (Ω). Under this assumption, there is for any f ∈ C (Ω)
a best approximation s∗ ∈ S to f , see Corollary 3.8. Further in Chapter 3,
we analyzed the problem of Chebyshev approximation from a more general
viewpoint. Recall that we have made a negative observation: The Chebyshev
norm  · ∞ is not strictly convex, as shown in Example 3.34. According to
Theorem 3.37, however, strictly convex norms guarantee (for convex S ⊂ F)
the uniqueness of best approximations. Therefore, the problem of approxi-
mation with respect to the Chebyshev norm  · ∞ appears to be, at first
sight, rather critical.
But we should not be too pessimistic. In this chapter, we will derive
suitable conditions on the approximation space S ⊂ C (Ω), under which we
can even guarantee strong uniqueness of best approximations. However, accor-
ding to the Mairhuber-Curtis theorem, Theorem 5.25, strong uniqueness can
only be achieved for the univariate case. Therefore, the case d = 1, where
Ω = [a, b] ⊂ R is a compact interval, is of primary importance. In fact, we
will study Chebyshev approximation to continuous functions in C [a, b] by
algebraic polynomials from Pn , for n ∈ N0 , in more detail.
Furthermore, we derive suitable characterizations for best approximations
for the particular case · = ·∞ , where we can rely on our previous results in
Chapter 3. This finally leads us to the Remez algorithm, an iterative numerical
method to compute best approximations with respect to the Chebyshev norm
 · ∞ . We show linear convergence for the Remez iteration.
1
Pafnuty Lvovich Chebyshev (1821-1894), Russian mathematician

© Springer Nature Switzerland AG 2018 139


A. Iske, Approximation Theory and Algorithms for Data Analysis, Texts
in Applied Mathematics 68, https://doi.org/10.1007/978-3-030-05228-7_5
140 5 Chebyshev Approximation

5.1 Approaches to Construct Best Approximations


For a compact domain Ω ⊂ Rd , d ≥ 1, we denote by C (Ω) the linear space of
all continuous functions on Ω. Moreover, we assume throughout this chapter
that S ⊂ C (Ω) is a finite-dimensional linear subspace of C (Ω). Under these
assumptions, there exists, according to Corollary 3.8, for any f ∈ C (Ω) a
best approximation s∗ ∈ S to f with respect to the Chebyshev norm  · ∞ .
However, s∗ is not necessarily unique, since  · ∞ is not strictly convex.
Now we apply the characterizations for best approximations from Chap-
ter 3 to the special case of the Chebyshev norm ·∞ . We begin with the direct
characterizations from Section 3.4, where we had proven the Kolmogorov cri-
terion, Corollary 3.55. We can adapt the Kolmogorov criterion to the Cheby-
shev norm  · ∞ as follows.

Theorem 5.1. Let S ⊂ C (Ω) be a linear subspace of C (Ω) and suppose


f ∈ C (Ω) \ S. Then s∗ ∈ S is a best approximation to f with respect to
 · ∞ , if and only if

max s(x) sgn((s∗ − f )(x)) ≥ 0 for all s ∈ S, (5.1)


x∈Es∗ −f

where
Es∗ −f = {x ∈ Ω : |(s∗ − f )(x)| = s∗ − f ∞ } ⊂ Ω
denotes the set of extremal points of s∗ − f in Ω.

Proof. By the equivalence of the Kolmogorov criterion, Corollary 3.55, s∗ is


a best approximation to f with respect to  ·  =  · ∞ , if and only if

+ (s∗ − f, s − s∗ ) = max (s − s∗ )(x) sgn((s∗ − f )(x)) ≥ 0 for all s ∈ S,


x∈Es∗ −f

where we have used the Gâteaux derivative of the norm  · ∞ from Theo-
rem 3.64. By the linearity of S, this condition is equivalent to (5.1). 

Given the result of Theorem 5.1, we can immediately solve one simple
problem of Chebyshev approximation. To this end, we regard the univariate
case, d = 1, where Ω = [a, b] ⊂ R for a compact interval. In this case, we
wish to approximate continuous functions from C [a, b] by constants.

Corollary 5.2. Let [a, b] ⊂ R be compact and f ∈ C [a, b]. Then

fmin + fmax
c∗ = ∈ P0
2
is the unique best approximation to f from P0 with respect to  · ∞ , where

fmin = min f (x) and fmax = max f (x).


x∈[a,b] x∈[a,b]
5.1 Approaches to Construct Best Approximations 141

Proof. For f ∈ P0 , the statement is trivial. Now suppose f ∈ C [a, b] \ P0 .


The continuous function f ∈ C [a, b] attains its minimum and maximum on
the compact interval [a, b]. Therefore, there are xmin , xmax ∈ [a, b] satisfying
fmin = f (xmin ) and fmax = f (xmax ).
Obviously, xmin , xmax lie in the set of extremal points Ec∗ −f , where
c∗ − f (xmin ) = η
c∗ − f (xmax ) = −η
with η = c∗ − f ∞ = (fmax − fmin )/2 > 0. Moreover, in this case we have
max c sgn(c∗ − f (x)) = c sgn(c∗ − f (xmin )) = c ≥ 0
x∈Ec∗ −f

for c ≥ 0 on the one hand, and


max c sgn(c∗ − f (x)) = c sgn(c∗ − f (xmax )) = −c > 0
x∈Ec∗ −f

for c < 0 on the other hand. Altogether, the Kolmogorov criterion from
Theorem 5.1,
max c sgn(c∗ − f (x)) ≥ 0 for all c ∈ P0 ,
x∈Ec∗ −f

is satisfied. Therefore, c∗ is a best approximation to f from P0 with respect


to  · ∞ .
Finally, c∗ is the unique best approximation to f , since for c = c∗ we have
c − f ∞ ≥ |c − fmin | > |c∗ − fmin | = c∗ − f ∞ for c > c∗ ;
c − f ∞ ≥ |c − fmax | > |c∗ − fmax | = c∗ − f ∞ for c < c∗ .

Observe from the above construction of the unique best approximation
c∗ ∈ P0 to f ∈ C [a, b] that there are at least two different extremal points
x1 , x2 ∈ Ec∗ −f which satisfy the alternation condition
(c∗ − f )(xk ) = (−1)k σc∗ − f ∞ for k = 1, 2 (5.2)
for some σ ∈ {±1}. The alternation condition (5.2) is necessary and sufficient
for a best approximation from P0 . Moreover, there is no upper bound for the
number of alternation points. We can further explain this by the following
simple example.
Example 5.3. We approximate fm (x) = cos(mx), for m ∈ N, on the interval
[−π, π] by constants. According to Corollary 5.2, c∗ ≡ 0 is the unique best
approximation to fm from P0 , for all m ∈ N. We get c∗ − fm ∞ = 1 for
the minimal distance between fm and P0 , and the error function c∗ − fm has
2m + 1 alternation points xk = πk/m, for k = −m, . . . , m. ♦
142 5 Chebyshev Approximation

For the approximation of f ∈ C [a, b] by polynomials from Pn−1 , there are


at least n+1 alternation points. Moreover, the best approximation p∗ ∈ Pn−1
to f is unique. We can prove these two statements by another corollary from
the Kolmogorov criterion, Theorem 5.1.
Corollary 5.4. Let [a, b] ⊂ R be compact and f ∈ C [a, b] \ Pn−1 , where
n ∈ N. Then there is a unique best approximation p∗ ∈ Pn−1 to f from Pn−1
with respect to  · ∞ . Moreover, there are at least n + 1 extremal points
{x1 , . . . , xn+1 } ⊂ Ep∗ −f with a ≤ x1 < . . . < xn+1 ≤ b, that are satisfying
the alternation condition

(p∗ − f )(xk ) = (−1)k σp∗ − f ∞ for k = 1, . . . , n + 1 (5.3)

for some σ ∈ {±1}.


Proof. The existence of a best approximation is covered by Corollary 3.8.
Now let p∗ ∈ Pn−1 be a best approximation to f . We decompose the set
of extremal points Ep∗ −f into m pairwise disjoint, non-empty, and monoto-
nically increasing subsets, E1 , . . . , Em ⊂ Ep∗ −f , i.e.,

a ≤ x 1 < x 2 < . . . < xm ≤ b for all xk ∈ Ek and k = 1, . . . , m, (5.4)

so that the sign of the error p∗ − f is alternating on the sets Ek ⊂ Ep∗ −f ,


1 ≤ k ≤ m, i.e., we have, for some σ ∈ {±1},

sgn((p∗ − f )(xk )) = (−1)k σ for all xk ∈ Ek for k = 1, . . . , m. (5.5)

We denote the order relation in (5.4) in short by E1 < . . . < Em .


Note that there are at least two extremal points in Ep∗ −f , at which the
error function p∗ −f has different signs. Indeed, this is because the continuous
function p∗ − f attains its minimum and maximum on [a, b], so that

(p∗ − f )(xmin ) = −p∗ − f ∞ and (p∗ − f )(xmax ) = p∗ − f ∞

for xmin , xmax ∈ [a, b]. Otherwise, p∗ cannot be a best approximation to f .


Therefore, there are at least two subsets Ek in the above decomposition
of Ep∗ −f , i.e., m ≥ 2. We now show that we even have m ≥ n + 1 for the
number of subsets Ek .
Suppose m < n+1, or, m ≤ n. Then there is a set X ∗ = {x∗1 , . . . , x∗m−1 } of
size m − 1, whose points x∗k are located between the points from neighbouring
subsets Ek < Ek+1 , respectively, so that

xk < x∗k < xk+1 for all xk ∈ Ek , xk+1 ∈ Ek+1 and k = 1, . . . , m − 1.

In this case, the corresponding knot polynomial


m−1
ωX ∗ (x) = (x − x∗k ) ∈ Pm−1 ⊂ Pn−1
k=1
5.1 Approaches to Construct Best Approximations 143

has on the subsets Ek alternating signs, where

sgn(ωX ∗ (xk )) = (−1)m−k for all k = 1, . . . , m.

Now for the polynomial p = p∗ + σ̂ ωX ∗ ∈ Pn−1 , with σ̂ ∈ {±1}, we have

sgn((p − p∗ )(xk )(p∗ − f )(xk )) = σ̂(−1)m−k (−1)k σ = σ̂(−1)m σ

for all xk ∈ Ek and for k = 1, . . . , m. Letting σ̂ = −(−1)m σ, we have

max (p − p∗ )(xk )sgn((p∗ − f )(xk ) < 0.


xk ∈Ep∗ −f

This, however, is in contradiction to the Kolmogorov criterion, Theorem 5.1.


Therefore, there are at least m ≥ n + 1 monotonically increasing non-empty
subsets E1 < . . . < Em of Ep∗ −f , for which the sign of the error function
p∗ − f is alternating, i.e., we have (5.5) with m ≥ n + 1, which implies the
alternation condition (5.3).
Now we prove the uniqueness of p∗ by contradiction.
Suppose there is another best approximation q ∗ ∈ Pn−1 to f , p∗ = q ∗ .
Then, the convex combination p = (p∗ + q ∗ )/2 ∈ Pn−1 is according to The-
orem 3.16 yet another best approximation to f . In this case, there are for p
at least n + 1 alternation points x1 < . . . < xn+1 , so that

(p − f )(xk ) = (−1)k σp − f ∞ for k = 1, . . . , n + 1

for some σ ∈ {±1}, where {x1 , . . . , xn+1 } ⊂ Ep−f .


But the n + 1 alternation points x1 , . . . , xn+1 of p are also contained in
each of the extremal point sets Ep∗ −f and Eq∗ −f . Indeed, this is because in

1 ∗ 1
|(p − f )(xk )| + |(q ∗ − f )(xk )|
p − f ∞ = |(p − f )(xk )| ≤
2 2
1 1
≤ p∗ − f ∞ + q ∗ − f ∞ = p∗ − f ∞ = q ∗ − f ∞ ,
2 2
equality holds for k = 1, . . . , n + 1. In particular, we have

|(p∗ − f )(xk ) + (q ∗ − f )(xk )| = |(p∗ − f )(xk )| + |(q ∗ − f )(xk )|

for all 1 ≤ k ≤ n + 1.
Due to the strict convexity of the norm | · | (see Remark 3.27) and by the
equivalence statement (d) in Theorem 3.26, the signs of the error functions
p∗ − f and q ∗ − f must agree on {x1 , . . . , xn+1 }, i.e.,

sgn((p∗ − f )(xk )) = sgn((q ∗ − f )(xk )) for all k = 1, . . . , n + 1.

Altogether, the values of the polynomials p∗ , q ∗ ∈ Pn−1 coincide on the n + 1


points x1 , . . . , xn+1 , which implies p∗ ≡ q ∗ . 
144 5 Chebyshev Approximation

Now we note another important corollary, which directly follows from our
observation in Proposition 3.42 and from Exercise 3.73.
Corollary 5.5. For L > 0 let f ∈ C [−L, L]. Moreover, let p∗ ∈ Pn , for
n ∈ N0 , be the unique best approximation to f from Pn with respect to  · ∞ .
Then the following statements are true.
(a) If f is even, then its best approximation p∗ ∈ Pn is even.
(b) If f is odd, then its best approximation p∗ ∈ Pn is odd.
Proof. The linear space Pn of algebraic polynomials is reflection-invariant,
i.e., for p(x) ∈ Pn , we have p(−x) ∈ Pn . Moreover, by Corollary 5.4 there
exists for any f ∈ C [−L, L] a unique best approximation p∗ ∈ Pn to f from
Pn with respect to  · ∞ . Without loss of generality, we assume L = 1. By
Proposition 3.42 and Exercise 3.73, both statements (a) and (b) hold. 
For illustration, we apply Corollary 5.5 in the following two examples.
Example 5.6. We approximate fm (x) = sin(mx), for m ∈ N, on [−π, π] by
linear polynomials. The function fm is odd, for all m ∈ N, and so is the best
approximation p∗m ∈ P1 to fm odd. Therefore, p∗m has the form p∗m (x) = αm x
for a slope αm ≥ 0, which is yet to be determined.
Case 1: For m = 1, the constant c ≡ 0 cannot be a best approximation
to f1 (x) = sin(x), since c − f1 has only two alternation points ±π/2. By
symmetry, we can restrict our following investigations to the interval [0, π].
The function p∗1 (x) − f1 (x) = α1 x − sin(x), with α1 > 0, has two alternation
points {x∗ , π} on [0, π],
(p∗1 − f1 )(x∗ ) = α1 x∗ − sin(x∗ ) = −η and (p∗1 − f1 )(π) = α1 π = η,
where η = p∗1 − f1 ∞ is the minimal distance between f1 and P1 . Moreover,
the alternation point x∗ satisfies the condition
0 = (p∗1 − f1 ) (x∗ ) = α1 − cos(x∗ ) which implies α1 = cos(x∗ ).
Therefore, x∗ is a solution of the nonlinear equation
cos(x∗ )(x∗ + π) = sin(x∗ ),
which we can solve numerically, whereby we obtain the alternation point x∗ ≈
1.3518, the slope α1 = cos(x∗ ) ≈ 0.2172 and the minimal distance η ≈ 0.6825.
Altogether, the best approximation p∗1 (x) = α1 x with {−π, −x∗ , x∗ , π} gives
four alternation points for p∗1 − f1 on [−π, π], see Figure 5.1 (a).
Case 2: For m > 1, p∗m ≡ 0 is the unique best approximation to fm .
For the minimal distance, we get p∗m − fm ∞ = 1 and the error function

pm − fm has 2m alternation points
2k − 1
xk = ± π for k = 1, 2, . . . , m,
2m
see Figure 5.1 (b) for the case m = 2. ♦
5.1 Approaches to Construct Best Approximations 145
1

0.8

0.6

0.4

0.2

-0.2

-0.4

-0.6

-0.8

-1
-3 -2 -1 0 1 2 3

(a) approximation of the function f1 (x) = sin(x)

0.8

0.6

0.4

0.2

-0.2

-0.4

-0.6

-0.8

-1
-3 -2 -1 0 1 2 3

(b) approximation of the function f2 (x) = sin(2x)

Fig. 5.1. Approximation of the function fm (x) = sin(mx) on [−π, π] by linear


polynomials for (a) m = 1 and (b) m = 2. The best approximation p∗m ∈ P1 to fm ,
m = 1, 2, is odd. In Example 5.6, we determine the best approximation p∗m ∈ P1 to
fm and the corresponding alternation points for all m ∈ N.
146 5 Chebyshev Approximation

Regrettably, the characterization of best approximations in Corollary 5.4


is not constructive, since neither do we know the set of extremal points Ep∗ −f
nor do we know the minimal distance p∗ −f ∞ a priori. Otherwise, we could
immediately compute the best approximating polynomial p∗ ∈ Pn−1 from the
interpolation conditions

p∗ (xk ) = f (xk ) + (−1)k η where η = σp∗ − f ∞ ,

for k = 1, . . . , n+1. For further illustration, we discuss the following example,


where we can predetermine some of the extremal points.
Example 5.7. We approximate the absolute-value function f (x) = |x|
on [−1, 1] by quadratic polynomials. To construct the best approximation
p∗2 ∈ P2 to f ∈ C [−1, 1] we first note the following observations.
• The function f is even, and so p∗2 must be even, by Corollary 5.5.
• By Corollary 5.4 there are at least four extremal points, |Ep∗2 −f | ≥ 4.
• The error function e = p∗2 −f on [0, 1] is a quadratic polynomial. Therefore,
e has on (0, 1) at most one local extremum x∗ ∈ (0, 1). This local extremum
must lie in the set of extremal points Ep∗2 −f . By symmetry, −x∗ ∈ (−1, 0)
is also contained in the set of extremal points Ep∗2 −f .
• Further extrema of the error function e can only be at the origin or at
the boundary points ±1. Since |Ep∗2 −f | ≥ 4 and due to symmetry, both
boundary points ±1 must lie in Ep∗2 −f .
• To satisfy the alternation condition, the origin must necessarily lie in
Ep∗2 −f . Indeed, for the subset E = {−1, −x∗ , x∗ , 1} ⊂ Ep∗2 −f the four
signs of e on E are symmetric, in particular not alternating. Therefore, we
have Ep∗2 −f = {−1, −x∗ , 0, x∗ , 1} for some x∗ ∈ (0, 1).
• By symmetry we can restrict ourselves in the following investigations to the
unit interval [0, 1]: Since the error function e = p∗2 − f has three extrema
{0, x∗ , 1} in [0, 1], e has two zeros in (0, 1), i.e., the function graphs of f and
p∗2 intersect in (0, 1) at two points. Hence, p∗2 is convex, where p∗2 (0) > 0.
We can now sketch the function graphs of f and p∗2 (see Figure 5.2).
By our above observations the best approximation p∗2 has the form

p∗2 (x) = η + αx2

with the minimal distance η = p∗2 −f ∞ , for some positive slope α > 0. More-
over, e = p∗2 − f has on the set of extremal points Ep∗2 −f = {−1, −x∗ , 0, x∗ , 1}
alternating signs ε = (1, −1, 1, −1, 1). We compute α by the alternation con-
dition at x = 1,
(p∗2 − f )(1) = η + α − 1 = η,
and so we obtain α = 1, so that p∗2 (x) = η + x2 . The local minimum x∗ of
the error function e = p∗2 − f satisfies the necessary condition

e (x∗ ) = 2x∗ − 1 = 0,
5.1 Approaches to Construct Best Approximations 147

whereby x∗ = 1/2, so that Ep∗2 −f = {−1, −1/2, 0, 1/2, 1}. Finally, at x∗ = 1/2
we have the alternation condition

(p∗2 − f )(1/2) = η + 1/4 − 1/2 = −η,

holds, whereby η = 1/8. Hence, the quadratic polynomial p∗2 (x) = 1/8 + x2
is the unique best approximation to f from P2 with respect to  · ∞ . ♦

1.2

0.8

0.6

0.4

0.2

0
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

Fig. 5.2. Approximation of the function f (x) = |x| on [−1, 1] by quadratic polyno-
mials. The best approximation p∗2 ∈ P2 to f is even and convex. The set of extremal
points Ep∗2 −f = {−1, −x∗ , 0, x∗ , 1} has five alternation points.

A more constructive account for the computation of best approximations


relies on the dual characterizations from Section 3.3. To this end, we recall
the necessary and sufficient condition from Theorem 3.48. According to The-
orem 3.48, s∗ ∈ S ⊂ C (Ω) is a best approximation to f ∈ C (Ω) with respect
to  · ∞ , if and only if there is a dual functional ϕ ∈ (C (Ω)) satisfying
(a) ϕ∞ = 1.
(b) ϕ(s∗ − f ) = s∗ − f ∞ .
(c) ϕ(s − s∗ ) ≥ 0 for all s ∈ S.
148 5 Chebyshev Approximation

To construct such a characterizing dual functional, we use the assumption



m
ϕ(u) = λk εk u(xk ) for u ∈ C (Ω) (5.6)
k=1

with coefficient vector λ = (λ1 , . . . , λm )T ∈ Λm , lying at the boundary


*  +
 m
m
Λm = (λ1 , . . . , λm ) ∈ R  λk ∈ [0, 1], 1 ≤ k ≤ m,
T
λk = 1 (5.7)
k=1

of the standard simplex Δm ⊂ Rm from (2.38). Moreover,

ε = (ε1 , . . . , εm )T ∈ {±1}m

denotes a sign vector and X = {x1 , . . . , xm } ⊂ Ω is a point set.


Assuming (5.6) condition (a) is already satisfied, since
m 
 
 
|ϕ(u)| =  λk εk u(xk ) ≤ u∞ for all u ∈ C (Ω) (5.8)
 
k=1

and so ϕ∞ ≤ 1. Moreover, for any u ∈ C (Ω) satisfying u∞ = 1 and


u(xk ) = εk , for all 1 ≤ k ≤ m, we have equality in (5.8), so that ϕ has norm
length one by ϕ∞ = λ1 = 1.
To satisfy condition (b), we choose X = Es∗ −f , i.e., Es∗ −f = {x1 , . . . , xm }.
In this case, we get, in combination with εk = sgn((s∗ − f )(xk )), the identity


m 
m
ϕ(s∗ − f ) = λk εk (s∗ − f )(xk ) = λk |(s∗ − f )(xk )| = s∗ − f ∞ .
k=1 k=1

But the set of extremal points Es∗ −f is unknown a priori. Moreover, it


remains to satisfy condition (c).
From now on we study the construction of coefficients λ ∈ Λm , signs
ε ∈ {±1}m and points X = {x1 , . . . , xm } in more detail. In order to do so,
we need some technical preparations. We begin with the representation of
convex hulls.

Definition 5.8. Let F be a linear space and M ⊂ F. Then the convex hull
conv(M) of M is the smallest convex set in F containing M, i.e.,

conv(M) = K.
M⊂K⊂F
K convex
5.1 Approaches to Construct Best Approximations 149

The following representation for conv(M) is much more useful in practice.


Theorem 5.9. Let F be a linear space and M ⊂ F. Then we have
⎧ ⎫
⎨  ⎬
m

conv(M) = λj xj  xj ∈ M and λ = (λ1 , . . . , λm )T ∈ Λm for m ∈ N .
⎩ ⎭
j=1

Proof. Let us consider the set


⎧ ⎫
⎨  ⎬
m

K= λj xj  xj ∈ M and λ = (λ1 , . . . , λm )T ∈ Λm for m ∈ N . (5.9)
⎩ ⎭
j=1

We now show the following properties for K.


(a) K is convex.
(b) M ⊂ K.
(c) conv(M) ⊂ K.
(a): For x, y ∈ K we have the representations

m
x= λj x j with λ = (λ1 , . . . , λm )T ∈ Λm , {x1 , . . . , xm } ⊂ M, m ∈ N
j=1

n
y= μ k yk with μ = (μ1 , . . . , μn )T ∈ Λn , {y1 , . . . , yn } ⊂ M, n ∈ N.
k=1

Note that any convex combination αx + (1 − α)y, α ∈ [0, 1], can be written
as a convex combination of the points x1 , . . . , xm , y1 , . . . , yn ,

m 
n 
m 
n
αx + (1 − α)y = α λj xj + (1 − α) μ k yk = αλj xj + (1 − α)μk yk ,
j=1 k=1 j=1 k=1

so that αx + (1 − α)y ∈ K for all α ∈ [0, 1].


(b): Any point x ∈ M lies in K, by m = 1, λ1 = 1 and x1 = x in (5.9).
Therefore, the inclusion M ⊂ K holds.
(c): By (a) and (b) K is a convex set containing M. From the minimality
of conv(M) we can conclude conv(M) ⊂ K.
We now show the inclusion K ⊂ conv(M). To this end, we first note that
any convex L containing M, i.e., M ⊂ L, is necessarily a superset of K, i.e.,
K ⊂ L. Indeed, this is because L contains all finite convex combinations of
points from M. This immediately implies

K⊂ L = conv(M).
M⊂L
L convex

Altogether, we have K = conv(M). 


150 5 Chebyshev Approximation

By the characterization in Theorem 5.9, we can identify the convex hull


conv(M), for any set M ⊂ F, as the set of all finite convex combina-
tions of points from M. For finite-dimensional linear spaces F, the length of
the convex combinations can uniformly be bounded above according to the
Carathéodory2 theorem.

Theorem 5.10. (Carathéodory).


Let F be a linear space of finite dimension dim(F) = n < ∞. Moreover,
suppose M ⊂ F. Then we have the representation
⎧ ⎫
⎨  ⎬
m

conv(M) = λj xj  xj ∈ M, λ = (λ1 , . . . , λm )T ∈ Λm for m ≤ n + 1 .
⎩ ⎭
j=1

Proof. For x ∈ conv(M) we consider a representation of the form


m
x= λj xj with λ = (λ1 , . . . , λm )T ∈ Λm and x1 , . . . , xm ∈ M
j=1

but with minimal m ∈ N. Then λj > 0, i.e., λj ∈ (0, 1] for all 1 ≤ j ≤ m.


From the assumed representation we get

m
λj (x − xj ) = 0 ,
j=1

i.e., the elements x − xj ∈ F, 1 ≤ j ≤ m, are linearly dependent in F.


Now suppose m > n + 1, or, m − 1 > n. Then there are α2 , . . . , αm ∈ R,
that are not all vanishing, with

m
αj (x − xj ) = 0,
j=1

where we let α1 = 0. This gives the representation



m 
m
0= (λj + tαj )(x − xj ) = μj (t)(x − xj ) for all t ∈ R,
j=1 j=1

with μj (t) = (λj + tαj ) and therefore μj (0) = λj > 0.


Now we choose one t∗ ∈ R satisfying

μj (t∗ ) = λj + t∗ αj ≥ 0 for all j = 1, . . . , m,

and μk (t∗ ) = 0 for some k ∈ {1, . . . , m}. By


2
Constantin Carathéodory (1873-1950), Greek mathematician
5.1 Approaches to Construct Best Approximations 151

μj (t∗ )
ρj = : m ∗
≥0 for j = 1, . . . , m
k=1 μk (t )

we have

m
ρj = 1
j=1

and

m 
m
ρj (x − xj ) = 0 ⇐⇒ x= ρj x j .
j=1 j=1

Note that ρk = 0. But this is in contradiction to the minimality of m. 

The Carathéodory theorem implies the following important result.

Corollary 5.11. Let F be a normed linear space of finite dimension n < ∞.


Suppose M ⊂ F is a compact subset of F. Then conv(M) is compact.

Proof. We regard on the compact set Ln+1 = Λn+1 × Mn+1 the continuous
mapping ϕ : Ln+1 −→ F, defined as


n+1
ϕ(λ, X) = λj x j
j=1

for λ = (λ1 , . . . , λn+1 )T ∈ Λn+1 and X = (x1 , . . . , xn+1 ) ∈ Mn+1 . According


to the Carathéodory theorem, Theorem 5.10, we have ϕ(Ln+1 ) = conv(M).
Therefore, conv(M) is also compact, since conv(M) is the image of the com-
pact set Ln+1 under the continuous mapping ϕ : Ln+1 −→ F. 

From Corollary 5.11, we gain the following separation theorem.

Theorem 5.12. Let M ⊂ Rd be compact. Then the following two statements


are equivalent.
(a) There is no β ∈ Rd \ {0} satisfying β T x > 0 for all x ∈ M.
(b) 0 ∈ conv(M).

Proof. (b) ⇒ (a): Let 0 ∈ conv(M). Then we have the representation


m
0= λj xj with λ = (λ1 , . . . , λm )T ∈ Λm and x1 , . . . , xm ∈ M.
j=1

Suppose there is one β ∈ Rd \ {0} satisfying β T x > 0 for all x ∈ M. Then


we immediately get a contradiction by

m
βT 0 = 0 = λj β T xj > 0.
j=1
152 5 Chebyshev Approximation

(a) ⇒ (b): Suppose statement (a) holds. Further suppose that 0 ∈ / conv(M).
Since conv(M) is compact, by Corollary 5.11, there is one β∗ ∈ conv(M),
β∗ = 0, of minimal Euclidean norm in conv(M). This minimum β∗ , viewed
as a best approximation from conv(M) to the origin with respect to  · 2 , is
characterized by

(β∗ − 0, x − β∗ ) ≥ 0 for all x ∈ conv(M)

according to the Kolmogorov theorem, Corollary 3.55, in combination with


the Gâteaux derivative for Euclidean norms in Theorem 3.62. But this con-
dition is equivalent to

β∗T x = (β∗ , x) ≥ (β∗ , β∗ ) = β∗ 22 > 0 for all x ∈ conv(M).

But this is in contradiction to our assumption in (a). 

Remark 5.13. The equivalence statement (a) in Theorem 5.12 says that the
Euclidean space Rd cannot be split by a separating hyperplane through the
origin into two half-spaces, such that M is entirely contained in one of the
two half-spaces. 

5.2 Strongly Unique Best Approximations


Now we wish to further develop the characterizations for best approximations
from Sections 3.3 and 3.4 for the special case of the Chebyshev norm  · ∞ .
In the following discussion, {s1 , . . . , sn } ⊂ S, for n ∈ N, denotes a basis for
the finite-dimensional approximation space S ⊂ C (Ω). To characterize a best
approximation s∗ ∈ S to some f ∈ C (Ω)\S, we work with the compact point
set 1  2
Ms∗ −f = (s∗ − f )(x)(s1 (x), . . . , sn (x))T  x ∈ Es∗ −f ⊂ Rn ,
where we can immediately draw the following conclusion from Theorem 5.12.

Corollary 5.14. For s∗ ∈ S the following statements are equivalent.


(a) s∗ is a best approximation to f ∈ C (Ω) \ S.
(b) 0 ∈ conv(Ms∗ −f ).

Proof. In this proof, we use the notation



n
sβ = βj sj ∈ S for β = (β1 , . . . , βn )T ∈ Rn . (5.10)
j=1

(b) ⇒ (a): Let 0 ∈ conv(Ms∗ −f ). Suppose s∗ ∈ S is not a best approxi-


mation to f . Then there is one β = (β1 , . . . , βn )T ∈ Rn \ {0} satisfying

s∗ − f − sβ ∞ < s∗ − f ∞


5.2 Strongly Unique Best Approximations 153

In this case, we have

|(s∗ − f )(x) − sβ (x)|2 < |(s∗ − f )(x)|2 for all x ∈ Es∗ −f .

But this is equivalent to

|(s∗ − f )(x)|2 − 2(s∗ − f )(x)sβ (x) + s2β (x) < |(s∗ − f )(x)|2

for all x ∈ Es∗ −f , so that


1 2
(s∗ − f )(x)sβ (x) > s (x) ≥ 0 for all x ∈ Es∗ −f ,
2 β
i.e.,

β T (s∗ − f )(x)(s1 (x), . . . , sn (x))T > 0 for all x ∈ Es∗ −f .

By the equivalence statements in Theorem 5.12, we see that the origin is


in this case not contained in the convex hull conv(Ms∗ −f ). But this is in
contradiction to statement (b).
(a) ⇒ (b): Let s∗ be a best approximation to f . Suppose 0 ∈ / conv(Ms∗ −f ).
Due to Theorem 5.12, there is one β = (β1 , . . . , βn )T ∈ Rn \ {0} satisfying
β T u > 0, or, −β T u < 0, for all u ∈ Ms∗ −f . But this is equivalent to

(s∗ − f )(x) s−β (x) < 0 for all x ∈ Es∗ −f ,

i.e., s∗ − f and s−β have opposite signs on Es∗ −f , whereby

sgn((s∗ − f )(x)) s−β (x) < 0 for all x ∈ Es∗ −f .

In particular (by using the compactness of Es∗ −f ), we have

max sgn((s∗ − f )(x)) s−β (x) < 0.


x∈Es∗ −f

But this is, due to the Kolmogorov criterion, Theorem 5.1, in contradiction
to the optimality of s∗ in (a), 
Corollary 5.14 yields an important result concerning the characterization
of best approximations.
Corollary 5.15. For s∗ ∈ S the following statements are equivalent.
(a) s∗ is a best approximation to f ∈ C (Ω) \ S.
(b) There are m ≤ n + 1
• pairwise distinct extremal points x1 , . . . , xm ∈ Es∗ −f
• signs εj = sgn((s∗ − f )(xj )), for j = 1, . . . , m,
• coefficients λ = (λ1 , . . . , λm )T ∈ Λm with λj > 0 for all 1 ≤ j ≤ m,
satisfying
m
ϕ(s) := λj εj s(xj ) = 0 for all s ∈ S. (5.11)
j=1
154 5 Chebyshev Approximation

Proof. (a) ⇒ (b): Let s∗ be a best approximation to f .


Then we have 0 ∈ conv(Ms∗ −f ) by Corollary 5.14. According to the
Carathéodory theorem, Theorem 5.10, there are m ≤ n + 1 extremal points
x1 , . . . , xm ∈ Es∗ −f and coefficients λ = (λ1 , . . . , λm )T ∈ Λm satisfying


m 
m 
m
0= λj ((s∗ − f )(xj ))sk (xj ) = λj εj s∗ − f ∞ sk (xj ) = λj εj sk (xj )
j=1 j=1 j=1

for all basis elements sk ∈ S, k = 1, . . . , n.


(b) ⇒ (a): Under the assumption in (b), we have 0 ∈ conv(Ms∗ −f ),
whereby s∗ is a best approximation to f , due to Corollary 5.14. 

Remark 5.16. In statement (b) of Corollary 5.15 the alternation condition

εj · εj+1 = −1 for j = 1, . . . , m − 1

is not necessarily satisfied. In Corollary 5.4, we considered the special case


of polynomial approximation by S = Pn−1 ⊂ C [a, b]. In that case, the alter-
nation condition (5.3) is satisfied with at least n + 1 extremal points. But in
Corollary 5.15 only at most n + 1 extremal points are allowed. 

In the following discussion, we will see how the characterizations of Corol-


lary 5.4 and Corollary 5.15 can be combined. To this end, the following result
is of central importance, whereby we can even prove strong uniqueness for
best approximations.

Theorem 5.17. For f ∈ C (Ω) \ S, let s∗ ∈ S be a best approximation to f .


Moreover, suppose that ϕ : C (Ω) −→ R is a linear functional of the form


m
ϕ(u) = λk εk u(xk ) for u ∈ C (Ω) (5.12)
k=1

satisfying the dual characterization (5.11) of Corollary 5.15 for a point set
X = {x1 , . . . , xm } ⊂ Es∗ −f , where 2 ≤ m ≤ n + 1. Then, we have for any
s ∈ S the estimates
λmin
s − f ∞ ≥ s − f ∞,X ≥ s∗ − f ∞ + s∗ − s∞,X , (5.13)
1 − λmin
where λmin := min1≤j≤m λj > 0.

Proof. Suppose s ∈ S. Then, the first estimate in (5.13) is trivial. To show


the second estimate in (5.13), we use the ingredients ε ∈ {±1}m , λ ∈ Λm ,
and X ⊂ Es∗ −f for the functional ϕ in (5.12) from the dual characterization
of Corollary 5.15. Note that we have

s − f ∞,X ≥ εj (s − f )(xj ) = εj (s − s∗ )(xj ) + εj (s∗ − f )(xj )


5.2 Strongly Unique Best Approximations 155

and, moreover, εj (s∗ − f )(xj ) = s∗ − f ∞ , for all j = 1, . . . , m, so that

s − f ∞,X ≥ s∗ − f ∞ + εj (s − s∗ )(xj ) for all 1 ≤ j ≤ m. (5.14)

Since m ≥ 2, we have λmin ∈ (0, 1/2] and so λmin /(1 − λmin ) ∈ (0, 1].
Now let xj ∗ ∈ X be a point satisfying |(s − s∗ )(xj ∗ )| = s − s∗ ∞,X . If
εj (s − s∗ )(xj ∗ ) = s − s∗ ∞,X , then the second estimate in (5.13) is satisfied,
with λmin /(1−λmin ) ≤ 1. Otherwise, we have εj (s−s∗ )(xj ∗ ) = −s−s∗ ∞,X ,
whereby with ϕ(s − s∗ ) = 0 the estimate


m
λj ∗ s−s∗ ∞,X = λk εk (s−s∗ )(xk ) ≤ (1−λj ∗ ) max∗ εk (s−s∗ )(xk ) (5.15)
k=j
k=1
k=j ∗

follows. Now then, for k ∗ ∈ {1, . . . , m} \ {j ∗ } satisfying

εk∗ (s − s∗ )(xk∗ ) = max∗ εk (s − s∗ )(xk )


k=j

we find, due to (5.15), the estimate

λmin λj ∗
s∗ − s∞,X ≤ s∗ − s∞,X ≤ εk∗ (s − s∗ )(xk∗ ),
1 − λmin 1 − λj ∗

which implies, in combination with (5.14), the second estimate in (5.13). 

Note that for any best approximation s∗ ∈ S to f ∈ C (Ω) \ S, the


estimates in (5.13) yield the inequality

λmin
s − f ∞ − s∗ − f ∞ ≥ s − s∗ ∞,X for all s ∈ S. (5.16)
1 − λmin

Given this result, we can further analyze the question for the (strong) unique-
ness of best approximations to f ∈ C (Ω) \ S. To this end, we first take note
of the following simple observation.

Remark 5.18. Let s∗ ∈ S be a best approximation to f ∈ C (Ω) \ S. Then,


for any other best approximation s∗∗ ∈ S to f ∈ C (Ω) we have

λmin
0 = s∗∗ − f ∞ − s∗ − f ∞ ≥ s∗∗ − s∗ ∞,X
1 − λmin

by (5.16), and this implies, for λmin ∈ (0, 1), the identity

s∗∗ − s∗ ∞,X = 0.

In conclusion, all best approximations to f must coincide on X. Now if ·∞,X


is a norm on S, then s∗ will be the unique best approximation to f . 
156 5 Chebyshev Approximation

In the following Section 5.3, we develop suitable conditions on S ⊂ C (Ω),


under which we can guarantee the uniqueness of best approximations. In
our developments the definiteness of  · ∞,X plays an important role. As
we can show already now, the definiteness of  · ∞,X guarantees the strong
uniqueness of a best approximation s∗ ∈ S to f ∈ C (Ω) \ S.
Theorem 5.19. Under the assumptions of Theorem 5.17, let  · ∞,X be a
norm on S. Then there exists for any f ∈ C (Ω) \ S a strongly unique best
approximation s∗ ∈ S to f .
Proof. The approximation space S ⊂ C (Ω) is finite-dimensional. Therefore,
there exists for any f ∈ C (Ω) a best approximation s∗ ∈ S to f , according
to Corollary 3.8. Moreover, all norms on S are equivalent. In particular, the
two norms  · ∞ and  · ∞,X are on S equivalent, so that there is a constant
β > 0 satisfying
s∞,X ≥ βs∞ for all s ∈ S. (5.17)
By (5.13), the best approximation s∗ to f is strongly unique, since
λmin
s−f ∞ −s∗ −f ∞ ≥ s−s∗ ∞,X ≥ αs−s∗ ∞ for all s ∈ S,
1 − λmin
where α = βλmin /(1 − λmin ) > 0. 
Before we continue our analysis, we first discuss two examples.
Example 5.20. Let F = C [−1, 1], S = P1 ⊂ F and f (x) = x2 . Then,
c∗ ≡ 1/2 is according to Corollary 5.2 the unique best approximation to f
from P0 . Since f is even, the unique best approximation p∗1 ∈ P1 to f from
P1 is also even, due to Corollary 5.5. In this case, p∗1 is necessarily constant,
and so c∗ is also the unique best approximation to f from P1 . Moreover, the
error function c∗ − f has on the interval [−1, 1] exactly three extremal points
X = {x1 , x2 , x3 } = {−1, 0, 1}, where the alternation conditions are satisfied,
1
c∗ − f (xj ) = (−1)j c∗ − f ∞ = (−1)j · for j = 1, 2, 3.
2
For λ1 = 1/4, λ2 = 1/2, λ3 = 1/4 and εj = (−1)j , for j = 1, 2, 3, we have

3
λj εj p(xj ) = 0 for all p ∈ P1 .
j=1

Moreover,  · ∞,X is a norm on P1 . According to Theorem 5.19, the


constant c∗ is the strongly unique best approximation to f from P1 . By
λmin = min1≤j≤3 λj = 1/4 we get, like in the proof of Theorem 5.19 (with
β = 1), the estimate
λmin 1
p − f ∞ − c∗ − f ∞ ≥ p − c∗ ∞,X = p − c∗ ∞ for all p ∈ P1
1 − λmin 3
for the strong uniqueness of c∗ with the constant α = 1/3. ♦
5.2 Strongly Unique Best Approximations 157

For further illustration, we make the following link to Example 5.7.

Example 5.21. Let F = C [−1, 1], S = P2 ⊂ F and f (x) = |x|. From


Example 5.7 the function p∗2 (x) = 1/8 + x2 is the unique best approximation
to f from P2 with extremal point set Ep∗2 −f = {0, ±1/2, ±1}.
For the dual characterization of the best approximation p∗2 ∈ P2 to f , we
seek, according to Corollary 5.15, a set X ⊂ Ep∗2 −f of extremal points, where
2 ≤ m = |X| ≤ dim(P2 ) + 1 = 4, signs εj = sgn((p∗2 − f )(xj )), 1 ≤ j ≤ m,
and coefficients λ = (λ1 , . . . , λm ) ∈ Λm satisfying


m
λj εj p(xj ) = 0 for all p ∈ P2 . (5.18)
j=1

This results in dim(P2 ) = 3 linear equations. Together with

λ1 + . . . + λm = 1 (5.19)

we get a total number of four linear equation conditions for λ ∈ Λm . There-


fore, we let m = 4 and, moreover, we take X = {−1/2, 0, 1/2, 1} ⊂ Ep∗2 −f
with signs ε = (−1, 1, −1, 1). In this way, we reformulate (5.18) as follows.

−λ1 p(−1/2) + λ2 p(0) − λ3 p(1/2) + λ4 p(1) = 0 for all p ∈ P2 . (5.20)

We pose the conditions from (5.20) to the three elements of the monomial
basis {1, x, x2 } of P2 . For p ≡ 1 we get −λ1 + λ2 − λ3 + λ4 = 0, whereby
from (5.19) we get

λ2 + λ4 = 1/2 and λ1 + λ3 = 1/2. (5.21)

For p(x) = x and p(x) = x2 , we get by (5.20) the conditions


1 1
λ4 = (λ3 − λ1 ) and λ4 = (λ1 + λ3 ). (5.22)
2 4
Then, (5.21) implies λ4 = 1/8 and moreover λ2 = 3/8. From (5.21) and (5.22)
we finally compute λ3 = 3/8 and λ1 = 1/8. Therefore,

1 λmin 1
λmin = and = .
8 1 − λmin 7
The characterization (5.13) in Theorem 5.17 implies the estimate
1
p − f ∞ − p∗2 − f ∞ ≥ p − p∗2 ∞,X for all p ∈ P2 . (5.23)
7
Next, we show the strong uniqueness of p∗2 , where we use Theorem 5.19.
To this end, note that  · ∞,X is a norm on P2 . Therefore, it remains to
determine an equivalence constant β > 0, like in (5.17), satisfying
158 5 Chebyshev Approximation

p∞,X ≥ βp∞ for all p ∈ P2 . (5.24)

We choose for p ∈ P2 the monomial representation p(x) = a0 + a1 x + a2 x2 .


By evaluation of p on the point set X = {−1/2, 0, 1/2, 1}, we get

a0 = p(0), a1 = p(1/2) − p(−1/2), a2 = p(1) − p(0) − p(1/2) + p(−1/2),

and therefore the (rough) estimate

p∞ ≤ |a0 | + |a1 | + |a2 | ≤ p∞,X + 2p∞,X + 4p∞,X = 7p∞,X

for all p ∈ P2 , whereby (5.24) holds for β = 1/7. Together with (5.23), this
finally yields the sought estimate
1 1
p − f ∞ − p∗2 − f ∞ ≥ p − p∗2 ∞,X ≥ p − p∗2 ∞ for all p ∈ P2 .
7 49
Therefore, p∗2 is the strongly unique best approximation to f . ♦

5.3 Haar Spaces


In this section, we develop sufficient conditions for the approximation space
S ⊂ C (Ω) under which a best approximation s∗ ∈ S to f ∈ C (Ω) \ S is
strongly unique. To this end, we can rely on the result of Theorem 5.19,
according to which we need to ensure the definiteness of  · ∞,X on S for
any X ⊂ Es∗ −f .
We continue to use the assumptions and notations from the previous
section, where (s1 , . . . , sn ) ∈ S n , for n ∈ N, denotes an ordered basis of a
finite-dimensional linear approximation space S ⊂ C (Ω). By the introduction
of Haar3 spaces we specialize our assumptions on S and (s1 , . . . , sn ) as follows.

Definition 5.22. A linear space S ⊂ C (Ω) with dim(S) = n < ∞ is called


a Haar space of dimension n ∈ N on Ω, if any s ∈ S \ {0} has at most n − 1
zeros on Ω. A basis H = (s1 , . . . , sn ) ∈ S n for a Haar space S on Ω is called
a Haar system on Ω.

In Haar spaces S of dimension n ∈ N, we can solve interpolation problems


for a discrete set X ⊂ Ω containing |X| = n pairwise distinct points. In this
case,  · ∞,X is a norm on S, and so a solution of the interpolation problem
is unique. We can further characterize Haar spaces as follows.
3
Alfréd Haar (1885-1933), Hungarian mathematician
5.3 Haar Spaces 159

Theorem 5.23. Let S ⊂ C (Ω) be a linear space of dimension n ∈ N and


X = {x1 , . . . , xn } ⊂ Ω a set of n pairwise distinct points. Then the following
statements are equivalent.
(a) Any s ∈ S \ {0} has at most n − 1 zeros on X.
(b) For s ∈ S, we have the implication

sX = 0 =⇒ s ≡ 0 on Ω,

i.e.,  · ∞,X is a norm on S.


(c) For any fX ∈ Rn , there is one unique s ∈ S satisfying sX = fX .
(d) For any basis H = (s1 , . . . , sn ) ∈ S n of S, the Vandermonde matrix
⎡ ⎤
s1 (x1 ) · · · s1 (xn )
⎢ .. ⎥ ∈ Rn×n
VH,X = ⎣ ... . ⎦
sn (x1 ) · · · sn (xn )

is regular, where in particular det(VH,X ) = 0.


If one of the statements (a)-(d) holds for all sets X = {x1 , . . . , xn } ⊂ Ω of
n pairwise distinct points, then all of the remaining three statements (a)-(d)
are satisfied for all X. In this case, S is a Haar space of dimension n on Ω.

Proof. Let X = {x1 , . . . , xn } ⊂ Ω be a set of n pairwise distinct points in


Ω. Obviously, the statements (a) and (b) are equivalent. By statement (b),
the linear mapping LX : s −→ sX is injective. Since n = dim(S) = dim(Rn ),
this is equivalent to statement (c), i.e., LX is surjective, and, moreover, also
equivalent to statement (d), i.e., LX is bijective. This completes our proof
for the equivalence of statements (a)-(d).
If one of the statements in (a)-(d) holds for all sets X = {x1 , . . . , xn } ⊂ Ω
of n pairwise distinct points, then all of the remaining three statements in
(a)-(d) are satisfied, due to the equivalence of statements (a)-(d). In this
case, statement (a) holds in particular, for all sets X = {x1 , . . . , xn } ⊂ Ω,
i.e., any s ∈ S \ {0} has at most n − 1 zeros on Ω, whereby S is, according
Definition 5.22, a Haar space of dimension n on Ω. 

According to the Mairhuber4 -Curtis5 theorem [17, 48] there are no non-
trivial Haar systems on multivariate connected domains Ω ⊂ Rd , d > 1.
Before we prove the Mairhuber-Curtis theorem, we introduce a few notions.

Definition 5.24. A domain Ω ⊂ Rd is said to be connected, if for any


pair of two points x, y ∈ Ω there is a continuous mapping γ : [0, 1] −→ Ω
satisfying γ(0) = x and γ(1) = y, i.e., the points x and y can be connected
by a continuous path in Ω.
4
John C. Mairhuber (1922-2007), US American mathematician
5
Philip C. Curtis, Jr. (1928-2016), US-American mathematician
160 5 Chebyshev Approximation

Moreover, we call a domain Ω ⊂ Rd homeomorphic to a subset of the


sphere S1 := {x ∈ R2 | x2 = 1} ⊂ R2 , if for a non-empty and connected
subset U ⊂ S1 there is a bijective continuous mapping ϕ : Ω −→ U with
continuous inverse ϕ−1 : U −→ Ω.

Theorem 5.25. (Mairhuber-Curtis, 1956/1959).


Let H = (s1 , . . . , sn ) ∈ (C (Ω))n be a Haar system of dimension n ≥ 2 on
a connected set Ω ⊂ Rd , d > 1. Then, Ω contains no bifurcation, i.e., Ω is
homeomorphic to a subset of the sphere S1 ⊂ R2 .

Fig. 5.3. According to the Mairhuber-Curtis theorem, Theorem 5.25, there are no
non-trivial Haar systems H on domains Ω containing bifurcations.

Proof. Suppose Ω contains a bifurcation (see Figure 5.3 for illustration).


Moreover, X = {x1 , . . . , xn } ⊂ Ω be a subset of n ≥ 2 pairwise distinct
points in Ω. Now regard the determinant
⎡ ⎤
s1 (x1 ) s1 (x2 ) s1 (x3 ) · · · s1 (xn )
⎢ s2 (x1 ) s2 (x2 ) s2 (x3 ) · · · s2 (xn ) ⎥
⎢ ⎥
d{x1 ,x2 ,x3 ...,xn } = det(VH,X ) = det ⎢ . .. .. .. ⎥ .
⎣ .. . . . ⎦
sn (x1 ) sn (x2 ) sn (x3 ) · · · sn (xn )

If d{x1 ,x2 ,x3 ...,xn } = 0, then H, by Theorem 5.23, is not a Haar system.
Otherwise, we can shift the two points x1 und x2 by a continuous mapping
along the two branches of the bifurcation, without any coincidence between
points in X (see Figure 5.4).
Therefore, the determinant d{x2 ,x1 ,x3 ,...,xn } has, by swapping the first two
columns in matrix VH,X , opposite sign to d{x1 ,x2 ,x3 ,...,xn } , i.e.,

sgn d{x1 ,x2 ,x3 ,...,xn } = −sgn d{x2 ,x1 ,x3 ,...,xn } .
5.3 Haar Spaces 161

x2 x
xn 1
...

domain Ω X = (x1 , . . . , xn ) ∈ Ω n

x1 x1

x2
xn xn
x2
... ...

step 1: shift of x1 step 2: shift of x2

x1 x1 x
xn xn 2
x2
... ...

step 3: back-shift of x1 step 4: back-shift of x2

Fig. 5.4. Illustration of the Mairhuber-Curtis theorem, Theorem 5.25. The two
points x1 and x2 can be swapped by a continuous mapping, i.e., by shifts along the
branches of the bifurcation without coinciding with any other point from X.
162 5 Chebyshev Approximation

Due to the continuity of the determinant, there must be a sign change of the
determinant during the (continuous) swapping between x1 and x2 . In this
case, H = {s1 , . . . , sn } cannot be a Haar system, by Theorem 5.23. But this
is in contradiction to our assumption to H. 
Due to the result of the Mairhuber-Curtis theorem, Theorem 5.25, we
restrict ourselves from now to the univariate case, d = 1. Moreover, we assume
from now that the domain Ω is a compact interval, i.e.,
Ω = [a, b] ⊂ R for − ∞ < a < b < ∞.
Before we continue our analysis on strongly unique best approximations,
we first give a few elementary examples for Haar spaces.
Example 5.26. For n ∈ N0 and [a, b] ⊂ R the linear space of polynomials Pn
is a Haar space of dimension n+1 on [a, b], since according to the fundamental
theorem of algebra any non-trivial polynomial from Pn has at most n zeros.

Example 5.27. For N ∈ N0 the linear space TNC of all complex trigonometric
polynomials of degree at most N is a Haar space of dimension N + 1 on
[0, 2π), since TNC is, by Theorem 2.36, a linear space of dimension N + 1, and,
moreover, the linear mapping p −→ pX , for p ∈ TNC is, due to Theorem 2.39,
for all sets X ⊂ [0, 2π) of |X| = N + 1 pairwise distinct points bijective.
Likewise, we can show, by using Corollaries 2.38 and 2.40, that the linear
space TnR of all real trigonometric polynomials of degree at most n ∈ N0 is a
Haar space of dimension 2n + 1 on [0, 2π). ♦
Example 5.28. For [a, b] ⊂ R and λ0 < . . . < λn the functions
1 λ0 x 2
e , . . . , e λn x
are a Haar system on [a, b]. We can show this by induction on n.
Initial step: For n = 0 the statement is trivial.
Induction hypothesis: Suppose the statement is true for n − 1 ∈ N.
Induction step (n − 1 −→ n): If a function of the form
1 2
u(x) ∈ span eλ0 x , . . . , eλn x
has n + 1 zeros in [a, b], then the function
d −λ0 x
v(x) = e · u(x) for x ∈ [a, b]
dx
has, according to the Rolle6 theorem, at least n zeros in [a, b]. However,
3 4
v(x) ∈ span e(λ1 −λ0 )x , . . . , e(λn −λ0 )x ,

which implies v ≡ 0 by the induction hypothesis, and so u ≡ 0. ♦


6
Michel Rolle (1652-1719), French mathematician
5.3 Haar Spaces 163

Example 5.29. The functions f1 (x) = x and f2 (x) = ex are not a Haar
system on [0, 2]. This is because dim(S) = 2 for S = span{f1 , f2 }, but the
continuous function
f (x) = ex − 3x ≡ 0
has by f (0) = 1, f (1) = e − 3 < 0 and f (2) > 0 at least two zeros in [0, 2].
Therefore, S cannot be a Haar space on [0, 2]. ♦
Example 5.30. For [a, b] ⊂ R let g ∈ C n+1 [a, b] satisfy g (n+1) (x) > 0 for all
x ∈ [a, b]. Then, the functions {1, x, . . . , xn , g} are a Haar system on [a, b]:
First note that the functions 1, x, . . . , xn , g(x) are linearly independent, since
from
α0 1 + α1 x + . . . + αn xn + αn+1 g(x) ≡ 0 for x ∈ [a, b]
we can conclude αn+1 g (n+1) (x) ≡ 0 after (n + 1)-fold differentiation, whereby
αn+1 = 0. The remaining coefficients α0 , . . . , αn do also vanish, since the
monomials 1, x, . . . , xn are linearly independent. Moreover, we can show that
any function u ∈ span{1, x, . . . , xn , g} \ {0} has at most n + 1 zeros in [a, b]:
Suppose
n
u(x) = αj xj + αn+1 g(x) ≡ 0
j=0

has n + 2 zeros in [a, b]. Then, the (n + 1)-th derivative


u(n+1) (x) = αn+1 g (n+1) (x)
has, due to the Rolle theorem, at least one zero in [a, b]. But this implies
αn+1 = 0, since g (n+1) is positive on [a, b]. In this case, u ∈ Pn is a polynomial
of degree at most n, which, according to the fundamental theorem of algebra,
vanishes identically on [a, b]. But this is in contraction to our assumption. ♦
Now we return to the dual characterization of (strongly) unique best
approximations. According to Corollary 5.15, there is for any best approxi-
mation s∗ ∈ S to f ∈ C [a, b] a characterizing dual functional ϕ : C (Ω) −→ S
of the form
 m
ϕ(u) = λj εj u(xj ) for u ∈ C [a, b] (5.25)
j=1

satisfying ϕ(S) = 0, where m ≤ n+1. For the case of Haar spaces S ⊂ C [a, b]
the length of the dual functional in (5.25) is necessarily m = n + 1. Let us
take note of this important observation.
Proposition 5.31. Let ϕ : C [a, b] −→ R be a functional of the form (5.25),
where m ≤ n + 1. Moreover, let S ⊂ C [a, b] be a Haar space of dimension
dim(S) = n ∈ N on [a, b]. If ϕ(S) = {0}, then we have m = n + 1.
Proof. Suppose m ≤ n. Then, due to Theorem 5.23 (c), the Haar space S
contains one element s ∈ S satisfying s(xj ) = εj , for all 1 ≤ j ≤ m. But for
this s, we find ϕ(s) = λ1 = 1, in contradiction to ϕ(S) = {0}. 
164 5 Chebyshev Approximation

In the following discussion, we consider, for a fixed basis H = (s1 , . . . , sn )


of the Haar space S, points X = (x1 , . . . , xn+1 ) ∈ I n+1 , and sign vectors
ε = (ε1 , . . . , εn+1 ) ∈ {±1}n+1 , the non-singular Vandermonde matrices
⎡ ⎤
s1 (x1 ) · · · s1 (xk−1 ) s1 (xk+1 ) · · · s1 (xn+1 )
⎢ ⎥
VH,X\{xk } = ⎣ ... ..
.
..
.
..
. ⎦∈R
n×n
(5.26)
sn (x1 ) · · · sn (xk−1 ) sn (xk+1 ) · · · sn (xn+1 )

for 1 ≤ k ≤ n + 1, and the alternation matrix


⎡ ⎤
ε1 . . . εn+1
  ⎢ s (x ) · · · s (x ⎥
ε ⎢ 1 1 1 n+1 ) ⎥
Aε,H,X = =⎢ . . ⎥ ∈ R(n+1)×(n+1) . (5.27)
VH,X ⎣ .. .. ⎦
sn (x1 ) · · · sn (xn+1 )

We first take note of a few properties for VH,X\{xk } and Aε,H,X .

Proposition 5.32. Let H = (s1 , . . . , sn ) be a Haar system on an interval


I ⊂ R and X = (x1 , . . . , xn+1 ) ∈ I n+1 be a vector of n + 1 pairwise distinct
points. Then, the following statements are true.
(a) For the Vandermonde matrices VH,X\{xk } in (5.26), the signs of the n+1
determinants

dk = det(VH,X\{xk } ) = 0 for 1 ≤ k ≤ n + 1

are constant, i.e., sgn(dk ) = σ, for all 1 ≤ k ≤ n + 1, for some σ ∈ {±1}.


(b) If the signs in ε = (ε1 , . . . , εn+1 ) ∈ {±1}n+1 are alternating, i.e., if

εk = (−1)k−1 σ for 1 ≤ k ≤ n + 1

for some σ ∈ {±1}, then the matrix Aε,H,X in (5.27) is non-singular.

Proof. (a): Suppose we have sgn(dk ) = sgn(dk+1 ) for some 1 ≤ k ≤ n.


We consider a continuous mapping γ : [0, 1] −→ I satisfying γ(0) = xk
and γ(1) = xk+1 . In this case, the continuous determinant mapping

d(α) = det(VH,(x1 ,...,xk−1 ,γ(α),xk+2 ,...,xn+1 ) ) for α ∈ [0, 1]

satisfying d(0) = dk+1 and d(1) = dk must have a sign change on (0, 1). Due
to the continuity of d there is one α∗ ∈ (0, 1) satisfying d(α∗ ) = 0. However,
in this case, the Vandermonde matrix VH,(x1 ,...,xk−1 ,γ(α∗ ),xk+2 ,...,xn+1 ) ∈ Rn×n
is singular. Due to Theorem 5.23 (d), the elements in (s1 , . . . , sn ) are not a
Haar system on I ⊂ R. But this is in contradiction to our assumption.
5.3 Haar Spaces 165

(b): According to the Laplace7 expansion (here with respect to the first
row), the determinant of Aε,H,X has the representation


n+1 
n+1
det(Aε,H,X ) = (−1)k+1 (−1)k−1 σ · dk = σ dk .
k=1 k=1

Due to statement (a), the signs of the determinants dk , 1 ≤ k ≤ n + 1, are


constant, which implies det(Aε,H,X ) = 0. 

By using the results of Propositions 5.31 and 5.32, we can prove the
alternation theorem, being the central result of this chapter. According to the
alternation theorem, the signs ε = (ε1 , . . . , εn+1 ) of the dual characterization
in (5.25) are for the case of Haar spaces S alternating. Before we prove the
alternation theorem, we first give a formal definition for alternation sets.

Definition 5.33. Let S ⊂ C (I) be a Haar space of dimension n ∈ N on an


interval I ⊂ R. Moreover, suppose s∗ ∈ S and f ∈ C (I)\S. Then, an ordered
set X = (x1 , . . . , xn+1 ) ∈ Esn+1
∗ −f ⊂ I
n+1
of n + 1 monotonically increasing
extremal points x1 < . . . < xn+1 is called an alternation set for s∗ and f ,
if
εj = sgn((s∗ − f )(xj )) = (−1)j σ for all j = 1, . . . , n + 1
for some σ ∈ {±1}, i.e., if the signs of s∗ − f are alternating on X.

Theorem 5.34. (Alternation theorem).


Let S ⊂ C (I) be a Haar space of dimension n ∈ N on an interval I ⊂ R.
Moreover, let IK ⊂ I be a compact subset containing at least n + 1 elements.
Then, there is for any f ∈ C (IK ) \ S a strongly unique best approximation
s∗ ∈ S to f with respect to  · ∞,IK . The best approximation s∗ is charac-
∗ −f ⊂ IK
terized by the existence of an alternation set X ∈ Esn+1 n+1
for s∗ and f .

Proof. Due to Corollary 3.8, any f ∈ C (IK ) has a best approximation s∗ ∈ S.


Moreover, the strong uniqueness of s∗ follows from Theorem 5.19, where the
assumptions required therein for Theorem 5.17 are covered by Corollary 5.15.
Now we prove the stated characterization for s∗ .
To this end, let X = (x1 , . . . , xn+1 ) ∈ Esn+1
∗ −f ⊂ IK
n+1
be an alternation
∗ ∗
set for s and f with (alternating) signs εj = sgn((s − f )(xj )) = (−1)j σ, for
1 ≤ j ≤ n + 1, and some σ ∈ {±1}. Then, we consider the linear system
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
ε1 · · · εn+1 ε1 λ 1 1
⎢ s1 (x1 ) · · · s1 (xn+1 ) ⎥ ⎢ ε2 λ2 ⎥ ⎢ 0 ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ .. .. ⎥·⎢ .. ⎥ = ⎢ .. ⎥ (5.28)
⎣ . . ⎦ ⎣ . ⎦ ⎣.⎦
sn (x1 ) · · · sn (xn+1 ) εn+1 λn+1 0
7
Pierre-Simon Laplace (1749-1827), French mathematician and physicist
166 5 Chebyshev Approximation

with the alternation matrix Aε,H,X on the left hand side in (5.28). According
to Proposition 5.32 (a), the matrix Aε,H,X is non-singular. Therefore, the
products εk λk , for 1 ≤ k ≤ n + 1, uniquely solve the linear system (5.28).
Due to the Cramer8 rule we have the representation
(−1)k−1 dk
ε k λk = for all 1 ≤ k ≤ n + 1,
det(Aε,H,X )
where according to Proposition 5.32 (a) the signs of the n + 1 determinants
dk = det(VH,X\{xk } ), for 1 ≤ k ≤ n + 1, are constant. This implies εk λk = 0,
and, moreover, there is one unique vector λ = (λ1 , . . . , λn+1 )T ∈ Λn+1 with
positive coefficients
dk
λk = :n+1 >0 for all 1 ≤ k ≤ n + 1
j=1 dj
which solves the linear system (5.28). This solution λ ∈ Λn+1 of (5.28) finally
yields the characterizing functional (according to Corollary 5.15),

n+1
ϕ(u) = λj εj u(xj ) for u ∈ C (IK ), (5.29)
j=1

satisfying ϕ(S) = {0}. Due to Corollary 5.15, s∗ is the (strongly unique) best
approximation to f .
Now suppose that s∗ ∈ S is the strongly unique best approximation to
f ∈ C (IK ) \ S. Recall that the dual characterization in Corollary 5.15 proves
the existence of a functional ϕ : C (IK ) −→ R of the form (5.25) satisfying
ϕ(S) = {0}, where ϕ has, according to Proposition 5.31, length m = n + 1.
We show that the point set X = (x1 , . . . , xn+1 ) ∈ Esn+1 ∗ −f (from the dual

characterization in Corollary 5.15) is an alternation set for s∗ and f , where


our proof is by contradiction. To this end, let ε = (ε1 , . . . , εn+1 ) ∈ {±1}n+1
denote the sign vector of s∗ −f with εj = sgn((s∗ −f )(xj )), for 1 ≤ j ≤ n+1.
Now suppose there is one index k ∈ {1, . . . , n} satisfying εk = εk+1 . Then
there is one s ∈ S \ {0} satisfying s(xj ) = 0 for all j ∈ {k, k + 1} and
s(xk ) = εk . Since s cannot have more than n − 1 zeros in I, we necessarily
have
εk = sgn(s(xk )) = sgn(s(xk+1 )) = εk+1 .
This particularly implies
ϕ(s) = λk + λk+1 |s(xk+1 )| > 0,
which, however, is in contradiction to ϕ(s) = 0. 
We finally remark that the characterizing alternation set X ∈ Esn+1
for
∗ −f

s∗ and f in the alternation theorem, Theorem 5.34, is not necessarily unique.


This is because the set Es∗ −f of extremal points can be arbitrarily large (see
Example 5.3).
8
Gabriel Cramer (1704-1752), Swiss mathematician
5.4 The Remez Algorithm 167

5.4 The Remez Algorithm


In this section, we discuss the Remez9 algorithm [59, 60], an iterative method
to numerically compute the (strongly unique) best approximation s∗ ∈ S to
f ∈ C [a, b] \ S, where [a, b] ⊂ R is a compact interval. Moreover, S ⊂ C [a, b]
denotes a Haar space of dimension n ∈ N on [a, b].
In any of its iteration steps, the Remez algorithm computes, for an ordered
(i.e., monotonically increasing) reference set X = (x1 , . . . , xn+1 ) ∈ [a, b]n+1 of
length |X| = n + 1, the corresponding (strongly unique) best approximation
s∗X to f with respect to  · ∞,X , so that

s∗X − f ∞,X < s − f ∞,X for all s ∈ S \ {s∗X }.

To compute s∗X , we first fix an ordered basis H = (s1 , . . . , sn ) of the Haar


space S, so that s∗X can be represented as linear combination

n
s∗X = αj∗ sj ∈ S (5.30)
j=1

of the Haar system H with coefficients α∗ = (α1∗ , . . . , αn∗ )T ∈ Rn . According


to the alternation theorem, Theorem 5.34, the sought best approximation s∗X
necessarily satisfies the alternation condition

(s∗X − f )(xk ) = (−1)k−1 σs∗X − f ∞,X for 1 ≤ k ≤ n + 1 (5.31)

for some σ ∈ {±1}. Letting ηX = σs∗X − f ∞,X we rewrite (5.31) as

s∗X (xk ) + (−1)k ηX = f (xk ) for 1 ≤ k ≤ n + 1. (5.32)

Therefore, ηX and the unknown coefficients α∗ ∈ Rn of s∗X are the solution


of the linear equation system
 
η
ATε,H,X · X∗ = fX (5.33)
α

with the right hand side fX = (f (x1 ), . . . , f (xn+1 ))T ∈ Rn+1 and the alter-
nation matrix Aε,H,X ∈ R(n+1)×(n+1) in (5.27), containing the sign vector
ε = (−1, 1, . . . , (−1)n+1 ) ∈ {±1}n+1 , or,
⎡ ⎤⎡ ⎤ ⎡ ⎤
−1 s1 (x1 ) · · · sn (x1 ) ηX f (x1 )
⎢ 1 s1 (x2 ) · · · sn (x2 ) ⎥ ⎢ ∗⎥ ⎢ ⎥
⎢ ⎥ ⎢ α1 ⎥ ⎢ f (x2 ) ⎥
⎢ .. .. .. ⎥ ⎢ . ⎥ = ⎢ .. ⎥.
⎣ . . . ⎦ ⎣ .. ⎦ ⎣ . ⎦
(−1)n+1 s1 (xn+1 ) · · · sn (xn+1 ) αn∗ f (xn+1 )

By Proposition 5.32 (b), the matrix Aε,H,X is non-singular, and so the


solution of the linear system (5.33) is unique. By the solution of (5.33), we
9
Evgeny Yakovlevich Remez (1896-1975), mathematician
168 5 Chebyshev Approximation

do not only obtain the coefficients α∗ = (α1∗ , . . . , αn∗ )T ∈ Rn of the best


approximation s∗X in (5.30), but also by |ηX | = s∗X − f ∞,X we get the
minimal distance and the sign σ = sgn(ηX ) in (5.31).
The following observation concerning Chebyshev approximation to a func-
tion f ∈ C [a, b] by algebraic polynomials from Pn−1 shows that, in this special
case, the linear system (5.33) can be avoided. To this end, we use the New-
ton representation (2.34) for the interpolation polynomial in Theorem 2.13.
Recall that the Newton polynomials are given as


k
ωk (x) = (x − xj ) ∈ Pk for 0 ≤ k ≤ n − 1.
j=1

In particular, we apply the linear operator [x1 , . . . , xn+1 ] : C [a, b] −→ R of the


divided differences to f (see Definition 2.10). To evaluate [x1 , . . . , xn+1 ](f ), we
apply the recursion in Theorem 2.14. The recursion in Theorem 2.14 operates
only on the vector of function values fX = (f (x1 ), . . . , f (xn+1 ))T ∈ Rn+1 .
Therefore, the application of [x1 , . . . , xn+1 ] is also well-defined for any sign
vector ε ∈ {±1}n+1 of length n + 1. In particular, the divided difference
[x1 , . . . , xn+1 ](ε) can also be evaluated by the recursion in Theorem 2.14. In
the formulation of the following result, we apply divided differences to vectors
ε with alternating signs.

Proposition 5.35. For n ∈ N, let X = (x1 , . . . , xn+1 ) be an ordered set of


n + 1 points in [a, b] ⊂ R, ε = (−1, 1, . . . , (−1)n+1 ) ∈ {±1}n+1 a sign vector,
and f ∈ C [a, b] \ Pn−1 . Then,


n−1
s∗X = [x1 , . . . , xk+1 ](f − ηX ε)ωk ∈ Pn−1 (5.34)
k=0

is the strongly unique best approximation s∗X ∈ Pn−1 to f w.r.t.  · ∞,X ,


where
[x1 , . . . , xn+1 ](f )
ηX = . (5.35)
[x1 , . . . , xn+1 ](ε)
The minimal distance is given as

s∗X − f ∞,X = |ηX |.

Proof. Application of the linear operator [x1 , . . . , xn+1 ] : C [a, b] −→ R to the


alternation condition (5.32) immediately gives the representation

[x1 , . . . , xn+1 ](f )


ηX = .
[x1 , . . . , xn+1 ](ε)

Indeed, due to Corollary 2.18 (b), all polynomials from Pn−1 are contained
in the kernel of [x1 , . . . , xn+1 ]. In particular, we have [x1 , . . . , xn+1 ](s∗X ) = 0.
5.4 The Remez Algorithm 169

Under the alternation condition (5.32), s∗X ∈ Pn−1 is the unique solution
of the interpolation problem

s∗X (xk ) = f (xk ) − (−1)k ηX for 1 ≤ k ≤ n,

already for the first n alternation points (x1 , . . . , xn ) ∈ Esn∗ −f . This gives the
X
stated Newton representation of s∗X in (5.34). 

Remark 5.36. Note that all divided differences

[x1 , . . . , xk+1 ](f − ηX ε) = [x1 , . . . , xk+1 ](f ) − ηX [x1 , . . . , xk+1 ](ε)

in (5.34) are readily available from the computation of ηX in (5.35). Therefore,


we can compute the best approximation s∗X ∈ Pn−1 to f with respect to
 · ∞,X by divided differences in only O(n2 ) steps, where the computation
is efficient and stable. 

To show how the result of Proposition 5.35, in combination with Re-


mark 5.36, can be applied, we make the following concrete example.

Example 5.37. Let F = C [0, 2] and S = P1 ⊂ F. We approximate the


exponential function f (x) = ex on the reference set X = {0, 1, 2}. To com-
pute the minimal distance ηX and the best approximation s∗X ∈ P1 we use
Proposition 5.35 with n = 2. We apply divided differences to the sign vector
ε = (−1, 1, −1) and to the data vector fX = (1, e, e2 ), where e denotes the
Euler number. By the recursion in Theorem 2.14, we obtain the following
triangular scheme for divided differences (see Table 2.1).

X fX X εX
0 1 0 −1
1 e e−1 1 1 2
2 e 2
e(e − 1) (e − 1) /22
2 −1 −2 −2

Hereby we obtain
 2  2
[0, 1, 2](f ) e−1 e−1
ηX = =− and so s∗X − f ∞,X = .
[0, 1, 2](ε) 2 2

Moreover,
 2
e−1 e2 − 1
s∗X = [0](f − ηX ε) + [0, 1](f − ηX ε)x = 1 − + x
2 2

is the unique best approximation to f from P1 w.r.t. ·∞,X (see Fig. 5.5 (a)).

170 5 Chebyshev Approximation
8

0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

(a) X0 = {0, 1, 2}, s∗0 − f ∞,X0 ≈ 0.7381


8

0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

(b) X1 = {0, x∗ , 2}, s∗1 − f ∞,X1 ≈ 0.7579

Fig. 5.5. Approximation to f (x) = ex on [0, 2] by linear polynomials from P1 . (a)


Initial reference set X0 = {0, 1, 2} with minimal distance |η0 | = (e−1)2 /4 ≈ 0.7381.
(b) Reference set X1 = {0, x∗ , 2}, where x∗ = log((e2 − 1)/2) ≈ 1.1614, with
minimal distance |η1 | = 14 (e2 − 1)(x∗ − 1) + 2 ≈ 0.7579 (see Example 5.45).
5.4 The Remez Algorithm 171

We present another example, where we link with Example 5.7.

Example 5.38. We approximate the absolute-value function f (x) = |x|


on [−1, 1] by quadratic polynomials, i.e., S = P2 . By our previous investi-
gations in Example 5.7, Ep∗2 −f = {−1, −1/2, 0, 1/2, 1} is the set of extremal
points for the best approximation p∗2 ∈ P2 to f . To compute p∗2 , we ap-
ply Proposition 5.35 with n = 3. We let ε = (−1, 1, −1, 1) and we choose
X = {−1, −1/2, 0, 1/2} ⊂ Ep∗2 −f as the reference set. By the recursion in
Theorem 2.14, we obtain the following triangular scheme (see Table 2.1).

X fX X εX
−1 1 −1 −1
− 12 1
2 −1 − 12 1 4
0 0 −1 0 0 −1 −4 −8
1 1 4 1 32
2 2 1 2 3 2 1 4 8 3

Hereby we obtain ηX = 1/8, and so s∗X − f ∞,X = 1/8, along with


9 3
[x1 ](f − ηX ε) = , [x1 , x2 ](f − ηX ε) = − , [x1 , x2 , x3 ](f − ηX ε) = 1.
8 2
Therefore,
 
9 3 1 1
s∗X (x) = − (x + 1) + (x + 1) x + = + x2
8 2 2 8

is the unique best approximation to f from P2 with respect to  · ∞ . Note


that this is consistent with our observations in Example 5.7, since p∗2 ≡ s∗X .

Now we describe the iteration steps of the Remez algorithm. At any Remez
step the current reference set (in increasing order)

X = (x1 , . . . , xn+1 ) ∈ [a, b]n+1

is modified. This is done by a Remez exchange of one point x̂ ∈ X for one


point x∗ ∈ [a, b] \ X, where

|(s∗X − f )(x∗ )| = s∗X − f ∞ ,

so that the next reference set is

X+ = (X \ {x̂}) ∪ {x∗ } = (x+


1 , . . . , xn+1 ) ∈ [a, b]
+ n+1
.

With the Remez exchange, the point x∗ is swapped for the point x̂ ∈ X, such
that the points of the new reference set X+ are in increasing order, i.e.,
172 5 Chebyshev Approximation

a ≤ x+
1 < x2 < . . . < xn < xn+1 ≤ b,
+ + +

and with maintaining the alternation condition, i.e.,

sgn((s∗X − f )(x+ j
j )) = (−1) σ for 1 ≤ j ≤ n + 1

for some σ ∈ {±1}. The exchange for the point pair (x̂, x∗ ) ∈ X × [a, b] \ X
is described by the Remez exchange, Algorithm 8.

Algorithm 8 Remez exchange


1: function Remez exchange(X,s∗X )
2: Input: reference set X = (x1 , . . . , xn+1 ) ∈ [a, b]n+1 ;
3: best approximation s∗X to f with respect to  · ∞,X ;
4:
5: find x∗ ∈ [a, b] satisfying |(s∗X − f )(x∗ )| = s∗X − f ∞ ;
6: let σ ∗ := sgn((s∗X − f )(x∗ );
7:
8: if x∗ ∈ X then return X; best approximation found
9: else if x∗ < x1 then
10: if sgn((s∗X − f )(x1 )) = σ ∗ then X+ = (x∗ , x2 , . . . , xn+1 );
11: else X+ = (x∗ , x1 , . . . , xn );
12: end if
13: else if x∗ > xn+1 then
14: if sgn((s∗X − f )(xn+1 )) = σ ∗ then X+ = (x1 , . . . , xn , x∗ );
15: else X+ = (x2 , . . . , xn+1 , x∗ );
16: end if
17: else
18: find j ∈ {1, . . . , n} satisfying xj < x∗ < xj+1 ;
19: if sgn((s∗X − f )(xj )) = σ ∗ then X+ = (x1 , . . . , xj−1 , x∗ , xj+1 , . . . , xn+1 );
20: else X+ = (x1 , . . . , xj , x∗ , xj+2 , . . . , xn+1 );
21: end if
22: end if
23: return X+ ;
24: end function

Remark 5.39. The reference set X+ = (x+ 1 , . . . , xn+1 ) ∈ [a, b]


+ n+1
, after the
application of one Remez exchange, Algorithm 8, to the previous reference
set X = (x1 , . . . , xn+1 ) ∈ [a, b]n+1 , satisfies the following three conditions.
• |(s∗X − f )(x∗ )| = s∗X − f ∞ for one x∗ ∈ X+ ;
• |(s∗X − f )(x)| ≥ s∗X − f ∞,X for all x ∈ X+ ;
• sgn((s∗X − f )(x+ j )) = (−1) σ for 1 ≤ j ≤ n + 1 and some σ ∈ {±1};
j

These conditions are required for the performance of the Remez algorithm.

5.4 The Remez Algorithm 173

Now we formulate the Remez algorithm, Algorithm 9, as an iterative


method to numerically compute the (strongly unique) best approximation
s∗ ∈ S to f ∈ C [a, b] \ S satisfying

η = s∗ − f ∞ < s − f ∞ for all s ∈ S \ {s∗ }.

The Remez algorithm generates a sequence (Xk )k∈N0 ⊂ [a, b]n+1 of reference
sets, so that for the transition from X = Xk to X+ = Xk+1 , for any k ∈ N0 ,
all three conditions in Remark 5.39 are satisfied. The corresponding sequence
of best approximations s∗k ∈ S to f with respect to  · ∞,Xk satisfying

ηk = s∗k − f ∞,Xk < s − f ∞,Xk for all s ∈ S \ {s∗k }

converges to s∗ , i.e., s∗k −→ s∗ and ηk −→ η, for k → ∞, as we will prove in


Theorem 5.43.

Algorithm 9 Remez algorithm


1: function Remez algorithm
2: Input: Haar space S of dimension n ∈ N; f ∈ C [a, b] \ S;
3:
(0) (0)
4: find initial reference set X0 = (x1 , . . . , xn+1 ) ∈ [a, b]n+1 ;
5: for k = 0, 1, 2, . . . do
6: compute best approximation s∗k ∈ S to f with respect to  · ∞,Xk ;
7: let ηk := s∗k − f ∞,Xk ;
8: compute ρk = s∗k − f ∞ ;
9: if ρk ≤ ηk then return s∗k best approximation found
10: else
(k+1) (k+1)
11: find reference set Xk+1 = (x1 , . . . , xn+1 ) ∈ [a, b]n+1 satisfying
∗ ∗ ∗
12: • |(sk − f )(x )| = ρk for some x ∈ Xk+1 ;
13: • |(s∗k − f )(x)| ≥ ηk for all x ∈ Xk+1 ;
(k+1)
14: • sgn((s∗k − f )(xj )) = (−1)j σk
15: for 1 ≤ j ≤ n + 1 and some σk ∈ {±1}. alternation condition
16: end if
17: end for
18: end function

Remark 5.40. We remark that the construction of the reference set Xk+1
in line 11 of Algorithm 9 can be accomplished by a Remez exchange step,
Algorithm 8. In this case, all three conditions in lines 12-15 of Algorithm 9
are satisfied, according to Remark 5.39. 
174 5 Chebyshev Approximation

Next, we analyze the convergence of the Remez algorithm. To this end,


we first remark that at any step k in the Remez algorithm, we have
(k) (k)
• a current reference set Xk = (x1 , . . . , xn+1 ) ⊂ [a, b]n+1 ,
(k) (k)
• alternating signs ε(k) = (ε1 , . . . , εn+1 ) ∈ {±1}n+1 ,
(k) (k)
• and positive coefficients λ(k) = (λ1 , . . . , λn+1 )T ∈ Λn+1 ,
so that the dual functional ϕ : [a, b] −→ R, defined as


n+1
(k) (k) (k)
ϕ(u) = λj εj u(xj ) for u ∈ C [a, b],
j=1

satisfies the characterization (5.29) in the alternation theorem, Theorem 5.34.


In particular, by the alternation theorem, the following properties hold.
• Xk ⊂ Es∗k −f ;
= sgn((s∗k − f )(xj )) = (−1)j σk for all 1 ≤ j ≤ n + 1 with σk ∈ {±1};
(k) (k)
• εj
• εj (s∗k − f )(xj ) = s∗k − f ∞,Xk = ηk for all 1 ≤ j ≤ n + 1;
(k) (k)

• ϕ(s) = 0 for all s ∈ S.


Now we prove the monotonicity of the minimal distances ηk .

Proposition 5.41. Let the assumptions from the Remez algorithm be satis-
fied. Then, for any step k ∈ N0 , where the Remez iteration does not terminate,
we have the monotonicity of the minimal distances,

ηk+1 > ηk .

Proof. The representation


n+1
(k+1) (k+1) ∗ (k+1)
ηk+1 = λj εj (sk+1 − f )(xj )
j=1


n+1
(k+1) (k+1) ∗ (k+1)
= λj εj (sk − f )(xj )
j=1


n+1
|(s∗k − f )(xj
(k+1) (k+1)
= λj )|
j=1

holds. Moreover, we have

= sgn(s∗k − f )(xj
(k+1) (k+1)
εj ) for all 1 ≤ j ≤ n + 1

after the Remez exchange (see Algorithm 9, line 14).


Moreover, we have |(s∗k −f )(xj
(k+1)
)| ≥ ηk , for all 1 ≤ j ≤ n+1 (cf. line 13),

and there is one index j ∈ {1, . . . , n + 1} (cf. line 12) satisfying
5.4 The Remez Algorithm 175

|(s∗k − f )(xj ∗ )| = ρk = s∗k − f ∞ .


(k+1)

But this implies


(k+1) (k+1) (k+1) (k+1)
ηk+1 ≥ λj ∗ ρk + (1 − λj ∗ )ηk > λj ∗ ηk + (1 − λj ∗ )ηk = ηk , (5.36)
which already completes our proof. 
(k)
Next, we show that the coefficients λj are uniformly bounded away from
zero.
Lemma 5.42. Let f ∈ C [a, b]\S. Then, under the assumptions of the Remez
algorithm, the uniform bound
(k)
λj ≥α>0 for all 1 ≤ j ≤ n + 1 and all k ∈ N0 ,
holds for some α > 0 which is independent of 1 ≤ j ≤ n and k ∈ N0 .
Proof. We have

n+1 
n+1
λj εj (s∗k − f )(xj ) ≥ η0 = s∗ − f ∞,X0 .
(k) (k) (k) (k) (k) (k)
ηk = − λj εj f (xj ) =
j=1 j=1

Suppose the statement is false. Then, there are sequences of reference sets
(Xk )k , signs (ε(k) )k , and coefficients (λ(k) )k satisfying

n+1
(k) (k) (k)
ηk = − λj εj f (xj ) ≥ η0 > 0 for all k ∈ N0 , (5.37)
j=1

where one index j ∗ ∈ {1, . . . , n + 1} satisfies λj ∗ −→ 0, for k → ∞.


(k)

But the elements of the sequences (Xk )k , (ε(k) )k , and (λ(k) )k lie in com-
pact sets, respectively. Therefore, there are convergent subsequences with
(k )
xj −→ xj ∈ [a, b] for → ∞,
(k )
εj  −→ εj ∈ {±1} for → ∞,
(k )
λj −→ λj ∈ [0, 1] for → ∞,
for all 1 ≤ j ≤ n + 1, where λj ∗ = 0 for one index j ∗ ∈ {1, . . . , n + 1}.
Now we regard an interpolant s ∈ S satisfying s(xj ) = f (xj ) for all
1 ≤ j ≤ n + 1, j = j ∗ . Then, we have

n+1
(k ) (k ) ∗ (k )

n+1
(k ) (k ) (k )
η k = λj εj (sk − f )(xj )= λj εj (s − f )(xj )
j=1 j=1


n+1
(k ) (k ) (k ) (k ) (k ) (k )
= λj εj (s − f )(xj ) + λj ∗ εj ∗ (s − f )(xj ∗ )
j=1
j=j ∗


n+1
−→ λj εj (s − f )(xj ) + λj ∗ εj ∗ (s − f )(xj ∗ ) = 0 for → ∞.
j=1
j=j ∗
176 5 Chebyshev Approximation

But this is in contradiction to (5.37). 

Now we can finally prove convergence for the Remez agorithm.

Theorem 5.43. Either the Remez algorithm, Algorithm 9, terminates after


k ∈ N steps with returning the best approximation s∗k = s∗ to f ∈ C [a, b]
or the Remez algorithm generates convergent sequences of minimal distances
(ηk )k and best approximations (s∗k )k with limit elements

lim ηk = η = s∗ − f ∞ and lim s∗k = s∗ ∈ S,


k→∞ k→∞

where s∗ ∈ S is the strongly unique best approximation to f ∈ C [a, b] with


minimal distance η. The sequence (ηk )k of minimal distances converges lin-
early to η by the contraction

η − ηk+1 < θ(η − ηk ) for some θ ∈ (0, 1). (5.38)

Proof. Let f ∈ C [a, b] \ S (for f ∈ S the statement is trivial).


If the Remez algorithm terminates after k ∈ N steps, in line 9 of Algo-
rithm 9, with s∗k ∈ S, then s∗k = s∗ is, according to the alternation theorem,
Theorem 5.34, the strongly unique best approximation to f .
Now suppose the Remez algorithm does not terminate after finitely many
steps. For this case, we first show the contraction property (5.38).
By the estimate in (5.36),
(k+1) (k+1)
ηk+1 ≥ λj ∗ ρk + (1 − λj ∗ )ηk (5.39)

and from ρk = s∗k − f ∞ > s∗ − f ∞ = η > 0 it follows that


(k+1) (k+1)
ηk+1 > λj ∗ η + (1 − λj ∗ )ηk

and so
(k+1)
η − ηk+1 < (1 − λj ∗ )(η − ηk ).
(k+1)
By Lemma 5.42, there is one α > 0 satisfying λj ≥ α, for all 1 ≤ j ≤ n+1
and all k ∈ N0 . Therefore, the stated contraction (5.38) holds for θ = 1 − α ∈
(0, 1). From this, we get the estimate

η − ηk < θk (η − η0 ) for all k ∈ N0

by induction on k. Therefore, the sequence (ηk )k of minimal distances is


convergent with limit element η, i.e., ηk −→ η, for k → ∞.
From estimate (5.39), we can conclude
ηk+1 − ηk ηk+1 − ηk
ρk ≤ + ηk < + ηk ,
(k+1)
λj ∗ 1−θ
5.4 The Remez Algorithm 177

and this gives the estimates


ηk+1 − ηk
ηk < ρk < + ηk .
1−θ
This implies the convergence of the distances ρk to η, i.e.,

lim ρk = lim s∗k − f ∞ = s∗ − f ∞ = η.


k→∞ k→∞

We can conclude that the sequence (s∗k )k ⊂ S of the strongly unique best
approximations to f on Xk converges to the strongly unique best approxi-
mation s∗ to f . 

Finally, we discuss one important observation. We note that for the ap-
proximation of strictly convex functions f ∈ C [a, b] by linear polynomials,
the Remez algorithm may return the best approximation s∗ ∈ P1 to f after
only one step.

Proposition 5.44. Let f ∈ C [a, b] be a strictly convex function on a compact


interval [a, b] and S = P1 . Moreover, let X0 = (a, x0 , b), for x0 ∈ (a, b), be
an initial reference set for the Remez algorithm. Then, the Remez algorithm
terminates after at most one Remez exchange.

Proof. Regard s ∈ P1 in its monomial representation s(x) = m · x + c for


m, c ∈ R. Then, we have for x, y ∈ [a, b], x = y, and λ ∈ (0, 1) the strict
inequality

(f − s)(λx + (1 − λ)y)
= f (λx + (1 − λ)y) − m · (λx + (1 − λ)y) − c
< λf (x) + (1 − λ)f (y) − m · (λx + (1 − λ)y) − c
= λf (x) − λmx − λc + (1 − λ)f (y) − (1 − λ)my − (1 − λ)c
= λ(f − s)(x) + (1 − λ)(f − s)(y)

by the strict convexity of f , i.e., f − s is also strictly convex.


Now let s∗ ∈ P1 be the strongly unique best approximation to f . Due to
the alternation theorem, Theorem 5.34, the error function f − s∗ has at least
three extremal points with alternating signs in [a, b]. Since f − s∗ is strictly
convex and continuous, f − s∗ has exactly one global minimum x∗ on (a, b).
Moreover, two global maxima of f − s∗ are at the boundary of [a, b], i.e., we
have {a, b} ⊂ Es∗ −f with

(f − s∗ )(a) = f − s∗ ∞ = (f − s∗ )(b).

From the representation s∗ (x) = m∗ · x + c∗ , we obtain the slope

f (b) − f (a)
m∗ = = [a, b](f ).
b−a
178 5 Chebyshev Approximation

Let s∗0 ∈ P1 be the best approximation to f with respect to X0 = (a, x0 , b).


Then, according to the alternation theorem, we have

(f − s∗0 )(a) = σf − s∗0 ∞,X0 = (f − s∗0 )(b) for some σ ∈ {±1},

whereby the representation s∗0 (x) = m0 · x + c0 implies m0 = [a, b](f ) = m∗ ,


i.e., s∗ and s∗0 differ by at most one constant.
If x0 ∈ Es∗ −f , then x0 = x∗ , and the best approximation s∗ to f is
already found by s∗0 , due to the alternation theorem. In this case, the Remez
algorithm terminates immediately with returning s∗ = s∗0 .
If x0 ∈ Es∗ −f , then the Remez algorithm selects the unique global mini-
mum x∗ = x0 of f − s∗ for the exchange with x0 : Since s∗ and s∗0 differ by
at most a constant, x∗ is also the unique global minimum of f − s∗0 on (a, b),
i.e., we have

(f − s∗0 )(x∗ ) < (f − s∗0 )(x) for all x ∈ [a, b], where x = x∗ . (5.40)

By the strict convexity of f −s∗0 , we can further conclude the strict inequality

(f − s∗0 )(x∗ ) < (f − s∗0 )(x0 ) < 0,

or,

ρ0 = f −s∗0 ∞ = |(f −s∗0 )(x∗ )| > |(f −s∗0 )(x0 )| = f −s∗0 ∞,X0 = η0 . (5.41)

By (5.40) and (5.41), the point x∗ is the unique global maximum of |f −s∗0 |
on [a, b]. Therefore, x∗ is the only candidate for the required Remez exchange
(in line 5 of Algorithm 8) for x0 . After the execution of the Remez exchange,
we have X1 = (a, x∗ , b), so that the Remez algorithm immediately terminates
with returning s∗1 = s∗ . 

For further illustration, we make an example linked with Example 5.37.

Example 5.45. We approximate the strictly convex exponential function


f (x) = exp(x) on the interval [0, 2] by linear polynomials, i.e., F = C [0, 2]
and S = P1 . We take X0 = (0, 1, 2) as the initial reference set in the Remez
algorithm, Algorithm 9. According to Example 5.37,
 2
e−1 e2 − 1
s∗0 (x) =1− + x
2 2

is the unique best approximation to f from P1 with respect to  · ∞,X0 , with


minimal distance |η0 | = (e − 1)2 /4 ≈ 0.7381, where e is the Euler number.
The error function |s∗0 (x) − exp(x)| attains on [0, 2] its unique maximum
ρ0 = s∗0 − exp ∞ ≈ 0.7776 at x∗ = log((e2 − 1)/2) > 1. We have ρ0 > η0 ,
and so one Remez exchange leads to the new reference set X1 = (0, x∗ , 2).
5.5 Exercises 179

According to Proposition 5.44, the Remez algorithm returns already after the
next iteration the best approximation s∗1 to f .
Finally, we compute s∗1 , the best approximation to f for the reference set
X1 = (0, x∗ , 2). To this end, we proceed as in Example 5.37, where we first
determine the required divided differences for f and ε = (−1, 1, −1) by using
the recursion in Theorem 2.14:
X fX X εX
0 1 0 −1
e2 −1 e2 −3
x∗ 2 2x∗ x∗ 1 2
x∗
e2 +1 (e2 −1)(x∗ −1)+2 2 −1 − 2−x
2
− (2−x2∗ )x∗
2 e2 2(2−x∗ ) 2(2−x∗ )x∗

From this we compute the minimal distance s∗1 − f ∞,X1 = −η1 ≈ 0.7579
by
1 
η1 = − (e2 − 1)(x∗ − 1) + 2
4
and the best approximation to f from P1 with respect to  · ∞,X1 by

e2 − 4η1 − 3
s∗1 = [0](f − ηX ε) + [0, x∗ ](f − ηX ε)x = 1 + η1 + x.
2x∗
By Proposition 5.44, the Remez algorithm terminates with the reference set
X1 = Es∗1 −f , so that by s∗1 ∈ P1 the unique best approximation to f with
respect to  · ∞ is found. Figure 5.5 shows the best approximations s∗j ∈ P1
to f for the reference sets Xj , for j = 0, 1. ♦

5.5 Exercises
Exercise 5.46. Let F = C [−1, 1] be equipped with the maximum norm
 · ∞ . Moreover, let f ∈ P3 \ P2 be a cubic polynomial, i.e., has the form

f (x) = a x3 + b x2 + c x + d for x ∈ [−1, 1]

with coefficients a, b, c, d ∈ R, where a = 0.


(a) Compute a best approximation p∗2 ∈ P2 to f from P2 w.r.t.  · ∞ .
(b) Is the best approximation p∗2 from (a) unique?

Exercise 5.47. Let P∞ : C [a, b] −→ Pn denote the operator, which assigns


every f ∈ C [a, b] to its best approximation p∗∞ (f ) ∈ Pn from Pn w.r.t.  · ∞ ,
i.e.,
P∞ (f ) = p∗∞ (f ) for f ∈ C [a, b].
(a) Show that P∞ is well-defined.
(b) Is P∞ linear or non-linear?
180 5 Chebyshev Approximation

Exercise 5.48. For a compact interval [a, b] ⊂ R, let F = C [a, b] be


equipped with the maximum norm  · ∞ . Moreover, let f ∈ C [a, b] \ Pn−1 ,
for n ∈ N. Then, there is a strongly unique best approximation p∗ ∈ Pn−1 to
f from Pn−1 w.r.t.  · ∞ and an alternation set X = (x1 , . . . , xn+1 ) ∈ Esn+1
∗ −f

for s∗ and f (see Corollary 5.4). For the dual characterization of the best
approximation p∗ ∈ Pn−1 we use, as in (5.6), a linear functional ϕ ∈ F  of
the form

n+1
ϕ(u) = λk εk u(xk ) for u ∈ C [a, b]
k=1

with coefficients λ = (λ1 , . . . , λn+1 )T ∈ Λn+1 and alternating signs

εk = sgn(p∗ − f )(xk ) = σ (−1)k for k = 1, . . . , n + 1.

By these assumptions on ϕ, two conditions of the dual characterization (ac-


cording to Theorem 3.48) are already satisfied, that is (a) ϕ∞ = 1 and
(b) ϕ(p∗ − f ) = p∗ − f ∞ .
Now this problem is concerning condition (c) of the dual characterization
in Theorem 3.48. To this end, consider using divided differences (cf. Defini-
tion 2.10) to construct, from given alternation points

a ≤ x 1 < . . . < xn ≤ b

and σ ∈ {±1}, a coefficient vector λ = (λ1 , . . . , λn+1 )T ∈ Λn+1 satisfying

ϕ(p) = 0 for all p ∈ Pn−1 .

Exercise 5.49. Let F = C [0, 2π] and S = P1 . Moreover, for n ∈ N, let

fn (x) = sin(nx) for x ∈ [0, 2π].

(a) Compute the unique best approximation s∗n ∈ P1 to fn w.r.t.  · ∞ .


(b) How many alternation points occur for the error function s∗n − fn in (a)?
But should not there only be three alternation points?

Exercise 5.50. Let F = C [−2, 1] be equipped with the maximum norm


 · ∞ . Compute the unique best approximation p∗ ∈ P2 from P2 to the
function f ∈ C [−2, 1], defined as

f (x) = |x + 1| for x ∈ [−2, 1]

with respect to the maximum norm  · ∞ .


Moreover, determine the set of extremal points X = Ep∗ −f , along with a
constant K > 0 satisfying

p − f ∞,X ≥ p∗ − f ∞ + K · p − p∗ ∞,X for all p ∈ P2 .

Plot the graphs of f and the best approximation p∗ to f in one figure.


5.5 Exercises 181

Exercise 5.51. Let F = C [0, 2] be equipped with the maximum norm ·∞ .
Determine the strongly unique best approximation p∗ ∈ P1 from P1 to the
function f ∈ C [0, 2], defined as

f (x) = exp −(x − 1)2 for x ∈ [0, 2]

with respect to the maximum norm  · ∞ .


Moreover, determine a constant K > 0 satisfying

p − f ∞ − p∗ − f ∞ ≥ K · p − p∗ ∞ for all p ∈ P1 .

Use this inequality to conclude the uniqueness of the best approximation


p∗ ∈ P1 yet once more.
Exercise 5.52. Let S ⊂ C [a, b] be a Haar space with dim(S) = n + 1 ∈ N.
Prove the Haar condition: If s ∈ S \ {0} has on the interval [a, b] exactly
m zeros from which k zeros are without sign change, then we have m + k ≤ n.
Exercise 5.53. In this problem, let I ⊂ R be a compact set containing
sufficiently many points, respectively. Analyze whether or not the following
function systems H = (s1 , . . . , sn ) ∈ (C (I))n are a Haar system on I.
(a) H = (x, 1/x) for I ⊂ (0, ∞).
(b) H = (1/(x − c0 ), 1/(x − c1 )) for I ⊂ R \ {c0 , c1 }, where c0 = c1 .
(c) H = (1, x2 , x4 , . . . , x2n ) for I = [−1, 1].
(d) H = (1, x, . . . , xn , g(x)) for a compact interval I = [a, b],
where g ∈ C n+1 [a, b] with g (n+1) ≥ 0 and g (n+1) ≡ 0 on [a, b].
Exercise 5.54. For n ∈ N0 , let Tne be the linear space of all even real-valued
trigonometric polynomials of degree at most n, and let Tno be the linear space
of all odd real-valued trigonometric polynomials of degree at most n.
(a) Show that Tne is a Haar space on the interval [0, π).
(b) Determine the dimension of Tne .
(c) Is Tno a Haar space on the interval [0, π)?
(d) Is Tno a Haar space on the open interval (0, π)?
(e) Determine the dimension of Tno .
Exercise 5.55. Prove the following results.
(a) The functions

s0 (x) = 1, s1 (x) = x cos(x), s2 (x) = x sin(x)

are a Haar system on [0, π].


(b) There is no two-dimensional subspace of

S = span{s0 , s1 , s2 } ⊂ C [0, π],

which is a Haar space on [0, π].


182 5 Chebyshev Approximation

Exercise 5.56. For n ∈ N0 , let S ⊂ C [a, b] be a (n + 1)-dimensional linear


subspace of C [a, b]. Moreover, let S satisfy the weak Haar condition on [a, b],
according to which any s ∈ S has at most n sign changes in [a, b].
Prove the following statements for f ∈ C [a, b].
(a) If there is an alternation set for s ∈ S and f of length n + 2, so that there
are n + 2 pairwise distinct alternation points a ≤ x0 < . . . < xn+1 ≤ b
and one sign σ ∈ {±1} satisfying

(s − f )(xk ) = σ (−1)k s − f ∞ for all k = 0, . . . , n + 1,

then s is a best approximation to f from S with respect to  · ∞ .


(b) The converse of statement (a) is false (for the general case).
Exercise 5.57. Let F = C [a, b] and S ⊂ F be a Haar space on [a, b] of
dimension n + 1 containing the constant functions. Moreover, let f ∈ F \ S,
such that 1 2
span S ∪ {f }
is a Haar space on [a, b]. Finally, s∗ ∈ S be the unique best approximation to
f from S with respect to  · ∞ .
Show that the error function f − s∗ has exactly n + 2 extremal points

a = x0 < . . . < xn+1 = b,

where f − s∗ is strictly monotone between neighbouring extremal points.


Exercise 5.58. In this programming exercise, we wish to compute for any
n ∈ N the strongly unique best approximation p∗ ∈ Pn−1 to f ∈ C [a, b]\Pn−1
from Pn−1 w.r.t.  · ∞,X on a point set X = (x1 , . . . , xn+1 ) ∈ [a, b]n+1 , so
that
p∗ − f ∞,X < p − f ∞,X for all p ∈ Pn−1 \ {p∗ }.
To this end, implement a function called mybestpoly with header
[alpha,eta] = mybestpoly(f,X),
which returns on input point set X (of length |X| = n + 1) the Newton
coefficients α = (α0 , . . . , αn−1 ) ∈ Rn of the best approximation


n−1 
k
p∗ (x) = αk ωk (x) where ωk (x) = (x − xj ) ∈ Pk for 0 ≤ k ≤ n − 1
k=0 j=1

to f w.r.t.  · ∞,X , along with the minimal distance ηX = f − s∗ ∞,X .


Exercise 5.59. To efficiently evaluate the best approximation p∗ ∈ Pn−1
from Exercise 5.58, we use the Horner10 scheme (a standard numerical
method, see e.g. [28, Section 5.3.3].
10
William George Horner (1786-1837), English mathematician
5.5 Exercises 183

To this end, implement a function called mynewtonhorner with header


[p] = mynewtonhorner(X,alpha,x),
which returns on an input point set X = {x1 , . . . , xn+1 } ⊂ [a, b], Newton
coefficients α = (α0 , . . . , αn−1 ) ∈ Rn and x ∈ R the value


n−1
p(x) = αk ωk (x) ∈ Pn−1 ,
k=0

where the evaluation of p at x should rely on the Horner scheme.

Exercise 5.60. Implement the Remez exchange, Algorithm 8. To this end,


write a function called myremezexchange with header
[X] = myremezexchange(X,epsilon,x),
which returns, on input reference set X = (x1 , . . . , xn+1 ) ∈ [a, b]n+1 , an
extremal point x = x∗ ∈ [a, b] \ X satisfying

|(p∗ − f )(x∗ )| = p∗ − f ∞

and a sign vector ε = (ε1 , ε2 ) ∈ {±1}2 satisfying

ε1 = sgn(p∗ − f )(x1 ) and ε2 = sgn(p∗ − f )(x∗ )

the updated reference set X+ (as output by Algorithm 8), i.e.,

X+ = (X \ {xj }) ∪ {x∗ } for one 1 ≤ j ≤ n + 1.

Exercise 5.61. Implement the Remez algorithm, Algorithm 9. To this end,


write a function myremez with header
[alpha,eta,X,its] = myremez(f,X),
which returns, on input function f ∈ C [a, b]\Pn−1 and an initial reference set
X = (x1 , . . . , xn+1 ) ∈ [a, b]n+1 , the Newton coefficients α = (α0 , . . . , αn−1 )
of the (strongly unique) best approximation p∗ ∈ Pn−1 to f from Pn−1
w.r.t.  · ∞ , the minimal distance η = p∗ − f ∞ , a set of alternation points
X ⊂ Ep∗ −f , and the number its of the performed Remez iterations. For
your implementation, use the functions mybestpoly (from Exercise 5.58),
mynewtonhorner (Exercise 5.59) and myremezexchange (Exercise 5.60).
Verify your function myremez by using the following examples.

(a) f (x) = 3 x, [a, b] = [0, 1], X = 0, 21 , 34 , 1 ;
(b) f (x) = sin(5x) + cos(6x), [a, b] = [0, π], X = 0, 12 , 32 , 52 , π .
184 5 Chebyshev Approximation

Exercise 5.62. Analyze for the case S = Pn−1 the asymptotic computa-
tional complexity for only one iteration of the Remez algorithm, Algorithm 9.
(a) Determine the costs for the minimal distance ηk = s∗k − f ∞,Xk .
Hint: Use divided differences (according to Proposition 5.35).
(b) Determine the costs for computing the Newton coefficients of s∗k .
Hint: Reuse the divided differences from (a).
(c) Sum up the required asymptotic costs in (a) and (b).

How do you efficiently compute the update ηk+1 from information that is
required to compute ηk ?

Exercise 5.63. Assuming the notations of the Remez algorithm, Algorithm 9,


we consider the (global) distance

ρk = s∗k − f ∞ for k ∈ N0

between f ∈ C [a, b] and the current best approximation s∗k ∈ S to f , for the
(k) (k)
current reference set Xk = (x1 , . . . , xn+1 ) ∈ [a, b]n+1 and w.r.t.  · ∞,Xk .
Show that the sequence (ρk )k∈N0 is not necessarily strictly increasing. To
this end, construct a simple (but non-trivial) counterexample.
6 Asymptotic Results

In this chapter, we prove asymptotic statements to quantify the convergence


behaviour of both algebraic and trigonometric approximation by partial sums.
For the trigonometric case, the analysis of Fourier partial sums plays a
central role. Recall that we have studied Fourier partial sums,

(f, 1) 
n
(Fn f )(x) = + [(f, cos(j·)) cos(jx) + (f, sin(j·)) sin(jx)] ,
2 j=1

for f ∈ C2π , already in Chapter 4: According to Corollary 4.12, Fn f is the


unique best approximation to f from the linear space Tn of trigonometric
polynomials of degree at most n ∈ N0 with respect to the Euclidean norm ·.
As we proceed in this chapter, we will analyze the asymptotic behaviour
of the minimal distances with respect to both the Euclidean norm  · ,

η(f, Tn ) := inf T − f  = Fn f − f  for n → ∞,


T ∈Tn

and with respect to the maximum norm  · ∞ . To this end, we first show for
continuous functions f ∈ C2π convergence of Fn f to f with respect to  · ,
and then we prove convergence rates, for f ∈ C2π
k
, k ∈ N0 , of the form

η(f, Tn ) = o(n−k ) for n → ∞.

Finally, we analyze the uniform convergence of Fourier partial sums, i.e., we


study the asymptotic behaviour of the distances

Fn f − f ∞ for n → ∞.

In this chapter, we prove the following classical results of approximation:


• The Weierstrass theorem, according to which any function f ∈ C2π can,
w.r.t. ·∞ , be approximated arbitrarily well by trigonometric polynomials.
• The Jackson inequalities, which allow us to quantify the asymptotic
behaviour of the minimal distances

η∞ (f, Tn ) := inf T − f ∞ for n → ∞.


T ∈Tn

Likewise, we will also discuss the algebraic case for the approximation to
f ∈ C [a, b] by partial sums Pn f from Pn .

© Springer Nature Switzerland AG 2018 185


A. Iske, Approximation Theory and Algorithms for Data Analysis, Texts
in Applied Mathematics 68, https://doi.org/10.1007/978-3-030-05228-7_6
186 6 Asymptotic Results

6.1 The Weierstrass Theorem


We analyze the following two fundamental questions of approximation:
Question 1: Can we approximate any function f ∈ C [a, b] on a compact inter-
val [a, b] ⊂ R with respect to  · ∞ arbitrarily well by algebraic polynomials?
Question 2: Can we approximate any continuous 2π-periodic function f ∈ C2π
with respect to  · ∞ arbitrarily well by trigonometric polynomials?
Not too surprisingly, the two questions are related. In fact, a positive ans-
wer to both questions was given already in 1885 by Weierstrass1 , who also
discovered the intrinsic relation between the two problems of these questions.
As we show in this section, the answer to the trigonometric case (Question 2)
can be concluded from the solution for the algebraic case (Question 1). The
solutions given by Weierstrass were celebrated as the birth of approximation.
In the following discussion, we will be more precise about the above two
questions. To this end, we need only a few preparations.

Definition 6.1. Let F be a normed linear space with norm  · . Then a


subset S ⊂ F is said to lie dense in F with respect to  · , if there exists,
for any f ∈ F and any ε > 0, an element s ≡ s(f, ε) ∈ S satisfying

s − f  < ε.

Now we can give a more concise formulation for the above two questions.
• Are the algebraic polynomials P dense in C [a, b] with respect to  · ∞ ?
• Are the trigonometric polynomials T dense in C2π with respect to  · ∞ ?

Remark 6.2. If S ⊂ F is dense in F with respect to ·, then the topological


closure S of S (with respect to  · ) coincides with F , i.e.,

S = F,

or, in other words: For any f ∈ F, there is a convergent sequence (sn )n∈N in
S with limit f , so that sn − f  −→ 0 for n → ∞. 

Remark 6.3. For a linear subspace S ⊂ F, S = F, Definition 6.1 does only


make sense, if S is infinite-dimensional. Otherwise, if S = F is only finite-
dimensional, then there is, according to Corollary 3.8, for any f ∈ F \ S a
best approximation s∗ ∈ S to f at a positive minimal distance η(f, S) > 0,
i.e., f cannot be approximated arbitrarily well by elements from S, since the
closest distance between f and S is η(f, S). In this case, S is not dense in F.

1
Karl Weierstrass (1815-1897), German mathematician
6.1 The Weierstrass Theorem 187

Example 6.4. The set Q of rational numbers is dense in the set R of real
numbers with respect to the absolute-value function | · |. ♦

Now let us turn to the Weierstrass theorems, for which there exist many
different proofs (see, e.g. [33]). Our constructive proof for the algebraic case of
the Weierstrass theorem relies on a classical account via Korovkin sequences.

Definition 6.5. A sequence (Kn )n∈N of linear and monotone operators


Kn : C [a, b] −→ C [a, b] is called a Korovkin2 sequence on C [a, b], if

lim Kn p − p∞ = 0 for all p ∈ P2 .


n→∞

To further explain the utilized terminology, we recall a standard charac-


terization for monotone linear operators.

Remark 6.6. A linear operator K : C [a, b] −→ C [a, b] is monotone on


C [a, b], if and only if K is positive on C [a, b], i.e., the following two statements
are equivalent.
(a) For any f, g ∈ C [a, b] satisfying f ≤ g, we have Kf ≤ Kg;
(b) For any f ∈ C [a, b] satisfying f ≥ 0, we have Kf ≥ 0;
where all inequalities in (a) and (b) are taken pointwise on [a, b]. 

Next, we study an important special case for a Korovkin sequence. To


this end, we restrict ourselves to the continuous functions C [0, 1] on the unit
interval [0, 1]. This is without loss of generality, since otherwise, i.e., for any
other compact interval [a, b] ⊂ R, we may apply the affine-linear mapping
x −→ (x − a)/(b − a), for x ∈ [a, b].
Now we consider the Bernstein3 polynomials
 
(n) n j
βj (x) = x (1 − x)n−j ∈ Pn for 0 ≤ j ≤ n. (6.1)
j

Let us note a few elementary properties of Bernstein polynomials.


(n) (n)
Remark 6.7. The Bernstein polynomials β0 , . . . , βn ∈ Pn , for n ∈ N0 ,
(a) form a basis for the polynomial space Pn ,
(n)
(b) are positive on [0, 1], i.e., βj (x) ≥ 0 for all x ∈ [0, 1],
(c) are on [0, 1] a partition of unity, i.e.,


n
(n)
βj (x) = 1 for all x ∈ [0, 1].
j=0
2
Pavel Petrovich Korovkin (1913-1985), Russian mathematician
3
Sergei Natanovich Bernstein (1880-1968), Russian mathematician
188 6 Asymptotic Results

Note that property (c) holds by the binomial theorem, whereas properties (a)
and (b) can be verified by elementary calculations (cf. Exercise 6.83). 
By using the Bernstein polynomials in (6.1) we can make an important
example for monotone linear operators on C [0, 1].
Definition 6.8. For n ∈ N, the Bernstein operator Bn : C [0, 1] −→ Pn
is defined as

n
(n)
(Bn f )(x) = f (j/n)βj (x) for f ∈ C [0, 1], (6.2)
j=0

(n) (n)
where β0 , . . . , βn ∈ Pn are the Bernstein polynomials in (6.1).
The Bernstein operators Bn are obviously linear on C [0, 1]. By the posi-
(n)
tivity of the Bernstein polynomials βj , Remark 6.7 (b), the Bernstein ope-
rators Bn are, moreover, positive (and therefore monotone) on C [0, 1]. We
note yet another elementary property of the operators Bn .
Remark 6.9. The Bernstein operators Bn : C [0, 1] −→ Pn in (6.2) are
bounded on C [0, 1] with respect to  · ∞ , since for any f ∈ C [0, 1], we have
   
   n 
 n   (n) 

Bn f ∞ = 
(n) 
f (j/n)βj (x) ≤ f ∞   βj (x)
 = f ∞
 j=0   j=0 
∞ ∞

and so
Bn f ∞ ≤ f ∞ for all f ∈ C [0, 1].
In particular, by transferring the result of Theorem 3.45 from linear func-
tionals to linear operators, we can conclude that the Bernstein operators
Bn : C [0, 1] −→ Pn are continuous on C [0, 1]. 
Now we prove the Korovkin property for the Bernstein operators.
Theorem 6.10. The sequence of Bernstein operators Bn : C [0, 1] −→ Pn ,
for n ∈ N, is a Korovkin sequence on C [0, 1].
Proof. The Bernstein operators Bn , n ∈ N, reproduce linear polynomials.
Indeed, on the one hand, we have Bn 1 ≡ 1, for all n ∈ N, by the partition of
unity, according to Remark 6.7 (c). On the other hand, we find for p1 (x) = x
the identity Bn p1 = p1 , for any n ∈ N, since we get
n   n  
j n j n−1 j
(Bn p1 )(x) = x (1 − x) n−j
= x (1 − x)n−j
j=0
n j j=1
j − 1

n−1 
n−1 j
=x x (1 − x)n−j−1 = x.
j=0
j
6.1 The Weierstrass Theorem 189

According to Definition 6.5, it remains to show the uniform convergence


lim Bn p2 − p2 ∞ = 0
n→∞

for the quadratic monomial p2 (x) = x2 . To this end, we apply the Bernstein
operators Bn to the sequence of functions
n x
fn (x) = x2 − ∈ P2 for n ≥ 2,
n−1 n−1
where for n ≥ 2 we have
n   2
 
n j n j
(Bn fn )(x) = − xj (1 − x)n−j
j=0
j n n − 1 n(n − 1)
2


n
n! j(j − 1) j
= x (1 − x)n−j
j=0
(n − j)!j! n(n − 1)

n
(n − 2)!
= xj (1 − x)n−j
j=2
(n − j)!(j − 2)!

n−2 
n−2 j
= x2 x (1 − x)n−j−2 = p2 (x).
j=0
j

Together with the boundedness of the Bernstein operators Bn (according to


Remark 6.9), this finally implies
Bn p2 − p2 ∞ = Bn (p2 − fn )∞ ≤ p2 − fn ∞ ,
whereby through p2 − fn ∞ −→ 0, for n → ∞, the statement is proven. 
The following result of Korovkin is of fundamental importance.
Theorem 6.11. (Korovkin, 1953). For a compact interval [a, b] ⊂ R, let
(Kn )n∈N be a Korovkin sequence on C [a, b]. Then, we have
lim Kn f − f ∞ = 0 for all f ∈ C [a, b]. (6.3)
n→∞

Proof. Suppose f ∈ C [a, b]. Then, f is bounded on [a, b], i.e., there is some
M > 0 with f ∞ ≤ M . Moreover, f is uniformly continuous on the compact
interval [a, b], i.e., for any ε > 0 there is some δ > 0 satisfying
|x − y| < δ =⇒ |f (x) − f (y)| < ε/2 for all x, y ∈ [a, b].
Now let t ∈ [a, b] be fixed. Then, we have for x ∈ [a, b] the two estimates
 2
ε x−t ε 2M  
f (x) − f (t) ≤ + 2M = + 2 x2 − 2xt + t2
2 δ 2 δ
 2
ε x−t ε 2M  
f (x) − f (t) ≥ − − 2M = − − 2 x2 − 2xt + t2 ,
2 δ 2 δ
190 6 Asymptotic Results

where ε, δ and M are independent of x. If we apply the linear and monotone


operator Kn , for n ∈ N, to both sides of these inequalities (with respect to
variable x), then this implies

(Kn f )(x) − f (t)(Kn 1)(x) ≤


ε 2M  
(Kn 1)(x) + 2 (Kn x2 )(x) − 2t(Kn x)(x) + t2 (Kn 1)(x)
2 δ
(Kn f )(x) − f (t)(Kn 1)(x) ≥
ε 2M  
− (Kn 1)(x) − 2 (Kn x2 )(x) − 2t(Kn x)(x) + t2 (Kn 1)(x)
2 δ
for all x ∈ [a, b]. Therefore, we have the estimate

|(Kn f )(x) − f (t)(Kn 1)(x)| ≤


ε 2M
|(Kn 1)(x)| + 2 |(Kn x2 )(x) − 2t(Kn x)(x) + t2 (Kn 1)(x)|. (6.4)
2 δ
By assumption, there is for any ε̃ > 0 some N ≡ N (ε̃) ∈ N satisfying

(Kn xk ) − xk ∞ < ε̃ for k = 0, 1, 2,

for all n ≥ N . This in particular implies

|(Kn 1)(x)| ≤ Kn 1∞ = (Kn 1 − 1) + 1∞ ≤ ε̃ + 1 (6.5)

as well as

|(Kn x2 )(x) − 2t(Kn x)(x) + t2 (Kn 1)(x)| =


|((Kn x2 )(x) − x2 ) − 2t((Kn x)(x) − x) + t2 ((Kn 1)(x) − 1) + x2 − 2tx + t2 |
≤ ε̃(1 + 2|t| + t2 ) + (x − t)2 (6.6)

for all n ≥ N . From (6.4), (6.5) and (6.6), we obtain the estimate

|(Kn f )(x) − f (t)| ≤ |(Kn f )(x) − f (t)(Kn 1)(x)| + |f (t)(Kn 1)(x) − f (t)|
ε 2M  
≤ (ε̃ + 1) + 2 ε̃(1 + 2|t| + t2 ) + (x − t)2 + M ε̃,
2 δ
where for x = t, the inequality
ε 2M  
|(Kn f )(t) − f (t)| ≤ (ε̃ + 1) + 2 ε̃(1 + 2|t| + t2 ) + M ε̃ (6.7)
2 δ
follows for all n ≥ N .
Now the right hand side in (6.7) can uniformly be bounded from above
by an arbitrarily small ε̂ > 0, so that we have, for some N ≡ N (ε̂) ∈ N,

Kn f − f ∞ < ε̂ for all n ≥ N.

This proves the uniform convergence in (6.3), as stated. 


6.1 The Weierstrass Theorem 191

Now we can prove the density theorem of Weierstrass.


Corollary 6.12. (Weierstrass theorem for algebraic polynomials).
The algebraic polynomials P are, w.r.t. the maximum norm  · ∞ on a com-
pact interval [a, b] ⊂ R, dense in C [a, b]. In particular, any f ∈ C [a, b] can,
w.r.t.  · ∞ , be approximated arbitrarily well by algebraic polynomials, i.e.,
for any f ∈ C [a, b] and ε > 0, there is a polynomial p ∈ P satisfying
p − f ∞ < ε.
Proof. We use the Bernstein operators (Bn )n∈N , which are a Korovkin se-
quence on C [0, 1]. Suppose f ∈ C [0, 1] and ε > 0. Then, according to the
Korovkin theorem, there is one n ≡ n(ε) ∈ N, satisfying Bn f − f ∞ < ε. By
p = Bn f ∈ Pn ⊂ P, the statement follows immediately from Theorem 6.11.

Note that the Weierstrass theorem gives a positive answer to Question 1,
as posed at the outset of this section. Next, we specialize the density theorem
of Weierstrass, Corollary 6.12, to even (or odd) functions.
Corollary 6.13. Any even continuous function f ∈ C [−1, 1] can, w.r.t. the
norm ·∞ , be approximated arbitrarily well by an even algebraic polynomial.
Likewise, any odd continuous function f ∈ C [−1, 1] can, with respect to
 · ∞ , be approximated arbitrarily well by an odd algebraic polynomial.
Proof. Let f ∈ C [−1, 1] be even and ε > 0. Then, due to the Weierstrass
theorem, Corollary 6.12, and Proposition 3.42, there is an even algebraic
polynomial p ∈ P satisfying p − f ∞ < ε. Likewise, for odd f ∈ C [−1, 1],
the second statement follows by similar arguments (cf. Exercise 3.73). 
Now from our observations in Corollary 6.13, we wish to conclude a corres-
ponding density result for the case of trigonometric polynomials T ⊂ C2π . In
preparation, we first prove two lemmas.
Lemma 6.14. The linear space of real-valued trigonometric polynomials
 8
1 
T = spanR √ , cos(jx), sin(jx)  j ∈ N
2
is a unital commutative algebra over R. In particular, T is closed under the
multiplication, i.e., the product of two real-valued trigonometric polynomials
is a real-valued trigonometric polynomial.
Proof. The statement follows directly from the trigonometric addition for-
mulas, in particular from the representations (4.16)-(4.18), for j, k ∈ Z, i.e.,
2 cos(jx) cos(kx) = cos((j − k)x) + cos((j + k)x)
2 sin(jx) sin(kx) = cos((j − k)x) − cos((j + k)x)
2 sin(jx) cos(kx) = sin((j − k)x) + sin((j + k)x).
The remaining properties for a unital commutative algebra T are trivial. 
192 6 Asymptotic Results

Remark 6.15. Let p ∈ P be an algebraic polynomial. Then,

p(sin(jx) cos(kx)) ∈ T for j, k ∈ N0

is a trigonometric polynomial. Moreover, every trigonometric polynomial

p(cos(kx)) ∈ T for k ∈ N0

is an even function. 

We now show that the even trigonometric polynomials are, with respect
to the maximum norm  · ∞ , dense in C [0, π].

Lemma 6.16. For any f ∈ C [0, π] and ε > 0, there is one even trigono-
metric polynomial Tg ∈ T satisfying

Tg − f ∞ < ε.

Proof. Suppose f ∈ C [0, π]. Then, g(t) = f (arccos(t)) ∈ C [−1, 1]. Therefore,
according to the Weierstrass theorem, Corollary 6.12, there is one algebraic
polynomial p ∈ P satisfying p − g∞,[−1,1] < ε. This implies

p(cos(·)) − f ∞,[0,π] = p − g∞,[−1,1] < ε

with the (bijective) variable transformation x = arccos(t), or, t = cos(x).


Letting Tg (x) = p(cos(x)) ∈ T , our proof is complete. 

Now we transfer the Weierstrass theorem for algebraic polynomials, Corol-


lary 6.12, to the case of trigonometric polynomials. To this end, we consider
the linear space C2π ⊂ C (R) of all continuous 2π-periodic target functions.
Due to the periodicity of the elements in C2π , we can restrict ourselves to the
compact interval [0, 2π].
This results in the formulation of the Weierstrass density theorem.

Corollary 6.17. (Weierstrass theorem for the trigonometric case).


The trigonometric polynomials T are, w.r.t. the maximum norm ·∞ , dense
in C2π . In particular, any function f ∈ C2π can, w.r.t. ·∞ , be approximated
arbitrarily well by trigonometric polynomials, i.e., for any f ∈ C2π and ε > 0
there is a trigonometric polynomial Tf ∈ T satisfying Tf − f ∞ < ε.

Proof. Any f ∈ C2π can be decomposed as a sum


1 1
f (x) = (f (x) + f (−x)) + (f (x) − f (−x)) = fe (x) + fo (x)
2 2
of an even function fe ∈ C2π and an odd function fo ∈ C2π . Now the two
even functions

fe (x) and ge (x) = sin(x)fo (x)


6.1 The Weierstrass Theorem 193

can be approximated arbitrarily well on [0, π] by even trigonometric polyno-


mials Tfe , Tge ∈ T , so that we have

Tfe − fe ∞ = Tfe − fe ∞,[−π,π] = Tfe − fe ∞,[0,π] < ε/4


Tge − ge ∞ = Tge − ge ∞,[−π,π] = Tge − ge ∞,[0,π] < ε/4.

Therefore, we have, everywhere on R, the representations

fe = Tfe + ηfe and ge = Tge + ηge

with (even) error functions ηfe , ηge ∈ C2π , where ηfe ∞ , ηge ∞ < ε/4.
From these two representations, we obtain the identity

sin2 (x)f (x) = sin2 (x)(fe (x) + fo (x))


= sin2 (x)Tfe (x) + sin(x)Tge (x) + sin2 (x)ηfe (x) + sin(x)ηge (x)
= Tfs (x) + ηfs (x),

where

Tfs (x) = sin2 (x)Tfe (x) + sin(x)Tge (x) ∈ T


ηfs (x) = sin2 (x)ηfe (x) + sin(x)ηge (x) with ηfs ∞ < ε/2.

Using similar arguments we can derive, for the phase-shifted function

f˜(x) = f (x + π/2) ∈ C2π ,

a representation of the form

sin2 (x)f˜(x) = Tfs˜(x) + ηfs˜(x) with ηfs˜∞ < ε/2

with Tfs˜ ∈ T , so that after reversion of the translation x −→ x − π/2, we


have

cos2 (x)f (x) = Tfs˜(x − π/2) + ηfs˜(x − π/2) = Tfc (x) + ηfc (x) with ηfc ∞ < ε/2,

where Tfc (x) ∈ T . By summation of the two representations for f , we obtain


by

f (x) = Tfs (x) + Tfc (x) + ηfs (x) + ηfc (x) = Tf (x) + ηf (x) with ηf ∞ < ε

the stated estimate


Tf − f ∞ < ε
for the so constructed trigonometric polynomial Tf = Tfs + Tfc ∈ T . 
This gives a positive answer to Question 2 from the outset of this section.
Finally, we remark that the maximum norm ·∞ is in the following sense
stronger than any p-norm  · p , 1 ≤ p < ∞.
194 6 Asymptotic Results

Corollary 6.18. The algebraic polynomials P are, w.r.t. any p-norm  · p ,


1 ≤ p < ∞, and for compact [a, b] ⊂ R, dense in C [a, b]. Likewise, the
trigonometric polynomials T are, w.r.t. ·p , dense in C2π for all 1 ≤ p < ∞.

Proof. For f ∈ C [a, b] and ε > 0 there is one p ∈ P satisfying p − f ∞ < ε.


This immediately implies the estimate
 b
p−f pp = |p(x)−f (x)|p dx ≤ (b−a)p−f p∞ < (b−a)εp for 1 ≤ p < ∞,
a

i.e., any f ∈ C [a, b] can, w.r.t.  · p , be approximated arbitrarily well by


algebraic polynomials. The case of trigonometric polynomials T , i.e., the
second statement, can be covered by using similar arguments. 

Remark 6.19. Corollary 6.18 states that convergence in the maximum norm
 · ∞ implies convergence in any p-norm  · p , 1 ≤ p < ∞. The converse,
however, does not hold in general. In this sense, the maximum norm  · ∞
is the strongest among all p-norms, for 1 ≤ p ≤ ∞. 

A corresponding statement holds for weighted Euclidean norms.

Corollary 6.20. Let w : (a, b) −→ (0, ∞) be a continuous and integrable


weight function, so that w defines on C [a, b], for compact [a, b] ⊂ R, the
inner product
 b
(f, g)w = f (x)g(x)w(x) dx for f, g ∈ C [a, b] (6.8)
a

1/2
and the Euclidean norm  · w = (·, ·)w . Then, any function f ∈ C [a, b] can,
w.r.t.  · w , be approximated arbitrarily well by algebraic polynomials, i.e.,
the polynomial space P is, with respect to  · w , dense in C [a, b].

Proof. For f ∈ C [a, b], we have


 b  b
f 2w = |f (x)|2 w(x) dx ≤ f 2∞ w(x) dx = Cw f 2∞ ,
a a

where Cw = 1w < ∞. Now let ε > 0 and p ∈ P with p − f ∞ < ε/ Cw .
Then, 9
p − f w ≤ Cw p − f ∞ < ε,
i.e., f can, with respect to  · w , be approximated arbitrarily well by p ∈ P.

6.2 Complete Orthogonal Systems and Riesz Bases 195

6.2 Complete Orthogonal Systems and Riesz Bases


We recall the notion and properties of orthogonal (and orthonormal) systems
from Section 4.2. In the following discussion, we consider a Euclidean space
F with inner product (·, ·) and norm  ·  = (·, ·)1/2 . Moreover, let Sn ⊂ F be
a finite-dimensional linear subspace of dimension dim(Sn ) = n ∈ N with an
(ordered) orthogonal basis (sj )nj=1 in Sn , so that the orthogonality relation

(sj , sk ) = δjk · sj 2 for 1 ≤ j, k ≤ n

holds. According to Theorem 4.5, the unique best approximation to f ∈ F is


given by the orthogonal projection

n
(f, sj )
Πn f = s j ∈ Sn (6.9)
j=1
sj 2

of f onto Sn , obtained by the orthogonal projection operator Πn : F −→ Sn .


In the following discussion, we investigate approximation properties of the
partial sums Πn f in (6.9), where our particular interest is placed on their
asymptotic behaviour. To this end, we analyze convergence for the sequence
(Πn f )n∈N , for n → ∞, where we link to our discussion in Section 4.2. On this
occasion, we recall the Pythagoras theorem (4.6), the Bessel inequality (4.12),
and the Parseval identity (4.10), or, (4.11), according to which we have

n
|(f, sj )|2
Πn f 2 = for all f ∈ F. (6.10)
j=1
sj 2

6.2.1 Complete Orthogonal Systems

We wish to transfer our results from Section 4.2 to infinite (countable and
ordered) orthogonal systems (and orthonormal systems) (sj )j∈N in F. Our
first result on this is based on the following characterization.
Theorem 6.21. Let (sj )j∈N be an orthogonal system in a Euclidean space
F with inner product (·, ·) and norm  ·  = (·, ·)1/2 . Then, the following
statements are equivalent.
(a) The span of (sj )j∈N is dense in F, i.e., F = span{sj | j ∈ N}.
(b) For any f ∈ F the sequence (Πn f )n∈N of partial sums Πn f in (6.9)
converges to f with respect to the norm  · , i.e.,

Πn f −→ f for n → ∞. (6.11)

(c) For any f ∈ F we have the Parseval identity



 |(f, sj )|2
f 2 = . (6.12)
j=1
sj 2
196 6 Asymptotic Results

Proof. For any f ∈ F, the n-th partial sum Πn f is the unique best approxi-
mation to f from Sn = span{s1 , . . . , sn } with respect to  · .
(a) ⇒ (b): Suppose for f ∈ F and ε > 0, there is one N ∈ N and sN ∈ SN
satisfying sN − f  < ε. Then, we have for n ≥ N

Πn f − f  = inf s − f  ≤ inf s − f  ≤ sN − f  < ε,


s∈Sn s∈SN

and so the sequence (Πn f )n∈N converges, with respect to  · , to f , i.e.,

Πn f − f  −→ 0 for n → ∞,

or, in short, Πn (f ) −→ f for n → ∞.


(b) ⇒ (c): Suppose the sequence (Πn f )n∈N of the partial sums Πn f con-
verges to f ∈ F, so that Πn f −f  −→ 0 for n → ∞. Then, by the Pythagoras
theorem,
f 2 = Πn f − f 2 + Πn f 2 , (6.13)
in combination with the Parseval identity (6.10) we obtain, for n → ∞, the
representation
∞
|(f, sj )|2
f  = lim Πn f  =
2 2
.
n→∞
j=1
sj 2

(c) ⇒ (a): From the Pythagoras theorem (6.13) and by (6.10), we obtain


n
|(f, sj )|2
Πn f − f 2 = f 2 − −→ 0 for n → ∞
j=1
sj 2

and so there is, for any ε > 0, one N ≡ N (ε) satisfying ΠN f − f  < ε. 

Definition 6.22. An orthogonal system (sj )j∈N satisfying one of the proper-
ties (a), (b), or (c) in Theorem 6.21 (and so all three properties), is called a
complete orthogonal system in F. The notion of a complete orthonormal
system is defined accordingly.

Remark 6.23. For a complete orthogonal system (sj )j∈N in F we have,


according to property (b) in Theorem 6.21, the series representation

 (f, sj )
f= sj for f ∈ F (6.14)
j=1
sj 2

by convergence of (Πn f )n∈N to f with respect to  · . The series in (6.14)


is often referred to as (generalized) Fourier series of f with (generalized)
Fourier coefficients (f, sj )/sj 2 . 

From the equivalence in Theorem 6.21, we can conclude a useful result.


6.2 Complete Orthogonal Systems and Riesz Bases 197

Corollary 6.24. Under the assumptions in Theorem 6.21, we have


∞
|(f, sj )|2
Πn f − f 2 = for all f ∈ F (6.15)
j=n+1
sj 2

for the representation of the squared error norm of Πn (f ) − f .

Proof. The representation (6.15) follows from property (c) in Theorem 6.21
by the Pythagoras theorem (6.13) and the Parseval identity (6.10). 

By using the Weierstrass density theorems for algebraic and trigonometric


polynomials, in Corollaries 6.12 and 6.17, we can give examples for complete
orthogonal systems.
Our first example draws a link to Corollary 6.20.

Example 6.25. Let w : (a, b) −→ (0, ∞) be a continuous weight function,


so that w defines on C [a, b], for compact [a, b] ⊂ R, an inner product (·, ·)w ,
see (6.8). Moreover, suppose (pj )j∈N0 is a sequence of orthogonal polynomials
with respect to (·, ·)w (cf. our construction in Theorem 4.16). Then, (pj )j∈N0
is a complete orthogonal system in C [a, b] with respect to the Euclidean
1/2
norm  · w = (·, ·)w . Indeed, this is because the algebraic polynomials P
are, according to the Weierstrass theorem, Corollary 6.12, dense in C [a, b]
with respect to the maximum norm  · ∞ , and so, by Corollary 6.20, P is
also dense in C [a, b] with respect to  · w . ♦

Next, we prove a useful criterion for the completeness of systems (sj )j∈N
in Hilbert spaces F, in particular for the completeness of orthogonal systems.

Theorem 6.26. (Completeness criterion). For a system (sj )j∈N of ele-


ments in a Hilbert space F, the following statements are equivalent.
(a) The system (sj )j∈N is complete in F, i.e., F = span{sj | j ∈ N}.
(b) If f ∈ F is orthogonal to all elements sj , then f = 0, i.e., we have the
implication
(f, sj ) = 0 for all j ∈ N =⇒ f = 0.

Proof. Without loss of generality, we suppose that (sj )j∈N is an orthonormal


system in F. Otherwise, we can choose a subsequence (sjk )k∈N of linearly in-
dependent elements, which we then orthonormalize (as in the Gram-Schmidt
algorithm, Algorithm 4). In the following, we use the notation

S := span{sj | j ∈ N} ⊂ F

for the closure of span{sj | j ∈ N} in F and, moreover,

S ⊥ := {u ∈ F | (u, s) = 0 for all s ∈ S} ⊂ F


198 6 Asymptotic Results

for the orthogonal complement of S in F, so that F = S ⊕ S ⊥ .


(a) ⇒ (b): Let (sj )j∈N be complete in F. Then, the Parseval identity (6.12)
holds according to Theorem 6.21. From this, we see that (f, sj ) = 0, for all
j ∈ N, implies f  = 0, and so f = 0.
(b) ⇒ (a): Let f ∈ F satisfy (f, sj ) = 0 for all j ∈ N. In this case,
we have f ∈ S ⊥ by the linearity and the continuity of the inner product.
Conversely, for f ∈ S ⊥ the orthogonality relation (f, sj ) = 0 holds for all
j ∈ N. Therefore, f is contained in S ⊥ , if and only if (f, sj ) = 0 for all j ∈ N.
With the assumed implication in (b), we have S ⊥ = {0} and so S = F. 

6.2.2 Riesz Bases and Frames

Next, we extend the concept of complete orthonormal systems. To this end,


we fix a Hilbert space F with inner product (·, ·) and norm  ·  = (·, ·)1/2 . In
the following discussion, we regard systems (sn )n∈Z with the bi-infinite index
set Z. Recall that for a complete orthonormal system (sn )n∈Z in F, we have,
for any f ∈ F, the series representation

f= (f, sn )sn
n∈Z

according to Remark 6.23. Moreover, the Parseval identity in (6.12) holds,


which we represent as

f  = ((f, sn ))n∈Z 2 , (6.16)

where 2 denotes the linear space of all square summable sequences with
indices in Z (cf. Remark 3.15).

Definition 6.27. A system B = (un )n∈Z of elements in a Hilbert space F is


called a Riesz4 basis of F, if the following properties are satisfied.
(a) The span of B is dense in F, i.e.,

F = span{un | n ∈ Z}. (6.17)

(b) There are constants 0 < A ≤ B < ∞ satisfying


 2
 
 
Ac22 ≤ cn un  ≤ Bc22 for all c = (cn )n∈Z ∈ 2
. (6.18)
 
n∈Z

For a Riesz basis B, the “best possible” constants, i.e., the largest A and the
smallest B satisfying (6.18), are called Riesz constants of B.
4
Frigyes Riesz (1880-1956), Hungarian mathematician
6.2 Complete Orthogonal Systems and Riesz Bases 199

Remark 6.28. Every complete orthonormal system in F is a Riesz basis


of F. Indeed, in this case, we have the Parseval identity in (6.16), whereby
equality holds in (6.18) for A = B = 1. Moreover, the completeness in (6.17)
holds by Theorem 6.21 (a).
We remark that the Riesz estimates in (6.18), often written in short as
 
 
 
 cn un  ∼ c2 for all c = (cn )n∈Z ∈ 2 ,
 
n∈Z

describe the stability of the Riesz basis representation with respect to per-
turbations of the coefficients in c ∈ 2 . Therefore, Riesz bases are also often
referred to as 2 -stable bases of F. 

In the following analysis concerning Riesz bases B = (un )n∈Z of F, the


linear synthesis operator G : 2 −→ F, defined as

G(c) = c n un ∈ F for c = (cn )n∈Z ∈ 2 , (6.19)
n∈Z

plays an important role. We note the following properties of G.

Proposition 6.29. Let B = (un )n∈Z be a Riesz basis of F with Riesz con-
stants 0 < A ≤ B < ∞. Then, the synthesis operator G : 2 −→ F in (6.19)
has the following properties.

(a) The operator G is continuous, where G has operator norm G = B.
(b) The operator G is bijective. √
(c) The inverse G−1 of G is continuous with operator norm G−1  = 1/ A.

Proof. Statement (a) follows directly from the upper Riesz estimate in (6.18).
As for the proof of (b), note that G is surjective, since span{un | n ∈ Z}
is by (6.17) dense in F. Moreover, G is injective, since by (6.18) the kernel of
G can only contain the zero element. Altogether, the operator G is bijective.
Finally, for the inverse G−1 : F −→ 2
of G we find by (6.18) the estimate
1
G−1 (f )2 ≤ f 2 for all f ∈ F
A

and this implies the continuity of G−1 at operator norm G−1  = 1/ A.
This proves property (c). 

Now we consider the dual analysis operator G∗ : F −→ 2


of G in
(6.19), where G∗ is characterized by the duality relation

(G∗ (f ), c)2 = (f, G(c)) for all c ∈ 2


and all f ∈ F. (6.20)

We note the following properties of G∗ .


200 6 Asymptotic Results

Proposition 6.30. The pair of dual operators G in (6.19) and G∗ in (6.20)


satisfies the following properties.
(a) The operator G∗ has the representation

G∗ (f ) = ((f, un ))n∈Z ∈ 2
for all f ∈ F.

(b) The operator G∗ is bijective and has the inverse (G∗ )−1 = (G−1 )∗ .
(c) The operators G∗ and (G∗ )−1 are continuous via the isometries

G = G∗  and G−1  = (G∗ )−1 .

Proof. By (6.20), we find for the dual operator G∗ : F −→ 2 the identty



(G∗ (f ), c)2 = (f, G(c)) = cn (f, un ) = (((f, un ))n∈Z , c)2
n∈Z

for all c ∈ 2 , and this already implies the stated representation in (a).
By the representation in (a) in combination with the Riesz basis property
of B, we see that G∗ is bijective. Moreover, for f, g ∈ F the representation

((G−1 )∗ G∗ (f ), g) = (G∗ (f ), G−1 (g))2 = (f, GG−1 (g)) = (f, g)

holds. Therefore, (G−1 )∗ G∗ is the identity on F. Likewise, we see that


G∗ (G−1 )∗ is the identity on 2 . This proves statement (b).
As regards statement (c), we find on the one hand

G∗ (f )22 = (G∗ (f ), G∗ (f ))2 = (f, GG∗ (f )) ≤ f  · G · G∗ (f )2

by letting c = G∗ (f ) in (6.20), and this implies G∗  ≤ G. On the other


hand, we have

G(c)2 = (G(c), G(c)) = (G∗ G(c), c)2 ≤ G∗  · G(c) · c2

by letting f = G(c) in (6.20), which implies G ≤ G∗ . Altogether, we have


G = G∗ . The other statement in (c) follows from similar arguments. 
Now we explain a fundamental duality property for Riesz bases.
Theorem 6.31. For any Riesz basis B = (un )n∈Z of F with Riesz constants
0 < A ≤ B < ∞, there is a unique Riesz basis B̃ = (ũn )n∈Z of F, such that
(a) the elements in B and B̃ are mutually orthonormal, i.e.,

(un , ũm ) = δnm for all n, m ∈ Z. (6.21)

(b) the Riesz basis B̃ has Riesz constants 0 < 1/B ≤ 1/A < ∞.
(c) any f ∈ F can uniquely be represented w.r.t. B or B̃, respectively, as
 
f= (f, ũn )un = (f, un )ũn . (6.22)
n∈Z n∈Z
6.2 Complete Orthogonal Systems and Riesz Bases 201

The Riesz basis B̃ is called the dual Riesz basis of B in F.


Proof. We consider the linear operator G : 2 −→ F in (6.19) associated with
the Riesz basis B = (un )n∈Z and its dual operator G∗ : F −→ 2 in (6.20).
According to Propositions 6.29 and 6.30 each of the linear operators G and
G∗ is continuous and has a continuous inverse. Therefore, their composition
GG∗ : F −→ F is continuous and has a continuous inverse.
Now we consider B̃ = (ũn )n∈Z , where

ũn := (GG∗ )−1 un for n ∈ Z.

The elements in B̃ satisfy the orthonormality relation (6.21) in (a), since

(un , ũm ) = (un , (GG∗ )−1 um ) = (G−1 un , G−1 um )2 = δmn (6.23)

holds for any m, n ∈ Z. Moreover, for c = (cn )n∈Z ∈ 2 , we have the identity
   , -
      
   
 ∗ −1
cn ũn  = (GG ) cn un  = (G∗ )−1 c .
   
n∈Z n∈Z

By G∗ 2 = B and (G∗ )−1 2 = 1/A, we get the Riesz stability for B̃, i.e.,
 2
 
1   1
c22 ≤  cn ũn  ≤ c22 for all c = (cn )n∈Z ∈ 2 . (6.24)
B   A
n∈Z

Now the continuity of (GG∗ )−1 and the completeness of B in (6.17) implies

F = span{ũn | n ∈ Z},

i.e., B̃ is a Riesz basis of F with Riesz constants 0 < 1/B ≤ 1/A < ∞. The
stated uniqueness of B̃ follows from the orthonormality relation (6.23).
Let us finally show property (c). Since G is surjective, any f ∈ F can be
represented as

f= c n un for some c = (cn )n∈Z ∈ 2 .
n∈Z

But this implies , -



(f, ũm ) = cn un , ũm = cm ,
n∈Z

whereby the stated (unique) representation in (6.22) holds, i.e.,



f= (f, ũn )un for all f ∈ F.
n∈Z

Likewise, the stated representation in (6.22) with respect to the Riesz basis
B̃ can be shown by similar arguments. 
202 6 Asymptotic Results

From the estimates in (6.24) and the representation in (6.22), we get the
stability of the coefficients (f, un ))n∈Z ∈ 2 under perturbations of f ∈ F.

Corollary 6.32. Let B = (un )n∈Z be a Riesz basis of F with Riesz constants
0 < A ≤ B < ∞. Then, the stability estimates

Af 2 ≤ ((f, un ))n∈Z 22 ≤ Bf 2 for all f ∈ F (6.25)

hold. 

Remark 6.33. Every Riesz basis B = (un )n∈Z of F yields a system of 2 -


linearly independent elements in F, i.e., for c = (cn )n∈Z ∈ 2 the implication

cn un = 0 =⇒ c = 0
n∈Z

holds. In other words, G(c) = 0 implies c = 0, as this is covered by Proposi-


tion 6.29 (b). Moreover, Corollary 6.32 gives the stability estimates in (6.25).


The required conditions for a Riesz basis B = (un )n∈Z (according to


Definition 6.27) often appear as too restrictive. In fact, relevant applications
work with weaker conditions on B, where they merely require the stability
in (6.25), but not the 2 -linearly independence of B.

Definition 6.34. A system B = (un )n∈Z of elements in a Hilbert space F is


called a frame of F, if for 0 < A ≤ B < ∞ the estimates

Af 2 ≤ ((f, un ))n∈Z 22 ≤ Bf 2 for all f ∈ F (6.26)

hold, where the “best possible” constants, i.e., the largest A and the smallest
B satisfying (6.26), are called frame constants of B.

Remark 6.35. Any frame B = (un )n∈Z of F is complete in F, i.e., the span
of B is dense in F,
F = span{un | n ∈ Z}.
This immediately follows from the completeness criterion, Theorem 6.26, by
using the lower estimate in (6.26). 

Remark 6.36. Every Riesz basis B is a frame, but the converse is general not
true. Indeed, a frame B = (un )n∈Z allows ambiguities in the representation

f= c n un for f ∈ F,
n∈Z

due to a possible 2
-linear dependence of the elements in B. 
6.2 Complete Orthogonal Systems and Riesz Bases 203

Remark 6.37. For any frame B = (un )n∈Z of F, there exists a dual frame
B̃ = (ũn )n∈Z of F satisfying
 
f= (f, un )ũn = (f, ũn )un for all f ∈ F.
n∈Z n∈Z

However, the duality relation (un , ũm ) = δnm in (6.21) does not in general
hold, since otherwise the elements of B and the elements of B̃ would be 2 -
linearly independent, respectively. 

For further illustration, we discuss the following examples.

Example 6.38. The three vectors


√ √
u1 = (0, 1)T , u2 = (− 3/2, −1/2)T , u3 = ( 3/2, −1/2)T

form a frame in F = R2 , since for f = (f1 , f2 )T ∈ R2 , we have


, √ -2 , √ -2

3
3 1 3 1
2
(f, uj ) = f22 + − f1 − f 2 + f1 − f 2
j=1
2 2 2 2
3 2 3
= (f1 + f22 ) = f 22 ,
2 2
and so the stability in (6.25) holds with A = B = 3/2. However, note that
the vectors u1 , u2 , u3 are 2 -linearly dependent, since u1 + u2 + u3 = 0. ♦

In our next example we discuss Riesz bases in finite-dimensional Euclidean


spaces for the prototypical case of the Euclidean space F = Rd , where d ∈ N.

Example 6.39. For the Euclidean space Rd , where d ∈ N, equipped with the
Euclidean norm  · 2 , any basis B = {u1 , . . . , ud } of Rd is a Riesz basis of Rd .
Indeed, in this case, we have for the regular matrix U = (u1 , . . . , ud ) ∈ Rd×d
and for any vector c = (c1 , . . . , cd )T ∈ Rd the stability estimates
N 
 
 
U −1 −1
2 c2 ≤  cn un  = U c2 ≤ U 2 c2 .
 
n=1 2

Therefore, the Riesz constants 0 < A ≤ B < ∞ of B are given by the spectral
norms of the matrices U and U −1 , so that A = U −1 −2
2 and B = U 2 . The
2
−1
unique dual Riesz basis B̃ of B is given by the rows of the inverse U . This
immediately follows by U U −1 = I from Theorem 6.31 (a). ♦

We close this section by studying an example for a frame of Rd .


204 6 Asymptotic Results

Example 6.40. We continue to work with the Euclidean space Rd , where


d ∈ N, whose inner product is denoted by (·, ·). For a frame B = (un )N
n=1 of
Rd , where N > d, we consider the dual operator G∗ : Rd −→ RN in (6.20).
According to Proposition 6.30 (a), the representation
G∗ (f ) = ((f, un ))N
n=1 = (un f )n=1 ∈ R
T N N
for f ∈ Rd
holds, or, in matrix notation,
G ∗ f = U T f = cf for f ∈ Rd ,
where U = (u1 , . . . , uN ) ∈ Rd×N and cf = ((f, un ))N n=1 ∈ R . Due to
N

the completeness of B, according to Definition 6.34 (a), we see that the


columns (u1 , . . . , uN ) of U must contain a basis of Rd . Hence, U has full
rank, d = rank(U ), and U T ∈ RN ×d is injective. But this is consistent with
the injectivity of the dual operator G∗ , as established in the lower estimate
in (6.26), for A > 0.
Now we consider the dual frame B̃ = (ũn )Nn=1 of B, as characterized by


N
f= (f, un )ũn for all f ∈ Rd .
n=1

By U f = cf , we have U U T f = U cf and so
T

f = (U U T )−1 U cf for all f ∈ Rd ,


T −1
i.e., the dual frame B̃ = (ũn )N
n=1 is determined by the columns of (U U ) U.
However, the elements in B and B̃ do not satisfy the orthonormality relation
in Theorem 6.31 (a). ♦

6.3 Convergence of Fourier Partial Sums


In this section, we analyze the approximation behaviour of Fourier partial
sums in more detail. To this end, we recall our discussion from Section 4.3,
where, in particular, we had proven the orthonormality of the real-valued
R
trigonometric polynomials in C2π ≡ C2π , see Theorem 4.11. By using the
Weierstrass theorem for trigonometric polynomials, Corollary 6.17, we can
prove the following result.
Corollary 6.41. The real-valued trigonometric polynomials
 8
1 
 R
√ , cos(j·), sin(j·)  j ∈ N ⊂ C2π (6.27)
2
R
form a complete orthonormal system in C2π with respect to the Euclidean
1/2
norm  · R = (·, ·)R , as defined by the (real) inner product

1 2π R
(f, g)R = f (x)g(x) dx for f, g ∈ C2π .
π 0
6.3 Convergence of Fourier Partial Sums 205

Proof. The orthonormality of the trigonometric polynomials in (6.27) holds


by Theorem 4.11. Moreover, due to the trigonometric version of the Weier-
strass theorem, Corollary 6.17, the real-valued trigonometric polynomials
T ≡ T R are dense in C2π = C2π R
with respect to the maximum norm  · ∞ ,
and so T is a dense subset of C2π also with respect to the weaker Euclidean
norm  · R , cf. Corollary 6.18. 

Remark 6.42. The result of Corollary 6.41 can directly be transferred to


the complex case, whereby the complex-valued trigonometric polynomials
C
{eij· | j ∈ Z} ⊂ C2π
C
form a complete orthonormal system in C2π with respect to the Euclidean
1/2
norm  · C = (·, ·)C , defined by the (complex) inner product
 2π
1 C
(f, g)C = f (x)g(x) dx for f, g ∈ C2π , (6.28)
2π 0

cf. Remark 4.10. 

Now we consider, for n ∈ N0 , real-valued Fourier partial sums of the form

a0 
n
R
(Fn f )(x) = + (aj cos(jx) + bj sin(jx)) for f ∈ C2π (6.29)
2 j=1

with Fourier coefficients a0 = (f, 1)R , aj = (f, cos(j·))R , and bj = (f, sin(j·))R ,
for j ∈ N, see Corollary 4.12. As we noticed in Section 4.3, the Fourier ope-
R
rator Fn : C2π −→ TnR gives the orthogonal projection of C2π R
onto TnR . In
R R
particular, Fn f ∈ Tn is the unique best approximation to f ∈ C2π from TnR
with respect to the Euclidean norm  · R .
As regards our notations concerning real-valued against complex-valued
R
functions, we recall Remark 4.10: For real-valued functions f ∈ C2π ≡ C2π ,
we apply the inner product (·, ·) = (·, ·)R and the norm  ·  =  · R . In
C
contrast, for complex-valued functions f ∈ C2π , we use (·, ·)C and  · C .

6.3.1 Convergence in Quadratic Mean

From our above discussion, we can conclude the following convergence result.

Corollary 6.43. For the approximation to f ∈ C2π by Fourier partial sums


Fn f we have convergence in quadratic mean, i.e.,

lim Fn f − f  = 0 for all f ∈ C2π .


n→∞

Proof. The statement follows immediately from property (b) in Theorem 6.21
in combination with Corollary 6.41. 
206 6 Asymptotic Results

Next, we quantify the speed of convergence for the Fourier partial sums
Fn f . To this end, the complex representation in (4.23),

n
(Fn f )(x) = cj eijx , (6.30)
j=−n

with the complex Fourier coefficients cj ≡ cj (f ) = (f, exp(ij ·))C , i.e.,


 2π
1
cj = f (x)e−ijx dx for − n ≤ j ≤ n,
2π 0
and the orthonormal system {exp(ij ·) | − n ≤ j ≤ n} ⊂ TnC with respect to
the complex inner product (·, ·)C in (6.28) turns out to be particularly useful.
Indeed, from the representation in (6.30) we can prove the following result.
Theorem 6.44. For f ∈ C2π k
the Fourier partial sums Fn f converge to f at
convergence rate k ∈ N0 according to
 
1  
Fn f − f  ≤ k Fn f (k) − f (k)  = o(n−k ) for n → ∞. (6.31)
(n + 1)
Proof. For k = 0, we obtain the stated convergence result from Corollary 6.43.
For k = 1, we apply integration by parts to obtain, for j = 0 and f ∈ C2π 1
,
by the identity
 2π  2π
1 i 1  2π i 1
cj (f ) = f (x)e−ijx dx = f (x)e−ijx 0 − f  (x)e−ijx dx
2π 0 j 2π j 2π 0
 2π
i 1 i i
=− f  (x)e−ijx dx = − (f  , e−ij· ) = − cj (f  )
j 2π 0 j j
an alternative representation for the complex Fourier coefficients cj in (6.30).
By induction on k, we obtain for f ∈ C2π k
the representation
1
cj (f ) = (−i)k cj (f (k) ) for all j ∈ Z \ {0},
jk
and so in this case, we find the estimate
1
|cj (f )| ≤ |cj (f (k) )| for all j ∈ Z \ {0} and k ∈ N0 . (6.32)
|j|k
By the representation of the error in Corollary 6.24, this in turn implies
  1
Fn f − f 2C = |cj (f )|2 ≤ |cj (f (k) )|2
j 2k
|j|≥n+1 |j|≥n+1
1 
≤ |cj (f (k) )|2
(n + 1)2k
|j|≥n+1
 2
1  (k) 
= Fn f (k)
− f 
(n + 1)2k C
6.3 Convergence of Fourier Partial Sums 207

for f ∈ C2π
k
and therefore
 
1  (k) 
Fn f − f  ≤ F n f (k)
− f  = o(n−k ) for n → ∞,
(n + 1)k
where we use the convergence
 
 
Fn f (k) − f (k)  −→ 0 for n → ∞

for f (k) ∈ C2π according to Corollary 6.43. 


Remark 6.45. The convergence rate k ∈ N0 , as achieved in Theorem 6.44,
follows from the asymptotic decay of the Fourier coefficients cj (f ) of f
in (6.32), whereby

|cj (f )| = O |j|−k for |j| → ∞.

Further note that the decay of cj (f ) follows from the assumption f ∈ C2πk
.
As for the converse, we can determine the smoothness of f from the asymp-
totic decay of the Fourier coefficients cj (f ). More precisely: If the Fourier
coefficients cj (f ) of f have the asymptotic decay
 
|cj (f )| = O |j|−(k+1+ε) for |j| → ∞

for some ε > 0, then this implies f ∈ C2πk


(see Exercise 6.91).
Conclusion: The smoother f ∈ C2π , the faster the convergence of the Fourier
partial sums Fn f to f , and vice versa. 

6.3.2 Uniform Convergence

Next, we analyze the uniform convergence of the Fourier partial sums Fn f .


Although we have proven convergence in quadratic mean, i.e., convergence
with respect to the Euclidean norm  · , we cannot expect convergence in the
stronger maximum norm ·∞ , due to Remark 6.19. In fact, to prove uniform
convergence we need to assume further conditions on f ∈ C2π , especially
concerning its smoothness. As we show now, it is sufficient to require that f
has a continuous derivative, i.e., f ∈ C2π
1
.
Corollary 6.46. For f ∈ C2π 1
, the Fourier partial sums Fn f converge uni-
formly to f , i.e., we have

lim Fn f − f ∞ = 0.
n→∞

Proof. For any n ∈ N, the orthogonality Fn f − f ⊥ 1 holds, i.e., we have


 2π
(Fn f − f )(x) dx = 0 for all n ∈ N.
0
208 6 Asymptotic Results

Therefore, the error function Fn f − f has at least one zero xn in the open
interval (0, 2π), whereby for x ∈ [0, 2π] we obtain the representation
 x  x
(Fn f − f )(x) = (Fn f − f ) (ξ) dξ = (Fn f  − f  )(ξ) dξ,
xn xn
 
where we used the identity (Fn f ) = Fn f (see Exercise 6.92). By the Cauchy-
Schwarz inequality, we further obtain
 x   x 
   

|(Fn f − f )(x)| ≤ 
2  
1 dξ  ·  |(Fn f − f )(ξ)| dξ 
  2
xn xn
≤ (2π)2 Fn f  − f  2 −→ 0 for n → ∞, (6.33)
which already proves the stated uniform convergence. 
Now we conclude from Theorem 6.44 a corresponding result concerning
the convergence rate of (Fn f )n∈N0 with respect to the maximum norm  · ∞ .
Corollary 6.47. For f ∈ C2π k
, where k ≥ 1, the Fourier partial sums Fn f
converge uniformly to f at convergence rate k − 1, according to
Fn f − f ∞ = o(n−(k−1) ) for n → ∞.
Proof. For f  ∈ C2π
k−1
, we have by (6.33) and (6.31) the estimate
 
2π  (k) 
Fn f − f ∞ ≤ 2πFn f  − f   ≤ Fn f (k)
− f ,
(n + 1)k−1
whereby we obtain for f (k) ∈ C2π the asymptotic convergence behaviour
Fn f − f ∞ = o(n−(k−1) ) for n → ∞
according to Corollary 6.43. 

6.3.3 Pointwise Convergence


Next, we analyze pointwise convergence for the Fourier partial sums Fn f . To
this end, we first derive for x ∈ R a suitable representation for the pointwise
error (Fn f )(x) − f (x) at x. We utilize, for f ∈ C2π , the real representation
of Fn f , whereby we obtain
a0 
n
(Fn f )(x) = + [aj cos(jx) + bj sin(jx)]
2 j=1
⎡ ⎤
 n
1 2π 1
= f (τ ) ⎣ + (cos(jτ ) cos(jx) + sin(jτ ) sin(jx))⎦ dτ
π 0 2 j=1
⎡ ⎤
 n
1 2π 1
= f (τ ) ⎣ + cos(j(τ − x))⎦ dτ. (6.34)
π 0 2 j=1
6.3 Convergence of Fourier Partial Sums 209

Note that in the last line we applied the trigonometric addition formula

cos(u + v) = cos(u) cos(v) − sin(u) sin(v)

for u = jτ and v = −jx. Now we simplify the integrand in (6.34) by applying


the substitution z = τ − x along with the representation
⎡ ⎤
1 n
⎣ + cos(jz)⎦ 2 sin(z/2)
2 j=1

n
= sin(z/2) + 2 cos(jz) sin(z/2)
j=1
n       
1 1
= sin(z/2) + sin j+ z − sin j− z
j=1
2 2
  
1
= sin n+ z , (6.35)
2

where we used the trigonometric identity


   
u+v u−v
sin(u) − sin(v) = 2 cos sin
2 2

for u = (j + 1/2)z and v = (j − 1/2)z. This implies the representation


 2π
1
(Fn f )(x) = f (τ )Dn (τ − x) dτ, (6.36)
π 0

where the function


1 sin((n + 1/2)z)
Dn (z) = for n ∈ N0 (6.37)
2 sin(z/2)

is called Dirichlet5 kernel. Note that the Dirichlet kernel is 2π-periodic and
even, so that we can further simplify the representation in (6.36) to obtain
 2π
1
(Fn f )(x) = f (τ )Dn (τ − x) dτ
π 0

1 2π−x
= f (x + σ)Dn (σ) dσ
π −x

1 π
= f (x + σ)Dn (σ) dσ. (6.38)
π −π

Since Fn 1 ≡ 1, for n ∈ N0 , we further obtain by (6.38) the representation


5
Peter Gustav Lejeune Dirichlet (1805-1859), German mathematician
210 6 Asymptotic Results

1 π
(Fn f )(x) − f (x) = [f (x + σ) − f (x)] Dn (σ) dσ
π −π
 π
1
= gx (σ) · sin((n + 1/2)σ) dσ
π −π

for the pointwise error at x ∈ R, where


f (x + σ) − f (x)
gx (σ) := . (6.39)
2 sin(σ/2)
By using the trigonometric addition formula

sin(nσ + σ/2) = sin(nσ) cos(σ/2) + cos(nσ) sin(σ/2)

we can rewrite the representation for the pointwise error as a sum of the form

(Fn f )(x) − f (x)


 
1 π 1 π
= gx (σ) cos(σ/2) · sin(nσ) dσ + gx (σ) sin(σ/2) · cos(nσ) dσ
π −π π −π
= bn (gx (·) cos(·/2)) + an (gx (·) sin(·/2))

with the Fourier coefficients bn (vx ) and an (wx ) of the 2π-periodic functions

vx (σ) = gx (σ) cos(σ/2)


wx (σ) = gx (σ) sin(σ/2).

Suppose gx (σ) is a continuous function. Then, vx , wx ∈ C2π . Moreover,


by the Parseval identity, we have in this case
 
vx 2C = |(vx , exp(in·))|2 < ∞ and wx 2C = |(wx , exp(in·))|2 < ∞,
n∈Z n∈Z

so that the Fourier coefficients (bn (vx ))n∈Z and (an (wx ))n∈Z are a zero se-
quence, respectively, whereby the pointwise convergence of (Fn f )(x) to f (x)
at x would follow.
Now we are in a position where we can, from our above investigations,
formulate a sufficient condition for f ∈ C2π which guarantees pointwise con-
vergence of (Fn f )(x) to f (x) at x ∈ R.

Theorem 6.48. Let f ∈ C2π be differentiable at x ∈ R. Then, we have


pointwise convergence of (Fn f )(x) to f (x) at x, i.e.,

(Fn f )(x) −→ f (x) for n → ∞.

Proof. First note that the function gx in (6.39) can only have singularities at
σk = 2πk, for k ∈ Z. Now we analyze the behaviour of gx around zero, where
we find
6.3 Convergence of Fourier Partial Sums 211

f (x + σ) − f (x) f (x + σ) − f (x) σ
lim gx (σ) = lim = lim · lim
σ→0 σ→0 2 sin(σ/2) σ→0 σ σ→0 2 sin(σ/2)
= f  (x),

by using L’Hôpital’s 6 rule. Therefore, the function gx is continuous at σ = 0.


By the periodicity of gx and f , we see that the function gx is also continuous
at σ = 2πk, for all k ∈ Z, whereby gx is continuous on R. 

6.3.4 Asymptotic Behaviour of the Fourier Operator Norms

Now let us return to the uniform convergence of Fourier partial sums, where
the following question is of particular importance.
Question: Can we, under mild as possible conditions on f ∈ C2π \ C2π
1
, prove
statements concerning uniform convergence of the Fourier partial sums Fn f ?
To answer this question, we need to analyze the norm Fn ∞ of the
Fourier operator Fn with respect to the maximum norm  · ∞ . To this end,
we first derive a suitable representation for the operator norm

Fn f ∞
Fn ∞ := sup for n ∈ N0 , (6.40)
f ∈C2π \{0} f ∞

before we study the asymptotic behaviour of Fn ∞ for n → ∞.


From (6.38), we obtain the uniform estimate
 π 
1 2 π
|(Fn f )(x)| ≤ f ∞ |Dn (σ)| dσ = f ∞ · |Dn (σ)| dσ. (6.41)
π −π π 0

This leads us to a suitable representation for the norm Fn ∞ of Fn in (6.40).

Theorem 6.49. The norm of the Fourier operator Fn has the representation

Fn ∞ = λn for all n ∈ N0 ,

where
   
2 π
1 π  sin((n + 1/2)σ) 
λn := |Dn (σ)| dσ =   dσ (6.42)
π π  sin(σ/2) 
0 0

is called Lebesgue7 constant.

Proof. From (6.41), we immediately obtain

Fn f ∞ ≤ f ∞ · λn
6
Marquis de L’Hôpital (1661-1704), French mathematician
7
Henri Léon Lebesgue (1875-1941), French mathematician
212 6 Asymptotic Results

and so we have, on the one hand, the upper bound Fn ∞ ≤ λn .


On the other hand, we can choose, for any ε > 0 an even 2π-periodic
continuous function f satisfying f ∞ = 1, such that f approximates the
even step function sgn(Dn (x)) arbitrarily well, i.e.,
 
1 π 
Fn ∞ ≥ Fn f ∞ ≥ |(Fn f )(0)| =  f (σ)Dn (σ) dσ 
π −π
 
1 π 
≥  sgn(Dn (σ))Dn (σ) dσ  − ε
π −π

2 π
= |Dn (σ)| dσ − ε
π 0
= λn − ε,

whereby for ε → 0 we have the lower bound Fn ∞ ≥ λn .


Altogether, we find Fn ∞ = λn , as stated. 

Remark 6.50. To obtain uniform convergence,

Fn f − f ∞ −→ 0 for n → ∞,

for all f ∈ C2π , we require the Fourier operator norms Fn ∞ = λn to be


uniformly bounded from above. We can see this from the triangle inequality

Fn f ∞ ≤ Fn f − f ∞ + f ∞ .

Indeed, if the norms Fn ∞ are not uniformly bounded from above, then
there must be at least one f ∈ C2π yielding divergence Fn f ∞ −→ ∞ for
n → ∞, in which case the sequence of error norms Fn f − f ∞ must be
divergent, i.e., Fn f − f ∞ −→ ∞ for n → ∞. 

Unfortunately, the operator norms Fn ∞ are not uniformly bounded


from above. This is because we have the following estimates for λn = Fn ∞ .

Theorem 6.51. For the Lebesgue constants λn in (6.42), we have


4
log(n + 1) ≤ λn ≤ 1 + log(2n + 1) for all n ∈ N0 . (6.43)
π2
Proof. For n = 0, the estimates in (6.43) are satisfied by λ0 = 1.
Now suppose n ≥ 1. For the zeros

σk = for k ∈ Z
n + 1/2

of Dn (σ) in (6.37) we obtain, on the one hand, the lower estimates


6.3 Convergence of Fourier Partial Sums 213
n−1   
1  σk+1  sin((n + 1/2)σ) 
λn ≥   dσ
π sin(σ/2)
k=0 σk
 σk+1
2 1
n−1
≥ | sin((n + 1/2)σ)| dσ (6.44)
π σk+1 σk
k=0

4  1
n−1
=
π2 k+1
k=0
4
≥ 2 log(n + 1), (6.45)
π
where we have used the estimate

| sin(σ/2)| ≤ |σ/2| for all σ ∈ R

in (6.44) and, moreover, we have used the estimate


n−1
1
≥ log(n + 1) for all n ∈ N
k+1
k=0

in (6.45).
On the other hand, we have for the integrand in (6.42) the estimates
 ⎡ ⎤  
     
 sin((n + 1/2)σ)   1  n
  n

  = 2 ⎣ + cos(jσ) ⎦ = 1 + 2 cos(jσ)  ≤ 1 + 2n,
 sin(σ/2)     
 2 j=1   j=1 

see (6.35), and, moreover,


 
 sin((n + 1/2)σ) 
 ≤ 1 = π for π ≥ σ ≥
π
=: μn ,
 sin(σ/2)  σ/π σ 2n + 1
where we have used the estimate

sin(σ/2) ≥ σ/π for all σ ∈ [0, π].

But this already implies the upper bound


 μn  π 
1 π
λn ≤ (2n + 1) dσ + dσ
π 0 μn σ
μn
= (2n + 1) + log(π/μn ) = 1 + log(2n + 1).
π

Since Fn ∞ is unbounded, we can conclude from Remark 6.50, that there
exists at least one function f ∈ C2π for which the sequence of Fourier partial
sums Fn f does not converge uniformly to f . This fundamental insight is
based on the important uniform boundedness principle of Banach-Steinhaus.
214 6 Asymptotic Results

6.3.5 Uniform Boundedness Principle

Let us first quote the Banach8 -Steinhaus9 theorem, a well-known result from
functional analysis, before we draw relevant conclusions. We will not prove the
Banach-Steinhaus theorem, but rather refer the reader to the textbook [33].

Theorem 6.52. (Banach-Steinhaus, 1927).


Let (Ln )n∈N be a sequence of bounded linear operators

Ln : B1 −→ B2 for n ∈ N

between two Banach spaces B1 and B2 . Moreover, suppose the operators Ln


are pointwise bounded, i.e., for any f ∈ B1 we have

sup Ln f  < ∞.


n∈N

Then, the uniform boundedness principle holds for the operators Ln , i.e.,

sup Ln  < ∞.


n∈N

In conclusion, by the Banach-Steinhaus theorem, the pointwise bounded-


ness of the operators (Ln )n∈N implies their uniform boundedness. But this
has negative consequences for the approximation with Fourier partial sums.
We can further explain this by providing the following corollary.

Corollary 6.53. There is a function f ∈ C2π for which the sequence


(Fn f )n∈N of Fourier partial sums does not converge uniformly to f , i.e.,

Fn f − f ∞ −→ ∞ for n → ∞.

Moreover, for this f , we have the divergence

Fn f ∞ −→ ∞ for n → ∞.

Proof. The function space C2π , equipped with the maximum norm  · ∞ , is
a Banach space. By the divergence Fn ∞ = λn −→ ∞ for n → ∞, there is
one f ∈ C2π with Fn f ∞ −→ ∞ for n → ∞. Indeed, otherwise this would
contradict the Banach-Steinhaus theorem. Now the estimate

Fn f − f ∞ ≥ Fn f ∞ − f ∞

immediately implies, for this f , the stated divergence Fn f − f ∞ −→ ∞,


for n → ∞, of the Fourier partial sums’ maximum norms. 
8
Stefan Banach (1892-1945), Polish mathematician
9
Hugo Steinhaus (1887-1972), Polish mathematician
6.3 Convergence of Fourier Partial Sums 215

Next, we show the norm minimality of the Fourier operator Fn among all
surjective projection operators onto the linear space of trigonometric poly-
nomials Tn . The following result dates back to Charshiladse-Losinski.

Theorem 6.54. (Charshiladse-Losinski).


For n ∈ N0 , let L : C2π −→ Tn be a continuous linear projection operator,
i.e.,
L(Lf ) = L(f ) for all f ∈ C2π .
Moreover, suppose L is surjective, i.e., L(C2π ) = Tn . Then, we have

L∞ ≥ Fn ∞ .

Proof. We define for s ∈ R the translation operator Ts by

(Ts f )(x) := f (x + s) for f ∈ C2π and x ∈ R.

Note that Ts ∞ = 1. Moreover, we define a linear operator G by


 π
1
(Gf )(x) := (T−s LTs f )(x) ds for f ∈ C2π and x ∈ R. (6.46)
2π −π

Then, G : C2π −→ Tn is bounded (i.e., continuous) on C2π , since we have

|(Gf )(x)| ≤ T−s LTs f ∞ ≤ T−s ∞ L∞ Ts ∞ f ∞ = L∞ f ∞

and so Gf ∞ ≤ L∞ f ∞ for all f ∈ Tn , or,

G∞ ≤ L∞ .

Now the operator G coincides on C2π with the Fourier operator Fn , as we


will show by the following lemma. This then completes our proof. 

Lemma 6.55. Suppose the operator L : C2π −→ Tn satisfies the assumptions


in Theorem 6.54. Then, the operator G in (6.46) coincides on C2π with the
Fourier operator Fn : C2π −→ Tn , i.e., we have

Gf = Fn f for all f ∈ C2π .


C
Proof. We obtain the extension L : C2π −→ TnC of the operator L by letting
C R
Lf := Lu + iLv for f = u + iv ∈ C2π where u, v ∈ C2π = C2π .
C
In this way, the extension of G in (6.46) from C2π to C2π is well-defined.
C
Moreover, we work with the extension of Fn from C2π to C2π .
C
Since the orthonormal system {eij· | j ∈ Z} is complete in C2π (cf. Re-
mark 6.42 and Exercise 6.89) and by the continuity of the linear operators
C
Fn : C2π −→ TnC and G : C2π
C
−→ TnC , it is sufficient to show the identity
216 6 Asymptotic Results

G eij· = Fn eij· for all j ∈ Z. (6.47)


To this end, we take a closer look at the operator G. First we note
Ts eij· (x) = eij(x+s) = eijx eijs
and this implies
LTs eij· (x) = eijs Leij· (x)
and, moreover,
T−s LTs eij· (x) = eijs Leij· (x − s). (6.48)

Case 1: For |j| ≤ n, we have (since L is surjective)


(Lf )(x) = eijx ∈ TnC
C
for one f ∈ C2π . Together with the projection property L(Lf ) = Lf , this
implies (for this particular f ) the identity
(L(Lf ))(x) = L eij· (x) = (Lf )(x) = eijx ,

i.e., L eij· (x) = eijx . In combination with (6.48), we further obtain

T−s LTs eij· (x) = eijs eij(x−s) = eijx


and so
 π
ij· 1
G e (x) = eijx ds = eijx = Fn eij· (x).
2π −π

Case 2: For |j| > n, we have Fn eij· (x) = 0. Moreover, the function
e is orthogonal to the trigonometric polynomial L eij· (x − s) ∈ TnC .
ijs

From this and by (6.48), we obtain


 π
1
G e ij·
(x) = eijs L eij· (x − s) ds = 0.
2π −π
Altogether, the identity (6.47) holds, as stated. 
Obviously, the result of Theorem 6.54 makes our situation worse. Indeed,
we can formulate one more negative consequence from the Charshiladse-
Losinski theorem.
Corollary 6.56. Let (Ln )n∈N0 be a sequence of continuous and surjective
linear projection operators Ln : C2π −→ Tn . Then, there is a function f ∈ C2π
satisfying
Ln f ∞ −→ ∞ for n → ∞,
whereby
Ln f − f ∞ −→ ∞ for n → ∞.

6.4 The Jackson Theorems 217

Corollary 6.56 can be proven by similar arguments as for Corollary 6.53.


We finally draw another negative conclusion from the Banach-Steinhaus
theorem, which prohibits uniform convergence for sequences of interpolation
polynomials. The following important result is due to Faber10 [23].
Theorem 6.57. (Faber, 1914). For any sequence (In )n∈N0 of interpolation
operators In : C [a, b] −→ Pn , there is a continuous function f ∈ C [a, b],
for which the corresponding sequence (In f )n∈N0 of interpolation polynomials
In f ∈ Pn does not converge uniformly to f . 
For a proof of the Faber theorem, we refer to Exercise 6.93.

6.4 The Jackson Theorems


In this section, we analyze the asymptotic behaviour of the minimal distances
η∞ (f, Tn ) := inf T − f ∞ for f ∈ C2π
T ∈Tn
η∞ (f, Pn ) := inf p − f ∞ for f ∈ C [a, b]
p∈Pn

for n → ∞ with respect to the maximum norm ·∞ . According to the Weier-
strass theorems, Corollaries 6.12 and 6.17, we can rely on the convergence
η∞ (f, Tn ) −→ 0 and η∞ (f, Pn ) −→ 0 for n → ∞.
In this section, we quantify the asymptotic decay of the zero sequences
(η∞ (f, Tn ))n∈N0 and (η∞ (f, Pn ))n∈N0 for n → ∞.
We begin our analysis with the trigonometric case, i.e., with the asymp-
totic behaviour of (η∞ (f, Tn ))n∈N0 . On this occasion, we first recall the con-
vergence rates of the Fourier partial sums Fn f for f ∈ C2π . By the estimate
η∞ (f, Tn ) ≤ Fn f − f ∞ for n ∈ N0
we expect for f ∈ C2π
k
, k ≥ 1, at least the convergence rate k − 1, according
to Corollary 6.47. However, as it turns out, we gain even more. In fact, we
will obtain the convergence rate k, i.e.,
η∞ (f, Tn ) = O(n−k ) for n → ∞ for f ∈ C2π
k
.
Note that this complies with the convergence behaviour of Fourier partial
sums Fn f with respect to the Euclidean norm  · . Indeed, in that case, we
have, by Theorem 6.44, the asymptotic behaviour
η(f, Tn ) = o(n−k ) for n → ∞ for f ∈ C2π
k
.
For an intermediate conclusion, we note one important principle:
The smoother f ∈ C2πk
is, i.e., the larger k ∈ N, the faster the convergence
of the minimal distances η(f, Tn ) and η∞ (f, Tn ) to zero, for n → ∞.
10
Georg Faber (1877-1966), German mathematician
218 6 Asymptotic Results

Remark 6.58. On this occasion, we recall Remark 6.45, where we had


drawn a similar conclusion for the approximation by Fourier partial sums
with respect to the Euclidean norm. As regards our above intermediate con-
clusion, we remark that the converse of that principle is covered by the clas-
sical Bernstein theorems (see e.g. [11]), albeit we decided to refrain from
discussing Bernstein theorems in more details. 
In this section, we develop suitable conditions on f ∈ C2π \ C2π
1
, under
which the sequence (Fn f )n∈N0 of Fourier partial sums converges uniformly
to f . In this way, we give an answer to the question which we formulated
at the outset of Section 6.3.4. But first we require some preparations. Let
Πn : C2π −→ Tn denote the nonlinear projection operator, which assigns
f ∈ C2π to its unique best approximation Πn f ∈ Tn with respect to the
maximum norm  · ∞ , so that
η∞ (f, Tn ) = Πn f − f ∞ for all f ∈ C2π .
Then, we have the estimate
Fn f − f ∞ = Fn f − Πn f + Πn f − f ∞
= Fn (f − Πn f ) + (Πn f − f )∞
= (I − Fn )(Πn f − f )∞
≤ I − Fn ∞ · Πn f − f ∞
= I − Fn ∞ · η∞ (f, Tn ), (6.49)
where I denotes the identity on C2π . By Theorem 6.51 the sequence of ope-
rator norms Fn ∞ = λn diverges logarithmically, so that
I − Fn ∞ ≤ I∞ + Fn ∞ = O(log(n)) for n → ∞. (6.50)
On the ground of this observation, the asymptotic analysis of the mini-
mal distances η∞ (f, Tn ) is of primary interest: Namely, if we can show, for
f ∈ C2π , that the sequence (η∞ (f, Tn ))n∈N0 converges to zero at least alge-
braically, so that
log(n) · η∞ (f, Tn ) −→ 0 for n → ∞, (6.51)
then the sequence (Fn f )n∈N0 converges by (6.49) and (6.50) uniformly to f .
To this end, the following inequalities of Jackson11 are indeed very useful.
We begin our asymptotic analysis of the minimal distances η∞ (f, Tn ) for
continuously differentiable functions f ∈ C2π
1
. Recall that in this case the uni-
form convergence of the Fourier partial sums Fn f to f is already guaranteed
by Corollary 6.46 and quantified by Corollary 6.47. Nevertheless, the follo-
wing Jackson theorem is of fundamental importance for further investigations
concerning convergence rates of the minimal distance (η∞ (f, Tn ))n∈N0 .
11
Dunham Jackson (1888-1946), US-American mathematician
6.4 The Jackson Theorems 219

Theorem 6.59. (Jackson 1). For f ∈ C2π 1


, we have
π
η∞ (f, Tn ) ≤ f  ∞ = O(n−1 ) for n → ∞. (6.52)
2(n + 1)
Remark 6.60. The estimate of Jackson 1, Theorem 6.59, is sharp, i.e., there
is a function f ∈ C2π
1
\ Tn for which equality holds in (6.52). For more details,
we refer to Exercise 6.95. 
Our proof for Theorem 6.59 is based on the following two lemmas.
Lemma 6.61. We have
 
  
π  n
 π2
min ξ − a sin(jξ)  dξ = . (6.53)
a1 ,...,an ∈R  j  2(n + 1)
0  j=1 

Lemma 6.62. For A1 , . . . , An ∈ R, let Ln : C2π −→ Tn be a linear operator


of the form
a0 
n
(Ln f )(x) := + Aj [aj cos(jx) + bj sin(jx)] for f ∈ C2π , (6.54)
2 j=1

where a0 = (f, 1), aj = (f, cos(j·)) and bj = (f, sin(j·)), for 1 ≤ j ≤ n, are
the Fourier coefficients of f in (6.29). Then we have, for f ∈ C2π 1
, the error
representation
⎡ ⎤
 π n j
1 ⎣ξ + (−1)
(Ln f − f )(x) = Aj sin(jξ)⎦ f  (x + π − ξ) dξ. (6.55)
π −π 2 j=1 j

Now we can prove the statement of Jackson 1, Theorem 6.59.


Proof. (Jackson 1). For f ∈ C2π
1
, we have for the minimal distance
η∞ (f, Tn ) = inf T − f ∞
T ∈Tn

the estimate
η∞ (f, Tn ) ≤ Ln f − f ∞
 
 π  

1  ξ
n
(−1) j 

≤ f ∞ ·  + Aj sin(jξ) dξ

π −π  2 j=1 j 
 
 π 

1  n
2(−1) j 
= f  ∞ · ξ + A sin(jξ)  dξ
π 0   j
j 
j=1 
1 π2
= f  ∞ · ·
π 2(n + 1)
 π
= f ∞ · ,
2(n + 1)
220 6 Asymptotic Results

where in the second line we use the error representation (6.55). Moreover,
in the penultimate line we choose optimal coefficients A1 , . . . , An according
to (6.53). 

Now let us prove the two lemmas.

Proof. (Lemma 6.62). Suppose f ∈ C2π


1
. We use the notation

ξ  (−1)j
n
g(ξ) := + Aj sin(jξ)
2 j=1 j

for the first factor of the integrand in (6.55). This way we obtain

1 π
g(ξ)f  (x + π − ξ) dξ
π −π
 ξ=π 
1 1 π 
= − g(ξ)f (x + π − ξ) + g (ξ)f (x + π − ξ) dξ
π ξ=−π π −π

1π 1π 1 π 
=− f (x) − f (x + 2π) + g (x + π − σ)f (σ) dσ
π2 π2 π −π
 π
1
= −f (x) + g  (x + π − σ)f (σ) dσ
π −π

after integration by parts from the error representation (6.55). Now we have

g  (x + π − σ)
1  (−1)j
n
= + Aj · j · cos(j(x + π − σ))
2 j=1 j

1 
n
= + (−1)j Aj [cos(j(x + π)) cos(jσ) + sin(j(x + π)) sin(jσ)]
2 j=1

1   
n
= + (−1)j Aj (−1)j (cos(jx) cos(jσ) + sin(jx) sin(jσ))
2 j=1

1 
n
= + Aj [cos(jx) cos(jσ) + sin(jx) sin(jσ)]
2 j=1

and so

a0 
π n
1
g  (x + π − σ)f (σ) dσ = + Aj [aj cos(jx) + bj sin(jx)]
π −π 2 j=1
= (Ln f )(x),

which already shows that the stated error representation holds. 


6.4 The Jackson Theorems 221

Proof. (Lemma 6.61). For arbitrary a1 , . . . , an ∈ R, we have the estimate


 
 π  
 n

ξ − aj sin(jξ) dξ

0  j=1 
 ⎡ ⎤ 
 π 
  n

≥  ⎣ ξ− aj sin(jξ) sgn(sin((n + 1)ξ)) dξ 
⎦ (6.56)
 0 j=1 
 π 
 
=  ξ · sgn(sin((n + 1)ξ)) dξ  (6.57)
 0
 (k+1)π/(n+1) 
 n 
 
=  (−1) k
ξ dξ 
 kπ/(n+1) 
k=0
 
 1 π2 n 
 2 
= (−1) k
(k + 1) 2
− k 
 2 (n + 1)2 
k=0
 
 π2  n 
 
= (−1) k
(2k + 1) 
 2(n + 1)2 
k=0
 
 π2  π2
=  2
· (n + 1) = ,
2(n + 1) 2(n + 1)
where for the equality in (6.57) we use the identity
 π
sin(jξ) · sgn(sin((n + 1)ξ)) dξ = 0 for j < n + 1. (6.58)
0

We prove the statement (6.58) by Lemma 6.63.


For the solution of the optimization problem (6.53), we determine coef-
ficients a1 , . . . , an ∈ R, such that equality holds in (6.56). In this case, the
function

n
g(ξ) = ξ − aj sin(jξ)
j=1

must necessarily change signs at the points ξk = kπ/(n + 1) ∈ (0, π), for
1 ≤ k ≤ n. Indeed, this is because the function sgn(sin((n + 1)ξ)) has sign
changes on (0, π) only at the points ξ1 , . . . , ξn .
Note that this requirement yields n conditions on the sought coefficients
a1 , . . . , an ∈ R, where these conditions are the interpolation conditions

n
ξk = aj sin(jξk ) for 1 ≤ k ≤ n. (6.59)
j=1

But the interpolation problem (6.59) has a unique solution, since the trigono-
metric polynomials sin(j·), 1 ≤ j ≤ n, form a Haar system on (0, π) (see
Exercise 5.54). 
222 6 Asymptotic Results

Finally, it remains to show the identity (6.58).


Lemma 6.63. For n ∈ N, we have the identity
 π
sin(jξ) · sgn(sin((n + 1)ξ)) dξ = 0 for 1 ≤ j < n + 1. (6.60)
0

Proof. The integrand in (6.60) is an even function. Now we regard the integral
in (6.60) on [−π, π] (rather than on [0, π]). By using the identity
1 ijξ
sin(jξ) = e − e−ijξ
2i
it is sufficient to show
 π
Ij := eijξ · sgn(sin((n + 1)ξ)) dξ = 0 for 1 ≤ |j| < n + 1. (6.61)
−π

After the substitution ξ = σ + π/(n + 1) in (6.61) the representation


 π−π/(n+1)
Ij = eij(σ+π/(n+1)) · sgn(sin((n + 1)σ + π)) dσ
−π−π/(n+1)
 π
= −eijπ/(n+1) eijσ · sgn(sin((n + 1)σ)) dσ
−π

= −eijπ/(n+1) · Ij
holds. Since −eijπ/(n+1) = 1, this implies Ij = 0 for 1 ≤ |j| < n + 1. 
We wish to work with weaker conditions on f (i.e., weaker than f ∈ C2π
1

as in Jackson 1). In the next Jackson theorem, we only need Lipschitz12


continuity for f .
Definition 6.64. A function f : [a, b] −→ R is said to be Lipschitz
continuous on [a, b] ⊂ R, if there is a constant L > 0 satisfying
|f (x) − f (y)| ≤ L|x − y| for all x, y ∈ R.
In this case, L is called a Lipschitz constant of f on [a, b].
Remark 6.65. For a compact interval [a, b] ⊂ R, every function f ∈ C 1 [a, b]
is Lipschitz continuous on [a, b]. Indeed, this is because in this case, the mean
value theorem applies, so that for any x, y ∈ [a, b], we have the representation
f (x) − f (y) = f  (ξ) · (x − y) for some ξ ∈ (a, b)
and this implies the estimate
|f (x) − f (y)| ≤ f  ∞ · |x − y| for all x, y ∈ [a, b].
Therefore, L = f  ∞ is a Lipschitz constant of f on [a, b]. 
12
Rudolf Lipschitz (1832-1903), German mathematician
6.4 The Jackson Theorems 223

Theorem 6.66. (Jackson 2). Let f ∈ C2π be Lipschitz continuous on


[0, 2π] with Lipschitz constant L > 0. Then, we have
π·L
η∞ (f, Tn ) ≤ = O(n−1 ) for n → ∞.
2(n + 1)
Remark 6.67. The estimate of Jackson 2, Theorem 6.66, is sharp. For more
details, we refer to Exercise 6.95. 
Proof. For δ > 0, we consider the local mean value function
 x+δ
1
ϕδ (x) = f (ξ) dξ for x ∈ R (6.62)
2δ x−δ
of f on (x − δ, x + δ). Then, we have
f (x + δ) − f (x − δ)
ϕδ (x) = for all x ∈ R,

and so ϕδ is in C2π
1
. Moreover, ϕδ satisfies the uniform bound
|ϕδ (x)| ≤ L for all x ∈ R,
i.e., ϕδ ∞ ≤ L. By Jackson 1, Theorem 6.59, this implies
π·L
η∞ (ϕδ , Tn ) ≤ .
2(n + 1)
Moreover, we have
  
 x+δ  L x+δ
1  
|ϕδ (x) − f (x)| =  (f (ξ) − f (x)) dξ  ≤ |ξ − x| dξ
2δ  x−δ  2δ x−δ
L 2 L
= · δ = · δ −→ 0 for δ → 0.
2δ 2
Now let T ∗ (ϕδ ) ∈ Tn be the best approximation to ϕδ from Tn with
respect to  · ∞ , so that
η∞ (ϕδ , Tn ) = T ∗ (ϕδ ) − ϕδ ∞ .
Then, we have
η∞ (f, Tn ) ≤ T ∗ (ϕδ ) − f ∞
≤ T ∗ (ϕδ ) − ϕδ ∞ + ϕδ − f ∞
π·L L
≤ + · δ,
2(n + 1) 2
whereby for δ  0, we obtain
π·L
η∞ (f, Tn ) ≤
2(n + 1)
as stated. 
224 6 Asymptotic Results

To obtain even weaker conditions on the target function f ∈ C2π , we now


work with the modulus of continuity.

Definition 6.68. For [a, b] ⊂ R, let f ∈ C [a, b] and δ > 0. Then,

ω(f, δ) = sup |f (x + σ) − f (x)|


x,x+σ∈[a,b]
|σ|≤δ

is called modulus of continuity of f on [a, b] with respect to δ.

Remark 6.69. Note that the modulus of continuity ω(f, δ) quantifies the
local distance between the function values of f uniformly on [a, b]. In fact,
the smaller the modulus of continuity ω(f, δ), the smaller is the local variation
of f on [a, b]. For a compact interval [a, b] ⊂ R, the modulus of continuity
ω(f, δ) of f ∈ C [a, b] is finite by

ω(f, δ) ≤ 2f ∞,[a,b] for all b − a ≥ δ > 0,

and, moreover, we have the convergence

ω(f, δ) −→ 0 for δ  0.

For f ∈ C 1 [a, b] and x, x + σ ∈ [a, b], we have

f (x + σ) − f (x) = σ · f  (ξ) for some ξ ∈ (x, x + σ)

by the mean value theorem, and so

ω(f, δ) ≤ δ · f  ∞ .

For a Lipschitz continuous function f ∈ C [a, b] with Lipschitz constant L > 0


we finally have
ω(f, δ) ≤ δ · L.


The following Jackson theorem gives an upper bound for the minimal
distance η∞ (f, Tn ) by involving the modulus of continuity of f ∈ C2π .

Theorem 6.70. (Jackson 3). For f ∈ C2π , we have


 
3 π
η∞ (f, Tn ) ≤ · ω f, . (6.63)
2 n+1

Remark 6.71. The estimate of Jackson 3, Theorem 6.70, is not sharp. For
more details, we refer to Exercise 6.97. 
6.4 The Jackson Theorems 225

Proof. For the local mean value function ϕδ ∈ C2π 1


of f on (x − δ, x + δ)
in (6.62), we can give a uniform bound on the pointwise error by
 
1  x+δ 
 1
|ϕδ (x) − f (x)| ≤  (f (ξ) − f (x)) dξ  ≤ · 2δ · ω(f, δ) = ω(f, δ).
2δ  x−δ  2δ

Moreover, ϕδ is uniformly bounded above by


1
ϕδ ∞ ≤ · ω(f, 2δ).

Now let T ∗ (ϕδ ) ∈ Tn be the best approximation to ϕδ from Tn with
respect to  · ∞ . Then, by Jackson 1, Theorem 6.59, this implies for δ > 0
the estimate

η∞ (f, Tn ) ≤ T ∗ (ϕδ ) − f ∞
≤ T ∗ (ϕδ ) − ϕδ ∞ + ϕδ − f ∞
π 1
≤ · · ω(f, 2δ) + ω(f, δ)
2(n + 1) 2δ
 
π
≤ ω(f, 2δ) +1 .
4δ(n + 1)

Letting δ = π/(2(n + 1)), this gives the stated estimate in (6.63). 

Next, we analyze the asymptotic decay rate of the minimal distances


η∞ (f, Tn ), for smoother target functions f . To be more precise, we prove
asymptotic convergence rates for f ∈ C2π k
, k ∈ N. Given our previous results,
we can, for smoother f ∈ C2π , i.e., for larger k, expect faster convergence of
k

the zero sequence (η∞ (f, Tn ))n∈N0 . Our perception matches with the result
of the following Jackson theorem.

Theorem 6.72. (Jackson 4). For f ∈ C2π


k
, k ≥ 1, we have
 k
π
η∞ (f, Tn ) ≤ · f (k) ∞ = O n−k for n → ∞.
2(n + 1)
Our proof for Theorem 6.72 is based on two lemmas.

Lemma 6.73. For f ∈ C2π


1
and n ∈ N, the estimate
π
η∞ (f, Tn ) ≤ · η∞ (f  , Tn ),
2(n + 1)
holds, where the linear space

Tn := span {cos(k·), sin(k·) | 1 ≤ k ≤ n} for n ∈ N

consists of all trigonometric polynomials from Tn without the constants.


226 6 Asymptotic Results

Remark 6.74. For n ∈ N, we have

Tn = {T  ∈ C2π | T ∈ Tn } ⊂ Tn

and this explains our notation Tn . By Tn ⊂ Tn , we find the estimate

η∞ (f, Tn ) ≤ η∞ (f, Tn )

for all f ∈ C2π . 

Proof. (Lemma 6.73). Let T ∗ ∈ Tn be best approximation to f  from Tn .


For  x
T (x) := T ∗ (ξ) dξ ∈ Tn
0

we have T  = T ∗ and so

(T − f ) ∞ = T ∗ − f  ∞ = η∞ (f  , Tn ).

But this implies, by using Jackson 1, Theorem 6.59, the stated estimate:
π π
η∞ (f, Tn ) = η∞ (T − f, Tn ) ≤ · (T − f ) ∞ = · η∞ (f  , Tn ).
2(n + 1) 2(n + 1)


Lemma 6.75. Let f ∈ C2π


1
satisfy
 π
f (x) dx = 0. (6.64)
−π

Then we have, for any n ∈ N, the two estimates


π
η∞ (f, Tn ) ≤ · f  ∞ (6.65)
2(n + 1)
π
η∞ (f, Tn ) ≤ · η∞ (f  , Tn ). (6.66)
2(n + 1)

Proof. For the modified Fourier partial sum Ln f in (6.54),

a0 
n
(Ln f )(x) = + Ak (ak cos(kx) + bk sin(kx)),
2
k=1

we have a0 ≡ a0 (f ) = (f, 1) = 0 by (6.64) and so Ln f ∈ Tn . Therefore, we


have (6.65), since
π
η∞ (f, Tn ) ≤ Ln f − f ∞ ≤ f  ∞ ·
2(n + 1)

holds for optimal coefficients A1 , . . . , An (like in the proof of Jackson 1).


6.4 The Jackson Theorems 227

To show (6.66), suppose that T ∗ ∈ Tn is best approximation to f  from Tn .


For  x
T (x) := T ∗ (ξ) dξ ∈ Tn
0

we have T  = T ∗ . Moreover, for


 π
1 a0 (T )
S(x) := T (x) − T (ξ) dξ = T (x) −
2π −π 2

we have a0 (S) = (S, 1) = 0. Therefore, S ∈ Tn and S  = T ∗ . But this already


implies the stated estimate (6.66) by
π π
η∞ (f, Tn ) = η∞ (S − f, Tn ) ≤ · S  − f  ∞ = · η∞ (f  , Tn ).
2(n + 1) 2(n + 1)


Now we are in a position where we can prove Jackson 4, Theorem 6.72.

Proof. (Jackson 4). For f ∈ C2π 1


we have
 π
f  (ξ) dξ = f (π) − f (−π) = 0.
−π

Now the estimate (6.66) in Lemma 6.75 implies


 k−2
π
η∞ (f  , Tn ) ≤ · η∞ (f (k−1) , Tn ) for f ∈ C2π
k−1
2(n + 1)

by induction on k ≥ 2. Moreover, by Lemma 6.73 and (6.65), we get


π
η∞ (f, Tn ) ≤ · η∞ (f  , Tn )
2(n + 1)
 k−1
π
≤ · η∞ (f (k−1) , Tn )
2(n + 1)
 k−1
π π
≤ f (k) ∞
2(n + 1) 2(n + 1)
 k
π
= f (k) ∞
2(n + 1)

for f ∈ C2π
k
, where k ≥ 1. 

Now we return to the discussion from the outset of this section concer-
ning the uniform convergence of Fourier partial sums. In that discussion, we
developed the error estimate (6.49),

Fn f − f ∞ ≤ I − Fn ∞ · η∞ (f, Tn ) for f ∈ C2π .


228 6 Asymptotic Results

Moreover, we took further note on the application of the Jackson theorems.


We summarize the discussion of this section by the Dini13 -Lipschitz theorem,
each of whose results follows directly from (6.51),

log(n) · η∞ (f, Tn ) −→ 0 for n → ∞,

and one corresponding Jackson inequality in Theorems 6.66-6.72.

Theorem 6.76. (Dini-Lipschitz, 1872).


If f ∈ C2π satisfies one of the following conditions, then the sequence
(Fn f )n∈N0 of Fourier partial sums Fn f converges uniformly to f , i.e.,

Fn f − f ∞ −→ 0 for n → ∞,

at the following convergence rates.


(a) If
log(n) · ω(f, 1/n) = o(1) for n → ∞,
then we have (by Jackson 3)

Fn f − f ∞ = o(1) for n → ∞.

(b) If f is Lipschitz continuous, then we have (by Jackson 2)

Fn f − f ∞ = O(log(n)/n) for n → ∞.

(c) If f ∈ C2π
k
, for k ≥ 1, then we have (by Jackson 4)

Fn f − f ∞ = O(log(n)/nk ) for n → ∞.

Finally, we transfer the results of the theorems Jackson 1-4 concerning


the approximation of f ∈ C2π by trigonometric polynomials from Tn to the
case of approximation of f ∈ C [−1, 1] by algebraic polynomials from Pn .

Theorem 6.77. (Jackson 5). For the minimal distances

η∞ (f, Pn ) = inf p − f ∞,[−1,1] for f ∈ C [−1, 1]


p∈Pn

the following estimates hold.


(a) For f ∈ C [−1, 1], we have
 
3 π
η∞ (f, Pn ) ≤ · ω f, .
2 n+1
13
Ulisse Dini (1845-1918), Italian mathematician and politician
6.4 The Jackson Theorems 229

(b) If f is Lipschitz continuous with Lipschitz constant L > 0, then we have

3π · L
η∞ (f, Pn ) ≤ .
2(n + 1)

(c) For f ∈ C k [−1, 1], k ≥ 1, we have


 π k 1
η∞ (f, Pn ) ≤ f (k) ∞
2 (n + 1)n(n − 1) . . . (n − (k − 2))
= O(n−k ) for n → ∞.

We split the proof of Jackson 5, Theorem 6.77, into several lemmas. The
following lemma reveals the structural connection between the trigonometric
and the algebraic case.

Lemma 6.78. For f ∈ C [−1, 1] and g(ϕ) = f (cos(ϕ)) ∈ C2π , we have

η∞ (f, Pn ) = η∞ (g, Tn ).

Proof. For f ∈ C [−1, 1] the function g ∈ C2π is even. Therefore, the unique
best approximation T ∗ ∈ Tn to g is even, so that we have

T ∗ (ϕ) = p(cos(ϕ)) for ϕ ∈ [0, 2π]

for some p ∈ Pn . Moreover, we find the relation

η∞ (g, Tn ) = T ∗ − g∞ = p − f ∞ = p∗ − f ∞ = η∞ (f, Pn ),

where p∗ ∈ Pn is the unique best approximation to f from Pn . 

Lemma 6.79. For f ∈ C [−1, 1] and g(ϕ) = f (cos(ϕ)) ∈ C2π , we have

ω(g, δ) ≤ ω(f, δ) for all δ > 0.

Proof. By the mean value theorem, we have

| cos(ϕ + ε) − cos(ϕ)| ≤ ε for ε > 0

which in turn implies

ω(g, δ) = sup |g(ϕ + ε) − g(ϕ)| = sup |f (cos(ϕ + ε)) − f (cos(ϕ))|


|ε|≤δ |ε|≤δ

≤ sup |f (x + σ) − f (x)| = ω(f, δ).


|σ|≤δ

Now we prove statements (a) and (b) of Theorem 6.77.


230 6 Asymptotic Results

Proof. (Jackson 5, parts (a),(b)).


(a): From Jackson 3, Theorem 6.70, in combination with Lemma 6.78 and
Lemma 6.79, we can conclude
   
3 π 3 π
η∞ (f, Pn ) = η∞ (g, Tn ) ≤ · ω g, ≤ · ω f,
2 n+1 2 n+1
for f ∈ C [−1, 1].
(b): Statement (a) implies, for Lipschitz continuous f ∈ C [−1, 1] with
Lipschitz constant L the estimate
 
3 π 3 π·L
η∞ (f, Pn ) ≤ · ω f, ≤ · .
2 n+1 2 n+1

To prove part (c) of Jackson 5, Theorem 6.77, we use the following lemma.
Lemma 6.80. For f ∈ C 1 [−1, 1], we have
π
η∞ (f, Pn ) ≤ η∞ (f  , Pn−1 ).
2(n + 1)
Proof. Let p∗ ∈ Pn−1 be best approximation to f  from Pn−1 . For
 x
p(x) = p∗ (ξ) dξ ∈ Pn
0
 ∗
we have p = p and so
π π
η∞ (f, Pn ) = η∞ (p − f, Pn ) ≤ p − f  ∞ = η∞ (f  , Pn−1 )
2(n + 1) 2(n + 1)
holds by Jackson 1, Theorem 6.59, and Lemma 6.78. 
Now we can prove statement (c) of Jackson 5, Theorem 6.77.
Proof. (Jackson 5, part (c)).
For f ∈ C k [−1, 1], we obtain from Lemma 6.80 the estimate
 π k 1
η∞ (f, Pn ) ≤ · η∞ (f (k) , Pn−k )
2 (n + 1)n(n − 1) . . . (n + 2 − k)
by induction on k ≥ 1. This already implies, by
η∞ (f (k) , Pn−k ) ≤ f (k) − 0∞ = f (k) ∞ ,
the stated estimate
 π k 1
η∞ (f, Pn ) ≤ f (k) ∞ .
2 (n + 1)n(n − 1) . . . (n − (k − 2))

6.4 The Jackson Theorems 231

We close this chapter by giving a reformulation of the Dini-Lipschitz the-


orem for the algebraic case.

Theorem 6.81. (Dini-Lipschitz).


If f ∈ C [−1, 1] satisfies one of the following conditions, then the sequence
(Πn f )n∈N0 of Chebyshev partial sums


n
(f, Tk )w
Πn f = Tk
Tk 2w
k=0

in (4.32) converges uniformly to f , i.e.,

Πn f − f ∞ −→ 0 for n → ∞,

at the following convergence rates.


(a) If
log(n) · ω(f, 1/n) = o(1) for n → ∞,
then we have by Jackson 5 (a)

Πn f − f ∞ = o(1) for n → ∞.

(b) If f is Lipschitz continuous, then we have by Jackson 5 (b)

Πn f − f ∞ = O(log(n)/n) for n → ∞.

(c) If f ∈ C k [−1, 1], for k ≥ 1, then we have by Jackson 5 (c)

Πn f − f ∞ = O(log(n)/nk ) for n → ∞.

For the proof of the Dini-Lipschitz theorem we refer to Exercise 6.99.


232 6 Asymptotic Results

6.5 Exercises
Exercise 6.82. Prove the following results.
(a) Show that for a set of n + 1 pairwise distinct interpolation points

a ≤ x 0 < . . . < xn ≤ b

where n ∈ N, the corresponding interpolation operator In : C [a, b] −→ Pn


is not necessarily monotone. To this end, construct one counterexample
for the case n = 2 with three interpolation points a = x0 < x1 < x2 = b.
(b) Develop for the case n = 1 a necessary and sufficient condition for two
interpolation points
a ≤ x0 < x1 ≤ b
under which the interpolation operator I1 is monotone.

Exercise 6.83. For n ∈ N0 , consider the Bernstein polynomials


 
(n) n k
βk (x) = x (1 − x)n−k ∈ Pn for x ∈ [0, 1] and 0 ≤ k ≤ n.
k
(n)
(a) Show that the Bernstein polynomials βk are non-negative on the unit
interval [0, 1], where they are a partition of unity (cf. Remark 6.7 (b),(c)).
(b) Determine the zeros (including their multiplicities) and the maximum of
(n)
the Bernstein polynomial βk on [0, 1], for 0 ≤ k ≤ n, n ∈ N0 .
(c) Prove the recursion formula
(n) (n−1) (n−1)
βk (x) = x βk−1 (x) + (1 − x) βk (x) for x ∈ [0, 1],

for n ∈ N and k = 0, . . . , n, with initial and boundary values


(0) (n−1)
β0 ≡ 1, β−1 ≡ 0, βn(n−1) ≡ 0.
(n) (n)
(d) Show that the Bernstein polynomials β0 , . . . , βn of degree n ∈ N0
form a basis for the polynomial space Pn (cf. Remark 6.7 (a)).

Exercise 6.84. Consider the Bernstein operator Bn : C [0, 1] −→ Pn ,



n
(n)
(Bn f )(x) = f (j/n)βj (x) for f ∈ C [0, 1] and n ∈ N0 ,
j=0

(n)
with the Bernstein polynomials βj (x) = n
j xj (1 − x)n−j , for 0 ≤ j ≤ n.
Show that, for any f ∈ C [0, 1], the sequence ((Bn f ) )n∈N0 of derivatives
of Bn f converges uniformly on [0, 1] to f  , i.e.,

lim Bn (f ) − f  ∞ = 0.
n→∞
6.5 Exercises 233

Exercise 6.85. Prove the following results.


(a) For a compact interval [a, b] ⊂ R, let f ∈ C [a, b]. Show that f vanishes
identically on [a, b], if and only if all moments of f on [a, b] vanish, i.e., if
and only if
 b
mn = xn f (x) dx = 0 for all n ∈ N0 .
a

(b) Suppose f ∈ C2π . Show that f vanishes identically on R, if and only if


all Fourier coefficients of f vanish, i.e., if and only if
 2π
1
cj = f (x) e−ijx dx = 0 for all j ∈ Z.
2π 0
Exercise 6.86. Prove the following generalization of the Korovkin theorem.
Let Ω ⊂ Rd be a compact domain, where d ∈ N. Moreover, suppose for
s1 , . . . , sm ∈ C (Ω) that there are functions a1 , . . . , am ∈ C (Ω) satisfying

m
pt (x) = aj (t) sj (x) ≥ 0 for all t, x ∈ Ω,
j=1

where pt (x) = 0, if and only if t = x. Then, for any sequence (Ln )n∈N of
linear positive operators Ln : C (Ω) −→ C (Ω) satisfying

lim Ln sj − sj ∞ = 0 for all 1 ≤ j ≤ m


n→∞

we have the convergence

lim Ln s − s∞ = 0 for all s ∈ C (Ω).


n→∞

Conclude from this the statement of the Korovkin theorem, Theorem 6.11.

Exercise 6.87. Consider for n ∈ N0 the operator Πn∗ : C2π −→ Tn , which


assigns f ∈ C2π to its (strongly unique) best approximation Πn∗ f from Tn
with respect to  · ∞ , so that

η∞ (f, Tn ) = inf T − f ∞ = Πn∗ f − f ∞ .


T ∈Tn

Investigate Πn∗ for the following (possible) properties.


(a) projection property;
(b) surjectivity;
(c) linearity;
(d) continuity;
(e) boundedness.
234 6 Asymptotic Results

Exercise 6.88. Let (un )n∈Z be a system of elements in a Hilbert space H.


Prove the equivalence of the following two statements.
(a) The system (un )n∈Z is a Riesz basis of H.
(b) There is a linear, continuous and invertible operator T : H −→ H and a
complete orthonormal system (en )n∈Z in H satisfying

T e n = un for all n ∈ Z.

Exercise 6.89. Prove the completeness of the orthonormal system


C
{eij· | j ∈ Z} ⊂ C2π
C
in C2π with respect to the Euclidean norm  · C .
Hint: Corollary 6.41 and Remark 6.42.
C
Exercise 6.90. Consider the linear space C2L of complex-valued 2L-periodic
continuous functions, equipped with the inner product
 2L
1 C
(f, g) = f (x) · g(x) dx for f, g ∈ C2L .
2L 0
C
(a) Determine a complete orthonormal system (ej )j∈Z in C2L .
C
(b) Develop the Fourier coefficients cj = (f, ej ) of f ∈ C2L .
C
(c) Formulate the Parseval identity in C2L with respect to (ej )j∈Z .
Exercise 6.91. Let cj (f ) be the complex Fourier coefficients of f ∈ C2π .
Show that the estimate

|cj (f )| ≤ C(1 + |j|)−(k+1+ε) for all j ∈ Z,

for some C > 0 and ε > 0, implies f ∈ C2π


k
(cf. Remark 6.45).
Hint: Analyze the (uniform) convergence of the Fourier partial sums

n
(Fn f )(x) = cj (f )eijx
j=−n

and their derivatives.


Exercise 6.92. Show for f ∈ C2π
1
the identity

Fn f  = (Fn f ) for all n ∈ N

for the Fourier partial sums Fn f  of the derivative f  ∈ C2π .


Exercise 6.93. Prove Faber’s theorem, Theorem 6.57: For any sequence
(In )n∈N0 of interpolation operators In : C [a, b] −→ Pn , there is a conti-
nuous function f ∈ C [a, b], for which the corresponding sequence (In f )n∈N0
of interpolation polynomials In f ∈ Pn does not converge uniformly to f .
6.5 Exercises 235

Exercise 6.94. Let [a, b] ⊂ R be a compact interval. For the numerical


integration of
 b
Iab (f ) = f (x) dx for f ∈ C [a, b]
a

we apply the Newton-Cotes quadrature. For n ∈ N, the n-th Newton-Cotes


14

quadrature formula is defined as



n
Qn (f ) = (b − a) αj,n f (xj,n )
j=0

at equidistant knots
b−a
xj,n = a + j for j = 0, . . . , n
n
and weights
 b
1
αj,n = Lj,n (x) dx for j = 0, . . . , n,
b−a a

where {L0,n , . . . , Ln,n } ⊂ Pn are the Lagrange basis functions for the knot set
Xn = {x0,n , . . . , xn,n } (cf. the discussion on Lagrange bases in Section 2.3).
Show that there is a continuous function f ∈ C [a, b], for which the se-
quence of Newton-Cotes approximations ((Qn f ))n∈N diverges.

Hint: Apply the Kuzmin15 theorem, according to which the sum of the
weights’ moduli |αj,n | diverges, i.e.,


n
|αj,n | −→ ∞ for n → ∞.
j=0

Exercise 6.95. Show that the estimate of Jackson 1, Theorem 6.59,


π
η∞ (f, Tn ) ≤ f  ∞ for f ∈ C2π
1
, (6.67)
2(n + 1)

is sharp, i.e., there is a function f ∈ C2π


1
\Tn for which equality holds in (6.67).
Conclude from this that the estimate of Jackson 2, Theorem 6.66,
π·L
η∞ (f, Tn ) ≤ for f Lipschitz continuous with constant L > 0
2(n + 1)

is also sharp.
14
Roger Cotes (1682-1716), English mathematician
15
Rodion Ossijewitsch Kuzmin (1891-1949), Russian mathematician
236 6 Asymptotic Results

Exercise 6.96. Prove the theorem of de La Vallée Poussin16 : Let f ∈ C2π


and Tn ∈ Tn . If there exist 2n + 2 pairwise distinct points

0 ≤ x0 < . . . < x2n+1 < 2π,

such that Tn − f has alternating signs on xk , k = 0, . . . , 2n + 1, then we have

η∞ (f, Tn ) ≥ min |(Tn − f )(xk )|.


0≤k≤2n+1

Exercise 6.97. The estimate of Jackson 3, Theorem 6.70, is not sharp. Show
that the estimate
 
π
η∞ (f, Tn ) ≤ ω f, for f ∈ C2π
n+1

is sharp (under the assumptions and with the notations in Theorem 6.70).
Hint: Apply the theorem of de La Vallée Poussin from Exercise 6.96.

Exercise 6.98. Verify the following properties of the modulus of continuity

ω(f, δ) = sup |f (x + σ) − f (x)|


x,x+σ∈R
|σ|≤δ

of f : R −→ R on R with respect to δ > 0 (cf. Definition 6.68).


(a) ω(f, (n + θ)δ) ≤ nω(f, δ) + ω(f, θδ) for all θ ∈ [0, 1) and n ∈ N.
(b) ω(f, δ) ≤ nω(f, δ/n) for all n ∈ N.

Exercise 6.99. Prove part (c) of the Dini-Lipschitz theorem, Theorem 6.81,
in two steps as follows. First show that, for any f ∈ C 1 [−1, 1], the sequence
(Πn f )n∈N0 of Chebyshev partial sums


n
(f, Tj )w
Πn f = Tj where Tj = cos(j arccos(·)) ∈ Pj
j=0
Tj 2w

converges uniformly on [−1, 1] to f , i.e.,

lim Πn f − f ∞ = 0.
n→∞

From this conclude, for f ∈ C k [−1, 1], k ≥ 1, the convergence behaviour

Πn f − f ∞ = o(n1−k ) for n → ∞.

16
Charles-Jean de La Vallée Poussin (1866-1962), Belgian mathematician
7 Basic Concepts of Signal Approximation

In this chapter, we study basic concepts of mathematical signal analysis. To


this end, we first introduce the continuous Fourier transform F,

(Ff )(ω) = f (x) · e−ixω dω for f ∈ L1 (R), (7.1)
R

as a linear integral transform on the Banach space L1 (R) of absolutely


Lebesgue-integrable functions. We motivate the transfer from Fourier series
C
of periodic functions f ∈ C2π to Fourier transforms of non-periodic functions
f ∈ L (R). In particular, we provide a heuristic account to the Fourier trans-
1

formation Ff in (7.1), where we depart from Fourier partial sums Fn f , for


f ∈ C2π . Then, we analyze the following relevant questions.
(1) Is the Fourier transform F invertible?
(2) Can F be transferred to the Hilbert space L2 (R)?
(3) Can F be applied to multivariate functions f ∈ Lp (Rd ), for p = 1, 2?
We give positive answers to all questions (1)-(3). The answer to (1) leads
us, for f ∈ L1 (R), with Ff ∈ L1 (R), to the Fourier inversion formula

1
f (x) = (Ff )(ω)eixω dω for almost every x ∈ R.
2π R
To analyze (2), we study the spectral properties of F, where we identify the
Hermite functions hn in (4.55) as eigenfunctions of F. As we show, the Her-
mite functions (hn )n∈N0 form a complete orthogonal system in the Hilbert
space L2 (R). This result leads us to the Plancherel theorem, Theorem 7.30,
providing the continuous extension of F to an automorphism on L2 (R). The
basic properties of the Fourier operator F can be generalized from the uni-
variate case to the multivariate case, and this gives an answer to (3).
Finally, we formulate and prove the celebrated Shannon sampling theorem,
Theorem 7.34 (in Section 7.3), giving a fundamental result of mathematical
signal processing. According to the Shannon sampling theorem, a signal, i.e.,
a function f ∈ L2 (R), with bounded frequency density can be reconstructed
exactly from its samples (i.e., function values) on an infinite uniform grid at a
sufficiently small sampling rate. Our proof of the Shannon sampling theorem
serves to demonstrate the relevance and the significance of the introduced
Fourier methods.

© Springer Nature Switzerland AG 2018 237


A. Iske, Approximation Theory and Algorithms for Data Analysis, Texts
in Applied Mathematics 68, https://doi.org/10.1007/978-3-030-05228-7_7
238 7 Basic Concepts of Signal Approximation

The second half of this chapter is devoted to wavelets. Wavelets are popu-
lar and powerful tools of modern mathematical signal processing, in particular
for the approximation of functions f ∈ L2 (R). A wavelet approximation to f
is essentially based on a multiresolution of L2 (R), i.e., on a nested sequence
· · · ⊂ V−1 ⊂ V0 ⊂ V1 ⊂ · · · ⊂ Vj−1 ⊂ Vj ⊂ · · · ⊂ L2 (R) (7.2)
of closed scale spaces Vj ⊂ L2 (R). The nested sequence in (7.2) leads us to
stable approximation methods, where f is represented on different frequency
bands by orthogonal projectors Πj : L2 (R) −→ Vj . More precisely, for a fixed
scaling function ϕ ∈ L2 (R), the scale spaces Vj ⊂ L2 (R) in (7.2) are generated
by dilations and translations of basis functions ϕjk (x) := 2j/2 ϕ(2j x − k), for
j, k ∈ Z, so that

Vj = span{ϕjk : k ∈ Z} ⊂ L2 (R) for j ∈ Z.


Likewise, for a corresponding wavelet function ψ ∈ L2 (R), the orthogonal
complement Wj ⊂ Vj+1 of Vj in Vj+1 ,
Vj+1 = Wj ⊕ Vj for j ∈ Z,
is generated by basis functions ψkj (x) := 2j/2 ψ(2j x − k), for j, k ∈ Z, so that

Wj = span{ψkj | k ∈ Z} for j ∈ Z.
The basic construction of wavelet approximations to f ∈ L2 (R) is based
on refinement equations of the form
 
ϕ(x) = hk ϕ(2x − k) and ψ(x) = gk ϕ(2x − k),
k∈Z k∈Z

for specific coefficient masks (hk )k∈Z , (gk )k∈Z ⊂ 2


.
The development of wavelet methods, dating back to the early 1980s, has
since then gained enormous popularity in applications of information techno-
logy, especially in image and signal processing. Inspired by a wide range of
applications in science and engineering, this has led to rapid progress concer-
ning both computational methods and the mathematical theory of wavelets.
Therefore, it is by no means possible for us to give a complete overview
over the multiple facets of wavelet methods. Instead, we have decided to
present selected basic principles of wavelet approximation. To this end, we
restrict the discussion of this chapter to the rather simple Haar wavelet

⎨ 1 for x ∈ [0, 1/2),
ψ(x) = χ[0,1/2) (x) − χ[1/2,1) (x) = −1 for x ∈ [1/2, 1),

0 otherwise,
and its corresponding scaling function ϕ = χ[0,1) . For a more comprehensive
account to the mathematical theory of wavelets, we recommend the classical
textbooks [14, 18, 49] and, moreover, the more recent textbooks [29, 31, 69]
for a more pronounced connection to Fourier analysis.
7.1 The Continuous Fourier Transform 239

7.1 The Continuous Fourier Transform


In Section 4.3, we introduced Fourier partial sums Fn f to approximate con-
C
tinuous periodic functions f ∈ C2π , or, f ∈ C2π . Recall that we restricted
ourselves (without loss of generality) to continuous functions f with period
T = 2π (according to Definition 2.32).
In the following discussion, we assume f ≡ fT : R −→ C to be a time-
continuous signal (i.e., a function) with period T , for some T > 0. In this
case, by following along the lines of our derivations in Section 4.3, we obtain
for the complex n-th Fourier partial sum Fn fT of fT the representation

n
(Fn fT )(x) = cj eijωx (7.3)
j=−n

with the frequency ω = 2π/T and with the complex Fourier coefficients
 T  T /2
1 −ijωξ 1
cj = fT (ξ)e dξ = fT (ξ)e−ijωξ dξ (7.4)
T 0 T −T /2

for j = −n, . . . , n, as in (4.23) and in (4.24). Technically speaking, the Fourier


coefficient cj gives the amplification factor for the Fourier mode e−ijω· of the
frequency ωj = j · ω = j · 2π/T , for j = −n, . . . , n, i.e., the Fourier coefficients
cj yield the amplitudes of the present fundamental Fourier modes e−ijω· .
Then, in Section 6.3 we analyzed the convergence of Fourier partial sums.
According to Theorem 6.48, the Fourier series representation


fT (x) = cj eijωx (7.5)
j=−∞

holds pointwise at all points x ∈ R, where fT is differentiable.


The bi-infinite sequence (cj )j∈Z of complex Fourier coefficients in (7.5) is
called the discrete Fourier spectrum of the signal fT . Therefore, the discrete
Fourier spectrum of fT is a sequence (cj )j∈Z of Fourier coefficients from
which we can reconstruct fT exactly via (7.5). If only finitely many Fourier
coefficients cj do not vanish, then only the frequencies of the (finitely many)
corresponding Fourier modes e−ijω· will appear in the representation (7.5).
In this case, the frequency spectrum of fT is bounded.
Now we derive an alternative representation for the Fourier series in (7.5).
To this end, we first introduce the mesh width

Δω := ωj+1 − ωj ≡ for j ∈ Z
T
as the difference between two consecutive frequencies of the Fourier modes
e−ijω· . Then, the Fourier series in (7.5) can be written as
240 7 Basic Concepts of Signal Approximation

  T /2
1
fT (x) = fT (ξ)eiωj (x−ξ) dξ · Δω. (7.6)
j=−∞
2π −T /2

The Fourier series representation (7.6) of the T -periodic signal fT leads


us to the following questions.
• Is there a representation as in (7.6) for non-periodic signals f : R −→ C?
• If so, how would we represent the Fourier spectrum of f ?
In the following analysis on these questions, we consider a non-periodic
signal f : R −→ C as a signal with infinite period, i.e., we consider the limit

f (x) = lim fT (x) for x ∈ R, (7.7)


T →∞

where the T -periodic signal fT is assumed to coincide on (−T /2, T /2) with f .
Moreover, we regard the function
 T /2
gT (ω) := fT (ξ)e−iωξ dξ
−T /2

of the frequency variable ω, whereby we obtain for fT the representation



1 
fT (x) = gT (ωj ) eiωj x · Δω (7.8)
2π j=−∞

from (7.6). We remark that the infinite series in (7.8) is a Riemannian sum
on the knot sequence {wj }j∈Z . Note that the mesh width Δω of the sequence
{wj }j∈Z is, for large enough T > 0, arbitrarily small. This observation leads
us, via the above-mentioned limit in (7.7), to the function
 ∞
g(ω) := lim gT (ω) = f (ξ)e−iωξ dξ for ω ∈ R. (7.9)
T →∞ −∞

To guarantee the well-definedness of the function g in (7.9), we mainly


require the existence of the Fourier integral on the right hand side in (7.9)
for all frequencies ω. To this end, we assume f ∈ L1 (R), i.e., we assume the
function f to be absolutely integrable. In this case, the Fourier integral in (7.9)
is, due to |e−iω· | ≡ 1, finite, for all frequencies ω. Recall that we work here
and throughout this work with Lebesgue integration.

Definition 7.1. For f ∈ L1 (R), the function



ˆ
(Ff )(ω) = f (ω) := f (x)e−ixω dx for ω ∈ R (7.10)
R

is called the Fourier transform of f . The Fourier operator, which assigns


f ∈ L1 (R) to its Fourier transform Ff = fˆ, is denoted as F.
7.1 The Continuous Fourier Transform 241

Note that the Fourier transform F is a linear integral transform which


maps a function f ≡ f (x) of the spatial variable x (or, a signal f of the
time variable) to a function Ff = fˆ ≡ fˆ(ω) of the frequency variable ω.
The application of the Fourier transform is (especially for signals) referred
to as time-frequency analysis. Moreover, the function Ff = fˆ is called the
continuous Fourier spectrum of f . If we regard the Fourier integral in (7.10)
as parameter integral of the frequency variable ω, then we will see that the
Fourier transform fˆ : R −→ C of f ∈ L1 (R) is a function that is uniformly
continuous on R (see Exercise 7.56). In particular, we have fˆ ∈ C (R). More-
over, due to the estimate
  
 
|fˆ(ω)| =  f (x)e−ixω dx ≤ |f (x)| dx = f L1 (R) , (7.11)
R R

the function fˆ is uniformly bounded on R by the L1 -norm f L1 (R) of f .


We note the following fundamental properties of F (see Exercise 7.58).

Proposition 7.2. The Fourier transform F : L1 (R) −→ C (R) has the fol-
lowing properties, where we assume f ∈ L1 (R) for all statements (a)-(e).
(a) For fx0 := f (· − x0 ), where x0 ∈ R, we have

(Ffx0 )(ω) = e−iωx0 (Ff )(ω) for all ω ∈ R.

(b) For fα := f (α ·), where α ∈ R \ {0}, we have


1
(Ffα )(ω) = (Ff )(ω/α) for all ω ∈ R. (7.12)
|α|

(c) For the conjugate complex f¯ ∈ L1 (R), where f¯(x) = f (x), we have

(F f¯)(ω) = (Ff )(−ω) for all ω ∈ R. (7.13)

(d) For the Fourier transform of the derivative f  of f , we have

(Ff  )(ω) = iω(Ff )(ω) for all ω ∈ R

under the assumption f ∈ C 1 (R) ∩ L1 (R) with f  ∈ L1 (R).


(e) For the derivative of the Fourier transform Ff of f , we have

d
(Ff )(ω) = −i(F(xf ))(ω) for all ω ∈ R

under the assumption xf ∈ L1 (R). 

All properties in Proposition 7.2 can be shown by elementary calculations.


242 7 Basic Concepts of Signal Approximation

In the following discussion, we work with functions of compact support.

Definition 7.3. For a continuous function f : R −→ C, we call the point


set
supp(f ) := {x ∈ R | f (x) = 0} ⊂ R
support of f . Therefore, f has compact support, if supp(f ) is compact.

We denote by Cc (R) the linear space of all continuous functions with


compact support. Recall that Cc (R) is dense in L1 (R), i.e., for any f ∈ L1 (R)
and ε > 0 there is a function g ∈ Cc (R) satisfying f − gL1 (R) < ε.
According to the Riemann1 -Lebesgue lemma, the Fourier transform fˆ of
f ∈ L1 (R) vanishes at infinity.

Lemma 7.4. (Riemann-Lebesgue).


The Fourier transform fˆ of f ∈ L1 (R) vanishes at infinity, i.e.,

fˆ(ω) −→ 0 for |ω| → ∞.

Proof. Let g be a continuous function with compact support, i.e., g ∈ Cc (R).


Due to statement (a) in Proposition 7.2, the function

g−π/ω = g(· + π/ω) ∈ Cc (R) ⊂ L1 (R) for ω = 0

has the Fourier transform

(Fg−π/ω )(ω) = eiπ (Fg)(ω) = −(Fg)(ω) for ω = 0.

This implies the representation



2(Fg)(ω) = (Fg)(ω) − (Fg−π/ω )(ω) = (g(x) − g(x + π/ω))e−ixω dx,
R

whereby, in combination with the dominated convergence theorem, we get



1
|ĝ(ω)| = |(Fg)(ω)| ≤ |g(x) − g(x + π/ω)| dx −→ 0 for |ω| → ∞. (7.14)
2 R

Now Cc (R) is dense in L1 (R), so that for any f ∈ L1 (R) and ε > 0 there is
one g ∈ Cc (R) satisfying f − gL1 (R) < ε. From this, the statement follows
from the estimate (7.11), whereby

|fˆ(ω) − ĝ(ω)| ≤ f − gL1 (R) < ε for all ω ∈ R,

in combination with the property (7.14). 


1
Bernhard Riemann (1826-1866), German mathematician
7.1 The Continuous Fourier Transform 243

Remark 7.5. By the Riemann-Lebesgue lemma, the Fourier transform F is


a linear mapping between the Banach space (L1 (R), ·L1 (R) ) of all absolutely
integrable functions and the Banach space (C0 (R),  · ∞ ) of all continuous
functions that are vanishing at infinity, i.e.,

F : L1 (R) −→ C0 (R).

In our following discussion, two questions are of fundamental importance:


• Is the Fourier transform F invertible?
• Can the Fourier transform F be transferred to the Hilbert space L2 (R)?
To give an answer to these questions, we require only a few preparations.
First, we prove the following result.

Proposition 7.6. For f, g ∈ L1 (R) both functions fˆg and f ĝ are integrable.
Moreover, we have
 
fˆ(x)g(x) dx = f (ω)ĝ(ω) dω. (7.15)
R R

Proof. Since the functions fˆ and ĝ are continuous and bounded, respectively,
both functions fˆg and f ĝ are integrable. By using the Fubini2 theorem, we
can conclude
   
−ixω
f (ω)ĝ(ω) dω = f (ω) g(x)e dx dω
R R R
   
−ixω
= f (ω)e dω g(x) dx = fˆ(x)g(x) dx.
R R R

Now let us discuss two important examples for Fourier transforms.

Example 7.7. For α > 0, let α = χ[−α,α] be the indicator function of the
compact interval [−α, α] ⊂ R. Then,
 1
(F1 )(ω) = e−ixω dx = 2 · sinc(ω) for ω ∈ R
−1

is the Fourier transform of 1 , where the (continuous) function

sin(ω)/ω for ω = 0
sinc(ω) :=
1 for ω = 0

is called sinus cardinalis (or, sinc function) (see Figure 7.1).


2
Guido Fubini (1879-1943), Italian mathematician
244 7 Basic Concepts of Signal Approximation

1.5

0.5

−0.5
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2

(a) The indicator function 1

1.5

0.5

−0.5
−25 −20 −15 −10 −5 0 5 10 15 20 25

(b) The Fourier transform F 1 = 2sinc

Fig. 7.1. The sinc function yields the Fourier transform of the function 1.
7.1 The Continuous Fourier Transform 245

By the scaling property in (7.12), we find that (Fα )(ω) = 2α · sinc(αω),


for ω ∈ R, is the Fourier transform of α . Note that the Fourier transform
Fα of α ∈ L1 (R) is not contained in L1 (R), since the sinc function is not
absolutely integrable. ♦

Example 7.8. We compute the Fourier transform of the Gauss function

gα (x) = e−αx
2
for x ∈ R

for α > 0 by
 
−αx2 −ixω
e−α(x
2
g;
α (ω) = e e dx = +ixω/α)
dx
R R
  2

−α x2 + ixω
α +( 2α )
iω 2
eα( 2α ) dx

= e
R

iω 2
=e e−α(x+ 2α ) dx
−ω 2 /(4α)
R
<
π −ω2 /(4α)
= ·e ,
α

where in the last line we used the well-known identity


  <
−α(x+iy)2 −αx2 π
e dx = e dx = for α > 0.
R R α

In conclusion, we note the following observation:


The Fourier transform of a Gauss function is a Gauss function.
In particular, for α = 1/2, we have

g=1/2 = 2π · g1/2 , (7.16)

i.e., the Gauss function g1/2 is an eigenfunction of F for the eigenvalue 2π.
The identity in (7.16) immediately implies the representation

1
e−x /2 = √ e−y /2 · eixy dy
2 2
for all x ∈ R, (7.17)
2π R
where we used the symmetry g1/2 (x) = g1/2 (−x), for x ∈ R. ♦

Now we can determine the operator norm of F.

Proposition 7.9. The Fourier transform F : L1 (R) −→ C0 (R) has operator


norm one, i.e.,
FL1 (R)→C0 (R) = 1.
246 7 Basic Concepts of Signal Approximation

Proof. For f ∈ L1 (R), the Fourier transform Ff = fˆ is bounded, due


to (7.11), where Ff ∞ ≤ f L1 (R) . This implies FL1 (R)→C0 (R) ≤ 1.
For the Gauss function g1/2 (x) = exp(−x2 /2) from Example 7.8, we ob-

tain g1/2 L1 (R) = 2π, on the one hand, whereas, on the other hand, we

have Fg1/2 ∞ = 2π, due to (7.16). This implies

Ff ∞
FL1 (R)→C0 (R) = sup = 1.
f ∈L1 (R)\{0} f L1 (R)


From the result of Proposition 7.9, we can draw the following conclusion.
Corollary 7.10. Let (fn )n∈N be a convergent sequence in L1 (R) with limit
f ∈ L1 (R). Then, the corresponding sequence (fˆn )n∈N of Fourier transforms
Ffn = fˆn ∈ C0 (R) converges uniformly on R to fˆ.
Proof. The statement follows immediately from the estimate

fˆn − fˆ∞ = F(fn − f )∞ ≤ F · fn − f L1 (R) = fn − f L1 (R) ,

where F = FL1 (R)→C0 (R) = 1, due to Proposition 7.9. 


In the following discussion, the convolution in L1 is of primary importance.
Definition 7.11. For f, g ∈ L1 (R) the function

(f ∗ g)(x) := f (x − y)g(y) dy for x ∈ R (7.18)
R

is called convolution product, in short, convolution, between f and g.


We note that the convolution between L1 -functions is well-defined, i.e.,
for f, g ∈ L1 (R), the integral in (7.18) is finite, for all x ∈ R. Moreover, the
convolution f ∗ g is in L1 (R). We take note of this important result as follows.
Proposition 7.12. For f, g ∈ L1 (R), the convolution f ∗g is absolutely inte-
grable, and we have the estimate

f ∗ gL1 (R) ≤ f L1 (R) · gL1 (R) . (7.19)

Proof. For f, g ∈ L1 (R), we have the representation


  
(f ∗ g)(x) dx = f (x − y)g(y) dy dx
R
R R    
= g(y) f (x − y) dx dy = f (x) dx · g(y) dy,
R R R R

by using the Fubini theorem. Therefore, f ∗ g is integrable. From a similar


representation for |f ∗ g|, we get the estimate in (7.19). 
7.1 The Continuous Fourier Transform 247

Remark 7.13. Due to Proposition 7.12, the Banach space L1 (R) is closed
under the convolution product ∗, i.e., we have f ∗ g ∈ L1 (R) for f, g ∈ L1 (R).
Moreover, for f, g ∈ L1 (R), we have the identity
 
(f ∗ g)(x) = f (x − y)g(y) dy = f (y)g(x − y) dy = (g ∗ f )(x)
R R

for all x ∈ R, i.e., the convolution ∗ is commutative on L1 (R).


Therefore, L1 (R) is a commutative Banach algebra. 

Due to Proposition 7.12 and Remark 7.13, we can apply the Fourier trans-
form F to the convolution of two L1 -functions. As we show now, the Fourier
transform F(f ∗ g) of the convolution f ∗ g, for f, g ∈ L1 (R), coincides with
the algebraic product of their Fourier transforms Ff and Fg.

Theorem 7.14. (Fourier convolution theorem).


For f, g ∈ L1 (R) we have

F(f ∗ g) = (Ff ) · (Fg).

Proof. With application of the Fubini theorem we immediately obtain by


   
(F(f ∗ g))(ω) = (f ∗ g)(x)e−ixω dx = f (x − y)g(y) dy e−ixω dx
R R
  R
= f (x − y)e−i(x−y)ω dx g(y)e−iyω dy
R R

= (Ff )(ω)g(y)e−iyω dy = (Ff )(ω) · (Fg)(ω)
R

the stated representation for all ω ∈ R. 

Next, we specialize the Fourier convolution theorem to autocorrelations.

Definition 7.15. For f ∈ L1 (R) the convolution product


 
(f ∗ f ∗ )(x) = f (x − y)f ∗ (y) dy = f (x + y)f (y) dy for x ∈ R
R R

is called autocorrelation of f , where the function f ∗ ∈ L1 (R) is defined as


f ∗ (x) := f (−x) for all x ∈ R.

From the Fourier convolution theorem, Theorem 7.14, in combination with


statement (c) in Proposition 7.2, we immediately get the following result.

Corollary 7.16. For real-valued f ∈ L1 (R), we have the representation

F(f ∗ f ∗ )(ω) = |(Ff )(ω)|2 for all ω ∈ R

for the Fourier transform of the autocorrelation of f . 


248 7 Basic Concepts of Signal Approximation

7.1.1 The Fourier Inversion Theorem

In this section, we prove the Fourier inversion formula on L1 (R), as we already


motivated in the previous section. To this end, we derive a continuous version
of the Fourier series representation in (7.5) by using the continuous Fourier
spectrum fˆ = Ff , for f ∈ L1 (R).
In order to do so, we need only a few preparations.

Definition 7.17. A sequence of functions (δk )k∈N is called a Dirac3 sequence


in L1 (R), if all of the following conditions are satisfied.
(a) For all k ∈ N, we have the positivity

δk (x) ≥ 0 for almost every x ∈ R.

(b) For all k ∈ N, we have the normalization



δk (x) dx = 1.
R

(c) For all r > 0, we have



lim δk (x) dx = 0.
k→∞ R\[−r,r]

If we interpret the functions δk ∈ L1 (R) of a Dirac sequence as (non-


negative) mass densities, then the total mass will be normalized to unity,
due to property (b). Moreover, the total mass will at increasing k ∈ N be
concentrated around zero. This observation motivates the following example.

Example 7.18. For the Gauss function g1/2 (x) = e−x


2
/2
in Example 7.8,
we have   √
e−x /2 dx = 2π.
2
g1/2 (x) dx =
R R

Now we let δ1 (x) := √1 g1/2 (x)


and, moreover, δk (x) := kδ1 (kx), for k > 1,

so that
k
δk (x) = √ · e−k x /2
2 2
for k ∈ N. (7.20)

By elementary calculations, we see that the Gauss sequence (δk )k∈N satisfies
all conditions (a)-(c) in Definition 7.17, i.e., (δk )k∈N is a Dirac sequence. ♦

Next, we prove an important approximation theorem for L1 -functions,


according to which any f ∈ L1 (R) can be approximated arbitrarily well by
its convolutions f ∗ δk with elements of a Dirac sequence (δk )k∈N in L1 (R).
3
Paul Adrien Maurice Dirac (1902-1984), English physicist
7.1 The Continuous Fourier Transform 249

Theorem 7.19. (Dirac approximation theorem).


Let f ∈ L1 (R) and (δk )k∈N be a Dirac sequence in L1 (R). Then, we have

f − f ∗ δk L1 (R) −→ 0 for k → ∞, (7.21)

i.e., the sequence (f ∗ δk )k∈N converges in L1 (R) to f .

Proof. Let g be a continuously differentiable function with compact support,


i.e., g ∈ Cc1 (R). Then, the functions g and g  are bounded on R, i.e., there is
a constant M > 0 with max(g∞ , g  ∞ ) ≤ M . We let K := |supp(g)| < ∞
for the (finite) length |supp(g)| of the support interval supp(g) ⊂ R of g.
We estimate the L1 -error g − g ∗ δk L1 (R) from above by
  
 
g − g ∗ δk L1 (R) =  δk (y) (g(x) − g(x − y)) dy  dx
 
R R
  
≤ δk (y) |g(x) − g(x − y)| dy dx,
R R
  
= δk (y) |g(x) − g(x − y)| dx dy, (7.22)
R R

where we used the properties (a) and (b) in Definition 7.17. Note that the
function hy := g − g(· − y) satisfies, for any y ∈ R, the estimate

|hy (x)| ≤ g  ∞ · |y| ≤ M · |y| for all x ∈ R. (7.23)

Now we split the outer integral in (7.22) into a sum of two terms, which
we estimate uniformly from above by (7.23), so that we have, for any ρ > 0,

g − g ∗ δk L1 (R)
 ρ     
≤ δk (y) |hy (x)| dx dy + δk (y) |hy (x)| dx dy
−ρ R R\(−ρ,ρ) R

≤ 2 · |supp(g)| · g  ∞ · ρ + 2 · |supp(g)| · g∞ δk (y) dy
R\(−ρ,ρ)
≤ 4·K ·M ·ρ

for all k ≥ N ≡ N (ρ) ∈ N satisfying



δk (y) dy ≤ ρ,
R\(−ρ,ρ)

by using property (c) in Definition 7.17. For ε > 0 we have g−g∗δk L1 (R) < ε,
for all k ≥ N , with assuming ρ < ε/(4KM ), Therefore, g ∈ Cc1 (R) can
be approximated arbitrarily well in L1 (R) by convolutions g ∗ δk . Finally,
Cc1 (R) is dense in L1 (R), which implies the stated L1 -convergence in (7.21)
for f ∈ L1 (R). 
250 7 Basic Concepts of Signal Approximation

Now we turn to the Fourier inversion formula. At the outset of Section 7.1,
we derived the representation (7.8) for periodic functions. We can transfer
the inversion formula (7.8) from the discrete case to the continuous case. This
motivates the following definition.

Definition 7.20. For g ∈ L1 (R) we call the function



1
(F −1 g)(x) = ǧ(x) := g(ω) · eixω dω for x ∈ R (7.24)
2π R
inverse Fourier transform of g. The inverse Fourier operator, which
assigns g ∈ L1 (R) to its inverse Fourier transform ǧ, is denoted as F −1 .

Now we can prove the Fourier inversion formula

f = F −1 Ff

under suitable assumptions on f ∈ L1 (R).

Theorem 7.21. (Fourier inversion formula).


For f ∈ L1 (R) satisfying fˆ = Ff ∈ L1 (R) the Fourier inversion formula

1
f (x) = fˆ(ω) · eixω dω for almost every x ∈ R (7.25)
2π R
holds with equality at every point x ∈ R, where f is continuous.

Proof. In the following proof, we utilize the Dirac sequence (δk )k∈N of Gauss
functions from Example 7.18. For δk in (7.20), the identity (7.17) yields the
representation
 
k 1
e−y /2 · eikxy dy = e−ω /(2k ) · eixω dω
2 2 2
δk (x) = (7.26)
2π R 2π R
for all k ∈ N. This in turn implies
   
1
e−ω /(2k ) · ei(x−y)ω dω dy
2 2
(f ∗ δk )(x) = f (y)
R 2π R
  
1
f (y) · e−iyω dy e−ω /(2k ) · eixω dω
2 2
=
2π R R

1
f (ω) · e−ω /(2k ) · eixω dω,
2 2
= ˆ (7.27)
2π R
where, for changing order of integration, we applied the dominated conver-
gence theorem with the dominating function |f (y)|e−ω .
2

For k → ∞, the sequence of integrals in (7.27) converges to



1
fˆ(ω) · eixω dω,
2π R
7.2 The Fourier Transform on L2 (R) 251

where we use the assumption fˆ ∈ L1 (R).


According to the Dirac approximation theorem, Theorem 7.19, the se-
quence of Dirac approximations f ∗ δk converges in L1 (R) to f , for k → ∞.
This already proves the stated Fourier inversion formula (7.25) in L1 (R).
Finally, the parameter integral in (7.25) is a continuous function at x.
Therefore, we have equality in (7.25), provided that f is continuous at x. 

Remark 7.22. According to Remark 7.5, the Fourier transform maps any
f ∈ L1 (R) to a continuous function fˆ ∈ C0 (R). Therefore, by the Fourier
inversion formula, Theorem 7.21, there exists for any f ∈ L1 (R) satisfying
fˆ ∈ L1 (R) a continuous representative f˜ ∈ L1 (R), which coincides with f
almost everywhere on R (i.e., f ≡ f˜ in the L1 -sense), and for which the
Fourier inversion formula holds on R. 

The Fourier inversion formula implies the injectivity of F on L1 (R).

Corollary 7.23. Suppose Ff = 0 for f ∈ L1 (R) . Then, f = 0 almost


everywhere, i.e., the Fourier transform F : L1 (R) −→ C0 (R) is injective. 

In the following discussion, we will often apply the Fourier inversion for-
mula to continuous functions f ∈ L1 (R) ∩ C (R). By the following result, we
can in this case drop the assumption fˆ ∈ L1 (R) (see Exercise 7.64).

Corollary 7.24. For f ∈ L1 (R) ∩ C (R) the Fourier inversion formula


  
1
fˆ(ω) · eixω e−ε|ω| dω
2
f (x) = lim for all x ∈ R (7.28)
ε 0 2π R

holds. 

7.2 The Fourier Transform on L2 (R)


In this section, we transfer the Fourier transform F : L1 (R) −→ C0 (R) from
L1 (R) to L2 (R). We remark, however, that the Banach space L1 (R) is not a
subspace of the Hilbert space L2 (R) (see Exercise 7.57). For this reason, we
first consider the Schwartz4 space
 8
 k d
∞ 
S(R) = f ∈ C (R)  x ·  f (x) is bounded for all k, ∈ N0
dx

of all rapidly decaying C ∞ functions.


4
Laurent Schwartz (1915-2002), French mathematician
252 7 Basic Concepts of Signal Approximation

Remark 7.25. Every function f ∈ S(R) and all of its derivatives f (k) , for
k ∈ N, are rapidly decaying to zero around infinity, i.e., for any (complex-
valued) polynomial p ∈ P C and for any k ∈ N0 , we have
p(x)f (k) (x) −→ 0 for |x| → ∞.
Therefore, all derivatives f (k) of f ∈ S(R), for k ∈ N, are also contained
in S(R). Obviously, we have the inclusion S(R) ⊂ L1 (R), and so f ∈ S(R)
and all its derivatives f (k) , for k ∈ N, are absolutely integrable, i.e., we have
f (k) ∈ L1 (R) for all k ∈ N0 . 
Typical examples of elements in the Schwartz space S(R) are C ∞ func-
tions with compact support. Another example is the Gauss function gα , for
α > 0, from Example 7.8. Before we give further examples of functions in the
Schwartz space S(R), we first note a few observations.
According to Remark 7.25, every function f ∈ S(R) and all its derivatives
f (k) , for k ∈ N, have a Fourier transform. Moreover, for f ∈ S(R) and
k, ∈ N0 , we have the representations
d
(Ff )(ω) = (−i) (F(x f ))(ω) for all ω ∈ R
dω 
(Ff (k) )(ω) = (iω)k (Ff )(ω) for all ω ∈ R,
as they directly follow (by induction) from Proposition 7.2 (d)-(e) (see Exer-
cise 7.59). This yields the uniform estimate
   k 
 k d   d 
ω   
 dω  (Ff )(ω) ≤  dxk (x f (x)) 1 for all ω ∈ R.

(7.29)
L (R)

i.e., all functions ω k (Ff )() (ω), for k, ∈ N0 , are bounded. Therefore, we see
that the Fourier transform Ff of any f ∈ S(R) is also contained in S(R).
By the Fourier inversion formula, Theorem 7.25, the Fourier transform F is
bijective on S(R). We reformulate this important result as follows.
Theorem 7.26. The Fourier transform F : S(R) −→ S(R) is an automor-
phism on the Schwartz space S(R), i.e., F is linear and bijective on S(R).

Now we make an important example for a family of functions that are
contained in the Schwartz space S(R). To this end, we recall the Hermite
polynomials Hn from Section 4.4.3 and their associated Hermite functions
hn from Exercise 4.42.
Example 7.27. The Hermite functions

hn (x) = Hn (x) · e−x


2
/2
for n ∈ N0 (7.30)
are contained in the Schwartz space S(R). Indeed, this follows from the rapid
decay of the Gauss function g1/2 (x) = exp(−x2 /2), cf. Example 7.8. ♦
7.2 The Fourier Transform on L2 (R) 253

The Schwartz space S(R) is obviously contained in any Banach space


Lp (R), for 1 ≤ p ≤ ∞. In particular, S(R) is a subspace of the Hilbert space
L2 (R), i.e., S(R) ⊂ L2 (R). In the following discussion, we work with the
L2 -inner product

(f, g) = f (x)g(x) dx for f, g ∈ L2 (R). (7.31)
R

Now we prove the completeness of the Hermite functions in L2 (R).

Proposition 7.28. The Hermite functions (hn )n∈N0 in (7.30) are a com-
plete orthogonal system in the Hilbert space L2 (R).

Proof. The orthogonality of (hn )n∈N0 follows from the orthogonality of the
Hermite polynomials in Theorem 4.28. According to (4.47), we have

(hm , hn ) = 2n n! π · δmn for all m, n ∈ N0 . (7.32)

Now we show the completeness of the system (hn )n∈N0 . To this end, we
use the completeness criterion in Theorem 6.26, as follows.
Suppose that f ∈ L2 (R) satisfies (f, hn ) = 0 for all n ∈ N0 . Then, we
consider the function g : C −→ C, defined as

g(z) = h0 (x)f (x)e−ixz dx for z ∈ C.
R

Note that g is holomorphic on C, and, moreover, we have



(m)
g (z) = (−i) m
xm h0 (x)f (x)e−ixz dx for m ∈ N0 .
R

Therefore, g (m) (0) can be written as a linear combination of the inner


products (f, hk ), for k = 0, . . . , m, so that g (m) (0) = 0 for all m ∈ N0 .
From this, we conclude g ≡ 0, since g is holomorphic, which in turn implies
F(h0 f ) = 0. By Corollary 7.23, we get h0 f = 0 almost everywhere. In par-
ticular, we have f = 0 almost everywhere. Due to the completeness criterion,
Theorem 6.26, the orthogonal system (hn )n∈N0 is complete in L2 (R). 

Theorem 7.29. For any n ∈ N0 , the Hermite function h√n in (7.30) is an


eigenfunction of the Fourier transform for the eigenvalue 2π(−i)n , i.e.,

;n = 2π(−i)n hn
h for all n ∈ N0 .

Proof. We prove the statement by induction on n ∈ N0 .


Initial step: For n = 0 the statement holds for h0 = g1/2 by (7.16).
Induction hypothesis: Assume that the Hermite function
√ hn is an eigen-
function of the Fourier transform for the eigenvalue 2π(−i)n , for n ∈ N0 .
254 7 Basic Concepts of Signal Approximation

Induction step (n −→ n + 1): By partial integration, we obtain



h
n+1 (ω) = e−ixω hn+1 (x) dω
R
  n 
d d −x2
e−ixω (−1)n+1 ex /2
2
= e dx
R dx dxn
 n
x=R
−ixω n+1 x2 /2 d −x2
= lim e (−1) e e
R→∞ dxn x=−R

dn
− (−iω + x)e−ixω ex /2 (−1)n+1 n e−x dx
2 2

R dx

 x=R
= lim −e−ixω hn (x) x=−R + (−iω + x)e−ixω hn (x) dx
R→∞ R
;n (ω) + xh
= −iω h =n (ω).

From the induction hypothesis and Proposition 7.2 (e), we conclude



h
n+1 (ω) = 2π(−i)n+1 (ωhn (ω) − hn (ω)) . (7.33)

Now the three-term recursion of the Hermite polynomials in (4.48) can


be transferred to the Hermite functions, so that

hn+1 (x) = 2xhn (x) − 2nhn−1 (x) for n ≥ 0 (7.34)

holds with the initial values h−1 ≡ 0 and h0 (x) = exp(−x2 /2). By using the
recursion Hn (x) = 2nHn−1 (x), for n ∈ N, from Corollary 4.30, we get
d  −x2 /2 
hn (x) = · Hn (x) = −x · e−x /2 · Hn (x) + e−x /2 · Hn (x)
2 2
e
dx
= −xhn (x) + e−x /2 (2nHn−1 (x))
2

= 2nhn−1 (x) − xhn (x). (7.35)

Moreover, the representations in (7.34) and (7.35) imply the recursion

hn+1 (x) = xhn (x) − hn (x) for n ≥ 0 (7.36)



(cf. Exercise 4.42). Therefore, h
n+1 = 2π(−i)n+1 hn+1 by (7.33) and (7.36).

Given the completeness of the Hermite functions (hn )n∈N0 in L2 (R), ac-
cording to Theorem 7.28, there is a unique extension of the Fourier transform
F : S(R) −→ S(R) to the Hilbert space L2 (R). Moreover, by the spectral
property of the orthonormal system (hn )n∈N0 in L2 (R), as shown in Theo-
rem 7.29, the Parseval identity (6.12) can also be extended to L2 (R). This
important result is referred to as the Plancherel5 theorem.
5
Michel Plancherel (1885-1967), Swiss mathematician
7.3 The Shannon Sampling Theorem 255

Theorem 7.30. (Plancherel theorem).


The Fourier transform F : S(R) −→ S(R) can uniquely be extended to a
bounded and bijective linear mapping on the Hilbert space L2 (R). The ex-
tended Fourier transform F : L2 (R) −→ L2 (R) has the following properties.
(a) The Parseval identity

(Ff, Fg) = 2π(f, g) for all f, g ∈ L2 (R),

holds, so that in particular



Ff L2 (R) = 2πf L2 (R) for all f ∈ L2 (R).

(b) The Fourier inversion formula holds on L2 (R), i.e.,

F −1 (Ff ) = f for all f ∈ L2 (R).

(c) For the operator norms of F and F −1 on L2 (R), we have

FL2 (R)→L2 (R) = (2π)1/2


F −1 L2 (R)→L2 (R) = (2π)−1/2 .


We close this section by the following remark.
Remark 7.31. The Fourier operator F : L2 (R) −→ L2 (R) is uniquely de-
termined by the properties in Theorem 7.30. Moreover, we remark that the
Fourier transform F : L1 (R) −→ C0 (R) maps any f ∈ L1 (R) to a unique
uniformly continous function Ff ∈ C0 (R). In contrast, the Fourier transform
F : L2 (R) −→ L2 (R) maps any f ∈ L2 (R) to a function Ff ∈ L2 (R) that is
merely almost everywhere unique. 

7.3 The Shannon Sampling Theorem


This section is devoted to the Shannon6 sampling theorem, which is a funda-
mental result in mathematical signal processing. According to the Shannon
sampling theorem, any signal f ∈ L2 (R) with bounded frequency density can
be reconstructed exactly from its samples (i.e., function values) on an infinite
uniform grid {jd | j ∈ Z} ⊂ R at a sufficiently small sampling rate d > 0.
We formulate the mathematical assumptions on f as follows.
Definition 7.32. A function f ∈ L2 (R) is said to be band-limited, if its
Fourier transform Ff has compact support supp(Ff ), i.e., if there is some
constant L > 0 satisfying supp(Ff ) ⊂ [−L, L], where the smallest constant
L with this property is called the bandwidth of f .
6
Claude Elwood Shannon (1916-2001), US-American mathematician
256 7 Basic Concepts of Signal Approximation

Remark 7.33. Every band-limited function f is analytic. This important


result is due to the Paley7 -Wiener8 theorem. A detailed discussion concerning
the analyticity of Fourier transforms can be found in [58, Section IX.3]. 

Theorem 7.34. (Shannon sampling theorem).


Let f ∈ L2 (R) be a band-limited function with bandwidth L > 0. Then, we
have the reconstruction formula

f (x) = f (jπ/L) · sinc(Lx − jπ) for all x ∈ R. (7.37)
j∈Z

Proof. Without loss of generality, we assume L = π for the bandwidth of f ,


since otherwise we can resort to the case ĝ(ω) = fˆ(ω · π/L).
For a fixed x ∈ R, we work with the function ex (ω) = exp(ixω). By
ex ∈ L2 [−π, π], the Fourier series representation

ex (ω) = cj (ex ) · eijω
j∈Z

holds in the L2 -sense. The Fourier coefficients cj (ex ) of ex can be computed


as
 π
1
cj (ex ) = ex (ω) · e−ijω dω = sinc(π(x − j)) for all j ∈ Z.
2π −π

Now f has a continuous representative in L2 which satisfies the representation


 π
1
f (x) = fˆ(ω) · eixω dω (7.38)
2π −π
  π
1
= sinc(π(x − j)) · fˆ(ω) · eijω dω (7.39)
2π −π
j∈Z

= f (j) · sinc(π(x − j)) (7.40)
j∈Z

pointwise for all x ∈ R. Note that we have applied the Fourier inversion
formula of the Plancherel theorem, Theorem 7.30, to obtain (7.38) and (7.40).
Finally, we remark that the interchange of integration and summation
in (7.39) is valid by the Parseval identity
 π 
1
g(ω)h(ω) dω = cj (g) · cj (h) for all g, h ∈ L2 [−π, π],
2π −π
j∈Z

which completes our proof for the stated reconstruction formula in (7.37). 
7
Raymond Paley (1907-1933), English mathematician
8
Norbert Wiener (1894-1964), US-American mathematician
7.4 The Multivariate Fourier Transform 257

Remark 7.35. By the Shannon sampling theorem, Theorem 7.34, any band-
limited function f ∈ L1 (R) ∩ C (R), or, f ∈ L2 (R) with bandwidth L > 0
can uniquely be reconstructed from its values on the uniform sampling grid
{jd | j ∈ Z} ⊂ R for all sampling rates d ≤ π/L. Therefore, the optimal
sampling rate is d∗ = π/L, and this rate corresponds to half of the smallest
wave length 2π/L that is present in the signal f . The optimal sampling rate
d∗ = π/L is called the Nyquist rate (or, Nyquist distance). 

Remark 7.36. In the commonly used literature, various formulations of the


Shannon sampling theorem are given for band-limited functions f ∈ L1 (R),
rather than for f ∈ L2 (R). We remark that the representation in (7.37) does
also hold for band-limited functions f ∈ L1 (R), or, to be more precise, the
representation in (7.37) holds pointwise for a continuous representative of
f ∈ L1 (R). In fact, this statement can be shown (for compact supp(fˆ) ⊂ R)
by following along the lines of our proof for Theorem 7.34. 

Remark 7.37. The Shannon sampling theorem is, in its different variants,
also connected with the names of Nyquist9 , Whittaker10 , and Kotelnikov11 . In
fact, Kotelnikov had formulated and published the sampling theorem already
in 1933, although his work was widely unknown for a long time. Shannon
formulated the sampling theorem in 1948, where he used this result as a
starting point for his theory on maximal channel capacities. 

7.4 The Multivariate Fourier Transform

In this section, we introduce the Fourier transform for complex-valued func-


tions f ≡ f (x1 , . . . , xd ) of d real variables. To this end, we can rely on basic
concepts for the univariate case, d = 1, from the previous sections. Again,
we first regard the Fourier transform on the Banach space of all absolutely
integrable functions,
 8

L (R ) = f : R −→ C 
1 d d  |f (x)| dx < ∞ ,
Rd

equipped with the L1 -norm



f L1 (Rd ) = |f (x)| dx for f ∈ L1 (Rd ).
Rd

9
Harry Nyquist (1889-1976), US-American electrical engineer
10
Edmund Taylor Whittaker (1873-1956), British astronomer, mathematician
11
Vladimir Kotelnikov (1908-2005), Russian pioneer of information theory
258 7 Basic Concepts of Signal Approximation

Definition 7.38. For f ∈ L1 (Rd ), the function



ˆ
(Fd f )(ω) = f (ω) := f (x)e−ix,ω dx for ω ∈ Rd (7.41)
Rd

is called the Fourier transform of f . The Fourier operator, which assigns


f ∈ L1 (Rd ) to its d-variate Fourier transform Fd f = fˆ, is denoted as Fd .
Likewise, for g ∈ L1 (Rd ) the function

(Fd−1 g)(x) = ǧ(x) := (2π)−d g(ω) · eix,ω dω for x ∈ R (7.42)
R

is called the inverse Fourier transform of g. The operator, which assigns


g ∈ L1 (Rd ) to its inverse Fourier transform ǧ, is denoted as Fd−1 .

By separation of the variables in the Rd -inner product ·, · ,

x, ω = x1 ω1 +. . .+xd ωd for x = (x1 , . . . , xd )T , ω = (ω1 , . . . , ωd )T ∈ Rd ,

appearing in the Fourier transform’s formulas (7.41) and (7.42) we can, via

e±ix,ω = e±ix1 ω1 · . . . · e±ixd ωd ,

generalize the results for the univariate case, d = 1, to the multivariate case,
d ≥ 1. In the following of this section, we merely quote results that are
needed in Chapters 8 and 9. Of course, the Fourier inversion formula from
Theorem 7.21 is of central importance.

Theorem 7.39. (Fourier inversion formula).


For f ∈ L1 (Rd ) with fˆ = Fd f ∈ L1 (Rd ), the Fourier inversion formula

f (x) = (2π)−d fˆ(ω) · eix,ω dω for almost every x ∈ Rd (7.43)
Rd

holds with equality at every point x ∈ Rd , where f is continuous. 

As in Corollary 7.24, formula (7.43) holds also for f ∈ L1 (Rd ) ∩ C (Rd ).

Corollary 7.40. For f ∈ L1 (Rd ) ∩ C (Rd ), the Fourier inversion formula


  
−d ˆ ix,ω −ε ω 22
f (x) = lim (2π) f (ω) · e e dω for all x ∈ Rd (7.44)
ε 0 Rd

holds. 

An important example is the Fourier transform of the Gauss function.


7.4 The Multivariate Fourier Transform 259

Example 7.41. The d-variate Fourier transform of the Gauss function

gα (x) = e−α
2
x 2 for x ∈ Rd and α > 0

is  π d/2
e−
2
(Fd gα )(ω) = ω 2 /(4α) for ω ∈ Rd .
α

Moreover, we apply the Fourier transform Fd to convolutions.

Definition 7.42. For f, g ∈ L1 (Rd ), the function



(f ∗ g)(x) := f (x − y)g(y) dy for x ∈ Rd (7.45)
Rd

is called the convolution product, in short, convolution, between f and g.


Moreover, for f ∈ L1 (Rd ) the convolution product
 
∗ ∗
(f ∗ f )(x) = f (x − y)f (y) dy = f (x + y)f (y) dy for x ∈ Rd
Rd Rd

is called the autocorrelation of f , where f ∗ (x) := f (−x) for all x ∈ Rd .

As for the univariate case, in Theorem 7.14 and Corollary 7.16, the Fourier
convolution theorem holds for the multivariate Fourier transform.

Theorem 7.43. (Fourier convolution theorem).


For f, g ∈ L1 (Rd ), the identity

Fd (f ∗ g) = (Fd f ) · (Fd g).

holds. In particular, for real-valued f ∈ L1 (R), we have

Fd (f ∗ f ∗ )(ω) = |(Fd f )(ω)|2 for all ω ∈ Rd

for the Fourier transform of the autocorrelation of f . 

By following along the lines of Section 7.2, we can transfer the multivariate
Fourier transform Fd : L1 (Rd ) −→ C0 (Rd ) to the Hilbert space
 8

L2 (Rd ) = f : Rd −→ C  |f (x)|2 dx < ∞
Rd

of all square-integrable function, being equipped with the L2 -inner product



(f, g) = f (x)g(x) dx for f, g ∈ L2 (Rd )
Rd
260 7 Basic Concepts of Signal Approximation

and the Euclidean norm  · L2 (Rd ) = (·, ·)1/2 . To this end, we first introduce
the Fourier transform Fd on the Schwartz space
 8


d  k d
S(R ) = f ∈ C (R )  x ·  f (x) is bounded for all k, ∈ N0
d d
dx
of all rapidly decaying C ∞ functions. As for the univariate case, Theorem 7.26,
the Fourier transform Fd is bijective on S(Rd ).
Theorem 7.44. The multivariate Fourier transform Fd : S(Rd ) −→ S(Rd )
is an automorphism on the Schwartz space S(Rd ). 
This implies the Plancherel theorem, as in Theorem 7.30 for d = 1.
Theorem 7.45. (Plancherel theorem).
The Fourier transform Fd : S(Rd ) −→ S(Rd ) can uniquely be extended to
a bounded and bijective linear mapping on the Hilbert space L2 (Rd ). The
extended Fourier transform Fd : L2 (Rd ) −→ L2 (Rd ) has the following pro-
perties.
(a) The Parseval identity
(Fd f, Fd g) = (2π)d (f, g) for all f, g ∈ L2 (Rd ),
holds, so that in particular
Fd f L2 (Rd ) = (2π)d/2 f L2 (Rd ) for all f ∈ L2 (Rd ).
(b) The Fourier inversion formula
Fd−1 (Fd f ) = f for all f ∈ L2 (Rd )
holds on L2 (R), i.e.,

f (x) = (2π)−d fˆ(ω)eix,ω dω for almost every x ∈ Rd .
Rd

(c) For the operator norms of Fd and Fd−1 on L2 (Rd ), we have


Fd L2 (Rd )→L2 (Rd ) = (2π)d/2
Fd−1 L2 (Rd )→L2 (Rd ) = (2π)−d/2 .


7.5 The Haar Wavelet


In this section, we turn to the construction and analysis of wavelet methods.
Wavelets are important building blocks for multiresolution representations of
signals f ∈ L2 (R). To this end, suitable wavelet bases of L2 (R) are utilized. A
very simple-structured wavelet basis of L2 (R) is due to the work [32] of Alfréd
Haar in 1910. In the following discussion, we explain important principles of
wavelet methods by using the Haar12 wavelet.
12
Alfréd Haar (1885-1933), Hungarian mathematician
7.5 The Haar Wavelet 261

Let us first introduce a basic ingredient. For an interval I ⊂ R, we denote


by χI : R −→ R,
1 for x ∈ I,
χI (x) :=
0 otherwise,
the indicator function of I. Now we can give a definition for the Haar wavelet.
Definition 7.46. The function ψ : R −→ R, defined as

⎨ 1 for x ∈ [0, 1/2),
ψ(x) = χ[0,1/2) (x) − χ[1/2,1) (x) = −1 for x ∈ [1/2, 1),

0 otherwise,
is called Haar wavelet.
In the following, we wish to construct a wavelet basis of L2 (R) by using the
Haar wavelet ψ. To this end, we apply dilations (i.e., scalings) and translations
(i.e., shifts) to the argument of ψ. To be more precise, we consider, for j, k ∈ Z,
the wavelet functions
ψkj (x) := 2j/2 ψ(2j x − k) for x ∈ R (7.46)
that are generated from the Haar wavelet ψ by multiplication of ψ with factor
2j/2 , along with the application of dilations with 2j and translations about k
on the argument of ψ. In particular, for j = k = 0, we get the Haar wavelet
ψ = ψ00 . Figure 7.2 shows the function graphs of ψkj , for j = −1, 0, 1.
Let us note only a few elementary properties of the wavelet functions ψkj .
Proposition 7.47. For ψkj in (7.46), the following statements hold.
(a) The wavelet functions ψkj have zero mean, i.e.,
 ∞
ψkj (x) dx = 0 for all j, k ∈ Z.
−∞

(b) The wavelet functions ψkj have unit L2 -norm, i.e.,

ψkj L2 (R) = 1 for all j, k ∈ Z.

(c) For any j, k ∈ Z, the wavelet function ψkj has compact support, where

supp(ψkj ) = [2−j k, 2−j (k + 1)].



Proposition 7.47 (a)-(c) can be proven by elementary calculations.
Another important property is the orthonormality of the function system
{ψkj }j,k∈Z with respect to the L2 -inner product (·, ·), defined as

(f, g) := f (x)g(x) dx for f, g ∈ L2 (R).
R
262 7 Basic Concepts of Signal Approximation

1.5

-1 -1
1
-1 0
0.5

-0.5

-1

-1.5

-2

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2

ψk−1 for k = −1, 0

1.5 0 0 0 0
1
-2 -1 0 1
0.5

-0.5

-1

-1.5

-2

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2

ψk0 , for k = −2, −1, 0, 1

2 1 1 1 1 1 1 1 1
1.5 -4 -3 -2 -1 0 1 2 3
1

0.5

-0.5

-1

-1.5

-2

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2

ψk1 for k = −4, . . . , 3

Fig. 7.2. The Haar wavelet ψ = ψ00 generates the functions ψkj = 2j/2 ψ(2j · −k).
7.5 The Haar Wavelet 263

Proposition 7.48. The function system {ψkj }j,k∈Z is orthonormal in L2 (R),


i.e.,
(ψkj , ψm ) = δjm δk for all j, k, , m ∈ Z.

Proof. According to Proposition 7.47 (b), any ψkj has unit L2 -norm.
Now suppose that ψkj and ψm , are, for j, k, , m ∈ Z, distinct.
Case 1: If j = m, then k = . In this case, the intersection of the support
intervals of ψkj and ψm contains at most one point, according to Proposi-
tion 7.47 (c), so that (ψkj , ψm ) = 0.
Case 2: If j = m, then we assume m > j (without loss of generality). In
this case, we either have, for = 2m−j k, . . . , 2m−j (k + 1) − 1,

supp(ψkj ) ∩ supp(ψm ) = ∅,

whereby (ψkj , ψm ) = 0, or we have, for = 2m−j k, . . . , 2m−j (k + 1) − 1,

supp(ψm ) = [2−m , 2−m ( + 1)] ⊂ [2−j k, 2−j (k + 1)] = supp(ψkj ),

so that 
(ψkj , ψm ) = ±2 j/2
ψm (x) dx = 0.
supp(ψm )

This completes our proof. 

Next, we wish to construct a sequence of approximations to f ∈ L2 (R)


on different scales, i.e., at different resolutions. To this end, we work with
a decomposition of L2 (R) into ”finer” and ”coarser” closed subspaces. In
the following construction of such closed subspaces, the relation between the
Haar-Wavelet ψ and its scaling function

ϕ = χ[0,1)

plays an important role. For the functions

ϕjk (x) := 2j/2 ϕ(2j x − k) for j, k ∈ Z, (7.47)

generated by ϕ, we note the following elementary properties.


264 7 Basic Concepts of Signal Approximation

Proposition 7.49. For ϕjk in (7.47) the following statements hold.


(a) We have the orthonormality relation

(ϕjk , ϕj ) = δk for all j, k, ∈ Z.

(b) For any j, k ∈ Z, the function ϕjk has compact support, where

supp(ϕjk ) = [2−j k, 2−j (k + 1)].

Proposition 7.49 can be proven by elementary calculations.


Now we turn to a very important property of ϕ, which in particular ex-
plains the naming scaling function.

Proposition 7.50. The refinement equations

ϕj−1
k = 2−1/2 (ϕj2k + ϕj2k+1 ) for all j, k ∈ Z (7.48)
−1/2
ψkj−1 =2 (ϕj2k − ϕj2k+1 ) for all j, k ∈ Z (7.49)

hold.

Proof. By the representation

ϕ(x) = ϕ(2x) + ϕ(2x − 1) for all x ∈ R

the refinement equation in (7.48) holds for j = k = 0. By linear transforma-


tion of the argument x → 2j−1 x − k, this implies the representation in (7.48).
Likewise, the representation in (7.49) can be verified, by now from the identity

ψ(x) = ϕ(2x) − ϕ(2x − 1) for all x ∈ R.

By the refinement equations in (7.48) and (7.49), the coarser functions


ϕj−1
k and ψkj−1 are represented by a unique linear combination of two finer
functions, ϕj2k and ϕj2k+1 , respectively. We collect all functions of refinement
level j ∈ Z in the L2 -closure

Vj = span{ϕjk : k ∈ Z} ⊂ L2 (R) for j ∈ Z (7.50)

of all linear combinations of functions ϕjk , for k ∈ Z. For the properties of


the scale spaces Vj , we note the following observation.

Proposition 7.51. The spaces Vj in (7.50) have the following properties.


(a) Vj is 2−j Z-translation-invariant, i.e., f ∈ Vj implies f (· − 2−j k) ∈ Vj .
(b) The inclusion Vj−1 ⊂ Vj holds.
7.5 The Haar Wavelet 265

Proof. Property (a) follows from the scale-invariance of the wavelet basis,

ϕj (x − 2−j k) = 2j/2 ϕ(2j (x − 2−j k) − ) = 2j/2 ϕ(2j x − (k + )) = ϕjk+ .

Property (b) follows directly from the refinement equation in (7.48). 

Remark 7.52. According to property (b) in Proposition 7.51, the coarser


scale space Vj−1 (spanned by the coarser basis elements ϕj−1
k ) is contained in
the finer scale space Vj (spanned by the finer basis elements ϕjk ). Therefore,
the scale spaces (Vj )j∈Z in (7.50) are a nested sequence

· · · ⊂ V−1 ⊂ V0 ⊂ V1 ⊂ · · · ⊂ Vj−1 ⊂ Vj ⊂ · · · ⊂ L2 (R) (7.51)

of subspaces in L2 (R). 

Now we study further properties of the nested sequence (Vj )j∈Z . To this
end, we work with the orthogonal projection operator Πj : L2 (R) −→ Vj ,
for j ∈ Z, which assigns every f ∈ L2 (R) to its unique best approximation
s∗j = Πj f in L2 (R). According to our discussion in Section 6.2, we have the
series representation

Πj f = (f, ϕjk )ϕjk ∈ Vj for f ∈ L2 (R) (7.52)
k∈Z

for the orthogonal projection of f on Vj , as in (6.9). The following result


describes the asymptotic behaviour of the approximations (Πj f )j∈Z to any
function f ∈ L2 (R) with respect to  ·  =  · L2 (R) .

Proposition 7.53. For the sequence (Πj f )j∈Z of orthogonal projections


Πj f of f ∈ L2 (R) in (7.52) the following statements hold.
(a) The sequence (Πj f )j∈Z converges for j → ∞ w.r.t.  ·  to f , i.e.,

Πj f − f  −→ 0 for j → ∞.

(b) The sequence (Πj f )j∈Z converges for j → −∞ to zero, i.e.,

Πj f  −→ 0 for j → −∞.

Proof. Let ε > 0 and f ∈ L2 (R). Then there is, for a (sufficiently fine) dyadic
decomposition of R, a step function T ∈ L2 (R) with T −f  < ε/2. Moreover,
for the indicator functions χI j of the dyadic intervals Ikj := [2−j k, 2−j (k +1)),
k
we have the reproduction property Πj χI j = χI j , for all k ∈ Z. Therefore,
k k
there is a level index j0 ∈ Z with T = Πj T for all j ≥ j0 . From this, we can
conclude statement (a) by the estimate

Πj f − f  ≤ Πj (f − T ) + Πj T − T  + T − f 


≤ Πj  · f − T  + T − f  < ε for j ≥ j0 ,
266 7 Basic Concepts of Signal Approximation

where we use Πj  = 1 from Proposition 4.7.


To prove statement (b), we take on given ε > 0 a continuous function g
with compact support supp(g) = [−R, R], for R > 0, such that f −g < ε/2.
Now for 2j ≤ R−1 , we have
.  , - /
0 R
Πj g = 2j g(x) dx χI j + g(x) dx χI j
−1 0
−R 0

= 2j (c−1 χI j + c0 χI j ),
−1 0

where c−1 = (g, ϕj−1 ) and c0 = (g, ϕj0 ). Then, we have Πj g2 = 2j (c2−1 + c20 )
and, moreover, Πj g < ε/2 for j ≡ j(ε) ∈ Z small enough. For this j, we
finally get

Πj f  ≤ Πj (f − g) + Πj g ≤ f − g + Πj g < ε

by the triangle inequality, and so (b) is also proven. 

Proposition 7.53 implies a fundamental property of the scale spaces Vj .

Theorem 7.54. The system (Vj )j∈Z of scale spaces Vj in (7.50) forms a
multiresolution analysis of L2 (R) by satisfying the following conditions.
(a) The scale spaces in (Vj )j∈Z are nested, so that the inclusions (7.51) hold.
>
(b) The system (Vj )j∈Z is complete in L2 (R), i.e., 2
? L (R) = j∈Z Vj .
(c) The system (Vj )j∈Z satisfies the separation j∈Z Vj = {0}.

Proof. Property (a) holds according to Remark 7.52.


Property (b) follows from Proposition 7.53 (a) ? and Theorem 6.21.
To prove (c), let f ∈ L2 (R) be an element in j∈Z Vj . Then, f must have
the form
c for x ∈ (−∞, 0),
f (x) =
cr for x ∈ [0, ∞),
for some constants c , cr ∈ R. Since f ∈ L2 (R), we have c = cr = 0 and so
f ≡ 0. Hence, statement (c) is proven. 

In the following analysis, we consider the orthogonal complement

Wj−1 = {w ∈ Vj | (w, v) = 0 for all v ∈ Vj−1 } ⊂ Vj for j ∈ Z

of Vj−1 in Vj , where we use the notation

Vj = Wj−1 ⊕ Vj−1 (7.53)

for the orthogonality relation between Wj−1 and Vj−1 . In this way, the lin-
ear scale space Vj is by (7.53) decomposed into a smooth scale space Vj−1
7.5 The Haar Wavelet 267

containing the low frequency functions of Vj and a rough orthogonal comple-


ment space Wj−1 containing the high frequency functions from Vj . A recursive
decomposition of the scale spaces V yields the representation

Vj = Wj−1 ⊕ Wj−2 ⊕ · · · ⊕ Wj− ⊕ Vj− for ∈ N, (7.54)

whereby the scale space Vj is being decomposed in a finite sequence of sub-


space with increasing smoothness. By Theorem 7.54, we get the decomposi-
tion @
L2 (R) = Wj , (7.55)
j∈Z
2
i.e., L (R) is decomposed into the orthogonal subspaces Wj . The linear func-
tion spaces Wj are called wavelet spaces. The following result establishes
a fundamental relation between the wavelet functions {ψkj }j,k∈Z of the Haar
wavelets ψ and the wavelet spaces Wj .

Theorem 7.55. The functions {ψkj }j,k∈Z form an orthonormal basis of


L2 (R), i.e., {ψkj }j,k∈Z is a complete orthonormal system in L2 (R).

Proof. The orthonormality of the functions {ψkj }j,k∈Z is covered by Proposi-


tion 7.48. Therefore, it remains to prove the completeness of the orthonormal
system {ψkj }j,k∈Z in L2 (R). Due to the decomposition in (7.55), it is suffi-
cient to show that the wavelet space Wj is, for any refinement level j ∈ Z,
generated by the functions ψkj , for k ∈ Z, i.e.,

Wj = span{ψkj | k ∈ Z} for j ∈ Z.

To this end, we first verify the orthogonality relation

(ψkj−1 , ϕj−1
 )=0 for all k, ∈ Z. (7.56)

We get (7.56) as follows. For k = , we have supp(ψkj−1 ) ∩ supp(ϕj−1  ) = ∅,


whereby (ψkj−1 , ϕj−1
 ) = 0. For k = , the orthogonality in (7.56) follows from
Proposition 7.47 (a). Now by the orthogonality relation in (7.56), we have

ψkj−1 ∈ Wj−1 for all k ∈ Z.

The refinement equations (7.48) and (7.49) in Proposition 7.50 imply


 
ϕj2k = 2−1/2 ϕj−1
k + ψkj−1
 
ϕj2k+1 = 2−1/2 ϕj−1
k − ψ j−1
k .

Therefore, any basis element {ϕjk }k∈Z of Vj can be represented as a unique


linear combination of basis elements in {ϕj−1 k }k∈Z ⊂ Vj−1 and elements in
{ψkj−1 }k∈Z , and so the statement follows from the decomposition (7.53). 
268 7 Basic Concepts of Signal Approximation

According to our more general discussion concerning complete orthogonal


systems in Section 6.2, we obtain for all elements of the Hilbert space L2 (R)
the representation

f= (f, ψkj )ψkj for all f ∈ L2 (R). (7.57)
j,k∈Z

This representation follows directly from Theorem 6.21 (b) and Theorem 7.55.
Now we organize the representation (7.57) for f ∈ L2 (R) on multiple
wavelet scales. Our starting point for doing so is the multiresolution analysis
of L2 (R) in Theorem 7.54. For simplification we suppose supp(f ) ⊂ [0, 1]. We
approximate f on the scale space Vj , for j ∈ N, by the orthogonal projectors
Πj : L2 (R) −→ Vj , given as


N −1
Πj f = cjk ϕjk ∈ Vj for f ∈ L2 (R), (7.58)
k=0

where cjk := (f, ϕjk ), for k = 0, . . . , N − 1, and where we assume N = 2j . The


representation in (7.58) follows directly from (7.52), where the range of the
summation index k ∈ {0, . . . , N − 1} in (7.58) is due to

supp(f ) ⊂ [0, 1] and supp(ϕjk ) = [2−j k, 2−j (k + 1)].



By (7.53), Πj−1 = Πj − Πj−1 is the orthogonal projector of L2 (R) onto
Wj−1 , so that the decomposition

Πj f = Πj−1 f + Πj−1 f for all f ∈ L2 (R) (7.59)

holds. The orthogonal projector Πj−1 : L2 (R) −→ Wj−1 is described by


N/2−1

Πj−1 f= dj−1
k ψk
j−1
for f ∈ L2 (R), (7.60)
k=0

where dj−1
k := (f, ψkj−1 ), for k = 0, . . . , N/2 − 1.
By (7.58) and (7.60), the identity (7.59) can be written in the basis form


N −1 
N/2−1

N/2−1
cjk ϕjk = dj−1
k ψk
j−1
+ cj−1 j−1
k ϕk . (7.61)
k=0 k=0 k=0

With the recursive decomposition of the scale spaces in (7.54), for = j,

Vj = Wj−1 ⊕ Wj−2 ⊕ · · · ⊕ W0 ⊕ V0 for j ∈ N,

we can write the orthogonal projector Πj : L2 (R) −→ Vj as a telescoping sum


7.5 The Haar Wavelet 269


j−1
Πj f = Πr⊥ f + Π0 f for f ∈ L2 (R), (7.62)
r=0

whereby Πj f ∈ Vj is decomposed into a sum of functions Πr⊥ f ∈ Wr , for


r = j − 1, . . . , 0, and Π0 f ∈ V0 with increasing smoothness, i.e., from high
frequency to low frequency terms. By (7.58) and (7.60), we can rewrite (7.62)
in basis form as
N−1 
r
j−1 2 −1
j j
c k ϕk = drk ψkr + c00 ϕ00 . (7.63)
k=0 r=0 k=0
In practice, however, we have only discrete samples of f ∈ L2 (R). Suppose
the function values f (2−j k) are known for all k = 0, . . . , N −1, where N = 2j .
Then, f is interpolated by the function

N −1
s= f (2−j k)ϕ(2j · −k) ∈ Vj
k=0

at the sample points. Indeed, by ϕ(k) = δ0k we get


s(2−j ) = f (2−j ) for = 0, . . . , N − 1.
For the approximation of f , we use the function values of the finest level,
cjk ≈ 2−j/2 f (2−j k) for k = 0, . . . , N − 1,
−1
for the coefficients cj = (cjk )N
k=0 ∈ R
N
in (7.58).
Now we consider the representation of Πj f in (7.63). Our aim is to com-
−1
pute, from the input coefficients cj = (cjk )N
k=0 ∈ R , all wavelet coefficients
N

d = (c0 , d0 , (d1 )T , . . . , (dj−1 )T )T ∈ RN (7.64)


of the representation in (7.63), where
−1
r r
c0 = (c00 ) ∈ R1 and dr = (drk )2k=0 ∈ R2 for r = 0, . . . , j − 1.
The linear mapping T : RN −→ RN , which maps any data vector cj ∈ RN
to its corresponding wavelet coefficients d ∈ RN in (7.64) is bijective, and
referred to as discrete wavelet analysis. In the following discussion, we
describe the discrete wavelet analysis in detail.
The computation of the wavelet coefficients d in (7.64) can be performed
by recursive decompositions: At the first decomposition level, we compute
N/2−1 N/2−1
cj−1 = (cj−1
k )k=0 and dj−1 = (dj−1
k )k=0 in (7.61). To this end, we apply
the refinement equation in (7.48) to the representation in (7.61), whereby

N/2−1

N/2−1
cj2k ϕj2k + cj2k+1 ϕj2k+1 =
k=0 k=0
⎛ ⎞

N/2−1

N/2−1
2−1/2 ⎝ (cj2k + cj2k+1 )ϕj−1
k + (cj2k − cj2k+1 )ψkj−1 ⎠ .
k=0 k=0
270 7 Basic Concepts of Signal Approximation

By comparison of coefficients, we obtain the decomposition equation


   j−1   j−1 
Hj j c c
c = j−1 or Tj · cj = j−1 (7.65)
Gj d d

with the orthogonal decomposition matrix Tj ∈ RN ×N containing the matrix


blocks
⎡ ⎤ ⎡ ⎤
11 1 −1
⎢ .. ⎥ −1/2 ⎢ .. ⎥
Hj = 2−1/2 ⎣ . ⎦ , Gj = 2 ⎣ . ⎦∈R
N/2×N
.
11 1 −1

In the next level, the vector cj−1 ∈ RN/2 is decomposed into the vec-
tors cj−2 ∈ RN/4 and dj−2 ∈ RN/4 . The resulting recursion is called the
pyramid algorithm. The decomposition scheme of the pyramid algorithm is
represented as follows.

dj−1 dj−2 ... d0


! ! ! !
cj −→ cj−1 −→ cj−2 −→ . . . −→ c0

We can describe the decompositions of the pyramid algorithm as linear


mappings T : RN −→ RN , cj −→ T cj = d, whose matrix representation

T · cj = T1 · T2 · . . . · Tj−1 · Tj · cj = (c0 , d0 , (d1 )T , . . . , (dj−1 )T )T = d

contains the decomposition matrices Tj−r , r = 0, . . . , j − 1, of the recursion


levels. The orthogonal decomposition matrices are block diagonal of the form
⎡ ⎤
Hj−r
⎢ ⎥ N ×N
Tj−r = ⎣ Gj−r ⎦∈R for r = 0, . . . , j − 1 (7.66)
Ir
−r
)×N (1−2−r )
with Hj−r , Gj−r ∈ RN/2 ×N/2 and the identities Ir ∈ RN (1−2
r+1 r
.
Therefore, the orthogonal matrix

T = T1 · T2 · . . . · Tj−1 · Tj ∈ RN ×N (7.67)

represents the discrete wavelet analysis.


−1
For given wavelet coefficients d in (7.64), the coefficients cj = (cjk )N
k=0 can
thereby be reconstructed from Πj f in (7.63). The linear mapping of this re-
construction is called discrete wavelet synthesis. The wavelet synthesis
is represented by the inverse matrix

T −1 = Tj−1 · Tj−1
−1
· . . . · T2−1 · T1−1 = TjT · Tj−1
T
· . . . · T2T · T1T ∈ RN ×N

of T in (7.67), so that
7.6 Exercises 271

cj = TjT · . . . · T1T · d.
The discrete wavelet analysis and the discrete wavelet synthesis are as-
sociated with the terms discrete wavelet transform (wavelet analysis) and
inverse discrete wavelet transformation (wavelet synthesis).
Due to the orthogonality of the matrices Tj−r in (7.66), the wavelet trans-
form is numerically stable, since

d2 = T1 · . . . · Tj · cj 2 = cj 2 .

Moreover, the complexity of the wavelet transform is only linear, since the j
decomposition steps (for r = 0, 1, . . . , j − 1) require altogether

N + N/2 + . . . + 2 = 2N − 2 = O(N ) for N → ∞

operations.

7.6 Exercises
Exercise 7.56. Show that the Fourier transform fˆ : R −→ C,

fˆ(ω) = f (x)e−ixω dx for ω ∈ R,
R

of f ∈ L (R) is a uniformly continuous function on R.


1

Exercise 7.57. Consider the Banach space (L1 (R), ·L1 (R) ) and the Hilbert
space (L2 (R),  · L2 (R) ). Show that neither the inclusion L1 (R) ⊂ L2 (R) nor
the inclusion L2 (R) ⊂ L1 (R) holds. Make a (non-trivial) example for a linear
space S satisfying S ⊂ L1 (R) and S ⊂ L2 (R).

Exercise 7.58. Consider Proposition 7.2.


(a) Prove the properties (a)-(e) in Proposition 7.2.
(b) Give a multivariate formulation for each of the statements (a)-(e).

Exercise 7.59. Prove the following statements for the Fourier transform F.
(a) For the Fourier transform of the k-th derivative f (k) of f , we have

(Ff (k) )(ω) = (iω)k (Ff )(ω) for all ω ∈ R

under the assumption f (k) ∈ C (R) ∩ L1 (R).


(b) For the k-th derivative of the Fourier transform Ff of f , we have

dk
(Ff )(ω) = (−i)k (F(xk f ))(ω) for all ω ∈ R
dω k
under the assumption xk f ∈ L1 (R).
272 7 Basic Concepts of Signal Approximation

Exercise 7.60. Conclude from the results in Exercise 7.59 the statement:
”f ∈ L1 (R) is smooth, if and only if Ff has rapid decay around infinity”.
Be more precise on this and quantify the decay and the smoothness of f .

Exercise 7.61. Let f ∈ L1 (R) \ {0} be a function with compact support.


Prove the following statements for the Fourier transform Ff = fˆ of f .
(a) fˆ has arbitrarily many derivatives, i.e., fˆ ∈ C ∞ (R);
(b) fˆ does not have compact support.

Exercise 7.62. Prove the estimate

f ∗ g∞ ≤ f L1 (R) · g∞ for all f ∈ L1 (R), g ∈ C0 (R).

Exercise 7.63. Prove the convolution formula

Fd f ∗ Fd g = (2π)d Fd (f · g) for all f, g ∈ L1 (Rd )

in the frequency domain of the multivariate Fourier transform Fd .

Exercise 7.64. Prove for f ∈ L1 (R) ∩ C (R) the Fourier inversion formula
  
1
fˆ(ω) · eixω e−ε|ω| dω
2
f (x) = lim for all x ∈ R,
ε 0 2π R

i.e., prove Corollary 7.24 as a conclusion from Theorem 7.21.


Hint: see [26, Chapter 7].

Exercise 7.65. Let f : R −→ R be a Lebesgue-measurable function with


f (x) = 0 for almost every x ∈ R. Moreover, suppose that f satisfies the
decay condition
|f (x)| ≤ C · e−τ |x| for all x ∈ R
for some C, τ > 0. Show that the system (xn f (x))n∈N0 is complete in L2 (R),
i.e.,
span{xn f (x) | n ∈ N0 } = L2 (R).

Hint: Proposition 7.28.

Exercise 7.66. Prove the statements of Proposition 7.47.

Exercise 7.67. Let V0 ⊂ V1 be closed subspaces of L2 (R). Moreover, let


Π : L2 (R) −→ V be linear projectors of L2 (R) onto V , for = 0, 1.
(a) Show that the operator P = Π1 − Π0 : L2 (R) −→ V1 is a projector of
L2 (R) onto V1 , if and only if Π0 ◦ Π1 = Π0 .
(b) Make an example for two projectors Π : L2 (R) −→ V , for = 0, 1, such
that the condition Π0 ◦ Π1 = Π0 is violated.
7.6 Exercises 273

Exercise 7.68. For ψ ∈ L2 (R), let {ψ(· − k) | k ∈ Z} be a Riesz basis of

W0 = span{ψ(· − k) | k ∈ Z}

with Riesz constants 0 < A ≤ B < ∞. Moreover, let

ψkj := 2j/2 ψ(2j · −k) for j, k ∈ Z.

(a) Show that {ψkj | k ∈ Z} is a Riesz basis of

Wj = span{ψkj | k ∈ Z} for j ∈ Z

with Riesz constants 0 < A ≤ B < ∞.


(b) Show that {ψkj | j, k ∈ Z} is a Riesz basis of L2 (R) with Riesz constants
0 < A ≤ B < ∞, provided that
@
L2 (R) = Wj .
j∈Z
8 Kernel-based Approximation

This chapter is devoted to interpolation and approximation of multivariate


functions. Throughout this chapter, f : Ω −→ R denotes a continuous func-
tion on a domain Ω ⊂ Rd , for d > 1. Moreover, X = {x1 , . . . , xn } ⊂ Ω is a set
of pairwise distinct interpolation points where we assume that the function
values of f at X are known. We collect these function values in a data vector
fX = (f (x1 ), . . . , f (xn ))T = (f1 , . . . , fn )T ∈ Rn . (8.1)
Since we do not make any assumptions on the distribution of the points X in
the domain Ω, the point set X is considered as scattered. We formulate the
basic interpolation problem for scattered data sets (X, fX ) as follows.
Problem 8.1. On given interpolation points X = {x1 , . . . , xn } ⊂ Ω, where
Ω ⊂ Rd for d > 1, and function values fX ∈ Rn find an interpolant s ∈ C (Ω)
satisfying sX = fX , so that s satisfies the interpolation conditions
s(xj ) = f (xj ) for all 1 ≤ j ≤ n. (8.2)

According to the Mairhuber-Curtis theorem, Theorem 5.25, there are no
non-trivial Haar systems in the truly multivariate case, i.e., for multivariate
parameter domains Ω ⊂ Rd , d > 1, containing at least one interior point.
Therefore, the interpolation problem for the multivariate case, as formulated
by Problem 8.1, is much harder than that for the univariate case.
To solve the posed multivariate interpolation problem, we construct fami-
lies of basis functions that are generated by a reproducing kernel K of a
Hilbert space F. The construction of such kernels K requires suitable charac-
terizations for positive definite functions, as we explain this in detail later in
this chapter. To this end, we rely on fundamental results from functional ana-
lysis, which we develop here. For only a few standard results from functional
analysis, we omit their proofs and rather refer to the textbook [33].
In the following discussion of this chapter, we show how positive definite
kernels lead to optimal solutions of the interpolation problem, Problem 8.1.
Moreover, we discuss other features and advantages of the proposed interpola-
tion method, where aspects of numerical relevance, e.g. stability and update
strategies, are included in our discussion. Finally, we briefly address basic
aspects of kernel-based learning methods.

© Springer Nature Switzerland AG 2018 275


A. Iske, Approximation Theory and Algorithms for Data Analysis, Texts
in Applied Mathematics 68, https://doi.org/10.1007/978-3-030-05228-7_8
276 8 Kernel-based Approximation

8.1 Multivariate Lagrange Interpolation


8.1.1 Discussion of the Interpolation Problem

Before we develop concrete solutions to Problem 8.1, we discuss the interpo-


lation problem in (8.2) from a more general viewpoint. To this end, suppose
that for a continuous function and for pairwise distinct interpolation points
X = {x1 , . . . , xn } ⊂ Ω ⊂ Rd , d > 1, a data vector fX containing function
values of the form (8.1) is given.
To solve the interpolation problem in (8.2), we fix a suitable (finite-
dimensional) subspace S ⊂ C (Ω), from which we wish to determine an in-
terpolant s ∈ S satisfying the interpolation conditions (8.2). To this end,
we choose a set B = {s1 , . . . , sn } ⊂ C (Ω) of n linearly independent con-
tinuous functions sj : Ω −→ R, 1 ≤ j ≤ n, so that the finite-dimensional
interpolation space

S = span{s1 , . . . , sn } ⊂ C (Ω)

consists of all linear combinations of functions in B. In this approach, the


sought interpolant s ∈ S is assumed to be of the form

n
s= cj sj . (8.3)
j=1

Now the solution of Problem 8.1 leads us to the linear system

VB,X · c = fX

for (unknown) coefficients c = (c1 , . . . , cn )T ∈ Rn of s in (8.3), where

VB,X = (sj (xk ))1≤j,k≤n ∈ Rn×n

is the generalized Vandermonde-Matrix with respect to the basis B.


We wish to determine the basis B, and so the interpolation space S, such
that the interpolation problem (8.2) has for any set of interpolation points
X and function values fX a unique solution s from S, i.e., we require the
regularity of VB,X for any set of interpolation points X.
As we recall from our discussion in Chapter 5, especially in Section 5.3,
there is, according to the Mairhuber-Curtis theorem, Theorem 5.25, no non-
trivial Haar space S ⊂ C (Ω) on domains Ω ⊂ Rd containing bifurcations.
The negative result of Mairhuber-Curtis is in particular critical for the case of
multivariate domains. In other words, according to Mairhuber-Curtis, there
is for n ≥ 2 no Haar system {s1 , . . . , sn }, such that for any data vector fX the
interpolation problem fX = sX assuming an interpolant s ∈ span{s1 , . . . , sn }
is guaranteed to have a unique solution.
To further explain this dilemma, we refer to the characterization of Haar
spaces in Theorem 5.23. According to Theorem 5.23, for the unique solution
8.1 Multivariate Lagrange Interpolation 277

of the interpolation problems (8.2) we need to work with a basis B whose


elements do necessarily depend on the interpolation points X. To construct
such data-dependent bases B = {s1 , . . . , sn }, we choose the approach

sj ≡ K(·, xj ) for 1 ≤ j ≤ n, (8.4)

so that the j-th basis function sj ∈ B depends on the j-th interpolation


point xj ∈ X. In this approach, K : Ω × Ω −→ R in (8.4) denotes a suitable
continuous function, whose structural properties are discussed in the following
section.
Note that our assumption in (8.4) leads us, for a fixed set of interpolation
points X = {x1 , . . . , xn } ⊂ Ω to the finite-dimensional interpolation space

SX = span{K(·, xj ) | xj ∈ X} ⊂ C (Ω),

from which we wish to choose an interpolant of the form



n
s= cj K(·, xj ). (8.5)
j=1

The solution of the interpolation problems fX = sX is in this case given by


the solution c = (c1 , . . . , cn )T ∈ Rn of the linear equation system

AK,X · c = f X

with the interpolation matrix AK,X = (K(xk , xj ))1≤j,k≤n ∈ Rn×n .

8.1.2 Lagrange Interpolation by Positive Definite Functions

For the sake of unique interpolation, in Problem 8.1, and with assuming (8.5),
the matrix AK,X must necessarily be regular. Indeed, this follows directly
from Theorem 5.23. In the following discussion, we wish to construct conti-
nuous functions K : Ω × Ω −→ R, such that AK,X is symmetric positive
definite for all finite sets X of interpolation points, in which case AK,X would
be regular. Obviously, the matrix AK,X is symmetric, if the function K is
symmetric, i.e., if K(x, y) = K(y, x) for all x, y ∈ Rd . The requirement
for AK,X to be positive definite leads us to the notion of positive definite
functions. Since we allow arbitrary parameter domains Ω ⊂ Rd , we will from
now restrict ourselves (without loss of generality) to the case Ω = Rd .

Definition 8.2. A continuous and symmetric function K : Rd × Rd −→ R


is said to be positive definite on Rd , K ∈ PDd , if for any set of pairwise
distinct interpolation points X = {x1 , . . . , xn } ⊂ Rd , n ∈ N, the matrix

AK,X = (K(xk , xj ))1≤j,k≤n ∈ Rn×n , (8.6)

is symmetric and positive definite.


278 8 Kernel-based Approximation

We summarize our discussion as follows (cf. Theorem 5.23).

Theorem 8.3. For K ∈ PDd , let X = {x1 , . . . , xn } ⊂ Rd , for n ∈ N, be a


finite point set. Then, the following statements are true.
(a) The matrix AK,X in (8.6) is positive definite.
(b) If s ∈ SX vanishes on X, i.e., if sX = 0, then s ≡ 0.
(c) The interpolation problem sX = fX has a unique solution s ∈ SX of the
form (8.5), whose coefficient vector c = (c1 , . . . , cn )T ∈ Rn is determined
by the unique solution of the linear system AK,X · c = fX .


By Theorem 8.3, the posed interpolation problem, in Problem 8.1, has


for K ∈ PDd a unique solution s ∈ SX of the form (8.5). In this case, for
any fixed set of interpolation points X = {x1 , . . . , xn } ⊂ Rd there is a unique
Lagrange basis { 1 , . . . , n } ⊂ SX , whose Lagrange basis functions j ,
1 ≤ j ≤ n, are uniquely determined by the solution of the cardinal interpola-
tion problem

1 for j = k
j (xk ) = δjk = for all 1 ≤ j, k ≤ n. (8.7)
0 for j = k

Therefore, the Lagrange basis functions are also often referred to as cardinal
interpolants. We can represent the elements of the Lagrange basis { 1 , . . . , n }
as follows.

Proposition 8.4. Let K ∈ PDd and X = {x1 , . . . , xn } ⊂ Rd . Then, the


Lagrange basis { 1 , . . . , n } ⊂ SX for X is uniquely determined by the solution
of the linear system

AK,X · (x) = R(x) for x ∈ Rd , (8.8)

where

(x) = ( 1 (x), . . . , n (x))


T
∈ Rn and R(x) = (K(x, x1 ), . . . , K(x, xn ))T ∈ Rn .

The interpolant s ∈ SX satisfying sX = fX has the Lagrange representation

s(x) = fX , (x) , (8.9)

where ·, · denotes the usual inner product on the Euclidean space Rn .

Proof. For x = xj , the right hand side R(xj ) in (8.8) coincides with the j-th
column of AK,X , and so the j-th unit vector ej ∈ Rn is the unique solution
of the linear equation system (8.8), i.e.,

(xj ) = ej ∈ Rn for all 1 ≤ j ≤ n.


8.1 Multivariate Lagrange Interpolation 279

In particular, j satisfies the conditions (8.7) of cardinal interpolation.


Moreover, any Lagrange basis function j can, by using (x) = A−1 K,X R(x),
uniquely be represented as a linear combination

j (x) = eTj A−1


K,X R(x) for 1 ≤ j ≤ n (8.10)

of the basis functions K(x, xj ) in R(x), i.e., j ∈ SX for 1 ≤ j ≤ n.


From (8.10), we obtain, in particular, the stated representation in (8.8).
Finally, the interpolant s in (8.5) can, by using

s(x) = c, R(x) = A−1 −1


K,X fX , R(x) = fX , AK,X R(x) = fX , (x) ,

be represented as a unique linear combination in the Lagrange basis, where



n
s(x) = f (xj ) j (x)
j=1

and so we find the Lagrange representation, as stated in (8.9). 

8.1.3 Construction of Positive Definite Functions

In this section, we discuss the construction and characterization of positive


definite functions. To this end, we use the continuous multivariate Fourier
transform from Section 7.4.
But let us first note two simple observations. For K ∈ PDd and X = {x},
for x ∈ Rd , the matrix AK,X ∈ R1×1 is positive definite, i.e., K(x, x) > 0.
For X = {x, y}, with x, y ∈ Rd , x = y, we have det(AK,X ) > 0, whereby
K(x, y)2 < K(x, x)K(y, y).
In our subsequent construction of positive definite functions we assume

K(x, y) := Φ(x − y) for x, y ∈ Rd (8.11)

for an even continuous function Φ : Rd −→ R, i.e., Φ(x) = Φ(−x) for all


x ∈ Rd . Important special cases for such Φ are radially symmetric functions.

Definition 8.5. A continuous function Φ : Rd −→ R is radially symmetric


on Rd , with respect to the Euclidean norm  · 2 , in short, Φ is radially
symmetric, if there exists a continuous function φ : [0, ∞) −→ R satisfying
Φ(x) = φ(x2 ) for all x ∈ Rd .

Obviously, every radially symmetric function Φ = φ( · 2 ) is even. In the


following discussion, we call Φ or φ positive definite, respectively, in short,
Φ ∈ PDd or φ ∈ PDd , if and only if K ∈ PDd .
We summarize our observations for K ∈ PDd in (8.11) as follows.
280 8 Kernel-based Approximation

Remark 8.6. Let Φ : Rd −→ R be even and positive definite, i.e., Φ ∈ PDd .


Then, the following statements hold.
(a) Φ(0) > 0;
(b) |Φ(x)| < Φ(0) for all x ∈ Rd \ {0}.
From now, we assume the normalization Φ(0) = 1. This is without loss of
generality, since for Φ ∈ PDd , we have α Φ ∈ PDd for any α > 0. 
Now let us discuss the construction of positive definite functions. This is
done by using the continuous Fourier transform

ˆ
f (ω) := f (x)e−ix,ω dx for f ∈ L1 (Rd ).
Rd

The following fundamental result is due to Bochner1 who studied in [8] posi-
tive (semi-)definite functions of one variable. We can make use of the Bochner
theorem in [8] to prove suitable characterizations for multivariate positive
definite functions.
Theorem 8.7. (Bochner, 1932).
Suppose that Φ ∈ C (Rd )∩L1 (Rd ) is an even function. If the Fourier transform
Φ̂ of Φ is positive on Rd , Φ̂ > 0, then Φ is positive definite on Rd , Φ ∈ PDd .
Proof. For Φ ∈ C (Rd ) ∩ L1 (Rd ), the Fourier inversion formula

−d
Φ(x) = (2π) Φ̂(ω)eix,ω dω
Rd

holds (see Corollary 7.24). Moreover, Φ̂ is continuous on Rd (cf. our discussion


in Section 7.1). If Φ̂ > 0 on Rd , then the quadratic form
 2
   
n
 n

cT AK,X c = cj ck Φ(xj − xk ) = (2π)−d  c e ixj ,ω 
 j  Φ̂(ω) dω
j,k=1 Rd  j=1 

is non-negative for any pair (c, X) of a vector c = (c1 , . . . , cn )T ∈ Rn and a


point set X = {x1 , . . . , xn } ⊂ Rd , i.e., cT AK,X c ≥ 0. If cT AK,X c = 0, then
the symbol function

n
S(ω) ≡ Sc,X (ω) = cj eixj ,ω for ω ∈ Rd
j=1

must vanish identically on Rd , due to the positivity of Φ̂ on Rd . By the linear


independence of the functions eixj ,· , we can conclude c = 0 from S ≡ 0 (see
Exercise 8.61). Therefore, we have cT AK,X c > 0 for all c ∈ Rn \ {0} and
X ⊂ Rd with |X| = n ∈ N. 
1
Salomon Bochner (1899-1982), mathematician
8.1 Multivariate Lagrange Interpolation 281

Remark 8.8. We could also work with weaker assumptions on Φ̂ ∈ C (Rd )


in Theorem 8.7, if we merely require non-negativity Φ̂ ≥ 0, with Φ̂ ≡ 0, for Φ̂.
But the somewhat stronger requirements for Φ̂ in Theorem 8.7 are sufficient
for our purposes and, in fact, quite convenient in our following discussion. 

By using Bochner’s characterization of Theorem 8.7, we can make three


examples for positive definite radially symmetric functions Φ.

Example 8.9. The Gauss function

Φ(x) = e−
2
x 2 for x ∈ Rd

is for any d ≥ 1 positive definite on Rd , Φ ∈ PDd , by Example 7.41, where

Φ̂(ω) = π d/2 e−
2
ω 2 /4 > 0,

and so K(x, y) = exp(−x − y22 ) ∈ PDd , according to Theorem 8.7. ♦

Example 8.10. The inverse multiquadric


−β/2
Φ(x) = 1 + x22 for β > d/2

is positive definite on Rd for all d ∈ N. The Fourier transform of Φ is given


as
21−β
Φ̂(ω) = (2π)−d/2 ·
β−d/2
ω2 Kd/2−β (ω2 ), (8.12)
Γ (β)
where
 ∞
Kν (z) = e−z cosh(x) cosh(νx) dx for z ∈ C with | arg(z)| < π/2
0

is the modified Bessel function of the third kind of order ν ∈ C. We decided to


omit the rather technical details concerning the Fourier transform Φ̂ in (8.12)
and its positivity, but rather refer to [72, Theorem 6.13]. ♦

Example 8.11. The radial characteristic functions

(1 − x2 )β for x2 < 1


Φ(x) = (1 − x2 )β+ =
0 for x2 ≥ 1

of Askey [2] are for d ≥ 2 positive definite on Rd , provided that β ≥ (d+1)/2.


In this case, the Fourier transform Φ̂ of Φ can (up to some positive constant)
be represented as
 s
Φ̂(s) = s−(d/2+β+1) (s − t)β td/2 J(d−2)/2 (t) dt > 0 (8.13)
0

for s = ω2 , where


282 8 Kernel-based Approximation

 (−1)j (z/2)ν+2j
Jν (z) = for z ∈ C \ {0}
j=0
j!Γ (ν + j + 1)

is the Bessel function of the first kind of order ν ∈ C. Again, we decided to


omit the technical details concerning the Fourier transform Φ̂ in (8.13). Fur-
ther details on the construction and characterization of these early examples
for compactly supported radial positive definite functions are in [37]. ♦

Now that we have provided three explicit examples for positive definite
(radial) functions, we remark that the characterization of Bochner’s theorem
allows us to construct even larger classes of positive definite functions. This
is done by using convolutions. Recall that for any pair f, g ∈ L1 (Rd ) of
functions, the Fourier transform maps the convolution product f ∗g ∈ L1 (Rd ),

(f ∗ g)(x) = f (x − y)g(y) dy for f, g ∈ L1 (Rd )
Rd

to the product of their Fourier transforms, i.e.,

f
∗ g = fˆ · ĝ for f, g ∈ L1 (Rd )

by the Fourier convolution theorem, Theorem 7.14.


For g(x) = f ∗ (x) = f (−x), we get the non-negative autocorrelation

f
∗ f ∗ = fˆ · fˆ = |fˆ|2 for f ∈ L1 (Rd ).

This gives a simple method for constructing positive definite functions.

Corollary 8.12. For any function Ψ ∈ L1 (Rd ) \ {0}, its autocorrelation



Φ(x) = (Ψ ∗ Ψ ∗ )(x) = Ψ (x − y)Ψ (−y) dy
Rd

is positive definite, Φ ∈ PDd .

Proof. For Ψ ∈ L1 (Rd )\{0}, we have Φ ∈ L1 (Rd )\{0}, and so Φ̂ ∈ C (Rd )\{0}.
Moreover, the Fourier transform Φ̂ = |Ψ̂ |2 of the autocorrelation Φ = Ψ ∗ Ψ ∗
is, due to the Fourier convolution theorem, Theorem 7.43, non-negative, so
that Φ ∈ PDd , due to Remark 8.8. 

The practical value of the construction resulting from Corollary 8.12 is,
however, rather limited. This is because the autocorrelations Ψ ∗Ψ ∗ are rather
awkward to evaluate. To avoid numerical integration, one would prefer to
work with explicit (preferably simple) analytic expressions for positive defi-
nite functions Φ = Ψ ∗ Ψ ∗ .
We remark that the basic idea of Corollary 8.12 has led to the construc-
tion of compactly supported positive definite (radial) functions, dating back to
8.2 Native Reproducing Kernel Hilbert Spaces 283

earlier Göttingen works of Schaback & Wendland [62] (in 1993), Wu [74] (in
1994), and Wendland [71] (in 1995). In their constructions, explicit formulas
were given for autocorrelations Φ = Ψ ∗Ψ ∗ , whose generators Ψ (x) = ψ(x2 ),
x ∈ Rd , are specific radially symmetric and compactly supported functions
ψ : [0, ∞) −→ R. This has provided a large family of continuous, radially
symmetric, and compactly supported functions Φ = Ψ ∗ Ψ ∗ , as they were
later popularized by Wendland [71], who used the radial characteristic func-
tions of Example 8.11 for Ψ to obtain piecewise polynomial positive definite
compactly supported radial functions of minimal degree. For further details
concerning the construction of compactly supported positive definite radial
functions, we refer to the survey [61] of Schaback.

8.2 Native Reproducing Kernel Hilbert Spaces


The discussion of this section is devoted to reproducing kernel Hilbert spaces
F which are generated by positive definite functions K ∈ PDd . In particular,
for any fixed K ∈ PDd , the positive definite function K is shown to be the
reproducing kernel of its associated Hilbert space F ≡ FK , whose structure
is entirely determined by the properties of K. Therefore, F is also referred
to as the native reproducing kernel Hilbert space of K, in short, native space.
To introduce F, we first define, for a fixed positive definite K ∈ PDd , the
reconstruction space

S = {s ∈ SX | X ⊂ Rd , |X| < ∞} (8.14)

containing all (potential) interpolants of the form


n
s(x) = cj K(x, xj ) (8.15)
j=1

for some c = (c1 , . . . , cn )T ∈ Rn and X = {x1 , . . . , xn } ⊂ Rd .


Note that any s ∈ S in (8.15) can be rewritten as


n
s(x) ≡ sλ (x) := λy K(x, y) for λ = c j δxj (8.16)
j=1

where δx is the Dirac2 point evaluation functional, defined by δx (f ) = f (x),


and λy in (8.16) denotes action of the linear functional λ on variable y.
2
Paul Adrien Maurice Dirac (1902-1984), English physicist
284 8 Kernel-based Approximation

8.2.1 Topology of the Reconstruction Space and Duality


Now we consider the linear space
⎧ ⎫
⎨  ⎬
n

L= λ= cj δxj  c = (c1 , . . . , cn )T ∈ Rn , X = {x1 , . . . , xn } ⊂ Rd , n ∈ N
⎩ ⎭
j=1

containing all finite linear combinations of δ-functionals. We equip L with


the inner product

nλ 

(λ, μ)K := λx μy K(x, y) = cj dk K(xj , yk ) for λ, μ ∈ L, (8.17)
j=1 k=1

for K ∈ PDd , where



nλ 

λ= c j δxj ∈ L and μ= dk δyk ∈ L.
j=1 k=1

1/2
By  · K := (·, ·)K , L is a Euclidean space. Likewise, via the duality relation
in (8.16), we can equip S with the inner product
(sλ , sμ )K := (λ, μ)K for sλ , sμ ∈ S (8.18)
1/2
and the norm  · K = (·, ·)K . Note that the normed linear spaces S and L
are isometric isomorphic, S ∼
= L, via the linear bijection λ −→ sλ and by the
norm isometry
λK = sλ K for all λ ∈ L. (8.19)
Before we study the topology of the spaces L and S in more detail, we first
discuss a few concrete examples for inner products and norms of elements in
L and S.
Example 8.13. For any pair of point evaluation functionals δz1 , δz2 ∈ L,
with z1 , z2 ∈ Rd , their inner product is given by
(δz1 , δz2 )K = δzx1 δzy2 K(x, y) = K(z1 , z2 ) = Φ(z1 − z2 ).
Moreover, for the norm of any δz ∈ L, z ∈ Rd , we obtain
δz 2K = (δz , δz )K = δzx δzy K(x, y) = K(z, z) = Φ(0) = 1,
with using the normalization Φ(0) = 1, as introduced in Remark 8.6. Likewise,
we have
(K(·, z1 ), K(·, z2 ))K = K(z1 , z2 ) = Φ(z1 − z2 ) (8.20)
for all z1 , z2 ∈ Rd and
K(·, z)K = δz K = 1 for all z ∈ Rd .

8.2 Native Reproducing Kernel Hilbert Spaces 285

To extend this first elementary example, we regard, for a fixed point set
X = {x1 , . . . , xn } ⊂ Rd , the linear bijection operator G : Rn −→ SX , defined
as

n
G(c) = cj K(·, xj ) = c, R(x) for c = (c1 , . . . , cn )T ∈ Rn . (8.21)
j=1

Proposition 8.14. For any X = {x1 , . . . , xn } ⊂ Rd , we have

(G(c), G(d))K = c, d AK,X for all c, d ∈ Rn ,

where
c, d AK,X := cT AK,X d for c, d ∈ Rn
denotes the inner product generated by the positive definite matrix AK,X . In
particular, G is an isometry by

G(c)K = cAK,X for all c ∈ Rn ,


1/2
where  · AK,X := ·, · AK,X .

Proof. By (8.20), we have


n
(G(c), G(d))K = cj dk (K(·, xj ), K(·, xk ))K = cT AK,X d = c, d AK,X
j,k=1

for all c = (c1 , . . . , cn )T ∈ Rn and d = (d1 , . . . , dn )T ∈ Rn . 

The result of Proposition 8.14 leads us to the dual operator of G.

Proposition 8.15. For any finite point set X = {x1 , . . . , xn } ⊂ Rd , the dual
operator G∗ : SX −→ Rn of G in (8.21), characterized by the relation

(G(c), s)K = c, G∗ (s) for c ∈ Rn and s ∈ SX , (8.22)

is given as
G∗ (s) = sX for s ∈ SX .

Proof. Note that for any s ∈ SX , there is a unique d ∈ Rn satisfying G(d) = s,


so that we have

(G(c), s)K = (G(c), G(d))K = c, d AK,X = c, AK,X d = c, sX

for all c ∈ Rn , in which case the assertion follows directly from (8.22). 

Next, we compute inner products and norms for the Lagrange basis func-
tions 1 , . . . , n of SX . The following proposition yields an important result
concerning our subsequent stability analysis of the interpolation method.
286 8 Kernel-based Approximation

Proposition 8.16. For X = {x1 , . . . , xn } ⊂ Rd , the inner products between


the Lagrange basis functions j ∈ SX satisfying (8.7) are given as

( j, k )K = a−1
jk for all 1 ≤ j, k ≤ n,

where A−1 −1
K,X = (ajk )1≤j,k≤n ∈ R
n×n
. In particular, the norm of j ∈ SX is

 j 2K = a−1
jj for all 1 ≤ j ≤ n.

Proof. The representation of the Lagrange basis functions j in (8.10) yields

( j, k )K = eTj A−1 −1 T −1 −1
K,X AK,X AK,X ek = ej AK,X ek = ajk

for all 1 ≤ j, k ≤ n. 

From Example 8.13 and Proposition 8.16, we see that the matrices

AK,X = ((δxj , δxk )K )1≤j,k≤n ∈ Rn×n


A−1
K,X = (( j , k )K )1≤j,k≤n ∈ Rn×n

are Gramian, i.e., the entries of the symmetric positive definite matrices AK,X
and A−1
K,X are represented by inner products, respectively.

8.2.2 Construction of the Native Hilbert Space

In this section, we introduce the native Hilbert space F ≡ FK of K ∈ PDd .


To this end, we perform a completion of the Euclidean space S. On this
occasion, we recall the general concept of completion for normed linear spaces
from functional analysis. But we decided to omit the proofs, where we rather
refer to the more general discussion in [33, Appendix B]. The following result
can, for instance, be found in [33, Corollary 16.11].

Theorem 8.17. (Completion of normed linear spaces). Let S be a


normed linear space. Then, S is isometric isomorphic to a dense subspace
of a Banach space F, which is, up to norm isomorphy, unique. The Banach
space F is called completion of S, in short, F = S. 

The concept of completion can, obviously, be applied to Euclidean spaces:


For any Euclidean space S in (8.14), there is a unique (up to norm isomorphy)
Hilbert space F, which is the completion of S with respect to the Euclidean
norm  · K , i.e., F = S. Likewise, for the dual space L of S, there is a unique
(up to norm isomorphy) Hilbert space D satisfying D = L.
By the norm isomorphy in (8.19) and by the continuity of the norm  · K ,
we obtain another important result, where we extend the linear bijection
λ −→ sλ between L and S to D and F.
8.2 Native Reproducing Kernel Hilbert Spaces 287

Proposition 8.18. The Hilbert spaces D and F are isometric isomorphic,

D∼
= F,

via the linear bijection λ −→ sλ and by the norm isometry

λK = sλ K for all λ ∈ D.

Remark 8.19. Any functional μ ∈ D is continuous on the Hilbert space F


by the Cauchy-Schwarz inequality

|μ(sλ )| = |μx λy K(x, y)| = |(μ, λ)K | ≤ μK · λK = μK · sλ K .

In particular, any point evaluation functional δx ∈ L, x ∈ Rd , is continuous


on F, since we have

|δx (f )| ≤ δx K · f K = f K for all f ∈ F,

where δx K = 1 (see Example 8.13). 

Resorting to functional analysis, we see that F is a reproducing kernel


Hilbert space. But let us first recall some facts about reproducing kernels [1].

Definition 8.20. Let H denote a Hilbert space of functions f : Rd −→ R,


with inner product (·, ·)H . Then, a function K : Rd × Rd −→ R is said to be
a reproducing kernel for H, if K(·, x) ∈ H, for all x ∈ Rd , and

(K(·, x), f )H = f (x) for all f ∈ H and all x ∈ Rd .

Next, we prove an important result concerning the characterization of


reproducing kernel Hilbert spaces. To this end, we rely on the representation
theorem of Fréchet3 -Riesz4 , giving another standard result from functional
analysis, which can be found, for instance, in [33, Section 8.3].

Theorem 8.21. (Fréchet-Riesz representation theorem). Let H be a


Hilbert space. Then, there is for any bounded linear functional ϕ : H −→ R
a unique representer uϕ ∈ H, satisfying

ϕ(u) = (uϕ , u)H for all u ∈ H.

The mapping ϕ −→ uϕ is linear, bijective and isometric from H onto H. 


3
Maurice René Fréchet (1878-1973), French mathematician
4
Frigyes Riesz (1880-1956), Hungarian mathematician
288 8 Kernel-based Approximation

Theorem 8.22. A Hilbert space H of functions f : Rd −→ R has a repro-


ducing kernel, if and only if all point evaluation functionals δx : H −→ R,
for x ∈ Rd , are continuous on H.

Proof. Suppose K is a reproducing kernel for H. Then, by the estimate

|δx (f )| = |f (x)| = |(K(·, x), f )H | ≤ K(·, x)H · f H for x ∈ Rd

any point evaluation functional δx is bounded, and so continuous, on H.


As for the converse, suppose that all point evaluation functionals δx are
continuous on H. Then, due to the Fréchet-Riesz representation theorem,
Theorem 8.21, there is, for any x ∈ Rd , a unique function kx ∈ H satisfying

f (x) = δx (f ) = (kx , f )H for all f ∈ H,

and so the function K(·, x) := kx is a reproducing kernel for H. 

Remark 8.23. A reproducing kernel K for H is unique. Indeed, if K̃ is


another reproducing kernel for H, then we have

(K̃(·, x), f )H = f (x) for all f ∈ H and all x ∈ Rd .

For kx := K(·, x) and k̃x := K̃(·, x) this implies

(kx − k̃x , f )H = 0 for all f ∈ H and all x ∈ Rd ,

and so kx ≡ k̃x , i.e., K ≡ K̃. 

8.2.3 The Madych-Nelson Theorem

Now we are in a position where we can show that the positive definite function
K ∈ PDd is the (unique) reproducing kernel for the Hilbert space F ≡ FK .
To this end, we rely on the seminal works [45, 46, 47] of Madych and Nelson.

Theorem 8.24. (Madych-Nelson, 1983).


For any dual functional λ ∈ D, we have the representation

λ(f ) = (λy K(·, y), f )K for all f ∈ F. (8.23)

Proof. For λ ∈ L and sμ = μy K(·, y) ∈ S the representation

(λy K(·, y), sμ )K = (sλ , sμ )K = (λ, μ)K = λx μy K(x, y) = λ(sμ ) (8.24)

holds, with the inner products (·, ·)K : L × L −→ R and (·, ·)K : S × S −→ R
in (8.17) and in (8.18). By continuous extension of the representation (8.24)
from L to D and from S to F we already obtain the statement in (8.23). 

From the result of Theorem 8.24, we note the following observation.


8.3 Optimality of the Interpolation Method 289

Remark 8.25. Any dual functional λ ∈ D is, according to (8.23) and in the
sense of the Fréchet-Riesz representation theorem, Theorem 8.21, uniquely
represented by the element sλ = λy K(·, y) ∈ F. 

Now we can formulate the central result of this section.

Corollary 8.26. Every positive definite function K ∈ PDd is the unique


reproducing kernel of the Hilbert space F ≡ FK generated by K.

Proof. On the one hand, we have, for δx ∈ L , x ∈ Rd , the representation

δxy K(·, y) = K(·, x) ∈ F for all x ∈ Rd .

On the other hand, by letting λ = δx ∈ L in (8.23), we obtain

(K(·, x), f )K = f (x) for all f ∈ F and all x ∈ Rd .

Therefore, K is the reproducing kernel of F according to Definition 8.20. 

Another useful consequence of the Madych-Nelson theorem is as follows.

Corollary 8.27. Every function f ∈ F is continuous on Rd , F ⊂ C (Rd ).

Proof. Recall that we assume continuity for K ∈ PDd . Therefore, by

|f (x) − f (y)| = |(K(·, x) − K(·, y), f )K | ≤ f K · K(·, x) − K(·, y)K ,

for f ∈ F, and by

K(·, x) − K(·, y)2K


= (K(·, x), K(·, x))K − 2(K(·, x), K(·, y))K + (K(·, y), K(·, y))K
= K(x, x) − 2K(x, y) + K(y, y)

(cf. Example 8.13) we see that every f ∈ F is a continuous function. 

8.3 Optimality of the Interpolation Method

In this section, we prove further results that directly follow from the Madych-
Nelson theorem, Theorem 8.24. As we show, the proposed Lagrange interpo-
lation method is optimal in two different senses.

8.3.1 Orthogonality and Best Approximation

The first optimality property is based on the Pythagoras theorem.


290 8 Kernel-based Approximation

Corollary 8.28. For X = {x1 , . . . , xn } ⊂ Rd , F can be decomposed as

F = SX ⊕ {f ∈ F | fX = 0} , (8.25)

where SX = {f ∈ F | fX = 0} is the orthogonal complement of SX in F.
For f ∈ F and the unique interpolant s ∈ SX to f on X satisfying
sX = fX , the Pythagoras theorem holds, i.e.,

f 2K = s2K + f − s2K . (8.26)

Proof. For f ∈ F, let s ∈ SX be the unique interpolant to f from SX


satisfying sX = fX . Then, s can, according to (8.16), be represented
:n as
s = λy K(·, y), for some dual functional λ ∈ L of the form λ = j=1 cj δxj .
By the Madych-Nelson theorem, Theorem 8.24, we have

(s, g)K = 0 for all g ∈ F with λ(g) = 0,

i.e., s is perpendicular to the (algebraic) kernel of λ, and so the implication

gX = 0 =⇒ g ⊥ SX

holds. But this implies f − s ⊥ SX , or f − s ∈ SX , since (f − s)X = 0.
Therefore, the stated decomposition with the direct sum in (8.25) holds by

f = s + (f − s) ∈ SX ⊕ SX

and, moreover,

f 2K = f − s + s2K = f − s2K + 2(f − s, s)K + s2K = f − s2K + s2K ,

whereby the Pythagoras theorem (8.26) is also proven. 

By the result of Corollary 8.28, we can identify the unique interpolant


s∗ ∈ SX to f on X as the orthogonal projection of f onto SX . Therefore, the
interpolant s∗ is, according to Remark 4.2, the unique best approximation
to f from SX with respect to (F ,  · K ) . Note that the projection operator
ΠSX : F −→ SX , f −→ s∗ , satisfies the property

f − ΠSX f = (I − ΠSX )(f ) ⊥ SX for all f ∈ F,



so that the operator I − ΠSX : F −→ SX maps, according to the decom-

position in (8.25), onto the orthogonal complement SX ⊂ F of SX in F.
The linear operator I − ΠSX is also a projection operator (cf. our general
discussion on orthogonal projections in Section 4.2).
We summarize our observations as follows.
8.3 Optimality of the Interpolation Method 291

Corollary 8.29. For f ∈ F and X = {x1 , . . . , xn } ⊂ Rd the unique inter-


polant s∗ ∈ SX to f on X, s∗X = fX , satisfies the following properties.
(a) s∗ is the unique orthogonal projection of f ∈ F onto SX .
(b) s∗ is the unique best approximation to f ∈ F from SX w.r.t.  · K .


Moreover, Corollary 8.28 implies that the interpolant has minimal variation.

Corollary 8.30. For X = {x1 , . . . , xn } ⊂ Rd and fX ∈ Rn , the interpolant


s ∈ SX satisfying sX = fX is the unique minimizer of the energy functional
 · K among all interpolants from F to the data fX , i.e.,

sK ≤ gK for all g ∈ F with gX = fX .

The interpolant s is uniquely determined by this variational property. 

Now we analyze to stability of the proposed interpolation method. To


this end, we compute the norm of the interpolation operator IX : F −→ SX ,
which maps every f ∈ F to its unique interpolant s ∈ SX satisfying fX = sX .
On this occasion, we recall the definition for the norm of linear operators,
which is in particular for IX : F −→ S with respect to  · K given as

IX f K
IX K = sup .
f ∈F \{0} f K

Theorem 8.31. For X = {x1 , . . . , xn } ⊂ Rd , the native space norm IX K


of the interpolation operator IX : F −→ SX is one, i.e.,

IX K = 1.

Proof. The variational property in Corollary 8.30 implies

IX f K ≤ f K for all f ∈ F, (8.27)

and so IX K ≤ 1. Due to the projection property IX s = s, for all s ∈ SX ,


equality in (8.27) is attained at any s ∈ SX , i.e.,

IX sK = sK for all f ∈ SX ,

and therefore we have IX K = 1. 

The above result allows us to draw the following conclusion.

Remark 8.32. By the stability property (8.27) in Theorem 8.31, the pro-
posed interpolation method has minimal condition number w.r.t.  · K . 
292 8 Kernel-based Approximation

8.3.2 Norm Minimality of the Pointwise Error Functional

The second optimality property of the proposed interpolation method is con-


cerning the pointwise error

εx (f ) = f (x) − s(x) for x ∈ Rd (8.28)

between f ∈ F and the interpolant s ∈ SX to f on X satisfying sX = fX .


By the Lagrange representation of s in (8.9), the pointwise error functional
εx : F −→ [0, ∞) can be written as a linear combination of δ-functionals,

n
εx = δx − j (x)δxj = δx − (x)T δX ∈ L, (8.29)
j=1

where δX := (δx1 , . . . , δxn )T . Moreover, we use the notation



n 
n
(x)T R(x) = j (x)K(x, xj ) =
y
j (x)δxj K(x, y) = δx , T
(x)δX K
.
j=1 j=1

The pointwise error εx (f ) in (8.28) is bounded above as follows.

Corollary 8.33. For f ∈ F and X = {x1 , . . . , xn } ⊂ Rd , let s ∈ SX be the


unique interpolant to f on X satisfying sX = fX . Then, for the pointwise
error εx (f ) in (8.28) the estimate

|εx (f )| ≤ εx K · f K (8.30)

holds, where the norm εx K of the error functional can be written as

εx 2K = 1 − (x)T AK,X (x) = 1 −  (x)2AK,X , (8.31)

by using the positive definite matrix AK,X in (8.6), so that

0 ≤ εx K ≤ 1 for all x ∈ Rd . (8.32)

The error estimate in (8.30) is sharp, where equality holds for the function

fx = εyx K(·, y) ∈ F. (8.33)

Proof. By the Madych-Nelson theorem, Theorem 8.24, we have

εx (f ) = (εyx K(·, y), f )K for all f ∈ F, (8.34)

so that (8.30) follows directly from (8.34) and the Cauchy-Schwarz inequality.
We compute the norm of the error functional εx in (8.29) by

εx 2K = (εx , εx )K = (δx − (x)T δX , δx − (x)T δX )K


= 1 − 2 (x)T R(x) + (x)T AK,X (x) = 1 − (x)T AK,X (x),
8.4 Orthonormal Systems, Convergence, and Updates 293

(cf. Example 8.13), where we use the representation in (8.8). The upper bound
for εx K in (8.32) follows from the positive definiteness of AK,X .
Finally, for the function fx in (8.33) equality holds in (8.30), since we get
|εx (fx )| = |(εyx K(·, y), fx )K | = (fx , fx )K = (εx , εx )K = εx K · fx K
from the Madych-Nelson theorem, and so the estimate in (8.30) is sharp. 
Finally, we show the pointwise optimality of the interpolation method.
To this end, we regard quasi-interpolants of the form

n
s = T
fX = j f (xj ) for = ( 1, . . . , n)
T
∈ Rn
j=1

along with their associated pointwise error functionals



n
ε()
x = δx − j δxj = δx − T
δX ∈ L for x ∈ Rd . (8.35)
j=1

()
For the norm εx K we have, like in (8.31), the representation

x K = 1 − 2
ε() 2 T
R(x) + T
AK,X .
()
Now let us minimize the norm εx K under variation of the coefficients
∈ Rn . This leads us directly to the unconstrained optimization problem
ε()
x K = 1 − 2
2 T
R(x) + T
AK,X −→ minn ! (8.36)
∈R

whose unique solution is the solution to the linear system AK,X = R(x).
But this already implies the pointwise optimality, that we state as follows.
Corollary 8.34. Let X = {x1 , . . . , xn } ⊂ Rd and x ∈ Rd . Then, the point-
wise error functional εx in (8.29) is norm-minimal among all error func-
tionals of the form (8.35), where
εx K < ε()
x K for all ∈ Rn with AK,X = R(x),
i.e., εx is the unique solution to the optimization problem (8.36). 

8.4 Orthonormal Systems, Convergence, and Updates


In this section, we discuss important numerical aspects of the interpolation
method. First, we construct countable systems {uj }j∈N ⊂ S of orthonormal
bases in S ⊂ F, in short, orthonormal systems. On this occasion, we recall our
discussion in Sections 4.2 and 6.2, where we have already explained important
advantages of orthonormal systems. In particular, orthonormal systems and
their associated orthogonal projection operators Π : F −→ S lead us to
efficient and numerically stable approximation methods.
294 8 Kernel-based Approximation

8.4.1 Construction of Orthonormal Systems


Our following construction of orthonormal systems in S ⊂ F relies on a fa-
miliar result from linear algebra, the spectral theorem for symmetric matrices.
Proposition 8.35. For X = {x1 , . . . , xn } ⊂ Rd , let
AK,X = QT DQ
be the eigendecomposition of the symmetric positive definite kernel matrix
AK,X ∈ Rn×n in (8.6), with an orthogonal factor Q ∈ Rn×n and a diagonal
matrix D = diag(σ1 , . . . , σn ) ∈ Rn×n , whose elements σ1 ≥ . . . ≥ σn > 0 are
the positive eigenvalues of AK,X . Then, the functions
uj (x) = eTj D−1/2 Q · R(x) for 1 ≤ j ≤ n (8.37)
−1/2 −1/2
form an orthonormal basis of SX , where D−1/2 = diag(σ1 , . . . , σn ).
Proof. By the representation in (8.37), for R(x) = (K(x, xj ))1≤j≤n ∈ Rn ,
any uj is being expressed as a linear combination of the basis functions
{K(·, xj )}nj=1 ⊂ SX , and so uj ∈ SX . By Proposition 8.14, we obtain the
orthonormality relation
(uj , uk )K = eTj D−1/2 QAK,X QT D−1/2 ek = ej , ek = δjk
for all 1 ≤ j, k ≤ n. 
Now we develop for s, s̃ ∈ SX useful representations of their inner products
(s, s̃)K and norms sK . To this end, we work with the inner product
c, d A−1 = cT A−1
K,X d for c, d ∈ Rn ,
K,X

which is generated by the positive definite inverse A−1


K,X of AK,X .

Proposition 8.36. For X = {x1 , . . . , xn } ⊂ Rd , we have the representations


(s, s̃)K = sX , s̃X A−1 (8.38)
K,X

sK = sX A−1 (8.39)


K,X

for all s, s̃ ∈ SX .
Proof. For s, s̃ ∈ SX we have the Lagrange representations

n 
n
s(x) = sX , (x) = s(xj ) j (x) and s̃(x) = s̃X , (x) = s̃(xk ) k (x)
j=1 k=1

according to (8.9) in Proposition 8.4. From this and Proposition 8.16, we get

n 
n
(s, s̃)K = s(xj )s̃(xk )( j , k )K = s(xj )s̃(xk )a−1
jk = sX , s̃X A−1 ,
K,X
j,k=1 j,k=1

and so (8.38) holds. For s = s̃ in (8.38), we have (8.39). 


8.4 Orthonormal Systems, Convergence, and Updates 295

8.4.2 On the Convergence of the Interpolation Method

In this section, we develop rather elementary convergence results for the


proposed kernel-based interpolation method. We use the following notations.
By X we denote a finite set of pairwise distinct interpolation points, where we
further assume that X is contained in a compact domain Ω ⊂ Rd , i.e., X ⊂ Ω.
Moreover, we denote by sf,X ∈ SX the unique interpolant to f : Ω −→ R on
X, where we assume that the function f is contained in the linear subspace

FΩ := span {K(·, y) | y ∈ Ω} ⊂ F

i.e., f ∈ FΩ . Finally,
hX,Ω := sup min y − x2 (8.40)
y∈Ω x∈X

is the fill distance of the interpolation points X in the compact set Ω.


In the following discussion, we analyze for a nested sequence

X1 ⊂ X2 ⊂ X3 ⊂ . . . ⊂ Xn ⊂ . . . ⊂ Ω (8.41)

of (finite) point sets Xn ⊂ Ω, for n ∈ N, the asymptotic behaviour of the


minimal distances

ηK (f, SXn ) := sf,Xn − f K = inf s − f K for f ∈ FΩ (8.42)


s∈SXn

for n → ∞. Moreover, we work with the (reasonable) assumption

hXn ,Ω  0 for n → ∞ (8.43)

concerning the asymptotic geometric distribution of the interpolation points


Xn . Under this assumption, we already obtain our first convergence result.

Theorem 8.37. Let (Xn )n∈N ⊂ Ω be a nested sequence of interpolation


points, as in (8.41). Moreover, suppose that the associated fill distances hXn ,Ω
have asymptotic decay hXn ,Ω  0, as in (8.43). Then, we have, for any
f ∈ FΩ , the convergence

ηK (f, SXn ) = sf,Xn − f K −→ 0 for n → ∞.

Proof. Suppose y ∈ Ω. Then, according to our assumption in (8.43) there is


a sequence (xn )n∈N ⊂ Ω of interpolation points xn ∈ Xn satisfying

y − xn 2 ≤ hXn ,Ω −→ 0 for n → ∞.

Moreover, we have
2
ηK (K(·, y), SXn ) ≤ K(·, xn ) − K(·, y)2K = 2 − 2K(y, xn ) −→ 0

for n → ∞, due to the continuity of K and the normalization K(w, w) = 1.


296 8 Kernel-based Approximation

Now for Y = {y1 , . . . , yN } ⊂ Ω and c = (c1 , . . . , cN )T ∈ RN we consider


the function
N
fc,Y = cj K(·, yj ) ∈ SY ⊂ FΩ .
j=1

(j)
For any yj ∈ Y , 1 ≤ j ≤ N , we take a sequence (xn )n∈N ⊂ Ω of interpolation
(j) (j)
points xn ∈ Xn satisfying yj − xn 2 ≤ hXn ,Ω . Moreover, we consider the
functions

N
sc,n = n ) ∈ SXn
cj K(·, x(j) for n ∈ N.
j=1

Then, we have
 
N 
  
ηK (fc,Y , SXn ) ≤ sc,n − fc,Y K =


n ) − K(·, yj ) 
cj K(·, x(j)
 j=1 
K

N
≤ |cj | · K(·, x(j)
n ) − K(·, yj )K −→ 0 for n → ∞.
j=1

This proves the convergence for the dense subset

SΩ := {fc,Y ∈ SY | |Y | < ∞} ⊂ FΩ .

By continuous extension, we finally obtain the stated convergence on FΩ . 

We remark that the proven convergence in Theorem 8.37 can be arbitrarily


slow. Indeed, for any monotonically decreasing zero sequence (ηn )n∈N of non-
negative real numbers, i.e., ηn  0 for n → ∞, there is a nested sequence of
point sets (Xn )n∈N ⊂ Ω, as in (8.41), and a function f ∈ FΩ satisfying

ηK (f, SXn ) ≥ ηn for all n ∈ N.

For the proof of this statement, we refer to Exercise 8.64.


Nevertheless, we can prove convergence rates for norms that are weaker
than the native space norm  · K . To make a prototypical case, we restrict
ourselves to the maximum norm  · ∞ (cf. Exercise 8.62). On this occasion,
recall that any function f ∈ F is continuous, according to Corollary 8.27. In
particular, we have FΩ ⊂ C (Ω), and so  · ∞ is well-defined on FΩ .
For our next convergence result we require the following lemma.

Lemma 8.38. Let K(x, y) = Φ(x − y) be positive definite, K ∈ PDd , where


Φ : Rd −→ R is even and Lipschitz continuous with Lipschitz constant L > 0.
Then, we have, for any f ∈ FΩ , the estimate

|f (x) − f (y)|2 ≤ 2Lx − y2 · f 2K for all x, y ∈ Ω.


8.4 Orthonormal Systems, Convergence, and Updates 297

Proof. Suppose f ∈ FΩ satisfies f K = 1 (without loss of generality). Then,

|f (x) − f (y)|2 = |(f, Φ(· − x) − Φ(· − y))K |2 ≤ Φ(· − x) − Φ(· − y)2K


= 2Φ(0) − 2Φ(x − y) ≤ 2Lx − y2 ,

where we use the reproduction property of K in FΩ . 

Lemma 8.38 immediately implies the following error estimate.

Theorem 8.39. Let K(x, y) = Φ(x−y) be positive definite, K ∈ PDd , where


Φ : Rd −→ R is even and Lipschitz continuous with Lipschitz constant L > 0.
Moreover, let X ⊂ Ω be a finite subset of Ω ⊂ Rd . Then, we have, for any
f ∈ FΩ , the error estimate
9
sf,X − f ∞ ≤ 2LhX,Ω · f K .

Proof. Suppose y ∈ Ω. Then, there is some x ∈ X satisfying y−x2 ≤ hX,Ω .


Then, from Lemma 8.38, and by (sf,X − f )(x) = 0, we can conclude

|(sf,X − f )(y)|2 ≤ 2LhX,Ω · f 2K for all y ∈ Ω,

where we use the estimate sf,X − f K ≤ f K . 

From Theorem 8.39 we finally obtain our next convergence result.

Corollary 8.40. Let K(x, y) = Φ(x − y) be positive definite, K ∈ PDd ,


where Φ : Rd −→ R is even and Lipschitz continuous with Lipschitz constant
L > 0. Moreover, let (Xn )n∈N ⊂ Ω be a nested point set of interpolation
points, as in (8.41). Finally, assume for the associated fill distances hXn ,Ω the
asymptotic decay hXn ,Ω  0, as in (8.43). Then, we have, for any f ∈ FΩ ,
the uniform convergence
 
1/2
sf,Xn − f ∞ = O hXn ,Ω for n → ∞

at convergence rate 1/2. 

We remark that we can, under more restrictive assumptions on Φ ∈ PDd ,


prove convergence rates that are even higher than those in Corollary 8.40.
For a prototypical case, we refer to Exercise 8.66.

8.4.3 Update Strategies

Now we develop update strategies for the proposed interpolation method. To


further explain this task, let us regard a set Xn = {x1 , . . . , xn } ⊂ Rd of
n ∈ N pairwise distinct points. In this case, one update step is initiated by
adding a new point xn+1 ∈ Rd \ Xn to Xn . Typically, the insertion of xn+1
is motivated by the purpose to improve the quality of the approximation to
298 8 Kernel-based Approximation

f ∈ F (according to our discussion in Section 8.4.2). When adding a new


point xn+1 , this leads by
Xn+1 := Xn ∪ {xn+1 } for n ∈ N (8.44)
to an updated set of interpolation points Xn+1 . Note that the update of Xn
in (8.44) yields one additional interpolation condition
s(xn+1 ) = f (xn+1 )
in Problem 8.1, and so this requires an update for the interpolation method.
But we essentially wish to use the data of the interpolant sn ∈ SXn of f
on Xn , sXn = fXn , to efficiently compute the new data for the interpolant
sn+1 ∈ SXn+1 of f on Xn+1 , sXn+1 = fXn+1 . Any method which performs
such an efficient update for the relevant data is called an update strategy.
By iteration on the update step, we obtain, starting with an initial point
set X1 = {x1 }, for some x1 ∈ Rd , a nested sequence
X1 ⊂ X2 ⊂ X3 ⊂ . . . ⊂ Xn ⊂ Rd (8.45)
of subsets Xk , containing |Xk | = k interpolation points each. Moreover, two
subsequent point sets Xk ⊂ Xk+1 differ only about the point xk+1 , so that
{xk+1 } = Xk+1 \ Xk for 1 ≤ k ≤ n − 1.
Now we discuss the performance of selected update strategies. We begin
with updates on the Lagrange bases. On this occasion, we introduce another
orthonormal system for S ⊂ F.
Theorem 8.41. Let (Xm )nm=1 be a nested sequence of point sets of the
(m) (m)
form (8.45). Moreover, let (m) = { 1 , . . . , m } ⊂ SXm be their corres-
ponding Lagrange bases, for 1 ≤ m ≤ n, satisfying
(m)
j (xk ) = δjk for 1 ≤ j, k ≤ m.
Then, the sequence
(1) (n)
1 ,..., n
of the leading Lagrange basis functions forms an orthogonal system in SXn ,
where
( j , k )K = δjk · a−1
(j) (k)
kk for 1 ≤ j, k ≤ n
with the diagonal entries a−1 −1
kk of the inverse AK,Xk of AK,Xk , 1 ≤ k ≤ n.

Proof. We distinguish two cases.


Case 1: For j = k the statement follows from Proposition 8.16.
Case 2: Suppose j = k.
Assuming j < k (without loss of generality), we have
(k)
k (xj ) =0 for all xj ∈ Xj ⊂ Xk ,
(k) (k) (j)
i.e., k ⊥ SXj . In particular, ( k , j )K = 0. 
8.4 Orthonormal Systems, Convergence, and Updates 299

Next, we develop update strategies for the Cholesky5 decomposition of


the symmetric positive definite interpolation matrix AK,X in (8.6). To this
end, we describe one update step, starting with Xn = {x1 , . . . , xn }. In the
following discussion, it is convenient to use the abbreviation An := AK,Xn .
Now our aim is to efficiently compute the coefficients
(n+1) (n+1)
c(n+1) = (c1 , . . . , cn+1 )T ∈ Rn+1

of the interpolant


n+1
(n+1)
sn+1 = cj K(·, xj ) ∈ SXn+1
j=1

to f on Xn+1 via the solution to the linear system

An+1 c(n+1) = fXn+1

from the coefficients c(n) ∈ Rn of the previous interpolant sn ∈ SXn to f


on Xn . On this occasion, we recall the Cholesky decomposition for sym-
metric positive definite matrices, which should be familiar from numerical
mathematics (see e.g. [57, Theorem 3.6]). But let us first introduce lower
unitriangular matrices.

Definition 8.42. A lower unitriangular matrix L ∈ Rn×n has the form


⎡ ⎤
1
⎢ l21 1 ⎥
⎢ ⎥
⎢ l31 l32 1 ⎥
L=⎢ ⎥
⎢ .. . . ⎥
⎣ . .. .. ⎦
ln1 · · · · · · ln,n−1 1

i.e., we have ljj = 1 for the diagonal entries of L, 1 ≤ j ≤ n, and vanishing


entries above the diagonal, i.e., ljk = 0 for all 1 ≤ j < k ≤ n.

Theorem 8.43. Every symmetric positive definite matrix A has a unique


factorization of the form
A = LDLT (8.46)
with a lower unitriangular matrix L and a diagonal D = diag(d1 , . . . , dn )
with positive diagonal entries d1 , . . . , dn > 0. 

For a diagonal matrix


√ D = diag(d√ 1 , . . . , dn ) with positive diagonal entries,
we let D1/2 := diag( d1 , . . . , dn ), so that D1/2 · D1/2 = D. Now we can
introduce the Cholesky decomposition.
5
André-Louis Cholesky (1875-1918), French mathematician
300 8 Kernel-based Approximation

Definition 8.44. For a symmetric positive definite matrix A in (8.46), the


unique factorization
A = L̄L̄T
with factor L̄ := L · D1/2 , is called the Cholesky decomposition of A.
Now we can describe the Cholesky update. We start with the Cholesky
decomposition
An = L̄n L̄Tn (8.47)
of An = AK,Xn . When adding one interpolation point xn+1 ∈ Rd \ Xn to Xn ,
we wish to determine the Cholesky decomposition of An+1 := AK,Xn+1 for
the interpolation points Xn+1 = Xn ∪ {xn+1 }. To this end, we can use the
Cholesky decomposition of An in (8.47). In the following discussion, we let
1/2
L̄n := Ln · Dn for n ∈ N.
Theorem 8.45. For Xn = {x1 , . . . , xn } ⊂ Rd , let An = AK,Xn be the inter-
polation matrix in (8.6), whose Cholesky decomposition is as in (8.47). Then,
for An+1 = AK,Xn+1 , Xn+1 = Xn ∪ {xn+1 }, the Cholesky decomposition
An+1 = L̄n+1 L̄Tn+1 (8.48)
is given by the Cholesky factor
. /
L̄n 0
L̄n+1 = −1/2 1/2 ∈ R(n+1)×(n+1) , (8.49)
SnT Dn 1 − SnT Dn−1 Sn
where Sn ∈ Rn is the unique solution of the triangular system Ln Sn = Rn ,
for Rn := R(xn+1 ) = (K(x1 , xn+1 ), . . . , K(xn , xn+1 ))T ∈ Rn .
Proof. The matrix An+1 has the form
 
An Rn
An+1 =
RnT 1
and, moreover, the decomposition
     T −1 
Ln 0 Dn 0 L n D n Sn
An+1 = · · (8.50)
SnT Dn−1 1 0 1 − SnT Dn−1 Sn 0 1
holds, as we can verify directly by multiplying the factors.
Now note that the three matrix factors on the right hand side in (8.50)
have the required form of the unique decomposition An+1 = Ln+1 Dn+1 LTn+1
for An+1 , according to Theorem 8.43. Therefore, we have in particular
 
Ln 0
Ln+1 = ∈ R(n+1)×(n+1)
SnT Dn−1 1
and Dn+1 = diag(d1 , . . . , dn , 1 − SnT Dn−1 Sn ) ∈ R(n+1)×(n+1) .
But this immediately yields the Cholesky decomposition in (8.48) with
1/2
the Cholesky factor L̄n+1 = Ln+1 · Dn+1 , for which we can verify the stated
form in (8.49) by multiplying the factors. 
8.4 Orthonormal Systems, Convergence, and Updates 301

Now let us discuss the computational complexity of the Cholesky update.


Essentially, we need to determine the vector Sn in (8.49), which can be com-
puted by forward substitution as the solution of the triangular system in
O(n2 ) steps. This allows us to compute the required entries in the last row
of the Cholesky factor L̄n+1 in (8.49) in O(n) steps. Altogether, we only re-
quire at most O(n2 ) steps for the Cholesky update. In contrast, a complete
Cholesky decomposition of An+1 without using the Cholesky factor L̄n of An
costs O(n3 ) floating point operations (flops).
(n+1)
We compute the coefficients c(n+1) = (c1 , . . . , c(n+1) )T ∈ Rn+1 of the
interpolant
 (n+1)
n+1
sn+1 = cj K(·, xj ) ∈ SXn+1
j=1

to f on Xn+1 via the solution of the linear equation system

An+1 c(n+1) = fXn+1 (8.51)

efficiently as follows. To this end, we assume the coefficients c(n) ∈ Rn of


the previous interpolant sn ∈ SXn to f on Xn to be known. Moreover, we
employ the Cholesky decomposition An+1 = L̄n+1 L̄Tn+1 of An+1 to compute
the solution c(n+1) ∈ Rn+1 of (8.51) in two steps. This is done as follows.
(a) Solve the system L̄n+1 d(n+1) = fXn+1 by forward substitution.
(b) Solve the system L̄Tn+1 c(n+1) = d(n+1) by backward substitution.
Computational methods for solving (a) and (b) should be familiar from
numerical mathematics. The numerical solution of the triangular systems
in (a) and (b) require O(n2 ) flops each. But we can entirely avoid the com-
putational costs in (a). To this end, we take a closer look at the two systems
in (a) and (b).
The system in (a), L̄n+1 d(n+1) = fXn+1 , has the form
. / . /  
L̄n 0 d(n) f Xn
−1/2 1/2 · = .
1 − SnT Dn−1 Sn
(n+1)
SnT Dn dn+1 f (xn+1 )

Note that we have already determined the solution d(n) ∈ Rn of the


triangular system L̄n d(n) = fXn with the computation of the interpolant sn ,
whereby we obtain the last coefficient in d(n+1) by
−1/2 (n)
(n+1) f (xn+1 ) − SnT Dn d
dn+1 = 1/2
. (8.52)
1− SnT Dn−1 Sn
(n+1)
But we can avoid the computation of the entry dn+1 in (8.52). To see
this, we consider the system in (b), L̄Tn+1 c(n+1) = d(n+1) , which has the form
302 8 Kernel-based Approximation
. / . /
−1/2
L̄Tn Dn Sn d(n)
1/2 · c(n+1) = (n+1) .
0 1 − SnT Dn−1 Sn dn+1

For the last coefficient in c(n+1) , we have the representation


(n+1) −1/2
(n+1) dn+1 f (xn+1 ) − SnT Dn d(n)
cn+1 = = .
1 − SnT Dn−1 Sn
1/2
1 − SnT Dn−1 Sn

For the computation of the remaining n coefficients in c(n+1) , we apply back-


(n+1)
ward substitution. But in this case, the entry dn+1 in (8.52) is not needed.
Therefore, we require for the substitution in (a) no computational costs at
all, while the backward substitution in (b) costs altogether O(n2 ) flops.

8.5 Stability of the Reconstruction Scheme


In this section, we analyze the numerical stability of the kernel-based inter-
polation method. To this end, we first prove basic stability results, before we
discuss the conditioning of the interpolation problem. The investigations of
this section are motivated by the wavelet theory on time-frequency analysis,
where the concept of Riesz stability plays an important role.

8.5.1 Riesz Bases and Riesz Stability

For the special case of kernel-based interpolation from finite data, we can
characterize Riesz bases in a rather straightforward manner: For a finite set
X = {x1 , . . . , xn } ⊂ Rd of pairwise distinct interpolation points, the basis
functions BX = {K(·, xj )}nj=1 ⊂ SX are (obviously) a Riesz basis of SX ,
where we have the Riesz stability estimate
 2
 
 n

σmin (AK,X )c2 ≤   cj K(·, xj )
 ≤ σmax (AK,X )c2
2 2
(8.53)
 j=1 
K

for all c = (c1 , . . . , cn )T ∈ Rn , whose Riesz constants are determined by the


smallest eigenvalue σmin (AK,X ) and the largest eigenvalue σmax (AK,X ) of
AK,X . Indeed, according to Proposition 8.14, we have for G : Rn −→ SX
in (8.21),
n
G(c) = cj K(·, xj ),
j=1

the representation

G(c)2K = c2AK,X = cT AK,X c for all c ∈ Rn .


8.5 Stability of the Reconstruction Scheme 303

Therefore, the stated Riesz stability estimate in (8.53) holds by the Courant6 -
Fischer7 theorem, which should be familiar from linear algebra. In fact, ac-
cording to the Courant-Fischer theorem, the minimal eigenvalue σmin (A) and
the maximal eigenvalue σmax (A) of a symmetric matrix A can be represented
by the minimal and the maximal Rayleigh8 quotient, respectively, i.e.,
c, Ac c, Ac
σmin (A) = min and σmax (A) = max .
c∈Rn \{0} c, c c∈Rn \{0} c, c

By Theorem 6.31, any Riesz basis B has a unique dual Riesz basis B̃.
Now let us determine the dual Riesz basis of BX = {K(·, xj )}nj=1 ⊂ SX . To
this end, we rely on the results from Section 6.2.2. By Theorem 6.31, we can
identify the Lagrange basis of SX as dual to BX , i.e., B̃X = { 1 , . . . , n } ⊂ SX .

Theorem 8.46. For any point set X = {x1 , . . . , xn } ⊂ Rd , the Lagrange


basis B̃X = { j }nj=1 is the unique dual Riesz basis of BX = {K(·, xj )}nj=1 . In
particular, the orthonormality relation

(K(·, xj ), k )K = δjk , (8.54)

holds, for all 1 ≤ j, k ≤ n. Moreover, the stability estimates


 2
 
 n

−1
σmax (AK,X )fX 2 ≤   f (xj ) j  −1
 ≤ σmin (AK,X )fX 2 ,
2 2
(8.55)
 j=1 
K

hold, for all fX ∈ R , and, we have


n

σmin (AK,X )s2K ≤ sX 22 ≤ σmax (AK,X )s2K (8.56)

for all s ∈ SX .

Proof. The orthonormality relation in (8.54) follows from the reproduction


property of the kernel K, whereby

(K(·, xj ), k )K = k (xj ) = δjk for all 1 ≤ j, k ≤ n.

Due to Theorem 6.31, the Lagrange basis B̃X = { j }nj=1 ⊂ SX is the uniquely
determined dual Riesz basis of BX = {K(·, xj )}nj=1 ⊂ SX .
Moreover, by Proposition 8.36, the representation
 2
 
 n 
 f (xj ) j  T −1
 = fX A−1 for all fX ∈ Rn
2
 = fX AK,X fX
 j=1  K,X
K
6
Richard Courant (1888-1972), German-US American mathematician
7
Ernst Sigismund Fischer (1875-1954), Austrian mathematician
8
John William Strutt, 3. Baron Rayleigh (1842-1919), English physicist
304 8 Kernel-based Approximation

holds. According to the Courant-Fischer theorem, the Rayleigh estimates

σmin (A−1 T −1 −1
K,X )fX 2 ≤ fX AK,X fX ≤ σmax (AK,X )fX 2
2 2

hold for all fX ∈ Rn . This implies the stability estimate in (8.55), where
−1
σmax (AK,X ) = σmin (A−1
K,X ) and −1
σmin (AK,X ) = σmax (A−1
K,X ).

Letting f = s ∈ SX in (8.55), we finally get


 2
 
 n

−1
σmax (AK,X )sX 2 ≤ sK =   s(xj ) j  −1
 ≤ σmin (AK,X )sX 2
2 2 2
 j=1 
K

for all s ∈ SX , so that the stated estimates in (8.56) hold. 

From the Riesz duality relation between the bases BX = {K(·, xj )}nj=1 and
B̃X = { j }nj=1 , in combination with Theorem 6.31, in particular with (6.22),
we can conclude another important result.

Corollary 8.47. For f ∈ SX , the representations



n 
n
f= (f, K(·, xj ))K j = (f, j )K K(·, xj ) (8.57)
j=1 j=1

hold. 

Remark 8.48. We can also verify the representations in (8.57) for



n 
n
f= cj K(·, xj ) = f (xj ) j ∈ SX
j=1 j=1

directly: On the one hand, we get

cj = ej , c = eTj A−1 T −1


K,X fX = fX AK,X ej = (f, j )K

from Proposition 8.16. On the other hand, we have (f, K(·, xj ))K = f (xj ) by
the reproduction property of the kenel K, for all 1 ≤ j ≤ n. 

8.5.2 Conditioning of the Interpolation Problem

In this section, we analyze the conditioning of the interpolation problem,


Problem 8.1. Thereby, we quantify the sensitivity of the interpolation problem
with respect to perturbations of the input data. We restrict ourselves to the
interpolation problem for continuous functions f ∈ C (Ω) on a fixed compact
domain Ω ⊂ Rd , i.e., we only allow interpolation point sets X in Ω, X ⊂ Ω.
In practice, this requirement does usually not lead to severe restrictions.
8.5 Stability of the Reconstruction Scheme 305

For our subsequent analysis, we equip C (Ω) with the maximum norm
 · ∞ . Moreover, for any set of interpolation points X = {x1 , . . . , xn } ⊂ Ω,
we denote by IX : C (Ω) −→ SX the interpolation operator for X, which
assigns every function f ∈ C (Ω) to its unique interpolant s ∈ SX satisfying
s X = fX .

Definition 8.49. For X = {x1 , . . . , xn } ⊂ Ω, the condition number of


the interpolation problem, Problem 8.1, is the smallest constant κ∞ ≡ κ∞,X
satisfying
IX f ∞ ≤ κ∞ · f ∞ for all f ∈ C (Ω),
i.e., κ∞ is the operator norm IX ∞ of IX on C (Ω) w.r.t.  · ∞ .

The operator norm IX ∞ = κ∞ can be computed as follows.

Theorem 8.50. For X = {x1 , . . . , xn } ⊂ Ω, the norm IX ∞ of the inter-


polation operator IX : C (Ω) −→ SX is given by the Lebesgue constant


n
Λ∞ := max | j (x)| = max  (x)1 , (8.58)
x∈Ω x∈Ω
j=1

i.e., IX ∞ = Λ∞ .

Proof. For any f ∈ C (Ω), let s = IX (f ) ∈ SX ⊂ C (Ω) denote the unique


interpolant to f on X satisfying fX = sX . Using the Lagrange representation
of s in (8.9), we obtain the estimate


n
IX f ∞ = s∞ ≤ max | j (x)| · |f (xj )| ≤ Λ∞ · f ∞ ,
x∈Ω
j=1

and therefore IX ∞ ≤ Λ∞ .


In order to see that IX ∞ ≥ Λ∞ holds, suppose that the maximum of
Λ∞ in (8.58) is attained at x∗ ∈ Ω. Moreover, let g ∈ C (Ω) be a function
with unit norm gL∞ (Ω) = 1, that is satisfying the interpolation conditions
g(xj ) = sgn( j (x∗ )), for all 1 ≤ j ≤ n. Then, we have


n 
n
IX g∞ ≥ (IX g)(x∗ ) = j (x

)g(xj ) = | j (x∗ )| = Λ∞
j=1 j=1

and so IX g∞ ≥ Λ∞ , which implies IX ∞ ≥ Λ∞ .


Altogether, the stated identity IX ∞ = Λ∞ holds. 
306 8 Kernel-based Approximation

We can compute bounds on the Lebesgue constant Λ∞ as follows.


Proposition 8.51. For X = {x1 , . . . , xn } ⊂ Ω, we have the estimates
n A
 A
1 ≤ Λ∞ ≤ a−1
jj ≤ n · σmax (A−1
K,X ) (8.59)
j=1

for the Lebesgue constant Λ∞ , where a−1


jj > 0 is, for 1 ≤ j ≤ n, the j-th
diagonal entry in the inverse A−1
K,X of A K,X .

Proof. We first prove the upper bound in (8.59). To this end, we assume that
the maximum in (8.58) is attained at x∗ ∈ Ω. Then, from Example 8.13 and
Proposition 8.16, we get the first upper bound in (8.59) by

n 
n
Λ∞ = | j (x∗ )| = |δx∗ ( j )|
j=1 j=1
n 
n n A

≤ δx∗ K ·  j K =  j K = a−1
jj .
j=1 j=1 j=1

But this immediately implies the second upper bound in (8.59) by

a−1 −1
jj ≤ σmax (AK,X ) for all 1 ≤ j ≤ n.

The lower bound in (8.59) holds by  (xj )1 = 1, for 1 ≤ j ≤ n. 


We remark that the estimates for Λ∞ in (8.59) are only rather rough.
Optimal bounds for the spectral condition number of AK,X can be found in
the recent work [22] of Diederichs.

8.6 Kernel-based Learning Methods


This section is devoted to one particular variant of linear regression. For the
description of the basic method, we can directly link with our previous inves-
tigations in Sections 2.1 and 2.2. By the introduction of kernel-based learning
methods, we provide an alternative method for data fitting by Lagrange in-
terpolation. Kernel-based learning is particularly relevant, if the input data
(X, fX ) are very large and decontaminated from noise. In such application
scenarios, we wish to reduce, for a suitably chosen linear subspace R ⊂ S,
the empiric 2 -data error
1
ηX (f, s) = sX − fX 22 (8.60)
N
under variation of s ∈ R. To this end, we construct an approximation s∗ to
f , s∗ ≈ f , which, in addition, satisfies specific smoothness requirements. We
measure the smoothness of s∗ by the native energy functional J : S −→ R,
8.6 Kernel-based Learning Methods 307

J(s) := s2K for s ∈ S. (8.61)

To make a compromise between the data error in (8.60) and the smooth-
ness in (8.61), we consider in the following of this section the minimization
of the cost functional Jα : S −→ R, defined as

Jα (s) = ηX (f, s) + αJ(s) for α > 0. (8.62)

The term αJ(s) in (8.62) is called the regularization term, which penalizes
non-smooth elements s ∈ R, that are admissible for the optimization problem.
Moreover, the regularization parameter α > 0 is used to balance between the
data error ηX (f, s) and the smoothness J(s) of s.
Therefore, we can view the approximation method of this section as a
regularization method (see Section 2.2). According to the jargon of approxi-
mation theory, the proposed method of this section is also referred to as
penalized least squares approximation (see, e.g. [30]).

8.6.1 Problem Formulation and Characterization of Solutions

To explain the basic approximation problem, let X = {x1 , . . . , xN } ⊂ Rd be


a finite point set. Moreover, suppose that Y = {y1 , . . . , yn } is a subset of X,
Y ⊂ X, whose size |Y | = n is much smaller than the size |X| = N of X, i.e.,
n  N . Then, our aim is to reconstruct an unknown function f ∈ F from its
values fX ∈ RN by solving the following unconstrained optimization problem.

Problem 8.52. Let α ≥ 0. Determine from given data fX and Y ⊂ X, an


approximation sα ∈ SY to f satisfying
 
1 1
(f − sα )X 2 + αsα K = min
2 2
(f − s)X 2 + αsK .
2 2
(8.63)
N s∈SY N

We denote the optimization problem in (8.63) as (Pα ). 

Before we discuss the well-posedness of problem (Pα ), let us first make a


few comments. For α = 0, the optimization problem (P0 ) obviously coincides
with the basic problem of linear least squares approximation [7, 41]. For very
large values of α > 0, the smoothness term αs2K in (8.63) dominates the
data error. In fact, we expect that any sequence {sα }α of solutions sα to (Pα )
converges for α → ∞ to zero, which is the unique minimum of J(s) on SY .
Now we show that the optimization problem (Pα ) has, for any α > 0, a
unique solution. To this end, we choose for the data error the representation
1
ηX (f, s) = fX − AX,Y c22 for s ∈ SY , (8.64)
N
with
AX,Y = (K(xk , yj ))1≤k≤N ;1≤j≤n ∈ RN ×n ,
308 8 Kernel-based Approximation

and the coefficient vector c = (c1 , . . . , cn )T ∈ Rn of



n
s= cj K(·, yj ) ∈ SY . (8.65)
j=1

Therefore, by using the representation

J(s) = s2K = cT AK,Y c for s ∈ S,

we can express the cost functional Jα : S −→ R in (8.63) for (Pα ) as


1
Jα (s) := ηX (f, s) + αJ(s) = fX − AX,Y c22 + αcT AK,Y c. (8.66)
N
Now we prove the existence and uniqueness for the solution of (Pα ).
Theorem 8.53. Let α ≥ 0. Then, the penalized least squares problem (Pα )
has a unique solution sα ∈ SY of the form (8.65), where the coefficients
cα ∈ Rn of sα are uniquely determined by the solution of the normal equation
 
1 T 1
AX,Y AX,Y + αAK,Y cα = ATX,Y fX . (8.67)
N N
Proof. For any solution sα of (Pα ), the corresponding coefficient vector
cα ∈ Rn minimizes the cost functional Jα in (8.66). Now the gradient of
Jα does necessarily vanish at cα , whereby the representation through the
stated normal equation in (8.67) follows. Note that the coefficient matrix of
the normal equation in (8.67) is, for any α ≥ 0, symmetric positive definite.
Therefore, (Pα ) has a unique solution. 
An alternative characterization for the unique solution sα of (Pα ) follows
from our previous results on Euclidean approximation (see Section 4.1).
Theorem 8.54. For α ≥ 0, the solution sα ≡ sα (f ) ∈ SY of (Pα ) satisfies
the condition
1
(f − sα )X , sX = α(sα , s)K for all s ∈ SY . (8.68)
N
Proof. We equip F ×F with a positive semi-definite symmetric bilinear form,
1
[(f, g), (f˜, g̃)]α := fX , f˜X + α(g, g̃)K for f, g, f˜, g̃ ∈ F,
N
yielding the semi-norm
1
|(f, g)|2α = fX 22 + αg2K for f, g ∈ F.
N
Now the solution sα ∈ SY of (Pα ) corresponds to the best approximation
(s∗α , s∗α ) ∈ SY × SY to (f, 0) with respect to (F × F, | · |α ), i.e.,
8.6 Kernel-based Learning Methods 309

|(f, 0) − (s∗α , s∗α )|2α = inf |(f, 0) − (s, s)|2α .


s∈SY

According to Remark 4.2, the best approximation s∗α is unique and, moreover,
characterized by the orthogonality condition

[(f, 0) − (s∗α , s∗α ), (s, s)]α = 0 for all s ∈ SY ,

which is for s∗α = sα equivalent to the condition in (8.68). 


The characterizations in the Theorems 8.53 and 8.54 are obviously equi-
valent. Indeed, if we replace s ∈ SY in (8.68) by the standard basis functions
K(·, yk ) ∈ SY , for 1 ≤ k ≤ n, then, the condition in (8.68) can be expressed
as
1
(f − sα )X , R(yk ) = α(sα , K(·, yk ))K for all 1 ≤ k ≤ n, (8.69)
N
where
RT (yk ) = (K(x1 , yk ), . . . , K(xN , yk )) = eTk ATX,Y .
For sα in (8.65) with corresponding coefficients cα ∈ Rn , we get

(sα )X = AX,Y cα ∈ RN ,

and so we obtain the normal equation in (8.69): On the one hand, the left
hand side in (8.69) can be written as
1 1  
(f − sα )X , R(yk ) = RT (yk )fX − RT (yk )AX,Y cα
N N
1  
= eTk ATX,Y fX − eTk ATX,Y AX,Y cα .
N
On the other hand, the right hand side in (8.69) can be written as

α(sα , K(·, yk ))K = αsα (yk ) = αeTk AK,Y cα ,

where we used the reproduction property

(sα , K(·, yk ))K = sα (yk )

of the kernel K.

8.6.2 Stability, Sensitivity, Error Bounds, and Convergence

Next, we analyze the stability of the proposed regression method. To this


end, we first bound the minimum of the cost functional in (8.63) as follows.
Theorem 8.55. For any α ≥ 0, the solution sα ≡ sα (f ) ∈ SY of (Pα )
satisfies the stability estimate
1
(sα − f )X 22 + αsα 2K ≤ (1 + α)f 2K .
N
310 8 Kernel-based Approximation

Proof. Let sf ∈ SY denote the (unique) interpolant to f at Y satisfying


(sf − f )Y = 0. Recall sf K ≤ f K from Corollary 8.30. Then, we have

1 
N
1
(sα − f )X 22 + αsα 2K = |sα (xk ) − f (xk )|2 + αsα 2K
N N
k=1

1 
N
≤ |sf (xk ) − f (xk )|2 + αsf 2K
N
k=1
1 
≤ εx 2K · f 2K + αf 2K
N
x∈X\Y
⎛ ⎞
1 
=⎝ εx 2K + α⎠ f 2K
N
x∈X\Y
 
N −n
≤ + α f 2K ≤ (1 + α)f 2K ,
N
where we use the pointwise error estimate in (8.30) along with the uniform
estimate εx K ≤ 1 in (8.32). 
Next, we analyze the sensitivity of problem (Pα ) under variation of the
smoothing parameter α ≥ 0. To this end, we first observe that the solution
sα ≡ sα (f ) of problem (Pα ) coincides with that of the target function s0 , i.e.,
sα (s0 ) = sα (f ).
Lemma 8.56. For any α ≥ 0, the solution sα ≡ sα (f ) of (Pα ) satisfies the
following properties.
(a) The Pythagoras theorem, i.e.,

(f − sα )X 22 = (f − s0 )X 22 + (s0 − sα )X 22 .

(b) The best approximation property sα (s0 ) = sα (f ), i.e.,


 
1 1
(s0 − sα )X 22 + αsα 2K = min (s0 − s)X 22 + αs2K .
N s∈SY N
Proof. Recall that the solution sα (f ) to (Pα ) is characterized by the condi-
tions (8.68) in Theorem 8.54, where for α = 0, we obtain the characterization
1
(f − s0 )X , sX = 0 for all s ∈ SY . (8.70)
N
For s ∈ SY , this implies the relation

(f − s)X 22 = (f − s0 + s0 − s)X , (f − s0 + s0 − s)X


= (f − s0 )X 22 + 2(f − s0 )X , (s0 − s)X + (s0 − s)X 22
= (f − s0 )X 22 + (s0 − s)X 22 ,
8.6 Kernel-based Learning Methods 311

and so, for s = sα , we get property (a).


To verify property (b), we subtract the representation in (8.70) from that
in (8.68), whereby with
1
(s0 − sα )X , sX = α(sα , s)K for all s ∈ SY (8.71)
N
we get the characterization (8.68) for the unique solution sα (s0 ) of (Pα ), so
that statement (b) follows from Theorem 8.54. 

Next, we analyze the convergence of {sα }α for α  0. To this end, we first


prove one stability estimate for sα , along with one error bound for sα − s0 .

Theorem 8.57. Let f ∈ F and α ≥ 0. Then, the solution sα ≡ sα (f ) to


problem (Pα ) has the following properties.
(a) sα satisfies the stability estimate

sα K ≤ s0 K .

(b) sα satisfies the error estimate


1
(sα − s0 )X 22 ≤ αs0 2K .
N
Proof. Letting s = s0 − sα in (8.71) we get
1
(s0 − sα )X 22 + αsα 2K = α(sα , s0 )K . (8.72)
N
By using the Cauchy-Schwarz inequality, this implies
1
(s0 − sα )X 22 + αsα 2K ≤ αsα K · s0 K ,
N
and so the statements in (a) and (b) hold. 

Finally, we prove the convergence of sα to s0 , for α  0.

Theorem 8.58. The solution sα of (Pα ) converges to the solution s0 of (P0 ),


for α  0, at the following convergence rates.
(a) With respect to the norm  · K , we have the convergence

sα − s0 2K = O(α) for α  0.

(b) With respect to the data error, we have the convergence


1
(sα − s0 )X 22 = o(α) for α  0.
N
312 8 Kernel-based Approximation

Proof. To prove (a), first note that

sX := sX 2 for s ∈ SY

is a norm on SY . To see the definiteness of  · X on SY , note that sX = 0


implies sX = 0, in particular sY = 0, since Y ⊂ X, in which case s = 0.
Moreover, since SY has finite dimension, the norms  · X and  · K are
equivalent on SY , so that there exists some constant C > 0 satisfying

sK ≤ CsX for all s ∈ SY .

This, in combination with property (b) in Theorem 8.57, implies (a) by

sα − s0 2K ≤ C 2 (sα − s0 )X 22 ≤ C 2 N αs0 2K .

To prove (b), we recall the relation (8.72) to obtain


 
1 1
(sα , s0 )K = (s0 − sα )X 22 + αsα 2K for α > 0.
α N

This in turn implies the identity


2
sα − s0 2K = s0 2K − sα 2K − (s0 − sα )X 22 (8.73)
αN
by

sα − s0 2K = sα 2K − 2(sα , s0 )K + s0 2K


 
2 1
= sα 2K + s0 2K − (s0 − sα )X 22 + αsα 2K
α N
2
= s0 2K − sα 2K − (s0 − sα )X 22 .
αN
To complete our proof for (b), note that, by statement (a), the left hand
side in (8.73) tends to zero, for α  0, and so does the right hand side
in (8.73) tend to zero. By the stability estimate in Theorem 8.57 (a), we get

0 ≤ s0 K − sα K ≤ sα − s0 K −→ 0 for α  0,

so that sα K −→ s0 K for α  0. Therefore,


2
(s0 − sα )X 22 −→ 0 for α  0,
αN
which completes our proof for (b). 
8.7 Exercises 313

8.7 Exercises
Exercise 8.59. Let K : Rd × Rd −→ R be a continuous symmetric function,
for d > 1. Moreover, suppose that for some n ∈ N all symmetric matrices of
the form
AK,X = (K(xk , xj ))1≤j,k≤n ∈ Rn×n ,
for sets X = {x1 , . . . , xn } ⊂ Rd of n pairwise distinct points, are regular.
Show that all symmetric matrices AK,X ∈ Rn×n are positive definite, as
soon as there is one point set Y = {y1 , . . . , yn } ⊂ Rd for which the matrix
AK,Y ∈ Rn×n is symmetric positive definite.
Hint: Proof of the Mairhuber-Curtis theorem, Theorem 5.25.

Exercise 8.60. Let F be a Hilbert space of functions f : Rd −→ R with


reproducing kernel K : Rd × Rd −→ R, K ∈ PDd . Moreover, for a set of
interpolation points X = {x1 , . . . , xn } ⊂ Rd , let IX : F −→ SX denote
the interpolation operator which assigns every function f ∈ F to its unique
interpolant s ∈ SX from SX = span{K(·, xj ) | 1 ≤ j ≤ n} satisfying sX = fX .
Prove the following statements.
(a) If the interpolation method is translation-invariant, i.e., if we have for
any finite set of interpolation points X the translation invariance

(IX f )(x) = (IX+x0 fx0 )(x + x0 ) for all f ∈ F and all x0 ∈ Rd ,

where X + x0 := {x1 + x0 , . . . , xn + x0 } ⊂ Rd and fx0 := f (· − x0 ), then


K has necessarily the form K(x, y) = Φ(x − y), where Φ ∈ PDd .
(b) If the interpolation method is translation-invariant and rotation-invariant,
i.e., if for any finite set of interpolation points X = {x1 , . . . , xn } ⊂ Rd
and for any rotation matrix Q ∈ Rd×d the identity

(IX f )(x) = (IQX fQ )(Qx) for all f ∈ F

holds, where QX := {Qx1 , . . . , Qxn } ⊂ Rd and fQ := f (QT ·), then K


has necessarily the form K(x, y) = φ(x − y2 ), where φ ∈ PDd .

Exercise 8.61. Let X = {x1 , . . . , xn } ⊂ Rd , d ∈ N, be a finite set of n ∈ N


points. Show that the functions eixj ,· , for 1 ≤ j ≤ n, are linearly independent
on Rd , if and only if the points in X are pairwise distinct.
Hint: First prove the assertion for the univariate case, d = 1. To this
end, consider, for pairwise distinct points X = {x1 , . . . , xn } ⊂ R the linear
combination

Sc,X (ω) = c1 eix1 ,ω + . . . + cn eixn ,ω for c = (c1 , . . . , cn )T ∈ Rn .


(k)
Then, evaluate the function Sc,X and its derivatives Sc,X , for 1 ≤ k < n, at
ω = 0, to show the implication
314 8 Kernel-based Approximation

Sc,X ≡ 0 =⇒ c=0
(k)
by using the n linear conditions Sc,X (0) = 0, for 0 ≤ k < n. Finally, to
prove the assertion for the multivariate case, d > 1, use the separation of the
components in eixj ,ω , for ω = (ω1 , . . . , ωd )T ∈ Rd and 1 ≤ j ≤ n.
Exercise 8.62. Let K ∈ PDd . Show that the native space norm  · K of
the Hilbert space F ≡ FK is stronger than the maximum norm  · ∞ , i.e.,
if a sequence (fn )n∈N of functions in F converges w.r.t.  · K to f ∈ F, so
that fn − f K −→ 0 for n → ∞, then (fn )n∈N does also converge w.r.t. the
maximum norm  · ∞ to f , so that fn − f ∞ −→ 0 for n → ∞.
Exercise 8.63. Let H be a Hilbert space of functions with reproducing ker-
nel K ∈ PDd . Show that H is the native Hilbert space of K, i.e., FK = H.
Hint: First, show the inclusion FK ⊂ H. Then, consider the direct sum

H = FK ⊕ G

to show G = {0}, by contradiction.


Exercise 8.64. Let (ηn )n∈N be a monotonically decreasing zero sequence
of non-negative real numbers, i.e., ηn  0 for n → ∞. Show that there is
a nested sequence of point sets (Xn )n∈N ⊂ Ω, as in (8.41), and a function
f ∈ FΩ satisfying

ηK (f, SXn ) ≥ ηn for all n ∈ N.

Exercise 8.65. Let K(x, y) = Φ(x−y) be positive definite, K ∈ PDd , where


Φ : Rd −→ R is even and satisfies, for α > 0, the growth condition

|Φ(0) − Φ(x)| ≤ Cxα


2 for all x ∈ Br (0), (8.74)

around zero, for some r > 0 and some C > 0. Show that in this case, every
f ∈ F ≡ FK is globally Hölder continuous with Hölder exponent α/2, i.e.,
α/2
|f (x) − f (y)| ≤ Cx − y2 for all x, y ∈ Rd .

Conclude that no positive definite kernel function K ∈ PDd satisfies the


growth condition in (8.74) for α > 2.
Exercise 8.66. Let K(x, y) = Φ(x−y) be positive definite, K ∈ PDd , where
Φ : Rd −→ R is even and satisfies, for α > 0, the growth condition

|Φ(0) − Φ(x)| ≤ Cxα


2 for all x ∈ Br (0),

around zero, for some r > 0 and C > 0. Moreover, for compact Ω ⊂ Rd , let
(Xn )n∈N be a nested sequence of subsets Xn ⊂ Ω, as in (8.41), whose mono-
tonically decreasing fill distances hXn ,Ω are a zero sequence, i.e., hXn ,Ω  0
for n → ∞. Show for f ∈ FΩ the uniform convergence
8.7 Exercises 315
 
α/2
sf,Xn − f ∞ = O hXn ,Ω for n → ∞.

Determine from this result the convergence rate for the special case of the
Gauss kernel in Example 8.9.

Exercise 8.67. Show that the diagonal entry


1/2
1 − SnT Dn−1 Sn

of the Cholesky factor L̄n+1 in (8.49) is positive. To this end, first show the
representation
1 − SnT Dn−1 Sn = εxn+1 ,Xn 2K ,
where εxn+1 ,Xn is the error functional in (8.29) at xn+1 ∈ Xn+1 \ Xn with
respect to the set of interpolation points Xn .
9 Computerized Tomography

Computerized tomography (CT) refers to a popular medical imaging method


in diagnostic radiology, where large data samples are taken from a human
body to generate slices of images to visualize the interior structure, e.g. of
organs, muscles, brain tissue, or bones. But computerized tomography is also
used in other relevant applications areas, e.g. in non-destructive evaluation
of materials.
In the data acquisition of computerized tomography, a CT scan is being
generated by a large set of X-ray beams with known intensity, where the
X-ray beams pass through a medium (e.g. a human body) whose interior
structure is to be recovered. Each CT datum is generated by one X-ray beam
which is travelling along a straight line segment from an emitter to a detector.
If we identify the image domain with a convex set in the plane Ω ⊂ R2 ,
then (for each X-ray beam) the emitter is located at some position xE ∈ Ω,
whereas the detector is located at another position xD ∈ Ω. Therefore, the
X-ray beam passes through Ω along the straight line segment [xE , xD ] ⊂ Ω,
from xE to xD (see Figure 9.1).

xE

xD
Ω

Fig. 9.1. X-ray beam travelling from emitter xE to detector xD along [xE , xD ] ⊂ Ω.

© Springer Nature Switzerland AG 2018 317


A. Iske, Approximation Theory and Algorithms for Data Analysis, Texts
in Applied Mathematics 68, https://doi.org/10.1007/978-3-030-05228-7_9
318 9 Computerized Tomography

In the acquisition of one CT datum, the initial intensity IE = I(xE ) of


the X-ray beam is being controlled at the emitter, whereas the final intensity
ID = I(xD ) is being measured at the detector. Therefore, the difference
ΔI = IE − ID gives the loss of intensity. Now the datum ΔI depends on
the interior structure (i.e., on the material properties) of the medium along
the straight line segment [xE , xD ]. To be more precise, ΔI quantifies the
medium’s absorption of energy on [xE , xD ].
Now we explain, how the CT data ΔI are interpreted mathematically. By
the law of Lambert1 -Beer2 [6]

dI(x)
= −f (x)I(x) (9.1)
dx
the rate of change for the X-ray intensity I(x) at x is quantified by the factor
f (x), where f (x) is referred to as the attenuation-coefficient function. There-
fore, the attenuation-coefficient function f (x) yields the energy absorption on
the computational domain Ω, and so f (x) represents an important material
property of the scanned medium.
In the following of this chapter, we are interested in the reconstruction
of f (x). To this end, we further study the differential equation (9.1). By
integrating (9.1) along the straight line segment [xE , xD ], we determine the
loss of intensity (or, the loss of energy) of the X-ray beam on [xE , xD ] by
 xD  xD
dI(x)
=− f (x)dx. (9.2)
xE I(x) xE

Now we can rewrite (9.2) as


   xD
I(xE )
log = f (x)dx. (9.3)
I(xD ) xE

The intensity IE = I(xE ) at the emitter and the intensity ID = I(xD ) at


the detector can be controlled or be measured. This measurement yields the
line integral of the attenuation-coefficient function f (x) along [xE , xD ],
 xD
f (x)dx. (9.4)
xE

In this chapter, we first explain how the attenuation-coefficient function


f (x) can be reconstructed exactly from the set of line integrals in (9.4). This
leads us to a rather comprehensive mathematical discussion, from the prob-
lem formulation to the analytical solution. Then, we develop and analyze nu-
merical algorithms to solve the reconstruction problem for f (x) in relevant
application scenarios. Numerical examples are presented for illustration.
1
Johann Heinrich Lambert (1728-1777), mathematician, physicist, philosopher
2
August Beer (1825-1863), German mathematician, chemist and physicist
9.1 The Radon Transform 319

9.1 The Radon Transform


9.1.1 Representation of Lines in the Plane
We represent any straight line ⊂ R2 in the Euclidean plane by using polar
coordinates. To this end, we consider the orthogonal projection x ∈ of the
origin 0 ∈ R2 onto . Therefore, we can characterize x ∈ as the unique
best approximation to 0 from with respect to the Euclidean norm  · 2 .
Moreover, we consider the (unique) angle θ ∈ [0, π), for which the unit vector
nθ = (cos(θ), sin(θ)) is perpendicular to . Then, x ∈ can be represented
by
x = (t cos(θ), t sin(θ)) ∈ ⊂ R2
for some pair (t, θ) ∈ R × [0, π) of polar coordinates. For any straight line
⊂ R2 , the so constructed polar coordinates (t, θ) ∈ R × [0, π) are unique.
As for the converse, for any pair of polar coordinates (t, θ) ∈ R × [0, π)
there is a unique straight line ≡ t,θ ⊂ R2 , which is represented in this way
by (t, θ). We introduce this representation as follows (see Figure 9.2).
Definition 9.1. For any coordinate pair (t, θ) ∈ R × [0, π), we denote by
t,θ ⊂ R the unique straight line which passes through x = (t cos(θ), t sin(θ))
2

and is perpendicular to the unit vector nθ = (cos(θ), sin(θ)).

 
n⊥
θ = − sin(θ), cos(θ)
 
nθ = cos(θ), sin(θ)


 
x = t cos(θ), t sin(θ)

x
t,θ

Fig. 9.2. Representation of straight line t,θ ⊂ R2 by coordinates (t, θ) ∈ R × [0, π).

For the parameterization of a straight line t,θ , for (t, θ) ∈ R × [0, π), we
use the standard point-vector representation, whereby any point (x, y) ∈ t,θ
in t,θ is uniquely represented as a linear combination of the form
320 9 Computerized Tomography

(x, y) = t · nθ + s · n⊥
θ (9.5)

with the curve parameter s ∈ R and the spanning unit vector

n⊥
θ = (− sin(θ), cos(θ)),

which is perpendicular to nθ , i.e., n⊥


θ ⊥ nθ (see Figure 9.2). We can describe
the relation between (t, s) and (x, y) in (9.5) via the linear system

x ≡ x(t, s) = cos(θ)t − sin(θ)s


y ≡ y(t, s) = sin(θ)t + cos(θ)s

or        
x cos(θ) − sin(θ) t t
= · = Qθ · (9.6)
y sin(θ) cos(θ) s s
with the rotation matrix Qθ ∈ R2×2 . The inverse of the orthogonal matrix
Qθ is given by the rotation matrix Q−θ = QTθ , whereby the representation
       
cos(θ) sin(θ) x t x
· = = QTθ · (9.7)
− sin(θ) cos(θ) y s y

follows immediately from (9.6). Moreover, (9.6), or (9.7), yields the relation

t2 + s2 = x2 + y 2 , (9.8)

which will be useful later in this chapter.

9.1.2 Formulation of the Reconstruction Problem

The basic reconstruction problem of computerized tomography, as sketched


at the outset of this chapter, can be stated mathematically as follows.

Problem 9.2. Reconstruct a function f ≡ f (x, y) from line integrals



f (x, y) dx dy, (9.9)
t,θ

that are assumed to be known for all straight lines t,θ , (t, θ) ∈ R × [0, π). 

We remark that the CT reconstruction problem, Problem 9.2, cannot be


solved for all bivariate functions f . But under suitable conditions on f we
can reconstruct the function f exactly from its Radon3 data
*  +


f (x, y) dx dy  (t, θ) ∈ R × [0, π) . (9.10)
t,θ

3
Johann Radon (1887-1956), Austrian mathematician
9.1 The Radon Transform 321

For any function f ∈ L1 (R2 ), the line integral in (9.9) is, for any coordi-
nate pair (t, θ) ∈ R × [0, π), defined as
 
f (x, y) dx dy = f (t cos(θ) − s sin(θ), t sin(θ) + s cos(θ)) ds, (9.11)
t,θ R

where we use the coordinate transform in (9.6) with the arc length element
9
(ẋ(s), ẏ(s))2 ds = (− sin(θ))2 + (cos(θ))2 ds = ds

on t,θ . This finally leads us to the Radon transform.

Definition 9.3. For f ≡ f (x, y) ∈ L1 (R2 ), the function



Rf (t, θ) = f (t cos(θ) −s sin(θ), t sin(θ) + s cos(θ)) ds for t ∈ R, θ ∈ [0, π)
R

is called the Radon transform of f .

Remark 9.4. The Radon transform R is well-defined on L1 (R2 ), where we


have Rf ∈ L1 (R × [0, π)) for f ∈ L1 (R2 ) (see Exercise 9.33). However, there
are functions f ∈ L1 (R2 ), whose Radon transform Rf ∈ L1 (R × [0, π)) is not
finite in (t, θ) ∈ R × [0, π) (see Exercise 9.34). 

Note that the Radon transform R is a linear integral transform which


maps a bivariate function f ≡ f (x, y) in Cartesian coordinates (x, y) to a bi-
variate function Rf (t, θ) in polar coordinates (t, θ). This observation allows us
to reformulate the reconstruction of f , Problem 9.2, more concise as follows.
On this occasion, we implicitly accommodate the requirement f ∈ L1 (R2 ) to
the list of our conditions on f .

Problem 9.5. Determine the inversion of the Radon transform R. 

Before we turn to the solution of Problem 9.5, we first give some elemen-
tary examples of Radon transforms. We begin with the indicator function
(i.e., the characteristic function) of the disk Br = {x ∈ R2 | x2 ≤ r}, r = 0.

Example 9.6. For the indicator function χBr of the disk Br ,

1 for x2 + y 2 ≤ r2 ,
f (x, y) = χBr (x, y) :=
0 for x2 + y 2 > r2 ,

we compute the Radon transform Rf as follows. We first apply the variable


transformation (9.6), which, in combination with the relation (9.8), gives

1 for t2 + s2 ≤ r2 ,
f (t cos(θ) − s sin(θ), t sin(θ) + s cos(θ)) =
0 for t2 + s2 > r2 .
322 9 Computerized Tomography

Note that Rf (t, θ) = 0, if and only if the straight line t,θ does not intersect
with the interior of the disk Br , i.e, if and only if |t| ≥ r. Otherwise, i.e., for
|t| < r, we obtain by
  √
r 2 −t2 9
Rf (t, θ) = f (x, y) d(x, y) = √ 1 ds = 2 r 2 − t2
t,θ − r 2 −t2

the length of the straight line segment t,θ ∩ supp(f ) = t,θ ∩ Br . ♦

Example 9.7. We compute the Radon transform of the cone function


9
1 − x2 + y 2 for x2 + y 2 ≤ 1,
f (x, y) =
0 for x2 + y 2 > 1,

or, by transformation (9.6) and on the relation (9.8),



1 − t2 + s 2 for t2 + s2 ≤ 1,
f (t cos(θ) − s sin(θ), t sin(θ) + s cos(θ)) =
0 for t2 + s2 > 1.

In this case, we get Rf (t, θ) = 0 for |t| ≥ 1 and


  √
1−t2  9 
Rf (t, θ) = f (x, y) d(x, y) = √ 1− t2 + s2 ds
t,θ − 1−t2
, √ -
9 t2 1 + 1 − t2
= 1 − t2 − log √
2 1 − 1 − t2

for |t| < 1. ♦

Remark 9.8. For any radially symmetric function f (·) = f (·2 ), the Radon
transform Rf (t, θ) does only depend on t ∈ R, but not on the angle θ ∈ [0, π).
Indeed, in this case, we have the identity
  
Rf (t, θ) = f (x2 ) dx = f (Qθ x2 ) dx = f (x2 ) dx
t,θ t,0 t,0
= Rf (t, 0)

by application of the variable transform with the rotation matrix Qθ in (9.6).


This observation is consistent with our Examples 9.6 and 9.7. 

Now we construct another simple example from elementary functions. In


medical image reconstruction, the term phantom is often used to denote test
images whose Radon transforms can be computed analytically. The phantom
bull’s eye is only one such example for a popular test case.
9.1 The Radon Transform 323

(a) The phantom bull’s eye


1.4

1.2

0.8

0.6

0.4

0.2

0
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1

(b) The Radon transform of bull’s eye

Fig. 9.3. Bull’s eye and its Radon transform (see Example 9.9).
324 9 Computerized Tomography

Example 9.9. The phantom bull’s eye is given by the linear combination
3 1
f (x, y) = χB3/4 (x, y) − χB1/2 (x, y) + χB1/4 (x, y) (9.12)
4 4
of three indicator functions χBr of the disks Br , for r = 3/4, 1/2, 1/4. To
compute Rf , we apply the linearity of operator R, whereby
3 1
Rf (t, θ) = (RχB3/4 )(t, θ) − (RχB1/2 )(t, θ) + (RχB1/4 )(t, θ). (9.13)
4 4
Due to the radial symmetry of f (or, of χBr ), the Radon transform Rf (t, θ)
does depend on t, but not on θ (cf. Remark 9.8). Now we can use the result
of Example 9.6 to represent the Radon transform Rf in (9.13) by linear com-
bination of the Radon transforms RχBr , for r = 3/4, 1/2, 1/4. The phantom
f and its Radon transform Rf are shown in Figure 9.3. ♦

(a) Shepp-Logan-Phantom f (b) Radon transform Rf

Fig. 9.4. The Shepp-Logan phantom and its sinogram.

For further illustration, we finally consider the Shepp-Logan phantom [66],


a popular test case from medical imaging. The Shepp-Logan phantom f is a
superposition of ten different ellipses to sketch a cross section of the human
brain, see Figure 9.4 (a). In fact, the Shepp-Logan phantom is a very popular
test case for numerical simulations, where the Radon transform Rf of f can
be computed analytically. Figure 9.4 (b) shows the Radon transform Rf
displayed in the rectangular coordinate system R × [0, π). Such a represen-
tation of Rf is called sinogram. In computerized tomography, the Shepp-
Logan phantom (along with other popular test cases) is often used to evaluate
the performance of numerical algorithms to reconstruct f from Rf .
9.2 The Filtered Back Projection 325

9.2 The Filtered Back Projection


Now we turn to the inversion of the Radon transform, i.e., we wish to solve
Problem 9.5. To this end, we first note some preliminary observations. Sup-
pose we wish to reconstruct f ≡ f (x, y) from given Radon data (9.10) only
at one point (x, y). In this case, only those values of the line integrals (9.9)
are relevant, whose Radon lines t,θ contain the point (x, y). Indeed, for all
other straight lines t,θ , which do not contain (x, y), the value f (x, y) does
not take influence on the line integral Rf (t, θ).
For this reason, we first wish to find out which Radon lines t,θ do contain
the point (x, y). For any fixed angle θ ∈ [0, π), we can immediately work this
out by using the relation (9.5). In fact, in this case, we necessarily require

t = x cos(θ) + y sin(θ),

see (9.7), and so this condition on t is also sufficient. Therefore, only the
straight lines
x cos(θ)+y sin(θ),θ for θ ∈ [0, π)
contain the point (x, y). This observation leads us to the following definition
for the back projection operator.

Definition 9.10. For h ∈ L1 (R × [0, π)), the function



1 π
Bh(x, y) = h(x cos(θ) + y sin(θ), θ) dθ for (x, y) ∈ R2
π 0

is called the back projection of h.

Remark 9.11. The back projection is a linear integral transform which


maps a bivariate function h ≡ h(t, θ) in polar coordinates (t, θ) to a bivariate
function Bh(x, y) in Cartesian coordinates (x, y).
Moreover, the back projection B is (up to a positive factor) the adjoint
operator of the Radon transform Rf . For more details on this, we refer to
Exercise 9.39. 

Remark 9.12. The back projection B is not the inverse of the Radon trans-
form R. To see this, we make a simple counterexample. We consider the indi-
cator function f := χB1 ∈ L1 (R2 ) of the unit ball B1 = {x ∈ R2 | x2 ≤ 1},
whose (non-negative) Radon transform

2 1 − t2 for |t| ≤ 1,
Rf (t, θ) =
0 for |t| > 1

is computed in Example 9.6.√Now we evaluate the back projection


√ B(Rf ) of
Rf at (1 + ε, 0). For ε ∈ (0, 2 − 1), we have 1 + ε ∈ (1, 2) and, moreover,
|(1 + ε) cos(θ)| < 1 for θ ∈ [π/4, 3π/4]. Therefore, we obtain
326 9 Computerized Tomography
 π
1
(B(Rf ))(1 + ε, 0) = Rf ((1 + ε) cos(θ), θ) dθ
π 0
 3π/4
1
≥ Rf ((1 + ε) cos(θ), θ) dθ
π π/4
 3π/4 9
2
= 1 − (1 + ε)2 cos2 (θ) dθ > 0,
π π/4

i.e., we have (B(Rf ))(1 + ε, 0) > 0 for all ε ∈ (0, 2 − 1).
Likewise, by the radial symmetry of f , we get for ϕ ∈ (0, 2π)

(B(Rf ))((1 + ε) cos(ϕ), (1 + ε) sin(ϕ)) > 0 for all ε ∈ (0, 2 − 1),

see Exercise 9.37, i.e., B(Rf ) is positive on the open annulus


√ 3  √ 4
R1 2 = x ∈ R2  1 < x2 < 2 ⊂ R2 .

However, we have f ≡ 0 on R1 2 , so that f is not reconstructed by the back
projection B(Rf ) of its Radon transform Rf , i.e., f = B(Rf ). 

Figure 9.5 shows another counterexample by graphical illustration: In


this case, the back projection B is applied to the Radon transform Rf of the
Shepp-Logan phantom f (cf. Figure 9.4). Observe that the sharp edges of
phantom f are blurred by the back projection B. In relevant applications, in
particular for clinical diagnostics, such smoothing effects are clearly undesired.
In the following discussion, we show how we can avoid such undesired effects
by the application of filters.

(a) Shepp-Logan phantom f (b) back projection B(Rf ).

Fig. 9.5. The Shepp-Logan phantom f and its back projection B(Rf ).
9.2 The Filtered Back Projection 327

Now we turn to the inversion of the Radon transform. To this end, we


work with the continuous Fourier transform F, which we apply to bivariate
functions f ≡ f (x, y) in Cartesian coordinates as usual, i.e., by using the
bivariate Fourier transform F ≡ F2 . But for functions h ≡ h(t, θ) in polar
coordinates we apply the univariate Fourier transform F ≡ F1 to variable
t ∈ R, i.e., we keep the angle θ ∈ [0, π) fixed.
Definition 9.13. For a function f ≡ f (x, y) ∈ L1 (R2 ) in Cartesian coordi-
nates the Fourier transform F2 f of f is defined as

(F2 f )(X, Y ) = f (x, y)e−i(xX+yY ) d(x, y).
R2

For a function h ≡ h(t, θ) in polar coordinates satisfying h(·, θ) ∈ L1 (R), for


all θ ∈ [0, π), the univariate Fourier transform F1 h of h is defined as

(F1 h)(S, θ) = h(t, θ)e−iSt dt for θ ∈ [0, π).
R

The following result will lead us directly to the inversion of the Radon
transform. In fact, the Fourier slice theorem (also often referred to as central
slice theorem) is an important result in Fourier analysis.
Theorem 9.14. (Fourier slice theorem). For f ∈ L1 (R2 ), we have
F2 f (S cos(θ), S sin(θ)) = F1 (Rf )(S, θ) for all S ∈ R, θ ∈ [0, π). (9.14)
Proof. For f ≡ f (x, y) ∈ L1 (R2 ), we consider the Fourier transform
 
F2 f (S cos(θ), S sin(θ)) = f (x, y)e−iS(x cos(θ)+y sin(θ)) dx dy (9.15)
R R

at (S, θ) ∈ R × [0, π). By the variable transformation (9.6), the right hand
side in (9.15) can be represented as
 
f (t cos(θ) − s sin(θ), t sin(θ) + s cos(θ))e−iSt ds dt,
R R
or, as
  
f (t cos(θ) − s sin(θ), t sin(θ) + s cos(θ)) ds e−iSt dt.
R R

Note that the inner integral coincides with the Radon transform Rf (t, θ).
But this already implies the stated identity

F2 f (S cos(θ), S sin(θ)) = Rf (t, θ)e−iSt dt = F1 (Rf )(S, θ).
R


328 9 Computerized Tomography

Theorem 9.15. (Filtered back projection formula).


For f ∈ L1 (R2 ) ∩ C (R2 ), the filtered back projection formula
1
f (x, y) = B F1−1 [|S|F1 (Rf )(S, θ)] (x, y) for all (x, y) ∈ R2 (9.16)
2
holds.
Proof. For f ∈ L1 (R2 ) ∩ C (R2 ), we have the Fourier inversion formula
 
−1 1
f (x, y) = F2 (F2 f )(x, y) = (F2 f )(X, Y )ei(xX+yY ) dX dY.
4π 2 R R
Changing variables from Cartesian coordinates (X, Y ) to polar coordinates,
(X, Y ) = (S cos(θ), S sin(θ)) for S ∈ R and θ ∈ [0, π),
and by dX dY = |S| dS dθ we get the representation
 π
1
f (x, y) = F2 f (S cos(θ), S sin(θ))eiS(x cos(θ)+y sin(θ)) |S| dS dθ.
4π 2 0 R
By the representation (9.14) in the Fourier slice theorem, this yields
 π
1
f (x, y) = F1 (Rf )(S, θ)eiS(x cos(θ)+y sin(θ)) |S| dS dθ
4π 2 0 R
 π
1
= F −1 [|S|F1 (Rf )(S, θ)] (x cos(θ) + y sin(θ)) dθ
2π 0 1
1
= B F1−1 [|S|F1 (Rf )(S, θ)] (x, y).
2

Therefore, the reconstruction problem, Problem 9.5, is solved analytically.
But the application of the formula (9.16) leads to critical numerical problems.
Remark 9.16. The filtered back projection (FBP) formula (9.16) is numer-
ically unstable. We can explain this as follows. In the FBP formula (9.16) the
Fourier transform F1 (Rf ) of the Radon transform Rf is being multiplied
by the factor |S|. According to the jargon of signal processing, we say that
F1 (Rf ) is being filtered by |S|, which, on this occasion, explains the naming
filtered back projection. Now the multiplication by the filter |S| in (9.16) is
very critical for high frequencies S, i.e., for S with large magnitude |S|. In
fact, by the FBP formula (9.16) the high-frequency components in Rf are
amplified by the factor |S|. This is particularly critical for noisy Radon data,
since the high-frequency noise level of the recorded signals Rf is in this case
exaggerated by application of the filter |S|.
Conclusion: The filtered back projection formula (9.16) is highly sensitive
with respect to perturbations of the Radon data Rf by noise. For this reason,
the FBP formula (9.16) is entirely useless for practical purposes. 
9.3 Construction of Low-Pass Filters 329

9.3 Construction of Low-Pass Filters


To stabilize the filtered back projection, we replace the filter |S| in the FPB
formula (9.16) by a specific low-pass filter. In the general context of Fourier
analysis, a low-pass filter is a function F ≡ F (S) of the frequency variable
S, which maps high-frequency parts of a signal to zero. To this end, we
usually require compact support for F , so that supp(F ) ⊆ [−L, L] for a fixed
bandwidth L > 0, i.e., so that F (S) = 0 for all frequencies S with |S| > L.
In the particular context of the FBP formula (9.16), we require sufficient
approximation quality for the low-pass filter F within the frequency band
[−L, L], i.e.,
F (S) ≈ |S| on [−L, L].
To be more concrete on this, we explain our requirements for F as follows.
Definition 9.17. Let L > 0. Moreover, suppose that W ∈ L∞ (R) is an even
function with compact support supp(W ) ⊆ [−1, 1] satisfying W (0) = 1. A
low-pass filter for the stabilization of (9.16) is a function F : R −→ R of
the form
F (S) = |S| · W (S/L) for S ∈ R,
where L denotes the bandwidth and W is the window of F ≡ FL,W .
Now let us make a few examples for commonly used low-pass filters. In
the following discussion,
1 for |S| ≤ L,
L (S) ≡ χ[−L,L] (S) =
0 for |S| > L,

is, for L > 0, the indicator function of the interval [−L, L], and we let  := 1 .
Example 9.18. The Ram-Lak filter FRL is given by the window

WRL (S) = (S),

so that
|S| for |S| ≤ L,
FRL (S) = |S| · L (S) =
0 for |S| > L.
The Ram-Lak filter is shown in Figure 9.6 (a). ♦
Example 9.19. The Shepp-Logan filter FSL is given by the window

WSL (S) = sinc(πS/2) · (S),

so that
sin(πS/(2L)) 2L
· | sin(πS/(2L))| for |S| ≤ L,
FSL (S) = |S| · · L (S) = π
πS/(2L) 0 for |S| > L.

The Shepp-Logan filter is shown in Figure 9.6 (b). ♦


330 9 Computerized Tomography
1

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0
-1.5 -1 -0.5 0 0.5 1 1.5

(a) Ram-Lak filter


1

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0
-1.5 -1 -0.5 0 0.5 1 1.5

(b) Shepp-Logan filter


1

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0
-1.5 -1 -0.5 0 0.5 1 1.5

(c) cosine filter

Fig. 9.6. Three commonly used low-pass filters (see Examples 9.18-9.20).
9.3 Construction of Low-Pass Filters 331
1

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0
-1.5 -1 -0.5 0 0.5 1 1.5

(a) β = 0.5
1

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0
-1.5 -1 -0.5 0 0.5 1 1.5

(b) β = 0.6
1

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0
-1.5 -1 -0.5 0 0.5 1 1.5

(c) β = 0.7

Fig. 9.7. The Hamming filter Fβ for β ∈ {0.5, 0.6, 0.7} (see Example 9.21).
332 9 Computerized Tomography
1

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0
-1.5 -1 -0.5 0 0.5 1 1.5

(a) α = 2.5
1

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0
-1.5 -1 -0.5 0 0.5 1 1.5

(b) α = 5.0
1

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0
-1.5 -1 -0.5 0 0.5 1 1.5

(c) α = 10.0

Fig. 9.8. The Gauss filter Fα for α ∈ {2.5, 5.0, 10.0} (see Example 9.22).
9.3 Construction of Low-Pass Filters 333

Example 9.20. The cosine filter FCF is given by the window

WCF (S) = cos(πS/2) · (S),

so that
|S| · cos(πS/(2L)) for |S| ≤ L,
FCF (S) = |S| · cos(πS/(2L)) · L (S) =
0 for |S| > L.

The cosine filter is shown in Figure 9.6 (c). ♦

Example 9.21. The Hamming filter Fβ is given by the window

Wβ (S) = (β + (1 − β) cos(πS)) · (S) for β ∈ [1/2, 1].

Note that the Hamming filter Fβ is a combination of the Ram-Lak filter FRL
and the cosine filter FCF . The Hamming filter Fβ is shown in Figure 9.7, for
β ∈ {0.5, 0.6, 0.7}. ♦

Example 9.22. The Gauss filter Fα is given by the window

Wα (S) = exp −(πS/α)2 · (S) for α > 1.

The Gauss filter Fα is shown in Figure 9.8, for α ∈ {2.5, 5.0, 10.0}. ♦

If we replace the filter |S| in (9.16) by a low-pass filter F ≡ F (S), then


the resulting reconstruction of f ,
1
fF (x, y) := B F1−1 [F (S) · F1 (Rf )(S, θ)] (x, y), (9.17)
2
will no longer be exact, i.e., the function fF in (9.17) yields an approximate
reconstruction of f , f ≈ fF . We analyze the approximation behaviour of fF
later in this chapter. But we develop a suitable representation for fF in (9.17)
already now.
Note that any low-pass filter F is absolutely integrable, i.e., F ∈ L1 (R).
In particular, any low-pass filter F has, in contrast to the filter |S|, an inverse
Fourier transform F1−1 F . We use this observation to simplify the representa-
tion of fF in (9.17) by
1
fF (x, y) = B (F1−1 F ∗ Rf )(S, θ) (x, y). (9.18)
2
To further simplify the representation in (9.18), we first prove a useful
relation between R and B. This relation involves the convolution product ∗
that we apply to bivariate functions in Cartesian coordinates and to bivariate
functions in polar coordinates, according to the following definition.
334 9 Computerized Tomography

Definition 9.23. For f ≡ f (x, y) ∈ L1 (R2 ) and g ≡ g(x, y) ∈ L1 (R2 ), the


convolution f ∗ g between the functions f and g is defined as
 
(f ∗ g)(X, Y ) = f (X − x, Y − y)g(x, y) dx dy for X, Y ∈ R.
R R

For θ ∈ [0, π) and functions g(·, θ), h(·, θ) ∈ L1 (R), the convolution g ∗ h
between g and h is defined as

(g ∗ h)(T, θ) = g(T − t, θ)h(t, θ) dt for T ∈ R.
R

Theorem 9.24. For h ∈ L1 (R×[0, π)) and f ∈ L1 (R2 ), we have the relation
B (h ∗ Rf ) (X, Y ) = (Bh ∗ f ) (X, Y ) for all (X, Y ) ∈ R2 . (9.19)
Proof. For the right hand side in (9.19), we obtain the representation
(Bh ∗ f ) (X, Y )
 
= (Bh)(X − x, Y − y)f (x, y) dx dy
R R
   π 
1
= h((X − x) cos(θ) + (Y − y) sin(θ), θ) dθ f (x, y) dx dy.
π R R 0
By variable transformation on (x, y) by (9.6) and dx dy = ds dt, we obtain
 
1 π
(Bh ∗ f ) (X, Y ) = h(X cos(θ) + Y sin(θ) − t, θ)(Rf )(t, θ) dt dθ
π 0 R

1 π
= (h ∗ Rf )(X cos(θ) + Y sin(θ), θ) dθ
π 0
= B(h ∗ Rf )(X, Y )
for all (X, Y ) ∈ R2 . 
Theorem 9.24 and (9.18) provide a very useful representation for fF ,
where we use the inverse Fourier transform F1−1 F of the filter F by
(F1−1 F )(t, θ) := (F −1 F )(t) for t ∈ R and θ ∈ [0, π)
as a bivariate function.
Corollary 9.25. Let f ∈ L1 (R2 ). Moreover, let F be a filter satisfying
F1−1 F ∈ L1 (R × [0, π)). Then, the representation
1
fF (x, y) = B(F1−1 F ) ∗ f (x, y) = (KF ∗ f ) (x, y), (9.20)
2
holds, where
1
B F1−1 F (x, y)
KF (x, y) :=
2
denotes the convolution kernel of the low-pass filter F . 
9.4 Error Estimates and Convergence Rates 335

Remark 9.26. The statement of Corollary 9.25 does also hold without the
assumption F1−1 F ∈ L1 (R×[0, π)), see [5]. Therefore, in Section 9.4, we apply
Corollary 9.25 without any assumptions on the low-pass filter F . 

9.4 Error Estimates and Convergence Rates


To evaluate the quality of low-pass filters F , we analyze the intrinsic L2 -error

f − fF L2 (R2 ) (9.21)

that is incurred by the utilization of F . To this end, we consider for α > 0


the Sobolev4 space
1 2
Hα (R2 ) = g ∈ L2 (R2 ) | gα < ∞ ⊂ L2 (R2 ),

equipped with the norm  · α , where


 
1 α
gα =
2
1 + x2 + y 2 |Fg(x, y)|2 dx dy for g ∈ Hα (R2 ),
4π 2 R R

where we apply the bivariate Fourier transform, i.e., F = F2 .


We remark that estimates for the L2 -error in (9.21) and for the Lp -error
were proven by Madych [44] in 1990. Moreover, pointwise error estimates
and L∞ -estimates were studied by Munshi et al. [52, 53, 54] in 1991-1993.
However, we do not further pursue their techniques here. Instead of this, we
work with a more recent account by Beckmann [4, 5]. The following result
of Beckmann [5] leads us, without any detour, to useful L2 -error estimates
under rather weak assumptions on f and F in (9.21).

Theorem 9.27. For α > 0, let f ∈ L1 (R2 ) ∩ Hα (R2 ) and W ∈ L∞ (R).


Then, we have for the L2 -error (9.21) of the reconstruction fF = f ∗ KF
in (9.20) the estimate
 
f − fF L2 (R2 ) ≤ Φα,W (L) + L−α f α ,
1/2
(9.22)

where
(1 − W (S))2
Φα,W (L) := sup 2 2 α
for L > 0. (9.23)
S∈[−1,1] (1 + L S )

Proof. By f ∈ L1 (R2 ) ∩ Hα (R2 ), for α > 0, we have f ∈ L2 (R2 ). Moreover,


we have fF ∈ L2 (R2 ), as shown in [5]. By application of the Fourier convolu-
tion theorem on L2 (R2 ), Theorem 7.43, in combination with the Plancherel
theorem, Theorem 7.45, we get the representation

4
Sergei Lvovich Sobolev (1908-1989), Russian mathematician
336 9 Computerized Tomography

1
f − f ∗ KF 2L2 (R2 ) = Ff − Ff · FKF 2L2 (R2 ;C)
4π 2
1
= Ff − WL · Ff 2L2 (R2 ;C) , (9.24)
4π 2
where for the scaled window WL (S) := W (S/L), S ∈ R, we used the identity

WL ((x, y)2 ) = FKF (x, y) for almost every (x, y) ∈ R2 (9.25)

(see Exercise 9.44). Since supp(WL ) ⊂ [−L, L], we can split the square error
in (9.24) into a sum of two integrals,
1
Ff − WL · Ff 2L2 (R2 ;C)
4π 2 
1
= |(Ff − WL · Ff )(x, y)|2 d(x, y) (9.26)
4π 2 (x,y) 2 ≤L

1
+ 2 |Ff (x, y)|2 d(x, y). (9.27)
4π (x,y) 2 >L

By f ∈ Hα (R2 ), we estimate the integral in (9.27) from above by



1
|Ff (x, y)|2 d(x, y)
4π 2 (x,y) 2 >L

1 α
≤ 2
1 + x2 + y 2 L−2α |Ff (x, y)|2 d(x, y)
4π (x,y) 2 >L

≤ L−2α f 2α , (9.28)

whereas for the integral in (9.26), we obtain the estimate



1
|(Ff − WL · Ff )(x, y)|2 d(x, y)
4π 2 (x,y) 2 ≤L

1 |1 − WL ((x, y)2 )|2 α
= 2 2 + y 2 )α
1 + x2 + y 2 |Ff (x, y)|2 d(x, y)
4π (x,y) 2 ≤L (1 + x
, -  
(1 − WL (S))2 1 α
≤ sup 2 )α 2
1 + x2 + y 2 |Ff (x, y)|2 dx dy
S∈[−L,L] (1 + S 4π R R
, -
(1 − W (S)) 2
= sup 2 2 α
f 2α
S∈[−1,1] (1 + L S )

= Φα,W (L) · f 2α . (9.29)

Finally, the sum of the two upper bounds in (9.29) and in (9.28) yields the
stated error estimate in (9.22). 
9.4 Error Estimates and Convergence Rates 337

Remark 9.28. For the Ram-Lak filter from Example 9.18, we have W ≡ 1
on [−1, 1], and so Φα,W ≡ 0. In this case, Theorem 9.27 yields the error
estimate

f − fF L2 (R2 ) ≤ L−α f α = O L−α for L → ∞.

This further implies L2 -convergence, i.e., fF −→ f for L → ∞, for the


reconstruction method fF at convergence rate α. 

In our subsequent analysis concerning arbitrary low-pass filters F , we use


the following result from [4] to prove L2 -convergence fF −→ f , for L → ∞,
i.e.,
f − fF L2 (R2 ) −→ 0 for L → ∞.

Theorem 9.29. Let W ∈ C [−1, 1] satisfy W (0) = 1. Then, we have, for


any α > 0, the convergence

(1 − W (S))2
Φα,W (L) = max α −→ 0 for L → ∞. (9.30)
S∈[0,1] (1 + L2 S 2 )


Proof. Let Sα,W,L ∈ [0, 1] be the smallest maximum on [0, 1] for the function

(1 − W (S))2
Φα,W,L (S) := α for S ∈ [0, 1].
(1 + L2 S 2 )


Case 1: Suppose Sα,W,L is uniformly bounded away from zero, i.e., we

have Sα,W,L ≥ c > 0 for all L > 0, for some c ≡ cα,W > 0. Then,
2


1 − W (Sα,W,L ) 1 − W 2∞,[−1,1]
0 ≤ Φα,W,L Sα,W,L = ∗ α ≤ α −→ 0
1 + L2 (Sα,W,L )2 (1 + L2 c2 )

holds for L → ∞.

Case 2: Suppose Sα,W,L −→ 0 for L → ∞. Then, we have

∗ 2

1 − W (Sα,W,L ) ∗ 2
0≤ Φα,W,L Sα,W,L = ∗ α ≤ 1 − W (Sα,W,L ) −→ 0,
1+ L2 (Sα,W,L )2

for L → ∞, by the continuity of W and with W (0) = 1. 

Now the convergence of the reconstruction method fF follows directly


from Theorems 9.27 and 9.29.
338 9 Computerized Tomography

Corollary 9.30. For α > 0, let f ∈ L1 (R2 ) ∩ Hα (R2 ). Moreover, let W be a


continuous window on [0, 1] satisfying W (0) = 1. Then, the convergence
f − fF L2 (R2 ) −→ 0 for L → ∞ (9.31)
holds. 
We remark that the assumptions in Corollary 9.30,
W ∈ C ([0, 1]) and W (0) = 1
are satisfied by all windows W of the low-pass filters F in Examples 9.18-9.22.
A more detailed discussion on error estimates and convergence rates for FBP
reconstruction methods fF can be found in the work [4] of Beckmann.

9.5 Implementation of the Reconstruction Method


In this section, we explain how to implement the FBP reconstruction method
efficiently. The starting point for our discussion is the representation (9.18),
whereby, for a fixed low-pass filter F , the corresponding reconstruction fF ,
is given as
1
fF (x, y) = B (F1−1 F ∗ Rf )(S, θ) (x, y). (9.32)
2
Obviously, we can only acquire and process finitely many Radon data
Rf (t, θ) in practice. In the acquisition of Radon data, the X-ray beams are
usually generated, such that the resulting Radon lines t,θ ⊂ Ω are distributed
in the image domain Ω ⊂ R2 on regular geometries.

9.5.1 Parallel Beam Geometry


A commonly used method of data acquisition is referred to as parallel beam
geometry. In this method, the Radon lines t,θ are collected in subsets of
parallel lines. We can explain this more precisely as follows. For a uniform
discretization of the angular variable θ ∈ [0, π) with N distinct angles
θk := kπ/N for k = 0, . . . , N − 1
and for a fixed sampling rate d > 0 with
tj := j · d for j = −M, . . . , M,
a constant number of 2M + 1 Radon data is recorded per angle θk ∈ [0, π),
along the parallel Radon lines { tj ,θk | j = −M, . . . , M }. Therefore, the re-
sulting discretization of the Radon transform consists of N × (2M + 1) Radon
data
{Rf (tj , θk ) | j = −M, . . . , M and k = 0, . . . , N − 1} . (9.33)
Figure 9.9 shows 110 Radon lines tj ,θk ∩ [−1, 1]2 on parallel beam geo-
metry for N = 10 angles θk and 2M + 1 = 11 Radon lines per angle.
9.5 Implementation of the Reconstruction Method 339
1

0.8

0.6

0.4

0.2

-0.2

-0.4

-0.6

-0.8

-1
-1 -0.5 0 0.5 1

Fig. 9.9. Parallel beam geometry. Regular distribution of 110 Radon lines tj ,θk ,
for N = 10 angles θk , 2M + 1 = 11 Radon lines per angle, at sampling rate d = 0.2.

9.5.2 Inverse Fourier transform of the low-pass filters

For our implementation of the FBP reconstruction method fF in (9.32), we


need the inverse Fourier transforms of the chosen low-pass filter F . Note that
any low-pass filter F is, according to Definition 9.17, an even function. There-
fore, the inverse Fourier transform F −1 F of F is an inverse cosine transform.
This observation will simplify our following computations for F −1 F .
We start with the Ram-Lak filter from Example 9.18.

Proposition 9.31. The inverse Fourier transform of the Ram-Lak filter

FRL (S) = |S| · (S) for S ∈ R

is given by
 
−1 1 (Lt) · sin(Lt) 2 · sin2 (Lt/2)
F FRL (t) = − for t ∈ R. (9.34)
π t2 t2

The evaluation of F −1 FRL at tj = jd, with sampling rate d = π/L > 0, yields
340 9 Computerized Tomography


L2 /(2π) for j = 0,
−1
F FRL (πj/L) = 0  0 even,
for j = (9.35)

−2L2 /(π 3 · j 2 ) for j odd.

Proof. The inverse Fourier transform F −1 FRL of the even function FRL is
given by the inverse cosine transform
 L
1
F −1 FRL (t) = S · cos(tS) dS.
π 0

Now we can compute the representation in (9.34) by elementary calculations,


 S=L
−1 1 cos(tS) + (tS) · sin(tS)
F FRL (t) =
π t2
 S=0

1 cos(Lt) + (Lt) · sin(Lt) − 1
=
π t2
 
1 (Lt) · sin(Lt) 2 · sin2 (Lt/2)
= − ,
π t2 t2

where we use the trigonometric identity cos(θ) = 1 − 2 · sin2 (θ/2).


For the evaluation of F −1 FRL at t = πj/L, we obtain
 
1 (πj) · sin(πj) 2 · sin2 (πj/2)
F −1 FRL (πj/L) = −
π (πj/L)2 (πj/L)2
,  2 -
L2 2 · sin(πj) sin(πj/2)
= −
2π πj (πj/2)

and this already yields the stated representation in (9.35). 

Next, we consider the Shepp-Logan filter from Example 9.19.

Proposition 9.32. The Shepp-Logan filter


2L
π · | sin(πS/(2L))| for |S| ≤ L,
FSL (S) =
0 for |S| > L,

has the inverse Fourier transform


 
L cos(Lt − π/2) − 1 cos(Lt + π/2) − 1
F −1 FSL (t) = 2 − for t ∈ R.
π t − π/(2L) t + π/(2L)

The evaluation of F −1 FSL at tj = jd, with sampling rate d = π/L > 0, yields

4L2
F −1 FSL (πj/L) = . (9.36)
π 3 (1 − 4j 2 )
9.5 Implementation of the Reconstruction Method 341

Proof. We compute the inverse Fourier transform F −1 FSL by


 L
1 2L
F −1 FSL (t) = · sin(πS/(2L)) · cos(tS) dS
π 0 π
 S=L
L cos((t − π/(2L))S) cos((t + π/(2L))S)
= − ,
π2 t − π/(2L) t + π/(2L) S=0

where we used the trigonometric addition formula


   
x−y x+y
2 sin cos = sin(x) − sin(y)
2 2

for x = (t + π/(2L))S and y = (t − π/(2L))S.


For the evaluation of F −1 FSL at t = πj/L, we obtain
 
−1 L cos(πj − π/2) cos(πj + π/2)
F FSL (πj/L) = 2 −
π πj/L − π/(2L) πj/L + π/(2L)
 
1 1
− −
πj/L − π/(2L) πj/L + π/(2L)
 
L 1 1
= 2 −
π πj/L + π/(2L) πj/L − π/(2L)
4L2
= 3 .
π (1 − 4j 2 )

This completes our proof for the stated representation in (9.36). 

For the inverse Fourier transforms of the remaining filters F from Exam-
ples 9.20-9.22 we refer to Exercise 9.43.

9.5.3 Discretization of the Convolution

Next, we discretize the convolution operator ∗ in (9.32). Our purpose for


doing so is to approximate, for any angle θ ≡ θk ∈ [0, π), k = 0, . . . , N − 1,
the convolution product

(F1−1 F ∗ Rf )(S, θ) = (F1−1 F )(S − t) · Rf (t, θ) dt (9.37)
R

between the functions

u(t) = F1−1 F (t) and v(t) = Rf (t, θ) for t ∈ R

from discrete data

uj = F1−1 F (tj ) and vj = Rf (tj , θk ) for j ∈ Z,


342 9 Computerized Tomography

which we acquire at tj = j · d with sampling rate d = π/L. To this end, we


replace the integral in (9.37), by application of the composite rectangle rule,
with the (infinite) sum
π
(F1−1 F ∗ Rf )(tm , θk ) ≈ um−j · vj for m ∈ Z, (9.38)
L
j∈Z

i.e., we evaluate the convolution u∗v at S = tm = πm/L, m ∈ Z, numerically.


For the convergence of the sum in (9.38), we require (uj )j∈Z , (vj )j∈Z ∈ 1 .
But in relevant application scenarios the situation is much easier. Indeed, we
can assume compact support for the target attenuation-coefficient function
f . In this case, the Radon transform v = Rf (·, θ) also has compact support,
for all θ ∈ [0, π), and so only finitely many Radon data in (vj )j∈Z do not
vanish, so that the sum in (9.38) is finite.
According to our discussion concerning parallel beam geometry, we assume
for the Radon data {Rf (tj , θk )}j∈Z , for any angle θk = kπ/N ∈ [0, π), the
form
vj = Rf (tj , θk ) for j = −M, . . . , M.
But we choose M ∈ N large enough, so that vj = 0 for all |j| > M . In this
way, we can represent the series in (9.38) as a finite sum. This finally yields
by

π 
M
(F1−1 F ∗ Rf )(tm , θk ) ≈ um−j · vj for m ∈ Z (9.39)
L
j=−M

a suitable discretization for the convolution product (F1−1 F ∗ Rf ).

9.5.4 Discretization of the Back Projection

Finally, we turn to the discretization of the back projection. Recall that,


according to Definition 9.10, the back projection of h ∈ L1 (R × [0, π)) at
(x, y) is defined as

1 π
Bh(x, y) = h(x cos(θ) + y sin(θ), θ) dθ for (x, y) ∈ R2 . (9.40)
π 0
Further recall that for the FBP reconstruction method fF in (9.32), the back
projection B is applied to the function

h(S, θ) = (F1−1 F ∗ Rf )(S, θ). (9.41)

To numerically compute the integral in (9.40) at (x, y), we apply the


composite rectangle rule, whereby
N −1
1 
Bh(x, y) ≈ h(x cos(θk ) + y sin(θk ), θk ). (9.42)
N
k=0
9.5 Implementation of the Reconstruction Method 343

This, however, leads us to the following problem.


To approximate Bh(x, y) in (9.42) over the Cartesian grid of pixel points
(x, y) we need, for any angle θk , k = 0, . . . , N − 1, the values h(t, θk ) at
t = x cos(θk ) + y sin(θk ). (9.43)
In the previous section, we have shown how to evaluate h in (9.41) at
polar coordinates (tm , θk ) numerically from input data of the form
h(tm , θk ) = (F1−1 F ∗ Rf )(tm , θk ) for m ∈ Z. (9.44)
Now note that t in (9.43) is not necessarily contained in the set {tm }m∈Z .
But we can determine the value h(t, θk ) at t = x cos(θk ) + y sin(θk ) from the
data in (9.44) by interpolation, where we recommend the following methods.
Piecewise constant interpolation: In this method, the value h(t, θk ) at
t ∈ [tm , tm+1 ) is approximated by
h(tm , θk ) for t − tm ≤ tm+1 − t,
h(t, θk ) ≈ I0 h(t, θk ) :=
h(tm+1 , θk ) for t − tm > tm+1 − t.
Note that the resulting interpolant I0 h(·, θk ) is piecewise constant.
Interpolation by linear splines: In this method, the value h(t, θk ) at
t ∈ [tm , tm+1 ) is approximated by
L
h(t, θk ) ≈ I1 h(t, θk ) := [(t − tm )h(tm+1 , θk ) + (tm+1 − t)h(tm , θk )]
π
The spline interpolant I1 h(·, θk ) is globally continuous and piecewise linear.

We summarize the proposed FBP reconstruction method in Algorithm 10.

9.5.5 Numerical Reconstruction of the Shepp-Logan Phantom


We have implemented the FBP reconstruction method, Algorithm 10. For
the purpose of illustration, we apply the FBP reconstruction method to the
Shepp-Logan phantom [66] (see Figure 9.4 (a)). To this end, we use the Shepp-
Logan filter FSL from Example 9.19 (see Figure 9.6 (b)). The inverse Fourier
transform F −1 FSL is given in Proposition 9.32, along with the functions
values
4L2
(F −1 FSL )(πj/L) = 3 for j ∈ Z,
π (1 − 4j 2 )
which we use to compute the required convolutions in line 9 of Algorithm 10.
To compute the back projection (line 16), we apply linear spline interpolation,
i.e., we choose I = I1 in line 13.
For our numerical experiments we used parameter values as in Table 9.1.
Figure 9.10 shows the resulting FBP reconstructions of the Shepp-Logan
phantoms on a grid of 512 × 512 pixels.
344 9 Computerized Tomography

Algorithm 10 Reconstruction by filtered back projection


1: function Filtered Back Projection(Rf )
2: Input: Radon data Rf ≡ Rf (tj , θk ), k = 0, . . . , N − 1, j = −M, . . . , M ;
3: evaluation points {(xn , ym ) ∈ R2 | (n, m) ∈ Ix × Iy } for (finite)
4: index sets Ix × Iy ⊂ N × N.
5:
6: choose low-pass filter F with window WF and bandwidth L > 0;
7: for k = 0, . . . , N − 1 do
8: for i ∈ I do for (finite) index set I ⊂ Z
9: let compute convolution product (9.39)

π 
M
hik := F −1 F ((i − j)π/L) · Rf (tj , θk )
L j=−M 1

10: end for


11: end for
12:
13: choose interpolation method I e.g. linear splines I1
14: for n ∈ Ix do
15: for m ∈ Iy do
16: let compute back projection (9.42)

1 
N −1
fnm := Ih(xn cos(θk ) + ym sin(θk ), θk ).
2N
k=0

17: end for


18: end for
19:
20: Output: reconstruction {fnm }(n,m)∈Ix ×Iy with values fnm ≈ fF (xn , ym ).
21: end function

(a) 2460 Radon lines (b) 15150 Radon lines (c) 60300 Radon lines

Fig. 9.10. Reconstruction of the Shepp-Logan phantom by filtered back projection,


Algorithm 10, using the parameters in Table 9.1.
9.6 Exercises 345

Table 9.1. Reconstruction of the Shepp-Logan phantom by filtered back projection,


Algorithm 10. The following values were used for the bandwidth L, the sampling
rate d, the number N of angles θk , at 2M +1 parallel Radon lines tj ,θk per angle θk .
The resulting reconstructions on a grid of 512 × 512 pixels are shown in Figure 9.10.

parameter bandwidth sampling rate # angles # Radon lines


M L=π·M d = π/L N = 3 · M N × (2M + 1)
20 20π 0.05 60 2460
50 50π 0.02 150 15150
100 30π 0.01 300 60300

9.6 Exercises
Exercise 9.33. Prove for f ∈ L1 (R2 ) the estimate

Rf (·, θ)L1 (R) ≤ f L1 (R2 ) for all θ ∈ [0, π)

to conclude
Rf ∈ L1 (R × [0, π)) for all f ∈ L1 (R2 ),
i.e., for f ∈ L1 (R2 ), we have

(Rf )(t, θ) < ∞ for almost every (t, θ) ∈ R × [0, π).

Exercise 9.34. Consider the function f : R2 −→ R, defined as


* −3/2
x2 for x2 ≤ 1
f (x) = x = (x, y) ∈ R2 .
0 for x2 > 1

Show that (Rf )(0, 0) is not finite, although f ∈ L1 (R2 ).

Exercise 9.35. Show that the Radon transform Rf of f ∈ L1 (R2 ) has com-
pact support, if f has compact support.
Does the converse of this statement hold? I.e., does f ∈ L1 (R2 ) necessarily
have compact support, if supp(Rf ) is compact?

Exercise 9.36. Recall the rotation matrix Qθ ∈ R2×2 in (9.6) and the unit
vector nθ = (cos(θ), sin(θ))T ∈ R2 , for θ ∈ [0, π), respectively.
Prove the following properties for the Radon transform Rf of f ∈ L1 (R2 ).

(a) For fθ (x) = f (Qθ x), the identity

(Rf )(t, θ + ϕ) = (Rfθ )(t, ϕ)

holds for all t ∈ R and all θ, ϕ ∈ [0, π).


346 9 Computerized Tomography

(b) For fx0 (x) = f (x + x0 ), where x0 ∈ R2 , the identity

(Rfx0 )(t, θ) = (Rf )(t + nTθ x0 , θ)

holds for all t ∈ R and all θ ∈ [0, π).

Exercise 9.37. Show that for a radially symmetric function f ∈ L1 (R2 ), the
backward projection B(Rf ) of Rf is radially symmetric.
Now consider the indicator function f = χB1 of the unit ball B1 and its
Radon transform Rf from Example 9.6. Show that the backward projection
B(Rf ) of Rf is positive on the open annulus
√ 3  √ 4
R1 2 = x ∈ R2  1 < x2 < 2 ⊂ R2 .

Hint: Remark 9.12.

Exercise 9.38. Prove the Radon convolution theorem,

R(f ∗ g) = (Rf ) ∗ (Rg) for f, g ∈ L1 (R2 ) ∩ C (R2 )

Hint: Use the Fourier slice theorem, Theorem 9.14.

Exercise 9.39. Show that the backward projection B is (up to factor π) the
adjoint operator of the Radon transform R. To this end, prove the relation

(Rf, g)L2 (R×[0,π)) = π(f, Bg)L2 (R2 )

for g ∈ L2 (R × [0, π)) and f ∈ L1 (R2 ) ∩ L2 (R2 ) satisfying Rf ∈ L2 (R × [0, π)).

Exercise 9.40. In this exercise, we consider a spline filter of first order,


which is a low-pass filter F : R −→ R of the form

F (S) = |S| · ∧(S) · (S)

(cf. Definition 9.17) with the linear B-spline ∧ : R −→ R, defined as

1 − |S| for |S| ≤ 1,


∧(S) = (1 − |S|)+ =
0 for |S| > 1.

(a) Show the representation


 
−1 2 sin2 (x/2) + sinc(x) − 1
(F1 F )(x) = for x ∈ R
π x2

for the inverse Fourier transform F1−1 F of F .


(b) Use the result in (a) to compute (F1−1 F )(πn) for n ∈ Z.
9.6 Exercises 347

Exercise 9.41. A spline filter Fk of order k ∈ N0 has the form

Fk (S) = |S| · ∧k (S) · (S) for k ∈ N0 , (9.45)

where the B-spline ∧k is defined by the recursion

∧k (S) := (∧k−1 ∗ )(S/αk ) for k ∈ N (9.46)

for the initial value ∧0 :=  and where, moreover, the positive scaling factor
αk > 0 in (9.46) is chosen, such that supp(∧k ) = [−1, 1].
In this exercise, we construct a spline filter of second order.
(a) Show that the initial value ∧0 yields the Ram-Lak filter, i.e., F0 ≡ FRL .
(b) Show that the scaling factor αk > 0 in (9.46) is, for any k ∈ N, uniquely
determined by the requirement supp(∧k ) = [−1, 1].
(c) Show that ∧1 generates by F1 the spline filter from Exercise 9.40.
Determine the scaling factor α1 of F1 .
(d) Compute the second order spline filter F2 . To this end, determine the
B-spline ∧2 in (9.46), along with its scaling factor α2 .

Exercise 9.42. Develop a construction scheme for higher order spline filters
Fk of the form (9.45), where k ≥ 3. To this end, apply the recursion in (9.46)
and determine the scaling factors αk , for k ≥ 3.

Exercise 9.43. Compute the inverse Fourier transform F −1 F of the


(a) cosine filter F = FCF from Example 9.20;
(b) Hamming filter F = Fβ from Example 9.21;
(c) Gauss filter F = Fα from Example 9.22.
Compute for each of the filters F in (a)-(c) the values

(F −1 F )(πj/L) for j ∈ Z.

Hint: Proposition 9.31 and Proposition 9.32.

Exercise 9.44. Let F ≡ FL,W be a low-pass filter with bandwidth L > 0


and window function W : R −→ R, according to Definition 9.17. Moreover,
let
1
KF (x, y) = B F1−1 F (x, y) for (x, y) ∈ R2
2
be the convolution kernel of F .
Prove for the scaled window WL (S) = W (S/L), S ∈ R, the identity

WL ((x, y)2 ) = FKF (x, y) (9.47)

In which sense does the identity in (9.47) hold?


Hint: Elaborate the details in the proof of [5, Proposition 4.1].
348 9 Computerized Tomography

Exercise 9.45. Implement the reconstruction method of the filtered back


projection (FBP), Algorithm 10. Apply the FBP method to the phantom
bull’s eye (see Example 9.9 and Figure 9.3). To this end, use the bandwidth
L = π · M , the sampling rate d = π/L, and N = 3 · M angles θk , with 2M + 1
parallel Radon lines tj ,θk per angle θk , for M = 10, 20, 50.
For verification, the reconstructions with 512 × 512 pixels are displayed
in Figure 9.11, where we used the Shepp-Logan filter FSL from Example 9.19
(see Figure 9.6 (b)) in combination with linear spline interpolation.

(a) 630 Radon lines (b) 2460 Radon lines (c) 15150 Radon lines

Fig. 9.11. Reconstruction of bull’s eye from Example 9.9 (see Figure 9.3).
References

1. N. Aronszajn: Theory of reproducing kernels. Transactions of the AMS 68,


1950, 337–404.
2. R. Askey: Radial characteristic functions. TSR # 1262, Univ. Wisconsin, 1973.
3. S. Banach, S. Mazur: Zur Theorie der linearen Dimension.
Studia Mathematica 4, 1933, 100–112.
4. M. Beckmann, A. Iske: Error estimates and convergence rates for filtered back
projection. Mathematics of Computation, published electronically on April 30,
2018, https://doi.org/10.1090/mcom/3343.
5. M. Beckmann, A. Iske: Approximation of bivariate functions from fractional
Sobolev spaces by filtered back projection. HBAM 2017-05, U. Hamburg, 2017.
6. A. Beer: Bestimmung der Absorption des rothen Lichts in farbigen Flüssig-
keiten. Annalen der Physik und Chemie 86, 1852, 78–88.
7. Å. Bjørck: Numerical Methods for Least Squares Problems. SIAM, 1996.
8. S. Bochner: Vorlesungen über Fouriersche Integrale.
Akademische Verlagsgesellschaft, Leipzig, 1932.
9. D. Braess: Nonlinear Approximation Theory. Springer, Berlin, 1986.
10. M.D. Buhmann: Radial Basis Functions.
Cambridge University Press, Cambridge, UK, 2003.
11. E.W. Cheney: Introduction to Approximation Theory.
Second edition, McGraw Hill, New York, NY, U.S.A., 1982.
12. W. Cheney, W. Light: A Course in Approximation Theory.
Graduate Studies in Mathematics, vol. 101, AMS, Providence, RI, U.S.A., 2000.
13. O. Christensen: An Introduction to Frames and Riesz Bases.
Second expanded edition, Birkhäuser, 2016.
14. C.K. Chui: Wavelets: A Mathematical Tool for Signal Analysis.
Monographs on Mathematical Modeling and Computation. SIAM, 1997.
15. C.W. Clenshaw: A note on the summation of Chebyshev series.
Mathematics of Computation 9(51), 1955, 118–120.
16. J.W. Cooley, J.W. Tukey. An algorithm for the machine calculation of complex
Fourier series. Mathematics of Computation 19, 1965, 297–301.
17. P.C. Curtis Jr.: N-parameter families and best approximation.
Pacific Journal of Mathematics 9, 1959, 1013–1027.
18. I. Daubechies: Ten Lectures on Wavelets. SIAM, Philadelphia, 1992.
19. P.J. Davis: Interpolation and Approximation. 2nd edition, Dover, NY, 1975.
20. C. de Boor: A Practical Guide to Splines. Revised edition,
Applied Mathematical Sciences, vol. 27, Springer, New York, 2001.
21. R.A. DeVore: Nonlinear approximation. Acta Numerica, 1998, 51–150.

© Springer Nature Switzerland AG 2018 349


A. Iske, Approximation Theory and Algorithms for Data Analysis, Texts
in Applied Mathematics 68, https://doi.org/10.1007/978-3-030-05228-7
350 References

22. B. Diederichs, A. Iske: Improved estimates for condition numbers of radial basis
function interpolation matrices. J. Approximation Theory, published electroni-
cally on October 16, 2017, https://doi.org/10.1016/j.jat.2017.10.004.
23. G. Faber: Über die interpolatorische Darstellung stetiger Funktionen.
Jahresbericht der Deutschen Mathematiker-Vereinigung 23, 1914, 192–210.
24. G.E. Fasshauer: Meshfree Approximation Methods with Matlab.
World Scientific, Singapore, 2007.
25. G.E. Fasshauer, M. McCourt: Kernel-based Approximation Methods using
Matlab. World Scientific, Singapore, 2015.
26. G.B. Folland: Fourier Analysis and its Applications.
Brooks/Cole, Pacific Grove, CA, U.S.A., 1992.
27. B. Fornberg, N. Flyer: A Primer on Radial Basis Functions with Applications
to the Geosciences. SIAM, Philadelphia, 2015.
28. W. Gander, M.J. Gander, F. Kwok: Scientific Computing – An Introduction
using Maple and MATLAB. Texts in CSE, volume 11, Springer, 2014.
29. C. Gasquet, P. Witomski: Fourier Analysis and Applications. Springer Sci-
ence+Business Media, New York, 1999.
30. M. v. Golitschek: Penalized least squares approximation problems.
Jaen Journal on Approximation Theory 1(1), 2009, 83–96.
31. J. Gomes, L. Velho: From Fourier Analysis to Wavelets. Springer, 2015.
32. A. Haar: Zur Theorie der orthogonalen Funktionensysteme.
Mathematische Annalen 69, 1910, 331–371.
33. M. Haase: Functional Analysis: An Elementary Introduction.
American Mathematical Society, Providence, RI, U.S.A., 2014.
34. P.C. Hansen, J.G. Nagy, D.P. O’Leary: Deblurring Images: Matrices, Spectra,
and Filtering. Fundamentals of Algorithms. SIAM, Philadelphia, 2006.
35. E. Hewitt, K.A. Ross: Abstract Harmonic Analysis I. Springer, Berlin, 1963.
36. K. Höllig, J. Hörner: Approximation and Modeling with B-Splines.
SIAM, Philadelphia, 2013.
37. A. Iske: Charakterisierung bedingt positiv definiter Funktionen für multivari-
ate Interpolationsmethoden mit radialen Basisfunktionen. Dissertation, Uni-
versität Göttingen, 1994.
38. A. Iske: Multiresolution Methods in Scattered Data Modelling. Lecture Notes
in Computational Science and Engineering, vol. 37, Springer, Berlin, 2004.
39. J.L.W.V. Jensen: Sur les fonctions convexes et les inégalités entre les valeurs
moyennes. Acta Mathematica 30, 1906, 175–193.
40. P. Jordan, J. von Neumann: On inner products in linear, metric spaces.
Annals of Mathematics 36(3), 1935, 719–723.
41. C.L. Lawson, R.J. Hanson: Solving Least Squares Problems.
Prentice-Hall, Englewood Cliffs, NJ, U.S.A., 1974.
42. P.D. Lax: Functional Analysis. Wiley-Interscience, New York, U.S.A., 2002.
43. G.G. Lorentz, M. v. Golitschek, Y. Makovoz: Constructive Approximation.
Grundlehren der mathematischen Wissenschaften, Band 304, Springer, 2011.
44. W.R. Madych: Summability and approximate reconstruction from Radon
transform data. In: Integral Geometry and Tomography, E. Grinberg and
T. Quinto (eds.), AMS, Providence, RI, U.S.A., 1990, 189–219.
45. W.R. Madych, S.A. Nelson: Multivariate Interpolation: A Variational Theory.
Technical Report, Iowa State University, 1983.
46. W.R. Madych, S.A. Nelson: Multivariate interpolation and conditionally posi-
tive definite functions. Approx. Theory Appl. 4, 1988, 77–89.
References 351

47. W.R. Madych, S.A. Nelson: Multivariate interpolation and conditionally posi-
tive definite functions II. Mathematics of Computation 54, 1990, 211–230.
48. J. Mairhuber: On Haar’s theorem concerning Chebysheff problems having
unique solutions. Proc. Am. Math. Soc. 7, 1956, 609–615.
49. S. Mallat: A Wavelet Tour of Signal Processing. Academic Press, 1998.
50. G. Meinardus: Approximation of Functions: Theory and Numerical Methods.
Springer, Berlin, 1967.
51. V. Michel: Lectures on Constructive Approximation. Birkhäuser, NY, 2013.
52. P. Munshi: Error analysis of tomographic filters I: theory.
NDT & E Int. 25, 1992, 191–194.
53. P. Munshi, R.K.S. Rathore, K.S. Ram, M.S. Kalra: Error estimates for tomo-
graphic inversion. Inverse Problems 7, 1991, 399–408.
54. P. Munshi, R.K.S. Rathore, K.S. Ram, M.S. Kalra: Error analysis of tomo-
graphic filters II: results. NDT & E Int. 26, 1993, 235–240.
55. J.J. O’Connor, E.F. Robertson: MacTutor History of Mathematics archive.
http://www-history.mcs.st-andrews.ac.uk.
56. M.J.D. Powell: Approximation Theory and Methods.
Cambridge University Press, Cambridge, UK, 1981.
57. A. Quarteroni, R. Sacco, F. Saleri: Numerical Mathematics.
Springer, New York, 2000.
58. M. Reed, B. Simon: Fourier Analysis, Self-Adjointness. In: Methods of Modern
Mathematical Physics II, Academic Press, New York, 1975.
59. E.Y. Remez: Sur le calcul effectiv des polynômes d’approximation des
Tschebyscheff. Compt. Rend. Acad. Sc. 199, 1934, 337.
60. E.Y. Remez: Sur un procédé convergent d’approximations successives pour
déterminer les polynômes d’approximation. Compt. Rend. Acad. Sc. 198, 1934,
2063.
61. R. Schaback: Creating surfaces from scattered data using radial basis functions.
In: Mathematical Methods for Curves and Surfaces, M. Dæhlen, T. Lyche, and
L.L. Schumaker (eds.), Vanderbilt University Press, Nashville, 1995, 477–496.
62. R. Schaback, H. Wendland: Special Cases of Compactly Supported Radial
Basis Functions. Technical Report, Universität Göttingen, 1993.
63. R. Schaback, H. Wendland: Numerische Mathematik. Springer, Berlin, 2005.
64. L.L. Schumaker: Spline Functions: Basic Theory. Third Edition,
Cambridge University Press, Cambridge, UK, 2007.
65. L.L. Schumaker: Spline Functions: Computational Methods. SIAM, 2015.
66. L.A. Shepp, B.F. Logan: The Fourier reconstruction of a head section.
IEEE Trans. Nucl. Sci. 21, 1974, 21–43.
67. G. Szegő: Orthogonal Polynomials. AMS, Providence, RI, U.S.A., 1939.
68. L.N. Trefethen: Approximation Theory and Approximation Practice.
SIAM, Philadelphia, 2013.
69. D.F. Walnut: An Introduction to Wavelet Analysis. Birkhäuser Basel, 2004.
70. G.A. Watson: Approximation Theory and Numerical Methods.
John Wiley & Sons, Chichester, 1980.
71. H. Wendland: Piecewise polynomial, positive definite and compactly supported
radial functions of minimal degree. Advances in Comp. Math. 4, 1995, 389–396.
72. H. Wendland: Scattered Data Approximation.
Cambridge University Press, Cambridge, UK, 2005.
73. Wikipedia. The free encyclopedia. https://en.wikipedia.org/wiki/
74. Z. Wu: Multivariate compactly supported positive definite radial functions.
Advances in Comp. Math. 4, 1995, 283–292.
Subject Index

Algorithm completeness criterion, 197


– Clenshaw, 125, 126, 136 computerized tomography, 317
– divided differences, 33 condition number, 305
– filtered back projection, 344 connected, 159
– Gram-Schmidt, 120 convergence rate, 206
– Neville-Aitken, 27 convex
– pyramid, 270 – function, 73
– Remez, 167, 173 – functional, 74
alternation – hull, 148
– condition, 142, 167 – set, 69
– matrix, 164 convolution, 246, 259, 334
– set, 165 – kernel, 334
– theorem, 165 – theorem
autocorrelation, 247, 259, 282 – – Fourier transform, 259
– – Radon transform, 346
back projection, 325
Banach space, 2 dense subset, 186
band-limited function, 255 Dirac
bandwidth, 255, 329 – approximation theorem, 249
Bernstein – evaluation functional, 283
– operator, 188 – sequence, 248
– polynomial, 187 Dirichlet kernel, 209
Bessel inequality, 109 discrete Fourier transform, 53
best approximation, 61 divided difference, 29, 168
– direct characterization, 87 dual
– dual characterization, 86 – functional, 84
– strongly unique, 92 – space, 84

Chebyshev Euclidean space, 3


– approximation, 139 extremal points, 140
– knots, 43
– norm, 139 fill distance, 295
– partial sum, 125, 231 filtered back projection, 328, 344
– polynomials, 43, 123 formula
Cholesky decomposition, 300 – Euler, 49
complete – Hermite-Genocchi, 36
– orthogonal system, 196 – Leibniz, 37
– orthonormal system, 196 – Rodrigues, 127

© Springer Nature Switzerland AG 2018 353


A. Iske, Approximation Theory and Algorithms for Data Analysis, Texts
in Applied Mathematics 68, https://doi.org/10.1007/978-3-030-05228-7
354 Subject Index

Fourier Lagrange
– coefficient, 48, 112 – basis, 278
– convolution theorem, 247 – polynomial, 21
– inversion formula, 250, 251, 255, 258 – representation, 21, 278
– matrix, 53 Lebesgue
– operator, 118, 240, 250, 258 – constant, 211, 305
– partial sum, 112 – integrable, 79
– series, 118 Legendre polynomial, 127
– slice theorem, 327 Leibniz formula, 37
– spectrum, 239, 241 Lemma
– transform, 240, 258, 327 – Aitken, 26
frame, 202 – Riemann-Lebesgue, 242
frequency spectrum, 239 Lipschitz
functional – constant, 222
– bounded, 84 – continuity, 222
– continuous, 64 low-pass filter, 329
– convex, 74
– dual, 84 matrix
– linear, 84 – alternation, 164
– design, 10
Gâteaux derivative, 87 – Gram, 106, 286
Gauss – Toeplitz, 57
– filter, 333 – unitriangular, 299
– function, 245, 259, 281 – Vandermonde, 20, 276
– normal equation, 11 minimal
– distance, 61
Hölder inequality, 77, 79 – sequence, 69
Haar Minkowski inequality, 78, 79
– space, 158 modulus of continuity, 224
– system, 158 multiresolution analysis, 266
– wavelet, 261
Hermite Newton
– function, 138, 252 – Cotes quadrature, 235
– Genocchi formula, 36 – polynomial, 28, 168
– polynomials, 130
Hilbert space, 69 operator
– analysis, 199
indicator function, 261 – Bernstein, 188
inequality – difference, 29
– Bessel, 109 – projection, 108
– Hölder, 77, 79 – synthesis, 199
– Jensen, 73 orthogonal
– Minkowski, 78, 79 – basis, 106
– Young, 77 – complement, 108, 266
– projection, 104, 108, 265
Jackson theorems, 217 – system, 196
Jensen inequality, 73 orthonormal
– basis, 107, 267
Kolmogorov criterion, 92 – system, 196, 293
Subject Index 355

parallel beam geometry, 338 strictly convex


parallelogram identity, 66 – function, 73
Parseval identity, 109, 195, 234, 255 – norm, 74
periodic function, 47 – set, 69
polarization identity, 66 support, 242
polynomial
– Bernstein, 187 Theorem
– Chebyshev, 43, 123 – alternation, 165
– Hermite, 130 – Banach-Mazur, 86
– Lagrange, 21 – Banach-Steinhaus, 214
– Legendre, 127 – Bochner, 280
– Newton, 28 – Carathéodory, 150
positive definite function, 277 – Charshiladse-Losinski, 215
projection – de La Vallée Poussin, 236
– operator, 108 – Dini-Lipschitz, 228, 231
– orthogonal, 108 – Faber, 217
pseudoinverse, 18 – Freud, 93
pyramid algorithm, 270 – Jackson, 219, 223–225, 228
– Jordan-von Neumann, 66
radially symmetric, 279 – Korovkin, 189
Radon transform, 321 – Kuzmin, 235
refinement equations, 264 – Madych-Nelson, 288
regularization method, 14 – Mairhuber-Curtis, 160
Remez – Paley-Wiener, 256
– algorithm, 167, 173 – Plancherel, 255, 260
– exchange, 172 – Pythagoras, 108, 290
reproducing kernel, 287 – Shannon, 256
Riemann-Lebesgue lemma, 242 – Weierstrass, 191, 192
Riesz three-term recursion, 121
– basis, 198, 302 Toeplitz matrix, 57
– constant, 198, 302 topological
– stability, 302 – closure, 186
– dual, 84
scale space, 264 translation-invariant, 264, 313
scaling function, 263 trigonometric polynomials, 47, 48
Schwartz space, 251, 260
sequence uniform boundedness principle, 214
– Cauchy, 69 unitriangular matrix, 299
– Dirac, 248
– Korovkin, 187 Vandermonde matrix, 20, 159, 276
sinc function, 40, 243 wavelet, 260
sinogram, 324 – analysis, 269
Sobolev space, 335 – coefficient, 269
space – Haar, 261
– Banach, 2 – space, 267
– Haar, 158 – synthesis, 270
– Hilbert, 69 – transform, 271
– Schwartz, 251 window function, 329
– Sobolev, 335
spline filter, 346 Young inequality, 77
Name Index

Aitken, A.C. (1895-1967), 26 Haar, A. (1885-1933), 158, 260


Hahn, H. (1879-1934), 86
Banach, S. (1892-1945), 86, 214 Hermite, C. (1822-1901), 34, 130
Beer, A. (1825-1863), 318 Hesse, L.O. (1811-1874), 11
Bernstein, S.N. (1880-1968), 187 Hilbert, D. (1862-1943), 69
Bessel, F.W. (1784-1846), 109 Horner, W.G. (1786-1837), 182
Bochner, S. (1899-1982), 280
Jackson, D. (1888-1946), 218
Carathéodory, C. (1873-1950), 150 Jensen, J.L. (1859-1925), 73
Cauchy, A.-L. (1789-1857), 69, 105 Jordan, P. (1902-1980), 66
Chebyshev, P.L. (1821-1894), 139
Cholesky, A.-L. (1875-1918), 299 Kolmogoroff, A.N. (1903-1987), 91
Cooley, J.W. (1926-2016), 56 Korovkin, P.P. (1913-1985), 187
Cotes, R. (1682-1716), 235 Kotelnikov, V. (1908-2005), 257
Courant, R. (1888-1972), 303 Kuzmin, R.O. (1891-1949), 235
Cramer, G. (1704-1752), 166
Curtis, P.C. Jr. (1928-2016), 160 Lagrange, J.-L. (1736-1813), 21
Lambert, J.H. (1728-1777), 318
de L’Hôpital, M. (1661-1704), 211
Laplace, P.-S. (1749-1827), 165
de La Vallée Poussin (1866-1962), 236
Lebesgue, H.L. (1875-1941), 79, 211
Dini, U. (1845-1918), 228
Legendre, A.-M. (1752-1833), 127
Dirac, P.A.M. (1902-1984), 248, 283
Leibniz, G.W. (1646-1716), 37
Dirichlet, P.G.L. (1805-1859), 209
Lipschitz, R. (1832-1903), 222
Euler, L. (1707-1783), 49
Machiavelli, N.B. (1469-1527), 56
Faber, G. (1877-1966), 217 Mairhuber, J.C. (1922-2007), 160
Fischer, E.S. (1875-1954), 303 Mazur, S. (1905-1981), 86
Fourier, J.B.J. (1768-1830), 48 Minkowski, H. (1864-1909), 78
Fréchet, M.R. (1878-1973), 287
Freud, G. (1922-1979), 93 Neumann, J. von (1903-1957), 66
Fubini, G. (1879-1943), 243 Neville, E.H. (1889-1961), 27
Newton, I. (1643-1727), 28
Gâteaux, R. (1889-1914), 87 Nyquist, H. (1889-1976), 257
Gauß, C.F. (1777-1855), 11
Genocchi, A. (1817-1889), 34 Paley, R. (1907-1933), 256
Gram, J.P. (1850-1916), 106 Parseval, M.-A. (1755-1836), 109
Plancherel, M. (1885-1967), 254
Hölder, O. (1859-1937), 77 Pythagoras (around 570-510 BC), 108

© Springer Nature Switzerland AG 2018 357


A. Iske, Approximation Theory and Algorithms for Data Analysis, Texts
in Applied Mathematics 68, https://doi.org/10.1007/978-3-030-05228-7
358 Name Index

Radon, J. (1887-1956), 320 Szegő, G. (1895-1985), 123


Rayleigh, J.W.S. (1842-1919), 303
Remez, E.Y. (1896-1975), 167 Taylor, B. (1685-1731), 98
Riemann, B. (1826-1866), 242 Tikhonov, A.N. (1906-1993), 15
Riesz, F. (1880-1956), 198, 287 Toeplitz, O. (1881-1940), 57
Rodrigues, B.O. (1795-1851), 127 Tukey, J.W. (1915-2000), 56
Rolle, M. (1652-1719, 162
Vandermonde, A.-T. (1735-1796), 20
Schmidt, E. (1876-1959), 119
Schwartz, L. (1915-2002), 251 Weierstraß, K. (1815-1897), 186
Schwarz, H.A. (1843-1921), 105 Whittaker, E.T. (1873-1956), 257
Shannon, C.E. (1916-2001), 255 Wiener, N. (1894-1964), 256
Sobolev, S.L. (1908-1989), 335
Steinhaus, H. (1887-1972), 214 Young, W.H. (1863-1942), 77

You might also like