Optimal Transportation and Action Minimizing Measures PDF
Optimal Transportation and Action Minimizing Measures PDF
Optimal Transportation and Action Minimizing Measures PDF
and
École Normale Supérieure of Lyon
Phd thesis
24th October 2007
Alessio Figalli
a.figalli@sns.it
Advisor Advisor
Prof. Luigi Ambrosio Prof. Cédric Villani
Scuola Normale Superiore of École Normale Supérieure of
Pisa Lyon.
Contents
Introduction 7
3
4 Contents
6 Appendix 211
6.1 Semi-concave functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
6.2 Tonelli Lagrangians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
6.2.1 Definition and background . . . . . . . . . . . . . . . . . . . . . . 221
6.2.2 Lagrangian costs and semi-concavity . . . . . . . . . . . . . . . . 233
6.2.3 The twist condition for costs obtained from Lagrangians . . . . . 236
Bibliography 239
6 Contents
Introduction
The Monge transportation problem is more than 200 years old [110], and it has generated
in the last years a huge amount of work.
Originally Monge wanted to move, in 3-space, a rubble (déblais) to build up a mound
or fortification (remblais) minimizing the cost. Now, if the rubble consists of masses, say
m1 , . . . , mn at locations {x1 , . . . xn }, one should move them into another set of positions
{y1 , . . . , yn } by minimizing the weighted traveled distance. Therefore one should try to
minimize n
X
mi |xi − T (xi )|,
i=1
over all bijections T : {x1 , . . . xn } → {y1 , . . . , yn }, where d is the usual Euclidean distance
on 3-space.
Nowadays, one would be more interested in minimizing the energy cost rather than
the traveled distance. Therefore one would try rather to minimize
n
X
mi |xi − T (xi )|2 .
i=1
T] µ = ν,
i.e. ¡ ¢
ν(A) = µ T −1 (A) ∀A ⊂ Y measurable,
and in such a way that T minimizes the transportation cost. This last condition means
Z ½Z ¾
c(x, T (x)) dµ(x) = min c(x, S(x)) dµ(x) ,
X S] µ=ν X
7
8 Introduction
where c : X × Y → R is some given cost function, and the minimum is taken over all
measurable maps S : X → Y with S] µ = ν. When the transport condition T] µ = ν is
satisfied, we say that T is a transport map, and if T minimizes also the cost we call it an
optimal transport map.
In the development of the theory of optimal transportation, as well as in the devel-
opment of other theories, it is important on the one hand to explore new variants of the
original problem, on the other hand to figure out, in this emerging variety of problems,
some common (and sometimes unexpected) features. This kind of analysis is the main
scope of our thesis.
The problems we will consider are:
3. The Brenier variational theory of incompressible flows: starting from the geomet-
rical interpretation of the Euler equations for incompressible fluids as a geodesic
equation in the space of the measure-preserving diffeomorphisms, one can look for
solutions of the Euler equations by minimizing the action functional. This leads
to the introduction of relaxed models and their study from a calculus of variation
point of view.
aim will be to develop such a theory in the case of an ordinary differential equation
perturbed by an irregular noise.
We remark that the first three topics are all variants of the optimal transportation
problem. Moreover, even though the last topic is only loosely related to optimal trans-
portation, at the technical level many connections arise, and the study of all them reveals
some new connections. For instance, Bernard and Buffoni [22] have recently shown how
one can fit Mather’s theory, as well as optimal transportation problems on manifolds
with a geometric cost, in the framework of measures in the space of action-minimizing
curves. We proceed in this research of a general unified framework, proving that also
variational solutions of the Euler equations [8] can be seen in this perspective, with
a possibly non-smooth action induced by the pressure field. Also the last two topics
present some links with the first three. For instance, the proof in [23] on the existence
and uniqueness of an optimal transport plan strongly relies on the regularity properties
of solutions of Hamilton-Jacobi equations, while the natural framework which allows to
develop a theory à la DiPerna-Lions for martingale solutions of stochastic differential
equations turns out to be the one of the measures in the space of paths, which, as we
said, is natural also in the optimal transportation problem and in the variational study
of the Euler equations.
Let us give a quick overview on all these subjects (each chapter contains a more
detailed mathematical and bibliographical description of the single problems), providing
also an outline of thesis’ content. All the results in this thesis have been presented
in a series of papers (accepted, submitted, or in preparation), originating from several
collaborations developed during the PhD studies.
1. As we explained above, one is interested to find a transport map T : X → Y from
µ to ν which minimizes the transportation cost, that is
Z ½Z ¾
c(x, T (x)) dµ(x) = min c(x, S(x)) dµ(x) .
X S] µ=ν X
Even in Euclidean spaces, with the cost c equal to the Euclidean distance or its
square, the problem of the existence of an optimal transport map is far from being
trivial. Moreover, it is easy to build examples where the Monge problem is ill-posed
simply because there is no transport map: this happens for instance when µ is a Dirac
mass while ν is not. This means that one needs some restrictions on the measures µ and
ν.
The major advance on this problem is due to Kantorovitch, who proposed in [85],
[86] a notion of weak solution of the optimal transport problem. He suggested to look
for plans instead of transport maps, that is probability measures γ in X × Y whose
marginals are µ and ν, i.e.
(πX )] γ = µ and (πY )] γ = ν,
10 Introduction
the general fact that, with the cost c(x, y) = |x − y|, one can find an optimal transport
map imposing also that the common mass between µ and ν stays fixed. In Chapter 1,
following a joint work with Albert Fathi [66], we show existence and uniqueness of an
optimal transport map in a very general setting, which includes the case of “geometric”
costs on manifolds, that is costs given by
Z 1
c(x, y) := inf L(γ(t), γ̇(t)) dt,
γ(0)=x,γ(1)=y 0
where L : T M → R is a Tonelli Lagrangian. This is the most general known result, since
it is valid for a wide class of cost functions and it does not require any global assumption
on the manifold (say, a bound on the sectional curvature). To this aim, we will need to
understand the regularity of the cost alongs extremals, a problem which is closely linked
to weak KAM theory and the regularity of solutions of the Hamilton-Jacobi equation,
see also Section 6.2.2.
Moreover, in Section 1.5 we will study the so-called “displacement interpolation”,
which is a way to connect measures using the optimal transportation. For instance,
suppose that µ0 and µ1 are two absolutely continuous measures in Rd , and let T :
Rd → Rd be the optimal transport map from µ0 to µ1 (as we said above, existence and
uniqueness of the optimal transport map in this special case is due to Brenier [30]).
Then, instead of “connecting” µ0 to µ1 in a linear way (that is µt = (1 − t)µ0 + tµ1 ), one
can consider the interpolation µt := ((1 − t) Id +tT )# µ0 , which is called “displacement
interpolation”. An interesting feature is that, from the convexity of certain funtionals
along such curves, one can deduce existence, uniqueness and stability of the gradient
flows of such functionals obtaining many interesting properties for Fokker-Planck-type
evolution equations such as the porous medium equation (see [42], and see also [11] for
an introduction and a wide bibliography on this subject).
The convexity of certain suitable functionals on Riemannian manifolds allows to
express Ricci curvature bounds on the manifold. In Section 1.7, following a joint work
with Cedric Villani [78], we use the general results on optimal transport maps mentionned
above to study the link between more possible notions of “displacement convexity“ (i.e.
convexity along displacement interpolations) and to prove their equivalence.
Finally, in Section 1.8 we will generalize the existence and uniqueness of the optimal
transport map without assumptions on the finiteness of the transportation cost, and we
will also prove that the optimal transport map on a general manifold is approximately
differentiable a.e. whenever the cost is given by c(x, y) = d2 (x, y) [75].
2. Another kind of transport problem is the so-called irrigation problem. Start-
ing from the observation of the frequent occurrence of branched networks in nature
(plants and trees, river basins, bronchial and cardiovascular systems) and in man de-
signed structures (communication networks, electric power supply, water distribution
12 Introduction
or drainage networks), and observing that the common function of such networks is to
transport some goods from an initial distribution (the supply) to another one (the de-
mand), we are interested in finding models which describe such fenomena. This was
done in [82, 98, 135, 25, 24, 29] by considering cost functions that encode the efficiency
of a transport induced by some structure. Branched structures, as the ones observed in
nature, then arise as the optimal structures along which the transport takes place.
PNThe first model P
is due to Gilbert [82]: given two atomic probability measures µ =
N2
1
a δ
i=1 i xi and ν = i=1 bi δyi , find a finite, oriented and weighted graph Γ = (vh , ph ) (vh
are the vectors that orient the graph, ph the weights), which satisfies Kirchhoff’s law at
the junctures (the mass which enters is equal the mass which exits, except at the points
xi where a mass ai exits, and at the points yj where a mass bj enters). One then looks
for a graph which minimizes the transportation cost
X
C α (Γ) = |vh |pαh
h
with 0 ≤ α ≤ 1. The motivation for introducing such a parameter α is that, since the
function t 7→ tα is sub-additive for 0 ≤ α ≤ 1, the inequality (ph1 +ph2 )α ≤ pαh1 +pαh2 holds,
and thus the mass has interest to concentrate and to move together. This problem has
been recently generalized (by Xia, Morel, Bernot, Solimini, etc.) to the case of arbitrary
probability measures, and one arrives at problems where the optimal objects have a
branched structure, and the optimal transportation costs C α (µ, ν) give rise to a distance
which metrizes the weak convergence. In order to extend the above problem to arbitrary
target and source measures, a “probabilistic” formalism that has been considered in
[98, 25, 24] is the one of traffic plans, which are suitable probability measures in the space
of continuous paths which “connect” two fixed measures µ and ν. In this framework, all
particles are indexed by the set Ω := [0, 1], and to each ω ∈ Ω is associated a 1-Lipschitz
path χ(ω, ·) in RN . This is a Lagrangian description of the dynamic of particles that
can be encoded by the image measure Pχ of the map ω 7→ χ(ω, ·) (which is therefore
a measure on the set of 1-Lipschitz paths). To each traffic plan one can associate a
suitable cost function which has to incorporate the principle that it is more efficient to
transport mass in a grouped way rather than in a separate way. Like in the discrete
case considered by Gilbert, to embed this principle the costs incorporate a parameter
α ∈ [0, 1] and make use of the concavity of x 7→ xα . Once the cost and the measures µ
and ν are given, one can consider what is called the irrigation problem by some authors
[25, 24, 26], i.e. the problem of minimizing the cost among structures transporting µ to
ν.
In Chapter 2 we study different kinds of possible costs, and in some case we show
their equivalence. Moreover, we study the properties of the costs when seen as functions
of the parameter α, and we use this analysis to show a stability property of minimizers.
13
Let us assume that u is smooth, so that it produces a unique flow g. Writing the Euler
equations in terms of g, we get
g̈(t, a) = −∇p (t, g(t, a)) (t, a) ∈ [0, T ] × D,
g(0, a) = a a ∈ D, (0.0.3)
g(t, ·) ∈ SDiff(D) t ∈ [0, T ],
and proved that a minimizer of the above variational problem η solves in a “weak” sense
the Euler equations: for all h ∈ Cc∞ (0, 1) and w smooth compactly supported vector
field on D, we have
Z Z T
d
ω̇(t) · [h(t)w(ω(t))] dt dη(ω) = hDx p(t, x), h(t)w(x)i
Ω(D) 0 dt
in the distributional sense.
In particular, this condition identifies uniquely the pressure field p (as a distribution)
up to trivial modifications, i.e. additive perturbations depending on time only. We also
remark that, if the measure η is given by
Z Z
f (ω) dη(ω) = f (t 7→ g(t, x)) dx
Ω(D) D
with g : [0, T ] → SDif f (D) smooth, then u(t, x) := ∂t g(t, g −1 (t, x)) is a solution of the
Euler equations. Now an important problem is to study the structure of minimizers,
finding necessary and sufficient conditions for optimality, a question which will be ad-
dressed and solved in Chapter 3 following a joint work with Luigi Ambrosio [8]. As we
already said, the results we prove show a somehow unexpected connection between the
variational theory of incompressible flows and the theory developed by Bernard-Buffoni
[22] of measures in the space of action-minimizing curves.
Indeed, first we refine a little bit the deep analysis made in [35] of the regularity of
the gradient of the pressure field: Brenier proved that the distributions ∂xi p are locally
finite measures in (0, T ) × D, but this information is not sufficient (due to a lack of time
regularity) to ¡imply that p is a¢ function. In Section 3.7 we improve this result showing
that p ∈ L2loc (0, T ); BVloc (D) (this has been done in another joint work with Luigi
Ambrosio [9]). In particular p is a function at least in some L1loc (Lrloc ) space, for some
r > 1. We can therefore develop a refined analysis of the necessary and sufficient opti-
mality conditions for action-minimizing curves in Γ(D) (see Section 3.6) which involve
the Lagrangian Z
1
Lp (γ) := |γ̇(t)|2 − p(t, γ(t)) dt,
2
the (locally) minimizing curves for Lp and the value function induced by Lp .
We also remark that the possibility of deducing such regularity result for the pressure
is based on the equivalence, proved in Section 3.4, of the above mentioned Brenier model
and the Eulerian-Lagrangian model introduced by the same author in [35] (see Section
3.3.3). Indeed, the regularity of p is easier to study within the latter model.
4. As we said, an important connection between Mather’s theory as well as optimal
transportation problems on manifolds exists [22, 23]. The key point is that cost functions
induced by Tonelli Lagrangians solve an Hamilton-Jacobi equation.
15
Important for studying the dynamic of a Lagrangian system and for having uniqueness
of solutions of the Hamilton-Jacobi equation
H(x, dx u) = c
is to understand the structure of some subsets of the tangent space which capture the
properties of the dynamic. Mather [105] proposed as an important problem to show
that the quotient Aubry set is totally disconnected if the Lagrangian (or, equivalently,
the Hamiltonian) is smooth. In Chapter 4 this problem will be completely solved up to
dimension 3, and in many particular cases in higher dimension, following a joint work
with Albert Fathi and Ludovic Rifford [67].
To understand the key idea of the proof, let us consider the particular case
1
H(x, p) = |p|2 + V (x),
2
and without loss of generality let us assume maxx V (x) = 0. Then in this case the
Hamilton-Jacobi equation one is interested in becomes
1
|dx u|2 + V (x) = 0
2
(the value c = 0 is the Mañé critical value for the above Hamiltonian), and the pro-
jected Aubry set is the set {V = 0}. As shown in Section 4.2, the key point to show
that the quotient Aubry set is totally disconnected (or small in the sense of the Haus-
dorff dimension) is to prove a sort of Sard-type theorem for critical subsolution of the
Hamilton-Jacobi equation (that is functions u which satisfy 12 |dx u|2 +V (x) ≤ 0), showing
that the image of the set {V = 0} under the map u : M → R has zero Lebesgue measure.
Although the function u is only C 1 , and so the classical Sard theorem cannot be applied,
in this case one has the extra information
and the function V (x) is smooth by assumption. One can therefore use the regularity of
V to deduce that u is really “flat” near {V = 0}, and so to deduce the Sard-type result.
5. In Chapter 5, we will develop a theory à la DiPerna-Lions for martingale solutions,
in the sense of Stroock-Varadhan, of stochastic differential equations.
In [56, 4], the authors developed a theory which, roughly speaking, allows to prove
existence and uniqueness in a weak sense for solutions of ordinary differential equations
with nonsmooth coefficients. This theory is bsed on the classical links between the
transport (or the continuity) equation
∂t µ + div(bµ) = 0
16 Introduction
What one proves is that, in a suitable sense (see [56, 4, 5] for a precise statement),
existence and uniqueness for the ordinary differential equation hold for almost every
initial condition if, and only if, the partial differential equation is well-posed in L∞ . It
was pointed out in [4] that this theory has a probabilistic flavour, and therefore it is very
natural to look for a more general theory concerning stochastic differential equations
whose limit, as the diffusion coefficient tends to 0, should be the DiPerna-Lions theory.
In Section 5.2 we obtain this type of extension [77]: first we study the links between
the Fokker-Planck equation
X 1X
∂ t µt + ∂i (bi µt ) − ∂ij (aij µt ) = 0 in [0, T ] × Rd , (0.0.4)
i
2 ij
1.1 Introduction
1
The optimal transportation problem we consider in this chapter is the following: given
two probability measures µ and ν, defined on the measurable spaces X and Y , find a
measurable map T : X → Y with
T] µ = ν, (1.1.1)
and in such a way that T minimize the transportation cost, that is
Z ½Z ¾
c(x, T (x)) dµ(x) = min c(x, S(x)) dµ(x) .
X S] µ=ν X
Here c : X × Y → R is some given cost function, and the minimum is taken over all
measurable maps S : X → Y with S] µ = ν. When condition (1.1.1) is satisfied, we say
that T is a transport map, and if T minimize also the cost we call it an optimal transport
map.
Even in Euclidean spaces, and the cost c equal to the Euclidean distance or its
square, the problem of the existence of an optimal transport map is far from being
trivial. Due to the strict convexity of the square of the Euclidean distance, the case
c(x, y) = |x − y|2 is simpler to deal with than the case c(x, y) = |x − y|. The reader
should consult the books and surveys given above to have a better view of the history
of the subject, in particular Villani’s second book on the subject [133]. However for the
case where the cost is a distance, one should cite at least the work of Sudakov [129],
1
This chapter is based on joint works with Albert Fathi [66], Cédric Villani [78], and on the work in
[75].
17
18 1.0. The optimal transportation problem
the transport map starting from an intermidiate time (Theorem 1.5.2). All the tecnical
results about semi-concave functions and Tonelli Lagrangians used in our proofs are
collected in the appendix at the end of the thesis.
(here Π(µ, ν) denotes the set of plans). If γ is a minimizer for the Kantorovich formu-
lation, we say that it is an optimal plan. Using weak topologies, it is simple to prove
existence of optimal plans whenever X and Y are Polish spaces and c is lower semicon-
tinuous (see [118], [132, Proposition 2.1] or [133]).
It is well-known that a linear minimization problem with convex constraints, like
(1.2.1), admits a dual formulation. Before stating the duality formula, we make some
definitions similar to that of the weak KAM theory (see [65]):
Definition 1.2.1 (c-subsolution). We say that a pair of functions ϕ : X → R∪{+∞},
ψ : Y → R ∪ {−∞} is a c-subsolution if
∀(x, y) ∈ X × Y, ψ(y) − ϕ(x) ≤ c(x, y).
Observe that when c is measurable and bounded below, and (ϕ, ψ) is a c-subsolution
with ϕ ∈ L1 (µ), ψ ∈ L1 (ν), then
Z Z Z
∀γ ∈ Π(µ, ν), ψ dν − ϕ dµ = (ψ(y) − ϕ(x)) dγ(x, y)
Y X X×Y
Z
≤ c(x, y) dγ(x, y).
X×Y
R
If moreover X×Y
c(x, y) dγ < +∞, and
Z Z
(ψ(y) − ϕ(x)) dγ(x, y) = c(x, y) dγ(x, y),
X×Y X×Y
Theorem 1.2.3 (Duality formula). Let X and Y be Polish spaces equipped with
probability measures µ and ν respectively, c : X × Y → R a lower semicontinuous cost
function bounded from below such that the infimum in the Kantorovitch problem (1.2.1)
is finite. Then a transport plan γ ∈ Π(µ, ν) is optimal if and only if there exists a
(c, γ)-calibrated subsolution (ϕ, ψ).
For a proof of this theorem see [120] and [133, Theorem 5.9 (ii)].
Here we study Monge’s problem on manifolds for a large class of cost functions
induced by Lagrangians like in [22], where the authors consider the case of compact
manifolds. We generalize their result to arbitrary non-compact manifolds.
Following the general scheme of proof, we will first prove a result on more general
costs, see Theorem 1.3.2. In this general result, the fact that the target space for the
Monge transport is a manifold is not necessary. So we will assume that only the source
space (for the Monge transport map) is a manifold.
Let M be an n-dimensional manifold (Hausdorff and with a countable basis), N a
Polish space, c : M × N → R a cost function, µ and ν two probability measures on M
and N respectively. We want to prove existence and uniqueness of an optimal transport
map T : M → N , under some reasonable hypotheses on c and µ.
One of the conditions on the cost c is given in the following definition:
Definition 1.2.4 (Twist Condition). For a given cost function c(x, y), we define the
skew left Legendre transform as the partial map
Λlc : M × N → T ∗ M,
∂c
Λlc (x, y) = (x, (x, y)),
∂x
whose domain of definition is
½ ¾
∂c
D(Λlc ) = (x, y) ∈ M × N | (x, y) exists .
∂x
Moreover, we say that c satisfies the left twist condition if Λlc is injective on D(Λlc ).
One can define similarly the skew right Legendre transform Λrc : M × N → T ∗ N by
∂c
Λrc (x, y) = (y, ∂y (x, y)),. The domain of definition of Λrc is D(Λrc ) = {(x, y) ∈ M × N |
∂c
∂x
(x, y) exists}. We say that c satisfies the right twist condition if Λrc is injective on
D(Λrc ).
1.2. Background and some definitions 21
The usefulness of these definitions will be clear in the Section 1.4, in which we will
treat the case where M = N and the cost is induced by a Lagrangian. This condition has
appeared already in the subject. It has been known (explicitly or not) by several people,
among them Gangbo (oral communication) and Villani (see [132, page 90]). It is used in
[22], since it is always satisfied for a cost coming from a Lagrangian, as will see below.
We borrow the terminology “twist condition” from the theory of Dynamical Systems:
if h : R × R → R, (x, y) 7→ h(x, y) is C2 , one says that h satisfies the twist condition
∂ 2h
if there exists a constant α > 0 such that ≥ α everywhere. In that case both
∂x∂y
maps Λlh : R × R → R × R, (x, y) 7→ (x, ∂h/∂x(x, y)) and Λrh : R × R → R × R, (x, y) 7→
(y, ∂h/∂y(x, y)) are C1 diffeomorphisms. The twist map f : R×R → R×R associated to
h is determined by f (x1 , v1 ) = (x2 , v2 ), where v1 = −∂h/∂x(x1 , x2 ), v2 = ∂h/∂y(x1 , x2 ),
which means f (x1 , v1 ) = Λrh ◦ [Λlh ]−1 (x1 , −v1 ), see [102] or [79].
We now recall some useful measure-theoretical facts that we will need in the sequel.
∂c
{(x, y) | (x, y) exists} is Borel measurable.
∂x
∂c
Moreover (x, y) 7→ ∂x
(x, y) is a Borel function on that set.
Proof. This a standard result in measure theory, we give here just a sketch of the proof.
By the locality of the statement, using charts we can assume M = Rn . Let Tk : Rn →
Rn be a dense countable family of linear maps. For any j, k ∈ N, we consider the Borel
function
|c(x + h, y) − c(x, y) − Tk (h)|
Lj,k (x, y) : = sup
|h|∈(0, 1 ) |h|
j
where in the second equality we used the continuity of x 7→ c(x, y). Then it is not difficult
∂c
to show that the set of point where ∂x (x, y) exists can be written as
∂c
To show that x 7→ ∂x
(x, y) is Borel, it suffices to note that the partial derivatives
∂c c(x1 , . . . , xi + 1` , . . . , xn , y) − ϕn (x1 , . . . , xi , . . . , xn , y)
(x, y) = lim
∂xi `→∞ 1/`
are countable limits of continuous functions, and hence are Borel measurable.
Therefore, by the above lemma, D(Λlc ) is a Borel set. If we moreover assume that c
satisfies the left twist condition (that is, Λlc is injective on D(Λlc )), then one can define
Then, by the injectivity assumption, one has that Λlc (D(Λlc )) is still a Borel set, and
(Λlc )−1 is a Borel map (see [51, Proposition 8.3.5 and Theorem 8.3.7], [70]). We can so
extend (Λlc )−1 as a Borel map on the whole T ∗ M as
½ l −1
l,inv (Λc ) (x, p) if p ∈ Tx∗ M ∩ Λlc (D(Λlc )),
Λc (x, p) =
(x, ȳ) if p ∈ Tx∗ M \ Λlc (D(Λlc )),
(i) the family of maps x 7→ c(x, y) = cy (x) is locally semi-concave in x locally uniformly
in y,
We can find a Borel countably (n − 1)-Lipschitz set E ⊂ M and a Borel measurable map
T : M → N such that
−1
G(ϕ,ψ) ⊂ Graph(T ) ∪ πM (E),
where πM : M × N → M is the canonical projection, and Graph(T ) = {(x, T (x)) | x ∈
M } is the graph of T . ¡ ¢
In other words, if we define P = πM G(ϕ,ψ) ⊂ M the part of G(ϕ,ψ) which is above
P \ E is contained a Borel graph.
More precisely, we will prove that there exist an increasing sequence of locally semi-
convex functions ϕn : M → R, with ϕ ≥ ϕn+1 ≥ ϕn on M , and an increasing sequence
of Borel subsets Cn such that
where Λl,inv
c is the extension of the inverse of Λlc defined at the end of Section 1.2.
∂c
• If x ∈ P ∩ Cn \ E, then the partial derivative (x, T (x)) exists (i.e. (x, T (x)) ∈
∂x
D(Λlc ) ), and
∂c
(x, T (x)) = −dx ϕn .
∂x
In particular, if x ∈ P ∩ Cn \ E, we have
The existence and uniqueness of a transport map is then a simple consequence of the
above theorem.
(i) the family of maps x 7→ c(x, y) = cy (x) is locally semi-concave in x locally uniformly
in y,
Then there exists a Borel map T : M → N , which is an optimal transport map from
µ to ν for the cost c. Morover, the map T is unique µ-a.e., and any plan γc ∈ Π(µ, ν)
optimal for the cost c is concentrated on the graph of T .
More precisely, if (ϕ, ψ) is a (c, γc )-calibrating pair, with the notation of Theorem
1.3.1, there exists an increasing sequence of Borel subsets Bn , with µ(∪n Bn ) = 1, such
that the map T is uniquely defined on B = ∪n Bn via
∂c
(x, T (x)) = −dx ϕn on Bn ,
∂x
and any optimal plan γ ∈ Π(µ, ν) is concentrated on the graph of that map T .
We remark that condition (iv) is trivially satisfied if
Z
c(x, y) dµ(x) dν(y) < +∞.
M ×N
However we needed to stated the above theorem in this more general form in order to
apply it in Section 1.5 (see Remark 1.5.3).
Proof of Theorem 1.3.2. Let γc ∈ Π(µ, ν) be an optimal plan. By Theorem 1.2.3 there
exists a (c, γ)-calibrated pair (ϕ, ψ). Consider the set
Since both M and N are Polish and both maps ϕ and ψ are Borel, the subset G is a
Borel subset of M × N . Observe that, by the definition of (c, γc )-calibrated pair, we have
γc (G) = 1.
By Theorem 1.3.1 there exists a Borel countably (n − 1)-Lipschitz set E such that
G \ (πM )−1 (E) is contained in the graph of a Borel map T . This implies that
¡ ¢
B = πM G \ (πM )−1 (E) = πM (G) \ E ⊂ M
˜ )−1 (G \ (πM )−1 (E)) and the map x 7→
is a Borel set, since it coincides with (IdM ×T
˜ (x) = (x, T (x)) is Borel measurable.
IdM ×T
1.3. The main result 25
Thus, recalling that the first marginal of γc is µ, by assumption (iii) we get γc ((πM )−1 (E)) =
µ(E) = 0. Therefore γc (G \ (πM )−1 (E)) = 1, so that γc is concentrated on the graph of
T , which gives the existence of an optimal transport map. Note now that µ(B) =
γc (π −1 (B)) ≥ γc (G \ (πM )−1 (E)) = 1. Therefore µ(B) = 1. Since B = P \ E,
where P = πM (G), using the Borel set Cn provided by Theorem 1.3.1, it follows that
Bn = P ∩ Cn \ E = D ∩ Cn is a Borel set with B = ∪n Bn . The end of Theorem 1.3.1
shows that T is indeed uniquely defined on B as said in the statement.
Let us now prove the uniqueness of the transport map µ-a.e.. If S is another optimal
transport map, consider the measures γT = (IdM ×T )# µ and γS = (IdM ×S)# µ. The
measure γ̄ = 12 (γT + γS ) ∈ Π(µ, ν) is still an optimal plan, and therefore must be
concentrated on a graph. This implies for instance that S = T µ-a.e., and thus T is the
unique optimal transport map. Finally, since any optimal γ ∈ Π(µ, ν) is concentrated
on a graph, we also deduce that any optimal plan is concentrated on the graph of T .
Vn = Wn ∩ (∪1≤k≤n Vyk ) .
ϕn ≤ ϕn+1 ≤ ϕ everywhere on M .
ϕ|Pn = ϕn |Pn ,
26 1.0. The optimal transportation problem
∂c
(x, T (x)) = −dx ϕn
∂x
(observe that, since ϕn ≤ ϕk for k ≥ n with equality on Pn , we have dx ϕn |Pn = dx ϕk |Pn
for k ≥ n). Since Pn+1 ⊃ Pn , and Vn ⊂ Vn+1 % N , we can conclude that G(ϕ,ψ) is a
graph over ∪n Pn ∩ F = P ∩ F (where P = πM (G(ϕ,ψ) ) = ∪n Pn ).
Observe that, at the moment, we do not know that T is a Borel map, since Pn is not
a priori Borel. Note first that by definition of Bn ⊂ Bn+1 , we ϕn = ϕn+1 on Bn , and
they are both differentiable at every point of Bn . Since ϕn ≤ ϕn+1 everywhere, by the
same argument as above we get dx ϕn = dx ϕn+1 for x ∈ Bn . Thus, setting B = ∪n Bn ,
we can extend T to M by
½
πN Λl,inv
c (x, −dx ϕn ) on Bn ,
T (x) =
ȳ on M \ B,
have density 0 at x for each ε > 0 with respect to Lebesgue measure. This last definition
is the one systematically used in [70]. On the other hand, for our purpose, Definition
1.3.3 is more convenient.
The set points x ∈ M where the approximate derivative d˜x f exists is measurable;
moreover, the map x 7→ d˜x f is also measurable, see [70, Theorem 3.1.4, page 214].
∂c
(x, T (x)) = −d˜x ϕ,
∂x
We will consider first the case of a Tonelli Lagrangian L on a connected manifold (see
Definition 6.2.4 of the Appendix for the definition of a Tonelli Lagrangian). For t > 0,
the cost ct,L : M × M → R associated to L is given by
Z t
ct,L (x, y) = inf L(γ(s), γ̇(s)) ds,
γ 0
where the infimum is taken over all the continuous piecewise C1 curves γ : [0, t] → M ,
with γ(0) = x, and γ(t) = y (see Definition 6.2.18 of the Appendix).
Proposition 1.4.1. If L : T M → R is a Tonelli Lagrangian on the connected manifold
M , then, for t > 0, the cost ct,L : M × M → R associated to the Lagrangian L is
continuous, bounded from below, and satisfies conditions (i) and (ii) of Theorem 1.3.2.
Proof. Since L is a Tonelli Lagrangian, observe that L is bounded below by C, where C is
the constant given in condition (c) of Definition 6.2.4. Hence the cost ct,L is bounded be-
low by tC. By Theorem 6.2.19 of the Appendix, the cost ct,L is locally semi-concave, and
therefore continuous. Moreover, we can now apply Proposition 6.1.17 of the Appendix
to conclude that ct,L satisfies condition (i) of Theorem 1.3.2.
The twist condition (ii) of Theorem 1.3.2 for ct,L follows from Lemma 6.2.22 and
Proposition 6.2.23.
For costs coming from Tonelli Lagrangians, we subsume the application of the main
Theorem 1.3.2, and its Complement 1.3.4.
Theorem 1.4.2. Let L be a Tonelli Lagrangian on the connected manifold M . Fix t > 0,
µ, ν a pair of probability measure on M , with µ giving measure zero to countably (n−1)-
Lipschitz sets, and assume that the infimum in the Kantorovitch problem (1.2.1) with
cost ct,L is finite. Then there exists a uniquely µ-almost everywhere defined transport
map T : M → M from µ to ν which is optimal for the cost ct,L . Moreover, any plan
γ ∈ Π(µ, ν), which is optimal for the cost ct,L , verifies γ(Graph(T )) = 1.
If µ is absolutely continuous with respect to Lebesgue measure, and (ϕ, ψ) is a ct,L -
calibrated subsolution for (µ, ν), then we can find a Borel set B of full µ measure, such
that the approximate differential d˜x ϕ of ϕ at x is defined for x ∈ B, the map x 7→ d˜x ϕ
is Borel measurable on B, and the transport map T is defined on B (hence µ-almost
everywhere) by
T (x) = π ∗ φH ˜
t (x, dx ϕ),
Proof. The first part is a consequence of Proposition 1.4.1 and Theorem 1.3.2. When µ
is absolutely continuous with respect to Lebesgue measure, we can apply Complement
1.3.4 to obtain a Borel subset A ⊂ M of full µ measure such that, for every x ∈ A, we
have (x, T (x)) ∈ D(Λlct,L ) and
∂ct,L
(x, T (x)) = d˜x ϕ.
∂x
By Lemma 6.2.22 and Proposition 6.2.23, if (x, y) ∈ D(Λlct,L ), then there is a unique
L-minimizer γ : [0, t] → M , with γ(0) = x, γ(t) = y, and this minimizer is of the form
γ(s) = πφLs (x, v), where π : T M → M is the canonical projection, and v ∈ Tx M is
uniquely determined by the equation
∂ct,L ∂L
(x, y) = − (x, v).
∂x ∂v
g L (ϕ)), where grad
Therefore T (x) = πφLt (x, grad g L (ϕ) is uniquely determined by
x x
π ∗ ◦ L = π,
Theorem 1.4.3. Suppose that the connected manifold M is endowed with a Riemannian
metric g which is complete. Denote by d the Riemannian distance. If r > 1, and µ and ν
are probability (Borel) measures on M , where µ gives measure zero to countably (n − 1)-
Lipschitz sets, and
Z Z
r
d (x, x0 ) dµ(x) < ∞ and dr (x, x0 ) dν(x) < ∞
M M
˜ of ϕ is defined by
where the approximate Riemannian gradient ∇ϕ
˜ x ϕ, ·) = d˜x ϕ,
gx (∇
Therefore
Z Z
r
d (x, y) dµ(x)dν(y) ≤ 2r [d(x, x0 )r + d(y, x0 )r ] dµ(x)dν(y)
M ×M M ×M
Z Z
r r r
=2 d (x, x0 ) dµ(x) + 2 dr (y, x0 ) dν(y) < ∞,
M M
and thus the infimum in the Kantorovitch problem (1.2.1) with cost dr is finite.
By Example 6.2.5, the Lagrangian Lr,g (x, v) = kvkrx = gx (v, v)r/2 is a weak Tonelli
Lagrangian. By Proposition 6.2.24, the non-negative and continuous cost dr (x, y) is
32 1.0. The optimal transportation problem
precisely the cost c1,Lr,g . Therefore this cost is locally semi-concave by Theorem 6.2.19.
By Proposition 6.1.17, this implies that dr (x, y) satisfies condition (i) of Theorem 1.3.2.
The fact that the cost dr (x, y) satisfies the left twist condition (ii) of Theorem 1.3.2
follows from Proposition 6.2.24. Therefore there is an optimal transport map T .
If the measure µ is absolutely continuous with respect to Lebesgue measure, and
(ϕ, ψ) is a calibrated subsolution for the cost dr (x, y) and the pair of measures (µ, ν),
then by Complement 1.3.4, for µ-almost every x, we have (x, T (x)) ∈ D(Λlc1,Lr,g ), and
∂ct,Lr,g
(x, T (x)) = −d˜x ϕ.
∂x
Since (x, T (x)) is in D(Λlc1,Lr,g ), it follows from Proposition 6.2.24 that T (x) = πφg1 (x, vx ),
where π : T M → M is the canonical projection, the flow φgt is the geodesic flow of g on
T M , and vx ∈ Tx M is determined by
∂ct,Lr,g ∂Lr,g
(x, T (x)) = − (x, vx ),
∂x ∂v
or, given the equality above, by
∂Lr,g
(x, vx ) = d˜x ϕ.
∂v
Now the vertical derivative of Lr,g is computed in Example 6.2.5
∂Lr,g
(x, v) = rkvkr−2
x gx (v, ·).
∂v
Hence vx ∈ Tx M is determined by
rkvx kr−2 ˜ ˜
x gx (vx , ·) = dx ϕ = gx (∇x ϕ, ·).
Therefore
˜ xϕ
∇
T (x) = πφgt (x, ).
˜ x ϕk(r−2)/(r−1)
r1/(r−1) k∇ x
By definition of the exponential map exp : T M → M , we have expx (v) = πφgt (x, v), and
the formula for T (x) follows.
1.5. The interpolation and its absolute continuity 33
We call Tt the optimal transport map given by Theorem 1.4.2 for (ct,L , µ0 , µt ). We denote
˜ t )# µ0 is the unique
by (ϕ, ψ) a fixed (ct,L , γt )-calibrated pair. Therefore γt = (IdM ×T
optimal plan from µ0 to µt . By Theorem 1.4.2, we can find a Borel subset B ⊂ M such
that:
• the approximate ddifferential d˜x ϕ exists for every x ∈ B, and is Borel measurable
on B;
g L (ϕ)),
Tt (x) = πφLt (x, gradx
∂L g L (ϕ)) = d˜x ϕ;
(x, grad x
∂v
∂c
• for every x ∈ B, the partial derivative (x, Tt (x)) exists, and is uniquely defined
∂x
by
∂c
(x, Tt (x)) = −d˜x ϕ;
∂x
34 1.0. The optimal transportation problem
We now make the following important remark, that we will need also in the sequel:
Remark 1.5.1. We observe that, for µ0 -a.e. x, there exists an unique curve from x to
∂c
Tt (x) that minimizes the action. In fact, since ∂x (x, y) exists at y = Tt (x) for µ0 -a.e. x,
the twist conditions proved in Section 1.4 tells us that its velocity at time 0 is µ0 -a.e.
univocally determined.
∀x ∈ B, g L (ϕ)).
Ts (x) = γx (s) = πφLs (x, grad x
Each map Ts is Borel measurable. In fact, since the global Legendre transform is a
homeomorphism and the approximate differential is Borel measurable, the Lagrangian
g L (ϕ) is itself Borel measurable. Moreover the map πφL :
approximate gradient grad s
T M → M is continuous, and thus Ts is Borel measurable. We can therefore define the
probability measure µs = Ts# µ0 on M , i.e. the measure µs is the image of µ0 under the
Borel measurable map Ts .
Theorem 1.5.2. Under the hypothesises above, the maps Ts satisfies the following
properties:
(i) For every s ∈ (0, t), the map Ts is the (unique) optimal transport maps for the cost
cs,L and the pair of measures (µ0 , µs ).
(ii) For every s ∈ (0, t), the map Ts : B → M is injective. Moreover, if we define
c̄s,L (x, y) = cs,L (y, x), the inverse map Ts−1 : Ts (B) → B is the (unique) optimal
transport map for the cost c̄s,L and the pair of measures (µs , µ0 ), and it is count-
ably Lipschitz (i.e. there exist a Borel countable partition of M such that Ts−1 is
Lipschitz on each set).
(iii) For every s ∈ (0, t), the measure µs = Ts# µ0 is absolutely continuous with respect
to Lebesgue measure.
1.5. The interpolation and its absolute continuity 35
(iv) For every s ∈ (0, t), the composition T̂s = Tt Ts−1 is the (unique) optimal trans-
port map for the cost ct−s,L and the pair of measures (µs , µt ), and it is countably
Lipschitz.
Proof. Fix s ∈ (0, t). It is not difficult to see, from the definition of ct,L , that
∀s ∈]a, b[, cb−a,L (γ(a), γ(b)) = cs−a,L (γ(a), γ(s)) + cb−s,L (γ(s), γ(b)).
∀x ∈ B, ct,L (x, Tt (x)) = cs,L (x, Ts (x)) + ct−s,L (Ts (x), Tt (x)). (1.5.2)
Note that the marginals of γs are (µ0 , µs ), and those of γ̂s are (µs , µt ). We claim that
cs,L (x, y) is integrable for γs and γ̂t−s . In fact, we have C = inf T M L > −∞, hence
cr,L ≥ Cr. Therefore, the equality (1.5.2) gives
∀x ∈ B, [ct,L (x, Tt (x)) − Ct] = [cs,L (x, Ts (x)) − Cs] + [ct−s,L (Ts (x), Tt (x)) − C(t − s)].
Since the functions between brackets are all non-negative, we can integrate this equality
with respect to µ0 to obtain
Z Z
[ct,L (x, y) − Ct] dγt = [cs,L (x, y) − Cs] dγs
M ×M M ×M
Z
+ [ct−s,L (x, y) − C(t − s)] dγ̂s .
M ×M
36 1.0. The optimal transportation problem
But all numbers involved in the equality above are non-negative, all measures are proba-
bility measures, and the cost ct,L is γt integrable since γt is an optimal plan for (ct,L , µ0 , µt ),
and the optimal cost of (ct,L , µ0 , µt ) is finite. Therefore we obtain that cs,L is γs -integrable,
and ct−s,L is γ̂s -integrable.
Since by definition of a calibrating pair we have ϕ > −∞ and ψ < +∞ everywhere
on M , we can find an increasing sequence of compact subsets Kn ⊂ M with ∪n Kn = M ,
and we consider Vn = Kn ∩ {ϕ ≥ −n}, Vn0 = Kn ∩ {ψ ≤ n}, so that ∪n Vn = ∪n Vn0 = M .
We define the functions ϕns , ψsn : M → R by
ψsn (z) = inf ϕ(z̃) + cs,L (z̃, z),
z̃∈Vn
where (ϕ, ψ) is the fixed ct,L -calibrated pair. Note that ψsn is bounded from below by
−n + t inf T M L > −∞. Moreover, the family of functions (ϕ(z̃) + cs,L (z̃, ·))z̃∈Vn0 is locally
uniformly semi-concave with a linear modulus, since this is the case for the family of
functions (cs,L (z̃, ·))z̃∈Vn0 , by Theorem 6.2.19 and Proposition 6.1.17. It follows from
Proposition 6.1.16 that ψsn is semi-concave with a linear modulus. A similar argument
proves that −ϕns is semi-concave with a linear modulus. Note also that, since Vn and Vn0
are both increasing sequences, we have ψsn ≥ ψsn+1 and ϕn+1 s ≤ ϕns , for every n. Therefore
we can define ϕs (resp. ψs ) as the pointwise limit of the sequence ϕns
Using the fact that (ϕ, ψ) is a ct,L -subsolution, and inequality (1.5.1) above, we obtain
∀x, y, z ∈ M, ψ(y) − ct−s,L (z, y) ≤ ϕ(x) + cs,L (x, z).
Therefore we obtain for x ∈ Vn , y ∈ Vn0 , z ∈ M
ψ(y) − ct−s,L (z, y) ≤ ϕns (z) ≤ ϕs (z) ≤ ψs (z) ≤ ψsn (z) ≤ ϕ(x) + cs,L (x, z). (1.5.3)
Inequality (1.5.3) above yields
∀x, y, z ∈ M, ψ(y) − ct−s,L (z, y) ≤ ϕs (z) ≤ ψs (z) ≤ ϕ(x) + cs,L (x, z). (1.5.4)
In particular, the pair (ϕ, ψs ) is a cs,L -subsolution, and the pair (ϕs , ψ) is a ct−s,L -
subsolution. Moreover, ϕ, ψs , ϕs and ψ are all Borel measurable.
We now define
Bn = B ∩ Vn ∩ Tt−1 (Vn0 ),
so that ∪n Bn = B has full µ0 -measure.
If x ∈ Bn , it satisfies x ∈ Vn and Tt (x) ∈ Vn0 . From Inequality (1.5.3) above, we
obtain
ψ(Tt (x)) − ct−s,L (Ts (x), Tt (x)) ≤ ϕns (Ts (x)) ≤ ϕs (Ts (x))
≤ ψs (Ts (x)) ≤ ψsn (Ts (x)) ≤ ϕ(x) + cs,L (x, Ts (x))
1.5. The interpolation and its absolute continuity 37
Since Bn ⊂ B, for x ∈ Bn , we have ψ(Tt (x)) − ϕ(x) = ct,L (x, Tt (x)). Combining this
with Equality (1.5.2), we conclude that the two extreme terms in the inequality above
are equal. Hence, for every x ∈ Bn , we have
ψ(Tt (x)) − ct−s,L (Ts (x), Tt (x)) = ϕns (Ts (x)) = ϕs (Ts (x))
= ψs (Ts (x)) = ψsn (Ts (x)) = ϕ(x) + cs,L (x, Ts (x)). (1.5.5)
In particular, we get
or equivalently
ψs (y) − ϕ(x) = cs,L (x, y) for γs -a.e. (x, y).
Since (ϕ, ψs ) is a (Borel) cs,L -subsolution, it follows that the pair (ϕ, ψs ) is (cs,L , γs )-
calibrated. Therefore, by Theorem 1.2.3 we get that γs = (IdM ×T ˜ s )# µ0 is an optimal
plan for (cs,L , µ0 , µs ). Moreover, since cs,L is γs -integrable, the infimum in the Kan-
torovitch problem (1.2.1) in Theorem 1.3.2 with cost cs,L is finite, and therefore there
exists a unique optimal transport plan. This proves (i).
Note for further reference that a similar argument, using the equality
which follows from Equation (1.5.5) above, shows that the measure γ̂s = (Ts ×T ˜ t )# µ0 is
an optimal plan for the cost ct−s,L and the pair of measures (µs , µt ).
We now want to prove (ii). Since B is the increasing union of Bn = B ∩ Vn ∩ Tt−1 (Vn0 ),
it suffices to prove that Ts is injective on Bn and that the restriction T −1 |T (Bn ) is locally
Lipschitz on Ts (Bn ).
Since Bn ⊂ Vn , by Inequality (1.5.3) above we have
∀x ∈ Bn , ϕns (Ts (x)) = ψsn (Ts (x)) = ϕ(x) + cs,L (x, Ts (x)). (1.5.7)
In particular, we have ϕns ≤ ψsn everywhere with equality at every point of Ts (Bn ). As
we have said above, both functions ψsn and −ϕns are locally semi-concave with a linear
modulus. It follows, from Theorem 6.1.19, that both derivatives dz ϕns , dz ψsn exist and
are equal for z ∈ Ts (Bn ). Moreover, the map z 7→ dz ϕns = dz ψsn is locally Lipschitz on
Ts (Bn ). Note that we also get from (1.5.6) and (1.5.7) above that for a fixed x ∈ Bn , we
have ϕns ≤ ϕ(x) + cs,L (x, ·) everywhere with equality at Ts (x). Since ϕn is semi-convex,
38 1.0. The optimal transportation problem
using that cs,L (x, ·) is semi-concave, again by Theorem 6.1.19, we obtain that the partial
∂cs,L
derivative (x, Ts (x)) of cs,L with respect to the second variable exists and is equal
∂y
to dTs (x) ϕns = dTs (x) ψsn . Since γx : [0, t] → M is an L-minimizer with γx (0) = x and
γx (s) = Ts (x), it follows from Corollary 6.2.20 that
∂cs,L ∂L
dTs (x) ψsn = (x, Ts (x)) = (γx (s), γ̇x (s)).
∂y ∂v
Since γx is an L-minimizer, its speed curve is an orbit of the Euler-Lagrange flow, and
therefore
(Ts (x), dTs (x) ψsn ) = L ((γx (s), γ̇x (s)) = L φLs (γx (0), γ̇x (0)),
and
x = πφL−s L −1 (Ts (x), dTs (x) ψsn ).
It follows that Ts is injective on Bn with inverse given by the map θn : Ts (Bn ) → Bn
defined, for z ∈ Ts (Bn ), by
θn (z) = πφL−s L −1 (z, dz ψsn ).
Note that the map θn is locally Lipschitz on Ts (Bn ), since this is the case for z 7→
dz ψsn , and both maps φL−s , L −1 are C1 , since L is a Tonelli Lagrangian. An analogous
argument proves the countably Lipschitz regularity of T̂s = Tt Ts−1 in part (iv). Finally
the optimality of Ts−1 simply follows from
½Z ¾ ½Z ¾
min c̄s,L (x, y) dγ(x, y) = min cs,L (x, y) dγ(x, y)
γ∈Π(µs ,µ0 ) M ×M γ∈Π(µ0 ,µs ) M ×M
Z
= cs,L (x, Ts (x)) dµ0 (x)
M
Z
= c̄s,L (y, Ts−1 (y)) dµs (y).
M
Part (iii) of the Theorem follows from part (ii). In fact, if A ⊂ M is Lebesgue neg-
ligible, the image Ts−1 (Ts (B) ∩ A) is also Lebesgue negligible, since Ts−1 is countably
Lipschitz on Ts (B), and therefore sends Lebesgue negligible subsets to Lebesgue neg-
ligible subsets. It remains to note, using that B is of full µ0 -measure, that µs (A) =
Ts# µ0 (A) = µ0 (Ts−1 (Ts (B) ∩ A)) = 0.
˜ t )# µ0 is an optimal plan for the
To prove part (iv), we already know that γ̂s = (Ts ×T
cost ct−s,L and the measures (µs , µt ). Since the Borel set B is of full µ0 -measure, and
Ts : B → Ts (B) is a bijective Borel measurable map, we obtain that Ts−1 is a Borel map,
−1
and µ0 = Ts# µs . It follows that
˜ t Ts−1 )# µs .
γ̂s = (IdM ×T
1.6. The Wasserstein space W2 39
Therefore the composition Tt Ts−1 is an optimal transport map for the cost ct−s,L and the
pair of measures (µs , µt ), and it is the unique one since ct−s,L is γ̂s -integrable and µs is
absolutely continuous with respect to the Lebesgue measure.
Remark 1.5.3. We observe that, in proving the uniqueness statement in parts (i) and
(iv) of the above theorem, we needed the full generality of Theorem 1.4.2, in which we
only assume that the infimum in the Kantorovitch problem is finite. Indeed, assuming
Z
ct,L (x, y) dµ0 (x)dµt (y) < +∞,
M ×M
would have to be finite. So the existence and uniqueness of a transport map in Theorem
1.3.2 under the integrability assumption on c with respect to µ ⊗ ν instead of assumption
(iv) would not have been enough to obtain Theorem 1.5.2.
Remark 1.5.4. We remark that, if both µ0 and µt are not assumed to be absolutely
continuous, and therefore no optimal transport map necessarily exists, one can still define
an “optimal” interpolation (µs )0≤s≤t between µ0 and µt using some measurable selection
theorem (see [133, Chapter 7]). Then, adapting our proof, one still obtains that, for any
s ∈ (0, t), there exists a unique optimal transport map Ss for (c̄s,L , µs , µ0 ) (resp. a unique
optimal transport map Ŝs for (ct−s,L , µs , µt )), and this map is countably Lipschitz.
We also observe that, if the manifold is compact, our proof shows that the above
maps are globally Lipschitz (see [22]).
We remark that, by the triangle inequality for d, the definition does not depends on the
point x0 . The space P2 (M ) can be endowed of the so called Wasserstein distance W2 :
½Z ¾
2 2
W2 (µ0 , µ1 ) := min d (x, y) dγ(x, y) .
γ∈Π(µ0 ,µ1 ) M ×M
40 1.0. The optimal transportation problem
The quantity W2 will be called the Wasserstein distance of order 2 between µ0 and µ1 .
It is well-known that it defines a finite metric on P2 (M ), and so one can speak about
geodesic in the metric space (P2 , W2 ). This space turns out, indeed, to be a length space
(see for example [132], [133]). We denote with P2ac (M ) the subset of P2 (M ) that consists
of the Borel probability measures on M that are absolutely continuous with respect to
vol.
By all the result proved before, it is simple to prove the following:
Proposition 1.6.1. P2ac (M ) is a geodesically convex subset of P2 (M ). Moreover, if
µ0 , µ1 ∈ P2ac (M ), then there is a unique Wasserstein geodesic {µt }t∈[0,1] joining µ0 to µ1 ,
which is given by
˜ ] µ0 ,
µt = (Tt )] µ0 := (exp[t∇ϕ])
where T (x) = expx [∇ ˜ x ϕ] is the unique transport map from µ0 to µ1 which is optimal for
1 2
the cost 2 d (x, y) (and so also optimal for the cost d2 (x, y)). Moreover:
1. Tt is the unique optimal transport map from µ0 to µt for all t ∈ [0, 1];
2. Tt−1 the unique optimal transport map from µt to µ0 for all t ∈ [0, 1] (and, if
t ∈ [0, 1), it is locally Lipschitz);
3. T ◦ Tt−1 the unique optimal transport map from µt to µ1 for all t ∈ [0, 1] (and, if
t ∈ (0, 1], it is locally Lipschitz).
Since we know that the transport is unique, the proof is quite standard. However,
for completeness, we give all the details.
Proof. Let {µt }t∈[0,1] be a Wasserstein geodesic joining µ0 to µ1 . Fix t ∈ (0, 1), and
let γt (resp. γ̂t ) be an optimal transport plan between µ0 and µt (resp. µt and µ1 ) (in
effect, we know that γt is a graph and it is unique, but we will not use this fact). We
now define the probability measure on M × M × M
Z
λt (dx, dy, dz) := γt (dx|y) × γ̂t (dz|y) dµt (y),
M
R R
where γt (dx, dy) = M γt (dx|y) dµt (y) and γ̂t (dy, dz) = M γ̂t (dz|y) dµt (y) are the disin-
tegrations of γt and γ̂t with respect to µt . Then, if we define
γ̃t := π]1,3 λt ,
it is simple to check that γ̃t is a transport plan from µ0 to µ1 . Now, since {µt }t∈[0,1] is a
geodesic, we have that
W2 (µ0 , µ1 ) ≤ kd(x, z)kL2 (γ̃t ,M ×M ) = kd(x, z)kL2 (λt ,M ×M ×M )
≤ kd(x, y)kL2 (λt ,M ×M ×M ) + kd(y, z)kL2 (λt ,M ×M ×M )
(1.6.1)
= kd(x, y)kL2 (γt ,M ×M ) + kd(y, z)kL2 (γ̂t ,M ×M )
= W2 (µ0 , µt ) + W2 (µt , µ1 ) = W2 (µ0 , µ1 ).
1.6. The Wasserstein space W2 41
This proves that γ̃t is an optimal transport plan between µ0 and µ1 , which implies that
γ̃t is supported on the graph of T . Moreover, since in (1.6.1) all the inequalities are
indeed equalities, we get that
d(x, z) = d(x, y) + d(y, z) for λt -a.e. (x, y, z) ∈ M × M × M
that is, y is on a geodesic from x to z. Moreover, since W2 (µ0 , µt ) = tW2 (µ0 , µ1 ), we also
have
d(x, y) = td(x, z), d(y, z) = (1 − t)d(x, z) for λt -a.e. (x, y, z) ∈ M × M × M.
Since, by Remark 1.5.1, the geodesic from x to T (x) is unique for µ0 -a.e. x, we conclude
that λ is concentrated on the subset {(x, Tt (x), T (x))}x∈supp(µ0 ) , which implies that µt =
(Tt )] µ0 . Moreover we see that µt := (Tt )] µ0 ∈ P2ac (M ). In fact,
Z Z
2
d (x, x0 ) dµt (x) = d2 (Tt (x), x0 ) dµ0 (x)
M M
Z
£ 2 ¤
≤2 d (x, x0 ) + d2 (x, Tt (x)) dµ0 (x)
ZM
£ 2 ¤
≤2 d (x, x0 ) + d2 (x, T (x)) dµ0 (x)
ZM
£ 2 ¤
≤4 d (x, x0 ) + d2 (x0 , T (x)) dµ0 (x)
ZM Z
2
=4 d (x, x0 ) dµ0 (x) + 4 d2 (x0 , y) dµ1 (y) < +∞,
M M
and the result in Section 1.5 tells us that µt is absolutely continuous. Using the notation
of Section 1.4, we have
Z t
1 1
ct (x, y) = inf kγ̇(s)k2γ(s) ds = d2 (x, y).
γ(0)=x, γ(t)=y 0 2 2t
Since Tt and Tt−1 are optimal for the cost function 2t1 d2 (x, y), and T ◦ Tt−1 is optimal for
1
the cost function 2(1−t) d2 (x, y), we get that Tt , Tt−1 and T ◦ Tt−1 are optimal also for the
cost d2 (x, y). ¤
The above result tells us that also (P2ac (M ), W2 ) is a length space.
As in the case of the approximate differential, it is not difficult to show that this
definition makes sense.
Observing that d2 (x, y) is locally semi-concave with linear modulus (see [66, Ap-
pendix]), we get that ϕn is locally semi-convex with linear modulus for each n. Thus we
can define µ-a.e. an approximate hessian for ϕ (see Definition 1.6.2):
˜ 2 ϕ := ∇2 ϕn
∇ for x ∈ An ∩ En ,
x x
where An was defined in the proof of Complement 1.3.4, En denotes the full µ-measure
set of points where ϕn admits a second order Taylor expansion, and ∇2x ϕn denotes the
self-adjoint operator on Tx M that appears in the Taylor expansion on ϕn at x. Let
us now consider, for each set Fn := An ∩ En , an increasing sequence of compact sets
n n
Km ⊂ Fn such that µ(Fn \ ∪m Km ) = 0. We now define the measures µnm := µxKm n
n n n
and νm := T] µm = (exp[∇ϕn ])] µm , and we renormalize them in order to obtain two
probability measures:
µnm νmn
νmn
µ̂nm := n
∈ P2ac (M ), n
ν̂m := n
= n
∈ P2ac (M ).
µm (M ) νm (M ) µm (M )
We now observe that T is still optimal. In fact, if this were not the case, we would have
Z Z
n
c(x, S(x)) dµ̂m (x) < c(x, T (x)) dµ̂nm (x)
M ×M M ×M
n
for a certain S transport map from µ̂nm to ν̂m . This would imply that
Z Z
n
c(x, S(x)) dµm (x) < c(x, T (x)) dµnm (x),
M ×M M ×M
1.6. The Wasserstein space W2 43
would have a cost strictly less than the cost of T , which would contradict the optimality
of T .
We will now apply the results of [50] to the compactly supported measures µ̂nm and
n
ν̂m in order to get information on the transport problem from µ to ν. In what follows we
will denote by ∇x d2y and by ∇2x d2y , respectively, the gradient and the hessian with respect
to x of d2 (x, y), and by dx exp and d(expx )v the two components of the differential of the
map T M 3 (x, v) 7→ expx [v] ∈ M (whenever they exist). By [50, Theorem 4.2], we get
the following.
Theorem 1.6.3 (Jacobian identity a.e.). There exists a subset E ⊂ M such that
µ(E) = 1 and, for each x ∈ E, Y (x) := d(expx )∇˜ x ϕ and H(x) := 21 ∇2x d2T (x) both exist
and we have
f (x) = g(T (x)) det[Y (x)(H(x) + ∇˜ 2 ϕ)] 6= 0.
x
Proof. It suffices to observe that [50, Theorem 4.2] applied to µ̂nm and ν̂m
n
gives that, for
n
µ-a.e. x ∈ Km ,
f (x) g(T (x))
n
= n det[Y (x)(H(x) + ∇2x ϕn )] 6= 0,
µm (M ) µm (M )
which implies
We can thus define µ-a.e. the (weak) differential of the transport map at x as
¡ ¢
˜ 2x ϕ .
dx T := Y (x) H(x) + ∇
Let us prove now that, indeed, T (x) is approximately differentiable µ-a.e., and that the
above differential coincides with the approximate differential of T . In order to prove
this fact, let us first make a formal computation. Observe that since the map x 7→
expx [− 21 ∇x d2y ] = y is constant, we have
¡1 ¢
0 = dx (expx [− 12 ∇x d2y ]) = dx exp[− 12 ∇x d2y ] − d(expx )− 1 ∇x d2y ∇2 d2
2 x y
∀y ∈ M,
2
44 1.0. The optimal transportation problem
˜ x ϕ] and recalling
By differentiating (in the approximate sense) the equality T (x) = exp[∇
the equality ∇˜ x ϕ = − ∇x d
1 2
2 T (x) , we obtain
¡ 2 ¢
d˜x T = d(expx )∇˜ x ϕ ∇˜ x ϕ + dx exp[∇ ˜ x ϕ]
¡ 2 ¢ ¡1 2 2 ¢
= d(expx )∇˜ x ϕ ∇˜ ϕ + d(expx ) 1 2
x − 2 ∇x dT (x) 2 ∇x dT (x)
¡ ¢
= d(expx )∇˜ x ϕ H(x) + ∇ ˜ 2x ϕ ,
as wanted. In order to make the above proof rigorous, it suffices to observe that for
µ-a.e. x, T (x) 6∈ cut(x), where cut(x) is defined as the set of points z ∈ M which cannot
be linked to x by an extendable minimizing geodesic. Indeed we recall that the square
of the distance fails to be semi-convex at the cut locus, that is, if x ∈ cut(y), then
(see [50, Proposition 2.5]). Now fix x ∈ Fn . Since we know that 12 d2 (z, T (x)) ≥ ψ(T (x))−
ϕn (z) with equality for z = x, we obtain a bound from below of the hessian of d2T (x) at x
in terms of the hessian of ϕn at x (see the proof of [50, Proposition 4.1(a)]). Thus, since
each ϕn admits vol-a.e. a second order Taylor expansion, we obtain that, for µ-a.e. x,
This implies that all the computations we made above in order to prove the formula
for d˜x T are correct. Indeed the exponential map (x, v) 7→ expx [v] is smooth if expx [v] 6∈
cut(x), the function d2y is smooth around any x 6∈ cut(y) (see [50, Paragraph 2]), and ∇˜ xϕ
is approximatively differentiable µ-a.e. Thus, recalling that, once we consider the right
composition of an approximatively differentiable map with a smooth map, the standard
chain rule holds (see the remarks after Definition 1.3.3), we have proved the following
regularity result for the transport map.
To prove our displacement convexity result, the following change of variables formula
will be useful.
1.6. The Wasserstein space W2 45
where J(x) := det[Y (x)(H(x) + ∇˜ 2 ϕ)] = det[d˜x T ] (either both integrals are undefined or
x
both take the same value in R).
The proof follows by the Jacobian identity proved in Theorem 1.6.3, exactly as in
[50, Corollary 4.7].
Let us now define for t ∈ [0, 1] the measure µt := (Tt )] µ, where
˜ x ϕ].
Tt (x) = expx [t∇
By the results in [66] and Proposition 1.6.1, we know that Tt coincides with the unique
optimal map pushing µ forward to µt , and that µt is absolutely continuous with respect
to vol for any t ∈ [0, 1].
Given x, y ∈ M , following [50], we define for t ∈ [0, 1]
Letting Br (y) ⊂ M denote the open ball of radius r > 0 centered at y ∈ M , for t ∈ (0, 1]
we define
vol(Zt (x, Br (y)))
vt (x, y) := lim >0
r→0 vol(Btr (y))
(the above limit always exists, though it will be infinite when x and y are conjugate
points; see [50]). Arguing as in the proof of Theorem 1.6.3, by [50, Lemma 6.1] we get
the following.
Theorem 1.6.6 (Jacobian inequality). Let E be the set of full µ-measure given by
Theorem 1.6.3. Then for each x ∈ E, Yt (x) := d(expx )t∇˜ x ϕ and Ht (x) := 21 ∇2x d2Tt (x) both
exist for all t ∈ [0, 1] and the Jacobian determinant
˜ 2 ϕ)]
Jt (x) := det[Yt (x)(Ht (x) + t∇ (1.6.2)
x
satisfies
46 1.0. The optimal transportation problem
1 1 1 1
Jtn (x) ≥ (1 − t) [v1−t (T (x), x)] n + t [vt (x, T (x))] n J1n (x).
We now consider as source measure µ0 = ρ0 d vol(x) ∈ P ac (M ) and as target measure
µ1 = ρ1 d vol(x) ∈ P ac (M ), and we assume as before that W2 (µ0 , µ1 ) < +∞. By
Proposition 1.6.1 we have
µt = (Tt )] [ρ0 d vol] = ρt d vol ∈ P2ac (M )
for a certain ρt ∈ L1 (M, d vol).
We now want to consider the behavior of the functional
Z
U (ρ) := A(ρ(x)) d vol(x)
M
along the path t 7→ ρt . In Euclidean spaces, this path is called displacement interpolation
and the functional U is said to be displacement convex if
[0, 1] 3 t 7→ U (ρt ) is convex for every ρ0 , ρ1 .
A sufficient condition for the displacement convexity of U in Rn is that A : [0, +∞) →
R ∪ {+∞} satisfy
(0, +∞) ∈ s 7→ sn A(s−n ) is convex and nonincreasing, with A(0) = 0 (1.6.3)
(see [106], [108]). Typical examples include the entropy A(ρ) = ρ log ρ and the Lq -norm
1
A(ρ) = q−1 ρq for q ≥ n−1
n
.
By all the results collected above, arguing as in the proof of [50, Theorem 6.2], we can
prove that the displacement convexity of U is still true on Ricci nonnegative manifolds
under the assumption (1.6.3).
Theorem 1.6.7 (displacement convexity on Ricci nonnegative manifolds). If
Ric ≥ 0 and A satisfies (1.6.3), then U is displacement convex.
Proof. As we remarked above, Tt is the optimal transport map from µ0 to µt . So, by
Theorem 1.6.3 and Proposition 1.6.5, we get
Z Z Ã !
ρ0 (x) ³ 1 ´n
U (ρt ) = A(ρt (x)) d vol(x) = A ¡ 1 ¢ Jtn (x) d vol(x), (1.6.4)
n
M Et Jt (x)
n
where Et is the set of full µ0 -measure given by Theorem 1.6.3 and Jt (x) 6= 0 is defined in
(1.6.2). Since Ric ≥ 0, we know that vt (x, y) ≥ 1 for every x, y ∈ M (see [50, Corollary
2.2]). Thus, for fixed x ∈ E1 , Theorem 1.6.6 yields the concavity of the map
1
[0, 1] 3 t 7→ Jtn (x).
1.7. Displacement convexity on Riemannian manifolds 47
Composing this function with the convex nonincreasing function s 7→ sn A(s−n ) we get
the convexity of the integrand in (1.6.4). The only problem we run into in trying to
conclude the displacement convexity of U is that the domain of integration appears to
depend on t. But, since by Theorem 1.6.3 Et is a set of full µ0 -measure for any t ∈ [0, 1],
we obtain that, for fixed t, t0 , s ∈ [0, 1],
simply by computing each of the three integrals above on the full measure set Et ∩ Et0 ∩
E(1−s)t+st0 .
where π(dy|x) is the disintegration of π(dx dy) with respect to the x variable.
β
Remark 1.7.2. Sufficient conditions for Uν and Uπ,ν to be well-defined are discussed
in [133, Theorems 17.8 and 17.28, Application 17.29] and will not be addressed here.
Remark 1.7.3. If U 0 (∞) = ∞, then finiteness of Uν (µ) implies that µ is absolutely
continuous with respect to ν. This is not true if U 0 (∞) < ∞.
The various notions of convexity that are considered in [97, 126, 127, 128] belong to
the following ones:
1.7. Displacement convexity on Riemannian manifolds 49
(ii) We say that Uν is weakly λ-displacement convex (resp. weakly displacement convex
with distortion β) if for all µ0 , µ1 in the domain of Uν , there is some Wasserstein geodesic
from µ0 to µ1 along which (1.7.1) (resp. (1.7.2)) is satisfied.
(iii) We say that Uν is weakly λ-a.c.c.s. displacement convex (resp. weakly a.c.c.s.
displacement convex with distortion β) if condition (1.7.1) (resp. (1.7.2)) is satisfied along
some Wasserstein geodesic when we further assume that µ0 , µ1 are absolutely continuous
and compactly supported.
Remark 1.7.5. The Wasserstein geodesic in (ii) and (iii) above is implicitly assumed
to have its image entirely contained in the domain of the functional Uν .
Remark 1.7.6. If Uν is a λ-displacement convex functional, then the function t 7→ Uν (µt )
is λ-convex on [0, 1], i.e. for all 0 ≤ s ≤ s0 ≤ 1 and t ∈ [0, 1],
1
Uν (µ(1−t)s+ts0 ) ≤ (1 − t)Uν (µs ) + tUν (µs0 ) − λt(1 − t)(s0 − s)2 W2 (µ0 , µ1 )2 . (1.7.3)
2
This is not a priori the case if we only assume that Uν is weakly λ-displacement convex.
In short, weakly means that we require a condition to hold only for some geodesic
between two measures, as opposed to all geodesics, and a.c.c.s. means that we only
require the condition to hold when the two measures are absolutely continuous and
compactly supported.
There are obvious implications (with or without distorsion)
λ-displacement convex
⇓
weakly λ-displacement convex
⇓
weakly λ-a.c.c.s. displacement convex.
50 1.0. The optimal transportation problem
Although the natural convexity condition is arguably the one appearing in (i), that
is, holding true along all Wasserstein geodesics, this condition is quite more delicate to
study than the weaker conditions appearing in (ii) and (iii), in particular for stability
issues: See [97, 126, 127]. In the same references the equivalence between (ii) and (iii)
was established, at least for compact spaces [97, Proposition 3.21]. But the implication
(ii) ⇒ (i) remained open (and was listed as an open problem in a preliminary version
of [133]). Here we shall fill this gap (at least for the functionals defined above), thus
answering a natural question about the notion of displacement convexity. Here is our
main result:
More generally, Theorem 1.7.7 makes it possible to drop the “weakly” in all displace-
ment convexity characterizations of Ricci curvature bounds.
Before turning to the proof of Theorem 1.7.7, let us explain a bit more about the dif-
ficulties and the strategy of proof. Obviously, there are two problems to tackle: first, the
possibility that µ0 and/or µ1 do not have compact support; and secondly, the possibility
that µ0 and/or µ1 are singular with respect to the volume measure.
It was shown in [97, 126, 127] that inequalities such as (1.7.1) or (1.7.2) are stable
under (weak) convergence. Then it is natural to approximate µ0 , µ1 by compactly
supported, absolutely continuous measures, and pass to the limit. This scheme of proof
is enough to show the implication (iii) ⇒ (ii) in Definition 1.7.4, but does not guarantee
that we can attain all Wasserstein geodesics in this way — unless of course we know that
there is a unique Wasserstein geodesic between µ0 and µ1 .
To treat the difficulty arising from the possible non-compactness, we use the results of
the previous sections, showing that the Wasserstein geodesic between any two absolutely
continuous probability measures on a Riemannian manifold M is unique, even if they
are not compactly supported.
1.7. Displacement convexity on Riemannian manifolds 51
The difficulty arising from the possible singularity of µ0 , µ1 is less simple. If µ0 and µ1
are both singular, then there are in general several Wasserstein geodesics joining them.
A most simple example is constructed by taking µ0 = δx0 and µ1 = δx1 , where δx stands
for the Dirac mass at x, and x0 , x1 are joined by multiple geodesics. So it is part of
the problem to regularize µ0 , µ1 into absolutely continuous measures µ0,k , µ1,k so that,
as k → ∞, the optimal transport between µ0,k and µ1,k converges to a given optimal
transport between µ0 and µ1 .
We handle this by a rather nonstandard regularization procedure, which roughly goes
as follows. We start from a given dynamical optimal transference plan Π between µ0
and µ1 , leave intact that part Π(a) of Π which corresponds to the absolutely continuous
part of µ0 . Then we let displacement occur for a very short time at the level of that
part Π(s) of Π corresponding to the singular part of µ0 . Next we regularize the resulting
contribution of Π(s) .
Let us illustrate this in the most basic case when µ0 = δx0 and µ1 = δx1 . Let
γ = (γt )0≤t≤1 be a given geodesic between x0 and x1 ; we wish to approximate the
Wasserstein geodesic (δγt )0≤t≤1 . Instead of directly regularizing µ0 and µ1 , we shall first
replace µ0 by µτ = δγτ , where τ is positive but very small; and then regularize δγτ and
δx1 into probability measures µτ,ϕ and µ1,ϕ . What we have gained is that the geodesic
joining γτ to x1 = γ1 is unique, so we may let τ → 0 and ϕ → 0 in such a way that the
Wasserstein geodesic joining µτ,ϕ to µ1,ϕ does converge to (δγt )0≤t≤1 .
In a more general context, the procedure will be more tricky, and what will make it
work is the following important property [133, Theorem 7.29]: Geodesics in dynamical
optimal transport plans do not cross at intermediate times. In fact, if Π is a given
dynamical optimal transport plan, then for each t ∈ (0, 1) one can define a measurable
map Ft : M → Γ by the requirement that Ft ◦et = Id, Π-almost surely. In understandable
words, if γ is a geodesic along which there is optimal transport, then the position of γ
at time t determines the whole geodesic γ. This property will ensure that Π(a) and Π(s)
“do not overlap at intermediate times”.
Finally, we note that the results in this section can be extended to more general
situations outside the category of Riemannian manifolds: It is sufficient that the optimal
transport between any two absolutely continuous probability measures be unique. In fact,
there is a more general framework where these results still hold true, namely the case of
nonbranching locally compact, complete length spaces. This extension is established, by
a slightly different approach, in [133, Chapter 30].
1.7.1 Proofs
In the sequel, we shall use the notation Ua,ν for (Ua )ν . An important ingredient in the
proof of Theorem 1.7.7 will be the following lemma, which has interest on its own (and
52 1.0. The optimal transportation problem
Z µ ¶
Z1 ρ1 (x) β(x, y)
UZβ1 ,π1 ,ν (µ1 ) = U Z1 π1 (dx dy);
β(x, y) Z1 ρ1 (x)
Z µ ¶
β Z2 ρ2 (x) β(x, y)
UZ2 ,π2 ,ν (µ2 ) = U π2 (dx dy).
β(x, y) Z2 ρ2 (x)
So the proof of the lemma will be complete if we can show that
µ ¶
Z1 ρ 1 + Z2 ρ 2 β
U (Z1 π1 + Z2 π2 )
β Z1 ρ 1 + Z2 ρ 2
µ ¶ µ ¶
Z1 ρ1 β Z2 ρ 2 β
≥U (Z1 π1 ) + U (Z2 π2 ). (1.7.5)
β Z1 ρ1 β Z2 ρ 2
Z1 ρ1 (x) Z2 ρ2 (x)
X1 = , X2 = ,
β(x, y) β(x, y)
d(Z1 π1 ) d(Z2 π2 )
p1 = (x, y), p2 = (x, y)
d(Z1 π1 + Z2 π2 ) d(Z1 π1 + Z2 π2 )
and to integrate against (Z1 π1 + Z2 π2 )(dx dy).
µ0 = ρ0 ν + µ0,s
54 1.0. The optimal transportation problem
be the Lebesgue decomposition of µ0 with respect to ν. Let E (a) and E (s) be two disjoint
Borel subsets of M such that ρ0 ν is concentrated on E (a) and µ0,s is concentrated on
E (s) . We decompose Π as
Π = Π(a) + Π(s) , (1.7.6)
where
© ª © ª
Π(a) := Πx γ ∈ Γ | γ(0) ∈ E (a) , Π(s) := Πx γ ∈ Γ | γ(0) ∈ E (s) .
and
(a) (s)
Π(a) (a) µ Π(s) (s) µ
Π̂ (a)
:= (a) , µ̂t := t(a) ; Π̂ (s)
:= (s) , µ̂t := t(s) .
Z Z Z Z
So
(a) (s)
µt = Z (a) µ̂t + Z (s) µ̂t . (1.7.7)
(a)
We remark that by what we proved in Section 1.5 µt is absolutely continuous for any
(s)
t ∈ [0, 1), but µt is not necessarly completely singular.
It follows from [133, Theorem 7.29 (v)] that for any t ∈ (0, 1) there is a Borel map
(s)
Ft such that Ft (γt ) = γ0 , Π(dγ)-almost surely. Then µt is concentrated on Ft−1 (E (s) ),
(a)
while µt is concentrated on Ft−1 (E (a) ); so these measures are singular to each other.
Then by Lemma 1.7.9 and (1.7.7), for any t ∈ (0, 1),
(a) (s)
Uν (µt ) = Z (a) UZ (a) ,ν (µ̂t ) + Z (s) UZ (s) ,ν (µ̂t ). (1.7.8)
In the sequel, we focus on part (i) of Theorem 1.7.7, since the reasoning is quite the
same for part (ii). By construction and the restriction property of optimal transport [133,
(a) (a)
Theorem 7.29], Π̂(a) is an optimal dynamical transference plan between µ̂0 and µ̂1 , and
(a) (a)
the associated Wasserstein geodesic is (µ̂t )0≤t≤1 . Since by construction µ̂0 is absolutely
(a)
continuous, by what we already proved (µ̂t ) is the unique Wasserstein geodesic joining
(a) (a)
µ̂0 to µ̂1 . Then we can apply the displacement convexity inequality of the functional
UZ (a) ,ν along that geodesic:
Plugging this back into (1.7.12) and using Lemma 1.7.9, we conclude that
λ
Uν (µt ) ≤ (1 − t) Uν (µ0 ) + t Uν (µ1 ) − t(1 − t) W22 (µ0 , µ1 ).
2
This finishes the proof of Theorem 1.7.7.
is ill-posed, as it may happen that C(µ, ν) = +∞. Howewer, it is known that the opti-
mality of a transport plan γ is equivalent to the c-cyclical monotonicity of the measure-
theoretic support of γ whenever C(µ, ν) < +∞ (see [13], [120], [133]), and so one may
ask whether the fact that the support of γ is c-cyclically monotone implies that γ is
supported on a graph. Moreover one can also ask whether this graph is unique, that is is
does not depends on γ, which is the case when the cost is µ ⊗ ν integrable, as Theorem
1.3.2 tells us. The uniqueness in that case, follows by the fact that the functions ϕn are
constructed using a pair of function (ϕ, ψ) which is optimal for the dual problem, and
so they are independent of γ. The result we now want to prove is the following:
(i) the family of maps x 7→ c(x, y) = cy (x) is locally semi-concave in x locally uniformly
in y;
∂c
(ii) ∂x
(x, ·) is injective on its domain of definition;
(iii) and the measure µ gives zero mass to sets with σ-finite (n − 1)-dimensional Haus-
dorff measure,
Let now (ϕ̃, ψ̃) be a pair associated to γ̃, and let ϕ̃n , B̃n and T̃ be constructed as above.
We need to prove that T = T̃ µ-a.e.
Let us define Cn := Bn ∩ B̃n . Then µ(Cn ) % 1. We want to prove that, if x is a µ-density
point of Cn for a certain n, then T (x) = T̃ (x) (we recall that, since µ(∪n Cn ) = 1, also
the union of the µ-density points of Cn is of full µ-measure, see for example [61, Chapter
1.7]).
Let us assume by contradiction that T (x) 6= T̃ (x), that is
dx ϕn 6= dx ϕ̃n .
Since x ∈ supp(µ), each ball around x must have positive measure under µ. Moreover,
the fact that the sets {ϕn = ϕ} and {ϕ̃n = ϕ̃} have µ-density 1 in x implies that the set
{ϕ = ϕ̃}
µ((A \ En ) ∩ Br (x))
lim = 0,
r→0 µ(Br (x))
1.8. A generalization of the existence and uniqueness result 59
1
µ(En ∩ Br (x)) ≥ µ(Br (x)) for r > 0 sufficiently small. (1.8.4)
5
Now, arguing as in the proof of the Aleksandrov’s lemma (see [107, Lemma 13]), we can
prove that
X := T̃ −1 (T (A)) ⊂ A
and X ∩ En lies a positive distance from x. In fact let us assume, without loss of
generality, that
To obtain the inclusion X ⊂ A, let z ∈ X and y := T̃ (z). Then y = T (m) for a certain
m ∈ A. For any w ∈ M , recalling (1.8.2), we have
∂c
ϕn (w) < − (zk , T̃ (zk ))(w − zk ) + ω(|w − zk |)|w − zk | + ϕ̃n (zk )
∂x
= dzk ϕ̃n (w − zk ) + ω(|w − zk |)|w − zk | + ϕ̃n (zk ).
60 1.0. The optimal transportation problem
Letting k → ∞ and recalling that dzk ϕ̃n → 0 and ϕ̃n (x) = ϕn (x) = 0, we obtain
which is absurd.
Thus there exists r > 0 such that Br (x) ∩ En and X ∩ En are disjoint, and (1.8.4) holds.
Defining now Y := T (A), by (1.8.4) we obtain
1. c : M × M → R is defined by
Z 1
c(x, y) := inf L(γ(t), γ̇(t)) dt,
γ(0)=x, γ(1)=y 0
where the infimum is taken over all the continuous piecewise C1 curves, and the
Lagrangian L(x, v) ∈ C 2 (T M, R) is C 2 -strictly convex and uniform superlinear in
v, and verifies an uniform boundeness in the fibers;
2. c(x, y) = dp (x, y) for any p ∈ (1, +∞), where d(x, y) denotes a complete Rieman-
nian distance on M .
Chapter 2
2.1 Introduction
1
The variety of structures arising in nature is extraordinary. By exploring the relation-
ship between form and function, D’Arcy Thompson, in his pioneering work [53], tries
to find common principles behind the varied phenomena (physical, chemical, biological,
short or long time scale, etc.) that interact to give birth to these structures. Indeed,
despite the complexity of nature, the approach of retaining only a small but decisive set
of parameters and principles to model the phenomenon at the origin of a given structure
can be successful. See for example [113] or consider the work of Turing on morphogene-
sis that led him to explain the appearance of heterogeneous spatial patterns in terms of
reaction-diffusion mechanisms [131].
Recently, such an approach was taken to model branched networks that achieve a
transport from a source to a target. Such networks are everywhere in nature (plants and
trees, river basins, bronchial and cardiovascular systems) and in man designed struc-
tures (communication networks, electric power supply, water distribution or drainage
networks). The common function of such networks is to transport some goods from an
initial distribution (the supply) to another (the demand). Following D’Arcy Thomp-
son, it is desirable to tie a link between this unity of form (branched networks) and
this unity of function (transporting goods from a supply to a demand). This was done
in [82, 98, 135, 25, 24, 29] by considering cost functions that encode the efficiency of
a transport induced by some structure. Branched structures, as the one observed in
nature, then arise as the optimal structures along which the transport takes place.
A simple but crucial principle was incorporated in the design of all the cost functions
1
This chapter is based on a joint work with Marc Bernot [27].
61
62 2.0. The irrigation problem
used by these authors. This principle states that it is more efficient to transport mass
in a grouped way rather than in a separate way. To embed this principle, the previously
mentioned costs incorporate a parameter α ∈ [0, 1] and make use of the concavity of
x 7→ xα . The idea is that for positive masses m1 and m2 , we have (m1 +m2 )α ≤ mα1 +mα2 ,
so that the particles are interested in moving together in order to lower the cost (see for
example the role of α in (2.1.1)). This effect gets stronger as α decreases, while the limit
case α = 1 gives no importance to the grouping of particles.
We now briefly review the different costs and descriptions of branched structures that
have been introduced so far. We then introduce a new dynamical cost functional, and
enlight the advantages it has over other models.
The model described by Gilbert in [82] consists in finite directed weighed graph G
with straight edges E(G) Pkand a weight function−w : P E(G) → (0, ∞).PThe graph P G
l
connects sources µ+ = a δ
i=1 i xi and targets µ = b δ
j=1 j yj with a
i i = j bj ,
ai , bj ≥ 0, and is required to satisfy Kirchhoff’s law at each vertex. The cost of G is
defined to be: X
M α (G) = w(e)α H1 (e). (2.1.1)
e∈E(G)
In [135], Xia extends this model to a continuous framework using Radon vector measures.
In both these models, the objects and their costs are static in the sense that no “particle”
is actually transported along the structure, and the cost depends only on the geometry
of the network.
In [98, 25, 24], a different kind of object, called traffic plan and denoted by χ, is
considered. In this framework, all particles are indexed by the set Ω := [0, 1], and
to each ω ∈ Ω is associated a 1-Lipschitz path χ(ω, ·) in RN . This is a Lagrangian
description of the dynamic of particles that can be encoded by the image measure Pχ of
the map ω 7→ χ(ω, ·) (which is therefore a measure on the set of 1-Lipschitz paths). This
measure induces a network structure similar to the one considered by Xia. To each traffic
plan is associated a cost E α which depends only on its network structure (see Definition
2.2.4) and, whenever it is finite, is the same as the one considered by Xia. Thus, though
a traffic plan is a dynamical object, its cost is static.
In [29], Brancolini, Buttazzo and Santambrogio consider an Eulerian formulation of
the problem, describing a transport from µ+ to µ− as a path in the space of measures.
The cost of such a path is defined as the length induced by a degenerate Riemannian
metric in the space of probability measures. More precisely, the cost of a path µ(t) is
given by Z 1
J(µ(t))|µ0 |(t) dt,
0
where J is a functional in the space of probability measures and |µ0 | denotes the metric
derivative (for the Wasserstein distance) of the path. Both the object and the cost are
2.1. Introduction 63
The advantage of the Lagrangian formulation with respect to the Eulerian one is to
allow to define costs of the above form in which one can take care of the speed of each
single particle, so that only moving particles contribute to the total cost.
What we propose, is to give a cost to the actual “dynamical” transport of mass from
µ+ to µ− that is induced by χ. To obtain such a cost, it is natural to require c(χ, ω, t) to
be local in space-time. By this property, we mean that c(χ, ω, t) only takes into account
the particles that are located at the point χ(ω, t) at time t. In [25] is considered a cost
c(χ, ω, t) depending on the total mass of particles passing through the point χ(ω, t) at
some time (see Definition 2.2.4). Since it takes into account only the global trajectories
of particles but not their local dynamics, this cost is local in space but not in time. The
associated functional E α thus quantifies the cost of the structure achieving the transport,
rather than the cost of the transport itself. In other words, we could also say that E α
evaluates the cost of permanent regime connecting µ+ to µ− , rather than the cost of a
dynamical transport from µ+ to µ− . The elementary cost c we introduce in Definition
2.3.3 has the desired locality property, and we denote by C α the induced cost via formula
(2.1.2). It is possible to extend the time domain by replacing R+ with R in (2.1.2), and
we denote by ERα and CRα the costs corresponding to E α and C α .
64 2.0. The irrigation problem
A B
We illustrate the advantage of such a “dynamical” cost with respect to the static one
in [25] on two examples:
• It gives a more realistic cost to an overlapping path. Indeed, in the case of the
static cost in [25], a path that follows the same circuit twice contributes to the cost
once, while the locality in time of the model we propose gives the expected cost
(see figure 2.1).
• It is more appropriate for the “who goes where problem”. Let us consider the
problem of two equal masses m located at points A and B, which represent both
the source and the target distribution, and where the transference plan constraint
consists in switching the two masses. In this case, the solution to this “who goes
where” problem is to transport the mass in A to position B and the mass in B
to position A along the segment joining them. For such a structure, the E α cost
does not distinguish between trajectories going from A to B and from B to A.
Indeed, the E α cost of this structure is |A − B|(2m)α , while the natural one would
be 2|A − B|mα . This is exactly the cost given by C α (see figure 2.2).
We will consider the irrigation problems for all the just mentioned costs. As it will be
proved in Section 2.5, the two irrigation problems with costs E α and C α are equivalent if
µ+ is a finite atomic, while the equivalence for ERα and CRα always holds. More precisely, in
these cases, we will prove that any minimizer for the dynamical cost is an E α -minimizer,
and that moreover, up to reparameterization, the converse is true (see the remarks after
Theorem 2.5.2). Since the cost E α (χ) is invariant by reparameterization of the traffic
2.2. Traffic plans 65
plan χ, while C α (χ) in general is not, this fact will tell us in particular that the cost C α
has the feature to select, among all the possible reparameterization of an optimal traffic
plan χ, some particular ones, in which particles actually move in a grouped way.
Given two measures µ+ and µ− , let us define
where the infimum is taken over all traffic plans transporting µ+ onto µ− (the same can
be done with C α , ERα and CRα ). By the above formula, one obtains a one-parameter
family of distances between measures, each of them inducing the weak-∗ topology. It
turns out that the continuity of the function α 7→ E α (µ+ , µ− ) is related to the following
stability property: given a converging sequence of traffic plans χn , respectively optimal
for the value αn , its limit is optimal for the limit value of αn . In particular, considering a
sequence αn → 1, one would obtain the convergence of optimal structures to an optimal
structure for the 1-Wasserstein distance. It is therefore of interest to study the α depen-
dence of E α (µ+ , µ− ). This α dependence will be shown in Section 2.6 to be continuous
if α ∈ [1 − N1 , 1] (N being the dimension of the ambient space).
The plan is as follows. In Section 2.2, we recall the main definitions and results
concerning traffic plans. In Section 2.3, we consider the energy functional of [25] in a
more general framework for which we obtain a general lower semicontinuity result. Then
we define a new dynamical (in the sense previously discussed) cost functional and obtain a
partial result of existence of a “dynamical” optimal traffic plan for the irrigation problem.
We can however obtain a more complete existence result by studying the properties of
E α -minimizers. Indeed, in Section 2.4, we prove that any E α -optimal traffic plan can
be suitably reparameterized. From this fact, we deduce in Section 2.5 that the cost of
optimal traffic plans and dynamical optimal traffic plans are the same, and that any
E α -optimal traffic plan can be reparameterized so that it is becomes optimal also for the
dynamical cost C α (this is always true for ERα and CRα , while for E α and C α we need µ+
to be finite atomic). Finally, in Section 2.6, we prove continuity results of E α (µ+ , µ− )
with respect to α, for fixed µ+ and µ− . As we already said above, this implies that limits
of optimal (for different values of α) traffic plans are still optimal for the limit value.
Convergence
Definition 2.2.5. We say that a sequence of traffic plans χn converges to a traffic plan
χ if Pχn weakly-∗ converges to Pχ , or equivalently if the random variables χn converge
in law to χ.
Definition 2.2.6. We say that a sequence of traffic plans χn fiber converges to a traffic
plan χ if χn (ω) converges to χ(ω) uniformly on compact subsets of R+ for every ω ∈ Ω
(this is stronger than the usual almost sure convergence of random variables).
Remark 2.2.7. By Skorokhod theorem (see Theorem 11.7.2 [57]) χn converges to χ if
and only if there exist χ̃n and χ̃ equivalent to χn and χ respectively and such that χ̃n (ω)
fiber converges to χ̃(ω).
Proposition 2.2.8. Up to a subsequence, any sequence of traffic plans χn in TPC con-
verges to a traffic plan χ. In addition, µ+ (χn ) * µ+ (χ) and µ− (χn ) * µ− (χ).
Existence of minimizers
The optimization problem we are interested in is the irrigation problem, i.e. the problem
of minimizing E α (χ) in TP(µ+ , µ− ). The following results are proved in [24, 98, 25].
Theorem 2.2.9. If C > 0 and χn : Ω × R+ → X is a sequence in TPC converging to
the traffic plan χ, then
E α (χ) ≤ lim inf E α (χn ).
n
Let
E α (µ+ , µ− ) := min E α (χ).
TP(µ+ ,µ− )
As proved in [25], there is an optimal traffic plan in TP(µ+ , µ− ) which is loop-free, i.e.
for almost any ω ∈ Ω, the map χ(ω, ·) is one to one in [0, Tχ (ω)]. Moreover, using
Propositions 6.4 and 6.6 in [25], given any optimal traffic plan with finite energy there is
an equivalent loop-free traffic plan with the same energy, hence optimal. Thus, without
loss of generality, we may assume that optimal traffic plans are loop-free.
The triangle inequality for the cost E α holds (just think of concatenating traffic plans
[26]):
By this result and Theorem 2.2.9, it is not difficult to prove that the cost E α metrizes
the weak-∗ convergence for α ∈ (1 − N1 , 1].
Regularity
The following regularity results were proved in [26].
Proposition 2.2.17. Let µ+ and µ− be atomic probability measures and α ∈ [0, 1]. An
optimum for the irrigation problem is a finite tree made of segments (in the sense that
the fibers χ(ω, ·), once parameterized by arc lengths, describe a finite set of piecewise
linear curves).
Any traffic plan χ ∈ TPR can be shifted in time so that it can be seen as a traffic plan
in TP and the corresponding ERα and E α costs are the same. Thus, from the point
of view of the irrigation problem, the two formalisms yield the same optimal objects.
However, the introduction of this extended model is made necessary for the study of
the dynamical framework we propose, since the dynamical cost we will consider is not
invariant by time-reparameterization.
70 2.0. The irrigation problem
The choice c(χ, ω, t) = θχ (χ(ω, t))α−1 yields the energy of a traffic plan given by
Definition 2.2.4. In this section, we first prove that for a large class of elementary costs
c(χ, ω, t), the cost of a traffic plan C(χ) is lower semicontinuous. Then, we introduce
a dynamical elementary cost (see the introduction for the meaning of dynamical) for
which the corresponding cost C is lower semicontinuous. This yields the existence of a
minimizer for the dynamical irrigation problem.
Proof. Let us set cλ (χ, ω, t) := inf s≥0 {c(χ, ω, s) + λ|t − s|}. Since c(χ, ω, ·) is lower
semicontinuous, it is classical (see [10]) that cλ (χ, ω, ·) is λ-Lipschitz and that
Let us prove that cλ (·, ω, t) is lower semicontinuous for all ω and t. Let χn → χ, and,
for fixed ω and t, assume that up to a subsequence the liminf of cλ (χn , ω, t) is indeed a
limit. Now, for each n, take tn such that
1
cλ (χn , ω, t) ≥ c(χn , ω, tn ) + λ|t − tn | − .
n
If tn → +∞, since c is non-negative,
lim cλ (χn , ω, t) ≥ lim inf c(χn , ω, tn ) + λ|t − tn | ≥ c(χ, ω, t∞ ) + λ|t − t∞ | ≥ cλ (χ, ω, t).
n n
2.3. Dynamic cost of a traffic plan 71
Z X· Z ti+1 ¸
λ λ
lim inf c (χn , ω, t)|χ̇n (ω, t)| dt ≥ lim inf c (χn , ω, ti ) |χ̇n (ω, t)| dt − λε(ti+1 − ti )
n [0,T ] n ti
i
X· Z ti+1 ¸ Z
≥ cλ (χ, ω, ti ) |χ̇(ω, t)| dt − λε(ti+1 − ti ) ≥ cλ (χ, ω, t)|χ̇(ω, t)| dt−2λεT.
i ti [0,T ]
This being true for all ε, we get for a.e. ω and all T > 0,
Z Z
lim inf c(χn , ω, t)|χ̇n (ω, t)| dt ≥ lim inf c(χn , ω, t)|χ̇n (ω, t)| dt
n R+ n [0,T ]
Z Z
λ
≥ lim inf c (χn , ω, t)|χ̇n (ω, t)| dt ≥ cλ (χ, ω, t)|χ̇(ω, t)| dt.
n [0,T ] [0,T ]
where α ∈ [0, 1], we observe that c(χ, t, ω) = θ̃χ (ω, t)α−1 , so that C α (χ) = C(χ) as
defined by (2.3.1). Let us consider a sequence of traffic plans χn fiber converging to χ,
and tn → t. We remark that the function
RN × RN 3 (x, y) 7→ δ(x, y) ∈ R
and since α ≤ 1,
lim inf c(χn , ω, tn ) ≥ c(χ, ω, t). (2.3.3)
n
Remark 2.3.5. It is not difficult to prove the upper semicontinuity of the multiplicity
θχ (χ(ω, t)), so that the elementary cost c(χ, ω, t) = θχ (χ(ω, t))α−1 satisfies the hypothesis
of Proposition 2.3.1. This yields a new simple proof of Theorem 2.2.9.
Like in the last paragraph of Section 2.2, it is possible to consider a dynamical cost
CRα (χ) for χ ∈ TPR (µ+ , µ− ). Proposition 2.3.1 and Theorem 2.3.4 hold with TPR and
CRα in place of TP and C α . The compactness of TPC stated in Proposition 2.2.8 yields:
Proposition 2.3.6. Let µ+ and µ− be probability measures on X, and let C > 0 be such
that TPC (µ+ , µ− ) is not empty (for example, take C ≥ diam(X)). Then, there exist
C α -minimizers (resp. CRα -minimizers) in TPC (resp. TPR,C ).
2.4. Synchronizable traffic plans 73
The argument used to prove Corollary 2.2.10 (that states the existence of E α -minimizers
in TP) is not adaptable to the case of C α , since neither C α (χ) nor CRα (χ) are invariant by
time-reparameterization of χ. In particular, the situation where C α -minimizers in TPC
change as C increases to +∞ is not excluded (this is not the case for E α , since by the
reparameterization argument used to prove Corollary 2.2.10 we know that all minimizers
are in TPC for C = E α (µ+ , µ− )). However, we shall see in Section 2.5, that by using
synchronization techniques developed in Section 2.4 we are still able to prove existence
of C α -minimizers in TP(µ+ , µ− ) provided that µ+ is finite atomic, and of CRα -minimizers
in TPR (µ+ , µ− ).
Given two traffic plans χ and χ̃, we say that χ̃ is a reparameterization of χ if, for
almost every ω ∈ Ω, the curve χ̃(ω, ·) is a reparameterization of χ(ω, ·). We will say that
χ̃ is an arc length parameterization of χ if, for almost every ω, χ̃(ω, ·) is an arc length
parameterization of χ(ω, t).
Since θχα−1 ≤ θ̃χα−1 with equality if χ is (positive) synchronized, one can easily deduce
that if a traffic plan is synchronized (resp. positive synchronized), then ERα (χ) = CRα (χ)
(resp. E α (χ) = C α (χ)).
The aim of this section is to prove that E α -optimal traffic plans are synchronizable.
Indeed, optimal traffic plans are such that there is a finite or countable set of points (xi )
and sets Ωi ⊂ Ωxi that form an (almost-)partition of Ω. This fact makes it possible to
synchronize independently each tree going through some xi , and then harmonize globally
these synchronizations thanks to the so-called strict single oriented path property that
we now discuss.
The strict single path definition was introduced in [26]. Following these authors, a
traffic plan is said to be strict single path if all fibers going through x and y have to
74 2.0. The irrigation problem
coincide between x and y. In other terms there is a single path (or none) between any
two points of the irrigation network. All optimal traffic plans can then be proven to
be strict single path up to the removal of a set of fibers with null measure. For our
synchronization purposes, we need to use a slight refinement of this notion, namely what
we call the strict single oriented path property. To state this property in precise terms,
we first need to introduce some definitions.
Definition 2.4.3. Let χ be a loop-free traffic plan, and define tx (ω) := inf{t : χ(ω, t) =
x}. Let x, y in Sχ , and define
the set of fibers passing through x and then through y. We denote by χxy the restriction
of χ to ∪ω∈Ω−xy→ {ω} × [tx (ω), ty (ω)]. It is the traffic plan made of all pieces of fibers of χ
Definition 2.4.4. A traffic plan χ has the strict single oriented path property (and we
say that χ is strict single oriented) if, for every pair x, y such that |Ω−
→ | > 0, all fibers
xy
xy
in Ωxy coincide between x and y with an arc Γ joining x to y, and Ω−
−→ → = ∅.
yx
By an immediate adaptation of the strict single path property of optimal traffic plans
proven in [26], we have the following result.
Proposition 2.4.5. (Strict single oriented path property) Let α ∈ [0, 1) and χ be
an optimal traffic plan such that E α (χ) < ∞. Then, up to removing a zero measure set
of fibers, χ has the strict single oriented path property.
We can now detail the lemmas useful to the prove the synchronizability of E α -optima.
Lemma 2.4.6. If χ is strict single oriented and Ω̃x ⊂ Ωx , then χx := χxΩ̃x is synchro-
nizable.
Proof. Let χ̃x (ω, t) be an arc length parameterization of χx (ω, t) such that χ̃x (ω, 0) = x.
Since χx (ω, ·) is injective, there is only one such parameterization. Let us now prove that
χ̃x is synchronized. Indeed, let us consider a point y in the image of χ. Since χ is strict
single oriented, there is only one path that connects x to y on the support of the traffic
plan χx . This allows to define lχx (y) as the distance from x to y (through the support of
χ). Since χ̃x (ω, ·) is parameterized by its arc length, we notice that for all ω ∈ Ωy ∩ Ω̃x
χ̃x (ω, lχ (y)) = y, i.e. χ̃x is synchronized. ¤
Lemma 2.4.7. Let χ1 and χ2 be synchronized, connected, arc length parameterized, and
such that χ1 ∪ χ2 is strict single oriented. Then χ1 ∪ χ2 is synchronizable.
2.4. Synchronizable traffic plans 75
C α (µ+ , µ− ) := inf
+
C α (χ), CRα (µ+ , µ− ) := inf+ CRα (χ).
TP(µ ,µ− ) TPR (µ ,µ− )
E α (µ+ , µ− ) = C α (µ+ , µ− ),
and
ERα (µ+ , µ− ) = CRα (µ+ , µ− ).
Proof. We remark that, by the definition of E α and C α , we immediately have the
inequality
E α (χ) ≤ C α (χ) for all traffic plan χ, (2.5.1)
so that,
E α (µ+ , µ− ) ≤ C α (µ+ , µ− ) ∀α ∈ [0, 1].
Let χ be a minimizer of E α . Proposition 2.4.10 ensures that there is a reparameteri-
zation χ̃ of χ such that χ̃ is positive synchronized, so that
Thus, E α (µ+ , µ− ) = C α (µ+ , µ− ) for all α ∈ [0, 1]. Finally, Proposition 2.4.11 yields
ERα (µ+ , µ− ) = CRα (µ+ , µ− ) for all α ∈ [0, 1]. ¤
By Proposition 2.4.11, we also have:
Theorem 2.5.2 states the equivalence of the cost given by the dynamical and the clas-
sical irrigation problem. Concerning minimizers, we can observe as a direct consequence
of Theorem 2.5.2 and (2.5.1) that every CRα -minimizer is an ERα -minimizer. Conversely,
by Proposition 2.4.11, any ERα -minimizer can be reparameterized so that it gives a CRα -
minimizer. The same considerations are true for E α and C α if µ+ is finite atomic thanks
to Proposition 2.4.10. Thus, in both these cases, the extended dynamical and classi-
cal irrigation problems yield exactly the same minimizers (up to reparameterization).
2.6. Stability with respect to the cost 77
Proposition 2.5.3. Let α ∈ [0, 1], µ+ and µ− be finite atomic measures, and χ ∈
TP(µ+ , µ− ) be a C α -minimizer. Then χ is a finite tree made of segments.
(i) If E α (χ) < +∞, then β 7→ E β (χ) is finite and continuous on [α, 1].
(ii) If E α (χ) = +∞, then E αn (χ) → +∞ for any decreasing sequence αn & α.
Proof. The monotonicity of α 7→ E α (χ) is trivial.
Let χ be such that E α (χ) < +∞ and let βn ∈ [α, 1] such that βn → β. For all
(ω, t) ∈ Ω × R+ , we have
Thus, since Z Z
α
E (χ) = θχ (χ(ω, t))α−1 |χ̇(ω, t)| dt dω < ∞,
Ω R+
78 2.0. The irrigation problem
Proposition 2.6.2. Let µ+ and µ− be two probability measures. The function [0, 1] 3
α 7→ E α (µ+ , µ− ) ∈ R+ ∪ {+∞} is non-increasing, right continuous and left lower semi-
continuous.
Proof. For simplicity of notation set f (α) := E α (µ+ , µ− ). Observe that, since α 7→
E α (χ) is non-increasing for all χ, f is non-increasing being an infimum of non-increasing
functions. Thus, f is left lower semicontinuous, i.e.
In what follows, χβ will always denote an optimal traffic plan for the exponent β, i.e.
such that E β (χβ ) = f (β). Let us consider a decreasing sequence αn such that αn & α
and a sequence of optimal traffic plans χαn .
By Lemma 2.6.1 and the optimality of χαn for E αn we get
f (α) = E α (χα ) = lim E αn (χα ) ≥ lim sup E αn (χαn ) ≥ lim inf E αn (χαn ). (2.6.1)
n n n
If lim inf n E αn (χαn ) = +∞, there is nothing to prove. Otherwise, up to apply the
reparameterization argument used to prove Corollary 2.2.9, we can assume that χαn ∈
TPC for some C > 0. Thus, by Proposition 2.2.8, there is a subsequence χαnk such that
χαnk → χ and lim inf E αnk (χαnk ) = lim inf E αn (χαn ). (2.6.2)
k n
lim inf E αnk (χαnk ) ≥ lim inf E αm (χαnk ) ≥ E αm (χ) for all m. (2.6.3)
k k
By Lemma 2.6.1, limm E αm (χ) = E α (χ) so that (2.6.1), (2.6.2) and (2.6.3) yield
¤
2.6. Stability with respect to the cost 79
Corollary 2.6.3. Let αn ∈ [0, 1] be a decreasing sequence converging to α, and let µ+ and
µ− be two probability measures. If χαn are optimal traffic plans for E αn and χαn → χ,
then χ is optimal for E α .
Proof. By Proposition 2.6.2, and since α 7→ E α (χ) is non-increasing and E αm is lower
semicontinuous for fixed m, we have
where the li and mi are respectively the lengths and weigths of the edges of G. Then,
since nα
X
β
β 7→ E (χα ) = li mβi
i=1
is continuous and finite on [0, 1], we see that E α (µ+ , µ− ) is finite on [0, 1]. Moreover
we already know by Proposition 2.6.2 that E α (µ+ , µ− ) is left lower semicontinuous and
right continuous. So, in order to conclude it is sufficient to prove that E α (µ+ , µ− ) is left
upper semicontinuous. Let (αn ) be a sequence such that αn % α. The continuity of
β 7→ E β (χα ) ensures that
lim sup E αn (µ+ , µ− ) = lim sup E αn (χαn ) ≤ lim sup E αn (χα ) = E α (χα ) = E α (µ+ , µ− ).
n n n
¤
80 2.0. The irrigation problem
Remark 2.6.6. In the case α = 1, the irrigation problem for the cost E α is equivalent to
the classical Monge-Kantorovich problem (see [110, 85, 132]). For that particular case,
Theorem 2.6.5 ensures that the transference plan associated to a sequence of optimal traf-
fic plans χαn , where αn → 1, converges, up to a subsequence, to an optimal transference
plan for the Monge-Kantorovich problem.
Chapter 3
3.1 Introduction
1
The velocity of an incompressible fluid moving inside a region D is mathematically
described by a time-dependent and divergence-free vector field u(t, x) which is parallel to
the boundary ∂D. The Euler equations for incompressible fluids describes the evolution
of such velocity field u in terms of the pressure field p:
∂t u + (u · ∇)u = −∇p in [0, T ] × D,
div u = 0 in [0, T ] × D, (3.1.1)
u·n=0 on [0, T ] × ∂D.
By the incompressibility condition, we get that at each time t the map g(t, ·) : D → D
is a measure-preserving diffeomorphism of D, that is
g(t, ·)# µD = µD ,
(here and in the sequel f# µ is the push-forward of a measure µ through a map f , and
µD is the volume measure of the manifold D). Writing Euler equations in terms of g, we
1
This chapter is based on two joint works with Luigi Ambrosio [8, 9].
81
82 3.0. Variational models for the incompressible Euler equations
get
g̈(t, a) = −∇p (t, g(t, a)) (t, a) ∈ [0, T ] × D,
g(0, a) = a a ∈ D, (3.1.2)
g(t, ·) ∈ SDiff(D) t ∈ [0, T ].
Viewing the space SDiff(D) of measure-preserving diffeomorphisms of D as an infinite-
dimensional manifold with the metric inherited from the embedding in L2 , and with
tangent space made by the divergence-free vector fields, Arnold interpreted the equation
above, and therefore (3.1.1), as a geodesic equation on SDiff(D) [15]. According to this
intepretation, one can look for solutions of (3.1.2) by minimizing
Z TZ
1
T |ġ(t, x)|2 dµD (x) dt (3.1.3)
0 D 2
among all paths g(t, ·) : [0, T ] → SDiff(D) with g(0, ·) = f and g(T, ·) = h prescribed
(typically, by right invariance, f is taken as the identity map i), and the pressure field
arises as a Lagrange multiplier from the incompressibility constraint (the factor T in front
of the integral is just to make the functional scale invariant in time). We shall denote by
δ(f, h) the Arnold distance in SDiff(D), whose square is defined by the above-mentioned
variational problem in the time interval [0, 1].
Although in the traditional approach to (3.1.1) the initial velocity is prescribed, while
in the minimization of (3.1.3) is not, this variational problem has an independent interest
and leads to deep mathematical questions, namely existence of relaxed solutions, gap
phenomena and necessary and sufficient optimality conditions, that are investigated in
this chapter. We also remark that no existence result of distributional solutions of (3.1.1)
is known when d > 2 (the case d = 2 is different, thanks to the vorticity formulation of
(3.1.1)), see [94], [36] for a discussion on this topic and other concepts of weak solutions
to (3.1.1).
On the positive side, Ebin and Marsden proved in [58] that, when D is a smooth
compact manifold with no boundary, the minimization of (3.1.3) leads to a unique solu-
tion, corresponding also to a solution to Euler equations, if f and h are sufficienly close
in a suitable Sobolev norm.
On the negative side, Shnirelman proved in [121], [122] that when d ≥ 3 the infimum
is not attained in general, and that when d = 2 there exists h ∈ SDiff(D) which cannot
be connected to i by a path with finite action. These “negative” results motivate the
study of relaxed versions of Arnold’s problem.
The first relaxed version of Arnold’s minimization problem was introduced by Brenier
in [31]: he considered probability measures η in Ω(D), the space of continuous paths
ω : [0, T ] → D, and minimized the energy
Z Z T
1
AT (η) := T |ω̇(τ )|2 dτ dη(ω),
Ω(D) 0 2
3.1. Introduction 83
(here and in the sequel et (ω) := ω(t) are the evaluation maps at time t). According to
Brenier, we shall call these η generalized incompressible flows in [0, T ] between i and
h. Obviously any sufficiently regular path g(t, ·) : [0, 1] → S(D) induces a generalized
incompressible flow η = (Φg )# µD , where Φg : D → Ω(D) is given by Φg (x) = g(·, x), but
the converse is far from being true: the main difference between classical and generalized
flows consists in the fact that fluid paths starting from different points are allowed to
cross at a later time, and fluid paths starting from the same point are allowed to split at
a later time. This approach is by now quite common, see for instance [4] (DiPerna-Lions
theory), [25] (branched optimal transportation), [97], [133].
Brenier’s formulation makes sense not only if h ∈ SDiff(D), but also when h ∈
S(D), where S(D) is the space of measure-preserving maps h : D → D, not necessarily
invertible or smooth. In the case D = [0, 1]d , existence of admissible paths with finite
action connecting i to any h ∈ S(D) was proved in [31], together with the existence
of paths with minimal action. Furthermore, a consistency result was proved: smooth
solutions to (3.1.1) are optimal even in the larger class of the generalized incompressible
flows, provided the pressure field p satisfies
and are the unique ones if the inequality is strict. When η = (Φg )# µD we can recover
g(t, ·) from η using the identity
Brenier found in [31] examples of action-minimizing paths η (for instance in the unit
ball of R2 , between i and −i) where no such representation is possible. The same
examples show that the upper bound (3.1.5) is sharp. Notice however that (e0 , et )# η is
a measure-preserving plan, i.e. a probability measure in D × D having both marginals
equal to µD . Denoting by Γ(D) the space of measure-preserving plans, it is therefore
natural to consider t 7→ (e0 , et )# η as a “minimizing geodesic” between i and h in the
larger space of measure-preserving plans. Then, to be consistent, one has to extend
Brenier’s minimization problem considering paths connecting γ, η ∈ Γ(D). We define
this extension, that reveals to be useful also to connect this model to the Eulerian-
Lagrangian one in [35], and to obtain necessary and sufficient optimality conditions even
when only “deterministic” data i and h are considered (because, as we said, the path
might be non-deterministic in between). In this presentation of our results, however, to
84 3.0. Variational models for the incompressible Euler equations
simplify the matter as much as possible, we shall consider the case of paths η between
i and h ∈ S(D) only.
In Section 3.5 we study the relation between the relaxation δ∗ of the Arnold distance,
defined by
½ Z ¾
2
δ∗ (h) := inf lim inf δ(i, hn ) : hn ∈ SDiff(D), |hn − h| dµD → 0 ,
n→∞ D
and the distance δ(i, h) arising from the minimization of the Lagrangian model. It is not
hard to show that δ(i, h) ≤ δ∗ (h), and a natural question is whether equality holds, or a
gap phenomenon occurs. In the case D = [0, 1]d with d > 2, an important step forward
was obtained by Shnirelman in [122], who proved that equality holds when h ∈ SDiff(D);
Shnirelman’s construction provides an approximation (with convergence of the action)
of generalized flows connecting i to h by smooth flows still connecting i to h. The main
result of this section is the proof that no gap phenomenon occurs, still in the case D =
[0, 1]d with d > 2, even when non-deterministic final data (i.e. measure-preserving plans)
are considered. The proof of this fact is based on an auxiliary approximation result,
Theorem 3.5.3, valid in any number of dimensions, which we believe of independent
interest: it allows to approximate, with convergence of the action, any generalized flow
η in [0, 1]d by W 1,2 flows (in time) induced by measure-preserving maps g(t, ·). This
fact shows that the “negative” result of Shnirelman on the existence in dimension 2 of
non-attainable diffeomorphisms is due to the regularity assumption on the path, and it
is false if one allows for paths in the larger space S(D). The proof of Theorem 3.5.3 uses
some key ideas from [122] (in particular the combination of law of large numbers and
smoothing of discrete families of trajectories), and some ideas coming from the theory
of optimal transportation.
Minimizing generalized paths η are not unique in general, as shown in [31]; how-
ever, Brenier proved in [33] that the gradient of the pressure field p, identified by the
distributional relation
is indeed unique. Here v t (x) is the “effective velocity”, defined by (et )# (ω̇(t)η) = v t µD ,
and v ⊗ v t is the quadratic effective velocity, defined by (et )# (ω̇(t) ⊗ ω̇(t)η) = v ⊗ v t µD .
The proof of this fact is based on the so-called dual least action principle: if η is optimal,
we have
AT (ν) ≥ AT (η) + hp, ρν − 1i (3.1.7)
for any measure ν in Ω(D) such that (e0 , eT )# ν = (i, h)# µD and kρν − 1kC 1 ≤ 1/2. Here
ρν is the (absolutely continuous) density produced by the flow ν, defined by ρν (t, ·)µD =
3.1. Introduction 85
(et )# ν. In this way, the incompressibility constraint can be slightly relaxed and one can
work with the augmented functional (still minimized by η)
for any generalized incompressible flow ν. Taking also the constraint (e0 , eT )# ν =
(i, h)# µ into account, we get
Z µZ T ¶ Z
1 2
AT (ν) = T |ω̇(t)| − q(t, ω) dt dν(ω) ≥ cTq (x, h(x)) dµT (x),
Ω(Td ) 0 2 Td
RT
where cTq (x, y) is the minimal cost associated with the Lagrangian T 0 21 |ω̇(t)|2 −q(t, ω) dt.
Since this lower bound depends only on h, we obtain that any η satisfying (3.1.4) and con-
2 R
centrated on cq -minimal paths, for some q ∈ L1 , is optimal, and δ (i, h) = cTq (i, h) dµT .
This is basically the argument used by Brenier in [31] to show the minimality of smooth
solutions to (3.1.1), under assumption (3.1.5): indeed, this condition guarantees that
solutions of ω̈(t) = −∇p(t, ω) (i.e. stationary paths for the Lagrangian, with q = p) are
also minimal.
We are able to show that basically this condition is necessary and sufficient for
optimality if the pressure field is globally integrable (see Theorem 3.6.12). However,
since no global in time regularity result for the pressure field is presently known, we have
also been looking for necessary and sufficient optimality conditions that don’t require the
global integrability of the pressure field. Using the regularity p ∈ L1loc ((0, T ); Lr (D)) for
some r > 1, guaranteed in the case D = Td with r = d/(d − 1) by the results contained
in the last saction, we show in Theorem 3.6.8 that any optimal η is concentrated on
locally minimizing paths for the Lagrangian
Z
1
Lp (ω) := |ω̇(t)|2 − p(t, ω) dt (3.1.8)
2
Since we are going to integrate p along curves, this statement is not invariant un-
der modifications of p in negligible sets, and the choice of a specific representative
p̄(t, x) := lim inf ε↓0 p(t, ·) ∗ φε (x) in the Lebesgue equivalence class is needed. Moreover,
the necessity of pointwise uniform estimates on pε requires the integrability of M p(t, x),
the maximal function of p(t, ·) at x (see (3.6.11)).
In addition, we identify a second necessary (and more hidden) optimality condition.
In order to state it, let us consider an interval [s, t] ⊂ (0, T ) and the cost function
½Z t ¾
s,t 1 2 1
cp (x, y) := inf |ω̇(τ )| − p(τ, ω) dτ : ω(s) = x, ω(t) = y, M p(τ, ω) ∈ L (s, t) .
s 2
(3.1.9)
(the assumption M p(τ, ω) ∈ L1 (s, t) is forced by technical reasons). Recall that, accord-
ing to the theory of optimal transportation, a probability measure λ in Td × Td is said
to be c-optimal if Z Z
0
c(x, y) dλ ≥ c(x, y) dλ
Td ×Td Td ×Td
3.2. Notation and preliminary results 87
for any probability measure λ0 having the R same marginals µ1 , µ2 of λ. We shall also
denote Wc (µ1 , µ2 ) the minimal value, i.e. Td ×Td c dλ, with λ c-optimal. Now, let η be an
optimal generalized incompressible R flow between i and h; according to the disintegration
theorem, we can represent η = η a dµD (a), with η a concentrated on curves starting
at a (and ending, since our final conditions is deterministic, at h(a)), and consider the
plans λs,t
a = (es , et )# η a . We show that
Roughly speaking, this condition tells us that one has not only to move mass from x to y
achieving cs,t
p , but also to optimize the distribution of mass between time s and time t. In
the “deterministic” case when either (e0 , es )# η or (e0 , et )# η are induced by a transport
map g, the plan λs,t a has δg(a) either as first or as second marginal, and therefore it is
uniquely determined by its marginals (it is indeed the product of them). This is the
reason why condition (3.1.10) does not show up in the deterministic case.
Finally, we show in Theorem 3.6.12 that the two conditions are also sufficient, even
on general manifolds D: if, for some r > 1 and q ∈ L1loc ((0, T ); Lr (D)), a generalized
incompressible flow η concentrated on locally minimizing curves for the Lagrangian Lq
satisfies
for all [s, t] ⊂ (0, T ), λs,t s,t
a is cq -optimal for µD -a.e. a ∈ D,
a complete and separable distance. We endow a Polish space X with the correspond-
ing Borel σ-algebra and denote by P(X) (resp. M+ (X), M (X)) the family of Borel
probability (resp. nonnegative and finite, real and with finite total variation) mea-
sures in X. For A ⊂ X and µ ∈ M (X) the restriction µxA of µ to A is defined by
µxA(B) := µ(A ∩ B). We will denote by i : X → X the identity map.
It is easy to check that f# µ has finite total variation as well, and that |f# µ| ≤ f# |µ|.
An elementary approximation by simple functions shows the change of variable formula
Z Z
g df# µ = g ◦ f dµ (3.2.1)
Y X
for any bounded Borel function (or even either nonnegative or nonpositive, and R-valued,
in the case µ ∈ M+ (X)) g : Y → R.
In this chapter we use only the “easy” implication in Prokhorov theorem, namely
that any tight family is sequentially relatively compact. It is immediate to check that a
sufficient condition for tightness of a family F of probability measures is the existence
of a coercive functional Ψ : X → [0, +∞] (i.e. a functional such that its sublevel sets
{Ψ ≤ t}, t ∈ R+ , are relatively compact in X) such that
Z
Ψ(x)dµ(x) ≤ 1 ∀µ ∈ F .
X
Lemma 3.2.3 ([14], Lemma 2.4). Let µ ∈ P(X) and u ∈ L2 (X; Rm ). Then, for any
Borel map f : X → Y , f# (uµ) ¿ f# µ and its density v with respect to f# µ satisfies
Z Z
2
|v| df# µ ≤ |u|2 dµ.
Y X
for all nonnegative Borel map f . Conversely, any λ and any Borel map x 7→ µx ∈ P(Y )
induce a probability measure µ in X × Y via (3.2.2).
Function spaces. We shall denote by Ω(D) the space C([0, T ]; D), and by ω : [0, T ] →
D its typical element. The evaluation maps at time t, ω 7→ ω(t), will be denoted by et .
If D is a smooth, compact Riemannian manifold without boundary (typically the
d-dimensional flat torus Td ), we shall denote µD its volume measure, and by dD its
Riemannian distance, normalizing the Riemannian metric so that µD is a probability
measure. Although it does not fit exactly in this framework, we occasionally consider
also the case D = [0, 1]d , because many results have already been obtained in this
particular case.
We shall often consider measures η ∈ M+ (Ω(D)) such that (et )# η ¿ µD ; in this
case we shall denote by ρη : [0, T ] × D → [0, +∞] the density, characterized by
S(D) := {g : D → D : g# µD = µD } . (3.2.3)
We also set
S i (D) := {g ∈ S(D) : g is µD -essentially injective} . (3.2.4)
For any g ∈ S i (D) the inverse g −1 is well defined up to µD -negligible sets, µD -measurable,
and g −1 ◦ g = i = g ◦ g −1 µD -a.e. in D. In particular, if g ∈ S i (D), g −1 ∈ S i (D).
We shall also denote by Γ(D) the family of measure-preserving plans, i.e. the prob-
ability measures in D × D whose first and second marginal are µD :
γg := (i × g)# µD .
90 3.0. Variational models for the incompressible Euler equations
for any φ ∈ C 1 ((0, T ) × D) with bounded first derivatives and support contained in
d
J × D, with J b ¡ (0, T ). In¢ the case when D ⊂ R is compact, dwe shall consider
1 d
functions φ ∈ C (0, T ) × R , again with support contained in J × R , with J b (0, T ).
The following general principle allows to lift solutions of the continuity equation to
measures in the space of continuous paths.
Theorem 3.2.4 (Superposition principle). Assume that either D is a compact subset
of Rd , or D is a smooth compact Riemannian manifold without boundary, and let µt :
[0, T ] → P(D) be a narrowly continuous solution of the continuity equation (3.2.8) for
a suitable velocity field v(t, x) = v t (x) satisfying kv t kL2 2 (µt ) ∈ L1 (0, T ). Then there exists
η ∈ P(Ω(D)) such that
(i) µt = (et )# η for all t ∈ [0, T ];
(ii) the following energy inequality holds:
Z Z T Z T Z
2
|ω̇(t)| dt dη(ω) ≤ |v t |2 dµt dt.
Ω(D) 0 0 D
Proof. In the case when D = R (and therefore also when D ⊂ Rd is closed) this result
d
is proved in Theorem 8.2.1 of [11] (see also [16], [123], [21] for related results). In the case
when D is a smooth, compact Riemannian manifold we recover the same result thanks
to an isometric embedding in Rm , for m large enough. ¤
3.3. Variational models for generalized geodesics 91
Let γ ∈ Γ(D) be given; the class of admissible paths, called by Brenier generalized
incompressible flows, is made by the probability measures η on Ω(D) such that
(et )# η = µD ∀t ∈ [0, T ].
where
( RT
T 0 12 |ω̇(t)|2 dt if ω is absolutely continuous in [0, T ]
AT (ω) := (3.3.2)
+∞ otherwise,
2
and δ (γi , γ) is defined by minimizing AT (η) among all generalized incompressible flows
η connecting γi to γ, i.e. those satisfying
(e0 , eT )# η = γ. (3.3.3)
Notice that it is not clear, in this purely Lagrangian formulation, how the relaxed
distance δ(η, γ) between two measure preserving plans might be defined, not even when η
and γ are induced by maps g, h. Only when g ∈ S i (D) we might use the right invariance
and define δ(γg , γh ) := δ(γi , γh◦g−1 ).
These remarks led us to the following more general problem: let us denote
Ω̃(D) := Ω(D) × D,
whose typical element will be denoted by (ω, a), and let us denote by πD : Ω̃(D) → D
the canonical projection. We consider probability measures η in Ω̃(D) having µD as
second marginal, i.e. (πD )# η = µD ; they can be canonically represented as η a ⊗ µD ,
where η a ∈ P(Ω(D)). The incompressibility constraint now becomes
Z
(et )# η a dµD (a) = µD ∀t ∈ [0, T ], (3.3.4)
D
2
Then, we define δ (η, γ) by minimizing the action
Z
AT (ω) dη(ω, a)
Ω̃(D)
with g = χ−1 . Therefore, choosing g(s) = s+εφ(s), with φ ∈ Cc1 (0, T ), the first variation
gives Z T µZ ¶
2
|ω̇| (s) dη(ω, a) φ̇(s) ds = 0.
0 Ω̃(D)
R
This proves that s 7→ Ω̃(D) |ω̇|2 (s) dη(ω, a) is equivalent to a constant. We shall call the
square root of this quantity speed of η.
Remark 3.3.2 (Restriction and concatenation). Let [s, t] ⊂ [0, T ] and let rs,t :
C([0, T ]; D) → C([s, t]; D) be the restriction map. It is immediate to check that, for
any generalized incompressible flow η = η a ⊗ µD in [0, T ] between η and γ, the measure
94 3.0. Variational models for the incompressible Euler equations
with η a,x , ν a,x concentrated on the curves ω with ω(l) = x. We can then consider
the image λx,a , via the concatenation of paths (from the product of C([s, l]; D) and
C([l, t]; D) to C([s, t]; D)), of the product measure η a,x × ν a,x to obtain a probability
measure in C([s, t]; D) concentrated on paths passing through x at time l. Eventually,
setting Z
λ= λx,a d(γa ⊗ µD )(x, a),
D×D
we obtain a generalized incompressible flow in [s, t] joining η to θ with action given by
t−s t−s
A[s,l] (η) + A[l,t] (ν),
l−s t−l
where A[s,l] (η) is the action of η in [s, l] and A[l,t] (ν) is the action of ν in [l, t] (strictly
speaking, the action of their restrictions).
A simple consequence of the previous remarks is that δ is a distance in Γ(D) (it
suffices to concatenate flows with unit speed); in addition, the restriction of an optimal
incompressible flow η = η a ⊗ µD between ηa ⊗ µD and γa ⊗ µD to an interval [s, t] is still
an optimal incompressible flow in [s, t] between the plans (es )# η a ⊗µD and (et )# η a ⊗µD .
This property will be useful in Section 3.6.
Another important property of δ that will be useful in Section 3.6 is its lower semi-
continuity with respect to the narrow convergence, that we are going to prove in the
next theorem. Another non-trivial fact is the existence of at least one generalized in-
compressible flow with finite action. In [31, Section 4] Brenier proved the existence of
such a flow in the case D = Td . Then in [122, Section 2], using a (non-injective) Lips-
chitz measure-preserving map from Td to [0, 1]d , Shnirelman produced a flow with finite
action also in this case (see also [35, Section 3]). In the next theorem we will show how
to construct a flow with finite action in a compact subset D whenever flows with finite
action can be built in D0 and a possibly non-injective, Lipschitz and measure-preserving
map f : D0 → D exists.
3.3. Variational models for generalized geodesics 95
Theorem 3.3.3. Assume that D ⊂ Rd is a compact set. Then the infimum in the
definition of δ(η, γ) is achieved,
and
δ(γi , γh ) ≤ δ(i, h) ∀h ∈ SDiff(D). (3.3.8)
√
Furthermore, sup δ(η, γ) ≤ d when either D = [0, 1]d or D = Td and, more gener-
η, γ∈Γ(D)
ally,
sup δ D (γi , γ) ≤ Lip(f ) sup δ D0 (γi , γ 0 )
γ∈Γ(D) γ 0 ∈Γ(D0 )
Clearly theR first marginal of γ 0 is µD0 ; since h ∈ S(D), changing variables in (3.3.9) one
has µD0 = D µh(y) dµD (y), and so also the second marginal of γ 0 is µD . Let us now prove
that (f × f )# γ 0 = (i × h)# µD : for any φ ∈ Cb (D × D) we have
Z Z
0 0 0
φ(y, y ) d(f × f )# γ (y, y ) = φ(f (x), f (x0 )) dγ 0 (x, x0 )
D×D 0 ×D 0
ZD Z
= φ(f (x), f (x0 )) dµy (x) dµh(y) (x0 ) dµD (y)
0 0
ZD D ×D
= φ(y, h(y)) dµD (y),
D
where in the last equality we used that µy is concentrated on f −1 (y) and µh(y) is concen-
trated on f −1 (h(y)) for µD -a.e. y. ¤
By (3.3.1), (3.3.8) and the narrow lower semicontinuity of δ(i, ·) we get
We conclude this section by pointing out some additional properties of the metric
space (Γ(D), δ).
Proposition 3.3.4. (Γ(D), δ) is a complete metric space, whose convergence implies
narrow convergence. Furthermore, the distance δ is right invariant under the action of
S i (D) on Γ(D). Finally, δ-convergence is strictly stronger than narrow convergence and,
as a consequence, (Γ(D), δ) is not compact.
Proof. We will prove that δ(η, γ) ≥ W2 (η, γ), where W2 is the quadratic Wasserstein
distance in P(D × D) (with the quadratic cost c((x1 , x2 ), (y1 , y2 )) = d2D (x1 , y1 )/2 +
d2D (x2 , y2 )/2); as this distance metrizes the narrow convergence, this will give the impli-
cation between δ-convergence and narrow convergence. In order to show the inequality
δ(η, γ) ≥ W2 (η, γ) we consider an optimal flow η a ⊗ µD defined in [0, 1]; then, denoting
by ωa ∈ Ω(D) the constant path identically R equal to a, and by ν a ∈ P(C([0, 1]; D × D))
the measure η a × δωa , the measure ν := D ν a dµD (a) ∈ P(C([0, 1]; D × D)) provides a
“dynamical transference plan” connecting η to γ (i.e. (e0 )# ν = η, (e1 )# ν = γ, see [133,
2
Chapter 7]) whose action is δ (η, γ); since the action of any dynamical transference plan
bounds from above W22 (η, γ), the inequality is achieved.
3.3. Variational models for generalized geodesics 97
and we need only to integrate this inequality with respect to a. From (3.3.11) we obtain
that S(D) is a closed subset of Γ(D), relative to the distance δ. In particular, considering
for instance a sequence (gn ) ⊂ S(D) narrowly converging to γ ∈ Γ(D) \ S(D), whose
existence is ensured by (3.2.6), one proves that the two topologies are not equivalent and
the space is not compact. ¤
Combining right invariance with (3.3.10), we obtain
if D = [0, 1]d with d ≥ 3. By the density of S i (D) in S(D) in the L2 norm and the lower
semicontinuity of δ, this inequality still holds when g ∈ S(D).
the case η = γi , simply labels the position of the particle at time 0) and to consider the
family of distributional solutions, indexed by a ∈ D, of the continuity equation
∂t ct,a + div(v t,a ct,a ) = 0 in D0 ((0, T ) × D), for µD -a.e. a, (3.3.13)
with the initial and final conditions
c0,a = ηa , cT,a = γa , for µD -a.e. a. (3.3.14)
RT R
Notice that minimization of the kinetic energy 0 D |v t,a |2 dct,a dt among all possible
solutions of the continuity equation would give, according to [19], the optimal transport
problem between ηa and γa (for instance, a path of Dirac masses on a geodesic connecting
g(a) to h(a) if ηa = δg(a) , γa = δh(a) ). Here, instead, by averaging with respect to a we
minimize the mean kinetic energy
Z Z TZ
|v t,a |2 dct,a dt dµD (a)
D 0 D
with the only global constraint between the family {ct,a } given by the incompressibility
of the flow: Z
ct,a dµD (a) = µD ∀t ∈ [0, T ]. (3.3.15)
D
It is useful to rewrite this minimization problem in terms of the the global measure c in
[0, T ] × D × D and the measures ct in D × D
c := ct,a ⊗ (L 1 × µD ), ct := ct,a ⊗ µD
(from whom ct,a can obviously be recovered by disintegration), and the velocity field
v(t, x, a) := v t,a (x): the action becomes
Z TZ
1
AT (c, v) := T |v(t, x, a)|2 dc(t, x, a),
0 D×D 2
while (3.3.13) is easily seen to be equivalent to
Z Z
d
φ(x, a) dct (x, a) = h∇x φ(x, a), v(t, x, a)i dct (x, a) (3.3.16)
dt D×D D×D
for any η, γ ∈ Γ(D). More precisely, any minimizer η of the Lagrangian model connect-
ing η to γ induces in a canonical way a minimizer (c, v) of the Eulerian-Lagrangian one,
and satisfies for L 1 -a.e. t ∈ [0, T ] the condition
Notice that mηt,a is well defined for L 1 -a.e. t, and absolutely continuous with respect to
cηt,a , thanks to Lemma 3.2.3; moreover, denoting by v ηt,a the density of mηt,a with respect
to cηt,a , by the same lemma we have
Z Z
η 2 η
|v t,a | dct,a ≤ |ω̇(t)|2 dη a (ω), (3.4.2)
D Ω(D)
with equality only if ω̇(t) = v ηt,a (et (ω)) for η a -a.e. ω. Then, we define the global measure
and velocity by
More generally the relaxed distance δ(η, γ) arising from the Lagrangian model can be
compared, at least when η = γi and the final condition γ is induced by a map h ∈ S(D),
with the relaxation δ∗ of the Arnold distance:
½ Z ¾
2
δ∗ (h) := inf lim inf δ(i, hn ) : hn ∈ SDiff(D), |hn − h| dµD → 0 . (3.5.2)
n→∞ D
By (3.3.7) and (3.3.8), we have δ∗ (h) ≥ δ(γi , γh ), and a gap phenomenon is said to occur
if the inequality is strict.
In the case d = 2, while examples of h ∈ SDiff(D) such that δ(i, h) = +∞ are known
[121], the nature of δ∗ (h) and the possible occurrence of the gap phenomenon are not
clear.
In this section we prove the non-occurrence of the gap phenomenon when the fi-
nal condition belongs to S(D), and even when it is a transport plan, still under the
assumption d ≥ 3. To this aim, we first extend the definition of δ∗ by setting
n o
δ∗ (γ) := inf lim inf δ(i, hn ) : hn ∈ SDiff(D), γhn → γ narrowly . (3.5.3)
n→∞
This extends the previous definition (3.5.2), taking into account that γhn narrowly con-
verge to γh if and only if hn → h in L2 (µD ) (for instance, this is a simple consequence of
[14, Lemma 2.3]).
The proof of the theorem, given at the end of this section, is a direct consequence of
Theorem 3.5.1 and of the following approximation result of generalized incompressible
flows by measure-preserving maps (possibly not smooth, or not injective), valid in any
number of dimensions.
Theorem 3.5.3. Let γ ∈ Γ(D). Then, for any probability measure η on Ω(D) such that
and AT (η) < ∞, there exists a sequence of flows (gk (t, ·))k∈N ⊂ W 1,2 ([0, T ]; L2 (D)) such
that:
(i) gk (t, ·) ∈ S(D) for all t ∈ [0, T ], hence η k := (Φgk )# µD , with Φgk (x) = gk (·, x), are
generalized incompressible flows;
Proof. The first three steps of the proof are more or less the same as in the proof of
Shnirelman’s approximation theorem (Theorem 3.5.1 in [122]).
Step 1. Given ε > 0 small, consider the affine transformation of D into the concentric
cube Dε of size 1 − 4ε:
This transformation induces a map T̃ε from Ω(D) into C([0, T ]; Dε ) (which is indeed a
bijection) given by
T̃ε (ω)(t) := Tε (ω(t)) ∀ω ∈ Ω(D).
Then we define η̃ ε := (T̃ε )# η, and
η ε := (1 − 4ε)d η̃ ε + η 0,ε ,
where η 0,ε is the “steady” flow in D \ Dε : it consists of all the curves in D \ Dε that
do not move for 0 ≤ t ≤ T . It is then not difficult to prove that η ε → η narrowly and
AT (η ε ) → AT (η), as ε → 0.
Therefore, by a diagonal argument, it suffices to prove our theorem for a measure
η which is steady near ∂D. More precisely we can assume that, if ω(0) is in the 2ε-
neighborhood of ∂D, then ω(t) ≡ ω(0) for η-a.e. ω. Moreover, arguing as in Step 1 of
the proof of the above mentioned approximation theorem in [122], we can assume that
the flow does not move for 0 ≤ t ≤ ε, that is, for η-a.e. ω, ω(t) ≡ ω(0) for 0 ≤ t ≤ ε.
Step 2. Let us now consider a family of independent random variables ω1 , ω2 , . . .
defined in a common probability space (Z, Z, P ), with values in C([0, T ], D) and having
the same law η. Recall that η is steady near ∂D and for 0 ≤ t ≤ ε, so we can see ωi
as random variables with values in the subset of Ω(D) given by the curves which do not
move for 0 ≤ t ≤ ε and in the 2ε-neighbourhood of the ∂D. By the law of large numbers,
the random probability measures in Ω(D)
N
1 X
ν N (z) := δω (z) , z ∈ Z,
N i=1 i
narrowly converge to η with probability 1. Moreover, always by the law of large numbers,
also
AT (ν N (z)) → AT (η)
with probability 1. Thus, choosing properly z, we have approximated η with measures
ν N concentrated on a finite number of trajectories ωi (z)(·) which are steady in [0, ε] and
close to ∂D. From now on (as typical in Probability theory) the parameter z will be
tacitly understood.
3.5. Comparison of metrics and gap phenomena 103
Step 3. Let ϕ ∈ Cc∞ (Rd ) be a smooth radial convolution kernel with ϕ(x) = 0 for
|x| ≥ 1 and ϕ(x) > 0 for |x| < 1. Given a finite number of trajectories ω1 , . . . , ωN as
described is step 2, we define
µ ¶
1 x − ωi (0)
ai (x) := d ϕ if dist(ωi (0), ∂D) ≥ ε,
ε ε
µ ¶
1 X x − γ(ωi (0))
ai (x) := d ϕ if dist(ωi (0), ∂D) ≤ ε,
ε γ∈Γ ε
where Γ is the discrete group ofR motions in Rn generated by the reflections in the faces
of D. It is easy to check that ai = 1 and that supp(ai ) is the intersection of D with
the closed ball B ε (ωi (0)). Define
Let MN := (a1 , . . . , aN , g1,t (x), . . . , gN,t (x)) and let us consider the generalized flow η N
associated to MN , given by
Z N Z
1 X
f (ω) dη N := ai (x)f (t 7→ gi,t (x)) dx (3.5.4)
Ω(D) N i=1 D
P R
(that is, η N is the measure in the space of paths given by N1 i D ai (x)δgi,· (x) dx). The
measure η N is well defined for the following reason: if dist(ωi (0), ∂D) ≤ ε we have
gi,t (x) = x, and if dist(ωi (0), ∂D) > ε and ai (x) > 0 we still have that the curve t 7→
gi,t (x) is contained in D because ai (x) > 0 implies |x − ωi (0)| ≤ ε and, by construction,
dist(ωi (t), ∂D) ≥ ε for all times. Since the density ρηN induced by η N is given by
N
N 1 X
ρ (t, x) := ai (x + ωi (0) − ωi (t)),
N i=1
the flow η N is not measure preserving. However we are more or less in the same situation
as in Step 3 in the proof of the approximation theorem in [122] (the only difference being
that we do not impose any final data). Thus, by [122, Lemma 1.2], with probability 1
as N → ∞. By the first two equations in (3.5.5), we can left compose gi,t with a smooth
correcting flow ζtN (x) as in Step 3 in the proof of the approximation theorem in [122],
in such a way that the flow η̃ N associated to M˜N := (a1 , . . . , aN , ζtN ◦ g1,t (x), . . . , ζtN ◦
gN,t (x)) via the formula analogous to (3.5.4) is incompressible. Moreover, thanks to the
third equation in (3.5.5) and the convergence of AT (ν N ) to AT (η), one can prove that
AT (η̃ N ) → AT (η) with probability 1.
We observe that, since η is steady for 0 ≤ t ≤ ε, the same holds by construction for
η̃ N . Without loss of generality, we can therefore assume that ζtN does not depend on t
for t ∈ [0, ε].
Step 4. In order to conclude, we see that the only problem now is that the flow η̃ N
associated to M˜N is still non-deterministic, since if x ∈ supp(ai ) ∩ supp(aj ) for i 6= j,
then more that one curve starts from x. Let us partition D in the following way:
D = D1 ∪ D2 ∪ . . . ∪ DL ∪ E,
where E is L d -negligible, any set Dj is open, and all x ∈ Dj belong to the interior of
the supports of exactly M = M (j) ≤ N sets ai , indexed by 1 ≤ i1 < · · · < iM ≤ N
(therefore L ≤ 2N ). This decomposition is possible, as E is contained in the union of
the boundaries of supp ai , which is L d -negligible.
Fix one of the sets Dj and assume just for notational simplicity that ik = k for
1 ≤ k ≤ M . We are going to modify the flow η̃ N in Dj , increasing a little bit its action
(say, by an amount α > 0), in such a way that for each point in Dj only one curve
starts from it. Given x ∈ Dj , we know that M curves start from it, weighted with mass
P
ak (x) > 0, and M k=1 ak (x) = 1. These curves coincide for 0 ≤ t ≤ ε (since nothing
moves), and then separate. We want to partition Dj in M sets Ek , with
Z
d
L (Ek ) = ak (x) dx, 1≤k≤M
Dj
in such a way that, for any x ∈ Ek , only one curve ωxk starts from it at time 0, ωxk (t) ∈ Dj
for 0 ≤ t ≤ ε, and the map Ek 3 x 7→ ωxk (ε) ∈ Dj pushes forward L d xEk into ak L d xDj .
Moreover, we want the incompressibility condition to be preserved for all t ∈ [0, ε]. If
this is possible, the proof will be concluded by gluing ωxk with the only curve starting
from ωxk (ε) with weight ak (ω k (ε)).
The above construction can be achieved in the following way. First we write the
interior of Dj , up to null measure sets, as a countable union of disjoints open cubes (Ci )
with size δi satisfying
M 2 X δi2 d
L (Ci ) ≤ α, (3.5.6)
ε i b̄2i
3.5. Comparison of metrics and gap phenomena 105
with b̄i := min min ak . This is done just considering the union of the grids in Rd given
1≤k≤M Ci
by Zd /2n for n ∈ N, and taking initially our cubes in this family; if (3.5.6) does not hold,
we keep splitting the cubes until it is satisfied (b̄i can only increase under this additional
splitting, therefore a factor 4 is gained in each splitting). Once this partition is given,
the idea is to move the mass within each Ci for 0 ≤ t ≤ ε. At least heuristically, one can
imagine that in Ci the functions ak are almost constant and that the velocity of a generic
path in Ci is at most of order δi /ε. Thus, the total energy of the new incompressible
fluid in the interval [0, ε] will be of order
XZ Z ε CX 2 d
|ω̇x (t)|2 dt dx ≤ δi L (Ci )
i C i 0 ε i
whose sum is δid , then the points which belong to Cik := xi + (0, δi )d−1 × Jk have to move
along curves in order to push forward L d xCik into ak L d xCi , where Jk are M consecutive
open intervals in (0, δi ) with length δi1−d mk . Moreover, this has to be done preserving
the incompressibility condition.
If we write x = (x0 , xd ) ∈ Rd with x0 = (x1 , . . . , xd−1 ), we can transport the M
uniform densities ¡ ¢
H 1 x xi + {x0 } × Jk with x0 ∈ [0, δi ](d−1) ,
into the M densities ¡ ¢
ak (x0 , ·)H 1 x xi + {x0 } × [0, δi ]
moving the curves only in the d-th direction, i.e. keeping x0 fixed. Thanks to Lemma 3.5.4
below and a scaling argument, we can do this construction paying at most M 2 b̄−2 3
i δi /ε in
each slice of Ci , and therefore with a total cost less than
M 2 X δid+2
≤ α.
ε i b̄2i
and
M2
A1 (h) ≤ , with b̄ := min min bk > 0. (3.5.7)
b̄2 1≤k≤M [0,1]
Proof. We start with a preliminary remark: let J ⊂ (0, 1) be an interval with length
1
l and assume thatR 1 t 7→ ρt is a nonnegative Lipschitz map between [0, 1] and L (0, 1),
with ρt ≤ 1 and 0 ρt dx = l for all t ∈ [0, 1], and let f (t, ·) be the unique (on J, up to
countable sets) nondecreasing map pushing χJ L 1 to ρt . Assume also that supp ρt is an
interval and ρt ≥ r L 1 -a.e. on supp ρt , with r > 0. Under this extra assumption, f (t, x)
is uniquely determined for all x ∈ J, and implicitly characterized by the conditions
Z f (t,x)
ρt (y) dy = L 1 ((0, x) ∩ J), f (t, x) ∈ supp ρt .
0
This implies, in particular, that f (·, x) is continuous for all x ∈ J. We are going to prove
that this map is even Lipschitz continuous in [0, 1] and
d Lip(ρ· )
| f (t, x)| ≤ for L 1 -a.e. t ∈ [0, 1] (3.5.8)
dt r
for all x ∈ J. To prove this fact, we first notice that the endpoints of the interval
supp ρt (whose length is at least l) move at most with velocity Lip(ρ· )/r; then, we fix
x ∈ J = [a, b] and consider separately the cases
and by assumption f (t, x) ∈ supp ρt for any x ∈ J, we get supp ρt = [f (t, a), f (t, b)] for
all t ∈ [0, 1]. This, together with the fact that the endpoints of the interval supp ρt move
at most with velocity Lip(ρ· )/r, implies (3.5.8) if x ∈ ∂J. In the second case we have
Z f (t,x)
ρt (y) dy ∈ (0, L 1 (J)),
0
3.5. Comparison of metrics and gap phenomena 107
therefore f (t, x) ∈ Int(supp ρt ) for all t ∈ [0, 1]. It suffices now to find a Lipschitz estimate
of |f (s, x)−f (t, x)| when s, t are sufficiently close. Assume that f (s, x) ≤ f (t, x): adding
R f (s,x)
and subtracting 0 ρt (y) dy in the identity
Z f (t,x) Z f (s,x)
ρt (y) dy = ρs (y) dy
0 0
we obtain Z Z
f (t,x) f (s,x)
ρt (y) dy = ρs (y) − ρt (y) dy.
f (s,x) 0
M −1
(ii) Lip(ρk· ) ≤ 2
on [0, 12 ], and Lip(ρk· ) ≤ 2 on [ 12 , 1];
P
M
(iii) ρkt = 1 for all t ∈ [0, 1].
k=1
Indeed, this would produce maps with time derivative bounded by (M − 1)/(2b̄) on [0, 12 ]
and bounded by 2/b̄ on [ 12 , 1], and this easily gives (3.5.7).
The construction can be achieved in two steps. First, we connect χJk L 1 to lk L 1 in
the time interval [0, 12 ]; then, we connect lk L 1 to bk L 1 in [ 21 , 1] by a linear interpolation.
The Lipschitz constants of the second step are easily seen to be less than 2, so let us
focus on the first interpolation.
Let us first consider the case of two densities ρ1 = χJ1 and ρ2 = χJ2 , with J1 = (0, l1 )
and J2 = (l1 , l). In the time interval [0, τ ], we define the expanding intervals
t t
J1,t = (0, l1 + l2 ), J2,t = (l1 − l1 , 1),
τ τ
so that Jk,τ = (0, l) for k = 1, 2, and then define
1 on (0, l1 − τt l1 ),
1 on (l1 + τt l2 , l),
1 t t 2
ρt := l1 /l on (l1 − τ l1 , l1 + τ l2 ), ρt := l2 /l on (l1 − τt l1 , l1 + τt l2 ),
0 otherwise. 0 otherwise.
108 3.0. Variational models for the incompressible Euler equations
By construction ρkt ≥ lk on Jk,t for k = 1, 2, ρ1t + ρ2t = 1, and it is easy to see that
l1 l2 l
Lip(ρk· ) ≤ ≤ . (3.5.9)
τl 4τ
We can now define the desired interpolation on [0, 12 ] for general M ≥ 2. Let us define
i
ti := for i = 1, . . . , M − 1,
2(M − 1)
In the time interval [t1 , t2 ], we leave fixed ρk0 := χJk L 1 for k ≥ 4 (if such k exist), and
we apply again the above construction in J1 ∪ J2 ∪ J3 to ρ12 3 1
t1 and ρt1 = χJ3 L . In this
l1 +l2
way, on [t1 , t2 ], ρ12 12 1 3
t1 is connected to ρt2 := l1 +l2 +l3 χJ1 ∪J2 ∪J3 L , and ρt1 is connected to
ρ3t2 := l1 +ll32 +l3 χJ1 ∪J2 ∪J3 L 1 . Finally, it suffices to define ρ1t := l1 l+l 1
ρ12 and ρ2t := l1 l+l
2 t
2
ρ12 .
2 t
In the third step we leave fixed the densities ρkt2 for k ≥ 5, and we do the same
construction as before adding the first three densities (that is, in this case one de-
fines ρ123 1 2 3 1
t2 := ρt2 + ρt2 + ρt2 = χJ1 ∪J2 ∪J3 L ). In this way, we connect ρt2 to ρt3 :=
123 123
l1 +l2 +l3
χ
l1 +l2 +l3 +l4 J1 ∪J2 ∪J3 ∪J4
L 1 and ρ4t2 to ρ4t3 := l1 +l2 l+l
4
3 +l4
χJ1 ∪J2 ∪J3 ∪J4 L 1 , and then we define
ρkt := l1 +llk2 +l3 ρ123
t for k = 1, 2, 3.
Iterating this construction on [ti , ti+1 ] for i ≥ P 4, one obtains the desired maps t 7→ ρkt .
Indeed, by construction ρkt ≥ lk on Jk,t , and M k
k=1 ρt = 1. Moreover, by (3.5.9), it is
simple to see that in each time interval [ti , ti+1 ] one has the bound
M −1
Lip(ρk· ) ≤
.
2
¡ −1)2 ¢
So the energy can be easily bounded by 1/b̄2 (M16 + 1 ≤ M 2 /b̄2 . ¤
Proof. (of Theorem 3.5.2) By applying Theorem 3.5.3 to the optimal η connecting i to
γ, we can find maps gk ∈ S(D) such that γgk → γ narrowly and
Now, if d ≥ 3 we can use (3.3.12), the triangle inequality, and the density of SDiff(D) in
S(D) in the L2 norm, to find maps hk ∈ SDiff(D) such that
where eεwt x is the flow, in the (ε, x) variables, generated by the autonomous field wt (x) =
d εwt
w(t, x) (i.e. e0wt = i and dε e x = w(t, eεwt x)), and the perturbed generalized flows
η ε := (S ε )# η. Notice that η ε is incompressible if div wt = 0, and more generally the
density ρηε satisfies for all times t ∈ (0, 1) the continuity equation
d ηε
ρ (t, x) + div(wt (x)ρηε (t, x)) = 0. (3.6.2)
dε
This motivates the following definition.
1
kρν − 1kC 1 ([0,1]×D) ≤ .
2
Now we provide a slightly simpler proof of the characterization given in [33] of the
pressure field (the original proof therein involved a time discretization argument).
110 3.0. Variational models for the incompressible Euler equations
∗
Theorem 3.6.2. For all η, γ ∈ Γ(D) there exists p ∈ [C 1 ([0, 1] × D)] such that
2
hp, ρν − 1i(C 1 )∗ ,C 1 ≤ A1 (ν) − δ (η, γ) (3.6.3)
for all almost incompressible flows ν satisfying (3.3.5).
Proof. Let us define the closed convex set C := {ρ ∈ C 1 ([0, 1] × D) : kρ − 1kC 1 ≤ 21 },
and the function φ : C 1 ([0, 1] × D) → R+ ∪ {+∞} given by
½
inf {A1 (ν) : ρν = ρ and (3.3.5) holds} if ρ ∈ C;
φ(ρ) :=
+∞ otherwise.
2
We observe that φ(1) = δ (η, γ). Moreover, it is a simple exercise to prove that φ
is convex and lower semicontinuous in C 1 ([0, 1] × D). Let us now prove that φ has
bounded (descending) slope at 1, i.e.
[φ(1) − φ(ρ)]+
lim sup < +∞,
ρ→1 k1 − ρkC 1
By [33, Proposition 2.1] we know that there exist 0 < ε < 12 and c > 0 such that, for any
ρ ∈ C with kρ − 1kC 1 ≤ ε, there is a Lipschitz family of diffeomorphisms gρ (t, ·) : D → D
such that
gρ (t, ·)# µD = ρ(t, ·)µD ,
gρ (t, ·) = i for t = 0, 1, and the Lipschitz constant of (t, x) 7→ gρ (t, x) − x is bounded
by c. Thus, adapting the construction in [33, Proposition 2.1] (made for probability
measures in Ω(D), and not in Ω̃(D)), for any incompressible flow η connecting η to γ,
and any ρ ∈ C, we can define an almost incompressible flow ν still connecting η to γ
such that ρν = ρ, and
A1 (ν) ≤ A1 (η) + c0 kρ − 1kC 1 (1 + A1 (η)),
where c0 depends only on c (for instance, we define ν := G# η, where G : Ω̃(D) → Ω̃(D)
is the map induced by gρ via the formula (ω(t), a) 7→ (gρ (t, ω(t)), a)). In particular,
considering an optimal η, we get
2
φ(ρ) ≤ φ(1) + ckρ − 1kC 1 (1 + δ (η, γ)) (3.6.4)
for any ρ ∈ C with kρ−1kC 1 ≤ ε. This fact implies that φ is bounded on a neighbourhood
of 1 in C. Now, it is a standard fact of convex analysis that a convex function bounded on
a convex set is locally Lipschitz on that set. This provides the bounded slope property.
By a simple application of the Hahn-Banach theorem (see for instance Proposition 1.4.4
in [11]), it follows that the subdifferential of φ at 1 is not empty, that is, there exists p
in the dual of C 1 such that
hp, ρ − 1i(C 1 )∗ ,C 1 ≤ φ(ρ) − φ(1).
This is indeed equivalent to (3.6.3). ¤
3.6. Necessary and sufficient optimality conditions 111
then η minimizes the new action among all almost incompressible flows ν between η and
γ.
Then, using the identities
¯
d d ε ¯ d
S (ω)(t)¯¯ = w(t, ω(t)) = ∂t w(t, ω(t)) + ∇x w(t, ω(t)) · ω̇(t)
dε dt ε=0 dt
and the convergence in the sense of distributions (ensured by (3.6.2)) of (ρηε − 1)/ε to
−div w as ε ↓ 0, we obtain
¯ Z Z 1
d p ¯ d
¯
0 = A1 (η ε )¯ = ω̇(t) · w(t, ω(t)) dt dη(ω, a) + hp, div wi. (3.6.6)
dε ε=0 Ω̃(D) 0 dt
As noticed in [33], this equation identifies uniquely the pressure field p (as a distribution)
up to trivial modifications, i.e. additive perturbations depending on time only.
In the Eulerian-Lagrangian model, instead, the pressure field is defined (see (2.20) in
[35]) and uniquely determined, still up to trivial modifications, by
µZ ¶ µZ ¶
∇p(t, x) = −∂t v(t, x, a) dct,x (a) −div v(t, x, a) ⊗ v(t, x, a) dct,x (a) , (3.6.7)
D D
all derivatives being understood in the sense of distributions in (0, 1) × D (here (c, v)
is any optimal pair for the Eulerian-Lagrangian model). We used the same letter p
to denote the pressure field in the two models: indeed, we have seen in the proof of
Theorem 3.4.1 that, writing η = η a ⊗ µD , the correspondence
maps optimal solutions for the first problem into optimal solutions for the second one.
Since under this correspondence (3.6.7) reduces to (3.6.6), the two pressure fields coin-
cide.
The following crucial regularity result for the pressure field is proved¡ in the last¢
section, and it improves in the time variable the regularity ∂xi p ∈ Mloc (0, 1) × D
obtained by Brenier in [35].
112 3.0. Variational models for the incompressible Euler equations
Theorem 3.6.3 (Regularity of pressure). Let (c, v) be an optimal pair for the
Eulerian-Lagrangian model, and let p be the pressure field identified by (3.6.7). Then
∂xi p ∈ L2loc ((0, 1); M(D)) and
¡ ¢ ¡ d/(d−1) ¢
p ∈ L2loc (0, 1); BVloc (D) ⊂ L2loc (0, 1); Lloc (D) .
In the case D = Td the same properties hold globally in space, i.e. replacing BVloc (D)
d/(d−1)
with BV (Td ) and Lloc (D) with Ld/(d−1) (Td ).
The L1loc integrability of p allows much stronger variations in the Lagrangian model,
that give rise to possibly nonsmooth densities, which may even vanish.
From now one we shall confine our discussion to the case of the flat torus Td , as
our arguments involve some global smoothing that becomes more technical, and needs
to be carefully checked in more general situations. We also set µT = µTd and denote
by dT the Riemannian distance in Td (i.e. the distance modulo 1 in Rd /Zd ). In the
next theorem we consider generalized flows ν with bounded compression, defined by the
property ρν ∈ L∞ ((0, 1) × D).
for any generalized flow with bounded compression ν between η and γ such that
If p ∈ L1 ([0, 1] × Td ), the condition (3.6.9) is not required for the validity of (3.6.8).
Proof. Let J := {ρν (t, ·) 6= 1} b (0, 1) and let us first assume that ρν is smooth. If
kρν −1kC 1 ≤ 1/2, then the result follows by Theorem 3.6.2. If not, for ε > 0 small enough
(1 − ε)η + εν is a slightly compressible generalized flow in the sense of Definition 3.6.1.
Thus, we have
εhp, ρν − 1i = hp, ρ(1−ε)η+εν − 1i ≤ A1 ((1 − ε)η + εν) − A1 (η) = ε (A1 (ν) − A1 (η)) ,
R
Then, we set ν ε := Rd (Tε,y )# νφ(y) dy, where φ : Rd → [0, +∞) is a standard convolution
kernel. It is easy to check that ν ε still connects η to γ, and that
we can pass to the limit in (3.6.8) with ν ε in place of ν, which are smooth.
In the general case we fix a convolution kernel with compact support ϕ(t) and, with
the same choice of χ done before, we define the maps
Z 1
Tε (ω, a)(t) := ( ω(t − sεχ(t))ϕ(s) ds, a).
0
Remark 3.6.5 (Smoothing of flows and plans). Notice that the same smoothing
argument can be used to prove this statement: given a flow η between η = ηa ⊗ µT
and γ = γa ⊗ µT (not necessarily with bounded compression), we can find flows with
bounded compression η ε connecting η ε := (ηa ) ∗ φε ⊗ µT to γ ε := (γa ) ∗ φε ⊗ µT , with
AT (η ε ) = AT (η) and
Z Z 1 Z Z 1
¡ ¢
rε (τ, ω) dτ dη(ω, a) = r(τ, ω) dτ dη ε (ω, a) ∀r ∈ L1 [0, 1] × Td
Ω̃(Td ) 0 Ω̃(Td ) 0
(where, as usual, rε (t, x) = r(t, ·) ∗ φε (x)). In order to have these properties, it suffices
to define Z
ε
η := (σεy )# η φ(y) dy,
Rd
114 3.0. Variational models for the incompressible Euler equations
where σz (ω, a) = (ω + z, a). Notice also that the “mollified plans” η ε , γ ε converge to η,
γ in (Γ(Td ), δ): if we consider the map Syε : Td → Ω(Td ) given by x 7→ ωx (t) := x + εty,
the generalized incompressible flow ν ε = ν εa ⊗ µT , with
Z
ε
ν a := (Syε )# λa φ(y) dy,
Rd
connects
R in [0, 1] the plan λ = λa ⊗ µT to λε = (λa ∗ φε ) ⊗ µT , with an action equal to
ε2 Rd |y|2 φ(y) dy.
In order to state necessary and sufficient optimality conditions at the level of single
fluid paths, we have to take into account that the pressure field is not pointwise defined,
and to choose a particular representative in its equivalence class, modulo negligible sets
in spacetime. Henceforth, we define
Notice that pε is smooth and still 1-periodic. The choice of the heat kernel here is conve-
nient, because of the semigroup property pε+ε0 = (pε )ε0 . Recall that p̄ is a representative,
because at any Lebesgue point x of p(t, ·) the limit of pε (t, x) exists, and coincides with
p(t, x).
In order to handle passages to limits, we need also uniform pointwise bounds on pε ;
therefore we define
Z
−d/2 2
M f (x) := sup (2π) |f |(x + εy)e−|y| /2 dy, f ∈ L1 (Td ). (3.6.11)
ε>0 Rd
because of the semigroup property; second, standard maximal inequalities imply kM f kLp (Td ) ≤
cp kf kLp (Td ) for ¡all p > 1. Setting¢ M p(t, x) := M p(t, ·)(x), by Theorem
¡ 3.6.3¢ we infer
2 d/d−1
that M p ∈ Lloc (0, 1), L (T ) , so that in particular M p ∈ Lloc (0, 1) × Td . This is
d 1
the integrability assumption on p that will play a role in the rest of this section.
3.6. Necessary and sufficient optimality conditions 115
Definition 3.6.6 (q-minimizing path). Let ω ∈ H 1 ((0, 1); D) with M q(τ, ω) ∈ L1 (0, 1).
We say that ω is a q-minimizing path if
Z 1 Z 1
1 1
|ω̇(τ )|2 − q(τ, ω) dτ ≤ |ω̇(τ ) + δ̇(τ )|2 − q(τ, ω + δ) dτ
0 2 0 2
for all [s, t] ⊂ (0, 1) and all δ ∈ H01 ((s, t); D) with M q(τ, ω + δ) ∈ L1 (s, t).
Remark 3.6.7. We notice that, for incompressible flows η, the L1 (resp. L1loc ) integra-
bility of M q(τ, ω) imposed
¡ on the ¢ curves ω (and on 1their
¡ perturbations
¢ ω + δ) is satisfied
1 d d
η-a.e. if M q ∈ L (0, 1) × T ) (resp. M q ∈ Lloc (0, 1) × T ); this can simply be
obtained first noticing that the incompressibility of η and Fubini’s theorem give
Z Z Z Z
f (τ, ω) dτ dη(ω, a) = f (τ, x) dµTd (x) dτ
Ω̃(Td ) J J Td
for all nonnegative Borel functions f and all intervals J ⊂ (0, 1), and then applying this
identity to f = M q.
so that the density produced by ν ε,y is at most 2, and equal to 1 outside the interval
[s, t].
Therefore, by Theorem 3.6.4 we get
Z Z t Z
ν ε,y
p̄(ρ − 1) dτ dµT ≤ A1 (ω + δ + εyχ) − A1 (ω) dη(ω, a).
Td s E
with the convention cs,tq (x, y) = +∞ if no admissible curve ω exists. Using this cost
function cs,t
q , we can consider the induced optimal transport problem, namely
½Z ¾
Wcs,t
q
(µ1 , µ2 ) := inf cs,t
q (x, y) dλ(x, y) : λ ∈ Γ(µ1 , µ2 ), (cs,t
q )
+ 1
∈ L (λ) ,
D×D
(3.6.14)
where Γ(µ1 , µ2 ) is the family of all probability measures λ in D×D whose first and second
marginals are respectively µ1 and µ2 . Again, we set by convention Wcs,t q
(µ1 , µ2 ) = +∞
if no admissible λ exists.
Unlike most classical situations (see [132]), existence of an optimal λ is not guaranteed
because cs,t
q are not lower semicontinuous in D × D, and also it seems difficult to get
lower bounds on cs,t q . It will be useful, however, the following upper bound on Wcs,t q
:
In particular, Wcs,t
q
(µ1 , µ2 ) as defined in (3.6.14) is not equal to +∞.
fulfils (3.6.15). It is easy to check, using Fubini’s theorem, that Kq0,t is µT -integrable in
Td . Indeed,
Z Z Z Z l
d τ
Kq0,t (w) dµT (w) = + M q(τ, w + (z − w)) dτ dµT (z) dµT (w)
Td 4l Td Td 0 l
Z Z Z l
τ
+ M q(l + τ, z + (w − z)) dτ dµT (w) dµT (z)
Td Td 0 l
Z Z Z l
d τ
= + M q(τ, w + y) dτ dµT (y) dµT (w)
4l Td Td 0 l
Z Z Z l
τ
+ M q(l + τ, z + y) dτ dµT (z) dµT (y)
Td Td 0 l
Z lZ Z
d τ τ
= + M q(τ, w + y) + M q(l + τ, w + y) dµT (w) dµT (y) dτ
4l d d l l
Z0 t ZT T
d
= + M q(τ, w) dµT (w) dτ < +∞.
4l 0 Td
¤
In the proof of the next theorem we are going to use the measurable selection theorem
(see [43, Theorems III.22 and III.23]): if (A, A, ν) is a measure space, X is a Polish space
and E ⊂ A × X is Aν ⊗ B(X)-measurable, where Aν is the ν-completion of A, then:
(ii) there exists a (Aν , B(X))-measurable map σ : π(E) → X such that (x, σ(x)) ∈ E
for ν-a.e. x ∈ πA (E).
The next theorem will provide a new necessary optimality condition involving not
only the path that should be followed between x and y (which, as we proved, should
minimize the Lagrangian Lp̄ in (3.1.8)), but also the “weights” given to the paths. We
observe that, when a variation of these weights is performed, new flows η̃ between η
and γ are built which need not be of bounded compression, for which (et )# η̃ might be
even singular with respect to µT ; therefore we can’t use directly them in the variational
principle (3.6.8); however, this difficulty can be overcome by the smoothing procedure
in Remark 3.6.5.
3.6. Necessary and sufficient optimality conditions 119
We also remark that, since τ 7→ kpε (τ, ·)k∞ is integrable in (s, t), for any ε > 0 the cost
cs,t
pε is bounded both from above and below. Next, we show that
cs,t s,t
p̄ (x, y) ≥ lim sup cpε (x, y) ∀(x, y) ∈ Td × Td . (3.6.20)
ε↓0
¡ ¢
Indeed, let ω ∈ H 1 [s, t]; Td with ω(s) = x, ω(t) = y and M p(τ, ω) ∈ L1 (s, t) (if there is
no such ω, there is nothing to prove). By the pointwise bound |pε | ≤ M p and Lebesgue’s
theorem, we get
Z t Z t
1 2 1
|ω̇(τ )| − p̄(τ, ω) dτ = lim |ω̇(τ )|2 − pε (τ, ω) dτ.
s 2 ε↓0 s 2
By the L1 (L∞ ) bound on M pε , the curve ω is admissible also for the variational problem
defining cs,t s,t
pε , therefore the above limit provides an upper bound on lim supε cpε (x, y). By
minimizing with respect to ω we obtain (3.6.20).
120 3.0. Variational models for the incompressible Euler equations
By (3.6.19) and the pointwise bound p̄ ≥ −|p̄| we infer that the positive part of
Wcs,t
p̄
(ηas , γat ) is µT -integrable. Let now δ > 0 be fixed, and let us consider the compact
space X := P(Td × Td ) and the B(Td )µT ⊗ B(X)-measurable set
½ Z ³ ¾
d s t s,t s t 1´
E := (a, λ) ∈ T × X : λ ∈ Γ(ηa , γa ), cp̄ (x, y) dλ < δ + Wcs,t (ηa , γa ) ∨ −
Td ×Td
p̄ δ
(we skip the proof of the measurability, that is based on tedious but routine arguments).
Since Wcs,tp̄
(ηas , γat ) < +∞ for µT -a.e. a, we obtain that for µT -a.e. a ∈ Td there exists
λ ∈ Γ(ηas , γat ) with (a, λ) ∈ E. Thanks to the measurable selection theorem we can select
a Borel family a 7→ λa ∈ P(Td × Td ) such that λa ∈ Γ(ηas , γat ) and
Z ³ 1´
cs,t
p̄ (x, y) dλ a < δ + W s,t
cp̄ (η s
a a, γ t
) ∨ − for µT -a.e. a ∈ Td .
d
T ×T d δ
By Lemma 3.6.9 and Remark 3.6.10 we get
and
Z Z Z Z
Kps,t (x) + Kps,t (y) dλa dµT (a) = Kps,t d(ηas + γat ) dµT (a) < +∞
Td Td ×Td Td Td
(we used the pointwise bound M pε ≤ M p and the fact that q 7→ Kqs,t has a monotone
dependence upon M q, see (3.6.16)). Therefore (3.6.20) and Fatou’s lemma give
Z ³ Z Z
s t 1´
δ+ Wcs,t (ηa , γa ) ∨ − dµT (a) ≥ lim sup cs,t
pε (x, y)dλa dµT (a). (3.6.21)
Td p̄ δ ε↓0 Td d
T ×Td
Still thanks
¡ to the ¢ measurable selection theorem, we can find a Borel map (x, y, a) 7→
x,y d x,y x,y x,y
ωa,ε ∈ C [s, t]; T with ωa,ε (s) = x, ωa,ε (t) = y, M pε (τ, ωa,ε ) ∈ L1 (s, t) and
Z t
1 x,y 2 x,y
|ω̇a,ε | − pε (τ, ωa,ε ) dτ < δ + cs,t
pε (x, y) for λa ⊗ µT -a.e. (x, y, a).
s 2
Let λε = λεa ⊗ µT be the push-forward, under the map (x, y, a) 7→ ωa,ε x,y
, of the measure
ε ε
λa ⊗ µT ; by construction this measure fulfils (es )# λa = ηa , (et )# λa = γat , (because the
s
marginals of λa are ηas and γat ), therefore it connects η s to γ t in [s, t]. Then, from (3.6.21)
we get
Z ³ Z Z t
1´ 1
2δ+ Wcs,t (η s
a , γ t
a )∨− dµ T (a) ≥ lim sup |ω̇(τ )|2 −pε (τ, ω) dτ dλε (ω, a).
T d p̄ δ ε↓0 d
C([s,t];T )×Td s 2
3.6. Necessary and sufficient optimality conditions 121
ε
Eventually, Remark 3.6.5 provides us with a flow with bounded compression λ̂ connect-
ing η s,ε to γ t,ε in [s, t] with
Z ³ Z Z t
1´ 1 ε
2δ+ Wcs,t (ηa , γa )∨− dµT (a) ≥ lim sup |ω̇(τ )|2 −p̄(τ, ω) dτ dλ̂ (ω, a).
Td
p̄ δ ε↓0 C([s,t];Td )×Td s 2
(3.6.22)
s,ε s t,ε t d
Since η → η and γ → γ in (Γ(T ), δ), we can find (by scaling η from [0, s] to
[0, sε ] and from [t, 1] to [tε , 1], and using repeatedly the concatenation, see Remark 3.3.2)
generalized flows ν ε between γ and η in [0, 1], sε ↑ s, tε ↓ t satisfying:
2 2
(c) the action of ν ε in [0, s] converges to δ (η, η s ) = s2 δ (η, γ), and the action of ν ε in
2 2
[t, 1] converges to δ (γ t , γ) = (1 − t)2 δ (η, γ).
Since ν ε is a flow with bounded compression connecting η to γ we use (3.6.8) and the
incompressibility in [0, 1] \ [s, t] to obtain
Z Z 1 Z Z t
1 2
|ω̇(τ )|2 dτ dν ε (ω, a) − p̄(τ, ω) dτ dν ε (ω, a) ≥ δ (η, γ) (3.6.23)
Ω̃(Td ) 0 2 Ω̃(Td ) s
for all ε > 0. Taking into account that (b) and (c) imply
Z Z s
1 2
|ω̇(τ )|2 dτ dν ε (ω, a) → sδ (η, γ)
Ω̃(Td ) 0 2
and Z Z 1
1 2
|ω̇(τ )|2 dτ dν ε (ω, a) → (1 − t)δ (η, γ),
Ω̃(Td ) t 2
A byproduct of the above proof is that equalities hold in (3.6.17), (3.6.18), and
therefore
Z Z µZ t ¶
1 2 s,t
|ω̇(τ )| − p̄(τ, ω) dτ − cp̄ (ω(s), ω(t)) dη a (ω) dµT (a) (3.6.25)
Td Ω(Td ) s 2
Z Z Z t Z
1 2
= |ω̇(τ )| − p̄(τ, ω) dτ dη a (ω) dµT (a) − Wcs,t (ηas , γat ) dµT (a) = 0.
Td d
Ω(T ) s 2 Td p̄
This yields in particular also the first optimality condition. However, as the proof of
Theorem 3.6.11 is much more technical than the one presented in Theorem 3.6.8, we
decided to present both.
Now we show that the optimality conditions in Theorems 3.6.8 and 3.6.11 are also
sufficient, even in the case of a general compact manifold without boundary D.
Theorem 3.6.12 (Sufficient condition). Assume that η = η a ⊗ µ is a generalized
incompressible flow in D between η and γ, and assume that for some map q the following
properties hold:
(a) M q ∈ L1 ((0, 1) × D) and η is concentrated on q-minimizing paths;
(b) the plan (e0 , e1 )# η a is optimal, relative to the cost c0,1
q defined in (3.6.13), for
µD -a.e. a.
Then η is optimal and q is the pressure field. In addition, if (a), (b) are replaced by
(a’) M q ∈ L1loc ((0, 1) × D) and η is concentrated on locally q-minimizing paths;
(b’) for all intervals [s, t] ⊂ (0, 1), the plan (es , et )# η a is optimal, relative to the cost
cs,t
q defined in (3.6.13), for µD -a.e. a,
This proves that η is optimal. Moreover, by using the inequality in (3.6.26) with a flow
ν with bounded compression, one obtains
By the incompressibility constraint, in the right hand ¡ side p̄ can ¢ be replaced by any
1 d
function q whose spatial
R means vanish and, if M q ∈ L [0, 1] × T , the resulting integral
bounds from above Td Wc0,1q
(ηa , γa ) dµT (a), as we proved in (3.6.26). ¤
124 3.0. Variational models for the incompressible Euler equations
under the global constraint given by the incompressibility of the flow (3.3.15). By what
we have already proved, the existence of minimizing pairs (c, v) with finite action holds
when, for instance, D = [0, 1]d or D = Td is the flat d-dimensional torus (see Section
3.3.2). Moreover minimizing pairs (c, v) satisfy the following two properties:
R
(a) (Constancy of kinetic energy) The map t 7→ |v|2 (t, x, a) dct coincides a.e. in
(0, T ) with a constant (2T −1 times the minimal action);
(b) (Weak solution to Euler’s equations) There exists a distribution p in (0, T )×D
satisfying
µZ ¶ µZ ¶
∇p = −∂t v(t, x, a) dct,x (a) − div v(t, x, a) ⊗ v(t, x, a) dct,x (a) ,
D D
In this section we refine a little bit the deep analysis made in [35] of the regularity of
the gradient of the pressure field: Brenier proved that the distributions ∂xi p are locally
finite measures in (0, T ) × D, but this information is not sufficient (due to a lack of
time regularity) to imply that p is a function. As shown in Corollary 3.7.4, a sufficient
¡ d/(d−1) ¢
condition, that gives also p ∈ L2loc (0, T ); Lloc (D) , is that
¡ ¢
∂xi p ∈ L2loc (0, T ); Mloc (D) , i = 1, . . . , d.
The proof of this regularity property is the main scope of this section. The fact that p is
a function at least in some L1loc (Lrloc ) space, for some r > 1, plays an important role in the
analysis, developed in Section 3.6, of the necessary and sufficient optimality conditions
for action-minimizing curves in Γ(D). Indeed, these conditions involve the Lagrangian
Z
1
Lp (γ) := |γ̇(t)|2 − p(t, γ(t)) dt,
2
3.7. Regularity of the pressure field 125
the (locally) minimizing curves for Lp and the value function induced by Lp , and none
of these objects makes sense if p is only a measure in the time variable.
From now, we fix a minimizing pair (c, v), and we shall denote by
Z T Z Z Z
∗ 1 2 1
A := |v| (t, x, a)dc(t, x, a) = T |v|2 (t, x, a) dct (x, a)
2 0 D×D 2 D×D
sup {−α(−F, −Φ) − β(F, Φ)} = inf {α∗ (c̃, ṽc̃) + β ∗ (c̃, ṽc̃)},
(F,Φ)∈E (c̃,ṽc̃)∈E ∗
and ½
∗ 0 if hc − c̃, ∂t φ + pi + hvc − ṽc̃, ∇x φi = 0 ∀ p, φ,
β (c̃, ṽc̃) :=
+∞ otherwise.
Thus it is simple to check that β ∗ (c̃, ṽc̃) = 0 if and only if the two constraints (3.3.14)
and (3.3.15) are satisfied.
One therefore deduces that the minimum of the action coincides with the dual prob-
lem sup(F,Φ)∈E {−α(−F, −Φ) − β(F, Φ)}, which more concretely can be written as
with
1
∂t φ + |∇x φ|2 + p ≤ 0.
2
Thus, the duality tells us that, for any ε > 0, there exist pε (t, x) and φε (t, x, a) satisfying
1
∂t φε + |∇x φε |2 + pε ≤ 0
2
and
1
h|v|2 , ci ≤ hc, ∂t φε + pε i + hvc, ∇x φε i + ε2 .
2
As shown in [35, Section 3.2], from this one deduces the estimate
Z
1
|v − ∇x φε |2 dc ≤ ε2 . (3.7.1)
2
R
We remark that, up to adding to φε a function of time, one can always assume D pε (t, x) dx =
0 for all t ∈ [0, T ]. As shown in [35, Section 3.4], the family pε in compact in the sense of
distributions, so that there exists a cluster point p. Moreover, since any limit point p of
pε is seen to satisfy (3.6.7) in the sense of distribution for any minimizing pair (c, v), ∇p
is uniquely determined, and this enforces the convergence of the whole family (∇pε )ε>0
to ∇p in the sense of distributions.
Let us now prove the following regularity result on ∇x φε : we present a proof slightly
different from the one in [35].
Proposition 3.7.1. Let τ ∈ (0, T ), let w : D → Rd be a smooth divergence-free vector
field parallel to ∂D and let esw (x) be the measure-preserving flow in D generated by w.
Then, for η < τ we have
Z T −τ Z
¯ ¯
¯∇x φε (t + η, eδw (x), a) − ∇x φε (t, x, a)¯2 dc ≤ L(ε2 + η 2 + δ 2 ), (3.7.2)
τ D×D
Proof. In the sequel we fix a cut-off function ζ : [0, T ] → [0, 1] identically equal to 1 on
[τ, T − τ ]. We recall the following estimate (Proposition 3.1 in [35]), which follows by
the “quasi optimality” of (pε , φε ) in the dual problem:
Z
1 ¯¯ ¯2
(∂t + v η · ∇x )eδζw − ∇x φε ◦ eδζw ¯ dcη
2
Z Z
2 1 ¯¯ η
¯
δζw ¯2 η 1
≤ε + (∂t + v · ∇x )e dc − |v|2 dc, (3.7.3)
2 2
(here eδζw (x) is the flow generated by w starting from x, at time δζ) where (v η , cη ) is the
“reparameterization” of (v, c) given by
cη = cη (t)dt = ct+ηζ(t) dt, v η (t, x, a) = (1 + ηζ 0 (t))v(t + ηζ(t), x, a).
R R R
The minimality of (v, c) gives |v η |2 dcη ≥ |v|2 dc, and the constancy of t 7→ |v|2 (t, x, a) dct
gives Z Z Z
|v | dc − |v| dc = (η 2 (ζ 0 )2 + 2ηζ 0 ) dc ≤ Cη 2 ,
η 2 η 2
(3.7.4)
Z Z
+ |v | dc − |v|2 dc.
η 2 η
128 3.0. Variational models for the incompressible Euler equations
Defining
Z Z
¯ η ¯ ¯ ¯
f (δ, ε, η) := ¯v − ∇x φε ◦ eδζw ¯2 dcη = ¯(1+ηζ 0 )v(1+ηζ, x, a)−∇x φε (1+ηζ, eδζw (x), a)¯2 dc
Z T −τ Z
¯ ¯
≥ ¯v − ∇x φε (t + η, eδw (x), a)¯2 dc
τ D×D
we see that it suffices to bound f from above. Since eδζw x − x = δζ(t)w(x) + O(δ 2 ) (in
the C 1 norm in spacetime), by Schwarz inequality, (3.7.4) and (3.7.5) we get
p
f ≤ C f δ + 2ε2 + C(δη + δ 2 ) + Cη 2 ,
Proof. For ζ ∈ Cc∞ (τ, T − τ ) nonnegative, η ∈ (0, τ /2) and δ, ε > 0 we consider the
following expression:
Z TZ ¯Z 1 ¯
¯ £ ¤ ¯
I = I(ζ, δ, η, ε) : = ¯
ζ(t) ¯ pε (t + ηθ, e (x)) − pε (t + ηθ, x) dθ¯¯ dxdt
δw
0 D
Z ¯Z 1 0 ¯
¯ £ ¤ ¯
= ζ(t) ¯¯ pε (t + ηθ, e (x)) − pε (t + ηθ, x) dθ¯¯ dc(t, x, a).
δw
0
Our goal is to bound I from above. This will be achieved in the following (many) steps:
I ≤ I1 + I2 + I3 and estimate of I2 , I3 ; I1 ≤ 2kζk∞ ε2 − (I4 + I5 + I6 ) and estimate of I5
and I6 ; I4 = I7 + I8 and estimate of I8 ; I7 = 2I9 + I10 and estimate of I9 ; I10 = I11 + I12
and estimate of I12 ; finally I11 = I13 + I14 and estimate of I13 and I14 . In order to avoid
a cumbersome notation, during this proof we denote by C a generic constant depending
only on (w, τ, T, A∗ ), whose specific value
¡ can change from line
¢ to line.
1 2
R We now2 consider λε (t, x, a) := − ∂t φε + 2 |∇x φε | + pε ≥ 0, and we recall that
λε dc ≤ ε . We have
I ≤ I1 + I2 + I3 ,
3.7. Regularity of the pressure field 129
where
Z ¯Z 1 ¯
¯ £ ¤ ¯
¯
I1 : = ζ(t) ¯ λε (t + ηθ, e (x), a) − λε (t + ηθ, x, a) dθ¯¯ dc,
δw
Z ¯Z0 1 ¯
¯ £ ¤ ¯
I2 : = ζ(t) ¯¯ ∂t φε (t + ηθ, e (x), a) − ∂t φε (t + ηθ, x, a) dθ¯¯ dc,
δw
0
Z ¯Z 1 ¯
¯ £1 1 ¤ ¯
I3 : = ζ(t) ¯¯ |∇x φε | (t + ηθ, e (x), a) − |∇x φε | (t + ηθ, x, a) dθ¯¯ dc.
2 δw 2
0 2 2
By (3.7.2) we have
√
k∇x φε (t+ηθ, eδw (x), a)kL2 (ζ 2 c) ≤ k∇x φε (t, x, a)kL2 (ζ 2 c) + Lkζk∞ (ε+η+δ) ∀θ ∈ (0, 1).
Therefore writing |A|2 − |B|2 as (A − B) · (A + B) and using (3.7.2) once more, we can
estimate
µZ ¶1/2
2 2 2 2 2 2
I3 ≤ C(ε + η + δ) ζ (t)|∇x φε | (t, x, a) dc + Ckζk∞ (ε + η + δ ) . (3.7.7)
For I2 we first integrate with respect to θ and then use the mean value theorem to obtain
Z Z 1
δ ¯£ ¤ ¯
I2 ≤ ζ(t) ¯ ∇x φε (t + η, eσδw (x), a) − ∇x φε (t, eσδw (x), a) · w(eσδw (x))¯ dσdc
η
Z 1Z 0
δ ¯£ ¯
≤C ζ(t) ¯ ∇x φε (t + η, eσδw (x), a) − ∇x φε (t, eσδw (x), a)¯ dcdσ
η 0
δ
≤ C (ε + η + δ)kζkL2 (0,T ) . (3.7.8)
η
R
Let us now consider I1 : using λε ≥ 0 and λε dc ≤ ε2 , we obtain
Z Z 1
£ ¤
I1 ≤ ζ(t) λε (t + ηθ, eδw (x), a) + λε (t + ηθ, x, a) dθdc
0
Z Z 1
2
£ ¤
≤ 2kζk∞ ε + ζ(t) λε (t + ηθ, eδw (x), a) + λε (t + ηθ, x, a) − 2λε (t, x, a) dθdc
0
≤ 2kζk∞ ε2 − I4 − I5 − I6 ,
where
Z Z 1£ ¤
I4 : = ζ(t) ∂t φε (t + ηθ, eδw (x), a) + ∂t φε (t + ηθ, x, a) − 2∂t φε (t, x, a) dθdc,
0
Z Z 1
1 £ ¤
I5 : = ζ(t) |∇x φε |2 (t + ηθ, eδw (x), a) + |∇x φε |2 (t + ηθ, x, a) − 2|∇x φε |2 (t, x, a) dθdc,
2
Z Z 10
£ ¤
I6 : = ζ(t) pε (t + ηθ, eδw (x)) + pε (t + ηθ, x) − 2pε (t, x) dθdc.
0
130 3.0. Variational models for the incompressible Euler equations
R
For I8 , using once more that λε ≥ 0 and λε dc ≤ ε2 , we have the bound
¯Z Z 1 ¯ ¯Z Z 1 ¯
¯ £ ¤ ¯ ¯ £ ¤ ¯
|I8 | ≤ 2 ¯¯ ζ(t − θη) − ζ(t) λε (t, x, a) dθdc¯¯ + ¯¯ ζ(t − θη) − ζ(t) |∇x φε | (t, x, a) dθdc¯¯
2
¯Z Z 01 ¯ 0
¯ £ ¤ ¯
+ 2 ¯¯ ζ(t − θη) − ζ(t) pε (t, x, a) dθdc¯¯
0
¯Z Z 1 ¯
¯ £ ¤ ¯
≤ 4kζk∞ ε + ¯¯
2
ζ(t − θη) − ζ(t) |∇x φε | (t, x, a) dθdc¯¯
2
0
R R
where
R in the last inequality we used that p ε dct = pε dx = 0. Using also the fact that
t 7→ |v|2 (t, x, a) dct does not depend on t we get
Z
¯ ¯
|I8 | ≤ 4kζk∞ ε + 2kζk∞ 2 ¯|∇x φε |2 (t, x, a) − |v|2 (t, x, a)¯ dc. (3.7.11)
We have, as for I2 ,
¯Z ¯
1 ¯¯ £¡ ¢ ¡ ¢¤ ¯
|I9 | = ¯ ζ(t) φε (t + η, e (x), a) − φε (t + η, x, a) − φε (t, e (x), a) − φε (t, x, a) dc¯¯
δw δw
η
¯Z Z 1 ¯
δ ¯¯ £ ¤ ¯
= ¯ ζ(t) ∇x φε (t + η, e (x), a) − ∇x φε (t, e (x), a) · w(e (x)) dσdc¯¯
σδw σδw σδw
η 0
δ
≤ C (ε + η + δ)kζkL2 (0,T ) . (3.7.12)
η
For I10 , we use the continuity equation ∂t c + divx (vc) = 0 (see (3.3.16)) and add and
subtract ζ(t) to get
Z Z 1 Z 1 £ ¤
I10 = ∂t ζ(t − (1 − σ)ηθ)∂t φε (t + ηθσ, x, a) ηθ dσdθdc
Z 0Z 0
1Z 1
=− ζ(t − (1 − σ)ηθ)∂t ∇x φε (t + ηθσ, x, a) · v(t, x, a)ηθ dσdθdc
0 0
Z Z 1 Z 1£ ¤
=− ζ(t − (1 − σ)ηθ) − ζ(t) ∂t ∇x φε (t + ηθσ, x, a) · v(t, x, a)ηθ dσdθdc
0 0
Z Z 1Z 1
− ζ(t)∂t ∇x φε (t + ηθσ, x, a) · v(t, x, a)ηθ dσdθdc
0 0
Z Z 1Z 1
£ ¤
=− ζ(t − (1 − σ)ηθ) − ζ(t) ∂t ∇x φε (t + ηθσ, x, a) · v(t, x, a)ηθ dσdθdc
0
Z Z 10
£ ¤
− ζ(t) ∇x φε (t + ηθ, x, a) − ∇x φε (t, x, a) · v(t, x, a) dθdc
0
=: I11 + I12 .
Now we see that, using (3.7.2) and the Schwarz inequality, we easily get
µZ ¶ 21
2 2
|I12 | ≤ C(ε + η) ζ (t)|∇x φε | dc + Ckζk2∞ (ε2 2
+η ) . (3.7.13)
132 3.0. Variational models for the incompressible Euler equations
and
Z Z 1 Z 1£ ¤
I14 := ζ 0 (t − (1 − σ)ηθ) − ζ 0 (t) ∇x φε (t + ηθσ, x, a) · v(t, x, a)ηθ dσdθdc.
0 0
R
Recalling that t 7→ |v|2 (t, x, a) dct is constant, by (3.7.1) we have
¯Z Z 1 ¯
¯ ¡ ¢ ¯
|I13 | ≤ ¯¯ [ζ(t − ηθ) − ζ(t)] ∇x φε (t, x, a) − v(t, x, a) · v(t, x, a) dθdc¯¯ ≤ Ckζk∞ ε.
0
(3.7.14)
Finally, by (3.7.2) we can bound I14 with
Z Z Z Z 1¯
T −τ /2 1 ¯
|I14 | ≤ kζ k∞ η 00 2 ¯∇x φε (t + ηθσ, x, a) · v(t, x, a)¯ dσdθdc
τ /2 D×A 0 0
00 2
≤ kζ k∞ η C (k∇x φε k2 + C(ε + η)) . (3.7.15)
Collecting (3.7.7), (3.7.8), (3.7.9), (3.7.10), (3.7.12), (3.7.13), (3.7.14), (3.7.15) we can
bound from above I as follows:
µZ ¶ 12
2 2
C(ε + η + δ) ζ (t)|∇x φε | dc + Ckζk2∞ (ε2 2
+η +δ ) 2
δ
+ I8 + C (ε + η + δ)kζkL2 (0,T ) + kζ 00 k∞ η 2 C (k∇x φε k2 + C(ε + η)) + 2kζk∞ ε2 + Ckζk∞ ε.
η
¡ ¢
Now, recalling the definition of I, we integrate pε ζ against a function f ∈ Cc∞ (0, T )×D
and pass to the limit as ε → 0, with η = δ frozen, to obtain
¯Z ¯
1 ¯¯ 1 £ −δw
¤ ¯
¯ ≤ Ckf k∞ (kζkL2 (0,T ) +δkζ 00 k∞ +δkζk∞ )
¯ hq, ζ(t) f (t − δθ, e (x)) − f (t − δθ, x) i dθ ¯
δ 0
3.7. Regularity of the pressure field 133
0
particular, if we ¡denote by q̄ε the¢ mean value of qε on D , qε − q̄ε is uniformly bounded
2 1∗ 0
in the
¡ space dLloc 0 (0,
¢ T ); L (D ) , and if q is any weak2 limit
¡ point (in the¢ duality with
2 0
Lloc (0, T ); L (D ) ) we easily get ∇q = ∇p and q ∈ Lloc (0, T ); BV (D ) .
In the case D = Td the proof is analogous: it suffices to apply Remark 3.7.3.
Chapter 4
4.1 Introduction
1
Let M be a smooth manifold without boundary. We denote by T M the tangent bundle
and by π : T M → M the canonical projection. A point in T M will be denoted by (x, v)
with x ∈ M and v ∈ Tx M = π −1 (x). In the same way a point of the cotangent bundle
T ∗ M will be denoted by (x, p) with x ∈ M and p ∈ Tx∗ M a linear form on the vector space
Tx M . We will suppose that g is a complete Riemannian metric on M . For v ∈ Tx M , the
norm kvkx is gx (v, v)1/2 . We will denote by k · kx the dual norm on T ∗ M . Moreover, for
every pair x, y ∈ M , d(x, y) will denote the Riemannian distance from x to y.
We will assume in the whole chapter that H : T ∗ M → R is a Hamiltonian of class
k,α
C , with k ≥ 2, α ∈ [0, 1], which satisfies the three following conditions:
(H1) C 2 -strict convexity: ∀(x, p) ∈ T ∗ M , the second derivative along the fibers
∂ 2 H(x,p)
∂p2
is positive strictly definite;
(H2) uniform superlinearity: for every K ≥ 0 there exists a finite constant C(K)
such that
∀(x, p) ∈ T ∗ M, H(x, p) ≥ Kkpkx + C(K);
1
This chapter is based on a joint work with Albert Fathi and Ludovic Rifford [67].
135
136 4.0. On the structure of the Aubry set and Hamilton-Jacobi equation
By the Weak KAM Theorem we know that, under the above conditions, there is
c(H) ∈ R such that the Hamilton-Jacobi equation
H(x, dx u) = c
admits a global viscosity solution u : M → R for c = c(H) and does not admit such
solution for c < c(H), see [62], [52], [65], [68], [96]. In fact, if M is assumed to be compact,
then c(H) is the only value of c for which the Hamilton-Jacobi equation above admits
a viscosity solution. The constant c(H) is called the critical value, or the Mañé critical
value of H. In the sequel, a viscosity solution u : M → R of H(x, dx u) = c(H) will be
called a critical viscosity solution or a weak KAM solution, while a viscosity subsolution u
of H(x, dx u) = c(H) will be called a critical viscosity subsolution (or critical subsolution
if u is at least C 1 ).
We recall that the Lagrangian L : T M → R associated to the Hamiltonian H is
defined by
∀(x, v) ∈ T M, L(x, v) := max ∗
{p(v) − H(x, p)} .
p∈Tx M
Since H is of class at least C 2 and satisfies the three conditions (H1)-(H3), it is well-
known (see for instance [65] or [68, Lemma 2.1])) that L is finite everywhere of class at
least C 2 , strictly convex and superlinear in each fiber Tx M , and satisfies
∀(x, p) ∈ Tx∗ M, H(x, p) = max {p(v) − L(x, v)} .
v∈Tx M
where the infimum is taken over all the absolutely continuous paths γ : [0, t] → M with
γ(0) = x and γ(t) = y. The Peierls barrier is the function h : M × M → R defined by
A := {x ∈ M | h(x, x) = 0} .
is a semi-distance on the projected Aubry set. We define the quotient Aubry set (AM , dM )
to be the metric space obtained by identifying two points in A if their semi-distance dM
vanishes. In [105], Mather formulated the following problem:
In [104], Mather brought a positive answer to that problem in low dimension. More
precisely, he proved that if M has dimension two, or if the Lagrangian is the kinetic
energy associated to a Riemannian metric on M in dimension ≤ 3, then the quotient
Aubry set is totally disconnected. In fact, Mather mentioned in [105] that it would be
even more interesting to be able to prove that the quotient Aubry set has vanishing
one-dimensional Hausdorff measure. The aim of the present chapter is to show that such
a property is satisfied under various assumptions. Let us state our results.
138 4.0. On the structure of the Aubry set and Hamilton-Jacobi equation
and denote by (A0M , dM ) the quotiented metric space. In fact, at the very end of his
paper [104], Mather noticed that the argument used in the case where L is a kinetic
energy in dimension 3 proves the total disconnectedness of the quotient Aubry set in
dimension 3 as long as A0M is empty. Our result concerning the stationary projected
Aubry set is the following:
Theorem 4.1.2. If dim M ≥ 3 and H of class C k,1 with k ≥ 2 dim M −3, then (A0M , dM )
has vanishing one-dimensional Hausdorff measure. Moreover, if α ∈ (0, 1] is such that
α( k+1
2
+ 1) ≥ dim M then (A0M , dM ) has vanishing α-dimensional Hausdorff measure. In
particular, if H is C ∞ then (A0M , dM ) has zero Hausdorff dimension.
This result is in some sense optimal: for each integer d > 0, and each ² > 0, Mather
has constructed on the torus Td = Rd /Zd a Tonelli Lagrangian L of class C2d−3,1−² such
that à is connected, contained in the fixed points of the Euler-Lagrange flow, and the
Mather quotient (AM , dM ) is isometric to an interval, see [105].
As a corollary of the above theorem, we have the following result which was moreorless
already proved by Mather in [105, §19 page 1722] (see also the work of Sorrentino [124],
where the author uses a strategy similar to ours to prove analogous results).
Corollary 4.1.3. Assume that H is of class C 2 and that its associated Lagrangian L
satisfies the following conditions:
1. ∀x ∈ M, minv∈Tx M L(x, v) = L(x, 0);
The theorems obtained in the first part of the chapter together with applications in
dynamics developed in Section 6 give an answer to this question if dim M ≤ 3.
The outline is the following: Sections 2 and 3 are devoted to preparatory results.
Section 4 is devoted to the proofs of Theorems 4.1.1, 4.1.2 and 4.1.4. Sections 5 and 6
present applications in dynamics.
Proof. Let x, y ∈ A be fixed. First, we notice that if u1 , u2 are two critical viscosity
subsolutions, then we have
(u1 − u2 )(y) − (u1 − u2 )(x) = (u1 (y) − u1 (x)) + (u2 (x) − u2 (y))
≤ h(x, y) + h(y, x) = dM (x, y).
(u1 − u2 )(y) − (u1 − u2 )(x) = (h(x, y) − h(y, y)) − (h(x, x) − h(y, x))
= h(x, y) + h(y, x) = dM (x, y),
since h(x, x) = h(y, y) = 0. Since u1 , u2 are both critical viscosity solutions, we obtain
easily the first and the second equality. The last inequality is an immediate consequence
of the Theorem of Fathi and Siconolfi recalled above.
The proofs of the next two Lemmas can be found in [114]. It has to be noticed that
Norton’s Lemma 4.2.2 is an elegant generalization of the Morse original Lemma, see
[111].
1. A0 is countable;
Lemma 4.2.3. For any C 1 -embedded compact disk B, there is a constant C > 0 such
that for all x, y ∈ B there is a C 1 path in B from x to y with length less than C|x − y|.
The proof of Lemma 4.2.4 that we present here is derived from [18] (compare [72])
who proved that if E ⊂ Rn is a measurable set, f : E → R is continuous, and n ≥ 2 is
such that f satisfies
|f (x) − f (y)| ≤ C|x − y|n ∀x, y ∈ E,
then f (E) has Lebesgue measure zero.
142 4.0. On the structure of the Aubry set and Hamilton-Jacobi equation
Then we have
X X
H α (Ψ(E2 )) ≤ (diamX Ψ(Bi ))α ≤ M (diam Bi )n ≤ M ε.
i∈I i∈I
where 5B denotes the ball concentric to B with radius 5 times that of B. We can so
consider the covering of f (E1 ) given by ∪1≤i≤Nn {f (5B ∩E1 )}B∈G . In this way, by (4.2.4),
we get
X X
H α (Ψ(E1 )) ≤ (diamX Ψ(5B ∩ E1 ))α ≤ 4n M α P α−n (diam 5B)n
B∈G B∈G
X
n α α−n
≤ 20 M P (diam B)n
B∈G
n α α−n
≤ 20 M P L n (Ω) ≤ 2 · 20n M α P α−n L n (E),
1,1
4.3 Existence of Cloc critical subsolution on noncom-
pact manifolds
In [20], using some kind of Lasry-Lions regularization (see [89]), Bernard proved the
existence of C 1,1 critical subsolutions on compact manifolds. Here, adapting his proof,
we show that the same result holds in the noncompact case and we make clear that
1,1
the Lipschitz constant of the derivative of the Cloc critical subsolution can be uniformly
bounded on compact subsets of M .
Theorem 4.3.1. Assume that H is of class C 2 . For every open subset O of M which
is relatively compact in M , there is a constant L = L(O) > 0 such that if u : M → R
1,1
is a critical viscosity subsolution, then there exists a Cloc critical subsolution v : M → R
whose restriction to the projected Aubry set is equal to u and such that the mapping
x 7→ dx v is L-Lipschitz on O.
Before proving Theorem 4.3.1, we observe that the following result holds:
144 4.0. On the structure of the Aubry set and Hamilton-Jacobi equation
Lemma 4.3.2. There is a constant K 0 > 0 such that any critical viscosity subsolution
u : M → R is K 0 -Lipschitz on M , that is,
where A(1) := supx∈M {L(x, v) | kvkx ≤ 1} is finite thanks to the uniform superlinearity
of H in the fibers. Thus, one has
Proof of Theorem 4.3.1. Let Kn be an increasing sequence of compact sets such that
◦
Kn ⊂K n+1 and ∪n Kn = M . We consider the two Lax-Oleinik semigroups Tt− and Tt+
defined by
Tt− u(x) := inf {u(y) + ht (y, x) + c(H)t} , Tt+ u(x) := sup {u(y) − ht (x, y) − c(H)t} ,
y∈M y∈M
for every x ∈ M . In [65], Fathi proved that those two semigroups preserve the set
of critical viscosity subsolutions and that, for all t > 0 and each continuous function
u, the function Tt+ u is locally semi-convex, while Tt− u is locally semi-concave. In [20],
the idea for proving the existence of C 1,1 critical subsolution on compact manifolds is
the following. It is a known fact that a function is C 1,1 if and only if it is both semi-
concave and semi-convex. Let now u be a critical viscosity subsolution. If we apply the
1,1
4.3. Existence of Cloc critical subsolution on noncompact manifolds 145
and such that for each x ∈ K4 and each p ∈ D− v(x) there is f ∈ F such that f (x) = v(x)
and dx f = p.
Proof. Let x ∈ K4 be fixed. From the definition of Tt+ u(x), we have
Tt+ u(x) ≥ u(x) − ht (x, x) − c(H)t
≥ u(x) − tL(x, 0) − c(H)t
≥ u(x) − (A(0) + c(H))t,
where A(0) := supx∈M {L(x, 0)} is finite thanks to the uniform superlinearity of H in the
fibers. On the other hand, by Lemma 4.3.2 and (4.3.1), we have for every y ∈ M ,
u(y) − ht (x, y) − c(H)t ≤ u(x) + K 0 d(x, y) − 2K 0 d(x, y) − c(K 0 )t − c(H)t
≤ u(x) − K 0 d(x, y) − (C(K 0 ) + c(H))t.
Therefore, the supremum in the definition of Tt+ u(x) is necessarily attained at a point
yx ∈ M satisfying
(A(0) − C(K 0 ))
d(x, yx ) ≤ t.
K0
0 ))
Denote by Kx the set of y ∈ M such that d(x, y) ≤ (A(0)−C(KK0
t, and by K the union of
Kx for x ∈ K4 . K is a compact subset of M . By Proposition 6.2.17 of the Appendix,
there is a compact set K̃ R⊂ M and a constant A > 0 such that every curve γ : [0, t] → M
t
with γ(0), γ(t) ∈ K and 0 L(γ(s), γ̇(s)) ds = ht (γ(0), γ(t)) satisfies
γ ([0, t]) ⊂ K̃, and kγ̇(s)kγ(s) ≤ A ∀s ∈ [0, t]. (4.3.2)
Let x ∈ K4 . By construction of Kx , there is yx ∈ Kx such that
Z t
+
Tt u(x) = u(yx ) − L(γx (s), γ̇x (s))ds − c(H)t,
0
146 4.0. On the structure of the Aubry set and Hamilton-Jacobi equation
for all x0 ∈ Vx . The function φx,yx is smooth and satisfies φx,yx (x) = 0. By construction
(because of the compactness
³ ´ of the set K̃), it is clear that the set of functions {φx,yx }x∈K4
◦
can be bounded in C 2 K 4 . More in general, the whole family G := {φx,y }x∈K4 ,y∈Kx is
bounded in C 2 .
Since K4 is compact, up to working in local charts, and using standard arguments to
extend the C 2 functions of our family constructed in charts to an open neighborhood of
K4 in such a way to preserve a global C 2 bound, we can assume that we are in Rn . Thus,
by [119, Proposition 6] applied to −Tt+ u, we obtain that v = Tt+ u is σ-semiconvex on
K4 , with the constant σ depending only on the C 2 bound of the family G (and therefore
is independent of the subsolution u). Now, by [119, Proposition 7], for any x ∈ K4 and
any p ∈ D− v(x) there exists a parabola Px,p with second derivative bounded by σ which
touches v from below at x with dx Px,p = p. By Lemma 4.3.2 we have the global bound
kpkx ≤ K 0 , and therefore F := {Px,p } is the desired family.
We claim that, for t1 , s1 > 0 small enough, the function u1 := Ts−1 Tt+1 u is C 1,1 on K2
and that the Lipschitz constant of its derivative can be bounded independently of u. In
order to prove this claim, we will show that, for s small enough, we have
© ª
Ts− Tt+ u(x) := min Tt+ u(y) + hs (y, x) + c(H)s , ∀x ∈ K2 . (4.3.3)
y∈K3
Once we will have proved this, the problem of proving C 1,1 regularity in K2 will be
exactly the same as in the compact case and so the proof in [20] will work.
1,1
4.3. Existence of Cloc critical subsolution on noncompact manifolds 147
³◦ ´
Indeed, always as in [20], for s small enough Ts− (F ) is a bounded subset of C 2 K 3
and, by (4.3.3), one can write
that implies that Ts− Tt+ u is C 1,1 on K2 . Moreover, we can assume that s is sufficiently
small so that Ts− (F ) is bounded in C 2 by a constant σ 0 which is still independent of u,
and this implies that the C 1,1 bound is independent of the particular subsolution u. Let
us now prove (4.3.3).
Set v := Tt+ u and fix s > 0. We recall that v is critical viscosity subsolution.
Since v is K 0 -Lipschitz on M , we deduce that for any x, y ∈ M ,
But, taking y = x in the formula defining Ts− v(x) yields for any x ∈ M ,
Consequently, we deduce that, for every x ∈ M , the infimum in the definition of Ts− v(x)
is necessarily attained at a point yx ∈ M satisfying
(A(0) − C(K 0 ))
d(x, yx ) ≤ s.
K0
(C−c(H))
So (4.3.3) follows taking s such that A(0)+c(H) s ≤ dist(K2 , K3c ). As we said above,
now the proof given in [20] allows us to say that u1 is C 1,1 in K2 . Let us now define
u2 (x) := Ts−2 Tt+2 u1 (x). Arguing as above we get that, for t2 , s2 smalls enough, u2 is C 1,1
in K3 . We now claim that, taking t2 , s2 sufficiently smalls, we also have that
µ ¶
1
LipK1 (dx u2 ) ≤ 1 + LipK2 (dx u1 ), (4.3.4)
2
for a fixed compact Kl we have an uniform bound for the Lipschitz constant of dx un on
1,1
Kl for any n. This implies that u∞ ∈ Cloc . ¤
Let u1 , u2 : M → R be two weak KAM solutions be fixed. It is well known that both the
mappings x ∈ M 7→ dx u1 , x ∈ M 7→ dx u2 coincide and are locally Lipschitz on A (see
[65]). Thus there is a constant C > 0 which does not depend on u1 and u2 such that if
we set v := u1 − u2 , we have
|v(y) − v(x)| ≤ C|x − y|2 , ∀x, y ∈ A ∩ O.
Since u1 and u2 are arbitrary, we get
dM (x, y) ≤ C|x − y|2 , ∀x, y ∈ A ∩ O.
4.4. Proofs of Theorems 4.1.1, 4.1.2, 4.1.4 149
for all t < t0 ∈ R. In particular, each such u is differentiable at each point of γ and
satisfies
∂L
dγ(t) u = (γ(t), γ̇(t)), ∀t ∈ R
∂v
(see [65]). Therefore we deduce that the function v : M → R defined as v := u1 − u2
is constant along the orbits of the Aubry set. Since this is true for any pair of KAM
solutions, this implies that dM (γ(s), γ(t)) = 0 for all s, t ∈ R. As a consequence, it suffices
to prove that the set (A0 ∩ S, dM ) has vanishing one-dimensional Hausdorff measure for
each open surface S ⊂ M which is locally transverse to the orbits of the Aubry set. As
above, this is a consequence of the fact that the mapping x 7→ dx u is locally Lipschitz
on A (with a constant which does not depend on u) and Lemma 4.2.4.
and
∂L
p̃(x) := (x, 0).
∂v
The function H̃ is of class C k,1 on O and satisfies for every x ∈ O,
Moreover, we notice that, by strict convexity of H and the fact that O is relatively
compact, there exists α ≥ 0 such that for every x ∈ O,
q
H(x, p) ≤ 0 =⇒ |p − p̃(x)| ≤ α −H̃(x). (4.4.1)
150 4.0. On the structure of the Aubry set and Hamilton-Jacobi equation
A0 = ∪i∈N Ai .
First, since A0 is countable, it has zero Hausdorff dimension. Let us now show that each
Ai has vanishing one-dimensional Hausdorff measure. Let i ≥ 1 be fixed. Since H̃ is a
C k,1 function vanishing on A0 , by (4.2.1) we know that
¯ ¯
¯ ¯
−H̃(x) = ¯H̃(x) − H̃(y)¯ ≤ Mi |x − y|k+1 , ∀y ∈ Ai , ∀x ∈ Bi .
So θ is something like the Poincaré map (or first return map) associated with γ̄ and S, it
is well defined in a small neighbourhood N ⊂ E of (x̄, v̄) and it is of class C k−1 . Denote
by dSg (·, ·) the distance on S which corresponds to the Riemannian metric induced by g
on S. We recall that the mapping (x, y) 7→ dSg (x, y)2 is smooth in a small neighbourhood
NS of x̄ in S. Without loss of generality we can assume that θ(N ) ⊂ NS . Define the
mapping Ψ : N → R by
³Z τ (x,v) ´2
Ψ(x, v) := L(γx,v (s), γ̇x,v (s))ds + c(H)τ (x, v) + dSg (θ(x, v), x)2 , ∀(x, v) ∈ N.
0
By construction, there exists δ(x̄) > 0 such that, for every (x, v) ∈ Ãp ∩ N , we have
where T̄ := T (x̄). Denote by Ãx̄ the set of (x, v) ∈ Ãp ∩ N such that T (x) ∈ (T̄ −
δ(x̄), T̄ + δ(x̄)), and by Ax̄ its projection on M . Furthermore, we notice that for every
(x, v) ∈ N , if we consider a minimizing geodesic with unit speed (for the Riemannian
metric g on S) γ : [0, dSg (θ(x, v), x)] → S joining θ(x, v) to x, we have
Z τ (x,v) Z dS
g (θ(x,v),x)
hρ(x,v) (x, x) ≤ L(γx,v (s), γ̇x,v (s))ds + L(γ(s), γ̇(s))ds,
0 0
Hence, if we denote by J an upper bound for |L(x, v)| with x ∈ NS and v ∈ Tx S satisfying
|v|g ≤ 1, we obtain for every (x, v) ∈ N ,
¯Z τ (x,v)
¯
hρ(x,v) (x, x) + c(H)ρ(x, v) ≤ ¯ L(γx,v (s), γ̇x,v (s))ds
0 ¯
¯
+ c(H)τ (x, v)¯ + (c(H) + J)dSg (θ(x, v), x)
p
≤ (1 + c(H) + J) Ψ(x, v). (4.4.2)
152 4.0. On the structure of the Aubry set and Hamilton-Jacobi equation
Without loss of generality, we can assume from now that we work in Rn . From Lemma
4.2.2, we can decompose the set Ãx̄ as
Proof. Let x ∈ K be such that H(x, dx u) < c(H). For every y ∈ M , we set ²(y) :=
c(H) − H(y, dy u). Since the mapping y 7→ ²(y) is continuous, there exists a constant CL
such that
²(y) ≤ CL , ∀y ∈ K.
Moreover, since y 7→ dy u is L-Lipschitz on O, there exists KL > 0 such that
³ the mapping
´
CL ²(x)
y 7→ H(y, dy u) is KL -Lipschitz on O and 2K L
≤ dist(K, Oc ). Hence B x, 2K L
⊂O
and so we have µ ¶
²(x) ²(x)
H(y, dy u) ≤ c(H) − , ∀y ∈ B x,
2 2KL
Since K is compact and L is uniformly superlinear in the fibers, there exists an upper
bound A for kwky over the set of (y, w) with y ∈ K such that the corresponding periodic
orbit in T M minimizes ht (y, y) for some t ≥ t0 (this follows directly from the proof of
Proposition 6.2.17 in the Appendix). Let γ : [0, t] → M be such that γ(0) = γ(t) = x
and satisfying
Z t
ht (x, x) = L(γ(s), γ̇(s))ds.
0
Thus we have
Z s0 (x) µ ¶
²(x)
u(γ(s0 (x))) − u(x) ≤ L(γ(s), γ̇(s)ds + c(H) − s0 (x).
0 2
In consequence, we obtain
Z t
²(x)
0 ≤ L(γ(s), γ̇(s)ds + c(H)t − s0 (x)
0 2
²(x)
= {ht (x, x) + c(H)t} − s0 (x).
2
Therefore we have
²(x) ²2 (x)
{ht (x, x) + c(H)t} ≥ s0 (x) = ,
2 4AKL
which means that, as long as s0 (x) ≤ t0 ,
p 1
c(H) − H(x, dx u) ≤ 2 AKL {ht (x, x) + c(H)t} 2 , ∀t ≥ t0 .
²(x)
Thus, in order to conclude, we need to have s0 (x) = 2AK L
≤ t0 for all x ∈ K, which is
the case if
CL CL
≤ t0 ⇔ AKL ≥ .
2AKL 2t0
Then we conclude
( r )
p CL 1
c(H) − H(x, dx u) ≤ max 2 AKL , 2 {ht (x, x) + c(H)t} 2 , ∀t ≥ t0 , ∀x ∈ K.
t0
Returning to the proof of Theorem 4.1.4, we notice that, without loss of general-
ity, we can assume that we work in a compact set K which is included in a relatively
compact open subset O of Rn with n = dim M . So, let us denote by L = L(O) the
1,1
uniform Lipschitz constant given by Theorem 4.3.1. Fix from now a pair of Cloc critical
subsolutions v1 , v2 : M → R such that dx v1 , dx v2 are L-Lipschitz on O. We notice that,
154 4.0. On the structure of the Aubry set and Hamilton-Jacobi equation
by strict convexity of the Hamiltonian, there exists a constant β > 0 such that for every
x ∈ O,
µ ¶
p1 + p2 H(x, p1 ) + H(x, p2 )
H x, ≤ − β|p1 − p2 |2 , ∀p1 , p2 ∈ Tx∗ M.
2 2
Hence, we have for every x ∈ O,
µ ¶
dx v1 + dx v2
H x, ≤ c(H) − β|dx v1 − dx v2 |2 .
2
By Lemma 4.4.1, we deduce that
M̂ 1
|dx v1 − dx v2 |2 ≤ {ht (x, x) + c(H)t} 2 , ∀t ≥ T̄ − δ(x̄), ∀x ∈ K. (4.4.4)
β
Let (x, v), (y, w) ∈ Ãi . From Lemma 4.2.3, there is a C 1 path (x(·), v(·)) : [0, 1] → N
in B̃i from (x, v) to (y, w) with length less than Ci |(x, v) − (y, w)|. Since dx v1 = L(x, v)
and dy v1 = L(y, w), there is a constant D > 0 such that for every s ∈ [0, 1],
|(x(s), v(s)) − (x, v)| ≤ Ci |(x, v) − (y, w)|
√
≤ Ci 1 + D2 |x − y|.
Hence, by (4.4.3), we have for every s ∈ [0, 1],
¡ ¢ k−1
Ψ(x(s), v(s)) ≤ Mi Cik−1 1 + D2 2 |x − y|k−1 .
By (4.4.2), this means that for every s ∈ [0, 1],
k−1
hρ(x(s),v(s)) (x(s), x(s)) + c(H)ρ(x(s), v(s)) ≤ (1 + c(H) + J)Di |x − y| 2 ,
for a certain constant Di . By (4.4.4) we finally deduce that there exists some constant
Ei > 0 such that, for every s ∈ [0, 1],
¯ ¯
¯dx(s) v1 − dx(s) v2 ¯ ≤ Ei |x − y| k−1
8 .
Remark 4.4.3. It can be shown that for every compact subset K ⊂ M , there is a
constant CK > 0 such that
h(x, x) ≤ CK d(x, A)2 , ∀x ∈ K,
where d(x, A) denotes the Riemannian distance from x to the set A (which is assumed
to be nonempty). Therefore, from Theorem 4.4.2, we deduce that if there are l ∈ N and
0
a function G : M → R of class C k ,1 with k 0 ≥ 2l(dim M − 1) − 1 such that
d(x, A)l ≤ G(x), ∀x ∈ M,
then (AM , dM ) has vanishing one-dimensional Hausdorff measure.
Theorem 4.6.1. The set Ĩ(u) is invariant under the Euler-Lagrange flow φLt . Moreover,
if (x, v) ∈ Ĩ(u), then dx u exists, and we have
∂L
dx u = (x, v) and H(x, dx u) = c(H).
∂v
It follows that:
Moreover the following results holds (see [65] or [63, Théorème 1]).
Theorem 4.6.2. 1) Two weak KAM solutions that coincide on A are equal everywhere.
2) For every u ∈ SS, there is a unique weak KAM solution u− : M → R such that
u− = u on A; moreover, the two functions u and u− are also equal on I(u).
(the equivalence between the definition with SS and the one with S− can be easily shown
from the results of [65]). The projected Aubry set A is simply the image π(Ã). We also
have \ \
A := I(u) = I(u).
u∈SS u∈S−
Both à and Ñ are compact subsets of T M invariant under the Euler-Lagrange flow φLt
of L.
Theorem 4.6.3 (Mañé). Each point of the invariant set à is chain-recurrent for the
restriction φLt |Ã . Moreover, the invariant set Ñ is chain-transitive for the restriction
φLt |Ñ .
Corollary 4.6.4. The restriction φLt |à to the invariant subset à is chain-transitive if
and only if à is connected.
Proof. This is an easy well-known result in the theory of Dynamical Systems: Suppose
θt , t ∈ R, is a flow on the compact metric space X. If every point of X is chain-recurrent
for θt , then θt is chain-transitive if and only if X is connected.
We give now the general relationship between uniqueness of weak KAM solutions and
the quotient Mather set.
Proof. For every fixed x ∈ M , the function y 7→ h(x, y) is a weak KAM solution.
Therefore if we assume that any two weak KAM solutions differ by a constant, then for
x1 , x2 ∈ M we can find a constant Cx1 ,x2 such that
This implies
∀x1 , x2 ∈ A, h(x1 , x2 ) + h(x2 , x1 ) = 0.
Which means that dM (x1 , x2 ) = 0, for every x1 , x2 ∈ A.
To prove the converse, let us recall that for every critical subsolution u, we have
à = Ĩ(u) = Ñ ,
where u is any element in S− . But, by Mañé’s Theorem 4.6.3, the invariant set Ñ is
chain-transitive for the flow φt , hence it is connected by Corollary 4.6.4.
We now denote by XL the Euler-Lagrange vector field of L, that is the vector field
on T M that generates φLt . We recall that an important property of XL is that
Notice that by part 2) of Theorem 4.6.2, if L satisfies the strong Mather condition,
then for every pair of critical sub-solutions u1 , u2 , the image (u1 − u2 )(A) ⊂ R is also of
Lebesgue measure 0. By the results proved in this chapter, we get:
(4) The Lagrangian is of class Ck,1 , with k ≥ 2 dim M − 3, and every point of à is
fixed under the Euler-Lagrange flow φLt .
(5) The Lagrangian is of class Ck , with k ≥ 8 dim M − 7, and each point of à either
is fixed under the Euler-Lagrange flow φLt or its orbit in the Aubry set is periodic
with (strictly) positive period.
Proof. First of all, we recall that, from Theorem 4.6.3, each point of A is chain-recurrent
for the restriction φLt |Ã . By [69, Theorem 1.5], we can find a C1 critical viscosity
subsolution u1 : M → R which is strict outside A, i.e. for every x ∈ / A we have
H(x, dx u1 ) < c(H). We define θ on T M by θ = (u1 − u) ◦ π. By Proposition 4.6.1, we
know that at each point (x, v) of Ĩ(u) the derivative of θ exists and depends continuously
on (x, v) ∈ Ĩ(u). By Proposition 4.6.6, at each point of (x, v) of Ĩ(u), we have
with the last inequality an equality if and only if dx u1 = dx u, and this implies H(x, dx u1 ) =
c(H). Since u1 is strict outside A, we conclude that XL · θ < 0 on Ĩ(u) \ Ã. Suppose
162 4.0. On the structure of the Aubry set and Hamilton-Jacobi equation
that (x0 , v0 ) ∈ Ĩ(u) \ Ã. By invariance of both à and Ĩ(u), every point on the orbit
φLt (x0 , v0 ), t ∈ R is also contained in Ĩ(u) \ Ã, therefore t 7→ c(t) := θ(φt (x0 , v0 )) is
(strictly) decreasing , and so we have c(1) < c(0). Observe now that θ(Ã) = (u1 − u)(A)
has measure 0 by the strong Mather condition, therefore we can find c ∈]c(1), c(0)[\θ(Ã).
By what we have seen, the directional derivative XL · θ is < 0 at every point of the level
set Lc = {(x, v) ∈ Ĩ(u) | θ(x, v) = c}. Since θ is everywhere non-increasing on the orbits
of φLt and XL · θ < 0 on Lc , we get
Consider the compact set Kc = {(x, v) ∈ Ĩ(u) | θ(x, v) ≤ c}. Using again that θ is
non-increasing on the orbits of φLt |Ĩ(u) , we have
We fix now some metric on Ĩ(u) defining its topology. We consider then the compact set
φL1 (Kc ). It is contained in the open set Kc \ Lc = {(x, v) ∈ Ĩ(u) | θ(x, v) < c}. We can
therefore find ² > 0 such that the ² neighborhood V² (φ1 (Kc )) of φL1 (Kc ) in Ĩ(u) is also
contained in Kc . Since for t ≥ 1, we have φLt−1 (Kc ) ⊂ Kc , and therefore φLt (Kc ) ⊂ φ1 (Kc ),
it follows that à !
[
V² φLt (Kc ) ⊂ Kc .
t≥1
It is know easy to conclude that every ²-pseudo orbit for φLt |Ĩ(u) that starts in Kc remains
in Kc . Since θ(φL1 (x0 , v0 )) = c(1) < c < c(0) = θ(x0 , v0 ), no α-pseudo orbit starting at
(x0 , v0 ) can return to (x0 , v0 ), for α ≤ ² such that the ball of center φL1 (x0 , v0 ) and radius
α, in Ĩ(u), is contained in Kc . Therefore (x0 , v0 ) cannot be chain recurrent.
Theorem 4.6.10. Let L be a Tonelli Lagrangian on the compact manifold M . If L
satisfies the strong Mather condition, then the following statements are equivalent:
(1) The Aubry set Ã, or its projection A, is connected.
(2) The Aubry set à is chain-transitive for the restriction of the Euler-Lagrange flow
φLt |Ã .
(5) There exists u ∈ SS such that Ĩ(u) is chain-recurrent for the restriction φt |Ĩ(u) of
the Euler-Lagrange flow.
Proof. From Corollary 4.6.4, we know that (1) and (2) are equivalent.
If (1) is true then for u1 , u2 ∈ S− , the image u1 − u2 (A) is a sub-interval of R, but
by the strong Mather condition, it is also of Lebesgue measure 0, therefore u1 − u2 is
constant. Hence (1) implies (3).
If (3) is true then (4) follows from Proposition 4.6.5.
Suppose now that (4) is true. Since for every u ∈ SS, we have à ⊂ Ĩ(u) ⊂ Ñ , we
obtain Ĩ(u) = Ñ . But Ñ is chain-transitive for the restriction φLt |Ñ . Hence (4) implies
(5).
If (5) is true for some u ∈ SS, then every point of Ĩ(u) is chain-recurrent for the
restriction φLt |Ĩ(u) . Lemma 4.6.9 then implies that à = Ĩ(u), and we therefore satisfy
(2).
1
Recent research activity has been devoted to study transport equations with rough
coefficients, showing that a well-posedness result for the transport equation in a certain
subclass of functions allows to prove existence and uniqueness of a flow for the associated
ODE. The first result in this direction is due to DiPerna and P.-L.Lions [56], where
the authors study the connection between the transport equation and the associated
ODE γ̇ = b(t, γ), showing that existence and uniqueness for the transport equation is
equivalent to a sort of well-posedness of the ODE which says, roughly speaking, that
the ODE has a unique solution for L d -almost every initial condition (here and in the
sequel, L d denotes the Lebesgue
P measure in Rd ). In that paper they also show that the
transport equation ∂t u + i bi ∂i u = c is well-posed in L∞ if b = (b1 , . . . , bn ) is Sobolev
and satisfies suitable global conditions (including L∞ -bounds on the spatial divergence),
which yields the well-posedness of the ODE.
In [4] (see also [5]), using a slightly differentPphilosophy, Ambrosio studied the con-
nection between the continuity equations ∂t u + i ∂i (bi u) = c and the ODE γ̇ = b(t, γ).
This different approach allows him to develop the general theory of the so-called Regular
Lagrangian Flows (see [5, Remark 31] for a detailed comparison with the DiPerna-Lions
axiomatization), which relates existence and uniqueness for the continuity equation with
well-posedness of the ODE, without assuming any regularity on the vector field b. Indeed,
since the transport equation is in a conservative form, it has a meaning in the sense of
distributions even when b is only L∞ 1
loc and u is Lloc . Thus, a general theory is developed
in [4] under very general hypotheses, showing as in [56] that existence and uniqueness
1
This chapter is based on the work in [77].
167
168 5.0. DiPerna-Lions theory for SDE
for the continuity equation is equivalent to a sort of well-posedness of the ODE. After
having proved this, in [4] the well-posedness of the continuity equations in L∞ is proved
in the case of vector fields with BV regularity whose distributional divergence belongs to
L∞ (for other similar results on the well-posedness of the transport/continuity equation,
see also [47, 48, 90, 83]).
Our aim is to develop a stochastic counterpart of this theory: in our setting the conti-
nuity equation becomes the Fokker-Planck equation, while the ODE becomes an SDE.
Let us consider the following SDE
½
dX = b(t, X) dt + σ(t, X) dB(t)
(5.1.1)
X(0) = x,
where b : [0, T ]×Rd → Rd and σ : [0, T ]×Rd → L (Rr , Rd ) are bounded (here L (Rr , Rd )
denotes the vector space of linear maps from Rr to Rd ) and B is an r-dimensional Brown-
ian motion on a probability space (Ω, A, P). We want to study the existence and unique-
∗
ness of martingale
P solutions for this equation. Let us define a(t, x) := σ(t, x)σ (t, x)
(that is aij := k σik σjk ). We consider the so called Fokker-Planck equation
½ P 1
P
∂ t µt + i ∂i (bi µt ) − 2 ij ∂ij (aij µt ) = 0 in [0, T ] × Rd ,
(5.1.2)
µ0 = µ̄ in Rd .
We recall that, for a (possibly signed) measure µ = µ(t, x) = µt (x), being a solution of
(5.1.2) simply means that
Z Z hX i
d 1X
ϕ(x) dµt (x) = bi (t, x)∂i ϕ(x)+ aij (t, x)∂ij ϕ(x) dµt (x) ∀ϕ ∈ Cc∞ (Rd ).
dt Rd R d
i
2 ij
(5.1.3)
∗
in the distributional sense on [0, T ], and the initial condition means that µt w -converges
to µ̄ (i.e. converges in the duality with Cc (Rd )) as t → 0. We observe that, since the
equation (5.1.2) is in divergence form, it makes sense without any regularity assumption
on a and b, provided that
Z T Z
¡ ¢
|b(t, x)| + |a(t, x)| d|µt |(x) dt < +∞ ∀A ⊂⊂ Rd
0 A
(here and in the sequel, |µt | denotes the total variation of µt ). Since b and a will always
be assumed to be bounded, in the definition of measure-valued solution of the PDE we
assume that Z T
|µt |(A) dt < +∞ ∀A ⊂⊂ Rd , (5.1.4)
0
5.1. Introduction and preliminary results 169
so that (5.1.2) surely makes sense. However, if µt is singular with respect to the Lebesgue
measure L d , then the products b(t, ·)µt and a(t, ·)µt are sensitive to modification of b(t, ·)
and a(t, ·) in L d -negligible sets. Since in the case of singular measures the coefficients
a and b will be assumed to be continuous, while in the case of coefficients in L∞ the
measures will be assumed to be absolutely continuous, (5.1.2) will always make sense.
Recall also that it is not restrictive to consider only solutions t 7→ µt of the Fokker-Planck
equation that are w∗ -continuous on [0, T ], i.e. continuous in the duality with Cc (Rd ) (see
Lemma 5.2.1). Thus, we can assume that µt is defined for all t and even at the endpoints
of [0, T ].
For simplicity of notation, we define
X 1X
Lt := bi (t, ·)∂i + aij (t, ·)∂ij .
i
2 ij
∂t µt = L∗t µt in [0, T ] × Rd ,
where L∗t denotes the (formal) adjoint of Lt in L2 (Rd ). Using Itô’s formula it is simple
to check that, if X(t, x, ω) ∈ L2 (Ω, C([0, T ], Rd )) is a family of solutions of (5.1.1),
measurable in (t, x, ω), then the measure µt defined by
Z Z
f (x) dµt (x) := E[f (X(t, x, ω))] dµ(x) ∀f ∈ Cc (Rd )
In the sequel, we will deal with families {νx }x∈Rd of probability measures that are
measurable with respect to x according to the following standard definition.
and in the sequel, S+ (Rd ) denotes the set of symmetric and non-negative definite d × d
matrices).
Theorem 5.1.3. Let us assume that a : [0, T ] × Rd → S+ (Rd ) and b : [0, T ] × Rd → Rd
are bounded functions such that:
P ∞ d
1. j ∂j aij ∈ L ([0, T ] × R ) for i = 1, . . . , d,
• Conclusions
In Section 5.5 we apply the theory developed in Paragraph 5.3.1 to obtain, in the cases
considered above, the generic well-posedness of the associated SDE.
Finally, in the last section we generalize an important uniqueness result of Stroock
and Varadhan (see Theorem 5.2.2 and the remarks at the end of Theorem 5.5.4).
that is (et )# νx = (et )# ν̃x (observe in particular that, if A = Rd and we have uniqueness
for the PDE for any initial time s ≥ 0, by Theorem 5.2.2 we get that νx = ν̃x for any
x ∈ Rd ).
(a) ⇒ (b): this implication follows by Theorem 5.2.7, which provides, for every finite
non-negative measure-valued solutions of the PDE, the representation
Z Z
ϕ dµt = ϕ(γ(t)) dνx (γ) dµ0 (x), (5.2.1)
Rd Rd ×ΓT
where, for µ0 -a.e. x, νx is a martingale solution of SDE starting from x (at time 0).
Therefore, by the uniqueness of (et )# νx , we obtain that solutions of the PDE are unique.
¤
We now prove that, if νx is a martingale solution of the SDE starting from x (at time
0) for µ0 -a.e. x, the right hand side of (5.2.1) always defines a non-negative solution of
the PDE. We recall that a locally finite measure in Rd is a possibly signed measure with
locally finite total variation.
Lemma 5.2.4. Let µ0 be a locally finite measure on Rd , and let {νx }x∈Rd be a measurable
family of probability measures on ΓT such that νx is a martingale solutionR of the SDE
starting from x (at time 0) for |µ0 |-a.e. x. Define on ΓT the measure ν := Rd νx dµ0 (x),
and assume that
Z TZ
χBR (γ(t)) dνx (γ) d|µ0 |(x) dt < +∞ ∀R > 0. (5.2.2)
0 Rd ×ΓT
Remark 5.2.5. Property 5.2.2 is trivially true if, for example, |µ0 |(Rd ) < +∞.
Proof. Let us first show that the map t 7→ hµνt , ϕi is absolutely continuous for any
ϕ ∈ Cc∞ (Rd ). We recall that a real valued map t 7→ f (t) is said absolutely continuous
if, for any ε > 0 there exists δ > 0 such that, given any family of disjoint intervals
(sk , tk ) ⊂ [0, T ], the following implication holds:
X X
|tk − sk | ≤ δ ⇒ |f (tk ) − f (sk )| ≤ ε.
k k
Take R > 0 such that supp(ϕ) ⊂ BR , and let I = ∪nk=1 (sk , tk ) be a subset of [0, T ] with
(sk , tk ) disjoint and such that |tk − sk | ≤ 1. For µ0 -a.e. x, by the definition of martingale
solution we have
Z Z Z tk Z
ϕ(γ(tk )) dνx (γ) − ϕ(γ(sk )) dνx (γ) = Lt ϕ(γ(t)) dνx (γ) dt
ΓT ΓT sk ΓT
Z tk Z X Z Z X
1 tk
= bi (t, γ(t))∂i ϕ(γ(t)) dνx (γ) dt+ aij (t, γ(t))∂ij ϕ(γ(t)) dνx (γ) dt
sk ΓT i 2 sk ΓT ij
Thus
Xn h iXn Z tk Z
ν ν 1
|hµtk , ϕi−hµsk , ϕi| ≤ kϕkC 2 kbk∞ + kak∞ χBR (γ(t)) dνx (γ) d|µ0 |(x) dt,
k=1
2 k=1 s k R d ×Γ
T
which shows that the map t 7→ hµνt , ϕi is absolutely continuous thanks to (5.2.2) and the
absolute continuity property of the integral. So, in order to conclude that µνt solves the
PDE, it suffices to compute the time derivative of t 7→ hµνt , ϕi, and, by the computation
we made above, one simply gets
Z µZ ¶
d ν d
hµ , ϕi = ϕ(γ(t)) dνx (γ) dµ0 (x)
dt t Rd dt ΓT
Z Z
= Lt ϕ(γ(t)) dνx (γ) dµ0 (x) = hµνt , Lt ϕi.
Rd ΓT
¤
Remark 5.2.6. We observe that, by the definition of µνt , the following implications hold:
1. µ0 ≥ 0 ⇒ ∀t ≥ 0, µνt ≥ 0 and µνt (Rd ) = µ0 (Rd ) (the total mass can also be infinite);
2. µ0 signed ⇒ ∀t ≥ 0, |µνt |(Rd ) ≤ |µ0 |(Rd ) (the total variation can also be infinite).
5.2. SDE-PDE uniqueness 175
Step R2: tightness. It is clear that the measures µε0 = µ0 ∗ ρε are tight. So, if we define
ν ε := Rd νxε dµε0 , we have
£ ¤
For any ϕ ∈ Cc∞ (Rd ), let us define Aϕ := kϕkC 2 kbk∞ + 21 kak∞ . Since for every
ϕ ∈ Cc∞ (Rd ) and any 0 < ε < 1
Z t ³X ´
1X ε
ϕ(γ(t)) − bεi (u, γ(u))∂i ϕ(γ(u)) + aij (u, γ(u))∂ij ϕ(γ(u)) du
0 i
2 ij
Since each νxεn is a martingale solution, we know that for any t ∈ [s, T ] and for any
ϕ ∈ Cc∞ (Rd )
Z · Z t ¸
ϕ(γ(t)) − Lnu ϕ(γ(u)) du Φs (γ) dνxεn (γ)f (x) dµε0n (x)
Rd ×Γ 0
T
Z · Z s ¸
= ϕ(γ(s)) − Lu ϕ(γ(u)) du Φs (γ) dνxεn (γ)f (x) dµε0n (x)
n
Rd ×ΓT 0
X 1 X εn
L̃nt := b̃εi n (t, ·)∂i + ã (t, ·)∂ij ,
i
2 ij ij
where b̃εi n and ãεijn are defined analogously to bεi n and aεijn . Thus we can write
Z · Z t ¸
ϕ(γ(t)) − ϕ(γ(s)) − Φs (γ) dνxεn (γ)f (x) dµε0n (x)
L̃nu ϕ(γ(u)) du
Rd ×Γ T s
Z ·Z t ¸
= (Lu − L̃u )ϕ(γ(u)) du Φs (γ) dνxεn (γ)f (x) dµε0n (x).
n n
Rd ×ΓT s
¯Z · Z t ¸ ¯
¯ ¯
¯ n s εn εn ¯
¯ ϕ(γ(t)) − ϕ(γ(s)) − L̃u ϕ(γ(u)) du Φ (γ) dνx (γ)f (x) dµ0 (x)¯
¯ Rd ×ΓT s ¯
Z ·Z t ¯ ¯ ¸
¯ n ¯
≤ ¯(Lu − L̃nu )ϕ(γ(u))¯ du Φs (γ) dνxεn (γ)f (x) dµε0n (x)
Rd ×ΓT s
Z ·Z t ¯ ¯ ¸
¯ n ¯
≤ ¯(Lu − L̃u )ϕ(γ(u))¯ du dνxεn (γ) dµε0n (x)
n
Rd ×Γ s
Z tZ T ¯ ¯
¯ n ¯
= ¯(Lu − L̃nu )ϕ(x)¯ dµεun (x) du
s Rd
¯ ¶ ¯
X t Z ¯µ (bi (u, ·)µu ) ∗ ρε
Z
(b̃ (u, ·)µ ) ∗ ρ ¯
≤ ¯ n
−
i u εn
∂ ϕ ¯(x) dµεun (x) du
¯ µu εn µuεn
i ¯
i s Rd
Z Z ¯µ ¶ ¯
1X t ¯ (aij (u, ·)µu ) ∗ ρεn
¯ (ãij (u, ·)µu ) ∗ ρεn ¯
¯(x) dµεun (x) du
+ ¯ − ∂ ij ϕ¯
2 d
ij s Rµεn u µεn u
XZ tZ
≤ |bi (u, ·) − b̃i (u, ·)|(x)∂i ϕ ∗ ρεn (x) dµu (x) du
i s Rd
Z tZ
1X
+ |aij (u, ·) − ãij (u, ·)|(x)∂ij ϕ ∗ ρεn (x) dµu (x) du.
2 ij s Rd
Since ã and b̃ are continuous, ãεn and b̃εn converge to ã and b̃ locally uniformly. So we
178 5.0. DiPerna-Lions theory for SDE
Choosing two sequences of continuous functions (b̃k )k∈N and (ãk )k∈N converging respec-
RT
tively to b and a in L1 ([0, T ] × Rd , η), with η := 0 µt dt, we finally obtain
Z · Z t ¸
ϕ(γ(t)) − ϕ(γ(s)) − Lu ϕ(γ(u)) du Φs (γ) dνx (γ)f (x) dµ0 (x) = 0,
Rd ×Γ T s
that is
Z · Z t ¸
ϕ(γ(t)) − Lu ϕ(γ(u)) du Φs (γ) dνx (γ)f (x) dµ0 (x)
Rd ×Γ 0
T
Z · Z s ¸
= ϕ(γ(s)) − Lu ϕ(γ(u)) du Φs (γ) dνx (γ)f (x) dµ0 (x).
Rd ×ΓT 0
By the arbitrariness of f we get that, for any 0 ≤ s ≤ t ≤ T , and for any Fs -measurable
function Φs , we have
Z · Z t ¸
ϕ(γ(t)) − Lu ϕ(γ(u)) du Φs (γ) dνx (γ)
ΓT 0
Z · Z s ¸
= ϕ(γ(s)) − Lu ϕ(γ(u)) du Φs (γ) dνx (γ) for µ0 -a.e. x.
ΓT 0
for any Fs -measurable function Φs (here the µ0 -a.e. depends on s and t but not on Φs ).
Taking now s, t ∈ [0, T ] ∩ Q, we deduce that, for µ0 -a.e. x,
Z · Z t ¸
ϕ(γ(t)) − Lu ϕ(γ(u)) du Φs (γ) dνx (γ)
ΓT 0
Z · Z s ¸
= ϕ(γ(s)) − Lu ϕ(γ(u)) du Φs (γ) dνx (γ)
ΓT 0
for any s, t ∈ [0, T ] ∩ Q, for any Fs -measurable function Φs . By the continuity of the
above equality with respect to both s and t, and the continuity in time of the filtration
Fs , we conclude that νx is a martingale solution for µ0 -a.e. x. ¤
(this result can also be proved more directly using as test functions in (5.1.2) a suitable
sequence (ϕn )n∈N ⊂ Cc∞ (Rd ), with 0 ≤ ϕn ≤ 1 and ϕn % 1, and, even in the case
when the measures µt are signed, under the assumption |µt |(Rd ) ≤ C one obtains the
constancy of the map t 7→ µt (Rd )).
1. for µ0 -a.e. x, νx is a martingale solution of the SDE starting from x (at time 0);
More in general, one can analogously define a µ0 -SLF starting at time s with s ∈ (0, T )
requiring that νx is a martingale solution of the SDE starting from x at time s.
180 5.0. DiPerna-Lions theory for SDE
Remark 5.3.2. If {νx }x∈Rd is a µ0 -SLF, then it is also a µ00 -SLF for any µ00 ∈ M+ (Rd )
with µ00 ≤ Cµ0 . Indeed, this easily follows by the inequality
Z Z
0
0 ≤ (et )# ν̃x dµ0 (x) ≤ C(et )# ν̃x dµ0 (x).
Rd Rd
and
© ª
L+ := u ∈ L∞ ([0, T ], L1+ (Rd )) ∩ L∞ ([0, T ], L∞ d ∗ ∞ d
+ (R )) | u ∈ C([0, T ], w − L (R )) .
Under an existence and uniqueness result for the PDE in the class L+ , we prove existence
and uniqueness of SLF.
Theorem 5.3.3 (Existence of SLF starting from a fixed measure). Let us suppose
that, for some initial datum µ0 = ρ0 L d ∈ M+ (Rd ), with ρ0 ∈ L∞ (Rd ), there exists a
solution of the PDE in L+ . Then there exists a µ0 -SLF.
Proof. It suffices to apply Theorem 5.2.7 to the solution of the PDE in L+ . ¤
Let us assume now that forward uniqueness for the PDE holds in the class L+ for
any initial time, that is, for any s ∈ [0, T ], for any ρs ∈ L1+ (Rd ) ∩ L∞ d
+ (R ), if we denote
by ρt L d and ρ̃t L d two solutions of the PDE in the class L+ starting from ρs L d at time
s, then
ρt = ρ̃t for any t ∈ [s, T ].
Before stating and proving our main theorem, we first introduce some notation that
will be used also in the last section.
Let B be the Borel σ-algebra on ΓT = C([0, T ], Rd ), and define the filtrations Ft :=
σ[es | 0 ≤ s ≤ t] and F t := σ[es | t ≤ s ≤ T ]. Set P(ΓT ) the set of probability measures
on ΓT . Now, given ν ∈ P(ΓT ), we denote by
ΓT 3 γ 7→ νFγ t ∈ P(ΓT )
- Z
ν(A ∩ B) = νFγ t (B) dν(γ) ∀A ∈ Ft , ∀B ∈ B. (5.3.1)
A
Since ΓT is a Polish space and every σ-algebra Ft is finitely generated, such a function
exists and is unique, up to ν-null sets. In particular, up to changing this function in a
ν-null set, the following fact holds:
γ
νM t1 ,...,tn ({γ̃ | γ̃(ti ) = γ(ti ) ∀i = 1, . . . , n}) = 1 ∀γ ∈ ΓT . (5.3.3)
γ x1 ,...,xn
If γ(ti ) = xi for i = 1, . . . , n, then we will also use the notation νM t1 ,...,tn = νM t1 ,...,tn .
R
By (5.3.1) one can check that ΓT νFγ̃ tn dνM γ
t1 ,...,tn (γ̃) is a regular conditional probability
t1 ,...,tn
distribution of ν given M , which implies by uniqueness that
Z
γ
νM t1 ,...,tn = νFγ̃ tn dνMγ
t1 ,...,tn (γ̃) for ν-a.e. γ. (5.3.4)
ΓT
We now want to use an analogous argument to deduce that, for any 0 < t1 < t2 < . . . <
tn ≤ T ,
(et1 , . . . , etn )# νx = (et1 , . . . , etn )# ν̃x for µ0 -a.e. x. (5.3.6)
182 5.0. DiPerna-Lions theory for SDE
The idea is that, given a measure µ̃s = ρ̃s L d ∈ M+ (Rd ), with ρ̃s ∈ L∞ , once we have a
µ̃s -SLF starting at time s we can multiply µ̃s by a function ψs ∈ Cc (Rd ) with ψs ≥ 0, and
by Remark 5.3.2 our µ̃s -SLF is also a ψs µ̃s -SLF starting at time s. Using this argument
n times at different times and the time marginals uniqueness, we will R obtain (5.3.6).
Fix 0 < t1 < . . . < tn ≤ T . Take ψ0 ≥ 0 with ψ0 ∈ Cc (Rd ) and Rd ψ0 dµ0 = 1, and
denote by µψt10 the value at time t1 of the (unique) solution in L+ of the PDE starting
from ψ0 µ0 (which is induced both by {νx } and {ν̃x } by uniqueness, see equation (5.3.5)).
Let {νx,t1 }x∈Rd and {ν̃x,t1 }x∈Rd be the families of probability measures on ΓT given by
the disintegration of
Z Z
ψ0 ψ0
ν := νx ψ0 (x) dµ0 (x) and ν̃ := ν̃x ψ0 (x) dµ0 (x)
Rd Rd
It is easily seen that {νx,t1 } and {ν̃x,t1 } are regular conditional probability distributions,
given M t1 = σ[et1 ], of ν ψ0 and ν̃ ψ0 respectively (that is, with the notation introduced
before, νx,t1 = (ν ψ0 )xMt1 and ν̃x,t1 = (ν̃ ψ0 )xMt1 ). Thus, looking at {νx,t1 } and {ν̃x,t1 } as their
restriction to C([t1 , T ], Rd ), {νx,t1 } and {ν̃x,t1 } are µψt10 -SLF starting at time t1 . Indeed, by
the stability of martingale solutions with respect to regular conditional probability (see
[125, Chapter 6]), {νx,t1 } and {ν̃x,t1 } are martingale solutions of the SDE starting from
x at time t1 for µψt10 -a.e. x (see also the remarks at the end of the proof of Proposition
5.6.1), while (ii) of Definition 5.3.1 is trivially true since {νx } and {ν̃x } are ψ0 µ0 -SLF.
As before, since {νx,t1 } and {ν̃x,t1 } are also ψ1 µψt10 -SLF for any ψ1 ∈ Cc (Rd ) with ψ1 ≥ 0,
using again the uniqueness of the PDE in L+ we get
Z Z
ψ0
ϕ(et2 (γ)) dνx,t1 (γ)ψ1 (x) dµt1 (x) = ϕ(et2 (γ)) dν̃x,t1 (γ)ψ1 (x) dµψt10 (x)
Rd ×ΓT Rd ×ΓT
by (5.3.8) we obtain
Z
ϕ(et2 (γ))ψ1 (et1 (γ)) dνx (γ)ψ0 (x) dµ0 (x)
Rd ×ΓT
Z
= ϕ(et2 (γ))ψ1 (et1 (γ)) dν̃x (γ)ψ0 (x) dµ0 (x)
Rd ×ΓT
R
for any non-negative ψ0 , ψ1 , ϕ ∈ Cc (Rd ) (the constraint Rd ψ0 dµ0 = 1 can be easily
removed multiplying the above equality by a positive constant). Iterating this argument,
we finally get
Z
ψn (etn (γ)) . . . ψ1 (et1 (γ)) dνx (γ)ψ0 (x) dµ0 (x)
Rd ×ΓT
Z
= ψn (etn (γ)) . . . ψ1 (et1 (γ)) dν̃x (γ)ψ0 (x) dµ0 (x),
Rd ×ΓT
Remark 5.3.5. Suppose that forward uniqueness for the PDE holds in the class L+ ,
and take µ0 = ρ0 L d and µ̃0 = ρ̃0 L d , with ρ0 , ρ̃0 ∈ L1+ (Rd ) ∩ L∞ d
+ (R ). If {νx } is a µ0 -SLF
and {ν̃x } is a µ̃0 -SLF, then
In fact, by Remark 5.3.2 {νx } and {ν̃x } are both µ0 ∧ µ̃0 -SLF, and thus we conclude by
the uniqueness result proved above.
By Theorems 5.3.3 and 5.3.4, and by the remark above, we obtain the following:
Corollary 5.3.6 (Existence and uniqueness of SLF). Let us assume that we have
forward existence and uniqueness for the PDE in L+ . Then there exists a measurable
selection of martingale solution {νx }x∈Rd which is a µ0 -SLF for any µ0 = ρ0 L d with
ρ0 ∈ L1+ (Rd ) ∩ L∞ d
+ (R ), and if {ν̃x }x∈Rd is a µ̃0 -SLF for a fixed µ̃0 = ρ̃0 L
d
with ρ̃0 ∈
1 d ∞ d d
L+ (R ) ∩ L+ (R ), then νx = ν̃x for L -a.e. x ∈ supp(µ̃0 ).
Proof. It suffices to consider a SLF starting from a Gaussian measure (which exists by
Theorem 5.3.3), and to apply Remark 5.3.5. ¤
184 5.0. DiPerna-Lions theory for SDE
By now, the above selection of martingale solutions {νx }, which is uniquely deter-
mined L d -a.e., will be called the SLF (starting at time 0 and relative to (b, a)).
We finally prove a stability result for SLF:
Theorem 5.3.7 (Stability of SLF starting from a fixed measure). Let us suppose
that bn , b : [0, T ] × Rd → Rd and an , a : [0, T ] × Rd → S+ (Rd ) are uniformly bounded
functions, and that we have forward existence and uniqueness for the PDE in L+ with
coefficients (b, a). Let µ0 = ρ0 L d ∈ M+ (Rd ), with ρ0 ∈ L∞ (Rd ), andR let {νxn }x∈Rd
n n n n
and {νR x }x∈Rd be µ0 -SLF for (b , a ) and (b, a) respectively. Define ν := Rd νx dµ0 (x),
ν := Rd νx dµ0 (x). Assume that:
Then ν n *∗ ν in M(ΓT ).
Proof. Since (bn , an ) are uniformly bounded in L∞ , as in Step 2 of the the proof of
Theorem 5.2.7 one proves that the sequence of probability measures (ν n ) on Rd × ΓT is
tight. In order to conclude, we must show that any limit point of (ν n ) is ν.
Let ν̃ be any limit point of (ν n ). We claim that ν̃ is concentrated on martingale solutions
of the SDE with coefficients (b, a). Indeed, let us define µ̃t := (et )# ν̃. Since µnt → µ̃t
narrowly and ρnt are non-negative functions bounded in L∞ (Rd ), we get µ̃t = ρt L d
for a certain non-negative function ρt ∈ L∞ (Rd ). We now observe that the argument
used in Step 3 of the proof of Theorem 5.2.7 was using only the property that, for any
ϕ ∈ Cc∞ (Rd ),
X Z t Z ¯¯³ ´ ¯
¯
lim sup ¯ bi (u, x) − b̃i (u, x) ∂i ϕ(x)¯ρnu (x) dx du
n
n→+∞ s Rd
i
X Z t Z ¯¯³ ´ ¯
¯
≤ ¯ bi (u, x) − b̃i (u, x) ∂i ϕ(x)¯ρu (x) dx du,
i s Rd
X Z t Z ¯¯³ ´ ¯
¯
lim sup ¯ anij (u, x) − ãij (u, x) ∂ij ϕ(x)¯ρnu (x) dx du
n→+∞ s Rd
ij
X Z t Z ¯¯³ ´ ¯
¯
≤ ¯ aij (u, x) − ãij (u, x) ∂ij ϕ(x)¯ ρu (x) dx du
ij s Rd
5.3. Stochastic Lagrangian Flows 185
for any b̃ : [0, T ] × Rd → Rd and ã : [0, T ] × Rd → S+ (Rd ) bounded and continuous. This
property simply follows by (i) and the w∗ -convergence of ρnt to ρt in L∞ ([0, T ] × Rd ).
Since t 7→ ρt L d is w∗ -continuous in the Rsense of measures, the w∗ -continuity of t 7→ ρt in
L∞ (Rd ) follows. Thus, if we write ν̃ := Rd ν̃x dµ0 (x) (considering the disintegration of ν̃
with respect to µ0 = (e0 )# ν̃), we have proved that {ν̃x } is a µ0 -SLF for (b, a). Therefore,
by Theorem 5.3.4, we conclude that ν = ν̃. ¤
We remark that the theory just developed could be generalized to more general
situations. Indeed the key property of the convex class L+ is the following monotonicy
property:
0 ≤ µ̃t ≤ µt ∈ L+ ⇒ µ̃t ∈ L+
(see also [5, Section 3]).
Lemma 5.3.8. Let us assume a = 0. Then νx,s is a martingale solution of the SDE
(which, in this case, is just an ODE) starting from x at time s if and only if it is
concentrated on integral curves of the ODE, that is, for νx,s -a.e. γ,
Z t
γ(t) − γ(s) = b(τ, γ(τ )) dτ ∀t ∈ [s, T ].
s
Proof. It is clear from the definition of martingale solution that, if νx,s is concentrated on
integral curves on the ODE, then it is a martingale solution. Let us prove the converse
implication. By the definition of martingale solution and the fact that a = 0, it is a
known fact that
Z t
Mt := γ(t) − γ(s) − b(τ, γ(τ )) dτ, t ∈ [s, T ],
s
is a νx,s -martingale with zero quadratic variation. This implies that also Mt2 is a mar-
tingale, and since Ms = 0 we get
Z µ Z t ¶2
νx,s
0=E [Mt2 ] = γ(t) − γ(s) − b(τ, γ(τ )) dτ dνx,s (γ) ∀t ∈ [s, T ],
ΓT s
Thus, in the case a = 0, a martingale solution of the SDE starting from x is simply a
measure on ΓT concentrated on integral curves of b. By the results in [4] we know that, if
we have forward uniqueness for the PDE in L+ , then any measure ν on ΓT concentrated
on integral curves of b such that its time marginals induces a solution of the PDE in L+
is concentrated on a graph, i.e. there exists a function x 7→ X(·, x) ∈ ΓT such that
ν = X(·, x)# µ0 , with µ0 := (e0 )# ν
(see for instance [7, Theorem 18]). Then, if we assume forward uniqueness for the PDE in
L+ , our SLF coincides exactly with the RLF in [4]. Applying the stability result proved
in the above paragraph, we obtain that, as the noise tends to 0, our SLF converges to
the RLF associated to the ODE γ̇ = b(γ). So we have a vanishing viscosity result for
RLF.
Corollary 5.3.9. Let us suppose that b : [0, T ]×Rd → Rd is uniformly bounded, and that
we have forward existence and uniqueness for the PDE in L+ with coefficients (b, 0). Let
{νxε }x∈Rd and {νx }x∈Rd be the SLF relative to (b, εI) and (b, 0) respectively (existence and
uniqueness of martingale solutions for the SDE with coefficients (b, εI), together with the
measurability of the family {νxε }x∈Rd , follows byR[125, Theorem 7.2.1]).
R Let µ0 = ρ0 L d ∈
M+ (Rd ), with ρ0 ∈ L∞ (Rd ), and define ν ε := Rd νxε dµ0 (x), ν := Rd νx dµ0 (x).
Set ρεt L d = µεt := (et )# ν ε , and assume that for any t ∈ [0, T ]
kρεt kL∞ (Rd ) ≤ C for a certain constant C = C(T ).
Then ν ε *∗ ν in M(ΓT ).
In [4], the uniqueness of RLF implies the semigroup law (see [4], [5] for more details).
In our case, by the uniqueness of SLF, we have as a consequence that the Chapman-
Kolmogorov equation holds:
Proposition 5.3.10. For any s ≥ 0, let {νx,s }x∈Rd denotes the unique SLF starting at
time s. Let us denote by νs,x (t, dy) the probability measure on Rd given by νs,x (t, ·) :=
(et )# νs,x . Then, for any 0 ≤ s < t < u ≤ T ,
Z
νt,y (u, ·)νs,x (t, dy) = νs,x (u, ·) for L d − a.e. x.
Rd
Proof. Let us define
½
νRs,x on C([s, t], Rd )
ν̃s,x :=
ν ν (t, dy) on C([t, T ], Rd ).
Rd t,y s,x
This gives a family of martingale solution starting from x at time s (see [125]), and,
using that {νx,s } and {νx,t } are SLF starting at time s and t respectively, it is simple to
check that {ν̃s,x }x∈Rd is a SLF starting at time s. Thus, by Theorem 5.3.4, we have the
thesis. ¤
5.4. Fokker-Planck equation 187
where a = (aij ) is symmetric and non-negative definite (that is, a : [0, T ]×Rd → S+ (Rd )).
(the above computation is admissible since f ∈ Cb1,2 ). This implies in particular that
Z Z Z
0= f (0, x) dµ0 (x) = f (t, x) dµt (x) = ψ(x) dµt (x).
Rd Rd Rd
We remark that, in the uniformly parabolic case, the above proof still works under
weaker regularity assumptions. Indeed, in that case, one has existence of a measurable
family of martingale solutions of the SDE and of a solution f ∈ Cb1,2 ([0, t] × Rd ) of the
adjoint equation if a and b are just Hölder continuous (see [125, Theorem 3.2.1]). So we
get:
Proposition 5.4.2. Let us assume that a : [0, T ]×Rd → S+ (Rd ) and b : [0, T ]×Rd → Rd
are bounded functions such that:
1. hξ, a(t, x)ξi ≥ α|ξ|2 ∀(t, x) ∈ [0, T ] × Rd , for some α > 0;
¡ ¢
2. |b(t, x)−b(s, y)|+ka(t, x)−a(s, y)k ≤ C |x − y|δ + |t − s|δ ∀(t, x), (s, y) ∈ [0, T ]×
Rd , for some δ ∈ (0, 1], C ≥ 0.
Then, for any finite measure µ0 there exists a unique finite measure-valued solution of
(5.4.1) starting from µ0 .
The proof the above theorem is quite standard, except for the uniqueness result in
the large space L2 , which is indeed quite technical and involved. The motivation for
this more general result is that L1+ (Rd ) ∩ L∞ d 2 d 1 d ∞ d
+ (R ) ⊂ L (R ), and L+ (R ) ∩ L+ (R ) is
the space where we need well-posedness of the PDE if we want to apply the theory on
martingale solutions developed in the last section (see Theorems 5.1.3 and 5.5.1).
We now give some properties of the family of solutions of (5.4.2):
(a) u0 ≥ 0 ⇒ u ≥ 0;
(c) if moreover
a b
∈ L2 ([0, T ] × Rd ), ∈ L2 ([0, T ] × Rd ),
1 + |x|2 1 + |x|
We observe that, by the above results together with Proposition 5.4.2, we obtain:
2. |b(t, x)−b(s, y)|+ka(t, x)−a(s, y)k ≤ C (|x − y|γ + |t − s|γ ) ∀(t, x), (s, y) ∈ [0, T ]×
Rd , for some γ ∈ (0, 1], C ≥ 0;
P P P
3. ∞ d
j ∂j aij ∈ L ([0, T ]×R ) for i = 1, . . . , d, ( i ∂i bi − 12 ij ∂ij aij )− ∈ L∞ ([0, T ]×
Rd );
a b
4. 1+|x|2
∈ L2 ([0, T ] × Rd ), 1+|x|
∈ L2 ([0, T ] × Rd ).
190 5.0. DiPerna-Lions theory for SDE
Then, for any µ0 ∈ M+ (Rd ) there exists a unique finite measure-valued solution µt ∈
M+ (Rd ) of (5.1.2) starting from µ0 . Moreover, if such that µ0 = ρ0 L d with ρ0 ∈ L2 (Rd ),
then µt ¿ L d for all t ∈ [0, T ].
Proof. Existence and uniqueness of finite measure-valued solutions follows by Proposi-
tion 5.4.2. So the only thing to prove is that, if ρ0 ∈ L1 (Rd ) ∩ L2 (Rd ) is non-negative,
then µt ∈ M+ (Rd ) and µt ¿ L d for all t ∈ [0, T ]. This simply follows by the fact that
the solution u ∈ Y provided by Theorem 5.4.3 belongs to L1+ (Rd ) by Proposition 5.4.4,
and thus coincides with µt by uniqueness in the set of finite measure-valued solutions.
¤
In order to prove the results stated before, we need the following theorem of J.-L.Lions
(see [93]):
Theorem 5.4.6. Let H be an Hilbert space, provided with a norm | · |, and inner product
(·, ·). Let Φ ⊂ H be a subspace endowed with a prehilbertian norm k · k, such that the
injection Φ ,→ H is continuous. We consider a bilinear form B : H × Φ → R such that:
- H 3 u 7→ B(u, ϕ) is continuous on H for any fixed ϕ ∈ Φ;
- there exists α > 0 such that B(ϕ, ϕ) ≥ αkϕk2 for any ϕ ∈ Φ.
Then, for any linear continuous form L on Φ there exists v ∈ H such that
B(v, ϕ) = L(ϕ) ∀ϕ ∈ Φ.
Proof of Theorem 5.4.3. We will first prove existence and uniqueness of a solution in the
space Y . Once this will be done, we will show that, if u is a weak solution of (5.4.2)
belonging to L2 ([0, T ]×Rd ) and ∂t aij ∈ L∞ ([0, T ]×Rd ) for i, j = 1, . . . , d, then u belongs
to Y , and so it coincides with the unique solution provided before.
The change of unknown
v(t, x) = e−λt u(t, x)
leads to the equation
½ P P
∂t v + i ∂i (b̃i v) − 21 ij ∂i (aij ∂j v) + λv = 0 in [0, T ] × Rd ,
(5.4.4)
v 0 = u0 ,
P P
where b̃i := bi − 12 j ∂j aij ∈ L∞ ([0, T ]×Rd ). Assuming that λ satisfies λ > 21 k( i ∂i b̃i )− k∞ ,
we will prove existence and uniqueness for u.
Step 1: existence in Y . We want to apply Theorem 5.4.6.
Let us take H := L2 ([0, T ], H 1 (Rd )), Φ := {ϕ ∈ C ∞ ([0, T ]×Rd ) | supp ϕ ⊂⊂ [0, T )×Rd }.
Φ is endowed with the norm
Z
2 2 1
kϕkΦ := kϕkH + |ϕ(0, x)|2 dx.
2 Rd
5.4. Fokker-Planck equation 191
and thus v ∈ Y . In order to give a meaning to the initial condition and to show the
uniqueness, we recall that for functions in Y there exists a well-defined notion of trace
at 0 in L2 (Rd ), and the following Gauss-Green formula holds:
Z TZ Z Z
∂t uũ + ∂t ũu dx dt = u(T, x)ũ(T, x) dx − u(0, x)ũ(0, x) dx ∀u, ũ ∈ Y
0 Rd Rd Rd
(5.4.5)
(both facts follow by a standard approximation with smooth functions R and by the
d 2
fact that, if u is smooth and compactly supported in [0, T ) × R , Rd u (0, x) dx ≤
2k∂t ukH ∗ kukH ). Thus, by (5.4.4) and (5.4.5), we obtain that v satisfies
Z
(v(0, x) − u0 (x))ϕ(0, x) dx = 0 ∀ϕ ∈ Φ,
Rd
³ X ´ Z TZ
1
≥ λ − k( ∂i b̃i )− k∞ v 2 dx dt.
2 i 0 R d
P
Since λ > 12 k( i ∂i b̃i )− k∞ , we get v = 0.
192 5.0. DiPerna-Lions theory for SDE
Remark 5.4.7. We observe that the above proof still works for the PDE
½ P P
∂t u + i ∂i (bi u) − 12 ij ∂ij (aij u) = U in [0, T ] × Rd ,
u(0) = u0 ,
and all the rest of the proof works without any changes.
Thanks to this remark, we can now prove uniqueness in the larger space L2 ([0, T ]×Rd )
under the assumption ∂t aij ∈ L∞ ([0, T ] × Rd ) for i, j = 1, . . . , d,.
Step 3: uniqueness in L2 . If u ∈ L2 ([0, T ] × Rd ) is a (distributional) solution of
(5.4.1), then
1X X
∂t u − ∂i (aij ∂j u) = − ∂i (b̃i u) ∈ L2 ([0, T ], H −1 (Rd )).
2 ij i
By Remark 5.4.7, there exists ũ ∈ Y solution of the above equation, with the same initial
condition. Let us define w := u − ũ ∈ L2 ([0, T ] × Rd ). Then w is a distributional solution
of ½ P
∂t w − A(∂x )w := ∂t w − 12 ij ∂i (aij ∂j w) = 0 in [0, T ] × Rd ,
w(0) = 0.
In order to conclude the proof, it suffices to prove that w = 0.
Step 3.1: regularization. Let us consider the PDE
(this is an elliptic problem degenerate in the time variable). Applying Theorem 5.4.6,
with H = Φ := L2 ([0, T ], H 1 (Rd )),
Z Z ³ ´
T
εX
B(u, ϕ) := uϕ + aij ∂j u∂i ϕ dx dt,
0 Rd 2 ij
Z T Z
L(ϕ) := wϕ dx dt,
0 Rd
we find a unique solution wε of (5.4.6) in L2 ([0, T ], H 1 (Rd )), that is wε = (I −εA(∂x ))−1 w,
with (I − εA(∂x )) : L2 ([0, T ], H 1 (Rd )) → L2 ([0, T ], H −1 (Rd )) isomorphism. Now we want
5.4. Fokker-Planck equation 193
to find the equation solved by wε . We observe that, since (I − εA(∂x ))−1 commutes with
A(∂x ) and ∂t w = A(∂x )w, the parabolic equation solved by wε formally looks
∂t wε − A(∂x )wε = [∂t , (I − εA(∂x ))−1 ]w.
Formally computing the commutator between ∂t and (I − εA(∂x ))−1 , one obtains
X
∂t wε − A(∂x )wε = ε(I − εA(∂x ))−1 ∂j (∂t aij ∂i wε ) (5.4.7)
ij
in the distributional sense (see (5.4.9) below). Let us assume for a moment that (5.4.7)
has been rigorously justified, and let us see how we can conclude.
Step 3.2: Gronwall argument. By (5.4.7) it follows that ∂t wε ∈ L2 ([0, T ], H −1 (Rd )).
Thus, recalling that wε ∈ L2 ([0, T ], H 1 (Rd )), we can multiply (5.4.7) by wε and integrate
on Rd , obtaining
Z Z Z X
1d 2 2
¡ ¢
|wε | dx + α |∇x wε | dx ≤ −ε (∂t aij )∂i wε ∂j (I − εA(∂x ))−1 wε dx.
2 dt Rd Rd Rd ij
3
which implies, for ε small enough (say ε ≤ 4 Cα 2 ),
r
2 ε
kwε (t)kL2 (Rd ) ≤ C kwε k2L2 ([0,T ]×Rd ) ∀t ∈ [0, T ].
α
By Gronwall inequality wε = 0, and thus by (5.4.6) w = 0.
Step 3.3: rigorous justification of (5.4.7). In order to conclude the proof of the
theorem, we only need to rigorously justify (5.4.7).
Let (anijP)n∈N be a sequence of smooth functions bounded in L∞ , P n
such that haP ξ, ξi ≥
α 2 n n n n
2
|ξ| , j ∂j aij and ∂t aij are uniformly bounded, and aij → aij , j ∂j aij → j ∂j aij ,
∂t anij → ∂t aij a.e. P
We now compute [∂t , (I − εAn (∂x ))−1 ], where An (∂x ) := ij ∂i (anij ∂j ·):
X X
[∂t , (I − εAn (∂x ))−1 ] = [∂t , εk An (∂x )k ] = εk [∂t , An (∂x )k ]
k≥0 n≥0
∞ X
X k−1
¡ ¢i ¡ ¢k−i−1
=ε εAn (∂x ) [∂t , An (∂x )] εAn (∂x )
k=0 i=0 (5.4.9)
X∞ X¡
¡ ¢i ¢k−i−1
=ε εAn (∂x ) [∂t , An (∂x )] εAn (∂x )
i=0 k>i
= ε(I − εA (∂x )) [∂t , A (∂x )](I − εAn (∂x ))−1 ,
n −1 n
P
where at the second equality we used the algebraic identity [A, B k ] = k−1 i
i=0 B [A, B]B
k−i−1
.
∞ d
Thus, for any ϕ, ψ ∈ Cc ([0, T ] × R ), we have
Z TZ Z TZ
¡ n −1
¢ £ ¤
ψ∂t (I − εA (∂x )) ϕ dx dt = ψ (I − εAn (∂x ))−1 ∂t ϕ dx dt
0 Rd 0 Rd
Z TZ
£ ¤
+ε ψ (I − εAn (∂x ))−1 [∂t , An (∂x )](I − εAn (∂x ))−1 ϕ dx dt. (5.4.10)
0 Rd
We now want to pass to the limit in the above identity as n → ∞. Since (I − εAn (∂x ))−1
is selfadjoint in L2 ([0, T ] × Rd ) and it commutes with An (∂x ), we get
Z TZ
£ ¤
ψ (I − εAn (∂x ))−1 [∂t , An (∂x )](I − εAn (∂x ))−1 ϕ dx dt
0 Rd
Z TZ
£ ¤£ ¤
= (I − εAn (∂x ))−1 ψ [∂t , An (∂x )](I − εAn (∂x ))−1 ϕ dx dt
0 Rd
Z TZ
£ ¡ ¢¤ £ ¤
=− ∂t (I − εAn (∂x ))−1 ψ (I − εAn (∂x ))−1 An (∂x )ϕ dx dt
0 Rd
Z TZ
£ ¤£ ¡ ¢¤
− (I − εAn (∂x ))−1 An (∂x )ψ ∂t (I − εAn (∂x )−1 )ϕ dx dt.
0 Rd
5.4. Fokker-Planck equation 195
By (5.4.9) we have
¡ ¢
∂t (I−εAn (∂x ))−1 ϕ = (I−εAn (∂x ))−1 ∂t ϕ+ε(I−εAn (∂x ))−1 [∂t , An (∂x )](I−εAn (∂x ))−1 ϕ,
P
and, observing that [∂t , An (∂x )] = ij ∂i (∂t anij ∂j ·), we deduce that the right hand side is
uniformly bounded in L2 ([0, T ], H 1 (Rd )). In the same way one obtains
¡ ¢ ¡ ¢
∂t (I − εAn (∂x ))−1 An (∂x )ϕ = (I − εAn (∂x ))−1 ∂t An (∂x )ϕ
+ ε(I − εAn (∂x ))−1 [∂t , An (∂x )](I − εAn (∂x ))−1 An (∂x )ϕ
= (I − εAn (∂x ))−1 [∂t , An (∂x )]ϕ
+ (I − εAn (∂x ))−1 An (∂x )∂t ϕ
+ ε(I − εAn (∂x ))−1 [∂t , An (∂x )](I − εAn (∂x ))−1 An (∂x )ϕ,
and, as above, the right hand side is uniformly bounded in L2 ([0, T ], H 1 (Rd )). Thus
∂t (I − εAn (∂x ))−1 ϕ is uniformly bounded in L2 ([0, T ], H 1 (Rd )) ⊂ L2 ([0, T ] × Rd ) (the
same obviously holds for ψ in place of ϕ), while (I − εAn (∂x ))−1 An (∂x )ϕ is uniformly
bounded in H 1 ([0, T ] × Rd ) (again the same fact holds for ψ in place of ϕ). Therefore,
1
since Hloc ([0, T ] × Rd ) ,→ L2loc ([0, T ] × Rd ) compactly, all we have to check is that
¡ ¢ ¡ ¢
∂t (I − εAn (∂x ))−1 ϕ → ∂t (I − εA(∂x ))−1 ϕ
and
(I − εAn (∂x ))−1 An (∂x )ϕ → (I − εA(∂x ))−1 A(∂x )ϕ
¡ ¢
in the sense of distribution (indeed, by what we have shown above, ∂t (I − εAn (∂x ))−1 ϕ
will converge weakly in L2 while (I − εAn (∂x ))−1 An (∂x )ϕ will converge strongly in L2loc ,
and therefore it is not difficult to see that the product converges to the product of the
limits). We observe that, since the solution of
belonging to L2 ([0, T ], H 1 (Rd )) is unique, and any limit point of (I −εAn (∂x ))−1 ϕ belongs
to L2 ([0, T ], H 1 (Rd )) and is a distributional solution of (5.4.11), one obtains that
in the distributional sense, which implies the convergence of ∂t (I −εAn (∂x ))−1 ϕ to ∂t (I −
εA(∂x ))−1 ϕ. Regarding (I − εAn (∂x ))−1 An (∂x )ϕ, let us take χ ∈ Cc∞ ([0, T ] × Rd ). Then
we consider
Z TZ Z TZ X
n
£ n −1
¤ ¡ ¢
A (∂x )ϕ (I − εA (∂x )) χ dx dt = − anij ∂j ϕ ∂i (I − εAn (∂x ))−1 χ dx dt.
0 Rd 0 Rd ij
196 5.0. DiPerna-Lions theory for SDE
Recalling that (I − εAn (∂x ))−1 χ is uniformly bounded in L2 ([0, T ], H 1 (Rd )), we get that
∂j (I − εAn (∂x ))−1 χ converges to ∂j (I − εA(∂x ))−1 χ weakly in L2 ([0, T ] × Rd ) while
anij → aij a.e., and so the convergence of (I −εAn (∂x ))−1 An (∂x )ϕ to (I −εA(∂x ))−1 A(∂x )ϕ
follows. ¡ ¢
Thus we are able to pass to the limit in (5.4.10), and we get ∂t (I − εA(∂x ))−1 ϕ ∈
L2 ([0, T ], H 1 (Rd )) and
Z T Z Z T Z
¡ −1
£ ¢ ¤
ψ∂t (I − εA(∂x )) ϕ dx dt = ψ (I − εA(∂x ))−1 ∂t ϕ dx dt
0 Rd 0 Rd
Z TZ
£ ¤
+ε ψ (I − εA(∂x ))−1 [∂t , A(∂x )](I − εA(∂x ))−1 ϕ dx dt.
0 Rd
Observing that (I − εA(∂x ))−1 is selfadjoint in L2 ([0, T ] × Rd ) (for instance, this can be
easily proved by approximation), we have that the second integral in the right hand side
can be written as
Z T Z
£ ¤
ψ (I − εA(∂x ))−1 [∂t , A(∂x )](I − εA(∂x ))−1 ϕ dx dt
0 Rd
Z TZ
£ ¤£ ¡ ¢¤
= (I − εA(∂x ))−1 ψ [∂t , A(∂x )] (I − εA(∂x ))−1 ϕ dx dt.
0 Rd
P
Using now that [∂t , A(∂x )] = ij ∂i (∂t aij ∂j ·) in the sense of distributions, it can be easily
proved by approximation that the right hand side above coincides with
Z T Z X ¡ ¢ ¡ ¢
− (∂t aij )∂i (I − εA(∂x ))−1 ψ ∂j (I − εA(∂x ))−1 ϕ dx dt.
0 Rd ij
This implies that (5.4.12) holds also for ψ ∈ L2 ([0, T ] × Rd ), and that (I − εA(∂x ))−1 ϕ
is an admissible test function in the equation ∂t w − A(∂x )w = 0. By these two facts we
obtain
Z TZ
£ ¤
0= w (∂t + A(∂x ))(I − εA(∂x ))−1 ϕ dx dt
0 Rd
Z TZ
£ ¤
= w (I − εA(∂x ))−1 (∂t + A(∂x ))ϕ dx dt
0 Rd
Z TZ X
¡ ¢ ¡ ¢
−ε (∂t aij )∂i (I − εA(∂x ))−1 w ∂j (I − εA(∂x ))−1 ϕ dx dt
0 Rd ij
Z T Z Z T Z X ¡ ¢
= wε [(∂t + A(∂x ))ϕ] dx dt − ε (∂t aij )∂i wε ∂j (I − εA(∂x ))−1 ϕ dx dt,
0 Rd 0 Rd ij
Proof of Proposition 5.4.4. (a) Arguing as in the the first part of the proof of Theorem
5.4.3, with the same notation we have
Z TZ ³ X ´
1X
0= ∂t v + ∂i (b̃i v) − ∂i (aij ∂j v) + λv v − dx dt
0 Rd i
2 ij
Z TZ h X i
1 d ¡ ¢ X
= − (v − )2 − b̃i ∂i (v − )2 − aij ∂i v − ∂j v − − 2λ(v − )2 dx
2 0 Rd dt i ij
Z ³ X ´Z T Z
1 − 2 1 −
≤− (v ) (T, x) dx − λ − k( ∂i b̃i ) k∞ (v − )2 dx dt
2 Rd 2 i 0 R d
³ ´ Z Z
1 X T
≤ − λ − k( ∂i b̃i )− k∞ (v − )2 dx dt,
2 i 0 Rd
and then v − = 0.
(b) It suffices to observe that the above argument works for every v ∈ Y such that
v(0) ≥ 0 and
X 1X
∂t v + ∂i (b̃i v) − ∂i (aij ∂j v) ≥ 0.
i
2 ij
198 5.0. DiPerna-Lions theory for SDE
P
Applying this remark to the function v := ku0 kL∞ (Rd ) − ue−λt with λ > k( i ∂i b̃i )− k∞ ,
P
and then letting λ → k( i ∂i b̃i )− k∞ , the thesis follows.
(c) The argument we use here is reminiscent of the one that we will use in the next para-
graph for renormalized solutions. Indeed, in order to prove the thesis, we will implicitly
prove that, if u ∈ L2 ([0, T ], H 1 (Rd )) is a solution of (5.4.2), it is also a renormalized
solution (see Definition 5.4.9).
Let us define ³√ ´
βε (s) := s + ε − ε ∈ C 2 (R).
2 2
Observing that |βε (u)| ≤ |u|, and using Hölder inequality and the inequalities
1 3 1 5
χ{R≤|x|≤2R} ≤ χ{|x|≥R} , 2
χ{R≤|x|≤2R} ≤ χ{|x|≥R} , (5.4.14)
R 1 + |x| R 1 + |x|2
we get
Z Z Z tZ X
ϕR βε (u(t)) dx ≤ ϕR βε (u(0)) dx + 2ε ( ∂i b̃i )− dx ds
Rd Rd 0 |x|≤2R i
µ ° ° ° ¶
° b ° ° a ° °
+ kϕkC 2 6° ° + 5° ° kukL2 ([0,t]×Rd ) .
1 + |x| L2 ([0,T ]×{|x|≥R}) 1 + |x|2 L2 ([0,T ]×{|x|≥R})
Then (5.4.1) satisfies the comparison principle in L1 (Rd ) ∩ L∞ (Rd ). In particular solu-
tions of the PDE in L , if they exist, are unique.
Since we do not assume any ellipticity of the PDE, in order to prove the above result
we use the technique of renormalized solutions, which was first introduced in the study
of the Boltzmann equation by DiPerna and P.-L.Lions [54, 55], and then applied in the
context of transport equations by many authors (see for example [56, 28, 47, 48, 4]).
200 5.0. DiPerna-Lions theory for SDE
Let u ∈ L∞ d
loc ([0, T ] × R ) and assume that
X 1X
c := ∂t u + bi ∂i u − aij ∂ij u ∈ L1loc ([0, T ] × Rd ). (5.4.15)
i
2 ij
We say that u is a renormalized solution of (5.4.15) if, for any convex function β : R → R
of class C 2 , we have
X 1X
∂t β(u) + bi ∂i β(u) − aij ∂ij β(u) ≤ cβ 0 (u).
i
2 ij
Now, since
X X X
aij ∂ij β(u) = ∂j (aij ∂i β(u)) − ∂j aij ∂i β(u)
ij ij ij
X X X
= ∂ij (aij β(u)) − 2 ∂i ((∂j aij )β(u)) + ( ∂ij aij )β(u),
ij ij ij
the above expression can be simplified, and we obtain that a solution of the Fokker-
Planck equation is renormalized if and only if
X 1X X 1X
∂t β(u) + ∂i (bi β(u)) − ∂ij (aij β(u)) ≤ ( ∂i b i − ∂ij aij )(β(u) − uβ 0 (u)).
i
2 ij i
2 ij
(5.4.16)
It is not difficult to prove the following:
Lemma 5.4.10. Assume that there exist p, q ∈ [1, ∞] such that
a b
∈ L1 ([0, T ], Lp (Rd )), ∈ L1 ([0, T ], Lq (Rd )),
1 + |x|2 1 + |x|
and that X 1X
( ∂i bi − ∂ij aij )− ∈ L1loc ([0, T ] × Rd ).
i
2 ij
Setting a, b = 0 for t < 0, assume moreover that any solution u ∈ L of the Fokker-
Planck equation in (−∞, T ) × Rd is renormalized. Then the comparison principle holds
in L .
Proof. By the linearity of the equation, it suffices to prove that
u0 ≤ 0 ⇒ u(t) ≤ 0 ∀t ∈ [0, T ].
Fix a non-negative cut-off function ϕ ∈ Cc∞ (Rd ) with supp(ϕ) ⊂ B2 (0), and ϕ = 1 in
B1 (0), and take as renormalization function
1 ³√ 2 ´
βε (s) := s + ε2 + s − ε ∈ C 2 (R).
2
Notice that βε is convex and
βε (s) → s+ as ε → 0, βε (s) − sβε0 (s) ∈ [−ε, 0].
By (5.4.16), we know that
X 1X X 1X
∂t βε (u) + ∂i (bi βε (u)) − ∂ij (aij βε (u)) ≤ ( ∂i bi − ∂ij aij )(βε (u) − uβε0 (u))
i
2 ij i
2 ij
202 5.0. DiPerna-Lions theory for SDE
Observing that |βε (u)| ≤ |u|, by Hölder inequality and the inequalities (5.4.14) we
can bound the first integral in the right hand side, uniformly with respect to ε, with
Z µ ¶
|b(t, x)| 5 |a(t, x)|
kϕkC 2 3 + |u(t, x)| dx
{|x|≥R} 1 + |x| 2 (1 + |x|2 )
µ ° °
° b(t) °
≤ kϕkC 3°
2 ° ku(t)kLp0 (Rd )
1 + |x| Lp ({|x|≥R})
¶
5°° a(t) °
°
+ ° ° ku(t)kLq0 (Rd )
2 1 + |x|2 Lq ({|x|≥R})
(recall that u ∈ L , and thus u ∈ L∞ ([0, T ], Lr (Rd )) for any r ∈ [1, ∞]), while the second
integral is bounded by
Z X 1X
ε ( ∂i bi − ∂ij aij )− dx.
{|x|≤2R} i
2 ij
2. a ∈ L∞ ([0, T ], S+ (Rd ))
Then any distributional solution u ∈ L∞ d
loc ([0, T ] × R ) of (5.4.15) is renormalized.
5.4. Fokker-Planck equation 203
Proof. We take η, a smooth convolution kernel in Rd , and we mollify the equation with
respect to the spatial variable obtaining
X 1X
∂ t uε + bi ∂i uε − aij ∂ij uε = c ∗ ηε − rε , (5.4.17)
i
2 ij
where X X
rε := (bi ∂i u) ∗ ηε − bi ∂i (u ∗ ηε ), uε := u ∗ ηε .
i i
by the standard chain rule in Sobolev spaces we get that uε is a renormalized solution,
that is X 1X
∂t β(uε ) + bi ∂i β(uε ) − aij ∂ij β(uε ) ≤ (c ∗ ηε − rε )β 0 (uε )
i
2 ij
for any β ∈ C 2 (R) convex. Passing to the limit in the distributional sense as ε → 0 in
the above identity, the convergence of all the terms is trivial except for rε β 0 (uε ).
Let ση be any weak limit point of rε β 0 (uε ) in the sense of measures (such a cluster point
exists since rε β 0 (uε ) is bounded in L1loc ). Thus we get
X 1X
∂t β(u) + bi ∂i β(u) − aij ∂ij β(u) − cβ 0 (u) ≤ −ση ≤ |ση |.
i
2 ij
Then, for any µ0 = ρ0 L d ∈ M+ (Rd ), with ρ0 ∈ L1 (Rd ) ∩PL∞ (Rd ), there exists a solution
of (5.1.2) in L+ . If moreover b ∈ L ([0, T ], BVloc (R )), i ∂i bi ∈ L1loc ([0, T ] × Rd ), and
1 d
is uniformly bounded in L1 ([0, T ], L∞ (Rd )). Indeed, if we now consider the approximate
solutions µnt = ρnt L d ∈ M+ (Rd ), we know that
X 1X
∂t ρnt + ∂i (bni ρnt ) − ∂ij (anij ρnt ) = 0,
i
2 ij
that is
1 X X X 1X
∂t ρnt − anij ∂ij ρnt + (bni − ∂j anij )∂i ρnt + ( ∂i bni − ∂ij anij )ρnt = 0.
2 i j i
2 ij
So we see that the approximate solutions are non-negative and uniformly bounded in
L1 ∩ L∞ (the bound in L1 follows by the constancy of the map t 7→ kρnt kL1 (observe that
ρnt ≥ 0 and recall Remark 5.2.8)). Therefore, any weak limit is a solution of the PDE in
L+ .
Uniqueness: it follows by Theorem 5.4.8. ¤
5.5 Conclusions
Let us now combine the results proved in Sections 5.2 and 5.4 in order to get existence
and uniqueness of SLF. The first theorem follows directly by Corollary 5.3.6 and Theorem
5.1.3, while the second is a consequence of Corollary 5.3.6 and Theorem 5.1.4.
Then there exists a unique SLF (in the sense of Corollary 5.3.6).P P
If moreover (bn , an ) → (b, a) in L1loc ([0, T ] × Rd ) and ( i ∂i bni − 12 ij ∂ij anij )− are
uniformly bounded in L1 ([0, T ], L∞ (Rd )), then the Feynman- Kac formula implies (ii) of
Theorem 5.3.7 (see the proof of Theorem 5.4.12). Thus we have stability of SLF.
Theorem 5.5.2. Let us assume that a : [0, T ] → S(Rd ) and b : [0, T ] × Rd → Rd are
bounded functions such that:
P
1. b ∈ L1 ([0, T ], BVloc (Rd )), i ∂i bi ∈ L1loc ([0, T ] × Rd );
P
2. ( i ∂i bi )− ∈ L1 ([0, T ], L∞ (Rd )).
Then there exists a unique SLF (in the sense of Corollary 5.3.6).P P
If moreover (bn , an ) → (b, a) in L1loc ([0, T ] × Rd ) and ( i ∂i bni − 12 ij ∂ij anij )− are
uniformly bounded in L1 ([0, T ], L∞ (Rd )), then the Feynman-Kac formula implies (ii) of
Theorem 5.3.7 (see the proof of Theorem 5.4.12). Thus we have stability of SLF.
In particular, by Corollary 5.3.9 and the Feynman-Kac formula (see the proof of
Theorem 5.4.12), the following vanishing viscosity result for RLF holds:
Let {νxε }x∈Rd be the unique SLF relative to (b, εI), with ε > 0, and {νx }x∈Rd be the RLF
relative to (b, 0) (which is uniquely determined L d -a.e. by the results in [4]). Then, as
ε → 0, Z Z
νxε f (x) dx *∗ νx f (x) dx in M(ΓT ) for any f ∈ Cc (Rd ).
Rd Rd
2. |b(t, x)−b(s, y)|+ka(t, x)−a(s, y)k ≤ C (|x − y|γ + |t − s|γ ) ∀(t, x), (s, y) ∈ [0, T ]×
Rd , for some γ ∈ (0, 1], C ≥ 0;
206 5.0. DiPerna-Lions theory for SDE
P P P
3. ∂j aij ∈ L∞ ([0, T ] × Rd ) for i = 1, . . . , d, (
j i ∂i bi − 12 ij ∂ij aij )− ∈ L∞ ([0, T ] ×
d
R );
a b
4. 1+|x|2
∈ L2 ([0, T ] × Rd ), 1+|x|
∈ L2 ([0, T ] × Rd ).
Then, there exists a unique martingale solution starting from x (at time 0) for any
x ∈ Rd .
We remark that this result is not interesting by itself, since it can be proved that the
martingale problem starting from any x ∈ Rd at any initial time s ∈ [0, T ] is well-posed
also under weaker regularity assumptions (see [125, Chapters 6 and 7]). We stated it
just because we believe that it is an interesting example of how existence and uniqueness
at the PDE level can be combined with a refined analysis at the level of the uniqueness
of martingale solutions. It is indeed in this spirit that we generalize Theorem 5.2.2 in
the following section, hoping that it could be useful for further analogous applications.
Then, given two measurable families of probability measures {νx1 }x∈Rd and {νx2 }x∈Rd with
νx1 , νx2 ∈ Cx , νx1 = νx2 for µ0 -a.e. x. In particular, by standard measurable selection
theorems (see for instance [125, Chapter 12]), Cx is a singleton for µ0 -a.e. x.
Proof. Let {νx1 }x∈Rd and {νx2 }x∈Rd be two measurable families of probability measures
with νx1 , νx2 ∈ Cx , and fix 0 < t1 < . . . < tn ≤ T.
Claim: for µ0 -a.e. x, for νxi -a.e. γ (i = 1, 2),
i,γ̃ i,γ
νx,F tn
∈ Cγ̃(tn ),tn for νx,M t1 ,...,tn -a.e. γ̃
i,γ i γ
where νx,M t1 ,...,tn := (νx )M t1 ,...,tn .
This claim follows observing that, by assumption (iii), for µ0 -a.e. x there exists a subset
i,γ
Γx ⊂ ΓT such that νxi (Γx ) = 1 and νx,F tn
∈ Cγ(tn ),tn for any γ ∈ Γx . Thus, by (5.3.1)
applied with ν := νx , A := ΓT , B := Γx , and with M t1 ,...,tn in place of Ftn , one obtains
i
Z
i c i,γ c i
0 = νx (Γx ) = νx,M t1 ,...,tn (Γx ) dνx (γ),
ΓT
that is,
i,γ
for νxi -a.e. γ, νx,M t1 ,...,tn (Γx ) = 1.
Let A ⊂ Rd be such that µ0 (Ac ) = 0 andR assumption (i) is true for any x ∈ A. By
assumption (iv), we have µtn (Ac ) = 0 = Rd ×ΓT 1Ac (γ(tn )) dνxi (γ) dµ0 (x), that is
is a martingale solution starting from γ(tn ) at time tn ). We now want to prove that, for
all n ≥ 1, 0 < t1 < . . . < tn ≤ T , we have that, for µ0 -a.e. x,
Z Z
1
f1 (et1 (γ)) . . . fn (etn (γ)) dνx (γ) = f1 (et1 (γ)) . . . fn (etn (γ)) dνx2 (γ) (5.6.3)
ΓT ΓT
for any fi ∈ Cc (Rd ). We observe that (5.6.3) is true for n = 1 by assumption (ii). We
want to prove it for any n by induction. Let us assume (5.6.3) true for n − 1, and let us
208 5.0. DiPerna-Lions theory for SDE
prove it for n.
We want to show that
Z Z
1
f1 (et1 (γ)) . . . fn (etn (γ)) dνx (γ) = f1 (et1 (γ)) . . . fn (etn (γ)) dνx2 (γ),
ΓT ΓT
i i
h i i
Eνx [f1 (et1 ) . . . fn (etn )] = Eνx Eνx [f1 (et1 ) . . . fn (etn ) | M t1 ,...,tn−1 ]
i
h i
i
= Eνx f1 (et1 ) . . . fn−1 (etn−1 )Eνx [fn (etn ) | M t1 ,...,tn−1 ]
i £ ¤
= Eνx f1 (et1 ) . . . fn−1 (etn−1 )ψxi (et1 , . . . , etn−1 ) ,
i
where ψxi (et1 , . . . , etn−1 ) := Eνx [fn (etn ) | M t1 ,...,tn−1 ]. Let φ ∈ Cc (Rd ), and let us prove
that
Z
1 £ ¤
Eνx f1 (et1 ) . . . fn−1 (etn−1 )ψx1 (et1 , . . . , etn−1 ) φ(x) dµ0 (x)
Rd Z
2 £ ¤
= Eνx f1 (et1 ) . . . fn−1 (etn−1 )ψx2 (et1 , . . . , etn−1 ) φ(x) dµ0 (x). (5.6.4)
Rd
i,γ
for µ0 -a.e. x, νx,M t1 ,...,tn−1 ∈ Cγ(tn−1 ),tn−1 for νxi -a.e. γ,
where the first equality in the above equation follows by the inductive hypothesis. Now,
by (5.6.4) and the arbitrariness of φ and of fj , with j = 1, . . . , n, we obtain that, for all
n ≥ 1, 0 < t1 < . . . < tn ≤ T , we have
for µ0 -a.e. x, (et1 , . . . , etn )# νx = (et1 , . . . , etn )# ν̃x ∀t1 , . . . , tn ∈ [0, T ].
Considering only rational times, we get that there exists a subset D ⊂ Rd , with µ0 (Dc ) =
0, such that, for any x ∈ D,
(et1 , . . . , etn )# νx = (et1 , . . . , etn )# ν̃x for any t1 , . . . , tn ∈ [0, T ] ∩ Q.
By continuity, this implies that, for any x ∈ D, νx = ν̃x , as wanted. ¤
The above result apply, for example, in the case when Cx,s denotes the set of all
martingale solutions starting from x. In particular, we remark that, by the above proof,
one obtains the well-known fact that, if νx is a martingale solution starting from x (at
γ
time 0), then, for any 0 ≤ t1 ≤ . . . ≤ tn ≤ T , νx,M t1 ,...,tn is a martingale solution starting
from γ(tn ) at time tn . More in general, since martingale solutions R are closed by convex
γ
combination, is µ is a probability measure on Rd , the average Rd νx,M t1 ,...,tn dµ(x) is a
R
If (i) holds, we can define µt := (et )# Rd νx dµ0 (x) for a measurable selections {νx }x∈Rd
with νx ∈ Cx , and this definition does not depends on the choice of νx ∈ Cx . We now
assume that:
Then, given two measurable families of probability measures {νx1 }x∈Rd and {νx2 }x∈Rd with
νx1 , νx2 ∈ Cx , νx1 = νx2 for µ0 -a.e. x. In particular, by standard measurable selection
theorems (see for instance [125, Chapter 12]), Cx is a singleton for µ0 -a.e. x.
Chapter 6
Appendix
211
212 6.0. Appendix
Proof. The proof of 1)(i) is obvious. For the proof of (ii), we fix x ∈ U , and we find
i0 ∈ {1, . . . , k} such that minki=1 fi (x) = fi0 (x). Since fi0 is ωi0 -semi-concave, we can find
a linear map lx : Rn → R such that
To prove 2), consider an open convex subset C with C̄ compact and contained in U .
By compactness of C̄ and continuity of x 7→ dx f , we can find a modulus ω, which is a
modulus of continuity for the map x 7→ dx f on C. The Mean Value Formula in integral
form Z 1
f (y) − f (x) = dtx+(1−t)y f (y − x) dt,
0
We now state and prove the first important consequences of the definition of semi-
concavity.
6.1. Semi-concave functions 213
we have klx k ≤ A;
(ii) the function f is locally Lipschitz.
Proof. From the definition, it follows that a semi-concave function is locally bounded
from above. We now show that f is also locally bounded from below. Fix a (compact)
cube C contained in U and y2n } be the vertices of the cube. Then, for each
P let {y1 , . . . , P
x ∈ C, we can write x = i αi yi , with i αi = 1. By the semi-concavity of f we have,
for each i = 1, . . . , 2n ,
with B = DC ω(DC ), where DC is the diameter of the compact cube C. It follows that
We now know that f is locally bounded. Using this fact, it is not difficult to show (i). In
fact, suppose that the closed ball B̄(x0 , 2r), r < +∞, is contained in U . For x ∈ B̄(x0 , r),
we have x − rv ∈ B̄(x0 , 2r) ⊂ U for each v ∈ Rn with kvk = 1, and therefore
Since, by the compactness of B̄(x0 , 2r), we already know that B̃ = supz∈B̄(x0 ,2r) |f (z)| is
finite, this implies
2B̃
klx k ≤ + ω(r).
r
214 6.0. Appendix
Since the compact set K ⊂ U can be covered by a finite numbers of balls B̄(xi , ri ),
i = 1, . . . , `, we obtain (i).
To prove (ii), we consider a compact subset K ⊂ U , and we apply (i) to obtain the
constant A. We denote by DK the (finite) diameter of the compact set K. For each
x, y ∈ K,
Let us recall that a Lipschitz real valued function defined on an open subset of
a Euclidean space is differentiable almost everywhere (with respect to the Lebesgue
measure). Therefore by part (ii) of Lemma 6.1.5 above we obtain the following corollary:
Corollary 6.1.6. A locally semi-concave real valued function defined on an open subset
of a Euclidean space is differentiable almost everywhere with respect to the Lebesgue
measure.
In fact, in the case of semi-concave functions there is a better result which is given
in Theorem 6.1.8 below, whose proof can be found in [41, Section 4.1]. Let us first give
a definition:
1. E is contained in ∪j Kj ;
2. for each j there exists a hyperplane Hj ⊂ Rn = Hj ⊕Hj⊥ , where Hj⊥ is the Euclidean
orthogonal of Hj , such that Kj is contained in the graph of a Lipschitz function
fj : Aj → Hj⊥ defined on a compact subset Aj ⊂ Hj .
Note that in the definition above, by the graph property (ii), the compact subset Kj
has finite (n − 1)-dimensional Hausdorff measure. Therefore any (n − 1)-Lipschitz set
is contained in a Borel (in fact σ-compact) (n − 1)-Lipschitz set with σ-finite (n − 1)-
dimensional Hausdorff measure.
Proof. Since the nature of the result is local, without loss of generality we can assume
that f : U → R is semi-concave with modulus ω. We now show that, for every V 0
convex open subset whose closure V̄ 0 is compact and contained in V , the restriction
f ◦ F |V 0 : V 0 → R is a semi-concave function. We set CV̄ 0 = maxz∈V̄ 0 kDz F k, and we
denote by ω̂V̄ 0 a modulus of continuity for the continuous function z 7→ Dz F on the
compact subset V̄ 0 .
For each x, y in the compact convex subset V̄ 0 ⊂ V , we have
f (F (y)) − f (F (x)) ≤ hlF (x) , F (y) − F (x)i + kF (y) − F (x)kω(kF (y) − F (x)k)
≤ hlF (x) , DF (x)(y − x)i + klF (x) kω̂V̄ 0 (ky − xk)ky − xk
+ CV̄ 0 ky − xkω(CV̄ 0 ky − xk);
Since F (V̄ 0 ) is a compact subset of U we can apply part (i) of Lemma 6.1.5 to obtain
that C̃V̄ 0 = supV̄ 0 klF (x) k is finite. This implies that f ◦ F on V 0 is semi-concave with the
modulus
ω̃(r) = C̃V̄ 0 ω̂V̄ 0 (r) + CV̄ 0 ω(CV̄ 0 r).
If F is C2 , then its derivative DF is locally Lipschitz on U , and we can assume that ω̂V̄ 0
is a linear modulus. Therefore, if ω is a linear modulus, we obtain that ω̃ is also a linear
modulus.
Thanks to the previous lemma, we can define a locally semi-concave function (resp.
a locally semi-concave function for a linear modulus) on a manifold as a function whose
restrictions to charts is, when computed in coordinates, locally semi-concave (resp. locally
semi-concave for a linear modulus). Moreover, it suffices to check this locally semi-
concavity in charts for a family of charts whose domains of definition cover the manifold.
It is not difficult to see that Theorem 6.1.8 is valid on any (second countable) manifold,
since we can cover such a manifold by the domains of definition of a countable family of
charts.
Now we want to introduce the notion of uniformly semi-concave family of functions.
216 6.0. Appendix
where DC is the diameter of the compact cube C. Using the fact that f (yj ) = inf i∈I fi (yj )
is finite, it follows that there exists A ∈ R such that
∀x ∈ C, ∀i ∈ I, fi (x) ≥ A.
Choose now ε > 0 such that B̄(x0 , ε) ⊂ C. If li : Rn → R is a linear form such that
∀y ∈ U, fi (y) ≤ fi (x0 ) + hli , y − x0 i + ky − x0 kω(ky − x0 k),
we obtain that, for every v ∈ Rn of norm 1,
A ≤ fi (x0 ) + hli , εvi + εω(ε).
Since fin (x0 ) & f (x0 ), we can assume fin (x0 ) ≤ M < +∞ for all n, that implies
M −A
klin k ≤ + ω(ε) < +∞.
ε
Up to extracting a subsequence, we can assume lin → l in Rn∗ , the dual space of Rn .
Then, as for every y ∈ U we have f (y) ≤ fin (y), passing to the limit in n in the inequality
f (y) ≤ fin (x0 ) + hlin , y − x0 i + ky − x0 kω(ky − x0 k),
we get
f (y) ≤ f (x0 ) + hl, y − x0 i + ky − x0 kω(ky − x0 k).
Since x0 ∈ U is arbitrary, this concludes the proof.
6.1. Semi-concave functions 217
since ω ≥ 0. Consider now the diffeomorphism ϕ : R∗+ → R∗+ , ϕ(x) = x2 . Then there
does not exist a non-empty open subset U ⊂ R∗+ , and a modulus of continuity ω, such
that the family (fk ◦ ϕ|U )k∈R is (uniformly) ω-semi-concave. Suppose in fact, by absurd,
that
fk ◦ ϕ(y) − fk ◦ ϕ(x) ≤ lx (y − x) + |y − x|ω(|y − x|),
where lx depends on k but not ω. Since fk ◦ ϕ is differentiable we must have lx (y − x) =
(fk ◦ ϕ)0 (x)(y − x) = 2kx(y − x). Therefore we should have
is ωi,j -semi-concave on Ũi × W̃j , for some modulus ωi,j . It is then clear that the family
(c(ϕ−1 −1
i (x̃), ψj (ỹ)))ỹ∈W̃j
We end this section with another useful theorem. The proof we give is an adaptation
of the proof of [64, Lemma 3.8, page 494].
In particular, we get
Therefore
|[l2,x − l1,x ](v)| ≤ 2kvkeuc ω(kvkeuc ),
for v small enough. Since l2,x −l1,x is linear it must be identically 0. We set lx = l2,x = l1,x .
◦
For i = 1, 2 and y ∈B, we obtain from (6.1.1)
which implies, exchanging k with −k, and using that the modulus ω is non-decreasing
Since ky2 − y1 keuc < 2, we can apply the inequality (1.5.3) above with any k such that
kkkeuc = (1 − r)ky2 − y1 keuc /2. If we divide the inequality (1.5.3) by kkkeuc , and take
the sup over all k such that kkkeuc = (1 − r)ky2 − y1 keuc /2, we obtain
h 2 i ¡ 1 − r¢
kdy1 ϕ1 − dy2 ϕ1 keuc ≤2 + 1 ω( 1 + ky2 − y1 keuc ).
1−r 2
◦
It follows that a modulus of continuity of x 7→ dx ϕ1 on E ∩ r B is given by
6 − 2r 3 − r
t 7→ ω( t).
1−r 2
◦
This implies the continuity of the map x 7→ dx ϕ1 on E ∩ r B. It also shows that it is
◦
Lipschitz on E ∩ r B when ω is a linear modulus.
6.2. Tonelli Lagrangians 221
Note that the integral is well defined with values in R ∪ {+∞}, because L is bounded
below, and s → L(γ(s), γ̇(s)) is defined a.e. and measurable. To make things simpler to
write, we set AL (γ) = +∞ if γ is not absolutely continuous.
(a) L is C1 ;
(c) there exist a complete Riemannian metric g on M and a constant C > −∞ such
that
∀(x, v) ∈ T M, L(x, v) ≥ kvkx + C
where k·kx is the norm on Tx M obtained from the Riemannian metric g;
222 6.0. Appendix
We will say that L is a Tonelli Lagrangian, if it is a weak Tonelli Lagrangian, and satisfies
the following two strengthening of conditions (a) and (b) above:
(a’) L is C2 ;
∂2L
(b’) for every (x, v) ∈ T M , the second partial derivative (x, v) is positive definite
∂v 2
on Tx M .
Since above a compact subset of a manifold all Riemannian metrics are equivalent, if
condition (d) in the definition is satisfied for one particular Riemannian metric, then it
is satisfied for any other Riemannian metric.
Note that when L is a weak Tonelli Lagrangian on M , and U : M → R is a C1
function which is bounded below, then L + U , defined by (L + U )(x, v) = L(x, v) + U (x)
is a weak Tonelli Lagrangian. If moreover L is a Tonelli Lagrangian, and U is C2 and
bounded below, then L + U is a Tonelli Lagrangian. Therefore one can generate a lot of
(weak) Tonelli Lagrangians from the following simple example.
Example 6.2.5. Suppose that g is a complete smooth Riemannian metric on M , and
r > 1. We define the Lagrangian Lr,g on M by
and that these partial derivatives are continuous. Therefore L is C1 . A simple compu-
tation gives
∂Lr,g
(x, v) = rkvkr−2
x gx (v, ·).
∂v
We now prove condition (c) and (d) of Definition 6.2.4 at once. In fact, if A is given, we
have
Lr,g (x, v) = kvkrx ≥ Akvkx − Ar/r−1 ,
as on can see by considering separately the two cases kvkr−1
x ≥ A and kvkr−1
x ≤ A. The
rest of the proof is easy.
The completeness of the Riemannian metric in condition (c) of Definition 6.2.4 above
is crucial to guarantee that a set of the form
In fact in [38, Theorem 3.7, page 114], the existence of absolutely continuous mini-
mizers is valid under very general hypotheses on the Lagrangian L (the C1 hypothesis
on L is much stronger than necessary). We now come to the problem of regularity of
minimizers which uses the C1 hypothesis on L:
which is an integrated from of the Euler-lagrange equation. This implies that ∂L/∂v(γ(t), γ̇(t))
is a C1 function of t with
· ¸
d ∂L ∂L
(γ(t), γ̇(t)) = (γ(t), γ̇(t)).
dt ∂v ∂x
Moreover, if L is a Cr Tonelli Lagrangian, with r ≥ 2, then any minimizer is of class
r
C.
Proof. We will only sketch the proof. If L is a Tonelli Lagrangian, this theorem would
be a formulation of what is nowadays called Tonelli’s existence and regularity theory.
In that case its proof can be found in many places, for example [38], [45], or [65]. The
fact that the regularity of minimizers holds for C1 (or even less smooth) Lagrangians is
more recent. The fact that a minimizer is Lipschitz has been established by Clarke and
Vinter, see [44, Corollary 1, page 77, and Corollary 3.1, page 90] (again the hypothesis
L is C1 is stronger than the one required in this last work). The same fact under weaker
regularity assumptions on L has been proved in [6]. A short and elegant proof of the fact
that a minimizer for the class of absolutely continuous curves is necessarily Lipschitz has
been given by Clarke, see [46]. Once one knows that γ is Lipschitz, when L is C1 it is
possible to differentiate the action, see [38], [45], or [65], and, using an integration by
parts, one can show that γ satisfies the following integrated form of the Euler-Lagrange
equation for almost every t ∈ [t0 , t1 ], for some fixed linear form c:
Z t
∂L ∂L
(γ(t), γ̇(t)) = c + (γ(s), γ̇(s)) ds. (6.2.2)
∂v t0 ∂x
But the continuity of the right hand side in (6.2.2) implies that ∂L/∂v(γ(t), γ̇(t)) extends
continuously everywhere on [t0 , t1 ]. Conditions (a) and (b) on L imply that the global
Legendre transform
L : T M → T ∗ M,
∂L
(x, v) 7→ (x, (x, v)),
∂v
is continuous and injective, therefore a homeomorphism on its image by, for example,
Brouwer’s Theorem on the Invariance of Domain (see also Proposition 6.2.9 below). We
therefore conclude that γ̇(t) has a continuous extension to [t0 , t1 ]. Since γ is Lipschitz this
implies that γ is C1 . Equation (6.2.1) follows from (6.2.2), which now holds everywhere
by continuity.
In fact we will only use the cases when L is C2 , in which case this regularity of
minimizers will follow from the “usual” Tonelli regularity theory, or when L is of the
form L(x, v) = kvkpx , p > 1, where the norm is obtained from a C2 Riemannian metric,
6.2. Tonelli Lagrangians 225
in which case the minimizers are necessarily geodesics which are of course as smooth as
the Riemannian metric, see Proposition 6.2.24 below.
To obtain further properties it is necessary to introduce the global Legendre trans-
form.
But this last quantity tends to −∞, as kvkx → +∞. Therefore the continuous function
v 7→ p(v) − L(x, v) achieves a maximum at some point vp ∈ Tx M . Since this function
is C1 , its derivative at vp must be 0. This yields p − ∂L/∂v(x, vp ) = 0. Hence (x, p) =
L (x, vp ).
To prove injectivity of L , it suffices to show that for v, v 0 ∈ Tx M , with v 6= v 0 , we have
∂L/∂v(x, v) 6= ∂L/∂v(x, v 0 ). Consider the function ϕ : [0, 1] → R, t 7→ L(x, tv+(1−t)v 0 ),
which by condition (b) of Definition 6.2.4 is strictly convex. Since it is C1 , we must have
ϕ0 (0) 6= ϕ0 (1). In fact, if that was not the case, then the non-decreasing function ϕ0 would
be constant on [0, 1], and ϕ would be affine on [0, 1]. This contradicts strict convexity.
By a simple computation, we therefore get
∂L ∂L
(x, v 0 )(v − v 0 ) = ϕ0 (0) 6= ϕ0 (1) = (x, v)(v − v 0 ).
∂v ∂v
This implies ∂L/∂v(x, v 0 ) 6= ∂L/∂v(x, v). We now show that L is a homeomorphism.
Since this map is continuous, and bijective, we have to check that it is proper, i.e. inverse
images under L of compact subsets of T ∗ M are (relatively) compact. For this it suffices
to show that for every compact subset K ⊂ M , and every C < +∞, the set
∂L
{(x, v) ∈ TK M | k (x, v)kx ≤ C}
∂v
226 6.0. Appendix
∂L
(x, v)(v) ≥ L(x, v) − L(x, 0).
∂v
But k∂L/∂v(x, v)kx ≥ ∂L/∂v(x, v)(v/kvkx ), therefore by condition (d) of Definition
6.2.4, we conclude that
∂L
∀A ≥ 0, ∀(x, v) ∈ TK M, k (x, v)kx ≥ A − [C(K, A)/kvkx ].
∂v
Taking A = C + 1, we get the inclusion
∂L
{(x, v) ∈ TK M | k (x, v)kx ≤ C} ⊂ {(x, v) ∈ TK M | kvkx ≤ C(K, C + 1)},
∂v
and the compactness of the first set follows.
Suppose now that L is a Cr Tonelli Lagrangian with r ≥ 2. Obviously L is Cr−1 .
By the inverse function theorem, to show that it is a Cr−1 diffeomorphism, it suffices to
show that the derivative is invertible at each point of T M . But a simple computation in
coordinates show that the derivative of L at (x, v) is given in matrix form by
Id 0
∂ 2L ∂ 2L
(x, v) (x, v)
∂x∂v ∂v 2
This is clearly invertible by (b’) of Definition 6.2.4.
(d∗ ) for every compact subset K ⊂ M the restriction of H to TK∗ M = ∪x∈K Tx∗ M is
superlinear in the fibers of T ∗ M → M : this means that for every A ≥ 0, there
exists a finite constant C ∗ (A, K) such that
Since L(x, 0) ≤ C for x ∈ V , it follows that, for kvkeuc > C − C(R + 1),
This implies
Therefore the sup in the definition of H(x, p) is attained at a point v(x,p) with kv(x,p) keuc ≤
C − C(R + 1). Note that this point v(x,p) is unique (compare with the argument proving
228 6.0. Appendix
that the Legendre transform is surjective). In fact, at its maximum v(x,p) , the C1 function
v 7→ p(v) − L(x, v) must have 0 derivative, and therefore
∂L
p= (x, v(x,p) ).
∂v
This means (x, p) = L (x, v(x,p) ), but the Legendre transform is injective by Proposition
6.2.9.
Note, furthermore, that the map
¡ ¢
f : V × {kpkeuc ≤ R} × {kvkeuc ≤ C − C(R + 1)} → R,
∂H ∂H ∂L
◦ L (x, v) = v and ◦ L (x, v) = − (x, v), (6.2.3)
∂p ∂x ∂x
which proves (a∗ ). Note that when L is a Cr Tonelli Lagrangian, by Proposition 6.2.9
the Legendre transform L is a Cr−1 global diffeomorphism. From the expression of
6.2. Tonelli Lagrangians 229
the partial derivatives above, we conclude that ∂H/∂p and ∂H/∂x are both Cr−1 . This
proves (a’∗ ).
We now prove (b’∗ ). Taking the derivative in v of the first equality in (6.2.3)
· ¸
∂H ∂L
x, (x, v) = v,
∂p ∂v
∂ 2H ∂ 2L
(L (x, v)) · (x, v) = IdRm ,
∂p2 ∂v 2
where the dot · represents the usual product of matrices. This means that the matrix
representative of ∂ 2 H/∂p2 (x, p) is the inverse of the matrix of a positive definite quadratic
form, therefore ∂ 2 H/∂p2 (x, p) is itself positive definite.
We prove (b∗ ). Suppose p1 6= p2 are both in Tx∗ M . Fix t ∈ (0, 1), and set p3 =
tp1 + (1 − t)p2 . The covectors p1 , p2 , p3 are all distinct. Call v1 , v2 , v3 elements in Tx M
such that pi = ∂L/∂v(x, vi ). By injectivity of the Legendre transform, the tangent
vectors v1 , v2 , v3 are also all distinct. Moreover, for i = 1, 2 we have
H(x, p3 ) = p3 (v3 ) − L(x, v3 ) = t[p1 (v3 ) − L(x, v3 )] + (1 − t)[p2 (v3 ) − L(x, v3 )].
Since the sup in the definition of H(x, p) is attained at a unique point, and v1 , v2 , v3 are
all distinct, for i = 1, 2 we must have
It follows that
we obtain
H(x, p) ≥ sup p(v) + inf −L(x, v).
kvkx ≤A x∈K,kvkx ≤A
But supkvkx ≤A p(v) = Akpkx , and C ∗ (A, K) = inf x∈K,kvkx ≤A −L(x, v) is finite by com-
pactness.
230 6.0. Appendix
∂H ∂H
XH (x, p) = ( (x, p), − (x, p)).
∂p ∂x
φLt = L −1 ◦ φH
t ◦L,
where φH 1
t is the partial flow of the C vector filed XH .
∂L
x(t) = γ(t) and p(t) = (γ(t), γ̇(t)).
∂v
By Theorem 6.2.7, x(t) = γ(t) is C1 with ẋ(t) = γ̇(t). The fact that p(t) is C1 follows
again from Theorem 6.2.7, which also yields in local coordinates
∂L
ṗ(t) = (γ(t), γ̇(t)).
∂x
6.2. Tonelli Lagrangians 231
Since (x(t), p(t)) = L (γ(t), γ̇(t)), we conclude from Proposition 6.2.11 that t 7→ (x(t), p(t))
satisfies the ODE
∂H
ẋ = (x, p)
∂p
ṗ = − ∂H (x, p).
∂x
Therefore the Legendre transform of the speed curve of a minimizer is a solution of the
Hamiltonian vector field XH .
If L is a Tonelli Lagrangian, by Proposition 6.2.11 the Hamiltonian H is C2 . Therefore
the vector field XH is C1 , and it defines a (partial) C1 flow φH t . The rest follows from
what was obtained above and the fact that the Legendre transform is C1 .
We recall the following definition
Definition 6.2.14 (Energy). If L is a C1 Lagrangian on the manifold M , its energy
E : T M → R is defined by
∂L
E(x, v) = H ◦ L (x, v) = (x, v)(v) − L(x, v).
∂v
Corollary 6.2.15 (Conservation of Energy). If L is a C1 Lagrangian on the manifold
M , and γ : [a, b] → M is a C1 minimizer for L, then the energy E is constant on the
speed curve
s 7→ (γ(s), γ̇(s)).
Proof. In fact E(γ(s), γ̇(s)) = H ◦ L (γ(s), γ̇(s)). But s 7→ L (γ(s), γ̇(s)) is a solution
of the vector field H, and the Hamiltonian H is constant on orbits of XH .
Proposition 6.2.16. If L is a weak Tonelli Lagrangian on the manifold M , then for
every compact subset K ⊂ M , and every C < +∞, the set
{(x, v) ∈ TK M | E(x, v) ≤ C}
is compact, i.e. the map E : T M → R is proper on every subset of the form π −1 (K),
where K is a compact subset of M .
Proof. Since E = H ◦ L , this follows from the fact that H is proper and L is a
homeomorphism.
Proposition 6.2.17. Let L be a weak Tonelli Lagrangian on M . Suppose K is a
compact subset of M , and t > 0. Then we can find a compact subset K̃ ⊂ M and a
finite constant A, such that every minimizer γ : [0, t] → M with γ(0), γ(t) ∈ K satisfies
γ([0, t]) ⊂ K̃ and kγ̇(s)kγ(s) ≤ A for every s ∈ [0, t].
232 6.0. Appendix
Proof. We will use as a distance d the one coming from the complete Riemannian metric.
All finite closed balls in this distance are compact (Hopf-Rinow theorem). We choose
x0 ∈ K, and R such that K ⊂ B(x0 , R) (we could take R = diam(K), the diameter of
K). We now pick x, y ∈ K. If α : [0, t] → M is a geodesic with α(0) = x, α(t) = y and
whose length is d(x, y) (such a geodesic exists by completeness), the inequality
implies that α([0, t]) ⊂ B̄(x0 , 3R). Moreover kα̇(s)kα(s) = d(x, y)/t ≤ 2R/t for every
s ∈ [0, t]. By compactness, the Lagrangian L is bounded on the set
kγ̇(s0 )kγ(s0 ) ≤ θ − C.
Moreover
γ([0, t]) ⊂ B̄(γ(0), t(θ − C)) ⊂ B̄(x0 , R + t(θ − C)).
We set K̃ = B̄(x0 , R + t(θ − C)). If we define
Observing that the set K˜ does not depend on γ, this finishes the proof.
6.2. Tonelli Lagrangians 233
where the infimum is taken over all the absolutelyR continuous curves γ : [0, t] → M , with
t
γ(0) = x, and γ(t) = y, and AL (γ) is the action 0 L(γ(s), γ̇(s)) ds of γ.
Using a change of variable in the integral defining the action, it is not difficult to
see that ct,L = c1,Lt where the Lagrangian Lt on M is defined by Lt (x, v) = tL(x, t−1 v).
Observe that Lt is a (weak) Tonelli Lagrangian if L is.
Proof. By the remark preceding the statement of the theorem, it suffices to prove this
∼
for c = c1,L . Let n be the dimension of M . Choose two charts ϕi : Ui −→ Rn , i = 0, 1,
on M . We will show that
◦ ◦
is semi-concave on B × B, where B is the closed Euclidean unit ball of center 0 in Rn . By
Proposition 6.2.17, we can find a constant A such that for every minimizer γ : [0, 1] → M ,
with γ(i) ∈ ϕ−1
i (B), we have
We now pick δ > 0 such that for all z1 , z2 ∈ Rn , with kz1 keuc ≤ 1, kz2 keuc = 2,
d(ϕ−1 −1
i (z1 ), ϕi (z2 )) ≥ δ, i = 0, 1,
where k·keuc denote the Euclidean norm. Then we choose ε > 0 such that Aε < δ. It
follows that
¡ ◦¢ ¡ ◦¢
γ([0, ε]) ⊂ ϕ−1
0 2 B and γ([1 − ε, 1]) ⊂ ϕ −1
1 2B .
234 6.0. Appendix
kγ̃˙ hi (s)keuc ≤ khi keuc + k(ϕi ◦ γ)0 (s)keuc ≤ 2 + k(ϕi ◦ γ)0 (s)keuc .
Since we know that the speed of γ is bounded in M , we can find a constant A1 such that
Li (z, v) = L(ϕ−1 −1
i (z), D[ϕi ](v)).
Hence
¡ ¢ ¡ −1 ¢
c ϕ−1
0 (x̃ 0 + h 0 ), ϕ −1
(x̃ 1 + h 1 ) − c ϕ (x̃ 0 ), ϕ −1
(x̃ 1 )
Z1 ε 0 1
£ ¤
≤ L0 (γ̃h0 (t), γ̃˙ h0 (t)) − L0 (ϕ0 ◦ γ(t), (ϕ0 ◦ γ)0 (t)) dt
0
Z 1
£ ¤
+ L1 (γ̃h1 (t), γ̃˙ h1 (t)) − L1 (ϕ1 ◦ γ(t), (ϕ1 ◦ γ)0 (t)) dt.
1−ε
We now call ω a common modulus of continuity for the derivative DL0 and DL1 on
the compact set B̄(0, 4) × B̄(0, A1 ). Here DL0 and DL1 denote the total derivatives of
6.2. Tonelli Lagrangians 235
L0 and L1 , i.e. with respect to all variables. When L has a derivative which is locally
Lipschitz, then DL0 and DL1 are also locally Lipschitz on Rn × Rn , and the modulus ω
◦
can be taken linear. Since γ̃hi (s) ∈B (0, 4) and kγ̃˙ hi (s)k ≤ A1 , we get the estimate
¡ ¢ ¡ −1 ¢
c ϕ−1 −1 −1
0 (x̃0 + h0 ),ϕ1 (x̃1 + h1 ) − c ϕ0 (x̃0 ), ϕ1 (x̃1 )
Z ε µ ¶
0 ε−t 1
≤ DL0 (ϕ0 ◦ γ(t), (ϕ0 ◦ γ) (t)) h0 , − h0 dt
0 ε ε
Z 1 µ ¶
0 t − (1 − ε) 1
+ DL1 (ϕ1 ◦ γ(t), (ϕ1 ◦ γ) (t)) h1 , h1 dt
1−ε ε ε
µ ¶ µ ¶
1 1 1 1
+ω kh0 keuc kh0 keuc + ω kh1 keuc kh1 keuc .
ε ε ε ε
We observe that the sum of the first two terms in the right hand side is linear, while the
sum of the last two is bounded by
µ ¶
1 1
ω k(h0 , h1 )keuc k(h0 , h1 )keuc .
ε ε
¡1 ¢ ◦ ◦
is semi-concave for the modulus ω̃(r) = 1ε ω ε
r on B × B, as wanted.
∂L ∂L
(v, w) 7→ (γ(t), γ̇(t))(w) − (γ(0), γ̇(0))(v),
∂v ∂v
where γ : [0, t] → M is a minimizer for L with γ(0) = x0 , γ(t) = y0 , and (v, w) ∈
Tx M × Ty M = T(x,y) (M × M ).
Proof. Again we will do it only for t = 1. If we use the notation introduced in the
previous proof, we see that a superdifferential of
¡ ¢
(x̃0 , x̃1 ) 7→ c ϕ−1 −1
0 (x̃0 ), ϕ1 (x̃1 )
is given by
(h0 , h1 ) 7→ l0 (h0 ) + l1 (h1 ),
236 6.0. Appendix
where
Z εh
t − ε ∂L0
l0 (h0 ) = − (ϕ0 ◦ γ(t), (ϕ0 ◦ γ)0 (t)) (h0 )
0 ε ∂x
1 ∂L0 i
+ (ϕ0 ◦ γ(t), (ϕ0 ◦ γ)0 (t)) (h0 ) dt,
ε ∂v
Z 1 h t − (1 − ε) ∂L
1
l1 (h1 ) = (ϕ1 ◦ γ(t), (ϕ1 ◦ γ)0 (t)) (h1 )
1−ε ε ∂x
1 ∂L1 i
+ (ϕ1 ◦ γ(t), (ϕ1 ◦ γ)0 (t)) (h1 ) dt.
ε ∂v
By Theorem 6.2.7, the curve t 7→ ϕ0 ◦ γ(t) is a C1 extremal of L0 and it satisfies the
following integrated form of the Euler-Lagrange equation:
∂L0 ∂L0
(ϕ0 ◦ γ(t), (ϕ0 ◦ γ)0 (t)) − (ϕ0 ◦ γ(0), (ϕ0 ◦ γ)0 (0))
∂v ∂v Z t
∂L0
= (ϕ0 ◦ γ(s), (ϕ0 ◦ γ)0 (s)) ds.
0 ∂x
This gives us
∂L0
l0 (h0 ) = − (ϕ0 ◦ γ(0), (ϕ0 ◦ γ)0 (0))
∂v Z · Z t ¸
1 ε d ∂L0 0
− (t − ε) (ϕ0 ◦ γ(s), (ϕ0 ◦ γ) (s)) ds dt.
ε 0 ds 0 ∂x
Obviously the second term in the right hand side is 0 and so l0 reinterpreted on Tx0 M
rather than on Rn gives − ∂L
∂v
(γ(0), γ̇(0)). The treatment for l1 is the same.
We have avoided the first variation formula in the proof of Corollary 6.2.20, because
this is usually proven for C 2 variation of curves and C 2 Lagrangians. Of course, our
argument to prove this Corollary is basically a proof for the first variation formula for
C 1 Lagrangians. This is of course already known and the proof is the standard one.
(UC) If γi : [ai , bi ] → M, i = 1, 2 are two L-minimizers such that γ1 (t0 ) = γ2 (t0 ) and
γ̇1 (t0 ) = γ̇2 (t0 ), for some t0 ∈ [a1 , b1 ] ∩ [a2 , b2 ], then γ1 = γ2 on the whole interval
[a1 , b1 ] ∩ [a2 , b2 ].
Then, for every t > 0, the cost ct,L : M × M → R satisfies the left (and the right) twist
condition of Definition 1.2.4.
Moreover, if (x, y) ∈ D(Λlct,L ), then we have:
(i) there is a unique L-minimizer γ : [0, t] → M such that x = γ(0), and y = γ(t);
(ii) the speed γ̇(0) is uniquely determined by the equality
∂ct,L ∂L
(x, y) = − (x, γ̇(0)).
∂x ∂v
Proof. We first prove part (ii). Pick γ : [0, t] → M an L-minimizer with x = γ(0) and
y = γ(t). From Corollary 6.2.20 we obtain the equality
∂ct,L ∂L
(x, y) = − (x, γ̇(0)). (6.2.4)
∂x ∂v
Since the C1 map v 7→ L(x, v) is strictly convex, the Legendre transform v ∈ Tx M 7→
∂L/∂v(x, v) is injective, and therefore γ̇(0) ∈ Tx M is indeed uniquely determined by
Equation (6.2.4) above. This proves (ii).
To prove statement (i), consider another L-minimizer γ1 : [0, t] → M is x = γ1 (0).
By what we just said, we also have
∂ct,L ∂L
(x, y) = − (x, γ̇1 (0)).
∂x ∂v
By the uniqueness already proved in statement (ii), we get γ̇1 (0) = γ̇(0). It now follows
from condition (UC) that γ = γ1 on the whole interval [0, t].
The twist condition follows easily. Consider (x, y), (x, y1 ) ∈ D(Λlct,L ) such that
∂ct,L ∂ct,L
(x, y) = (x, y1 ). (6.2.5)
∂x ∂x
By (i) there is a unique L-minimizer γ : [0, t] → M (resp. γ1 : [0, t] → M ) such that
x = γ(0), y = γ(1) (resp. x = γ1 (0), y1 = γ1 (1)), and
∂ct,L ∂L ∂ct,L ∂L
(x, y) = − (x, γ̇(0)) and (x, y1 ) = − (x, γ̇1 (0)).
∂x ∂v ∂x ∂v
From equation (6.2.5), and the injectivity of the Legendre transform of L, it follows
that γ̇1 (0) = γ̇(0). From condition (UC) we get γ = γ1 on the whole interval [0, t]. In
particular, we obtain y = γ(t) = γ1 (t) = y1 .
238 6.0. Appendix
with equality if and only if γ is parameterized with kγ(s)kx constant, i.e. proportionally
to arc-length. This of course implies
Z b
−r/s r
(b − a) `g (γ) ≤ kγ(s)krx ds,
a
[1] A.Abbondandolo & A.Figalli: High action orbits for Tonelli Lagrangians and
superlinear Hamiltonians on compact configuration spaces. J. Differential Equations,
234 (2007), no.2, 626-653.
[2] R.Abraham & J.E.Marsden: Foundations of mechanics. Second edition, revised
and enlarged. (1978) Benjamin/Cummings Publishing Co. Inc. Advanced Book Pro-
gram, Reading, Mass.
[3] L.Ambrosio: Lecture notes on optimal transport problems, in Mathematical Aspects
of Evolving Interfaces, Lecture Notes in Math. 1812, Springer-Verlag, Berlin/New
York (2003), 1-52.
[4] L.Ambrosio: Transport equation and Cauchy problem for BV vector fields. Invent.
Math., 158 (2004), no.2, 227-260.
[5] L.Ambrosio: Lecture notes on transport equation and Cauchy
problem for non-smooth vector fields. Preprint, 2005 (available at
http://cvgmt.sns.it/people/ambrosio).
[6] L.Ambrosio, O.Ascenzi & G.Buttazzo: Lipschitz regularity for minimizers of
integral functionals with highly discontinuous integrands. J. Math. Anal. Appl., 142
(1989), no. 2, 301-316.
[7] L.Ambrosio & G.Crippa: Existence, uniqueness, stability and differentiability
properties of the flow associated to weakly differentiable vector fields. UMI-Lecture
Notes, to appear.
[8] L.Ambrosio & A.Figalli: Geodesics in the space of measure-preserving maps and
plans. Arch. Ration. Mech. Anal., to appear.
[9] L.Ambrosio & A.Figalli: On the regularity of the pressure field of Brenier’s weak
solutions to incompressible Euler equations. Calc. Var. Partial Differential Equations,
31 (2007), no. 4, 497-509.
241
242 Bibliography
[10] L.Ambrosio, N.Fusco & D.Pallara: Functions of bounded variation and free
discontinuity problems. Oxford Mathematical Monographs, 2000.
[11] L.Ambrosio, N.Gigli & G.Savaré: Gradient flows in metric spaces and in the
Wasserstein space of probability measures. Lectures in Mathematics, ETH Zurich,
Birkhäuser (2005).
[12] L.Ambrosio, B.Kirchheim & A.Pratelli: Existence of optimal transport maps
for crystalline norms. Duke Math. J., 125 (2004), no. 2, 207-241.
[13] L.Ambrosio & A.Pratelli: Existence and stability results in the L1 theory
of optimal transportation. Lectures notes in Mathematics, 1813, Springer Verlag,
Berlin/New York (2003), 123-160.
[14] L.Ambrosio, S.Lisini & G.Savaré: Stability of flows associated to gradient vec-
tor fields and convergence of iterated transport maps. Manuscripta Math., 121 (2006),
1-50.
[15] V.Arnold: Sur la géométrie différentielle des groupes de Lie de dimension infinie
et ses applications à l’hydrodynamique des fluides parfaits. (French) Ann. Inst. Fourier
(Grenoble), 16 (1966), fasc. 1, 319-361.
[16] V.Bangert: Analytische Eigenschaften konvexer Funktionen auf Riemannschen
Manigfaltigkeiten. J. Reine Angew. Math., 307 (1979), 309-324.
[17] V.Bangert: Minimal measures and minimizing closed normal one-currents.
Geom. Funct. Anal., 9 (1999), 413-427.
[18] S.Bates: Toward a precise smoothness hypothesis in Sard’s Theorem. Proc. Amer.
Math. Soc., 117 (1993), no. 1, 279-283.
[19] J.-D. Benamou & Y. Brenier: A computational fluid mechanics solution to the
Monge-Kantorovich mass transfer problem. Numer. Math., 84 (2000), 375-393.
[20] P.Bernard: Existence of C 1,1 critical sub-solutions of the Hamilton-Jacobi equa-
tion on compact manifolds. Annales scientifiques de l’ENS, to appear.
[21] P.Bernard: Young measures, superposition, and transport. In preparation.
[22] P.Bernard & B.Buffoni: Optimal mass transportation and Mather theory. J.
Eur. Math. Soc., 9 (2007), no. 1, 85-121.
[23] P.Bernard & B.Buffoni: The Monge problem for supercritical Mañé potential
on compact manifolds. Adv. Math., 207 (2006), no. 2, 691-706.
Bibliography 243
[24] M.Bernot: Irrigation and Optimal Transport, Ph.D. Thesis, École Normale
Supérieure de Cachan, 2005. Available at http://www.umpa.ens-lyon.fr/˜mbernot.
[25] M.Bernot, V.Caselles & J.M.Morel: Traffic plans. Publ. Mat., 49 (2005),
no. 2, 417-451.
[27] M.Bernot & A.Figalli: Synchronized traffic plans and stability of optima.
ESAIM Control Optim. Calc. Var., to appear.
[31] Y.Brenier: The least action principle and the related concept of generalized flows
for incompressible perfect fluids. J. Amer. Math. Soc., 2 (1989), 225-255.
[33] Y.Brenier: The dual least action problem for an ideal, incompressible fluid. Arch.
Rational Mech. Anal., 122 (1993), 323-351.
[34] Y.Brenier: A homogenized model for vortex sheets. Arch. Rational Mech. Anal.,
138 (1997), 319-353.
[44] F.H.Clarke & R.B.Vinter: Regularity properties of solutions to the basic prob-
lem in the calculus of variations. Trans. Amer. Math. Soc., 289 (1985), 73-98.
[45] F.H. Clarke: Methods of dynamic and nonsmooth optimization. CBMS-NSF Re-
gional Conference Series in Applied Mathematics, 57 (1989), Society for Industrial
and Applied Mathematics (SIAM), Philadelphia, PA.
[52] G.Contreras: Action potential and weak KAM solutions. Calc. Var. Partial Dif-
ferential Equations, 13 (2001), no. 4, 427-458.
[53] D’Arcy Thompson: On Growth and Form. Cambridge University Press, 1942.
[55] R.J.DiPerna & P.-L.Lions: On the Cuachy problem for Boltzmann equations:
global existence and weak stability. Annals of Math., 130 (1989), no.2, 321-366.
[56] R.J.DiPerna & P.-L.Lions: Ordinary differential equations, transport theory and
Sobolev spaces. Invent. Math., 98 (1989), no.3, 511-547.
[57] R.M.Dudley: Real Analysis and Probability. Cambridge University Press, 2002.
[58] D.G.Ebin & J.Marsden: Groups of diffeomorphisms and the motion of an ideal
incompressible fluid. Annals of Math., 2 (1970), 102–163.
[60] L.C.Evans & W.Gangbo: Differential equations methods for the Monge-
Kantorovich mass transfer problem. Mem. Amer. Math. Soc., 137 (1999).
[61] L.C.Evans & R.F.Gariepy: Measure Theory and Fine Properties of Functions.
Studies in Advanced Mathematics, CRC Press, Boca Raton, FL, 1992.
[62] A.Fathi: Théorème KAM faible et théorie de Mather sur les systèmes lagrangiens.
C. R. Acad. Sci. Paris Sér. I Math., 324 (1997), no. 9, 1043-1046.
[67] A.Fathi, A.Figalli & L.Rifford: On a problem of Mather. Comm. Pure Appl.
Math., to appear.
[68] A.Fathi & E.Maderna: Weak KAM theorem on non compact manifolds. Non-
linear Differential Equations Appl., to appear.
[72] S.Ferry: When ²-boundaries are manifolds. Fund. Math., 90 (1976), no. 3, 199-
210.
[73] A.Figalli: Trasporto ottimale su varietà non compatte. Degree Thesis (in english),
(2006) (available at http://cvgmt.sns.it/people/figalli ).
[74] A.Figalli: The Monge problem on non-compact manifolds. Rend. Sem. Mat. Univ.
Padova, 117 (2007), 147-166.
[76] A.Figalli: A simple proof of the Morse-Sard theorem in Sobolev spaces. Proc.
Amer. Math. Soc., to appear.
[77] A.Figalli: Existence and uniqueness of martingale solutions for SDEs with rough
or degenerate coefficients. J. Funct. Anal, 254 (2008), no.1, 109-153.
[81] W.Gangbo: The Monge mass transfer problem and its applications, in Monge
Ampère equation: applications to geometry and optimization. Contemp. Math., 226
(1999), Amer. Math. Soc., Providence, RI, 79-104.
[82] E.N.Gilbert: Minimum cost communication networks. Bell System Tech. J., 46
(1967), 2209-2227.
[83] M.Hauray, C.LeBris & P.-L.Lions: Deux remarques sur les flots généralisées
d’équations différentielles ordinaires. [Two remarks generalized flows for ordinary
differential equations]. C. R. Acad. Sc. Paris, Sér. I, Math, submitted.
[88] N.V.Krylov & M.Röckner: Strong solutions of stochastic equations with singu-
lar time dependent drift. Probab. Theory Related Fields, 131 (2005), no. 2, 154-196.
[90] C.LeBris & P.-L.Lions: Renormalized solutions of some transport equations with
partially W 1,1 velocities and applications. Annali di matematica pura e applicata, 183
(2004), 97-130.
[92] C.LeBris & P.-L.Lions: Generalized flows for stochastic differential equations
with irregular coefficients. In preparation.
248 Bibliography
[108] R.J.McCann: A convexity principle for interacting gases. Adv. Math., 128
(1997), 153-179.
[110] G.Monge: Mémoire sur la Théorie des Déblais et des Remblais. Hist. de l’Acad.
des Sciences de Paris (1781), 666-704.
[111] A.P.Morse: The behavior of a function on its critical set. Annals of Math., 40
(1939), 62-70.
[114] A.Norton: A Critical Set with Nonnull Image has Large Hausdorff Dimension.
Trans. Amer. Math. Soc., 296 (1986), no. 1, 367-376.
[121] A.I.Shnirelman: The geometry of the group of diffeomorphisms and the dynamics
of an ideal incompressible fluid. (Russian) Mat. Sb. (N.S.), 128 (170) (1985), no. 1,
82–109.
[122] A.I.Shnirelman: Generalized fluid flows, their approximation and applications.
Geom. Funct. Anal., 4 (1994), no. 5, 586–620.
[123] S.K.Smirnov: Decomposition of solenoidal vector charges into elementary
solenoids and the structure of normal one-dimensional currents. St. Petersburg Math.
J., 5 (1994), 841–867.
[124] A.Sorrentino: On the total disconnectedness of the quotient Aubry set. Preprint,
2006.
[125] D.W.Stroock & S.R.S.Varadhan: Multidimensional diffusion processes.
Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Math-
ematical Sciences], 233. Springer-Verlag, Berlin-New York, 1979.
[126] Sturm, K.-T. On the geometry of metric measure spaces. I. Acta Math., 196, 1
(2006), 65-131.
[127] Sturm, K.-T. On the geometry of metric measure spaces. II. Acta Math. 196, 1
(2006), 133-177.
[128] K.T.Sturm & M.K.von Renesse: Transport Inequalities, Gradient Estimates,
Entropy and Ricci Curvature. Comm. Pure Appl. Math., 58, 7 (2005), 923-940.
[129] V.N.Sudakov: Geometric problems in the theory of infinite-dimensional proba-
bility distributions. Proc. Steklov Inst. Math., 141 (1979), 1-178.
[130] N.S.Trudinger & X.J.Wang: On the Monge mass transfer problem. Calc. Var.
Partial Differential Equations, 13 (2001), 19-31.
[131] A.M.Turing: The chemical basis of morphogenesis. Phil. Trans. Soc. Lond.,
B237 (1952), 37-72.
[132] C.Villani: Topics in optimal transportation. Graduate Studies in Mathematics,
58 (2003), American Mathematical Society, Providence, RI.
[133] C.Villani: Optimal transport, old and new. Lecture notes, 2005 Saint-Flour sum-
mer school, available online at http://www.umpa.ens-lyon.fr/˜cvillani.
[134] R.Vinter: Optimal control. Systems & Control: Foundations & Applications,
(2000), Birkhäuser Boston Inc., Boston, MA.
Bibliography 251
[135] Q.Xia: Optimal paths related to transport problems. Commun. Contemp. Math.,
5 (2003), no.2, 251-279.